In this post I'm going to talk about using Amazon Web Services (AWS) to develop applications. We will go over which services make more sense and how to efficiently use the services in a cost effective way. AWS offers a lot of services so we won't be able to cover them all but here are the ones you should know about.
While AWS has created a secure cloud infrastructure, it is important to note that security is not Amazon's sole responsibility. Amazon uses what's called Shared Responsibility Model which states that AWS manages security of the cloud and security in the cloud is the responsibility of the customer.
When you have EC2 instances and your application need to access other AWS services (like SNS, S3 etc), there are two ways to grant permission to your EC2 instances. You can either manually assign credentials to the instance or you can use IAM Roles.
Best practice is always to use the latter. Manually assigned credentials need to be rotated frequently and that can be messy when you have a lot of instances. You also risk the possibility of the credentials getting extracted, thus your applications are more vulnerable.
Unless you are connecting from outside AWS (like your computer) you need to always use IAM Roles to grant permission.
Storage and Content delivery
Simple Storage Service
Data storage for your applications can be costly. There are a few factors to consider when storing data on AWS. These factors include; how often is the data accessed? How fast should be the data retrieval process? Can the data be reproduced easily? How long should be the data available?
A popular storage service on AWS is Simple Storage Service (S3). S3 can serve objects through a CDN to Cloudfront, it can serve html files with Route 53, allows managed access and versioning of objects. With S3 you have the ability to set lifecycle policies for your objects. For instance, you can specify a policy that deletes or archives your objects to Amazon Glacier after a certain period of time.
So if your application needs frequent access to the data, S3 is a service of your choice. You can further reduce you cost by using S3 Reduced Redundancy Storage(RRS) instead of S3 standard storage. However, you should only use RRS for data that is easily reproducible (for example thumbnails) or for that you can afford to lose.
In the event that you have data that your application doesn't need frequent access and when accessed the retrieval time is not of importance, Amazon Glacier would be suitable. Amazon Glacier offers an archival storage type. Checking in and checking out items in Amazon Glacier can take several hours, hence the use for archival purposes only.
If your application need to use a relational database AWS offers RDS which is a fully managed database for relational databases. Being a fully managed database means Amazon will handle the underlying software updates and patches. RDS supports the following databases: PostgreSQL, MySQL, Oracle, SQL and Aurora. Aurora is just a fork of MySQL which is more efficient. Amazon recommends using Aurora in production environments.
Another recommendation in production is RDS Multi-AZ deployment. When a Multi-AZ DB Instance is provisioned, AWS automatically creates a primary DB instance and synchronously replicates the data to a standby instance in a different Availability Zone. In the event of a failure on the primary DB instance infrastructure AWS will perform an automatic failover to the standby database. There will be minimal or no downtime at all experienced by your application. Multi-AZ deployment is also important when AWS performs updates and patches on the underlying database software. AWS will update primary and standby database asynchronously meaning your application is not affected.
A popular NoSQL database in the community is MongoDB. Usually if they want to keep all the services on AWS, developers setup a MongoDB database on an EC2 instance. That means as a developer you are responsible for any software updates to your database as well as making sure the database is secure from the outside world. You also have to make sure the database is scalable. This sounds like a lot of work and responsibility already. An alternative would be DynamoDB, a fully managed NoSQL database service provided by AWS. MongoDB is fully distributed and auto scales which makes it fault tolerant. AWS also manages the provisioning of all underlying hardware. As a developer you just have to specify the required throughput capacity and DynamoDB will handle the rest.
You can further improve the performance of your application by implementing caching. For instance, you can cache database query results to avoid hitting the database all the time. You can also cache web sessions and any dynamically generated content. Amazon ElastiCache, a fully managed in-memory cache engine can be used with Redis or Memcached to achieve this.
If you want to get up and running quickly or don’t have the technical knowledge for building application environments, Elastic Beanstalk can be a deployment service of your choice. With Elastic Beanstalk you can easily deploy a full application environment automatically. The service integrates with other AWS services that include, Elastic Load Balancer, Auto Scaling, and EC2.
CloudFormation offers developers an easy way to create and manage a collection of related AWS resources. It is essentially Infrastructure as code as CloudFormation templates are merely json files. This can be useful if you are to scale your application, you can use a CloudFormation template to build EC2 instances that belong to an Elastic Load Balancer. You can also use the templates in disaster recovery, reducing the time required to spin up a new environment.
This was just an overview of the services that Amazon AWS has to offer. There is a lot of important services that we didn’t cover which include Networking (Virtual Private Cloud, DNS etc)