Skip to main content

Automating S3 Lifecycle Policies with Terraform

Amazon S3 storage is a popular service to use for storing backup and archive data. Infinite scalability, data tiering capabilities, and low cost make S3 a very attractive platform. While Amazon has an automated-tiering option for S3, there are times when you may want to control how you tier your data . The key is making sure that data lifecycle rules for your S3 bucket align with the minimum retention requirements Amazon has defined for the various S3 tiers.

Amazon has thorough documentation on the rules that govern S3 pricing and tiering. There are two details that are worth calling out:
  • Data moved to S3 Infrequent Access (IA) needs to live in IA for at least 30 days. If you access or modify data stored in IA within the first 30 days, the data is migrated back to standard S3 and you are still charged for 30 days of storage in IA.
  • Data moved to Glacier needs to live in Glacier for at least 90 days. Accessing or modifying this data prematurely will result in being charged for the full 90 days.
Any lifecycle policies you create on an S3 bucket, whether for object tiering or object deletion, should be created to adhere to these rules. Failing to do so can rob you of the value you should get from tiering your data. Hashicorp Terraform allows you to create reusable code modules to standardize how developers launch cloud infrastructure. It is possible to allow your end-users to define data retention patterns without enabling rules that violate Amazon’s minimum retention requirements.

I’ll start by creating variables to take in the retention values that a consumer of this module wants to set on her bucket:

variable "expiration_days" {
 description = "The number of days for objects to live"
 default     = "1460"
}

variable "noncurrent_expiration_days" {
 description = "The number of days for non-current objects to be retained"
 default     = "120"
# Only needed if versioning is enabled for this bucket
}

variable "transition_days_standard_ia" {
 description = “Number of days (>=30) after creation for transition to S3-IA”
 default = "180"
}

variable "transition_days_glacier" {
 description = “Number of days after creation for transition to GLACIER”
 default = "540"
}

Next, I’ll create two local values within my module to define the number of days non-current file versions need to live in a given tier before they are transitioned to a less expensive S3 tier. Two things to note about these values: First, these are only going to be used if versioning is turned on for the bucket. Second, in a scenario where versioning is turned on, the frequency with which an object is modified determines the volume of storage consumed by previous versions. If an object is modified frequently, it’s in the best interest of the bucket owner to move data to lower cost storage tiers as soon as possible. Using these local values ensures previous versions are moved down a tier as soon as they are eligible.
locals {
 transition_days_standard_ia_noncurrent_version = "30"
 transition_days_glacier_noncurrent_version     = "90"
}

As part of our resource “aws_s3_bucket” code block, we’re going to include the lifecycle rules. First we’ll cover the lifecycle rule that transitions data from standard S3 to S3-IA:

lifecycle_rule {
 id      = "standard_ia_transition"
 enabled = "${(var.expiration_days - var.transition_days_standard_ia > 30)}"

 transition {
   days          = "${var.transition_days_standard_ia}"
   storage_class = "STANDARD_IA"
 }

 noncurrent_version_transition {
   days          = "${var.noncurrent_expiration_days -local.transition_days_standard_ia_noncurrent_version > 30) ? local.transition_days_standard_ia_noncurrent_version : (var.noncurrent_expiration_days + 30)}"
   storage_class = "STANDARD_IA"
 }
}

In this example, the code only allows the lifecycle rule to be enabled if the primary copy of the data is not going to be expired before it has been in S3-IA for 30 days. Because the lifecycle rule is enabled based on current object versions, I have to use a different technique to prevent tiering down previous object versions that won’t live in IA for the full 30 days. In this example, I set the transition date to 30 days after object expiration, effectively preventing it from moving down to S3-IA.

I then add a second lifecycle rule, using a similar process, to transition objects from S3-IA to Glacier. Lastly I add a lifecycle rule to control the expiration of objects:

lifecycle_rule {
 id      = "expiration"
 enabled = true

 expiration {
   days = "${var.expiration_days}"
 }

 noncurrent_version_expiration {
   days = "${var.noncurrent_expiration_days}"
 }
}

This isn’t the only way to accomplish safe and appropriate lifecycle policies for your S3 bucket. This is how I solved the problem of preventing consumers of my module from creating costly lifecycle policies. I also wrote this using Terraform 0.11.8. With all the changes in Terraform 0.12.x, it’s possible that there are more efficient and elegant methods for accomplishing the same end-result. You can find more documentation on Terraform syntax for lifecycle policies here.









Comments

Popular posts from this blog

Hello World It seems that every programming course starts with a Hello World program.  A print statement shows the learner that a few simple words in the right format can cause things to happen.  Hello World lays the groundwork for lessons to come. I'm hoping this blog will do the same.  I want to help others write their own cloud story by sharing my journey, lessons learned, and the things I’ve built.  This isn’t a tech blog - the world has enough of those. I like to think of this as broadcast mentoring. I'm going to keep putting it out there - do with it what you will.  All I ask is that if it helps you, pay it back and help the next person in line. So who am I?  I’m a 20+ year veteran of the IT world.  I cut my teeth in the very early days of the internet.  I’ve done web design and worked for an ISP in the last days of dial-up.  I’ve done tech support, system administration, and systems engineering. For the last 14 years, my passion has been data storage and prot