Automating Backups to AWS S3 Using Make and Cron

Sign up For AWS

Follow the instructions in this link.
You should now have an AWS account and the AWS CLI installed.

Set up S3 Bucket

  • I will be using the new console, so select "Opt In" on the right to opt in to the new console.
  • After logging in, select S3 from the Services dropdown menu.
  • Click create bucket.
  • Set a bucket name, note that this must be unique.
  • Select a region and click next.
  • Enable versioning.
  • Make sure your permissions are set to your liking.
    • Generally just leave it with your personal user having read/write access, and be sure public permissions are unchecked.
  • Review and create bucket.

Set up Bucket Lifecycle Rules

  • Click on the bucket name.
  • Create a new folder in the bucket (I named mine 'backups').
  • Click on the lifecycle tab.
  • Click Add lifecycle rule.
  • Enter a rule name.
  • Type the name of your folder in the prefix field and select prefix [your-folder-name] and click next.
  • Select Previous Versions and add transition.
  • Select Transition to Standard-IA after and enter 30 days.
  • Click add transition and select Transition to Amazon Glacier after and enter 60 days. Click next.
  • Select Previous versions, select Permanently Delete Previous Versions, and enter 61 days.
  • Select Clean up incomplete multipart uploads and click next.
  • Review and click save.

About Lifecycle Rules

A bit about the previous lifecycle rules. I set all the transitions to happen in order to cut down on costs. By transitioning files that are not often changed to Standard-IA (Standard-Inactive) and then to Glacier, storage costs are cut down quite a bit. I also like to transition my Current Version to Standard-IA after 30 days and Glacier after 60 days as well (note that this does increase retrieval time). If you'd like to set that up, just follow the above instructions but click Current Version as well as Previous Versions.

Configure AWS CLI

Set up Makefile

Here's an example Makefile, personally I just store it in my home folder. After this is setup you'll just have to run make start-backup to get the syc started.

Run touch Makefile in your directory of choice.
Open Makefile in your editor of choice.

export HOME=/root
start-backup:
    @echo "initiating backup."
    cd /usr/local/aws/bin/ && ./aws s3 sync [path/to/folder/you/want/to/sync] s3://[your-bucket-name]/[folder] --delete \
    --exclude '.*/' \
    --exclude '.*' \
    --exclude '*node_modules*' \
    --exclude '*tmp*' \
    --exclude 'Downloads*' \
    --exclude 'Library*'
    @echo "backup finished"

Let's go through this line by line:

export HOME=/root

This is needed so that Cron will be able to find the AWS CLI credentials (addressed in Cron section below).

start-backup:

This sets the name of the make command (allows us to run make start-backup).

@echo "initiating backup."

This just prints the message in quotes to the terminal.

cd /usr/local/aws/bin/ && ./aws s3 sync [path/to/folder/you/want/to/sync] s3://[your-bucket-name]/[folder] --delete \
    --exclude '.*/' \
    --exclude '.*' \
    --exclude '*node_modules*' \
    --exclude '*tmp*' \
    --exclude 'Downloads*' \
    --exclude 'Library*'
  • This is actually one command broken up with '\'. The reason we cd to the /usr/local/aws/bin folder and run aws there is so that later Cron will be able to run the command successfully.
  • aws is the command for the AWS CLI, so aws s3 sync is telling the AWS CLI you are going to give it a sync command.
  • Why use sync: "sync updates any files that have a different size or modified time than files with the same name at the destination." This means after an initial large backup, following backups will only upload files that have changed.
  • Replace the items in square brackets with the corresponding information. The --delete option tells AWS to delete items that have been deleted locally. So for instance, I have a .txt file that I delete locally, the --delete option will delete that file in my S3 bucket as well.
  • By default everything is included in a sync command, the first exclude command here is telling sync to exclude all files and folders that start with '.', which are often configuration files. It also excludes the Library folder, the Downloads folder, and any node_modules folders found in the specified backup folder. These exclude commands are ones that I use, so adjust them as you need.
  • Note: There is currently an issue with AWS CLI that causes it to traverse the Library folder even when it is explicitly excluded, this can cause errors. Github Issue.

Run the Initial Backup

Before we set up Cron, go ahead and run the initial backup. To do this, open your terminal and cd to the directory you saved your Makefile in. Run make start-backup to initiate the backup.

Set up Cron

  • Note: Do this after your initial backup has completed and after you are confident you are backing up everything you want to be backing up, Cron provides very little useful output by default.
  • Copy aws CLI config folder to /root:
    • sudo mkdir /root
    • sudo cp -R ~/.aws/ /root/.aws
    • This allows Cron to have access to the AWS credentials.
  • sudo env EDITOR=vim crontab -e (I use vim as my editor, but feel free to use nano if you are unfamiliar with vim). This is done with sudo in order to allow cron to run aws cli succesfully.
  • Paste in 00 11 * * * cd /path/to/makefile && make start-backup.
  • Note: If you do wish to see better output from Cron, put > /tmp/aws.log 2>&1 at the end of the command to have Cron output to a logfile.
  • Cron works by specifying the time to run a specific command, with *'s meaning run always. So to translate this Cron command, we are saying at 00 minutes and 11 hours, every day, every month, every day of the week, run the command specified. Note that Cron will not run if your computer is sleeping.
  • After you've set the time you want the backup to start, save the file and quit. You should see crontab: installing new crontab output in the console. Verify that the crontab was installed correctly with crontab -l.

Costs

A note about costs, S3 is extremely inexpensive. Even with all of my files still in Standard Storage (in other words, not yet moved to Standard-IA or Glacier due to the lifecycle rules we set up above), my costs for storing 22gb is about fifty cents per month. I expect that to only decrease as the Standard-IA and Glacier rules come into effect.

All done

You should now have a cron job set up that automatically backs up your files to AWS S3. Please feel free to comment or ask questions below.

Ian Andersen

Read more posts by this author.