Achieving S3 Read-After-Update Consistency

Engineering
By
Adam Mills
Lead Engineer @ Hatch

The team at Hatch spun up the Labour Exchange in a few days re-purposing our tech to help stood down workers find employment during the covid-19 crisis.

In order to get the system up and running in such a short time frame we decided to use S3 as a flat-file data store to maintain our serverless batch job states and caches. After a very cursory search to satisfy ourselves S3 would guarantee read-after-write consistency, we flew on.

From the documentation

Amazon S3 provides read-after-write consistency for PUTS of new objects in your S3 bucket in all Regions with one caveat. The caveat is that if you make a HEAD or GET request to a key name before the object is created, then create the object shortly after that, a subsequent GET might not return the object due to eventual consistency.

Unfortunately, we missed the bolded section and the part further down the page that explicitly states our use case:

A process replaces an existing object and immediately tries to read it. Until the change is fully propagated, Amazon S3 might return the previous data.

As these processes run on a schedule (not as part of a user facing API) we could afford to spend some extra calls to S3 to roll our own read-after-update consistency. Knowing that S3 guarantees read-after-firstWrite, we can write a new file for every change, read the latest file and make sure we cleanup.

So every time we write a file we:

  • Append a timestamp to the filename
  • Remove older files

When we read a file we:

  • List all files with the key prefix (S3 guarantees listing files will be ordered by ascending UTF-8 binary order)
  • Get the newest file in the list

import { writeS3Obj, getS3FileAsObj, listObjects, deleteS3Files } from "./S3";

/**
 * S3 does not provide read-after-update consistency.
 * It does provide read-after-firstWrite consistency (as long as no GET has been requested)
 * We write a new file every time it changes, and we read the latest file.
 * S3 guarantees list of files are sorted in ascending UTF-8 Binary Order
 *
 */

const cleanUp = async (key: string) => {
  const response = await listObjects({
    MaxKeys: 1000,
    Bucket: process.env.BUCKET_NAME!,
    Prefix: key,
  });
  const keys = response.Contents?.map((c) => c.Key!) || [];
  await deleteS3Files(keys.slice(0, keys.length - 1));
};

export const writeServiceState = async (key: string, state: any) => {
  await writeS3Obj(`${key}.${Date.now()}`, state);
  await cleanUp(key);
};
export const getServiceState = async (key: string, defaultVal: T): Promise => {
  const response = await listObjects({
    MaxKeys: 1000,
    Bucket: process.env.BUCKET_NAME!,
    Prefix: key,
  });
  if (!response.Contents || response.Contents.length === 0) {
    console.log("No state file for key " + key);
    return defaultVal;
  }

  return getS3FileAsObj(response.Contents[response.Contents.length - 1].Key!);
};

Adam Mills
Lead Engineer @ Hatch

Join the Hatch mailing list

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.