Data Protection at Codebase

Backstage and Devops & Infrastructure

In light of recent events, we wanted to write a quick post to outline the measures we take to ensure that the data you entrust to us is safe and how we monitor the techniques used to ensure they will be ready for use when we need them.

Hardware goes wrong, software does things it shouldn't and people make mistakes. It happens. We can't stop it. We can, however, do everything we can to mitigate the effects that one of these failures has on our customers.

Our data protection plan is designed to protect your data against any of the following situations:

  • A hardware failure of our SAN infrastructure.
  • A hardware failure of any of the disks in our SAN.
  • A software error where a bug causes data to be deleted from a database.
  • A human error where a bunch of data is removed directly from the file system.
  • A major disaster (fire, flood, etc...) at a core data centre.

To protect against any of these situations causing data loss, we have put the following in place:

  • We take hourly backups of all our storage & database servers. These are stored offsite in our redundancy data centre. We receive daily reports to show the status of every backup taken in the last 24 hours. If any of these fail, it is highlighted immediately.

  • Our databases are replicated in pairs in our core data centre. Each member of pair is on a different set of SAN infrastructure. If this replication gets too far behind the master or fails entirely, we are notified immediately.

  • The master database is replicated off-site to our redundancy data centre. If this replication gets too far behind the master or fails entirely, we are notified immediately.

  • All repositories are stored on pairs of redundant storage servers. The data is replicated to both members in the pair and is also replicated off-site to our redundancy location. If this replication gets too far behind, we are notified immediately.

  • Our SAN takes daily snapshots of all our servers which can be restored. These are kept for 5 days.

In the event of an issue, the most likely source of data will be from one of our hot replicas which should have an (almost) real time copy of the data. If an issue occurs here, we'll be looking at cold replicas (offsite) and then to backups.

Needless to say, we're not complacent about any of this and we're constantly monitoring & revising our policies as technology changes and our data set grows.

If you have any questions, please don't hesitate to get in touch. We're more than happy to discuss our disaster recovery plans in more details with anyone who is interested.

A little bit about the author

Adam is the Head of Software at Krystal. Adam looks after our software engineering team by day and, when not doing that, enjoys hobbies including LEGO, collecting playing cards, building mechanical keyboards and home automation.