A college with 240 academic programs serving about 6,000 students.
The client experienced a simultaneous double-disk failure on a disk-storage appliance, which was used for the database and primary storage pools of their IBM Spectrum® Protect backup server. The drive failures caused the client’s server application to crash, corrupted their database, and caused complete data loss of their storage pools. Coincidentally, the client was in the process of procuring a new tapeless data-protection solution at the time of the failure. They had concerns about whether their aged legacy backup solution was recoverable from off-site tape copies, as well as the duration of the recovery process. What’s more, the storage pools affected in Spectrum Protect contained data with mandatory retention requirements.
After restoring the server database from the most recent full database backup on LTO-4 tape media, Sirius brought the client’s server application back online. The next challenge Sirius faced was recovering the primary storage pools. The reality sank in that the client would require a tape recall from their off-site tape vault in order to recover the data. At a minimum, they would also need enough disk storage to restore the data from tape back into the respective Spectrum Protect disk pools. In total, 50 LTO-4 tapes were recalled and required for recovery. After another disk storage appliance was identified as a replacement for the failed unit, the disk storage was allocated to the Spectrum Protect server and formatted as file systems on the operating system. The recovery took only one week, despite tape library gripper calibration issues due to earthquakes and tape media corruption.
Recovery from tape storage can take weeks depending on the amount of data, the number of volumes, the number of tape drives, the number of tape slots in a tape library, and the time it takes to retrieve the volumes from off-site vaulting (if applicable). Sirius storage experts were able to recover the entire environment in one week, recalling 50 volumes from an off-site tape vault. The recovery was successful, with 99.99% of the data restored to the storage pools (some corruption is not uncommon with tape media, and an acceptable risk compared to the extraordinary efficiency and low cost of tape storage solutions). Upon completing the server application database restore, the application was initialized and brought back online by Sirius.
The client’s simultaneous double-disk failure underscored the importance of a modern disaster recovery solution. The failure provided the client with the justification it needed to invest in a disk-based DR solution. The client engaged Sirius to design and implement a new tapeless data protection solution based on an IBM POWER9 processor-based Power Systems server to replace an aged Windows blade server, as well as a Dell EMC Data Domain to replace their legacy tape library. With a disk-only solution leveraging asynchronous replication, recovery occurs in minutes to hours, with no need for data handling or retrieval. The data is online and available for access at the DR site as soon as the replication context is synchronized.
The solution leverages container pools, cloud container pools and file-device pools where appropriate. It will be a much more efficient and high-performing solution, maximizing disk storage with inline data deduplication and LZ4 compression. The management of tape media (check-ins and check-outs) and tape media processes (mounts, dismounts, migrations, backup of storage pools, etc.) will be eliminated. They will also integrate replication with Data Domain or Spectrum Protect, depending on whether the storage pool is a container pool or file-device pool.