by Matthew Star, CTO, Spectra Logic
LTFS, linear tape file system, is sometimes called long term filesytem. No matter what you call it; LTFS lets tape behave like removable disk. Having tested various LTFS applications, I can tell you it is shaping up to become the new standard in tape interchange, particularly in LTO-based archives. LTFS is an open standard that uses two partitions to split directory contents from its associated file data. But what about the other open formats like TAR (derived from “tape archive”), which are open and have compatibility across multiple platforms? How does LTFS stack up in comparison?
Let’s look at both. TAR is a formatted data archive, usually written to tape and designed around sequential media. LTFS is a format used to make tape look more like random access media to the user or consumer of the storage. So, which is better? It depends on your needs and risks profile. TAR has been around for 30 some years and is available in source or binary format on nearly any operating system (OS) imaginable. LTFS currently only runs on about three OS's. TAR is self-describing, but must be accessed in sequence. You really cannot know the whole content of the archives on tape without reading, at a mimimum, the headers of each archive. In other words, TAR requires that you read the whole tape. LTFS, on the other hand, stores its directory, or header, information on a separate partition and thereby only loads a very small amount of data to be able to fully describe the contents of the entire tape.
There are some downsides to using tapes as a random access device. First and foremost, tape was not designed for a random access pattern. So, writing millions of small files to an LTFS formatted tape, then attempting to retrieve every other file on that tape can be a recipe for disaster, as the performance of the drive decreases significantly. This is where TAR works really well, because TAR bundles all of those millions of tiny files into one archive which is then stored as a single file on a single tape. Plus, TAR can restore data as fast as the drive can read it. If on the other hand you are writing hundreds of larger files to tape and want random access to any one of these files, LTFS may just be the trick you’re looking for.
The other advantage LTFS has over TAR is LTFS’s ease of use after the applications and drive stack are installed. LTFS makes the tape look just like a large USB key. TAR must be used with a command line interface (like tar -tvf /dev/tape1) just to get the contents of the one archive on a single tape.
So which one would I use? Both or either--depending on the environment and my needs. I don’t believe you should consider LTFS over TAR as a solution to your petabyte archive. But if you want an easy way to move data from place to place or are deploying a smaller archive, you should review LTFS’s features and benefits.