by Asif Khan
Data tiering is not new. It was first introduced by IBM in the 1960s not long after magnetic disks allowed real-time access to stored data. Hierarchical Storage Management (HSM) and its successor, Information Lifecycle Management (ILM), enacted fixed rules to move data from one storage array to another (ie “after 90 days, move all inactive data to a cheaper storage platform”).
The flaw in those early tiering schemes was that it was mostly a one way migration. There did not exist an easy way of quickly grabbing the inactive data and moving it back to the primary array so that users can access it. More importantly, it was expensive and cumbersome to implement so it never really enjoyed widespread use.
Enter Tiering v3.0. Most likely introduced in 2010 by an independent storage vendor called 3PAR (they called it Adaptive Optimization), most leading storage vendors today offer some variation of this capability.
NOTE1: 3PAR was acquired by HP later that same year in a fierce bidding war against Dell and has since become the flagship of HP’s storage portfolio.
NOTE2: I say 3PAR was “most likely” first because they may have announced it first but other vendors were working on similar tiering schemes at the same time and they all introduced their respective offerings within about a year of each other. In this case, being first may give 3PAR bragging rights but is not a major competitive differentiator.
Regardless of who was first to market, the three coolest new features with this latest generation of tiering (common to most vendor implementations) are as follows:
- Tiering can now move far smaller amounts of data (ie “chunks”) than before. A “chunk” is a generally accepted term for an atomic unit of tiering (although the terms “chunklet”, “page” and “extent” are also commonly used). A chunk is loosely defined as “bigger than a block but smaller than a LUN” and varies by vendor (see chart below).
- The tiering algorithm itself can determine which chunks should move based on usage patterns (no elaborate configuration required: “just set it and forget it”).
- The chunks move within the array itself (ie “move inactive chunks of data within the array from expensive RAID10 FC disk to cheaper RAID5 SATA disk”). Prior tiering schemes only moved data from one array to another (usually to a “nearline” disk-based backup device or a tape device).
These new capabilities became possible in part because late model arrays can crunch a lot more data (and manage more metadata) than their preceding generations. These new tiering schemes seemed to address a lot of the shortcomings of its predecessors and tiering was quickly dubbed *the next big thing* in storage. In the last two years, most of the storage vendors released their own implementation of data tiering.
In fact, data tiering has emerged as a key feature of a larger trend: storage virtualization. Just as server virtualization abstracts the OS from its underlying compute hardware, storage virtualization abstracts the data volume (ie LUN) from its underlying storage hardware.
And once you liberate your data from the underlying disks, you can do all sorts of interesting things with it like over-provision, compress, deduplicate and create space-efficient snaps and clones. Combine some of those cool features with tiering and suddenly…this ain’t your Daddy’s storage array anymore!
But what really propelled the rise of data tiering in particular (and storage virtualization in general) was the continuing price decline of solid state disks. When enterprise class solid state disks were first introduced about 5 years ago, SSDs cost 30-40x comparable FC disks. Today, SSDs are about 2-3x the price. I believe that the pricing model of the next generation of SSDs will render FC disks unnecessary.