Bob Friesenhahn wrote:
On Fri, 1 Jan 2010, Erik Trimble wrote:

Maybe it's approaching time for vendors to just produce really stupid SSDs: that is, ones that just do wear-leveling, and expose their true page-size info (e.g. for MLC, how many blocks of X size have to be written at once) and that's about it. Let filesystem makers worry about scheduling writes appropriately, doing redundancy, etc.

From the benchmarks, it is clear that the drive interface is already often the bottleneck for these new SSDs. That implies that the current development path is in the wrong direction unless we are willing to accept legacy-sized devices implementing a complex legacy protocol. If the devices remain the same physical size with more storage then we are faced with the same current situation we have with rotating media, with huge media density and relatively slow I/O performance. We do need stupider SSDs which fit in a small form factor, offer considerable bandwidth (e.g. 300MB/second) per device, and use a specialized communication protocol which is not defined by legacy disk drives. This allows more I/O to occur in parallel, for much better I/O rates.

Oooh!   Oooh!  a whole cluster of USB thumb drives!  Yeah!    <wink>

That is not far from what we should have (small chassis-oriented modules), but without the crummy USB.

Bob
I tend to like the 2.5" form factor, for a lot of reasons (economies of scale, and all). And, the new SATA III (i.e. 6Gbit/s) interface is really sufficient for reasonable I/O, at least until the 12Gbit SAS comes along in a year or so. The 1.8" drive form factor might be useful as Flash densities go up (in order to keep down the GB to drive interface ratio), but physically, that size is a bit of a pain (it's actually too small for reliability reasons, and makes chassis design harder). I'm actually all for adding a second SATA/SAS I/O connector on a 2.5" drive (it's just possible, physically).

That all said, it certainly would be really nice to get a SSD controller which can really push the bandwidth, and the only way I see this happening now is to go the "stupid" route, and dumb down the controller as much as possible. I really think we just want the controller to Do What I Say, and not try any optimizations or such. There's simply much more benefit to doing the optimization up at the filesystem level than down at the device level. For a trivial case, consider the dreaded "write-read-write" problem of MLCs: to write a single bit, a whole page has to be read, then the page recomposed with the changed bits, before writing again. If the filesystem was aware that the drive had this kind of issue, then in-RAM caching would almost always allow for the avoidance of the first "read" cycle, and performance goes back to a typical Copy-on-Write style stripe write. I can see why having "dumb" controllers might not appear to the consumer/desktop market, but certainly, for the Enterprise market, I think it's actually /more/ likely that they start showing up soon. Which would be a neat reversal of sorts: Consumer drives using a complex controller with cheap flash (and a large "spare" capacity area), while Enterprise drives use a simple controller, higher-quality flash chips, and likely a much smaller spare capacity area. Which means, I expect price parity between the two.
Whee!

--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to