Hi Edward, Do you have a source for the 8KiB block size data? whilst we can't avoid the SSD controller in theory we can change the smallest size we present to the SSD to 8KiB fairly easily... I wonder if that would help the controller do a better job (especially with TRIM)
I might have to do some test, so far the assumption (even inside sun's sd driver) is that SSD are really 4KiB even when the claim 512B, perhaps we should have an 8KiB option... Thanks, Deano de...@cloudpixies.com -----Original Message----- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Edward Ned Harvey Sent: 28 January 2011 13:25 To: 'Eff Norwood'; zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] Lower latency ZIL Option?: SSD behind Controller BB Write Cache > From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Eff Norwood > > We tried all combinations of OCZ SSDs including their PCI based SSDs and > they do NOT work as a ZIL. After a very short time performance degrades > horribly and for the OCZ drives they eventually fail completely. This was something interesting I found recently. Apparently for flash manufacturers, flash hard drives are like the pimple on the butt of the elephant. A vast majority of the flash production in the world goes into devices like smartphones, cameras, tablets, etc. Only a slim minority goes into hard drives. As a result, they optimize for these other devices, and one of the important side effects is that standard flash chips use an 8K page size. But hard drives use either 4K or 512B. The SSD controller secretly remaps blocks internally, and aggregates small writes into a single 8K write, so there's really no way for the OS to know if it's writing to a 4K block which happens to be shared with another 4K block in the 8K page. So it's unavoidable, and whenever it happens, the drive can't simply write. It must read modify write, which is obviously much slower. Also if you look up the specs of a SSD, both for IOPS and/or sustainable throughput... They lie. Well, technically they're not lying because technically it is *possible* to reach whatever they say. Optimize your usage patterns and only use blank drives which are new from box, or have been fully TRIM'd. Pfffft... But in my experience, reality is about 50% of whatever they say. Presently, the only way to deal with all this is via the TRIM command, which cannot eliminate the read/modify/write, but can reduce their occurrence. Make sure your OS supports TRIM. I'm not sure at what point ZFS added TRIM, or to what extent... Can't really measure the effectiveness myself. Long story short, in the real world, you can expect the DDRDrive to crush and shame the performance of any SSD you can find. It's mostly a question of PCIe slot versus SAS/SATA slot, and other characteristics you might care about, like external power, etc. _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss