Re: [zfs-discuss] New SSD options
d == Don d...@blacksun.org writes: hk == Haudy Kazemi kaze0...@umn.edu writes: d You could literally split a sata cable and add in some d capacitors for just the cost of the caps themselves. no, this is no good. The energy only flows in and out of the capacitor when the voltage across it changes. In this respect they are different from batteries. It's normal to use (non-super) capacitors as you describe for filters next to things drawing power in a high-frequency noisy way, but to use them for energy storage across several seconds you need a switching supply to drain the energy from it. the step-down and voltage-pump kinds of switchers are non-isolated and might do fine, and are cheaper than full-fledged DC-DC that are isolated (meaning the input and output can float wrt each other). you can charge from 12V and supply 5V if that's cheaper. :) hope it works. hk okay, we've waited 5 seconds for additional data to arrive to hk be written. None has arrived in the last 5 seconds, so we're hk going to write what we already have to better ensure data hk integrity, yeah, I am worried about corner cases like this. ex: input power to the SSD becomes scratchy or sags, but power to the host and controller remain fine. Writes arrive continuously. The SSD sees nothing wrong with its power and continues to accept and acknowledge writes. Meanwhile you burn through your stored power hiding the sagging supply until you can't, then the SSD loses power suddenly and drops a bunch of writes on the floor. That is why I drew that complicated state diagram in which the pod disables and holds-down the SATA connection once it's running on reserve power. Probably y'all don't give a fuck about such corners though, nor do many of the manufacturers selling this stuff, so, whatever. pgpYM02z6LZ58.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
This thread has grown giant, so apologies for screwing up threading with an out of place reply. :) So, as far as SF-1500 based SSD's, the only ones currently in existence are the Vertex 2 LE and Vertex 2 EX, correct (I understand the Vertex 2 Pro was never mass produced)? Both of these are based on MLC and not SLC -- why isn't that an issue for longevity? Any other SF-1500 options out there? We continue to use UPS-backed Intel X-25E's for ZIL. Ray ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
On Mon, May 24, 2010 at 11:30:20AM -0700, Ray Van Dolson wrote: This thread has grown giant, so apologies for screwing up threading with an out of place reply. :) So, as far as SF-1500 based SSD's, the only ones currently in existence are the Vertex 2 LE and Vertex 2 EX, correct (I understand the Vertex 2 Pro was never mass produced)? Both of these are based on MLC and not SLC -- why isn't that an issue for longevity? Any other SF-1500 options out there? We continue to use UPS-backed Intel X-25E's for ZIL. From earlier in the thread, it sounds like none of the SF-1500 based drives even have a supercap, so it doesn't seem that they'd necessarily be a better choice than the SLC-based X-25E at this point unless you need more write IOPS... Ray ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
From earlier in the thread, it sounds like none of the SF-1500 based drives even have a supercap, so it doesn't seem that they'd necessarily be a better choice than the SLC-based X-25E at this point unless you need more write IOPS... Ray I think the upcoming OCZ Vertex 2 Pro will have a supercap. I just bought a ocz vertex le, it doesn't have a supercap but it DOES have some awesome specs otherwise.. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
On 22 maj 2010, at 07.40, Don wrote: The SATA power connector supplies 3.3, 5 and 12v. A complete solution will have all three. Most drives use just the 5v, so you can probably ignore 3.3v and 12v. I'm not interested in building something that's going to work for every possible drive config- just my config :) Both the Intel X25-e and the OCZ only uses the 5V rail. You'll need to use a step up DC-DC converter and be able to supply ~ 100mA at 5v. It's actually easier/cheaper to use a LiPoly battery charger and get a few minutes of power than to use an ultracap for a few seconds of power. Most ultracaps are ~ 2.5v and LiPoly is 3.7v, so you'll need a step up converter in either case. Ultracapacitors are available in voltage ratings beyond 12volts so there is no reason to use a boost converter with them. That eliminates high frequency switching transients right next to our SSD which is always helpful. In this case- we have lots of room. We have a 3.5 x 1 drive bay, but a 2.5 x 1/4 hard drive. There is ample room for several of the 6.3V ELNA 1F capacitors (and our SATA power rail is a 5V regulated rail so they should suffice)- either in series or parallel (Depending on voltage or runtime requirements). http://www.elna.co.jp/en/capacitor/double_layer/catalog/pdf/dk_e.pdf You could 2 caps in series for better voltage tolerance or in parallel for longer runtimes. Either way you probably don't need a charge controller, a boost or buck converter, or in fact any IC's at all. It's just a small board with some caps on it. I know they have a certain internal resistance, but I am not familiar with the characteristics; is it high enough so you don't need to limit the inrush current, and is it low enough so that you don't need a voltage booster for output? Cost for a 5v only system should be $30 - $35 in one-off prototype-ready components with a 1100mAH battery (using prices from Sparkfun.com), You could literally split a sata cable and add in some capacitors for just the cost of the caps themselves. The issue there is whether the caps would present too large a current drain on initial charge up- If they do then you need to add in charge controllers and you've got the same problems as with a LiPo battery- although without the shorter service life. At the end of the day the real problem is whether we believe the drives themselves will actually use the quiet period on the now dead bus to write out their caches. This is something we should ask the manufacturers, and test for ourselves. Indeed! /ragge ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
Basic electronics, go! The linked capacitor from Elna ( http://www.elna.co.jp/en/capacitor/double_layer/catalog/pdf/dk_e.pdf) has an internal resistance of 30 ohms. Intel rate their 32GB X25-E at 2.4W active (we aren't interested in idle power usage, if its idle, we don't need the capacitor in the first place) on the +5V rail, thats 0.48A. (P=VI) V=IR, supply is 5V, current through load is 480mA, hence R=10.4 ohms. The resistance of the X25-E under load is 10.4 ohms. Now if you have a capacitor discharge circuit with the charged Elna DK-6R3D105T - the largest and most suitable from that datasheet - you have 40.4 ohms around the loop (cap and load). +5V over 40.4 ohms. The maximum current you can pull from that is I=V/R = 124mA. Around a quarter what the X25-E wants in order to write. The setup won't work. I'd suggest something more along the lines of: http://www.cap-xx.com/products/products.htm Which have an ESR around 3 orders of magnitude lower. t On 22 May 2010 18:58, Ragnar Sundblad ra...@csc.kth.se wrote: On 22 maj 2010, at 07.40, Don wrote: The SATA power connector supplies 3.3, 5 and 12v. A complete solution will have all three. Most drives use just the 5v, so you can probably ignore 3.3v and 12v. I'm not interested in building something that's going to work for every possible drive config- just my config :) Both the Intel X25-e and the OCZ only uses the 5V rail. You'll need to use a step up DC-DC converter and be able to supply ~ 100mA at 5v. It's actually easier/cheaper to use a LiPoly battery charger and get a few minutes of power than to use an ultracap for a few seconds of power. Most ultracaps are ~ 2.5v and LiPoly is 3.7v, so you'll need a step up converter in either case. Ultracapacitors are available in voltage ratings beyond 12volts so there is no reason to use a boost converter with them. That eliminates high frequency switching transients right next to our SSD which is always helpful. In this case- we have lots of room. We have a 3.5 x 1 drive bay, but a 2.5 x 1/4 hard drive. There is ample room for several of the 6.3V ELNA 1F capacitors (and our SATA power rail is a 5V regulated rail so they should suffice)- either in series or parallel (Depending on voltage or runtime requirements). http://www.elna.co.jp/en/capacitor/double_layer/catalog/pdf/dk_e.pdf You could 2 caps in series for better voltage tolerance or in parallel for longer runtimes. Either way you probably don't need a charge controller, a boost or buck converter, or in fact any IC's at all. It's just a small board with some caps on it. I know they have a certain internal resistance, but I am not familiar with the characteristics; is it high enough so you don't need to limit the inrush current, and is it low enough so that you don't need a voltage booster for output? Cost for a 5v only system should be $30 - $35 in one-off prototype-ready components with a 1100mAH battery (using prices from Sparkfun.com), You could literally split a sata cable and add in some capacitors for just the cost of the caps themselves. The issue there is whether the caps would present too large a current drain on initial charge up- If they do then you need to add in charge controllers and you've got the same problems as with a LiPo battery- although without the shorter service life. At the end of the day the real problem is whether we believe the drives themselves will actually use the quiet period on the now dead bus to write out their caches. This is something we should ask the manufacturers, and test for ourselves. Indeed! /ragge ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
On Fri, 21 May 2010, Don wrote: You could literally split a sata cable and add in some capacitors for just the cost of the caps themselves. The issue there is whether the caps would present too large a current drain on initial charge up- If they do then you need to add in charge controllers and you've got the same problems as with a LiPo battery- although without the shorter service life. Electricity does run both directions down a wire and the capacitor would look like a short circuit to the supply when it is first turned on. You would need some circuitry which delays applying power to the drive before the capacitor is sufficiently charged, and some circuitry which shuts off the flow of energy back into the power supply when the power supply shuts off (could be a silicon diode if you don't mind the 0.7 V drop). Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
Bob Friesenhahn wrote: On Fri, 21 May 2010, Don wrote: You could literally split a sata cable and add in some capacitors for just the cost of the caps themselves. The issue there is whether the caps would present too large a current drain on initial charge up- If they do then you need to add in charge controllers and you've got the same problems as with a LiPo battery- although without the shorter service life. Electricity does run both directions down a wire and the capacitor would look like a short circuit to the supply when it is first turned on. You would need some circuitry which delays applying power to the drive before the capacitor is sufficiently charged, and some circuitry which shuts off the flow of energy back into the power supply when the power supply shuts off (could be a silicon diode if you don't mind the 0.7 V drop). Bob You can also use an appropriately wired field effect transistor (FET) / MOSFET of sufficient current carrying capacity as a one-way valve (diode) that has minimal voltage drop. More: http://electronicdesign.com/article/power/fet-supplies-low-voltage-reverse-polarity-protecti.aspx http://www.electro-tech-online.com/general-electronics-chat/32118-using-mosfet-diode-replacement.html In regard to how long do you need to continue supplying power...that comes down to how long does the SSD wait before flushing cache to flash. If you can identify the maximum write cache flush interval, and size the battery or capacitor to exceed that maximum interval, you should be okay. The maximum write cache flush interval is determined by a timer that says something like okay, we've waited 5 seconds for additional data to arrive to be written. None has arrived in the last 5 seconds, so we're going to write what we already have to better ensure data integrity, even though it is suboptimal from a absolute performance perspective. In conventional terms of filling city buses...the bus leaves when it is full of people, or 15 minutes has passed since the last bus left. Does anyone know if there is a way to directly or indirectly measure the write caching flush interval? I know cache sizes can be found via performance testing, but what about write intervals? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
On the PCIe side, I noticed there's a new card coming from LSI that claims 150,000 4k random writes. Unfortunately this might end up being an OEM-only card. I also notice on the ddrdrive site that they now have an opensolaris driver and are offering it in a beta program. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
On May 20, 2010, at 7:17 PM, Ragnar Sundblad ra...@csc.kth.se wrote: On 21 maj 2010, at 00.53, Ross Walker wrote: On May 20, 2010, at 6:25 PM, Travis Tabbal tra...@tabbal.net wrote: use a slog at all if it's not durable? You should disable the ZIL instead. This is basically where I was going. There only seems to be one SSD that is considered working, the Zeus IOPS. Even if I had the money, I can't buy it. As my application is a home server, not a datacenter, things like NFS breaking if I don't reboot the clients is a non-issue. As long as the on-disk data is consistent so I don't have to worry about the entire pool going belly-up, I'm happy enough. I might lose 30 seconds of data, worst case, as a result of running without ZIL. Considering that I can't buy a proper ZIL at a cost I can afford, and an improper ZIL is not worth much, I don't see a reason to bother with ZIL at all. I'll just get a cheap large SSD for L2ARC, disable ZIL, and call it a day. For my use, I'd want a device in the $200 range to even consider an slog device. As nothing even remotely close to that price range exists that will work properly at all, let alone with decent performance, I see no point in ZIL for my application. The performance hit is just too severe to continue using it without an slog, and there's no slog device I can afford that works properly, even if I ignore performance. Just buy a caching RAID controller and run it in JBOD mode and have the ZIL integrated with the pool. A 512MB-1024MB card with battery backup should do the trick. It might not have the capacity of an SSD, but in my experience it works well in the 1TB data moderately loaded range. Have more data/activity then try more cards and more pools, otherwise pony up the for a capacitor backed SSD. It - again - depends on what problem you are trying to solve. If the RAID controller goes bad on you so that you loose the data in the write cache, your file system could be in pretty bad shape. Most RAID controllers can't be mirrored. That would hardly make a good replacement for a mirrored ZIL. As far as I know, there is no single silver bullet to this issue. That is true, and there at finite budgets as well and as all things in life one must make a trade-off somewhere. If you have 2 mirrored SSDs that don't support cache flush and your power goes out your file system will be in the same bad shape. Difference is in the first place you paid a lot less to have your data hosed. -Ross ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
On Thu, May 20, 2010 at 8:46 PM, Don d...@blacksun.org wrote: I'm kind of flabbergasted that no one has simply stuck a capacitor on a more reasonable drive. I guess the market just isn't big enough- but I find that hard to believe. I just spoke with a co-worker about doing something about it. He says he can design a small in-line UPS that will deliver 20-30 seconds of 3.3V, 5V, and 12V to the SATA power connector for about $50 in parts. It would be even less if only one voltage was needed. That should be enough for most any SSD to finish any pending writes. Any design that we come up with will be made publicly available under a Creative Commons or other similar license. -B -- Brandon High : bh...@freaks.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
I just spoke with a co-worker about doing something about it. He says he can design a small in-line UPS that will deliver 20-30 seconds of 3.3V, 5V, and 12V to the SATA power connector for about $50 in parts. It would be even less if only one voltage was needed. That should be enough for most any SSD to finish any pending writes. Oh I wasn't kidding when I said I was going to have to try this with my home server. I actually do some circuit board design and this would be an amusing project. All you probably need is 5v- I'll look into it. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
On 05/22/10 12:31 PM, Don wrote: I just spoke with a co-worker about doing something about it. He says he can design a small in-line UPS that will deliver 20-30 seconds of 3.3V, 5V, and 12V to the SATA power connector for about $50 in parts. It would be even less if only one voltage was needed. That should be enough for most any SSD to finish any pending writes. Oh I wasn't kidding when I said I was going to have to try this with my home server. I actually do some circuit board design and this would be an amusing project. All you probably need is 5v- I'll look into it. Two Supercaps should do the trick. Dive connectors only have 5 and 12v. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
On Fri, May 21, 2010 at 5:31 PM, Don d...@blacksun.org wrote: Oh I wasn't kidding when I said I was going to have to try this with my home server. I actually do some circuit board design and this would be an amusing project. All you probably need is 5v- I'll look into it. The SATA power connector supplies 3.3, 5 and 12v. A complete solution will have all three. Most drives use just the 5v, so you can probably ignore 3.3v and 12v. You'll need to use a step up DC-DC converter and be able to supply ~ 100mA at 5v. (I can't find any specific numbers on power consumption. Intel claims 75mW - 150mW for the X25-M. USB is rated at 500mA at 5v, and all drives that I've seen can run in an un-powered USB case.) It's actually easier/cheaper to use a LiPoly battery charger and get a few minutes of power than to use an ultracap for a few seconds of power. Most ultracaps are ~ 2.5v and LiPoly is 3.7v, so you'll need a step up converter in either case. If you're supplying more than one voltage, you should use a microcontroller to shut off all the charge pumps at once when the battery / ultracap runs low. If you're only supplying 5V, it doesn't matter. Cost for a 5v only system should be $30 - $35 in one-off prototype-ready components with a 1100mAH battery (using prices from Sparkfun.com), plus the cost for an enclosure, etc. A larger buy, a custom PCB, and a smaller battery would probably reduce the cost 20-50%. -B -- Brandon High : bh...@freaks.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
The SATA power connector supplies 3.3, 5 and 12v. A complete solution will have all three. Most drives use just the 5v, so you can probably ignore 3.3v and 12v. I'm not interested in building something that's going to work for every possible drive config- just my config :) Both the Intel X25-e and the OCZ only uses the 5V rail. You'll need to use a step up DC-DC converter and be able to supply ~ 100mA at 5v. It's actually easier/cheaper to use a LiPoly battery charger and get a few minutes of power than to use an ultracap for a few seconds of power. Most ultracaps are ~ 2.5v and LiPoly is 3.7v, so you'll need a step up converter in either case. Ultracapacitors are available in voltage ratings beyond 12volts so there is no reason to use a boost converter with them. That eliminates high frequency switching transients right next to our SSD which is always helpful. In this case- we have lots of room. We have a 3.5 x 1 drive bay, but a 2.5 x 1/4 hard drive. There is ample room for several of the 6.3V ELNA 1F capacitors (and our SATA power rail is a 5V regulated rail so they should suffice)- either in series or parallel (Depending on voltage or runtime requirements). http://www.elna.co.jp/en/capacitor/double_layer/catalog/pdf/dk_e.pdf You could 2 caps in series for better voltage tolerance or in parallel for longer runtimes. Either way you probably don't need a charge controller, a boost or buck converter, or in fact any IC's at all. It's just a small board with some caps on it. Cost for a 5v only system should be $30 - $35 in one-off prototype-ready components with a 1100mAH battery (using prices from Sparkfun.com), You could literally split a sata cable and add in some capacitors for just the cost of the caps themselves. The issue there is whether the caps would present too large a current drain on initial charge up- If they do then you need to add in charge controllers and you've got the same problems as with a LiPo battery- although without the shorter service life. At the end of the day the real problem is whether we believe the drives themselves will actually use the quiet period on the now dead bus to write out their caches. This is something we should ask the manufacturers, and test for ourselves. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
On 20 maj 2010, at 00.20, Don wrote: You can lose all writes from the last committed transaction (i.e., the one before the currently open transaction). And I don't think that bothers me. As long as the array itself doesn't go belly up- then a few seconds of lost transactions are largely irrelevant- all of the QA virtual machines are going to have to be rolled back to their initial states anyway. Ok - then you are in the dream situation, and your solution could be free of charge, a one-liner command, and perform better than any SSD on the market: Disable the ZIL. You will loose up to 30 seconds of the lastly written data, and if you use it as a NFS server your clients may get confused after a crash since the server is not in the state it should be in. You could also turn down the ZFS transaction timeout to loose less than 30 seconds if you want. Your pool will always be in a consistent shape on disk (if you have hardware that behaves). Remember to NEVER use this pool to anything that actually want better data persistency, that this is a pool tuned specifically for a very special case. In very recent opensolaris there is a zpool property for this, earlier you had to set a kernel flag when mounting the pool (and having it unset when mounting other pools, if you want them to have ZIL enabled). /ragge ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
On May 19, 2010, at 2:29 PM, Don wrote: The data risk is a few moments of data loss. However, if the order of the uberblock updates is not preserved (which is why the caches are flushed) then recovery from a reboot may require manual intervention. The amount of manual intervention could be significant for builds prior to b128. This risk is mostly mitigated by UPS backup and auto-shutdown when the UPS detects power loss, correct? Outside of pulling the plug that should solve power related problems. Kernel panics should only be caused by hardware issues, which might corrupt the disk data anyway. Obviously software can and does fail, but the biggest problem I hear about with ZIL devices is behavior in a sudden power loss situation. It seems to me that UPS backup along with starting a shutdown cycle before complete power failure should prevent most issues. Seems like that should help with issues like the X25-E not honoring cache flush as well, the UPS would give it time to finish the writes. Again, without a firmware issue in the drive itself. Should be about the same as a supercap anyway. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
On 20 maj 2010, at 20.35, David Magda wrote: On Thu, May 20, 2010 14:12, Travis Tabbal wrote: On May 19, 2010, at 2:29 PM, Don wrote: The data risk is a few moments of data loss. However, if the order of the uberblock updates is not preserved (which is why the caches are flushed) then recovery from a reboot may require manual intervention. The amount of manual intervention could be significant for builds prior to b128. This risk is mostly mitigated by UPS backup and auto-shutdown when the UPS detects power loss, correct? Unless you have a contractor working in the server room that bumps into the UPS and causes a power glitch which causes a whole bunch of equipment to cycle. Happened at $WORK (in another office) just two weeks ago. Or, a zillion of other problem modes with that setup, all from problems with the UPS, to the auto-shutdown communication signaling system, a problem with the computer system, the electrical distribution, or anything else. Building complex solutions to solve critical issues is IMHO seldom a very good solution. If you care about data integrity, buy stuff that do what they are supposed to do, and keep everything simple. Redundancy is often good, but keep the switchover mechanisms as simple and as few as possible. Choose mechanisms that can and will be tested regularly - and don't use systems that are almost never used and/or tested. Complex systems tend to fail, especially after some time when things have changed a bit, or even cause more outages in themselves. They are hard to test, maintain and understand, and they are often costly to buy too. KISS, you know. In the Intel X25 case - bug them until they release new firmware - they have sold you a defect product that they still haven't fixed. If they don't fix it and you need it, get another drive. It all depends on your level of paranoia. Either that, or you may have some kind of protocol, policy, contract, SLA or similar that you have to follow. (In any case it is often really hard to even guess how much a certain change gives or takes in availability numbers.) Just my 5 öre. /ragge ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
d == Don d...@blacksun.org writes: d Since it ignores Cache Flush command and it doesn't have any d persistant buffer storage, disabling the write cache is the d best you can do. This actually brings up another question I d had: What is the risk, beyond a few seconds of lost writes, if d I lose power, there is no capacitor and the cache is not d disabled? why use a slog at all if it's not durable? You should disable the ZIL instead. Compared to a slog that ignores cache flush, disabling the ZIL will provide the same guarantees to the application w.r.t. write ordering preserved, and the same problems with NFS server reboots, replicated databases, mail servers. It'll be faster than the fake-slog. It'll be less risk of losing the pool because the slog went bad and then you accidentally exported the pool while trying to fix things. The only case where you are ahead with the fake-slog, is the host's going down because of kernel panics rather than power loss. I don't know, though, what to do about these reports of devices that almost respect cache flushes but seem to lose exactly one transaction. AFAICT this should be a works/doesntwork situation, not a continuum. pgp4xXGJ3xew4.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
On 05/20/10 12:26, Miles Nordin wrote: I don't know, though, what to do about these reports of devices that almost respect cache flushes but seem to lose exactly one transaction. AFAICT this should be a works/doesntwork situation, not a continuum. But there's so much brokenness out there. I've seen similar tail drop behavior before -- last write or two before a hardware reset goes into the bit bucket, but ones before that are durable. So, IMHO, a cheap consumer ssd used as a zil may still be worth it (for some use cases) to narrow the window of data loss from ~30 seconds to a sub-second value. - Bill ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
use a slog at all if it's not durable? You should disable the ZIL instead. This is basically where I was going. There only seems to be one SSD that is considered working, the Zeus IOPS. Even if I had the money, I can't buy it. As my application is a home server, not a datacenter, things like NFS breaking if I don't reboot the clients is a non-issue. As long as the on-disk data is consistent so I don't have to worry about the entire pool going belly-up, I'm happy enough. I might lose 30 seconds of data, worst case, as a result of running without ZIL. Considering that I can't buy a proper ZIL at a cost I can afford, and an improper ZIL is not worth much, I don't see a reason to bother with ZIL at all. I'll just get a cheap large SSD for L2ARC, disable ZIL, and call it a day. For my use, I'd want a device in the $200 range to even consider an slog device. As nothing even remotely close to that price range exists that will work properly at all, let alone with decent performance, I see no point in ZIL for my application. The performance hit is just too severe to continue using it without an slog, and there's no slog device I can afford that works properly, even if I ignore performance. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
On May 20, 2010, at 6:25 PM, Travis Tabbal tra...@tabbal.net wrote: use a slog at all if it's not durable? You should disable the ZIL instead. This is basically where I was going. There only seems to be one SSD that is considered working, the Zeus IOPS. Even if I had the money, I can't buy it. As my application is a home server, not a datacenter, things like NFS breaking if I don't reboot the clients is a non-issue. As long as the on-disk data is consistent so I don't have to worry about the entire pool going belly-up, I'm happy enough. I might lose 30 seconds of data, worst case, as a result of running without ZIL. Considering that I can't buy a proper ZIL at a cost I can afford, and an improper ZIL is not worth much, I don't see a reason to bother with ZIL at all. I'll just get a cheap large SSD for L2ARC, disable ZIL, and call it a day. For my use, I'd want a device in the $200 range to even consider an slog device. As nothing even remotely close to that price range exists that will work properly at all, let alone with decent performance, I see no point in ZIL for my application. The performance hit is just too severe to continue using it without an slog, and there's no slog device I can afford that works properly, even if I ignore performance. Just buy a caching RAID controller and run it in JBOD mode and have the ZIL integrated with the pool. A 512MB-1024MB card with battery backup should do the trick. It might not have the capacity of an SSD, but in my experience it works well in the 1TB data moderately loaded range. Have more data/activity then try more cards and more pools, otherwise pony up the for a capacitor backed SSD. -Ross ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
On 21 maj 2010, at 00.53, Ross Walker wrote: On May 20, 2010, at 6:25 PM, Travis Tabbal tra...@tabbal.net wrote: use a slog at all if it's not durable? You should disable the ZIL instead. This is basically where I was going. There only seems to be one SSD that is considered working, the Zeus IOPS. Even if I had the money, I can't buy it. As my application is a home server, not a datacenter, things like NFS breaking if I don't reboot the clients is a non-issue. As long as the on-disk data is consistent so I don't have to worry about the entire pool going belly-up, I'm happy enough. I might lose 30 seconds of data, worst case, as a result of running without ZIL. Considering that I can't buy a proper ZIL at a cost I can afford, and an improper ZIL is not worth much, I don't see a reason to bother with ZIL at all. I'll just get a cheap large SSD for L2ARC, disable ZIL, and call it a day. For my use, I'd want a device in the $200 range to even consider an slog device. As nothing even remotely close to that price range exists that will work properly at all, let alone with decent performance, I see no point in ZIL for my application. The performance hit is just too severe to continue using it without an slog, and there's no slog device I can afford that works properly, even if I ignore performance. Just buy a caching RAID controller and run it in JBOD mode and have the ZIL integrated with the pool. A 512MB-1024MB card with battery backup should do the trick. It might not have the capacity of an SSD, but in my experience it works well in the 1TB data moderately loaded range. Have more data/activity then try more cards and more pools, otherwise pony up the for a capacitor backed SSD. It - again - depends on what problem you are trying to solve. If the RAID controller goes bad on you so that you loose the data in the write cache, your file system could be in pretty bad shape. Most RAID controllers can't be mirrored. That would hardly make a good replacement for a mirrored ZIL. As far as I know, there is no single silver bullet to this issue. /ragge ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
On May 20, 2010, at 1:12 PM, Bill Sommerfeld wrote: On 05/20/10 12:26, Miles Nordin wrote: I don't know, though, what to do about these reports of devices that almost respect cache flushes but seem to lose exactly one transaction. AFAICT this should be a works/doesntwork situation, not a continuum. But there's so much brokenness out there. I've seen similar tail drop behavior before -- last write or two before a hardware reset goes into the bit bucket, but ones before that are durable. So, IMHO, a cheap consumer ssd used as a zil may still be worth it (for some use cases) to narrow the window of data loss from ~30 seconds to a sub-second value. +1 -- richard -- ZFS and NexentaStor training, Rotterdam, July 13-15, 2010 http://nexenta-rotterdam.eventbrite.com/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
So, IMHO, a cheap consumer ssd used as a zil may still be worth it (for some use cases) to narrow the window of data loss from ~30 seconds to a sub-second value. There are lots of reasons to enable the ZIL now- I can throw four very inexpensive SSD's in there now in a pair of mirrors, and then when a better drive comes along I can replace each half of the mirror without bringing anything down. My slots are already allocated and it would be nice to save a few extra seconds of writes- just in case. It's not a great solution- but nothing is. I don't have access to a ZEUS- and even if I did- I wouldn't pay that kind of money for what amounts to a Vertex 2 Pro but with SLC flash. I'm kind of flabbergasted that no one has simply stuck a capacitor on a more reasonable drive. I guess the market just isn't big enough- but I find that hard to believe. Right now it seems like the options are all or nothing. There's just no %^$#^ middle ground. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
40k IOPS sounds like best in case, you'll never see it in the real world marketing to me. There are a few benchmarks if you google and they all seem to indicate the performance is probably +/- 10% of an intel x25-e. I would personally trust intel over one of these drives. Is it even possible to buy a zeus iops anywhere? I haven't been able to find one. I get the impression they mostly sell to other vendors like sun? I'd be curious what the price is on a 9GB zeus iops is these days? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
Don wrote: With that in mind- Is anyone using the new OCZ Vertex 2 SSD's as a ZIL? They're claiming 50k IOPS (4k Write- Aligned), 2 million hour MTBF, TRIM support, etc. That's more write IOPS than the ZEUS (40k IOPS, $) but at half the price of an Intel X25-E (3.3k IOPS, $400). Needless to say I'd love to know if anyone has evaluated these drives to see if they make sense as a ZIL- for example- do they honor cache flush requests? Are those sustained IOPS numbers? In my understanding nearly the only relevant number is the number of cache flushes a drive can handle per second, as this determines my single thread performance. Has anyone an idea what numbers I can expect from an Intel X25-E or an OCZ Vertex 2? -Arne ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
On Tue, May 18, 2010 at 4:28 PM, Don d...@blacksun.org wrote: With that in mind- Is anyone using the new OCZ Vertex 2 SSD's as a ZIL? The current Sandforce drives out don't have an ultra-capacitor on them, so they could lose data if the system crashed. There are supposed to be enterprise class drives based on the chipset out that do have an ultra-cap released any day now. Needless to say I'd love to know if anyone has evaluated these drives to see if they make sense as a ZIL- for example- do they honor cache flush requests? Are those sustained IOPS numbers? I don't think they do, the chipset was designed to use an ultra-cap to avoid having to honor flushes. Then again, the X25-E has the same problem. -B -- Brandon High : bh...@freaks.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
On 2010-05-19 08.32, sensille wrote: Don wrote: With that in mind- Is anyone using the new OCZ Vertex 2 SSD's as a ZIL? They're claiming 50k IOPS (4k Write- Aligned), 2 million hour MTBF, TRIM support, etc. That's more write IOPS than the ZEUS (40k IOPS, $) but at half the price of an Intel X25-E (3.3k IOPS, $400). Needless to say I'd love to know if anyone has evaluated these drives to see if they make sense as a ZIL- for example- do they honor cache flush requests? Are those sustained IOPS numbers? In my understanding nearly the only relevant number is the number of cache flushes a drive can handle per second, as this determines my single thread performance. Has anyone an idea what numbers I can expect from an Intel X25-E or an OCZ Vertex 2? I don't know about OCZ Vertex 2, but the Intel X25-E roughly halves it's IOPS number when you disable it's write cache (IIRC, it was in the range 1300-1600 writes/s or so). Since it ignores Cache Flush command and it doesn't have any persistant buffer storage, disabling the write cache is the best you can do. Note that there were reports of the Intel X25-E loosing a write even though you had the write cache disabled! Since they still haven't fixed this, after more than a year on the market, I believe it rather qualifies into the hardly usable toy class. I am very disappointed, I had hopes for a new class of cheap but usable flash drives. Maybe some day... /ragge ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
Well- 40k IOPS is the current claim from ZEUS- and they're the benchmark. They use to be 17k IOPS. How real any of these numbers are from any manufacturer is a guess. Given the Intel's refusal to honor a cache flush, and their performance problems with the cache disabled- I don't trust them any more than anyone else right now. As for the Vertex drives- if they are within +-10% of the Intel they're still doing it for half of what the Intel drive costs- so it's an option- not a great option- but still an option. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
As for the Vertex drives- if they are within +-10% of the Intel they're still doing it for half of what the Intel drive costs- so it's an option- not a great option- but still an option. Yes, but Intel is SLC. Much more endurance. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
On Wed, May 19, 2010 02:09, thomas wrote: Is it even possible to buy a zeus iops anywhere? I haven't been able to find one. I get the impression they mostly sell to other vendors like sun? I'd be curious what the price is on a 9GB zeus iops is these days? Correct, their Zeus products are only available to OEMs. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
Well the larger size of the Vertex, coupled with their smaller claimed write amplification should result in sufficient service life for my needs. Their claimed MTBF also matches the Intel X25-E's. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
Since it ignores Cache Flush command and it doesn't have any persistant buffer storage, disabling the write cache is the best you can do. This actually brings up another question I had: What is the risk, beyond a few seconds of lost writes, if I lose power, there is no capacitor and the cache is not disabled? My ZFS system is shared storage for a large VMWare based QA farm. If I lose power then a few seconds of writes are the least of my concerns. All of the QA tests will need to be restarted and all of the file systems will need to be checked. A few seconds of writes won't make any difference unless it has the potential to affect the integrity of the pool itself. Considering the performance trade-off, I'd happily give up a few seconds worth of writes for significantly improved IOPS. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
On May 19, 2010, at 2:29 PM, Don wrote: Since it ignores Cache Flush command and it doesn't have any persistant buffer storage, disabling the write cache is the best you can do. This actually brings up another question I had: What is the risk, beyond a few seconds of lost writes, if I lose power, there is no capacitor and the cache is not disabled? The data risk is a few moments of data loss. However, if the order of the uberblock updates is not preserved (which is why the caches are flushed) then recovery from a reboot may require manual intervention. The amount of manual intervention could be significant for builds prior to b128. My ZFS system is shared storage for a large VMWare based QA farm. If I lose power then a few seconds of writes are the least of my concerns. All of the QA tests will need to be restarted and all of the file systems will need to be checked. A few seconds of writes won't make any difference unless it has the potential to affect the integrity of the pool itself. Considering the performance trade-off, I'd happily give up a few seconds worth of writes for significantly improved IOPS. Space, dependability, performance: pick two :-) -- richard -- Richard Elling rich...@nexenta.com +1-760-896-4422 ZFS and NexentaStor training, Rotterdam, July 13-15, 2010 http://nexenta-rotterdam.eventbrite.com/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
On Wed, May 19, 2010 at 02:29:24PM -0700, Don wrote: Since it ignores Cache Flush command and it doesn't have any persistant buffer storage, disabling the write cache is the best you can do. This actually brings up another question I had: What is the risk, beyond a few seconds of lost writes, if I lose power, there is no capacitor and the cache is not disabled? You can lose all writes from the last committed transaction (i.e., the one before the currently open transaction). (You also lose writes from the currently open transaction, but that's unavoidable in any system.) Nowadays the system will let you know at boot time that the last transaction was not committed properly and you'll have a chance to go back to the previous transaction. For me, getting much-better-than-disk performance out of an SSD with cache disabled is enough to make that SSD worthwhile, provided the price is right of course. Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
You can lose all writes from the last committed transaction (i.e., the one before the currently open transaction). And I don't think that bothers me. As long as the array itself doesn't go belly up- then a few seconds of lost transactions are largely irrelevant- all of the QA virtual machines are going to have to be rolled back to their initial states anyway. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
You can lose all writes from the last committed transaction (i.e., the one before the currently open transaction). I'll pick one- performance :) Honestly- I wish I had a better grasp on the real world performance of these drives. 50k IOPS is nice- and considering the incredible likelihood of data duplication in my environment- the SandForce controller seems like a win. That said- does anyone have a good set of real world performance numbers for these drives that you can link to? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] New SSD options
I'm looking for alternatives SSD options to the Intel X25-E and the ZEUS IOPS. The ZEUS IOPS would probably cost as much as my entire current disk system (80 15k SAS drives)- and that's just silly. The Intel is much less expensive, and while fast- pales in comparison to the ZEUS. I've allocated 4 disk slots in my array for ZIL SSD's and I'm trying to find the best performance for my dollar. With that in mind- Is anyone using the new OCZ Vertex 2 SSD's as a ZIL? http://www.ocztechnology.com/products/solid-state-drives/2-5--sata-ii/performance-enterprise-solid-state-drives/ocz-vertex-2-sata-ii-2-5--ssd.html They're claiming 50k IOPS (4k Write- Aligned), 2 million hour MTBF, TRIM support, etc. That's more write IOPS than the ZEUS (40k IOPS, $) but at half the price of an Intel X25-E (3.3k IOPS, $400). Needless to say I'd love to know if anyone has evaluated these drives to see if they make sense as a ZIL- for example- do they honor cache flush requests? Are those sustained IOPS numbers? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss