[SLUG] HOT SWAPPING - Internals of the sync/umount call
On 16/05/2009, at 4:32 PM, slug-requ...@slug.org.au wrote: From: Daniel Pittman dan...@rimspace.net Date: 16 May 2009 4:31:34 PM To: slug@slug.org.au Subject: Re: [SLUG] hot swapping hard drives Grahame Kelly grah...@wildpossum.com writes: Of the seven systems I look after, three have hot-swapping HDA's via a RAID5/6 drive enclosures, two systems have add-on SATAII caddies for hot-swap and the others are without hot-swapping. If your interested, and to reset your angst a little, I have been in the computing/engineering industry for 25++ years. On the software level the OS only has to ensure that all dirty (written too) memory pages are written out to the drives and such buffering flushed, drive index tables updated and written ALL before the drive is removed. That only handles the hot *UN*-plug side of things, and can cause significant grief to you if the driver doesn't cope: anything from several minutes in which *all* disks on that controller are unavailable during error handling, through to a controller hang. Rather than stating what I suspect is just a belief, have you look at the Kernel source code at all? If so I would be very interested at exactly where you state such activity happens. According to Linux Internals Doco (and hereijn I refer to the Linux Drivers themselves) Once the device has been un-mounted the OS warrants that the device, its linked control blocks, buffers etc. are indeed-flushed and data secured on the device medium. The applicable driver HAVE already unloaded any cache data before the umount command returns with its resultant response. (Admittedly, the last is only on really bad hardware, but hey, that hardware is out there and still within the reasonable life of machines for home users.) Anyway, once the hardware doesn't die completely you still need the driver stack to notice and remove the now absent hardware from the software shadow representation. Crap controllers are just that - crap ;-) After all, you don't want /dev/sdb hanging about when the disk itself has been removed, taking up a slot and making life miserable. :) I have never experienced this in all the years working with Linux. Either you haven't un-mounted the device correctly (that is checked the return status byte if in a script), or the OS release you refer to is/was buggy, (Oh, and, of course, the hardware needs to be able to notify the driver that the device did actually go away, which not all hardware can.) Again - read the source code. The CLI command umount does this within the Linux / Unix OS. That should have the filesystem flush data, but doesn't actually push out dirty pages for the device — if you accessed it raw at any point this will not be sufficient. It was never mentioned about mounting raw. As everyone should know - your on your own if you mount any device raw, as you become the only one responsible for its connectivity, data control and reliability. (Also, lower layers such as LVM, software RAID, etc, might not flush their data during the unmount process.) Yep every driver should - otherwise they are badly designed and implemented. The sync command/programming API call is another way to do this programmatically. That will flush raw blocks from the device also. That is all that is required. Those are necessary, but not sufficient, steps, I fear. Also, on the hotplug side, where a new device is added, your driver needs to cope with detecting the device addition, probing it and ensuring the hardware copes, and with reporting that up the software stack. Yes but that is my point! - This is all part of the kernel drivers responsibility - read all about this in the source code... and the kernel internals. Hence, there is no need to portray the overside of hot swapping as problematic - you put it. On the hardware side, the PSU socket must ensure that power is presented to the drive before logic is connected (ground first). This is why the +12v, +5v and GND pins are usually extended about 8mm before the rest of the pins are connected. FWIW, SATA devices are hot-swap and the are ... a little less than 8mm of coverage for those connections. Just sayin' SATA I, II and forthcoming III specifications originally covered hot- swapping. So it would be expected at the hardware level. Cheers. Grahame Regards, Daniel -- SLUG - Sydney Linux User Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] HOT SWAPPING - Internals of the sync/umount call
On Sat, May 16, 2009, Grahame Kelly wrote: Rather than stating what I suspect is just a belief, have you look at the Kernel source code at all? If so I would be very interested at exactly where you state such activity happens. According to Linux Internals Doco (and hereijn I refer to the Linux Drivers themselves) Once the device has been un-mounted the OS warrants that the device, its linked control blocks, buffers etc. are indeed-flushed and data secured on the device medium. The applicable driver HAVE already unloaded any cache data before the umount command returns with its resultant response. And I assume that you 100% believe that when the drive says YES SIR I HAVE SYNCED it has actually done this? :) I have never experienced this in all the years working with Linux. Either you haven't un-mounted the device correctly (that is checked the return status byte if in a script), or the OS release you refer to is/was buggy, Or you've been lucky! FWIW, SATA devices are hot-swap and the are ... a little less than 8mm of coverage for those connections. Just sayin' SATA I, II and forthcoming III specifications originally covered hot- swapping. So it would be expected at the hardware level. Its optional. And it is not always implemented correctly. I have some notes somewhere from some previous experiments with various desktop-y SATA chipsets under FreeBSD/Linux and I found that they didn't all do hotswap as advertised. ;) Adrian -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] HOT SWAPPING - Internals of the sync/umount call
(Admittedly, the last is only on really bad hardware, but hey, that hardware is out there and still within the reasonable life of machines for home users.) Anyway, once the hardware doesn't die completely you still need the driver stack to notice and remove the now absent hardware from the software shadow representation. Crap controllers are just that - crap ;-) Returning to the original inquiry, and now that I know that I have a 82801GB/GR/GH (ICH7 Family) SATA IDE Controller (Intel) How do I go about finding out if it's safe to hot-swap? Google hasn't been very helpful thus far, perhaps because I'm asking the wrong question. Of course, someone might say hey I've got one of those and it's fine but if someone was reading this in the archives and wanted to find out about slightly different hardware, how would you find out? I'm old fashioned enough that it's about the fishing rather than the fish ;-) -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] HOT SWAPPING - Internals of the sync/umount call
david da...@kenpro.com.au writes: (Admittedly, the last is only on really bad hardware, but hey, that hardware is out there and still within the reasonable life of machines for home users.) Anyway, once the hardware doesn't die completely you still need the driver stack to notice and remove the now absent hardware from the software shadow representation. Crap controllers are just that - crap ;-) Returning to the original inquiry, and now that I know that I have a 82801GB/GR/GH (ICH7 Family) SATA IDE Controller (Intel) How do I go about finding out if it's safe to hot-swap? Did you try the libata status report page I posted the link to a while back? That should confirm that your ICH7 supports hotplug. Regards, Daniel -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] HOT SWAPPING - Internals of the sync/umount call
Grahame Kelly grah...@wildpossum.com writes: From: Daniel Pittman dan...@rimspace.net Grahame Kelly grah...@wildpossum.com writes: [...] That only handles the hot *UN*-plug side of things, and can cause significant grief to you if the driver doesn't cope: anything from several minutes in which *all* disks on that controller are unavailable during error handling, through to a controller hang. Rather than stating what I suspect is just a belief, have you look at the Kernel source code at all? I am a little curious why you suspect this is just a belief, but to answer your question, yes. I /do/ know what the kernel does, as well as having some idea of how various controllers handle (or fail to) the hotplug events. I even tested some improvements to the libata error handler, way back when, when it turns out that I owned one of the controllers where a little extra hand-holding in the error handler after a hotplug event. If so I would be very interested at exactly where you state such activity happens. What, you mean hardware or drivers that don't cope? Well, the NVIDIA SATA drivers had some problems that would cause a long, long delay trying error recovery if a device got unplugged. IIRC, an inverted bit in the sense data returned from the controller was responsible, but it has been some time. According to Linux Internals Doco (and hereijn I refer to the Linux Drivers themselves) Once the device has been un-mounted the OS warrants that the device, its linked control blocks, buffers etc. are indeed-flushed and data secured on the device medium. Sure. What that has to do with drivers that don't cope with error recovery from a hot-unplug of a device, though, I don't quite follow. [...] After all, you don't want /dev/sdb hanging about when the disk itself has been removed, taking up a slot and making life miserable. :) I have never experienced this in all the years working with Linux. Well, I am surprised. Certainly, on non-hotplug hardware the behaviour of a Linux block device driver is to keep the device around and report appropriate errors. Either you haven't un-mounted the device correctly (that is checked the return status byte if in a script), or the OS release you refer to is/was buggy, On the other hand, it seems we are talking about different things here. Yes, if a driver that supports hotplug, with hardware that supports hotplug, fails to remove the software device after the hardware is gone it has a bug. [...] (Also, lower layers such as LVM, software RAID, etc, might not flush their data during the unmount process.) Yep every driver should - otherwise they are badly designed and implemented. I don't think you quite follow: if you unmount a filesystem in an LV, but keep the PV active, LVM can quite reasonably keep metadata active and in memory. You have to deactivate the PV for that to change, which is the equivalent LVM operation to unmount for a filesystem. Software RAID behaves in a similar fashion. [...] Yes but that is my point! - This is all part of the kernel drivers responsibility - read all about this in the source code... and the kernel internals. Hence, there is no need to portray the overside of hot swapping as problematic - you put it. Sorry, I don't follow you. I don't recall, and can't find in my text, a reference to the overside of hot swapping. Can you clarify what you were responding to there? On the hardware side, the PSU socket must ensure that power is presented to the drive before logic is connected (ground first). This is why the +12v, +5v and GND pins are usually extended about 8mm before the rest of the pins are connected. FWIW, SATA devices are hot-swap and the are ... a little less than 8mm of coverage for those connections. Just sayin' SATA I, II and forthcoming III specifications originally covered hot- swapping. So it would be expected at the hardware level. My point was that there is not an 8mm long electrical connector on a SATA cable, or device. Nothing more than that. Regards, Daniel -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] HOT SWAPPING - Internals of the sync/umount call
Adrian Chadd adr...@creative.net.au writes: On Sat, May 16, 2009, Grahame Kelly wrote: [...] FWIW, SATA devices are hot-swap and the are ... a little less than 8mm of coverage for those connections. Just sayin' SATA I, II and forthcoming III specifications originally covered hot- swapping. So it would be expected at the hardware level. Its optional. And it is not always implemented correctly. Hot-swap is mandatory in the SATA spec, but not all controller chips report it in a meaningful way to the OS. Unfortunately, while they (theoretically) have to support the operation no one told their interface designers about that.[1] I have some notes somewhere from some previous experiments with various desktop-y SATA chipsets under FreeBSD/Linux and I found that they didn't all do hotswap as advertised. ;) *nod* Likewise, Linux was a lot more troublesome. The OP has recent enough hardware that his life is fine, if he is using the ICH7 in AHCI mode though. Lucky him, and lucky the rest of us now that hotplug is pretty much a standard feature. Regards, Daniel Footnotes: [1] ...well, in fairness, a bunch of the early hardware was just a PATA controller with a SATA bridge plugged in, or operated in a mode that looked just like a PATA controller with no hot-plug support. -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] HOT SWAPPING - Internals of the sync/umount call
Daniel Pittman wrote: david da...@kenpro.com.au writes: (Admittedly, the last is only on really bad hardware, but hey, that hardware is out there and still within the reasonable life of machines for home users.) Anyway, once the hardware doesn't die completely you still need the driver stack to notice and remove the now absent hardware from the software shadow representation. Crap controllers are just that - crap ;-) Returning to the original inquiry, and now that I know that I have a 82801GB/GR/GH (ICH7 Family) SATA IDE Controller (Intel) How do I go about finding out if it's safe to hot-swap? Did you try the libata status report page I posted the link to a while back? That should confirm that your ICH7 supports hotplug. http://ata.wiki.kernel.org/index.php/SATA_hardware_features When you posted I didn't know which controller was in there, now I do. ChipDriver NCQ DMA++ hotplug PMP ICH7 family ata_piix, ahci AHCIAHCIAHCIno Since I still don't want to fry the drive, the question still remains (for me at least, given that I'm not as erudite as some) Will this hotplug?? especially since other controllers on this page are listed simply as hotplug: yes rather than hotplug: AHCI. I guess I could just get an old SATA drive and try it, but from the discussion there also seems to be a question mark about frying the PSU. many thanks, David -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] HOT SWAPPING - Internals of the sync/umount call
david da...@kenpro.com.au writes: Daniel Pittman wrote: david da...@kenpro.com.au writes: (Admittedly, the last is only on really bad hardware, but hey, that hardware is out there and still within the reasonable life of machines for home users.) Anyway, once the hardware doesn't die completely you still need the driver stack to notice and remove the now absent hardware from the software shadow representation. Crap controllers are just that - crap ;-) Returning to the original inquiry, and now that I know that I have a 82801GB/GR/GH (ICH7 Family) SATA IDE Controller (Intel) How do I go about finding out if it's safe to hot-swap? Did you try the libata status report page I posted the link to a while back? That should confirm that your ICH7 supports hotplug. http://ata.wiki.kernel.org/index.php/SATA_hardware_features When you posted I didn't know which controller was in there, now I do. Chip Driver NCQ DMA++ hotplug PMP ICH7 family ata_piix, ahci AHCIAHCIAHCIno Since I still don't want to fry the drive, the question still remains (for me at least, given that I'm not as erudite as some) Will this hotplug?? Yes, if you are running it in AHCI mode. Specifically, you have to be in something other than compatibility mode in the BIOS, and it has to identify as an AHCI controller during boot. Check the kernel messages after boot to confirm that: dmesg | grep -i ahci especially since other controllers on this page are listed simply as hotplug: yes rather than hotplug: AHCI. Fair point. Sorry, I should have been clearer. I guess I could just get an old SATA drive and try it, but from the discussion there also seems to be a question mark about frying the PSU. If it makes you feel any better you can't *physically* damage a SATA device hotplugging it. You could corrupt the data on it if Linux hadn't written everything out, and you could crash your system, but nothing worse than that. Regards, Daniel -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] HOT SWAPPING - Internals of the sync/umount call
On 16/05/2009, at 5:53 PM, Adrian Chadd wrote: On Sat, May 16, 2009, Grahame Kelly wrote: Rather than stating what I suspect is just a belief, have you look at the Kernel source code at all? If so I would be very interested at exactly where you state such activity happens. According to Linux Internals Doco (and hereijn I refer to the Linux Drivers themselves) Once the device has been un-mounted the OS warrants that the device, its linked control blocks, buffers etc. are indeed-flushed and data secured on the device medium. The applicable driver HAVE already unloaded any cache data before the umount command returns with its resultant response. And I assume that you 100% believe that when the drive says YES SIR I HAVE SYNCED it has actually done this? :) Hi Adrian. It is all part of the standards each industry strives for. SATA drive manufactures validate and belong to the applicable standards groups just for these reasons. I am not disputing that some drives or controllers may not be standards conforming (at times this is more than likely). If and only if a drive, or/and its controller conform to such standards, then whatever data stream needs to be written by the subsystem on the completeion of a sync or in response to a umount is suppose to ensure that such data is stored on the media either before the status response is returned to the driving s/w or is warranted to have done so. If this didn't happen then all hell would break loose (which is what your saying). I don't believe much if anything at all. We both have discovered via our experiences when things don't work a.k.a. don't conform to a standard - this is when structures or such methodologies break. Under POSIX umount is suppose to warrant such for the device, its controlling structures and associated kernel drive tables. If the system(s) don't - then they simply are non-conforming implementations - That is ALL. I have never experienced this in all the years working with Linux. Either you haven't un-mounted the device correctly (that is checked the return status byte if in a script), or the OS release you refer to is/was buggy, Or you've been lucky! Whatever. FWIW, SATA devices are hot-swap and the are ... a little less than 8mm of coverage for those connections. Just sayin' SATA I, II and forthcoming III specifications originally covered hot- swapping. So it would be expected at the hardware level. Its optional. And it is not always implemented correctly. I have some notes somewhere from some previous experiments with various desktop-y SATA chipsets under FreeBSD/Linux and I found that they didn't all do hotswap as advertised. ;) Your correct to say And it is not always implemented correctly -- That is exactly what I am trying to show through this discussion. It is your experiments that I and others are interested in. We may together be able to: A narrow the problem down - and if it is a Linux or driver implementation - make and forward a patch in making the OS better compliant. B If its a drive issue, advise the manufacture, or simply advise others not to purchase same because of these issues. C Find a work-around - be it a operational, hardware or software one. And advise others not just in SLUG but the wider Linux/FreeBSD world. Thats my main point of this followup. Not an person to person agreement - rather a technical follow up to narrow down and implement a solution as I already have pointed out. Keep tracking those problematic issues, Set up a controlled test, document it and forward it to others so they may test the same and return their results to you. Cheers. Grahame Adrian -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] HOT SWAPPING - Internals of the sync/umount call
On Sun, May 17, 2009 at 11:04:33AM +1000, Grahame Kelly wrote: On 16/05/2009, at 5:53 PM, Adrian Chadd wrote: On Sat, May 16, 2009, Grahame Kelly wrote: Rather than stating what I suspect is just a belief, have you look at the Kernel source code at all? If so I would be very interested at [snip] I am not disputing that some drives or controllers may not be standards conforming (at times this is more than likely). If and only if a drive, or/and its controller conform to such standards, then whatever data stream needs to be written by the subsystem on the completeion of a sync or in response to a umount is suppose to ensure that such data is stored on the media either before the status response is returned to the driving s/w or is warranted to have done so. If this didn't happen then all hell would break loose (which is what your saying). I don't believe much if anything at all. We both have discovered via our experiences when things don't work a.k.a. don't conform to a standard - this is when structures or such methodologies break. Under POSIX umount is suppose to warrant such for the device, its controlling structures and associated kernel drive tables. If the system(s) don't - then they simply are non-conforming implementations - That is ALL. I think you missed the point about partitions sitting on LVM sitting on raid. if you umount a lvm partition the block device provided by lvm is unmounted - but the lvm group and potentially the raid device underneath isn't. Your about statement is only really true when we used drive directory and not via DM or LVM [snip] Cheers. Grahame Adrian -- The legislature's job is to write law. It's the executive branch's job to interpret law. - George W. Bush 11/22/2000 Austin, TX signature.asc Description: Digital signature -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html