[SLUG] HOT SWAPPING - Internals of the sync/umount call

2009-05-16 Thread Grahame Kelly


On 16/05/2009, at 4:32 PM, slug-requ...@slug.org.au wrote:

From: Daniel Pittman dan...@rimspace.net
Date: 16 May 2009 4:31:34 PM
To: slug@slug.org.au
Subject: Re: [SLUG] hot swapping hard drives


Grahame Kelly grah...@wildpossum.com writes:

Of the seven systems I look after, three have hot-swapping HDA's  
via a

RAID5/6 drive enclosures, two systems have add-on SATAII caddies for
hot-swap and the others are without hot-swapping.  If your  
interested,

and to reset your angst a little, I have been in the
computing/engineering industry for 25++ years.

On the software level the OS only has to ensure that all dirty
(written too) memory pages are written out to the drives and such
buffering flushed, drive index tables updated and written ALL before
the drive is removed.


That only handles the hot *UN*-plug side of things, and can cause
significant grief to you if the driver doesn't cope: anything from
several minutes in which *all* disks on that controller are  
unavailable

during error handling, through to a controller hang.


Rather than stating what I suspect is just a belief, have you look  
at the Kernel source code at all? If so I would be very interested at  
exactly where you state such activity happens.
According to Linux Internals Doco (and hereijn I refer to the Linux  
Drivers themselves) Once the device has been un-mounted the OS  
warrants that the device, its linked control blocks, buffers etc.
are indeed-flushed and data secured on the device medium.  The  
applicable driver HAVE already unloaded any cache data before the  
umount command returns with its resultant response.




(Admittedly, the last is only on really bad hardware, but hey, that
hardware is out there and still within the reasonable life of machines
for home users.)

Anyway, once the hardware doesn't die completely you still need the
driver stack to notice and remove the now absent hardware from the
software shadow representation.



Crap controllers are just that - crap ;-)


After all, you don't want /dev/sdb hanging about when the disk itself
has been removed, taking up a slot and making life miserable. :)


I have never experienced this in all the years working with Linux.
Either you haven't un-mounted the device correctly (that is checked  
the return status byte if in a script), or the OS release you refer to  
is/was buggy,




(Oh, and, of course, the hardware needs to be able to notify the  
driver

that the device did actually go away, which not all hardware can.)


Again - read the source code.




The CLI command umount does this within the Linux / Unix OS.


That should have the filesystem flush data, but doesn't actually push
out dirty pages for the device — if you accessed it raw at any point
this will not be sufficient.


It was never mentioned about mounting raw.  As everyone should know -  
your on your own if you mount any device raw, as you become the only  
one responsible for its connectivity, data control and reliability.




(Also, lower layers such as LVM, software RAID, etc, might not flush
their data during the unmount process.)


Yep every driver should - otherwise they are badly designed and  
implemented.





The sync command/programming API call is another way to do this
programmatically.


That will flush raw blocks from the device also.


That is all that is required.


Those are necessary, but not sufficient, steps, I fear.




Also, on the hotplug side, where a new device is added, your driver
needs to cope with detecting the device addition, probing it and
ensuring the hardware copes, and with reporting that up the software
stack.


Yes but that is my point! -  This is all part of the kernel drivers  
responsibility - read all about this
in the source code... and the kernel internals.  Hence, there is no  
need to portray the overside of hot swapping as problematic - you  
put it.



On the hardware side, the PSU socket must ensure that power is
presented to the drive before logic is connected (ground first). This
is why the +12v, +5v and GND pins are usually extended about 8mm
before the rest of the pins are connected.


FWIW, SATA devices are hot-swap and the are ... a little less than 8mm
of coverage for those connections.  Just sayin'


SATA I, II and forthcoming III specifications originally covered hot- 
swapping. So it would be expected at the hardware level.


Cheers.
Grahame



Regards,
   Daniel



--
SLUG - Sydney Linux User Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


--
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] HOT SWAPPING - Internals of the sync/umount call

2009-05-16 Thread Adrian Chadd
On Sat, May 16, 2009, Grahame Kelly wrote:

 Rather than stating what I suspect is just a belief, have you look  
 at the Kernel source code at all? If so I would be very interested at  
 exactly where you state such activity happens.
 According to Linux Internals Doco (and hereijn I refer to the Linux  
 Drivers themselves) Once the device has been un-mounted the OS  
 warrants that the device, its linked control blocks, buffers etc.
 are indeed-flushed and data secured on the device medium.  The  
 applicable driver HAVE already unloaded any cache data before the  
 umount command returns with its resultant response.

And I assume that you 100% believe that when the drive says YES SIR
I HAVE SYNCED it has actually done this? :)

 I have never experienced this in all the years working with Linux.
 Either you haven't un-mounted the device correctly (that is checked  
 the return status byte if in a script), or the OS release you refer to  
 is/was buggy,

Or you've been lucky!

 FWIW, SATA devices are hot-swap and the are ... a little less than 8mm
 of coverage for those connections.  Just sayin'
 
 SATA I, II and forthcoming III specifications originally covered hot- 
 swapping. So it would be expected at the hardware level.

Its optional. And it is not always implemented correctly.

I have some notes somewhere from some previous experiments with various
desktop-y SATA chipsets under FreeBSD/Linux and I found that they didn't all
do hotswap as advertised. ;)




Adrian

-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] HOT SWAPPING - Internals of the sync/umount call

2009-05-16 Thread david


 (Admittedly, the last is only on really bad hardware, but hey, that
 hardware is out there and still within the reasonable life of machines
 for home users.)

 Anyway, once the hardware doesn't die completely you still need the
 driver stack to notice and remove the now absent hardware from the
 software shadow representation.


 Crap controllers are just that - crap ;-)


Returning to the original inquiry, and now that I know that I have a
82801GB/GR/GH (ICH7 Family) SATA IDE Controller (Intel)
How do I go about finding out if it's safe to hot-swap? Google hasn't been very 
helpful thus far, perhaps because I'm asking the wrong question.


Of course, someone might say hey I've got one of those and it's fine but if 
someone was reading this in the archives and wanted to find out about slightly 
different hardware, how would you find out?



I'm old fashioned enough that it's about the fishing rather than the fish ;-)
--
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] HOT SWAPPING - Internals of the sync/umount call

2009-05-16 Thread Daniel Pittman
david da...@kenpro.com.au writes:
 (Admittedly, the last is only on really bad hardware, but hey, that
 hardware is out there and still within the reasonable life of machines
 for home users.)

 Anyway, once the hardware doesn't die completely you still need the
 driver stack to notice and remove the now absent hardware from the
 software shadow representation.

 Crap controllers are just that - crap ;-)

 Returning to the original inquiry, and now that I know that I have a
 82801GB/GR/GH (ICH7 Family) SATA IDE Controller (Intel) How do I go
 about finding out if it's safe to hot-swap?

Did you try the libata status report page I posted the link to a while
back?  That should confirm that your ICH7 supports hotplug.

Regards,
Daniel
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] HOT SWAPPING - Internals of the sync/umount call

2009-05-16 Thread Daniel Pittman
Grahame Kelly grah...@wildpossum.com writes:
 From: Daniel Pittman dan...@rimspace.net
 Grahame Kelly grah...@wildpossum.com writes:

[...]

 That only handles the hot *UN*-plug side of things, and can cause
 significant grief to you if the driver doesn't cope: anything from
 several minutes in which *all* disks on that controller are unavailable
 during error handling, through to a controller hang.

 Rather than stating what I suspect is just a belief, have you look
 at the Kernel source code at all?

I am a little curious why you suspect this is just a belief, but to
answer your question, yes.  I /do/ know what the kernel does, as well as
having some idea of how various controllers handle (or fail to) the
hotplug events.

I even tested some improvements to the libata error handler, way back
when, when it turns out that I owned one of the controllers where a
little extra hand-holding in the error handler after a hotplug event.

 If so I would be very interested at exactly where you state such
 activity happens.

What, you mean hardware or drivers that don't cope?  Well, the NVIDIA
SATA drivers had some problems that would cause a long, long delay
trying error recovery if a device got unplugged.  IIRC, an inverted bit
in the sense data returned from the controller was responsible, but it
has been some time.

 According to Linux Internals Doco (and hereijn I refer to the Linux
 Drivers themselves) Once the device has been un-mounted the OS
 warrants that the device, its linked control blocks, buffers etc.  are
 indeed-flushed and data secured on the device medium.

Sure.  What that has to do with drivers that don't cope with error
recovery from a hot-unplug of a device, though, I don't quite follow.

[...]

 After all, you don't want /dev/sdb hanging about when the disk itself
 has been removed, taking up a slot and making life miserable. :)

 I have never experienced this in all the years working with Linux.

Well, I am surprised.  Certainly, on non-hotplug hardware the behaviour
of a Linux block device driver is to keep the device around and report
appropriate errors.

 Either you haven't un-mounted the device correctly (that is checked
 the return status byte if in a script), or the OS release you refer to
 is/was buggy,

On the other hand, it seems we are talking about different things here.

Yes, if a driver that supports hotplug, with hardware that supports
hotplug, fails to remove the software device after the hardware is
gone it has a bug.

[...]

 (Also, lower layers such as LVM, software RAID, etc, might not flush
 their data during the unmount process.)

 Yep every driver should - otherwise they are badly designed and
 implemented.

I don't think you quite follow: if you unmount a filesystem in an LV,
but keep the PV active, LVM can quite reasonably keep metadata active
and in memory.

You have to deactivate the PV for that to change, which is the
equivalent LVM operation to unmount for a filesystem.

Software RAID behaves in a similar fashion.

[...]

 Yes but that is my point! - This is all part of the kernel drivers
 responsibility - read all about this in the source code... and the
 kernel internals.  Hence, there is no need to portray the overside
 of hot swapping as problematic - you put it.

Sorry, I don't follow you.  I don't recall, and can't find in my text, a
reference to the overside of hot swapping.

Can you clarify what you were responding to there?

 On the hardware side, the PSU socket must ensure that power is
 presented to the drive before logic is connected (ground first). This
 is why the +12v, +5v and GND pins are usually extended about 8mm
 before the rest of the pins are connected.

 FWIW, SATA devices are hot-swap and the are ... a little less than
 8mm of coverage for those connections.  Just sayin'

 SATA I, II and forthcoming III specifications originally covered hot-
 swapping. So it would be expected at the hardware level.

My point was that there is not an 8mm long electrical connector on a
SATA cable, or device.  Nothing more than that.

Regards,
Daniel
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] HOT SWAPPING - Internals of the sync/umount call

2009-05-16 Thread Daniel Pittman
Adrian Chadd adr...@creative.net.au writes:
 On Sat, May 16, 2009, Grahame Kelly wrote:

[...]

 FWIW, SATA devices are hot-swap and the are ... a little less than
 8mm of coverage for those connections.  Just sayin'

 SATA I, II and forthcoming III specifications originally covered hot-
 swapping. So it would be expected at the hardware level.

 Its optional. And it is not always implemented correctly.

Hot-swap is mandatory in the SATA spec, but not all controller chips
report it in a meaningful way to the OS.  Unfortunately, while they
(theoretically) have to support the operation no one told their
interface designers about that.[1]

 I have some notes somewhere from some previous experiments with
 various desktop-y SATA chipsets under FreeBSD/Linux and I found that
 they didn't all do hotswap as advertised. ;)

*nod*  Likewise, Linux was a lot more troublesome.  The OP has recent
enough hardware that his life is fine, if he is using the ICH7 in AHCI
mode though.  Lucky him, and lucky the rest of us now that hotplug is
pretty much a standard feature.

Regards,
Daniel

Footnotes: 
[1]  ...well, in fairness, a bunch of the early hardware was just a PATA
 controller with a SATA bridge plugged in, or operated in a mode
 that looked just like a PATA controller with no hot-plug support.

-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] HOT SWAPPING - Internals of the sync/umount call

2009-05-16 Thread david

Daniel Pittman wrote:
 david da...@kenpro.com.au writes:
 (Admittedly, the last is only on really bad hardware, but hey, that
 hardware is out there and still within the reasonable life of machines
 for home users.)

 Anyway, once the hardware doesn't die completely you still need the
 driver stack to notice and remove the now absent hardware from the
 software shadow representation.
 Crap controllers are just that - crap ;-)
 Returning to the original inquiry, and now that I know that I have a
 82801GB/GR/GH (ICH7 Family) SATA IDE Controller (Intel) How do I go
 about finding out if it's safe to hot-swap?

 Did you try the libata status report page I posted the link to a while
 back?  That should confirm that your ICH7 supports hotplug.


http://ata.wiki.kernel.org/index.php/SATA_hardware_features

When you posted I didn't know which controller was in there, now I do.

ChipDriver  NCQ DMA++   hotplug PMP
ICH7 family ata_piix, ahci  AHCIAHCIAHCIno

Since I still don't want to fry the drive, the question still remains (for me 
at least, given that I'm not as erudite as some) Will this hotplug?? 
especially since other controllers on this page are listed simply as hotplug: 
yes rather than hotplug: AHCI.


I guess I could just get an old SATA drive and try it, but from the discussion 
there also seems to be a question mark about frying the PSU.


many thanks,

David
--
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] HOT SWAPPING - Internals of the sync/umount call

2009-05-16 Thread Daniel Pittman
david da...@kenpro.com.au writes:
 Daniel Pittman wrote:
 david da...@kenpro.com.au writes:
 (Admittedly, the last is only on really bad hardware, but hey, that
 hardware is out there and still within the reasonable life of machines
 for home users.)

 Anyway, once the hardware doesn't die completely you still need the
 driver stack to notice and remove the now absent hardware from the
 software shadow representation.
 Crap controllers are just that - crap ;-)
 Returning to the original inquiry, and now that I know that I have a
 82801GB/GR/GH (ICH7 Family) SATA IDE Controller (Intel) How do I go
 about finding out if it's safe to hot-swap?

 Did you try the libata status report page I posted the link to a while
 back?  That should confirm that your ICH7 supports hotplug.


 http://ata.wiki.kernel.org/index.php/SATA_hardware_features

 When you posted I didn't know which controller was in there, now I do.

 Chip  Driver  NCQ DMA++   hotplug PMP
 ICH7 family   ata_piix, ahci  AHCIAHCIAHCIno

 Since I still don't want to fry the drive, the question still remains
 (for me at least, given that I'm not as erudite as some) Will this
 hotplug??

Yes, if you are running it in AHCI mode.  Specifically, you have to be
in something other than compatibility mode in the BIOS, and it has to
identify as an AHCI controller during boot.

Check the kernel messages after boot to confirm that:

dmesg | grep -i ahci

 especially since other controllers on this page are listed simply as
 hotplug: yes rather than hotplug: AHCI.

Fair point.  Sorry, I should have been clearer.

 I guess I could just get an old SATA drive and try it, but from the
 discussion there also seems to be a question mark about frying the
 PSU.

If it makes you feel any better you can't *physically* damage a SATA
device hotplugging it.  You could corrupt the data on it if Linux hadn't
written everything out, and you could crash your system, but nothing
worse than that.

Regards,
Daniel
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] HOT SWAPPING - Internals of the sync/umount call

2009-05-16 Thread Grahame Kelly


On 16/05/2009, at 5:53 PM, Adrian Chadd wrote:


On Sat, May 16, 2009, Grahame Kelly wrote:


Rather than stating what I suspect is just a belief, have you look
at the Kernel source code at all? If so I would be very interested at
exactly where you state such activity happens.
According to Linux Internals Doco (and hereijn I refer to the Linux
Drivers themselves) Once the device has been un-mounted the OS
warrants that the device, its linked control blocks, buffers etc.
are indeed-flushed and data secured on the device medium.  The
applicable driver HAVE already unloaded any cache data before the
umount command returns with its resultant response.


And I assume that you 100% believe that when the drive says YES SIR
I HAVE SYNCED it has actually done this? :)



Hi Adrian.

It is all part of the standards each industry strives for. SATA drive  
manufactures validate and belong to the applicable standards groups  
just for these reasons.


I am not disputing that some drives or controllers may not be  
standards conforming (at times this is more than likely). If and only  
if a drive, or/and its controller conform to such standards, then  
whatever data stream needs to be written by the subsystem on the  
completeion of a sync or in response to a umount is suppose to  
ensure that such data is stored on the media either before the status  
response is returned to the driving s/w or is warranted to have done  
so.  If this didn't happen then all hell would break loose (which is  
what your saying).


I don't believe much if anything at all.

We both have discovered via our experiences when things don't work  
a.k.a. don't conform to a standard - this is when structures or such  
methodologies break.
Under POSIX umount is suppose to warrant such for the device, its  
controlling structures and associated kernel drive tables.  If the  
system(s) don't - then they simply are non-conforming implementations  
- That is ALL.



I have never experienced this in all the years working with Linux.
Either you haven't un-mounted the device correctly (that is checked
the return status byte if in a script), or the OS release you refer  
to

is/was buggy,


Or you've been lucky!


Whatever.

FWIW, SATA devices are hot-swap and the are ... a little less than  
8mm

of coverage for those connections.  Just sayin'


SATA I, II and forthcoming III specifications originally covered hot-
swapping. So it would be expected at the hardware level.


Its optional. And it is not always implemented correctly.

I have some notes somewhere from some previous experiments with  
various
desktop-y SATA chipsets under FreeBSD/Linux and I found that they  
didn't all

do hotswap as advertised. ;)


Your correct to say And it is not always implemented correctly --  
That is exactly what I am trying to show through this discussion.


It is your experiments that I and others are interested in.
We may together be able to:
A narrow the problem down - and if it is a Linux or driver  
implementation - make and forward a patch in making the OS better  
compliant.
B If its a drive issue, advise the manufacture, or simply advise  
others not to purchase same because of these issues.
C Find a work-around - be it a operational, hardware or software one.  
And advise others not just in SLUG but the wider Linux/FreeBSD world.


Thats my main point of this followup.  Not an person to person  
agreement - rather a technical follow up to narrow down and implement  
a solution as I already have pointed out.


Keep tracking those problematic issues,
Set up a controlled test, document it and forward it to others so they  
may test the same and return their results to you.


Cheers.
Grahame



Adrian



--
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] HOT SWAPPING - Internals of the sync/umount call

2009-05-16 Thread Alex Samad
On Sun, May 17, 2009 at 11:04:33AM +1000, Grahame Kelly wrote:

 On 16/05/2009, at 5:53 PM, Adrian Chadd wrote:

 On Sat, May 16, 2009, Grahame Kelly wrote:

 Rather than stating what I suspect is just a belief, have you look
 at the Kernel source code at all? If so I would be very interested at

[snip]

 I am not disputing that some drives or controllers may not be standards 
 conforming (at times this is more than likely). If and only if a drive, 
 or/and its controller conform to such standards, then whatever data 
 stream needs to be written by the subsystem on the completeion of a 
 sync or in response to a umount is suppose to ensure that such data 
 is stored on the media either before the status response is returned to 
 the driving s/w or is warranted to have done so.  If this didn't happen 
 then all hell would break loose (which is what your saying).

 I don't believe much if anything at all.

 We both have discovered via our experiences when things don't work  
 a.k.a. don't conform to a standard - this is when structures or such  
 methodologies break.
 Under POSIX umount is suppose to warrant such for the device, its  
 controlling structures and associated kernel drive tables.  If the  
 system(s) don't - then they simply are non-conforming implementations - 
 That is ALL.

I think you missed the point about partitions sitting on LVM sitting on
raid.  if you umount a lvm partition the block device provided by lvm is
unmounted - but the lvm group and potentially the raid device underneath
isn't.

Your about statement is only really true when we used drive directory
and not via DM or LVM

[snip]

 Cheers.
 Grahame


 Adrian




-- 
The legislature's job is to write law. It's the executive branch's job to 
interpret law.

- George W. Bush
11/22/2000
Austin, TX


signature.asc
Description: Digital signature
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html