Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

Todd H. Poole Sun, 24 Aug 2008 19:51:49 -0700

> But you're not attempting hotswap, you're doing hot plug....

Do you mean hot UNplug? Because I'm not trying to get this thing to recognize 
any new disks without a restart... Honest. I'm just trying to prevent the 
machine from freezing up when a drive fails. I have no problem restarting the 
machine with a new drive in it later so that it recognizes the new disk.


> and unless you're using the onboard bios' concept of an actual
> RAID array, you don't have an array, you've got a JBOD and
> it's not a real JBOD - it's a PC motherboard which does _not_
> have the same electronic and electrical protections that a
> JBOD has *by design*.

I'm confused by what your definition of a RAID array is, and for that matter, 
what a JBOD is... I've got plenty of experience with both, but just to make 
sure I wasn't off my rocker, I consulted the demigod:

http://en.wikipedia.org/wiki/RAID
http://en.wikipedia.org/wiki/JBOD

and I think what I'm doing is indeed RAID... I'm not using some sort of 
controller card, or any specialized hardware, so it's certainly not Hardware 
RAID (and thus doesn't contain any of the fancy electronic or electrical 
protections you mentioned), but lacking said protections doesn't preclude the 
machine from being considered a RAID. All the disks are the same capacity, the 
OS still sees the zpool I've created as one large volume, and since I'm using 
RAID-Z (RAID5), it should be redundant... What other qualifiers out there are 
necessary before a system can be called RAID compliant?

If it's hot-swappable technology, or a controller hiding the details from the 
OS and instead  presenting a single volume, then I would argue those things are 
extra - not a fundamental prerequisite for a system to be called a RAID.

Furthermore, while I'm not sure what the difference between a "real JBOD" and a 
plain old JBOD is, this set-up certainly wouldn't qualify for either. I mean, 
there is no concatenation going on, redundancy should be present (but due to 
this issue, I haven't been able to verify that yet), and all the drives are the 
same size... Am I missing something in the definition of a JBOD?

I don't think so...
 
> And you're right, it can. But what you've been doing is outside
> the bounds of what IDE hardware on a PC motherboard is designed
> to cope with.

Well, yes, you're right, but it's not like I'm making some sort of radical 
departure outside of the bounds of the hardware... It really shouldn't be a 
problem so long as it's not an unreasonable departure because that's where 
software comes in. When the hardware can't cut it, that's where software picks 
up the slack.

Now, obviously, I'm not saying software can do anything with any piece of 
hardware you give it - no matter how many lines of code you write, your 
keyboard isn't going to turn into a speaker - but when it comes to reasonable 
stuff like ensuring a machine doesn't crash because a user did something with 
the hardware that he or she wasn't supposed to do? Prime target for software.

And that's the way it's always been... The whole push behind that whole ZFS 
Promise thing (or if you want to make it less specific, the attractiveness of 
RAID in general), was that "RAID-Z [wouldn't] require any special hardware. It 
doesn't need NVRAM for correctness, and it doesn't need write buffering for 
good performance. With RAID-Z, ZFS makes good on the original RAID promise: it 
provides fast, reliable storage using cheap, commodity disks." 
(http://blogs.sun.com/bonwick/entry/raid_z)

> Well sorry, it does. Welcome to an OS which does care.

The half-hearted apology wasn't necessary... I understand that OpenSolaris 
cares about the method those disks use to plug into the motherboard, but what I 
don't understand is why that limitation exists in the first place. It would 
seem much better to me to have an OS that doesn't care (but developers that do) 
and just finds a way to work, versus one that does care (but developers that 
don't) and instead isn't as flexible and gets picky... I'm not saying 
OpenSolaris is the latter, but I'm not getting the impression it's the former 
either...

> If the controlling electronics for your disk can't
> handle it, then you're hosed. That's why FC, SATA (in SATA
> mode) and SAS are much more likely to handle this out of
> the box. Parallel SCSI requires funky hardware, which is why
> those old 6- or 12-disk multipacks are so useful to have.
> 
> Of the failure modes that you suggest above, only one
> is going to give you anything other than catastrophic
> failure (drive motor degradation) - and that is because the
> drive's electronics will realise this, and send warnings to
> the host.... which should have its drivers written so
> that these messages are logged for the sysadmin to act upon.
> 
> The other failure modes are what we call catastrophic. And
> where your hardware isn't designed with certain protections
> around drive connections, you're hosed. No two ways
> about it. If your system suffers that sort of failure, would
> you seriously expect that non-hardened hardware would survive it?

Yes, I would. At the risk of sounding repetitive, I'll summarize what I've been 
getting at in my previous responses: I certainly _do_ think it's reasonable to 
expect non-hardened hardware to survive this type of failure. In fact, I think 
its unreasonable _not_ to expect it to. The Linux kernel, the BSD kernels, and 
the NT kernel (or whatever chunk of code runs Windows) all provide this type of 
functionality, and have so for some time. Granted, they may all do it in 
different ways, but at the end of the day, unplugging an IDE hard drive from a 
software RAID5 array in OpenSuSE or RedHat, FreeBSD, or Windows XP Professional 
will not bring the machine down. And it shouldn't in OpenSolaris either. There 
might be some sort of noticeable bump (Windows, for example, pauses for a few 
seconds while it tries to figure out what hell just happened to one of it's 
disks), but there isn't anything show stopping... 

> If you've got newer hardware, which can support SATA
> in native SATA mode, USE IT.

I'll see what I can do - this might be some sort of BIOS setting that can be 
configured.
 
> > I'm grateful for your help, but is there another way that you can think
> > of to get this to work?
> You could start by taking us seriously when we tell
> you that what you've been doing is not a good idea, and
> find other ways to simulate drive failures.

Lets drop the confrontational attitude - I'm not trying to dick around with you 
here. I've done my due diligence in researching this issue on Google, these 
forums, and Sun's documentation before making a post, I've provided any 
clarifying information that has been requested by those kind enough to post a 
response, and I've yet to resort to any witty or curt remarks in my 
correspondence with you, tcook, or myxiplx. Whatever is causing you to think 
I'm not taking anyone seriously, let me reassure you, I am.

The only thing I'm doing is testing a system by applying the worst case 
scenario of survivable torture to it and seeing how it recovers. If that's not 
a good idea, then I guess we disagree. But that's ok - you're James C. 
McPherson, Senior Kernel Software Engineer, Solaris, and I'm just some user 
who's trying to find a solution to his problem. My bad for expecting the same 
level of respect I've given two other members of this community to be returned 
in kind by one of it's leaders.

So aside from telling me to "[never] try this sort of thing with IDE" does 
anyone else have any other ideas on how to prevent OpenSolaris from locking up 
whenever an IDE drive is abruptly disconnected from a ZFS RAID-Z array?

-Todd
 
 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

Reply via email to