subject:"\[zfs\-discuss\] ZFS hangs\/freezes after disk failure,"

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-09-03 Thread Joe S

On Fri, Aug 29, 2008 at 10:32 PM, Todd H. Poole <[EMAIL PROTECTED]> wrote:
> I can't agree with you more. I'm beginning to understand what the phrase 
> "Sun's software is great - as long as you're running it on Sun's hardware" 
> means...
>
> Whether it's deserved or not, I feel like this OS isn't mature yet. And maybe 
> it's not the whole OS, maybe it's some specific subsection (like ZFS), but my 
> general impression of OpenSolaris has been... not stellar.
>
> I don't think it's ready yet for a prime time slot on commodity hardware.

I agree, but with careful research, you can find the *right* hardware.
In my quest (took weeks) to find reports of reliable hardware, I found
that the AMD chipsets were way too buggy. I also noticed that of the
workstations that Sun sells, they use nVidia nForce chipsets for AMD
CPU's and Intel x38 (only intel desktop chipset that supports ecc) for
the Intel CPUs. I read good and bad stories about various hardware and
decided I would stay close to what Sun sells. I've found NO Sun
hardware using the same chipset as yours.

There are a couple of AHCI bugs with the AMD/ATI SB600 chipset. Both
Linux and Solaris were affected. Linux put in a workaround that may
hurt performance slightly. Sun still has the bug open, but for what
it's worth, who's gonna use or care about a buggy desktop chipset in a
storage server?

I have an nVidia nForce 750a chipset (not the same as the sun
workstations, which use nforce pro, but its not too different) and the
same CPU (45 Watt dual core!) you have. My system works great (so
far). I haven't tried the disconnect drive issue thought. I will try
it tonight.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-30 Thread dick hoogendijk

On Sat, 30 Aug 2008 09:35:31 -0300
Toby Thain <[EMAIL PROTECTED]> wrote:
> On 30-Aug-08, at 2:32 AM, Todd H. Poole wrote:
> > I can't agree with you more. I'm beginning to understand what the  
> > phrase "Sun's software is great - as long as you're running it on  
> > Sun's hardware" means...
> 
> Totally OT, but this is also why Apple doesn't sell OS X for
> whitebox junk. :)

There's also a lot of whiteboxes that -do- run solaris very well.
"Some apples are rotten others are healthy". That quite normal.

-- 
Dick Hoogendijk -- PGP/GnuPG key: 01D2433D
++ http://nagual.nl/ + SunOS sxce snv95 ++
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-30 Thread Toby Thain


On 30-Aug-08, at 2:32 AM, Todd H. Poole wrote:

>> Wrt. what I've experienced and read in ZFS-discussion etc. list  
>> I've the
>> __feeling__, that we would have got really into trouble, using  
>> Solaris
>> (even the most recent one) on that system ...
>> So if one asks me, whether to run Solaris+ZFS on a production  
>> system, I
>> usually say: definitely, but only, if it is a Sun server ...
>>
>> My 2¢ ;-)
>
> I can't agree with you more. I'm beginning to understand what the  
> phrase "Sun's software is great - as long as you're running it on  
> Sun's hardware" means...
> ...

Totally OT, but this is also why Apple doesn't sell OS X for whitebox  
junk. :)

--Toby

>
> -Todd
> --
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-29 Thread Todd H. Poole

> Wrt. what I've experienced and read in ZFS-discussion etc. list I've the
> __feeling__, that we would have got really into trouble, using Solaris
> (even the most recent one) on that system ... 
> So if one asks me, whether to run Solaris+ZFS on a production system, I
> usually say: definitely, but only, if it is a Sun server ...
> 
> My 2¢ ;-)

I can't agree with you more. I'm beginning to understand what the phrase "Sun's 
software is great - as long as you're running it on Sun's hardware" means...

Whether it's deserved or not, I feel like this OS isn't mature yet. And maybe 
it's not the whole OS, maybe it's some specific subsection (like ZFS), but my 
general impression of OpenSolaris has been... not stellar.

I don't think it's ready yet for a prime time slot on commodity hardware.

And while I don't intend to fan any flames that might already exist (remember, 
I've only just joined within the past week, and thus haven't been around long 
enough to figured out even if any flames exist), but I believe I'm justified in 
making the above statement. Just off the top of my head, here is a list of red 
flags I've run into in 7 day's time:

 - If I don't wait for at least 2 minutes before logging into my system after 
I've powered everything up, my machine freezes.
 - If I yank a hard drive out of a (supposedly redundant) RAID5 array (or 
"RAID-Z zpool," as its called) that has an NFS mount attached to it, not only 
does that mount point get severed, but _all_ NFS connections to all mount 
points are dropped, regardless of whether they were on the zpool or not. Oh, 
and then my machine freezes.
 - If I just yank a hard drive out of a (supposedly redundant) RAID5 array (or 
"RAID-Z zpool," as its called), and just forgetting about NFS, my machine 
freezes.
 - If I query a zpool for its status, but don't do so under the right 
circumstances, my machine freezes.

I've had to use the hard reset button on my case more times than I've had the 
ability to shut down the machine properly from a non-frozen console or GUI. 

That shouldn't happen.

I dunno. If this sounds like bitching, that's fine: I'll file bug reports and 
then move on. It's just that sometimes, software needs to grow a bit more 
before it's ready for production, and I feel like trying to run OpenSolaris + 
ZFS on commodity hardware just might be one of those times.

Just two more cents to add to yours.

As Richard said, the only way to fix things is to file bug reports. Hopefully, 
the most helpful things to come out of this thread will be those forms of 
constructive criticism.

As for now, it looks like a return to LVM2, XFS, and one of the Linux or BSD 
kernels might be a more stable decision, but don't worry - I haven't been 
completely dissuaded, and I definitely plan on checking back in a few releases 
to see how things are going in the ZFS world. ;)

Thanks everyone for your help, and keep improving! :)

-Todd
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-28 Thread James C. McPherson

Hi Todd,
sorry for the delay in responding, been head down rewriting
a utility for the last few days.


Todd H. Poole wrote:
> Howdy James,
> 
> While responding to halstead's post (see below), I had to restart several
> times to complete some testing. I'm not sure if that's important to these
> commands or not, but I just wanted to put it out there anyway.
> 
>> A few commands that you could provide the output from
>> include:
>>
>>
>> (these two show any FMA-related telemetry)
>> fmadm faulty
>> fmdump -v
> 
> This is the output from both commands:
> 
> [EMAIL PROTECTED]:~# fmadm faulty
> ---   -- -
> TIMEEVENT-ID  MSG-ID SEVERITY
> ---   -- -
> Aug 27 01:07:08 0d9c30f1-b2c7-66b6-f58d-9c6bcb95392a  ZFS-8000-FDMajor
> 
> Fault class : fault.fs.zfs.vdev.io
> Description : The number of I/O errors associated with a ZFS device exceeded
> acceptable levels.  Refer to 
> http://sun.com/msg/ZFS-8000-FD
>  for more information.
> Response: The device has been offlined and marked as faulted.  An attempt
> will be made to activate a hot spare if available.
> Impact  : Fault tolerance of the pool may be compromised.
> Action  : Run 'zpool status -x' and replace the bad device.
 >
> [EMAIL PROTECTED]:~# fmdump -v
> TIME UUID SUNW-MSG-ID
> Aug 27 01:07:08.2040 0d9c30f1-b2c7-66b6-f58d-9c6bcb95392a ZFS-8000-FD
>  100%  fault.fs.zfs.vdev.io
> 
>Problem in: zfs://pool=mediapool/vdev=bfaa3595c0bf719
>   Affects: zfs://pool=mediapool/vdev=bfaa3595c0bf719
>   FRU: -
>  Location: -


In other emails in this thread you've mentioned the desire to
get an email (or some sort of notification) when Problems Happen(tm)
in your system, and the FMA framework is how we achieve that
in OpenSolaris.



# fmadm config
MODULE   VERSION STATUS  DESCRIPTION
cpumem-retire1.1 active  CPU/Memory Retire Agent
disk-transport   1.0 active  Disk Transport Agent
eft  1.16active  eft diagnosis engine
fabric-xlate 1.0 active  Fabric Ereport Translater
fmd-self-diagnosis   1.0 active  Fault Manager Self-Diagnosis
io-retire2.0 active  I/O Retire Agent
snmp-trapgen 1.0 active  SNMP Trap Generation Agent
sysevent-transport   1.0 active  SysEvent Transport Agent
syslog-msgs  1.0 active  Syslog Messaging Agent
zfs-diagnosis1.0 active  ZFS Diagnosis Engine
zfs-retire   1.0 active  ZFS Retire Agent


You'll notice that we've got an SNMP agent there... and you
can acquire a copy of the FMA mib from the Fault Management
community pages (http://opensolaris.org/os/community/fm and
http://opensolaris.org/os/community/fm/mib/).




>> (this shows your storage controllers and what's
>> connected to them) cfgadm -lav
> 
> This is the output from cfgadm -lav
> 
> [EMAIL PROTECTED]:~# cfgadm -lav
> Ap_Id  Receptacle   Occupant Condition  
> Information
> When Type Busy Phys_Id
> usb2/1 emptyunconfigured ok
> unavailable  unknown  n/devices/[EMAIL 
> PROTECTED],0/pci1458,[EMAIL PROTECTED]:1
> usb2/2 connectedconfigured   ok
> Mfg: Microsoft  Product: Microsoft 3-Button Mouse with IntelliEye(TM)
> NConfigs: 1  Config: 0  
> unavailable  usb-mousen/devices/[EMAIL 
> PROTECTED],0/pci1458,[EMAIL PROTECTED]:2
> usb3/1 emptyunconfigured ok
[snip]
> usb7/2 emptyunconfigured ok
> unavailable  unknown  n/devices/[EMAIL 
> PROTECTED],0/pci1458,[EMAIL PROTECTED],1:2
> 
> You'll notice that the only thing listed is my USB mouse... is that expected?

Yup. One of the artefacts of the cfgadm architecture. cfgadm(1m)
works by using plugins - usb, FC, SCSI, SATA, pci hotplug, InfiniBand...
but not IDE.

I think you also were wondering how to tell what controller
instances your disks were using in IDE mode - two basic ways
of achieving this:

/usr/bin/iostat -En

and

/usr/sbin/format

Your IDE disks will attach using the cmdk driver and show up like this:

c1d0
c1d1
c2d0
c2d1

In AHCI/SATA mode they'd show up as

c1t0d0
c1t1d0
c1t2d0
c1t3d0

or something similar, depending on how the bios and the actual
controllers sort themselves out.


>> You'll also find messages in /var/adm/messages which
>> might prove
>> useful to review.
> 
> If you really want, I can list the output from /var/adm/messages, but it
> doesn't seem to add anything new to what I've already copied and pasted.

No need - you've got them if you need them.

[snip]

>> http://docs.sun.com/app/docs/coll/40.1

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-27 Thread Todd H. Poole

Ah yes - that video is what got this whole thing going in the first place... I 
referenced it in one of my other posts much earlier. Heh... there's something 
gruesomely entertaining about brutishly taking a drill or sledge hammer to a 
piece of precision hardware like that.

But yes, that's the kind of torture test I would like to conduct, however, I'm 
operating on a limited test-budget right now, and I have to get the damn thing 
working in the first place before I start performing tests I can't easily 
reverse (I still have yet to fire up Bonnie++ and do some benchmarking), and 
most definitely before I can put on a show for those who control the draw 
strings to the purse...

But, imagine: walking into... oh say, I dunno... your manager's office, for 
example, and asking him to beat the hell out of one of your server's hard 
drives all the while promising him that no data would be lost, and none of his 
video on demand customers would ever notice an interruption in service. He 
might think you're crazy, but if it still works at the end of the day, your 
annual budget just might get a sizable increase to help you make all the other 
servers "sledge hammer resistant" like the first one. ;)

But that's just an example. That functionality could (and probably does) prove 
useful almost anywhere.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-27 Thread Miles Nordin

> "thp" == Todd H Poole <[EMAIL PROTECTED]> writes:

>> Would try this with
>> your pci/pci-e cards in this system? I think not.

   thp> Unplugging one of them seems like a fine test to me

I've done it, with 32-bit 5 volt PCI, I forget why.  I might have been
trying to use a board, but bypass the broken etherboot ROM on the
board.  It was something like that.

IIRC it works sometimes, crashes the machine sometimes, and fries the
hardware eventually if you keep doing it long enough.  

The exact same three cases are true of cold-plugging a PCI
card.  It just works a-lot-more-often sometimes if you power down
first.

Does massively inappropriate hotplugging possibly weaken the hardware
so that it's more likely to pop later?  maybe.  Can you think of a
good test for that?

Believe it or not, sometimes accurate information is worth more than a
motherboard that cost $50 five years ago.  Sometimes saving ten
minutes is worth more.  or... recovering an openprom password.

Testing availability claims rather than accepting them on faith, or 
rather than gaining experience in a slow, oozing, anecdotal way on
production machinery, is definitely not stupid.  Testing them in a way
that compares one system to another is double-un-stupid.

pgpaNUqIFkojT.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-27 Thread Miles Nordin

> "m" == MC  <[EMAIL PROTECTED]> writes:

 m> file another bug about how solaris recognizes your ACHI SATA
 m> hardware as old ide hardware.

I don't have that board but AIUI the driver attachment's chooseable in
the BIOS Blue Screen of Setup, by setting the controller to
``Compatibility'' mode (pci-ide) or ``Native'' mode (AHCI).  This
particular chip must be run in Compatibility mode because of bug
6665032.


pgp9DeDcvqDWx.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-27 Thread Richard Elling

Todd H. Poole wrote:
> And I want this test to be as rough as it gets. I don't want to play 
> nice with this system... I want to drag it through the most tortuous 
> worst-case scenario tests I can imagine, and if it survives with all 
> my test data intact, then (and only then) will I begin to trust it.

http://www.youtube.com/watch?v=naKd9nARAes
:-)
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-27 Thread Tim

On Wed, Aug 27, 2008 at 1:18 AM, MC <[EMAIL PROTECTED]> wrote:

> Okay, so your ACHI hardware is not using an ACHI driver in solaris.  A
> crash when pulling a cable is still not great, but it is understandable
> because that driver is old and bad and doesn't support hot swapping at all.
>

His AHCI is not using AHCI because he's set it not to.  If linux is somehow
ignoring the BIOS configuration, and attempting to load an AHCI driver for
the hardware anyways, that's *BROKEN* behavior.  I've yet to see WHAT driver
linux was using because he was too busy having a pissing match to get that
USEFUL information back to the list.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure, resumes when disk is replaced

2008-08-27 Thread Ross

Hi Todd,

Having finally gotten the time to read through this entire thread, I think Ralf 
said it best.  ZFS can provide data integrity, but you're reliant on hardware 
and drivers for data availability.

In this case either your SATA controller, or the drivers for it don't cope at 
all well with a device going offline, so what you need is a SATA card that can 
handle that.  Provided you have a controller that can cope with the disk 
errors, it should be able to return the appropriate status information to ZFS, 
which will in turn ensure your data is ok.

The technique obviously works or Sun's x4500 servers wouldn't be doing anywhere 
near as well as they are.  The problem we all seem to be having is finding 
white box hardware that supports it.

I suspect your best bet would be to pick up a SAS controller based on the LSI 
chipsets used in the new x4540 server.  There's been a fair bit of discussion 
here on these, and while there's a limitation in that you will have to manually 
keep track of drive names, I would expect it to handle disk failures (and 
pulling disks) much better, but you would probably be well advised asking the 
folks on the forums running those SAS controllers whether they've been able to 
pull disks sucessfully.

I think the solution you need is definately to get a better disk controller, 
and your choice is either a plain SAS controller, or a raid controller that can 
present individual disks in pass through mode since they *definately* are 
designed to handle failures.

Ross
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-27 Thread Todd H. Poole

I plan on fiddling around with this failmode property in a few hours. I'll be 
using http://docs.sun.com/app/docs/doc/817-2271/gftgp?l=en&a=view as a 
reference.

I'll let you know what I find out.

-Todd
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-27 Thread Florin Iucha

On Tue, Aug 26, 2008 at 11:18:51PM -0700, MC wrote:
> The two bonus things to do are: come to the forum and bitch about the bugs to 
> give them some attention, and come to the forum asking for help on making 
> solaris recognize your ACHI SATA hardware properly :)

Been there, done that.  No t-shirt, though...

The Solaris kernel might be the best thing since MULTICS, but the lack
of drivers really hampers it's spread.

florin

-- 
Bruce Schneier expects the Spanish Inquisition.
  http://geekz.co.uk/schneierfacts/fact/163


pgpYmXTBn8KcO.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-27 Thread Todd H. Poole

Howdy James,

While responding to halstead's post (see below), I had to restart several times 
to complete some testing. I'm not sure if that's important to these commands or 
not, but I just wanted to put it out there anyway.

> A few commands that you could provide the output from
> include:
> 
> 
> (these two show any FMA-related telemetry)
> fmadm faulty
> fmdump -v

This is the output from both commands:

[EMAIL PROTECTED]:~# fmadm faulty
---   -- -
TIMEEVENT-ID  MSG-ID SEVERITY
---   -- -
Aug 27 01:07:08 0d9c30f1-b2c7-66b6-f58d-9c6bcb95392a  ZFS-8000-FDMajor

Fault class : fault.fs.zfs.vdev.io
Description : The number of I/O errors associated with a ZFS device exceeded
acceptable levels.  Refer to http://sun.com/msg/ZFS-8000-FD
 for more information.
Response: The device has been offlined and marked as faulted.  An attempt
will be made to activate a hot spare if available.
Impact  : Fault tolerance of the pool may be compromised.
Action  : Run 'zpool status -x' and replace the bad device.



[EMAIL PROTECTED]:~# fmdump -v
TIME UUID SUNW-MSG-ID
Aug 27 01:07:08.2040 0d9c30f1-b2c7-66b6-f58d-9c6bcb95392a ZFS-8000-FD
 100%  fault.fs.zfs.vdev.io

   Problem in: zfs://pool=mediapool/vdev=bfaa3595c0bf719
  Affects: zfs://pool=mediapool/vdev=bfaa3595c0bf719
  FRU: -
 Location: -


> (this shows your storage controllers and what's
> connected to them) cfgadm -lav

This is the output from cfgadm -lav

[EMAIL PROTECTED]:~# cfgadm -lav
Ap_Id  Receptacle   Occupant Condition  Information
When Type Busy Phys_Id
usb2/1 emptyunconfigured ok
unavailable  unknown  n/devices/[EMAIL PROTECTED],0/pci1458,[EMAIL 
PROTECTED]:1
usb2/2 connectedconfigured   ok
Mfg: Microsoft  Product: Microsoft 3-Button Mouse with IntelliEye(TM)
NConfigs: 1  Config: 0  
unavailable  usb-mousen/devices/[EMAIL PROTECTED],0/pci1458,[EMAIL 
PROTECTED]:2
usb3/1 emptyunconfigured ok
unavailable  unknown  n/devices/[EMAIL PROTECTED],0/pci1458,[EMAIL 
PROTECTED],2:1
usb3/2 emptyunconfigured ok
unavailable  unknown  n/devices/[EMAIL PROTECTED],0/pci1458,[EMAIL 
PROTECTED],2:2
usb4/1 emptyunconfigured ok
unavailable  unknown  n/devices/[EMAIL PROTECTED],0/pci1458,[EMAIL 
PROTECTED],3:1
usb4/2 emptyunconfigured ok
unavailable  unknown  n/devices/[EMAIL PROTECTED],0/pci1458,[EMAIL 
PROTECTED],3:2
usb5/1 emptyunconfigured ok
unavailable  unknown  n/devices/[EMAIL PROTECTED],0/pci1458,[EMAIL 
PROTECTED],4:1
usb5/2 emptyunconfigured ok
unavailable  unknown  n/devices/[EMAIL PROTECTED],0/pci1458,[EMAIL 
PROTECTED],4:2
usb6/1 emptyunconfigured ok
unavailable  unknown  n/devices/[EMAIL PROTECTED],0/pci1458,[EMAIL 
PROTECTED],5:1
usb6/2 emptyunconfigured ok
unavailable  unknown  n/devices/[EMAIL PROTECTED],0/pci1458,[EMAIL 
PROTECTED],5:2
usb6/3 emptyunconfigured ok
unavailable  unknown  n/devices/[EMAIL PROTECTED],0/pci1458,[EMAIL 
PROTECTED],5:3
usb6/4 emptyunconfigured ok
unavailable  unknown  n/devices/[EMAIL PROTECTED],0/pci1458,[EMAIL 
PROTECTED],5:4
usb6/5 emptyunconfigured ok
unavailable  unknown  n/devices/[EMAIL PROTECTED],0/pci1458,[EMAIL 
PROTECTED],5:5
usb6/6 emptyunconfigured ok
unavailable  unknown  n/devices/[EMAIL PROTECTED],0/pci1458,[EMAIL 
PROTECTED],5:6
usb6/7 emptyunconfigured ok
unavailable  unknown  n/devices/[EMAIL PROTECTED],0/pci1458,[EMAIL 
PROTECTED],5:7
usb6/8 emptyunconfigured ok
unavailable  unknown  n/devices/[EMAIL PROTECTED],0/pci1458,[EMAIL 
PROTECTED],5:8
usb6/9 emptyunconfigured ok
unavailable  unknown  n/devices/[EMAIL PROTECTED],0/pci1458,[EMAIL 
PROTECTED],5:9
usb6/10emptyunconfigured ok
unavailable  unknown  n/devices/[EMAIL PROTECTED],0/pci1458,[EMAIL 
PROTECTED],5:10
usb7/1 emptyunconfigured ok
unavailable  unknown  n/devices/[EMAIL PROTECTED],0/pci1458,[EMAIL 
PROTECTED],1:1
usb7/2 emptyunconfigur

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-26 Thread MC

Okay, so your ACHI hardware is not using an ACHI driver in solaris.  A crash 
when pulling a cable is still not great, but it is understandable because that 
driver is old and bad and doesn't support hot swapping at all.

So there are two things to do here.  File a bug about how pulling a sata cable 
crashes solaris when the device is using the old ide driver.  And file another 
bug about how solaris recognizes your ACHI SATA hardware as old ide hardware.

The two bonus things to do are: come to the forum and bitch about the bugs to 
give them some attention, and come to the forum asking for help on making 
solaris recognize your ACHI SATA hardware properly :)

Good luck...

> Gotcha. But just to let you know, there are 4 SATA
> ports on the motherboard, with each drive getting its
> own port... how should I go about testing to see
> whether pulling one IDE drive (remember, they're
> really SATA drives, but they're being presented to
> the OS by the pci-ide driver) locks the entire IDE
> channel if there's only one drive per channel? Or do
> you think it's possible that two ports on the
> motherboard could be on one "logical channel" (for
> lack of a  better phrase) while the other two are on
> the other, and thus we could test one drive while
> another on the same "logical channel" is unplugged?
> 
> Also, remember that OpenSolaris freezes when this
> occurs, so I'm only going to have 2-3 seconds to
> execute a command before Terminal and - after a few
> more seconds, the rest of the machine - stop
> responding to input... 
> 
> I'm all for trying to test this, but I might need
> some instruction.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-26 Thread MC

> James isn't being a jerk because he hates your or
> anything...
> 
> Look, yanking the drives like that can seriously
> damage the drives or your motherboard. Solaris
> doesn't let you do it and assumes that something's
> gone seriously wrong if you try it. That Linux
> ignores the behavior and lets you do it sounds more
> like a bug in linux than anything else.

Solaris crashing is a linux bug.  That's a new one folks.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-26 Thread Todd H. Poole

> I think that your expectations from ZFS are
> reasonable.  However, it is useful to determine if pulling the IDE drive locks
> the entire IDE channel, which serves the other disks as well. This
> could happen at a hardware level, or at a device driver level. If this
> happens, then there is nothing that ZFS can do.

Gotcha. But just to let you know, there are 4 SATA ports on the motherboard, 
with each drive getting its own port... how should I go about testing to see 
whether pulling one IDE drive (remember, they're really SATA drives, but 
they're being presented to the OS by the pci-ide driver) locks the entire IDE 
channel if there's only one drive per channel? Or do you think it's possible 
that two ports on the motherboard could be on one "logical channel" (for lack 
of a  better phrase) while the other two are on the other, and thus we could 
test one drive while another on the same "logical channel" is unplugged?

Also, remember that OpenSolaris freezes when this occurs, so I'm only going to 
have 2-3 seconds to execute a command before Terminal and - after a few more 
seconds, the rest of the machine - stop responding to input... 

I'm all for trying to test this, but I might need some instruction.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-26 Thread Todd H. Poole

PS: I also think it's worthy to note the level of supportive and constructive 
feedback that many others have provided, and how much I appreciate it. Thanks! 
Keep it coming!
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-26 Thread Todd H. Poole

> Since OpenSolaris is open source, perhaps some brave
> soul can investigate the issues with the IDE device driver and
> send a patch.

Fearing that other Senior Kernel Engineers, Solaris, might exhibit similar 
responses, or join in and play “antagonize the noob,” I decided that I would 
try to solve my problem on my own. I tried my best to unravel the source tree 
that is OpenSolaris with some help from a friend, but I'll be the first to 
admit - we didn't even know where to begin, much less understand what we were 
looking at.

To say that he and I were lost would be an understatement.

I’m familiar with some subsections of the Linux kernel, and I can read and 
write code in a pinch, but there's a reason why most of my work is done for 
small, personal projects, or just for fun... Some people out there can see 
things like Neo sees the Matrix… I am not one of them.

I wish I knew how to write and then submit those types of patches. If I did, 
you can bet I would have been all over that days ago! :)

-Todd
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-26 Thread Todd H. Poole

> The behavior of ZFS to an error reported by an underlying device
> driver is tunable by the zpool failmode property.  By default, it is
> set to "wait."  For root pools, the installer may change this
> to "continue."  The key here is that you can argue with the choice
> of default behavior, but don't argue with the option to change.

I didn't want to argue with the option to change... trust me. Being able to 
change those types of options and having that type of flexibility in the first 
place is what makes a very large part of my day possible.

> qv. zpool failmode property, at least when you are operating in the
> zfs code.  I think the concerns here are that hangs can, and do, occur
> at other places in the software stack.  Please report these in the
> appropriate forums and bug categories.
>  -- richard

Now _that's_ a great constructive suggestion! Very good - I'll research this in 
a few hours, and report back on what I find.

Thanks for the pointer!

-Todd
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-25 Thread Miles Nordin

> "jcm" == James C McPherson <[EMAIL PROTECTED]> writes:
> "thp" == Todd H Poole <[EMAIL PROTECTED]> writes:
> "mh" == Matt Harrison <[EMAIL PROTECTED]> writes:
> "js" == John Sonnenschein <[EMAIL PROTECTED]> writes:
> "re" == Richard Elling <[EMAIL PROTECTED]> writes:
> "cg" == Carson Gaspar <[EMAIL PROTECTED]> writes:

   jcm> Don't _ever_ try that sort of thing with IDE. As I mentioned
   jcm> above, IDE is not designed to be able to cope with [unplugging
   jcm> a cable]

It shouldn't have to be designed for it, if there's controller
redundancy.  On Linux, one drive per IDE bus (not using any ``slave''
drives) seems like it should be enough for any electrical issue, but
is not quite good enough in my experience, when there are two PATA
busses per chip.  but one hard drive per chip seems to be mostly okay.
In this SATA-based case, not even that much separation was necessary
for Linux to survive on the same hardware, but I agree with you and
haven't found that level with PATA either.

OTOH, if the IDE drivers are written such that a confusing interaction
with one controller chip brings down the whole machine, then I expect
the IDE drivers to do better.  If they don't, why advise people to buy
twice as much hardware ``because, you know, controllers can also fail,
so you should have some controller redundancy''---the advice is worse
than a waste of money, it's snake oil---a false sense of security.

   jcm> You could start by taking us seriously when we tell you that
   jcm> what you've been doing is not a good idea, and find other ways
   jcm> to simulate drive failures.

well, you could suggest a method.

except that the whole point of the story is, Linux, without any
blather about ``green-line'' and ``self-healing,'' without any
concerted platform-wide effort toward availability at all, simply
works more reliably.

   thp> So aside from telling me to "[never] try this sort of thing
   thp> with IDE" does anyone else have any other ideas on how to
   thp> prevent OpenSolaris from locking up whenever an IDE drive is
   thp> abruptly disconnected from a ZFS RAID-Z array?

yeah, get a Sil3124 card, which will run in native SATA mode and be
more likely to work.  Then, redo your test and let us know what
happens.

The not-fully-voiced suggestion to run your ATI SB600 in native/AHCI
mode instead of pci-ide/compatibility mode is probably a bad one
because of bug 6665032: the chip is only reliable in compatibility
mode.  You could trade your ATI board for an nVidia board for about
the same price as the Sil3124 add-on card.  AIUI from Linux wiki:

 http://ata.wiki.kernel.org/index.php/SATA_hardware_features

...says the old nVidia chips use nv_sata driver, and the new ones use
the ahci driver, so both of these are different from pci-ide and more
likely to work.  Get an old one (MCP61 or older), and a new one (MCP65
or newer), repeat your test and let us know what happens.

If the Sil3124 doesn't work, and nv_sata doesn't work, and AHCI on
newer-nVidia doesn't work, then hook the drives up to Linux running
IET on basically any old chip, and mount them from Solaris using the
built-in iSCSI initiator.

If you use iSCSI, you will find: 

you will get a pause like with NT.  Also, if one of the iSCSI targets
is down, 'zpool status' might hang _every time_ you run it, not just
the first time when the failure is detected.  The pool itself will
only hang the first time.  Also, you cannot boot unless all iSCSI
targets are available, but you can continue running if some go away
after booting.  

Overall IMHO it's not as good as LVM2, but it's more robust than
plugging the drives into Solaris.  It also gives you the ability to
run smartctl on the drives (by running it natively on Linux) with full
support for all commands, while someone here who I told to run
smartctl reported that on Solaris 'smartctl -a' worked but 'smartctl
-t' did not.  I still have performance problems with iSCSI.  I'm not
sure yet if they're unresolvable: there are a lot of tweakables with
iSCSI, like disabling Nagle's algorithm, and enabling RED on the
initiator switchport, but first I need to buy faster CPU's for the
targets.

mh> Dying or dead disks will still normally be able to
mh> communicate with the driver to some extent, so they are still
mh> "there".

The dead disks I have which don't spin also don't respond to
IDENTIFY(0) so they don't really communicate with the driver at all.
now, possibly, *possibly* they are still responsive after they fail,
and become unresponsive after the first time they're
rebooted---because I think they load part of their firmware off the
platters.  Also, ATAPI standard says that while ``still
communicating'' drives are allowed to take up to 30sec to answer each
command, which is probably too long to freeze a whole system.  and 
still, just because ``possibly,'' it doesn't make sense to replace a
tested-working system with a tested-broken system, not even after
someone tells a comp

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-25 Thread Jens Elkner

On Mon, Aug 25, 2008 at 08:17:55PM +1200, Ian Collins wrote:
> John Sonnenschein wrote:
> >
> > Look, yanking the drives like that can seriously damage the drives
> > or your motherboard. Solaris doesn't let you do it ...

Haven't seen an andruid/"universal soldier" shipping with Solaris ... ;-)

> > and assumes that something's gone seriously wrong if you try it. That Linux 
> > ignores the behavior and lets you do it sounds more like a bug in linux 
> > than anything else.

Not sure, whether everything, what can't be understood, is "likely a bug"
- maybe it is "more forgiving" and tries its best to solve the problem
without taking you out of business (see below), even if it requires some
hacks not in line with specifications ...

> One point that's been overlooked in all the chest thumping - PCs vibrate
> and cables fall out.  I had this happen with an SCSI connector.  Luckily

Yes - and a colleague told me, that he've had the same problem once.
Also he managed a SiemensFujitsu server, where the SCSI-controller card 
had a tiny hairline crack: very odd behavior, usually not reproducible,
IIRC, the 4th ServiceEngineer finally replaced the card ...

> So pulling a drive is a possible, if rare, failure mode.

Definitely!

And expecting strange controller (or in general hardware) behavior is
possibly a big + for an OS, which targets SMEs and "home users" as well
(everybody knows about far east and other cheap HW producers,  which 
sometimes seem to say, lets ship it, later we build a special driver for
MS Windows, which workarounds the bug/problem ...).

"Similar" story: ~ 2000+ we had a WG server with 4 IDE channels PATA,
one HDD on each. HDD0 on CH0 mirrored to HDD2 on CH2, HDD1 on CH1 mirrored
to HDD3 on CH3, using Linux Softraid driver. We found out, that when
HDD1 on CH1 got on the blink, for some reason the controller got on the
blink as well, i.e. took CH0 and vice versa down too. After reboot, we
were able to force the md raid to re-take the bad marked drives and even
found out, that the problem starts, when a certain part of a partition
was accessed (which made the ops on that raid really slow for some
minutes - but after the driver marked the drive(s) as bad, performance
was back). Thus disabling the partition gave us the time to get a new
drive... During all these ops nobody (except sysadmins) realized, that we
had a problem - thanx to the md raid1 (with xfs btw.). And also we did not
have any data corruption (at least, nobody has complained about it ;-)).

Wrt. what I've experienced and read in ZFS-discussion etc. list I've the
__feeling__, that we would have got really into trouble, using Solaris
(even the most recent one) on that system ... 
So if one asks me, whether to run Solaris+ZFS on a production system, I
usually say: definitely, but only, if it is a Sun server ...

My 2¢ ;-)

Regards,
jel.

PS: And yes, all the vendor specific workarounds/hacks are for Linux
kernel folks a problem as well - at least on Torvalds side
discouraged IIRC ...
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-25 Thread Bob Friesenhahn

On Mon, 25 Aug 2008, Carson Gaspar wrote:
>
> B) The driver does not detect the removal. Commands must time out before
> a problem is detected. Due to driver layering, timeouts increase
> rapidly, causig te OS to "hang" for unreasonable periods of time.
>
> We really need to fix (B). It seems the "easy" fixes are:
>
> - Configure faster timeouts and fewer retries on redundant devices,

I don't think that any of these "easy" fixes are wise.  Any fix based 
on timeouts is going to cause problems with devices mysteriously 
timing out and being resilvered.

Device drivers should know the expected behavior of the device and act 
appropriately. For example, if the device is in a powered-down state, 
then the device driver can expect that it will take at least 30 
seconds for the device to return after being requested to power-up but 
that some weak devices might take a minute.  As far as device drivers 
go, I expect that IDE device drivers are at the very bottom of the 
feeding chain in Solaris since Solaris is optimized for enterprise 
hardware.

Since OpenSolaris is open source, perhaps some brave soul can 
investigate the issues with the IDE device driver and send a patch.

Bob
==
Bob Friesenhahn
[EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-25 Thread Bob Friesenhahn

On Sun, 24 Aug 2008, Todd H. Poole wrote:

> So aside from telling me to "[never] try this sort of thing with 
> IDE" does anyone else have any other ideas on how to prevent 
> OpenSolaris from locking up whenever an IDE drive is abruptly 
> disconnected from a ZFS RAID-Z array?

I think that your expectations from ZFS are reasonable.  However, it 
is useful to determine if pulling the IDE drive locks the entire IDE 
channel, which serves the other disks as well. This could happen at a 
hardware level, or at a device driver level. If this happens, then 
there is nothing that ZFS can do.

Bob
==
Bob Friesenhahn
[EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-25 Thread Richard Elling

Todd H. Poole wrote:
> Howdy 404, thanks for the response.
>
> But I dunno man... I think I disagree... I'm kinda of the opinion that 
> regardless of what happens to hardware, an OS should be able to work around 
> it, if it's possible. If a sysadmin wants to yank a hard drive out of a 
> motherboard (despite the risk of damage to the drive and board), then no OS 
> in the world is going to stop him, so instead of the sysadmin trying to work 
> around the OS, shouldn't the OS instead try to work around the sysadmin?
>   

The behavior of ZFS to an error reported by an underlying device
driver is tunable by the zpool failmode property.  By default, it is
set to "wait."  For root pools, the installer may change this
to "continue."  The key here is that you can argue with the choice
of default behavior, but don't argue with the option to change.

> I mean, as great of an OS as it is, Solaris can't possibly hope to stop me 
> from doing anything I want to do... so when it assumes that something's gone 
> seriously wrong (which yanking a disk drive would hopefully cause it to 
> assume), instead of just freezing up and becoming totally useless, why not do 
> something useful like eject the disk from it's memory, degrade the array, 
> send out an e-mail to a designated sysadmin, and then keep on chugging along?
>   

If this does not occur, then please file a bug against the appropriate
device driver (you're not operating in ZFS code here).

> Or, for a greater level of control, why not just read from some configuration 
> set by the sysadmin, and then decide to either do the above or shut down 
> entirely, as per the wishes of the sysadmin? Anything would be better than 
> just going into a catatonic state in less than five seconds.
>   

qv. zpool failmode property, at least when you are operating in the
zfs code.  I think the concerns here are that hangs can, and do, occur
at other places in the software stack.  Please report these in the
appropriate forums and bug categories.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-25 Thread Heikki Suonsivu on list forwarder

Justin wrote:
> Howdy Matt. Just to make it absolutely clear, I appreciate your
> response. I would be quite lost if it weren't for all of the input.
> 
>> Unplugging a drive (actually pulling the cable out) does not
>> simulate a drive failure, it simulates a drive getting unplugged,
>> which is something the hardware is not capable of dealing with.
>> 
>> If your drive were to suffer something more realistic, along the
>> lines of how you would normally expect a drive to die, then the
>> system should cope with it a whole lot better.
> 
> Hmmm... I see what you're saying. But, ok, let me play devil's
> advocate. What about the times when a drive fails in a way the system
> didn't expect? What you said was right - most of the time, when a
> hard drive goes bad, SMART will pick up on it's impending doom long
> before it's too late - but what about the times when the cause of the
> problem is larger or more abrupt than that (like tin whiskers causing
> shorts, or a server room technician yanking the wrong drive)?

I read a research paper by google about this a while ago.  Their 
conclusion was that SMART is poor predictor of disk failure, even though 
they did find some useful indications.  google for "google disk 
failure", it came out as second link a moment ago, title "Failure Trends 
in a Large Disk Drive Population".

The problem with trying to predict disk failures with SMART parameters 
only catches a certain percentage of failing disks, and that percentage 
is not all that great.  Many disks will still decide to fail 
catastrophically, most often early morning December 25th, in particular 
if there is a huge snowstorm going :)

Heikki

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-25 Thread Ralf Ramge

Ralf Ramge wrote:
[...]

Oh, and please excuse the grammar mistakes and typos. I'm in a hurry, 
not a retard ;-) At least I think so.

-- 

Ralf Ramge
Senior Solaris Administrator, SCNA, SCSA

Tel. +49-721-91374-3963 
[EMAIL PROTECTED] - http://web.de/

1&1 Internet AG
Brauerstraße 48
76135 Karlsruhe

Amtsgericht Montabaur HRB 6484

Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Thomas 
Gottschlich, Matthias Greve, Robert Hoffmann, Markus Huhn, Oliver Mauss, Achim 
Weiss 
Aufsichtsratsvorsitzender: Michael Scheeren

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-25 Thread Ralf Ramge

Todd H. Poole wrote:
> Hmmm... I see what you're saying. But, ok, let me play devil's advocate. What 
> about the times when a drive fails in a way the system didn't expect? What 
> you said was right - most of the time, when a hard drive goes bad, SMART will 
> pick up on it's impending doom long before it's too late - but what about the 
> times when the cause of the problem is larger or more abrupt than that (like 
> tin whiskers causing shorts, or a server room technician yanking the wrong 
> drive)?
>
> To imply that OpenSolaris with a RAID-Z array of IDE drives will _only_ 
> protect me from data loss during _specific_ kinds of failures (the one's 
> which OpenSolaris considers "normal") is a pretty big implication... and is 
> certainly a show-stopping one at that. Nobody is going to want to rely on an 
> OS/RAID solution that can only survive certain types of drive failures, while 
> there are others out there that can survive the same and more... 
>
> But then again, I'm not sure if that's what you meant... is that what you 
> were getting at, or did I misunderstand?
>   

I think there's a misunderstanding concerning underlying concepts. I'll 
try to explain my thoughts, please excuse me in case this becomes a bit 
lengthy. Oh, and I am not a Sun employee or ZFS fan, I'm just a customer 
who loves and hates ZFS at the same time ;-)

You know, ZFS is designed for high *reliability*. This means that ZFS 
tries to keep your data as safe as possible. This includes faulty 
hardware, missing hardware (like in your testing scenario) and, to a 
certain degree, even human mistakes.
But there are limits. For instance, ZFS does not make a backup 
unnecessary. If there's a fire and your drives melt, then ZFS can't do 
anything. Or if the hardware is lying about the drive geometry. ZFS is 
part of the operating environment and, as a consequence, relies on the 
hardware. 
so ZFS can't make unreliable hardware reliable. All it can do is trying 
to protect the data you saved on it. But it cannot guarantee this to you 
if the hardware becomes its enemy.
A real world example: I have a 32 core Opteron server here, with 4 
FibreChannel Controllers and 4 JBODs with a total of FC drives connected 
to it, running a RAID 10 using ZFS mirrors. Sounds a lot like high end 
hardware compared to your NFS server, right? But ... I have exactly the 
same symptom. If one drive fails, an entire JBOD with all 16 included 
drives hangs, and all zpool access freezes. The reason for this is the 
miserable JBOD hardware. There's only one FC loop inside of it, the 
drives are connected serially to each other, and if one drive dies, the 
drives behind it go downhill, too. ZFS immediately starts caring about 
the data, the zpool command hangs (but I still have traffic on the other 
half of the ZFS mirror!), and it does the right thing by doing so: 
whatever happens, my data must not be damaged.
A "bad" filesystem like Linux ext2 or ext3 with LVM would just continue, 
even if the Volume Manager noticed the missing drive or not. That's what 
you experienced. But you run in the real danger of having to use fsck at 
some point. Or, in my case, fsck'ing 5 TB of data on 64 drives. That's 
not much fun and results in a lot more downtime than replacing the 
faulty drive.

What can you expect from ZFS in your case? You can expect it to detect 
that a drive is missing and to make sure, that your _data integrity_  
isn't compromised. By any means necessary.  This may even require  to 
make a system completely unresponsive until a timeout has passed.

But what you described is not a case of reliability. You want something 
completely different. You expect it to deliver *availability*.

And availability is something ZFS doesn't promise. It simply can't 
deliver this. You have the impression that NTFS and various other 
Filesystems do so, but that's an illusion. The next reboot followed by a 
fsck run will show you why. Availability requires full reliability of 
every included component of your server as a minimum,  and you can't 
expect ZFS or any other filesystem to deliver this  with cheap IDE 
hardware.

Usually people want to save money when buying hardware, and ZFS is a 
good choice to deliver the *reliability* then. But the conceptual 
stalemate between reliability and availability of such cheap hardware 
still exists - the hardware is cheap, the file system and services may 
be reliable, but as soon as you want *availability*, it's getting 
expensive again, because you have to buy every hardware component at 
least twice.

So, you have the choice:

a) If you want *availability*, stay with your old solution. But oyu have 
no guarantee that your data is always intact. You'll always be able to 
stream your video, but you have no guarantee that the client will 
receive a stream without drop outs forever.

b) If you want *data integrity*, ZFS is your best friend. But you may 
have slight availability issues when it comes to hardware defects. Yo

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-25 Thread Carson Gaspar

John Sonnenschein wrote:

> Look, yanking the drives like that can seriously damage the drives or
> your motherboard. Solaris doesn't let you do it and assumes that
> something's gone seriously wrong if you try it. That Linux ignores
> the behavior and lets you do it sounds more like a bug in linux than
> anything else.

OK, so far we've had a lot of knee jerk defense of Solaris. Sorry, but 
that isn't helping. Let's get back to science here, shall we?

What happens when you remove a disk?

A) The driver detects the removal and informs the OS. Solaris appears to 
behave reasonaby well in this case.

B) The driver does not detect the removal. Commands must time out before 
a problem is detected. Due to driver layering, timeouts increase 
rapidly, causig te OS to "hang" for unreasonable periods of time.

We really need to fix (B). It seems the "easy" fixes are:

- Configure faster timeouts and fewer retries on redundant devices, 
similar to drive manufacturers' RAID edition firmware. This could be via 
driver config file, or (better) automatically via ZFS, similar to write 
cache behaviour.

- Propagate timeouts quickly between layers (immediate soft fail without 
retry) or perhaps just to the fault management system

-- 
Carson
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-25 Thread Justin

alrigt, alright, but your fault. you left your workstation logged on, what was 
i supposed to do? not chime in?

grotty yank
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-25 Thread Todd H. Poole

jalex? As in Justin Alex?

If you're who I think you are, don't you have a pretty long list of things you 
need to get done for Jerry before your little vacation?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-25 Thread Ian Collins

John Sonnenschein wrote:
> James isn't being a jerk because he hates your or anything...
>
> Look, yanking the drives like that can seriously damage the drives or your 
> motherboard. Solaris doesn't let you do it and assumes that something's gone 
> seriously wrong if you try it. That Linux ignores the behavior and lets you 
> do it sounds more like a bug in linux than anything else.
>  
>   
One point that's been overlooked in all the chest thumping - PCs vibrate
and cables fall out.  I had this happen with an SCSI connector.  Luckily
for me, it fell in a fan and made a lot of noise!

So pulling a drive is a possible, if rare, failure mode.

Ian
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-25 Thread Todd H. Poole

Howdy 404, thanks for the response.

But I dunno man... I think I disagree... I'm kinda of the opinion that 
regardless of what happens to hardware, an OS should be able to work around it, 
if it's possible. If a sysadmin wants to yank a hard drive out of a motherboard 
(despite the risk of damage to the drive and board), then no OS in the world is 
going to stop him, so instead of the sysadmin trying to work around the OS, 
shouldn't the OS instead try to work around the sysadmin?

I mean, as great of an OS as it is, Solaris can't possibly hope to stop me from 
doing anything I want to do... so when it assumes that something's gone 
seriously wrong (which yanking a disk drive would hopefully cause it to 
assume), instead of just freezing up and becoming totally useless, why not do 
something useful like eject the disk from it's memory, degrade the array, send 
out an e-mail to a designated sysadmin, and then keep on chugging along?

Or, for a greater level of control, why not just read from some configuration 
set by the sysadmin, and then decide to either do the above or shut down 
entirely, as per the wishes of the sysadmin? Anything would be better than just 
going into a catatonic state in less than five seconds.

Which is exactly what Linux, BSD, and even Windows _don't_ do, and why their 
continual operation even under such failures wouldn't be considered a bug.

When I yank a drive in a RAID5 array - any drive, be it IDE, SATA, USB, or 
Firewire - in OpenSuSE or RedHat, the kernel will immediately notice it's 
absence, and inform lvm and mdadm (the software responsible for keeping the 
RAID array together). mdadm will then degrade the array, and consult whatever 
instructions root gave it when the sysadmin was configuring the array. If the 
sysadmin waned the array to "stay up as long as it could," then it would 
continue to do that. If root wanted the array to be "brought down after any 
sort of drive failure," then the array would be unmounted. If root wanted to 
"power the machine down," then the machine will dutifully turn off.

Shouldn't OpenSolaris do the same thing?

And as for James not being a jerk because he hates me, does that mean he's just 
always like that? lol, it's alright: lets not try to explain or excuse trollish 
behavior, and instead just call it out and expose it for what it is, and then 
be done with it. 

I certainly am.

As always, thanks for the input.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-25 Thread Todd H. Poole

Howdy Matt. Just to make it absolutely clear, I appreciate your response. I 
would be quite lost if it weren't for all of the input.

> Unplugging a drive (actually pulling the cable out) does not simulate a 
> drive failure, it simulates a drive getting unplugged, which is 
> something the hardware is not capable of dealing with.
> 
> If your drive were to suffer something more realistic, along the lines 
> of how you would normally expect a drive to die, then the system should 
> cope with it a whole lot better.

Hmmm... I see what you're saying. But, ok, let me play devil's advocate. What 
about the times when a drive fails in a way the system didn't expect? What you 
said was right - most of the time, when a hard drive goes bad, SMART will pick 
up on it's impending doom long before it's too late - but what about the times 
when the cause of the problem is larger or more abrupt than that (like tin 
whiskers causing shorts, or a server room technician yanking the wrong drive)?

To imply that OpenSolaris with a RAID-Z array of IDE drives will _only_ protect 
me from data loss during _specific_ kinds of failures (the one's which 
OpenSolaris considers "normal") is a pretty big implication... and is certainly 
a show-stopping one at that. Nobody is going to want to rely on an OS/RAID 
solution that can only survive certain types of drive failures, while there are 
others out there that can survive the same and more... 

But then again, I'm not sure if that's what you meant... is that what you were 
getting at, or did I misunderstand?

> Unfortunately, hard drives don't come with a big button saying "simulate 
> head crash now" or "make me some bad sectors" so it's going to be 
> difficult to simulate those failures.

lol, if only they did - just having a button to push would make testing these 
types of things a lot easier. ;)

> All I can say is that unplugging a drive yourself will not simulate a 
> failure, it merely causes the disk to disappear. 

But isn't that a perfect example of a failure!? One in which the drive just 
seems to pop out of existence? lol, forgive me if I'm sounding pedantic, but 
why is there even a distinction between the two? This is starting to sound more 
and more like a bug...
 
> I hope this has been of some small help, even just to
> explain why the system didn't cope as you expected.

It has, thank you - I appreciate the response.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-25 Thread Justin

Howdy Matt, thanks for the response.

But I dunno man... I think I disagree... I'm kinda of the opinion that 
regardless of what happens to hardware, an OS should be able to work around it, 
if it's possible. If a sysadmin wants to yank a hard drive out of a motherboard 
(despite the risk of damage to the drive and board), then no OS in the world is 
going to stop him, so instead of the sysadmin trying to work around the OS, 
shouldn't the OS instead try to work around the sysadmin?

I mean, as great of an OS as it is, Solaris can't possibly hope to stop me from 
doing anything I want to do... so when it assumes that something's gone 
seriously wrong (which yanking a disk drive would hopefully cause it to 
assume), instead of just freezing up and becoming totally useless, why not do 
something useful like eject the disk from it's memory, degrade the array, send 
out an e-mail to a designated sysadmin, and then keep on chugging along?

Or, for a greater level of control, why not just read from some configuration 
set by the sysadmin, and then decide to either do the above or shut down 
entirely, as per the wishes of the sysadmin? Anything would be better than just 
going into a catatonic state in less than five seconds.

Which is exactly what Linux, BSD, and even Windows _don't_ do, and why their 
continual operation even under such failures wouldn't be considered a bug.

When I yank a drive in a RAID5 array - any drive, be it IDE, SATA, USB, or 
Firewire - in OpenSuSE or RedHat, the kernel will immediately notice it's 
absence, and inform lvm and mdadm (the software responsible for keeping the 
RAID array together). mdadm will then degrade the array, and consult whatever 
instructions root gave it when the sysadmin was configuring the array. If the 
sysadmin waned the array to "stay up as long as it could," then it would 
continue to do that. If root wanted the array to be "brought down after any 
sort of drive failure," then the array would be unmounted. If root wanted to 
"power the machine down," then the machine will dutifully turn off.

Shouldn't OpenSolaris do the same thing?

And as for James not being a jerk because he hates me, does that mean he's just 
always like that? lol, it's alright: lets not try to explain or excuse trollish 
behavior, and instead just call it out and expose it for what it is, and then 
be done with it. 

I certainly am.

Anyways, thanks for the input Matt.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-24 Thread Justin

Howdy Matt. Just to make it absolutely clear, I appreciate your response. I 
would be quite lost if it weren't for all of the input.

> Unplugging a drive (actually pulling the cable out) does not simulate a 
> drive failure, it simulates a drive getting unplugged, which is 
> something the hardware is not capable of dealing with.
> 
> If your drive were to suffer something more realistic, along the lines 
> of how you would normally expect a drive to die, then the system should 
> cope with it a whole lot better.

Hmmm... I see what you're saying. But, ok, let me play devil's advocate. What 
about the times when a drive fails in a way the system didn't expect? What you 
said was right - most of the time, when a hard drive goes bad, SMART will pick 
up on it's impending doom long before it's too late - but what about the times 
when the cause of the problem is larger or more abrupt than that (like tin 
whiskers causing shorts, or a server room technician yanking the wrong drive)?

To imply that OpenSolaris with a RAID-Z array of IDE drives will _only_ protect 
me from data loss during _specific_ kinds of failures (the one's which 
OpenSolaris considers "normal") is a pretty big implication... and is certainly 
a show-stopping one at that. Nobody is going to want to rely on an OS/RAID 
solution that can only survive certain types of drive failures, while there are 
others out there that can survive the same and more... 

But then again, I'm not sure if that's what you meant... is that what you were 
getting at, or did I misunderstand?

> Unfortunately, hard drives don't come with a big button saying "simulate 
> head crash now" or "make me some bad sectors" so it's going to be 
> difficult to simulate those failures.

lol, if only they did - just having a button to push would make testing these 
types of things a lot easier. ;)

> All I can say is that unplugging a drive yourself will not simulate a 
> failure, it merely causes the disk to disappear. 

But isn't that a perfect example of a failure!? One in which the drive just 
seems to pop out of existence? lol, forgive me if I'm sounding pedantic, but 
why is there even a distinction between the two? This is starting to sound more 
and more like a bug...
 
> I hope this has been of some small help, even just to
> explain why the system didn't cope as you expected.

It has, thank you - I appreciate the response.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-24 Thread Peter Bortas

On Mon, Aug 25, 2008 at 5:19 AM, John Sonnenschein
<[EMAIL PROTECTED]> wrote:
> James isn't being a jerk because he hates your or anything...
>
> Look, yanking the drives like that can seriously damage the drives or your 
> motherboard.

It can, but it's not very likely to.

> Solaris doesn't let you do it and assumes that something's gone seriously 
> wrong if you try it. That Linux ignores the behavior and lets you do it 
> sounds more like a bug in linux than anything else.

That if something sounds more like defensiveness. Pulling out the
cable isn't advisable, but it simulates the controller card on the
disk going belly up pretty well. Unless he pulls the power at the same
time, because that would also simulate a power failure.

If a piece of hardware stops responding you might do well to stop
talking to it, but there is nothing admirable about locking up the OS
if there is enough redundancy to continue without that particular
chunk of metal.

-- 
Peter Bortas
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-24 Thread Justin

aye mate, I had the exact same problem, but where i work, we pay some pretty 
seriosu dollars for a direct 24/7 line to some of sun's engineers, so i decided 
to call them up. after spending some time with tech support, i never really got 
the thing resolved, and i instead ended up going back to debian for all of our 
simple ide-based file servers.

if you really just want zfs, you can add it to whatever installation you've got 
now (opensuse?) through something like zfs-fuse, but you might take a 10-15% 
performance hit. if you don't want that, and you're not too concerned with 
violating a few licenses, you can just add it to your installation yourself, 
the source code is out there. you know, roll your own. ;-)

you just might be trying too hard to force a round peg into a square hole.

hey, besides, where you work? i registered because i know a guy with the same 
name
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-24 Thread John Sonnenschein

James isn't being a jerk because he hates your or anything...

Look, yanking the drives like that can seriously damage the drives or your 
motherboard. Solaris doesn't let you do it and assumes that something's gone 
seriously wrong if you try it. That Linux ignores the behavior and lets you do 
it sounds more like a bug in linux than anything else.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-24 Thread Matt Harrison

Todd H. Poole wrote:
>> But you're not attempting hotswap, you're doing hot plug
> 
> Do you mean hot UNplug? Because I'm not trying to get this thing to recognize 
> any new disks without a restart... Honest. I'm just trying to prevent the 
> machine from freezing up when a drive fails. I have no problem restarting the 
> machine with a new drive in it later so that it recognizes the new disk.
> 
>> and unless you're using the onboard bios' concept of an actual
>> RAID array, you don't have an array, you've got a JBOD and
>> it's not a real JBOD - it's a PC motherboard which does _not_
>> have the same electronic and electrical protections that a
>> JBOD has *by design*.
> 
> I'm confused by what your definition of a RAID array is, and for that matter, 
> what a JBOD is... I've got plenty of experience with both, but just to make 
> sure I wasn't off my rocker, I consulted the demigod:
> 
> http://en.wikipedia.org/wiki/RAID
> http://en.wikipedia.org/wiki/JBOD
> 
> and I think what I'm doing is indeed RAID... I'm not using some sort of 
> controller card, or any specialized hardware, so it's certainly not Hardware 
> RAID (and thus doesn't contain any of the fancy electronic or electrical 
> protections you mentioned), but lacking said protections doesn't preclude the 
> machine from being considered a RAID. All the disks are the same capacity, 
> the OS still sees the zpool I've created as one large volume, and since I'm 
> using RAID-Z (RAID5), it should be redundant... What other qualifiers out 
> there are necessary before a system can be called RAID compliant?
> 
> If it's hot-swappable technology, or a controller hiding the details from the 
> OS and instead  presenting a single volume, then I would argue those things 
> are extra - not a fundamental prerequisite for a system to be called a RAID.
> 
> Furthermore, while I'm not sure what the difference between a "real JBOD" and 
> a plain old JBOD is, this set-up certainly wouldn't qualify for either. I 
> mean, there is no concatenation going on, redundancy should be present (but 
> due to this issue, I haven't been able to verify that yet), and all the 
> drives are the same size... Am I missing something in the definition of a 
> JBOD?
> 
> I don't think so...
>  
>> And you're right, it can. But what you've been doing is outside
>> the bounds of what IDE hardware on a PC motherboard is designed
>> to cope with.
> 
> Well, yes, you're right, but it's not like I'm making some sort of radical 
> departure outside of the bounds of the hardware... It really shouldn't be a 
> problem so long as it's not an unreasonable departure because that's where 
> software comes in. When the hardware can't cut it, that's where software 
> picks up the slack.
> 
> Now, obviously, I'm not saying software can do anything with any piece of 
> hardware you give it - no matter how many lines of code you write, your 
> keyboard isn't going to turn into a speaker - but when it comes to reasonable 
> stuff like ensuring a machine doesn't crash because a user did something with 
> the hardware that he or she wasn't supposed to do? Prime target for software.
> 
> And that's the way it's always been... The whole push behind that whole ZFS 
> Promise thing (or if you want to make it less specific, the attractiveness of 
> RAID in general), was that "RAID-Z [wouldn't] require any special hardware. 
> It doesn't need NVRAM for correctness, and it doesn't need write buffering 
> for good performance. With RAID-Z, ZFS makes good on the original RAID 
> promise: it provides fast, reliable storage using cheap, commodity disks." 
> (http://blogs.sun.com/bonwick/entry/raid_z)
> 
>> Well sorry, it does. Welcome to an OS which does care.
> 
> The half-hearted apology wasn't necessary... I understand that OpenSolaris 
> cares about the method those disks use to plug into the motherboard, but what 
> I don't understand is why that limitation exists in the first place. It would 
> seem much better to me to have an OS that doesn't care (but developers that 
> do) and just finds a way to work, versus one that does care (but developers 
> that don't) and instead isn't as flexible and gets picky... I'm not saying 
> OpenSolaris is the latter, but I'm not getting the impression it's the former 
> either...
> 
>> If the controlling electronics for your disk can't
>> handle it, then you're hosed. That's why FC, SATA (in SATA
>> mode) and SAS are much more likely to handle this out of
>> the box. Parallel SCSI requires funky hardware, which is why
>> those old 6- or 12-disk multipacks are so useful to have.
>>
>> Of the failure modes that you suggest above, only one
>> is going to give you anything other than catastrophic
>> failure (drive motor degradation) - and that is because the
>> drive's electronics will realise this, and send warnings to
>> the host which should have its drivers written so
>> that these messages are logged for the sysadmin to act upon.
>>
>> The other failure modes

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-24 Thread Todd H. Poole

> But you're not attempting hotswap, you're doing hot plug

Do you mean hot UNplug? Because I'm not trying to get this thing to recognize 
any new disks without a restart... Honest. I'm just trying to prevent the 
machine from freezing up when a drive fails. I have no problem restarting the 
machine with a new drive in it later so that it recognizes the new disk.

> and unless you're using the onboard bios' concept of an actual
> RAID array, you don't have an array, you've got a JBOD and
> it's not a real JBOD - it's a PC motherboard which does _not_
> have the same electronic and electrical protections that a
> JBOD has *by design*.

I'm confused by what your definition of a RAID array is, and for that matter, 
what a JBOD is... I've got plenty of experience with both, but just to make 
sure I wasn't off my rocker, I consulted the demigod:

http://en.wikipedia.org/wiki/RAID
http://en.wikipedia.org/wiki/JBOD

and I think what I'm doing is indeed RAID... I'm not using some sort of 
controller card, or any specialized hardware, so it's certainly not Hardware 
RAID (and thus doesn't contain any of the fancy electronic or electrical 
protections you mentioned), but lacking said protections doesn't preclude the 
machine from being considered a RAID. All the disks are the same capacity, the 
OS still sees the zpool I've created as one large volume, and since I'm using 
RAID-Z (RAID5), it should be redundant... What other qualifiers out there are 
necessary before a system can be called RAID compliant?

If it's hot-swappable technology, or a controller hiding the details from the 
OS and instead  presenting a single volume, then I would argue those things are 
extra - not a fundamental prerequisite for a system to be called a RAID.

Furthermore, while I'm not sure what the difference between a "real JBOD" and a 
plain old JBOD is, this set-up certainly wouldn't qualify for either. I mean, 
there is no concatenation going on, redundancy should be present (but due to 
this issue, I haven't been able to verify that yet), and all the drives are the 
same size... Am I missing something in the definition of a JBOD?

I don't think so...
 
> And you're right, it can. But what you've been doing is outside
> the bounds of what IDE hardware on a PC motherboard is designed
> to cope with.

Well, yes, you're right, but it's not like I'm making some sort of radical 
departure outside of the bounds of the hardware... It really shouldn't be a 
problem so long as it's not an unreasonable departure because that's where 
software comes in. When the hardware can't cut it, that's where software picks 
up the slack.

Now, obviously, I'm not saying software can do anything with any piece of 
hardware you give it - no matter how many lines of code you write, your 
keyboard isn't going to turn into a speaker - but when it comes to reasonable 
stuff like ensuring a machine doesn't crash because a user did something with 
the hardware that he or she wasn't supposed to do? Prime target for software.

And that's the way it's always been... The whole push behind that whole ZFS 
Promise thing (or if you want to make it less specific, the attractiveness of 
RAID in general), was that "RAID-Z [wouldn't] require any special hardware. It 
doesn't need NVRAM for correctness, and it doesn't need write buffering for 
good performance. With RAID-Z, ZFS makes good on the original RAID promise: it 
provides fast, reliable storage using cheap, commodity disks." 
(http://blogs.sun.com/bonwick/entry/raid_z)

> Well sorry, it does. Welcome to an OS which does care.

The half-hearted apology wasn't necessary... I understand that OpenSolaris 
cares about the method those disks use to plug into the motherboard, but what I 
don't understand is why that limitation exists in the first place. It would 
seem much better to me to have an OS that doesn't care (but developers that do) 
and just finds a way to work, versus one that does care (but developers that 
don't) and instead isn't as flexible and gets picky... I'm not saying 
OpenSolaris is the latter, but I'm not getting the impression it's the former 
either...

> If the controlling electronics for your disk can't
> handle it, then you're hosed. That's why FC, SATA (in SATA
> mode) and SAS are much more likely to handle this out of
> the box. Parallel SCSI requires funky hardware, which is why
> those old 6- or 12-disk multipacks are so useful to have.
> 
> Of the failure modes that you suggest above, only one
> is going to give you anything other than catastrophic
> failure (drive motor degradation) - and that is because the
> drive's electronics will realise this, and send warnings to
> the host which should have its drivers written so
> that these messages are logged for the sysadmin to act upon.
> 
> The other failure modes are what we call catastrophic. And
> where your hardware isn't designed with certain protections
> around drive connections, you're hosed. No two ways
> about it. If your system

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-24 Thread James C. McPherson

Todd H. Poole wrote:
> Hmmm. Alright, but supporting hot-swap isn't the issue, is it? I mean,
> like I said in my response to myxiplx, if I have to bring down the
> machine in order to replace a faulty drive, that's perfectly acceptable -
> I can do that whenever it's most convenient for me.
> 
> What is _not_ perfectly acceptable (indeed, what is quite _unacceptable_)
> is if the machine hangs/freezes/locks up or is otherwise brought down by
> an isolated failure in a supposedly redundant array... Yanking the drive
> is just how I chose to simulate that failure. I could just as easily have
> decided to take a sledgehammer or power drill to it,

But you're not attempting hotswap, you're doing hot plug
and unless you're using the onboard bios' concept of an actual
RAID array, you don't have an array, you've got a JBOD and
it's not a real JBOD - it's a PC motherboard which does _not_
have the same electronic and electrical protections that a
JBOD has *by design*.

> http://www.youtube.com/watch?v=CN6iDzesEs0 (fast-forward to the 2:30
> part) http://www.youtube.com/watch?v=naKd9nARAes
> 
> and the machine shouldn't have skipped a beat. After all, that's the
> whole point behind the "redundant" part of RAID, no?

Sigh.

> And besides, RAID's been around for almost 20 years now... It's nothing
> new. I've seen (countless times, mind you) plenty of regular old IDE
> drives fail in a simple software RAID5 array and not bring the machine
> down at all. Granted, you still had to power down to re-insert a new one
> (unless you were using some fancy controller card), but the point
> remains: the machine would still work perfectly with only 3 out of 4
> drives present... So I know for a fact this type of stability can be
> achieved with IDE.

And you're right, it can. But what you've been doing is outside
the bounds of what IDE hardware on a PC motherboard is designed
to cope with.

> What I'm getting at is this: I don't think the method by which the drives
> are connected - or whether or not that method supports hot-swap - should
> matter.

Well sorry, it does. Welcome to an OS which does care.

> A machine _should_not_ crash when a single drive (out of a 4
> drive ZFS RAID-Z array) is ungracefully removed, regardless of how
> abruptly that drive is excised (be it by a slow failure of the drive
> motor's spindle, by yanking the drive's power cable, by yanking the
> drive's SATA connector, by smashing it to bits with a sledgehammer, or by
> drilling into it with a power drill).

If the controlling electronics for your disk can't handle
it, then you're hosed. That's why FC, SATA (in SATA mode)
and SAS are much more likely to handle this out of the box.
Parallel SCSI requires funky hardware, which is why those
old 6- or 12-disk multipacks are so useful to have.

Of the failure modes that you suggest above, only one is
going to give you anything other than catastrophic failure
(drive motor degradation) - and that is because the drive's
electronics will realise this, and send warnings to the
host which should have its drivers written so that these
messages are logged for the sysadmin to act upon.

The other failure modes are what we call catastrophic. And
where your hardware isn't designed with certain protections
around drive connections, you're hosed. No two ways about it.
If your system suffers that sort of failure, would you seriously
expect that non-hardened hardware would survive it?

> So we've established that one potential work around is to use the ahci
> instead of the pci-ide driver. Good! I like this kind of problem solving!
> But that's still side-stepping the problem... While this machine is
> entirely SATA II, what about those who have a mix between SATA and IDE?
> Or even much larger entities whose vast majority of hardware is only a
> couple of years old, and still entirely IDE?

If you've got newer hardware, which can support SATA in
native SATA mode, USE IT.

Don't _ever_ try that sort of thing with IDE. As I mentioned
above, IDE is not designed to be able to cope with what
you've been inflicting on this machine.

> I'm grateful for your help, but is there another way that you can think
> of to get this to work?

You could start by taking us seriously when we tell you
that what you've been doing is not a good idea, and find
other ways to simulate drive failures.

James C. McPherson
--
Senior Kernel Software Engineer, Solaris
Sun Microsystems
http://blogs.sun.com/jmcp   http://www.jmcp.homeunix.com/blog
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-24 Thread James C. McPherson

Tim wrote:
> I'm pretty sure pci-ide doesn't support hot-swap.  I believe you need ahci.

You're correct, it doesn't. Furthermore, to the best of
my knowledge, it won't ever support hotswap.


James C. McPherson
--
Senior Kernel Software Engineer, Solaris
Sun Microsystems
http://blogs.sun.com/jmcp   http://www.jmcp.homeunix.com/blog
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-24 Thread Todd H. Poole

Hmmm. Alright, but supporting hot-swap isn't the issue, is it? I mean, like I 
said in my response to myxiplx, if I have to bring down the machine in order to 
replace a faulty drive, that's perfectly acceptable - I can do that whenever 
it's most convenient for me. 

What is _not_ perfectly acceptable (indeed, what is quite _unacceptable_) is if 
the machine hangs/freezes/locks up or is otherwise brought down by an isolated 
failure in a supposedly redundant array... Yanking the drive is just how I 
chose to simulate that failure. I could just as easily have decided to take a 
sledgehammer or power drill to it,

http://www.youtube.com/watch?v=CN6iDzesEs0 (fast-forward to the 2:30 part)
http://www.youtube.com/watch?v=naKd9nARAes

and the machine shouldn't have skipped a beat. After all, that's the whole 
point behind the "redundant" part of RAID, no?

And besides, RAID's been around for almost 20 years now... It's nothing new. 
I've seen (countless times, mind you) plenty of regular old IDE drives fail in 
a simple software RAID5 array and not bring the machine down at all. Granted, 
you still had to power down to re-insert a new one (unless you were using some 
fancy controller card), but the point remains: the machine would still work 
perfectly with only 3 out of 4 drives present... So I know for a fact this type 
of stability can be achieved with IDE.

What I'm getting at is this: I don't think the method by which the drives are 
connected - or whether or not that method supports hot-swap - should matter. A 
machine _should_not_ crash when a single drive (out of a 4 drive ZFS RAID-Z 
array) is ungracefully removed, regardless of how abruptly that drive is 
excised (be it by a slow failure of the drive motor's spindle, by yanking the 
drive's power cable, by yanking the drive's SATA connector, by smashing it to 
bits with a sledgehammer, or by drilling into it with a power drill).

So we've established that one potential work around is to use the ahci instead 
of the pci-ide driver. Good! I like this kind of problem solving! But that's 
still side-stepping the problem... While this machine is entirely SATA II, what 
about those who have a mix between SATA and IDE? Or even much larger entities 
whose vast majority of hardware is only a couple of years old, and still 
entirely IDE?

I'm grateful for your help, but is there another way that you can think of to 
get this to work?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-24 Thread Tim

I'm pretty sure pci-ide doesn't support hot-swap.  I believe you need ahci.





On 8/24/08, Todd H. Poole <[EMAIL PROTECTED]> wrote:
> Ah, yes - all four hard drives are connected to the motherboard's onboard
> SATA II ports. There is one additional drive I have neglected to mention
> thus far (the boot drive) but that is connected via the motherboard's IDE
> channel, and has remained untouched since the install... I don't really
> consider it part of the problem, but I thought I should mention it just in
> case... you never know...
>
> As for the drivers... well, I'm not sure of the command to determine that
> directly, but going under System > Administration > Device Driver Utility
> yields the following information under the "Storage" entry:
>
> Components: "ATI Technologies Inc. SB600 IDE"
> Driver: pci-ide
> --Driver Information--
> Driver: pci-ide
> Instance: 1
> Attach Status: Attached
> --Hardware Information--
> Vendor ID: 0x1002
> Device ID: 0x438c
> Class Code: 0001018a
> DevPath: /[EMAIL PROTECTED],0/[EMAIL PROTECTED],1
>
> and
>
> Components: "ATI Technologies Inc. SB600 Non-Raid-5 SATA"
> Driver: pci-ide
> --Driver Information--
> Driver: pci-ide
> Instance: 0
> Attach Status: Attached
> --Hardware Information--
> Vendor ID: 0x1002
> Device ID: 0x4380
> Class Code: 0001018f
> DevPath: /[EMAIL PROTECTED],0/[EMAIL PROTECTED]
>
> Furthermore, there is one Driver Problem detected but the error is under the
> "USB" entry. There are seven items listed:
>
> Components: ATI Technologies Inc. SB600 USB Controller (EHCI)
> Driver: ehci
>
> Components: ATI Technologies Inc. SB600 USB (OHCI4)
> Driver: ohci
>
> Components: ATI Technologies Inc. SB600 USB (OHCI3)
> Driver: ohci
>
> Components: ATI Technologies Inc. SB600 USB (OHCI2)
> Driver: ohci
>
> Components: ATI Technologies Inc. SB600 USB (OHCI1)
> Driver: ohci (Driver Misconfigured)
>
> Components: ATI Technologies Inc. SB600 USB (OHCI0)
> Driver: ohci
>
> Components: Microsoft Corp. Wheel Mouse Optical
> Driver: hid
>
> As you can tell, the OHCI1 device isn't properly configured, but I don't
> know how to configure it (there's only a "Help" "Submit...", and "Close"
> button to click, no "Install Driver"). And, to tell you the truth, I'm not
> even sure it's worth mentioning because I don't have anything but my mouse
> plugged into USB, and even so... it's a mouse... plugged into USB... hardly
> something that is going to bring my machine to a grinding halt every time a
> SATA II disk gets yanked from a RAID-Z array (at least, I should hope the
> two don't have anything in common!).
>
> And... wait... you mean to tell me that I can't just untick the checkbox
> that says "Hey, freeze my system when a drive dies" to solve this problem?
> Ugh. And here I was hoping for a quick fix... ;)
>
> Anyway, how does the above sound? What else can I give you?
>
> -Todd
>
> PS: Thanks, by the way, for the support - I'm not sure where else to turn to
> for this kind of stuff!
>
>
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-24 Thread James C. McPherson

Todd H. Poole wrote:
> Hmm... I'm leaning away a bit from the hardware, but just in case you've
> got an idea, the machine is as follows:
> 
> CPU: AMD Athlon X2 4850e 2.5GHz Socket AM2 45W Dual-Core Processor Model
> ADH4850DOBOX
> (http://www.newegg.com/Product/Product.aspx?Item=N82E16819103255)
> 
> Motherboard: GIGABYTE GA-MA770-DS3 AM2+/AM2 AMD 770 ATX All Solid
> Capacitor AMD Motherboard
> (http://www.newegg.com/Product/Product.aspx?Item=N82E16813128081)


..
> The reason why I don't think there's a hardware issue is because before I
> got OpenSolaris up and running, I had a fully functional install of
> openSuSE 11.0 running (with everything similar to the original server) to
> make sure that none of the components were damaged during shipping from
> Newegg. Everything worked as expected.

Yes, but you're running a new operating system, new filesystem...
that's a mountain of difference right in front of you.


A few commands that you could provide the output from include:


(these two show any FMA-related telemetry)
fmadm faulty
fmdump -v

(this shows your storage controllers and what's connected to them)
cfgadm -lav

You'll also find messages in /var/adm/messages which might prove
useful to review.


Apart from that, your description of what you're doing to simulate
failure is

"however, whenever I unplug the SATA cable from one of the drives (to 
simulate a catastrophic drive failure) while doing moderate reading from the 
zpool (such as streaming HD video), not only does the video hang on the 
remote machine (which is accessing the zpool via NFS), but the server 
running OpenSolaris seems to either hang, or become incredibly unresponsive."


First and foremost, for me, this is a stupid thing to do. You've
got common-or-garden PC hardware which almost *definitely* does not
support hot plug of devices. Which is what you're telling us that
you're doing. Would try this with your pci/pci-e cards in this
system? I think not.


If you absolutely must do something like this, then please use
what's known as "coordinated hotswap" using the cfgadm(1m) command.


Viz:

(detect fault in disk c2t3d0, in some way)

# cfgadm -c unconfigure c2::dsk/c2t3d0
# cfgadm -c disconnect c2::dsk/c2t3d0

(go and swap the drive, plugin new drive with same cable)

# zpool replace -f poolname c2t3d0


What this will do is tell the kernel to do things in the
right order, and - for zpool - tell it to do an in-place
replacement of device c2t3d0 in your pool.


There are manpages and admin guides you could have a look
through, too:

http://docs.sun.com/app/docs/coll/40.17 (manpages)
http://docs.sun.com/app/docs/coll/47.23 (system admin collection)
http://docs.sun.com/app/docs/doc/817-2271 ZFS admin guide
http://docs.sun.com/app/docs/doc/819-2723 devices + filesystems guide



James C. McPherson
--
Senior Kernel Software Engineer, Solaris
Sun Microsystems
http://blogs.sun.com/jmcp   http://www.jmcp.homeunix.com/blog
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure, resumes when disk is replaced

2008-08-24 Thread Todd H. Poole

Hmm... You know, that's a good question. I'm not sure if those SATA II ports 
support hot swap or not. The motherboard is fairly new, but taking a look at 
the specifications provided by Gigabyte 
(http://www.gigabyte.com.tw/Products/Motherboard/Products_Spec.aspx?ProductID=2874)
 doesn't seem to yield anything. To tell you the truth, I think they're just 
plain 'ol dumb SATA II ports - nothing fancy here.

But that's alright, because hot swappable isn't something I'm necessarily 
chasing after. It would be nice, of course, but the thing that we want the most 
is stability during hardware failures. For this particular server, it is _far_ 
more important for the thing to keep chugging along and blow right through as 
many hardware failures as it can. If it's still got 3 of those 4 drives (which 
implies at least 2 data and 1 parity, or 3 data and no parity) then I still 
want to be able to read and write to those NFS exports like nothing happened. 
Then, at the end of the day, if we need to bring the machine down in order to 
install a new disk and resilver the RAID-Z array, that is perfectly acceptable. 
We could do that around 6:00 or so when everyone goes home for the day and when 
its much more convenient for us and the users, and let the 
resilvering/repairing operation run over night.

I also read the PDF summary you included in your link to your other post. And 
it seems we're seeing similar behavior here. Although, in this case, things are 
even simpler: there are only 4 drives in the case (not 8), and there is no 
extra controller card (just the ports on the motherboard)... It's hard to get 
any more basic than that.

As for testing in other OSes, unfortunately I don't readily have a copy of 
Windows available. But even if I did, I wouldn't know where to begin: almost 
all of my experience in server administration has been with Linux. For what 
it's worth, I have already established the above (that is, the seamless 
experience) with OpenSuSE 11.0 as the operating system, LVM as the volume 
manager, madam as the RAID manager, and XFS as the filesystem, so I know it can 
work...

I just want to get it working with OpenSolaris and ZFS. :)
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure, resumes when disk is replaced

2008-08-24 Thread Ross

PS.  Does your system definitely support SATA hot swap?  Could you for example 
test it under windows to see if it runs fine there?

I suspect this is a Solaris driver problem, but it would be good to have 
confirmation that the hardware handles this fine.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-24 Thread Todd H. Poole

Ah, yes - all four hard drives are connected to the motherboard's onboard SATA 
II ports. There is one additional drive I have neglected to mention thus far 
(the boot drive) but that is connected via the motherboard's IDE channel, and 
has remained untouched since the install... I don't really consider it part of 
the problem, but I thought I should mention it just in case... you never know...

As for the drivers... well, I'm not sure of the command to determine that 
directly, but going under System > Administration > Device Driver Utility 
yields the following information under the "Storage" entry:

Components: "ATI Technologies Inc. SB600 IDE"
Driver: pci-ide
--Driver Information--
Driver: pci-ide
Instance: 1
Attach Status: Attached
--Hardware Information--
Vendor ID: 0x1002
Device ID: 0x438c
Class Code: 0001018a
DevPath: /[EMAIL PROTECTED],0/[EMAIL PROTECTED],1

and

Components: "ATI Technologies Inc. SB600 Non-Raid-5 SATA"
Driver: pci-ide
--Driver Information--
Driver: pci-ide
Instance: 0
Attach Status: Attached
--Hardware Information--
Vendor ID: 0x1002
Device ID: 0x4380
Class Code: 0001018f
DevPath: /[EMAIL PROTECTED],0/[EMAIL PROTECTED]

Furthermore, there is one Driver Problem detected but the error is under the 
"USB" entry. There are seven items listed:

Components: ATI Technologies Inc. SB600 USB Controller (EHCI)
Driver: ehci

Components: ATI Technologies Inc. SB600 USB (OHCI4)
Driver: ohci

Components: ATI Technologies Inc. SB600 USB (OHCI3)
Driver: ohci

Components: ATI Technologies Inc. SB600 USB (OHCI2)
Driver: ohci

Components: ATI Technologies Inc. SB600 USB (OHCI1)
Driver: ohci (Driver Misconfigured)

Components: ATI Technologies Inc. SB600 USB (OHCI0)
Driver: ohci

Components: Microsoft Corp. Wheel Mouse Optical
Driver: hid

As you can tell, the OHCI1 device isn't properly configured, but I don't know 
how to configure it (there's only a "Help" "Submit...", and "Close" button to 
click, no "Install Driver"). And, to tell you the truth, I'm not even sure it's 
worth mentioning because I don't have anything but my mouse plugged into USB, 
and even so... it's a mouse... plugged into USB... hardly something that is 
going to bring my machine to a grinding halt every time a SATA II disk gets 
yanked from a RAID-Z array (at least, I should hope the two don't have anything 
in common!).

And... wait... you mean to tell me that I can't just untick the checkbox that 
says "Hey, freeze my system when a drive dies" to solve this problem? Ugh. And 
here I was hoping for a quick fix... ;)

Anyway, how does the above sound? What else can I give you?

-Todd

PS: Thanks, by the way, for the support - I'm not sure where else to turn to 
for this kind of stuff!
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure, resumes when disk is replaced

2008-08-24 Thread Ross

You're seeing exactly the same behaviour I found on my server, using a 
Supermicro AOC-SAT2-MV8 SATA controller.  It's detailed on the forums under the 
topics "Supermicro AOC-SAT2-MV8 hang when drive removed", but unfortunately 
that topic split into 3 or 4 pieces so it's a pain to find.

I also reported it as a bug here:
http://bugs.opensolaris.org/view_bug.do?bug_id=6735931
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-23 Thread Tim

On Sat, Aug 23, 2008 at 11:41 PM, Todd H. Poole <[EMAIL PROTECTED]>wrote:

> Hmm... I'm leaning away a bit from the hardware, but just in case you've
> got an idea, the machine is as follows:
>
> CPU: AMD Athlon X2 4850e 2.5GHz Socket AM2 45W Dual-Core Processor Model
> ADH4850DOBOX (
> http://www.newegg.com/Product/Product.aspx?Item=N82E16819103255)
>
> Motherboard: GIGABYTE GA-MA770-DS3 AM2+/AM2 AMD 770 ATX All Solid Capacitor
> AMD Motherboard (
> http://www.newegg.com/Product/Product.aspx?Item=N82E16813128081)
>
> RAM: G.SKILL 4GB (2 x 2GB) 240-Pin DDR2 SDRAM DDR2 800 (PC2 6400) Dual
> Channel Kit Desktop Memory Model F2-6400CL5D-4GBPQ (
> http://www.newegg.com/Product/Product.aspx?Item=N82E16820231122)
>
> HDD (x4): Western Digital Caviar GP WD10EACS 1TB 5400 to 7200 RPM SATA
> 3.0Gb/s Hard Drive (
> http://www.newegg.com/Product/Product.aspx?Item=N82E16822136151)
>
> The reason why I don't think there's a hardware issue is because before I
> got OpenSolaris up and running, I had a fully functional install of openSuSE
> 11.0 running (with everything similar to the original server) to make sure
> that none of the components were damaged during shipping from Newegg.
> Everything worked as expected.
>
> Furthermore, before making my purchases, I made sure to check the HCL and
> my processor and motherboard combination are supported:
> http://www.sun.com/bigadmin/hcl/data/systems/details/3079.html
>
> But, like I said earlier, I'm new here, so you might be on to something
> that never occurred to me.
>
> Any ideas?
>
>
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>


What are you using to connect the HD's to the system?  The onboard ports?
What driver is being used?  AHCI, or IDE compatibility mode?

I'm not saying the hardware is bad, I'm saying the hardware is most likely
the cause by way of driver.  There really isn't any *setting* in solaris I'm
aware of that says "hey, freeze my system when a drive dies".  That just
sounds like hot-swap isn't working as it should be.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-23 Thread Todd H. Poole

Hmm... I'm leaning away a bit from the hardware, but just in case you've got an 
idea, the machine is as follows:

CPU: AMD Athlon X2 4850e 2.5GHz Socket AM2 45W Dual-Core Processor Model 
ADH4850DOBOX (http://www.newegg.com/Product/Product.aspx?Item=N82E16819103255)

Motherboard: GIGABYTE GA-MA770-DS3 AM2+/AM2 AMD 770 ATX All Solid Capacitor AMD 
Motherboard (http://www.newegg.com/Product/Product.aspx?Item=N82E16813128081)

RAM: G.SKILL 4GB (2 x 2GB) 240-Pin DDR2 SDRAM DDR2 800 (PC2 6400) Dual Channel 
Kit Desktop Memory Model F2-6400CL5D-4GBPQ 
(http://www.newegg.com/Product/Product.aspx?Item=N82E16820231122)

HDD (x4): Western Digital Caviar GP WD10EACS 1TB 5400 to 7200 RPM SATA 3.0Gb/s 
Hard Drive (http://www.newegg.com/Product/Product.aspx?Item=N82E16822136151)

The reason why I don't think there's a hardware issue is because before I got 
OpenSolaris up and running, I had a fully functional install of openSuSE 11.0 
running (with everything similar to the original server) to make sure that none 
of the components were damaged during shipping from Newegg. Everything worked 
as expected.

Furthermore, before making my purchases, I made sure to check the HCL and my 
processor and motherboard combination are supported: 
http://www.sun.com/bigadmin/hcl/data/systems/details/3079.html 

But, like I said earlier, I'm new here, so you might be on to something that 
never occurred to me.

Any ideas?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure, resumes when disk is replaced

2008-08-23 Thread Tim

On Sat, Aug 23, 2008 at 11:06 PM, Todd H. Poole <[EMAIL PROTECTED]>wrote:

> Howdy yall,
>
> Earlier this month I downloaded and installed the latest copy of
> OpenSolaris (2008.05) so that I could test out some of the newer features
> I've heard so much about, primarily ZFS.
>
> My goal was to replace our aging linux-based (SuSE 10.1) file and media
> server with a new machine running Sun's OpenSolaris and ZFS. Our old server
> ran your typical RAID5 setup with 4 500GB disks (3 data, 1 parity), used
> lvm, mdadm, and xfs to help keep things in order, and relied on NFS to
> export users' shares. It was solid, stable, and worked wonderfully well.
>
> I would like to replicate this experience using the tools OpenSolaris has
> to offer, taking advantages of ZFS. However, there are enough differences
> between the two OSes - especially with respect to the filesystems and (for
> lack of a better phrase) "RAID managers" - to cause me to consult (on
> numerous occasions) the likes of Google, these forums, and other places for
> help.
>
> I've been successful in troubleshooting all problems up until now.
>
> On our old media server (the SuSE 10.1 one), when a disk failed, the
> machine would send out an e-mail detailing the type of failure, and
> gracefully fall into a degraded state, but would otherwise continue to
> operate using the remaining 3 disks in the system. After the faulty disk was
> replaced, all of the data from the old disk would be replicated onto the new
> one (I think the term is "resilvered" around here?), and after a few hours,
> the RAID5 array would be seamlessly promoted from "degraded" back up to a
> healthy "clean" (or "online") state.
>
> Throughout the entire process, there would be no interruptions to the end
> user: all NFS shares still remained mounted, there were no noticeable drops
> in I/O, files, directories, and any other user-created data still remained
> available, and if everything went smoothly, no one would notice a failure
> had even occurred.
>
> I've tried my best to recreate something similar in OpenSolaris, but I'm
> stuck on making it all happen seamlessly.
>
> For example, I have a standard beige box machine running OS 2008.05 with a
> zpool that contains 4 disks, similar to what the old SuSE 10.1 server had.
> However, whenever I unplug the SATA cable from one of the drives (to
> simulate a catastrophic drive failure) while doing moderate reading from the
> zpool (such as streaming HD video), not only does the video hang on the
> remote machine (which is accessing the zpool via NFS), but the server
> running OpenSolaris seems to either hang, or become incredibly unresponsive.
>
> And when I write unresponsive, I mean that when I type the command "zpool
> status" to see what's going on, the command hangs, followed by a frozen
> Terminal a few seconds later. After just a few more seconds, the entire GUI
> - mouse included - locks up or freezes, and all NFS shares become
> unavailable from the perspective of the remote machines. The whole machine
> locks up hard.
>
> The machine then stays in this frozen state until I plug the hard disk back
> in, at which point everything, quite literally, pops back into existence all
> at once: the output of the "zpool status" command flies by (with all disks
> listed as "ONLINE" and all "READ," "WRITE," and "CKSUM," fields listed as
> "0"), the mouse jumps to a different part of the screen, the NFS share
> becomes available again, and the movie resumes right where it had left off.
>
> While such a quick resume is encouraging, I'd like to avoid the freeze in
> the first place.
>
> How can I keep any hardware failures like the above transparent to my
> users?
>
> -Todd
>
> PS: I've done some researching, and while my problem is similar to the
> following:
>
> http://opensolaris.org/jive/thread.jspa?messageID=151719𥂧
> http://opensolaris.org/jive/thread.jspa?messageID=240481𺭡
>
> most of these posts are quite old, and do not offer any solutions.
>
> PSS: I know I haven't provided any details on hardware, but I feel like
> this is more likely a higher-level issue (like some sort of configuration
> file or setting is needed) rather than a lower-level one (like faulty
> hardware). However, if someone were to give me a command to run, I'd gladly
> do it... I'm just not sure which ones would be helpful, or if I even know
> which ones to run. It took me half an hour of searching just to find out how
> to list the disks installed in this system (it's "format") so that I could
> build my zpool in the first place. It's not quite as simple as writing out
> /dev/hda, /dev/hdb, /dev/hdc, /dev/hdd. ;)
>
>
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>



It's a lower level one.  What hardware are you running?
___
zfs-discuss mailing list
zfs-dis

[zfs-discuss] ZFS hangs/freezes after disk failure, resumes when disk is replaced

2008-08-23 Thread Todd H. Poole

Howdy yall,

Earlier this month I downloaded and installed the latest copy of OpenSolaris 
(2008.05) so that I could test out some of the newer features I've heard so 
much about, primarily ZFS. 

My goal was to replace our aging linux-based (SuSE 10.1) file and media server 
with a new machine running Sun's OpenSolaris and ZFS. Our old server ran your 
typical RAID5 setup with 4 500GB disks (3 data, 1 parity), used lvm, mdadm, and 
xfs to help keep things in order, and relied on NFS to export users' shares. It 
was solid, stable, and worked wonderfully well.

I would like to replicate this experience using the tools OpenSolaris has to 
offer, taking advantages of ZFS. However, there are enough differences between 
the two OSes - especially with respect to the filesystems and (for lack of a 
better phrase) "RAID managers" - to cause me to consult (on numerous occasions) 
the likes of Google, these forums, and other places for help.

I've been successful in troubleshooting all problems up until now.

On our old media server (the SuSE 10.1 one), when a disk failed, the machine 
would send out an e-mail detailing the type of failure, and gracefully fall 
into a degraded state, but would otherwise continue to operate using the 
remaining 3 disks in the system. After the faulty disk was replaced, all of the 
data from the old disk would be replicated onto the new one (I think the term 
is "resilvered" around here?), and after a few hours, the RAID5 array would be 
seamlessly promoted from "degraded" back up to a healthy "clean" (or "online") 
state.

Throughout the entire process, there would be no interruptions to the end user: 
all NFS shares still remained mounted, there were no noticeable drops in I/O, 
files, directories, and any other user-created data still remained available, 
and if everything went smoothly, no one would notice a failure had even 
occurred.

I've tried my best to recreate something similar in OpenSolaris, but I'm stuck 
on making it all happen seamlessly.

For example, I have a standard beige box machine running OS 2008.05 with a 
zpool that contains 4 disks, similar to what the old SuSE 10.1 server had. 
However, whenever I unplug the SATA cable from one of the drives (to simulate a 
catastrophic drive failure) while doing moderate reading from the zpool (such 
as streaming HD video), not only does the video hang on the remote machine 
(which is accessing the zpool via NFS), but the server running OpenSolaris 
seems to either hang, or become incredibly unresponsive. 

And when I write unresponsive, I mean that when I type the command "zpool 
status" to see what's going on, the command hangs, followed by a frozen 
Terminal a few seconds later. After just a few more seconds, the entire GUI - 
mouse included - locks up or freezes, and all NFS shares become unavailable 
from the perspective of the remote machines. The whole machine locks up hard.

The machine then stays in this frozen state until I plug the hard disk back in, 
at which point everything, quite literally, pops back into existence all at 
once: the output of the "zpool status" command flies by (with all disks listed 
as "ONLINE" and all "READ," "WRITE," and "CKSUM," fields listed as "0"), the 
mouse jumps to a different part of the screen, the NFS share becomes available 
again, and the movie resumes right where it had left off.

While such a quick resume is encouraging, I'd like to avoid the freeze in the 
first place.

How can I keep any hardware failures like the above transparent to my users?

-Todd

PS: I've done some researching, and while my problem is similar to the 
following:

http://opensolaris.org/jive/thread.jspa?messageID=151719𥂧
http://opensolaris.org/jive/thread.jspa?messageID=240481𺭡

most of these posts are quite old, and do not offer any solutions.

PSS: I know I haven't provided any details on hardware, but I feel like this is 
more likely a higher-level issue (like some sort of configuration file or 
setting is needed) rather than a lower-level one (like faulty hardware). 
However, if someone were to give me a command to run, I'd gladly do it... I'm 
just not sure which ones would be helpful, or if I even know which ones to run. 
It took me half an hour of searching just to find out how to list the disks 
installed in this system (it's "format") so that I could build my zpool in the 
first place. It's not quite as simple as writing out /dev/hda, /dev/hdb, 
/dev/hdc, /dev/hdd. ;)
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

54 matches

Mail list logo