[zfs-discuss] ZFS related kernel panic

2006-12-04 Thread Douglas Denny

Last Friday, one of our V880s kernel panicked with the following
message.This is a SAN connected ZFS pool attached to one LUN. From
this, it appears that the SAN 'disappeared' and then there was a panic
shortly after.

Am I reading this correctly?

Is this normal behavior for ZFS?

This is a mostly patched Solaris 10 6/06 install. Before patching this
system we did have a couple of NFS related panics, always on Fridays!
This is the fourth panic, first time with a ZFS error. There are no
errors in zpool status.

Dec  1 20:30:21 foobar scsi: [ID 107833 kern.warning] WARNING:
/[EMAIL PROTECTED],60/[EMAIL PROTECTED]/[EMAIL PROTECTED],1 (sd17):
Dec  1 20:30:21 foobar SCSI transport failed: reason 'incomplete':
retrying command
Dec  1 20:30:21 foobar scsi: [ID 107833 kern.warning] WARNING:
/[EMAIL PROTECTED],60/[EMAIL PROTECTED]/[EMAIL PROTECTED],1 (sd17):
Dec  1 20:30:21 foobar SCSI transport failed: reason 'incomplete':
retrying command
Dec  1 20:30:21 foobar scsi: [ID 107833 kern.warning] WARNING:
/[EMAIL PROTECTED],60/[EMAIL PROTECTED]/[EMAIL PROTECTED],1 (sd17):
Dec  1 20:30:21 foobar disk not responding to selection
Dec  1 20:30:21 foobar scsi: [ID 107833 kern.warning] WARNING:
/[EMAIL PROTECTED],60/[EMAIL PROTECTED]/[EMAIL PROTECTED],1 (sd17):
Dec  1 20:30:21 foobar disk not responding to selection
Dec  1 20:30:21 foobar scsi: [ID 107833 kern.warning] WARNING:
/[EMAIL PROTECTED],60/[EMAIL PROTECTED]/[EMAIL PROTECTED],1 (sd17):
Dec  1 20:30:21 foobar disk not responding to selection
Dec  1 20:30:21 foobar scsi: [ID 107833 kern.warning] WARNING:
/[EMAIL PROTECTED],60/[EMAIL PROTECTED]/[EMAIL PROTECTED],1 (sd17):
Dec  1 20:30:21 foobar disk not responding to selection
Dec  1 20:30:22 foobar scsi: [ID 107833 kern.warning] WARNING:
/[EMAIL PROTECTED],60/[EMAIL PROTECTED]/[EMAIL PROTECTED],1 (sd17):
Dec  1 20:30:22 foobar disk not responding to selection
Dec  1 20:30:22 foobar unix: [ID 836849 kern.notice]
Dec  1 20:30:22 foobar ^Mpanic[cpu2]/thread=2a100aedcc0:
Dec  1 20:30:22 foobar unix: [ID 809409 kern.notice] ZFS: I/O failure
(write on unknown off 0: zio 3004c0ce540 [L0 unallocated]
2L/2P DVA
[0]=0:2ae190:2 fletcher2 uncompressed BE contiguous
birth=586818 fill=0
cksum=102297a2db39dfc:cc8e38087da7a38f:239520856ececf15:c2fd36
9cea9db4a1): error 5
Dec  1 20:30:22 foobar unix: [ID 10 kern.notice]
Dec  1 20:30:22 foobar genunix: [ID 723222 kern.notice]
02a100aed740 zfs:zio_done+284 (3004c0ce540, 0, a8, 70513bf0, 0,
60001374940)
Dec  1 20:30:22 foobar genunix: [ID 179002 kern.notice]   %l0-3:
03006319fc80 70513800 0005 0005
Dec  1 20:30:22 foobar   %l4-7: 7b224278 0002
0008f442 0005
Dec  1 20:30:22 foobar genunix: [ID 723222 kern.notice]
02a100aed940 zfs:zio_vdev_io_assess+178 (3004c0ce540, 8000, 10, 0,
0, 10)
Dec  1 20:30:22 foobar genunix: [ID 179002 kern.notice]   %l0-3:
0002 0001  0005
Dec  1 20:30:22 foobar   %l4-7: 0010 35a536bc
 00043d7293172cfc
Dec  1 20:30:22 foobar genunix: [ID 723222 kern.notice]
02a100aeda00 genunix:taskq_thread+1a4 (600012a0c38, 600012a0be0,
50001, 43d72c8bfb810,
2a100aedaca, 2a100aedac8)
Dec  1 20:30:22 foobar genunix: [ID 179002 kern.notice]   %l0-3:
0001 0600012a0c08 0600012a0c10 0600012a0c12
Dec  1 20:30:22 foobar   %l4-7: 030060946320 0002
 0600012a0c00
Dec  1 20:30:22 foobar unix: [ID 10 kern.notice]
Dec  1 20:30:22 foobar genunix: [ID 672855 kern.notice] syncing file systems...
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS related kernel panic

2006-12-04 Thread James C. McPherson



Douglas Denny wrote:
 Last Friday, one of our V880s kernel panicked with the following
 message.This is a SAN connected ZFS pool attached to one LUN. From
 this, it appears that the SAN 'disappeared' and then there was a panic
 shortly after.

 Am I reading this correctly?

Yes.

 Is this normal behavior for ZFS?

Yes. You have no redundancy (from ZFS' point of view at least),
so ZFS has no option except panicing in order to maintain the
integrity of your data.

 This is a mostly patched Solaris 10 6/06 install. Before patching this
 system we did have a couple of NFS related panics, always on Fridays!
 This is the fourth panic, first time with a ZFS error. There are no
 errors in zpool status.

Without data, it is difficult to suggest what might have caused
your NFS panics.


James C. McPherson
--
Solaris kernel software engineer, system admin and troubleshooter
  http://www.jmcp.homeunix.com/blog
Find me on LinkedIn @ http://www.linkedin.com/in/jamescmcpherson
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS related kernel panic

2006-12-04 Thread Douglas Denny

On 12/4/06, James C. McPherson [EMAIL PROTECTED] wrote:

  Is this normal behavior for ZFS?

Yes. You have no redundancy (from ZFS' point of view at least),
so ZFS has no option except panicing in order to maintain the
integrity of your data.


This is interesting from a implementation point of view. Any singly
attached SAN connection that has a disconnect from its switch/backend
will cause the ZFS to panic, why would it not wait and see if the
device came back? Should all SAN connected ZFS pools have redundancy
built in with dual HBAs to dual SAN switches/controllers?

-Doug
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS related kernel panic

2006-12-04 Thread Richard Elling

Douglas Denny wrote:

On 12/4/06, James C. McPherson [EMAIL PROTECTED] wrote:

  Is this normal behavior for ZFS?

Yes. You have no redundancy (from ZFS' point of view at least),
so ZFS has no option except panicing in order to maintain the
integrity of your data.


This is interesting from a implementation point of view. Any singly
attached SAN connection that has a disconnect from its switch/backend
will cause the ZFS to panic, why would it not wait and see if the
device came back? Should all SAN connected ZFS pools have redundancy
built in with dual HBAs to dual SAN switches/controllers?


UFS will panic on EIO also.  Most other file systems, too.
You can put UFS on top of SVM, but unless SVM is configured for
redundancy, it (UFS) would still panic in such situations.  ZFS
doesn't bring anything new here, but I sense a change in expectations
that I can't quite reconcile.
 -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] replacing a drive in a raidz vdev

2006-12-04 Thread Krzys
I am having no luck replacing my drive as well. few days ago I replaced my drive 
and its completly messed up now.


  pool: mypool2
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress, 8.70% done, 8h19m to go
config:

NAME  STATE READ WRITE CKSUM
mypool2   DEGRADED 0 0 0
  raidz   DEGRADED 0 0 0
c3t0d0ONLINE   0 0 0
c3t1d0ONLINE   0 0 0
c3t2d0ONLINE   0 0 0
c3t3d0ONLINE   0 0 0
c3t4d0ONLINE   0 0 0
c3t5d0ONLINE   0 0 0
replacing DEGRADED 0 0 0
  c3t6d0s0/o  UNAVAIL  0 0 0  cannot open
  c3t6d0  ONLINE   0 0 0

errors: No known data errors

this is what I get, I am running Solaris 10 U2
two days ago I did see 2.00% range, and then like 10h remaining, now its still 
going and its already at least few days since it started.


when I do: zpool list
NAMESIZEUSED   AVAILCAP  HEALTH ALTROOT
mypool2 952G684G268G71%  DEGRADED   -

I have almost 1TB of space.
when I do df -k it does show me only 277gb, it is better than only displaying 
12gb as I did see yesterday.

mypool2/d3   277900047  12022884 265877163   5% /d/d3

when I do zfs list I get:
mypool2684G   254G52K  /mypool2
mypool2/d  191G   254G   189G  /mypool2/d
mypool2/[EMAIL PROTECTED]   653M  -   145G  -
mypool2/[EMAIL PROTECTED]  31.2M  -   145G  -
mypool2/[EMAIL PROTECTED]  36.8M  -   144G  -
mypool2/[EMAIL PROTECTED]  37.9M  -   144G  -
mypool2/[EMAIL PROTECTED]  31.7M  -   145G  -
mypool2/[EMAIL PROTECTED]  27.7M  -   145G  -
mypool2/[EMAIL PROTECTED]  34.0M  -   146G  -
mypool2/[EMAIL PROTECTED]  26.8M  -   149G  -
mypool2/[EMAIL PROTECTED]  34.4M  -   151G  -
mypool2/[EMAIL PROTECTED]  141K  -   189G  -
mypool2/d3 492G   254G  11.5G  legacy

I am so confused with all of this... Why its taking so long to replace that one 
bad disk? Why such different results? What is going on? Is there a problem with 
my zpool/zfs combination? Did I do anything wrong? Did I actually loose data on 
my drive? If I knew it woul dbe this bad I would just destroy my whole zpool and 
zfs and start from the beginning but I wanted to see how would it go trough 
replacement to see whats the process... I am so happy I did not use zfs in my 
production environment yet to be honest with you...


Chris



On Sat, 2 Dec 2006, Theo Schlossnagle wrote:

I had a disk malfunction in a raidz pool today.  I had an extra on in the 
enclosure and performed a: zpool replace pool old new and several unexpected 
behaviors have transpired:


the zpool replace command hung for 52 minutes during which no zpool 
commands could be executed (like status, iostat or list).


When it finally returned, the drive was marked as replacing as I expected 
from reading the man page.  However, it's progress counter has not been 
monotonically increasing.  It started at 1% and then went to 5% and then back 
to 2%, etc. etc.


I just logged in to see if it was done and ran zpool status and received:

pool: xsr_slow_2
state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
  continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scrub: resilver in progress, 100.00% done, 0h0m to go
config:

  NAME   STATE READ WRITE CKSUM
  xsr_slow_2 ONLINE   0 0 0
raidzONLINE   0 0 0
  c4t600039316A1Fd0s2ONLINE   0 0 0
  c4t600039316A1Fd1s2ONLINE   0 0 0
  c4t600039316A1Fd2s2ONLINE   0 0 0
  c4t600039316A1Fd3s2ONLINE   0 0 0
  replacing  ONLINE   0 0 0
c4t600039316A1Fd4s2  ONLINE   2.87K   251 0
c4t600039316A1Fd6ONLINE   0 0 0
  c4t600039316A1Fd5s2ONLINE   0 0 0


I thought to myself, if it is 100% done why is it still replacing? I waited 
about 15 seconds and ran the command again to find something rather 
disconcerting:


pool: xsr_slow_2
state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
  continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scrub: resilver in progress, 0.45% done, 27h27m to go
config:

  NAME

Re: [zfs-discuss] ZFS related kernel panic

2006-12-04 Thread Jason J. W. Williams

Hi all,

Having experienced this, it would be nice if there was an option to
offline the filesystem instead of kernel panicking on a per-zpool
basis. If its a system-critical partition like a database I'd prefer
it to kernel-panick and thereby trigger a fail-over of the
application. However, if its a zpool hosting some fileshares I'd
prefer it to stay online. Putting that level of control in would
alleviate a lot of the complaints it seems to me...or at least give
less of a leg to stand on. ;-)

A nasty little notice that tells you the system will kernel panick if
a vdev becomes unavailable, wouldn't be bad either when you're
creating a striped zpool. Even the best of us forgets these things.

Best Regards,
Jason

On 12/4/06, Richard Elling [EMAIL PROTECTED] wrote:

Douglas Denny wrote:
 On 12/4/06, James C. McPherson [EMAIL PROTECTED] wrote:
   Is this normal behavior for ZFS?

 Yes. You have no redundancy (from ZFS' point of view at least),
 so ZFS has no option except panicing in order to maintain the
 integrity of your data.

 This is interesting from a implementation point of view. Any singly
 attached SAN connection that has a disconnect from its switch/backend
 will cause the ZFS to panic, why would it not wait and see if the
 device came back? Should all SAN connected ZFS pools have redundancy
 built in with dual HBAs to dual SAN switches/controllers?

UFS will panic on EIO also.  Most other file systems, too.
You can put UFS on top of SVM, but unless SVM is configured for
redundancy, it (UFS) would still panic in such situations.  ZFS
doesn't bring anything new here, but I sense a change in expectations
that I can't quite reconcile.
  -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] replacing a drive in a raidz vdev

2006-12-04 Thread Bill Sommerfeld
On Mon, 2006-12-04 at 13:56 -0500, Krzys wrote:
 mypool2/[EMAIL PROTECTED]  34.4M  -   151G  -
 mypool2/[EMAIL PROTECTED]  141K  -   189G  -
 mypool2/d3 492G   254G  11.5G  legacy
 
 I am so confused with all of this... Why its taking so long to replace that 
 one 
 bad disk?

To workaround a bug where a pool traverse gets lost when the snapshot
configuration of a pool changes, both scrubs and resilvers will start
over again any time you create or delete a snapshot.

Unfortunately, this workaround has problems of its own -- If your
inter-snapshot interval is less than the time required to complete a
scrub, the resilver will never complete.  

The open bug is:

6343667 scrub/resilver has to start over when a snapshot is taken

if it's not going to be fixed any time soon, perhaps we need a better
workaround:

Ideas:
  - perhaps snapshots should be made to fail while a resilver (not
scrub!) is in progress...

  - or maybe snapshots should fail when a *restarted* resilver is in
progress -- that way, if you can complete the resilver between two
snapshots times, you don't miss any snapshots, but if it takes longer
than that, snapshots are sacrificed in the name of pool integrity.


- Bill


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS on multi-volume

2006-12-04 Thread Albert Shih
Hi all

Sorry if my question is not very clear, I'm not very familiar with ZFS (why
I ask this question).

Suppose I've lot of low-cost raid array disk (like Brownie meaning IDE/SATA
disk)) all in SCSI attachement (lot of ~ 10 and the sum of space is ~ 20 To). 
Now if I buy some «high» level big raid array disk on FC attachement and a big 
Sun 
server, can I create a ZFS fs on all disks with :

All data is on the new big raid array disk (using hardware raid)

and 

All data is mirroring on the sum of my old low-cost raid array

? 

If it's possible what do you think of the perf ?

The purpose is make a big NFS server with primary data on a high-level raid
array disk but using ZFS to mirror all data on the all old-raid-array.

Regards.
--
Albert SHIH
Universite de Paris 7 (Denis DIDEROT)
U.F.R. de Mathematiques.
7 ième étage, plateau D, bureau 10
Heure local/Local time:
Mon Dec 4 23:04:04 CET 2006
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS related kernel panic

2006-12-04 Thread Matthew Ahrens

Jason J. W. Williams wrote:

Hi all,

Having experienced this, it would be nice if there was an option to
offline the filesystem instead of kernel panicking on a per-zpool
basis. If its a system-critical partition like a database I'd prefer
it to kernel-panick and thereby trigger a fail-over of the
application. However, if its a zpool hosting some fileshares I'd
prefer it to stay online. Putting that level of control in would
alleviate a lot of the complaints it seems to me...or at least give
less of a leg to stand on. ;-)


Agreed, and we are working on this.

--matt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: ZFS related kernel panic

2006-12-04 Thread Peter Eriksson
 If you take a look at these messages the somewhat unusual condition 
 that may lead to unexpected behaviour (ie. fast giveup) is that 
 whilst this is a SAN connection it is achieved through a non- 
 Leadville config, note the fibre-channel and sd references. In a 
 Leadville compliant installation this would be the ssd driver, hence 
 you'd have to investigate the specific semantics and driver tweaks 
 that this system has applied to sd in this instance.

If only it was possible to use the Leadville drivers... We've seen the same 
problems here (*instant* panic if the FC switch reboots due to ZFS - I wouldn't 
mind if it kept on retrying a tad bit longer - preferably configurable). And to 
panic? How can that in any sane way be good way to protect the application?
*BANG* - no chance at all for the application to handle the problem...


Btw. in our case we have also wrapped the raw FC-attached disks with SVM 
metadevices first because if a disk in a A3500FC units goes bad then we had the 
_other_ failure mode of ZFS - total hang until I noticed that by wrapping the 
device with a layer of SVM metadevices insulated ZFS from that problem - now it 
correctly notices that the disk is gone/dead and displays that when doing 
zfs status etc.


(We (Lysator ACS - a students computer club) can't use the Leadville driver, 
since the 'ifp driver (and hence use the ssd disks) for the Qlogic QLA2100 
HBA boards is based on an older Qlogic firmware that only supports max 16 LUNs 
per target and we want more... So we use the Qlogic qla2100 driver instead 
which works really nicely but then it uses the sd disk devices instead. 

Being a computer club with limited funds means one finds ways to use old 
hardware in new and interesting ways :-)

Hardware in use: Primary file server: Sun Ultra 450, two Qlogic QLA2100 HBAs. 
One connected via an 8-port FC-AL *hub* to two Sun A5000 JBOD boxes (filled 
with 9 and 18GB FC disks), the other via a Brocade 2400 8-port switch (running 
in QuickLoop mode) to a Compaq StorageWorks RA8000 RAID and two A3500FC 
systems. 

Now... What can *possibly* go wrong with that setup? :-)

I'll tell you a couple:

1. When the server entered multiuser and started serving NFS to all the users 
$HOME - many many disks in the A5000 started resetting themself again and again 
and again... Solution: Tune down the maximum number of tagged commands that was 
sent to the disks in /kernel/drv/qla2100.conf: 
   hba1-max-iocb-allocation=7; # was 256
   hba1-execution-throttle=7; # was 31
(This problem wasn't there with the old Sun ifp driver, probably because it
has less agressive limits - but since that driver is totally nonconfigurable 
it's impossible to tell).

2. The power cord got slightly lose to the Brocade switch causing it to reboot 
causing the server into an *Instant PANIC thanks to ZFS*
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS related kernel panic

2006-12-04 Thread Jason J. W. Williams

Any chance we might get a short refresher warning when creating a
striped zpool? O:-)

Best Regards,
Jason

On 12/4/06, Matthew Ahrens [EMAIL PROTECTED] wrote:

Jason J. W. Williams wrote:
 Hi all,

 Having experienced this, it would be nice if there was an option to
 offline the filesystem instead of kernel panicking on a per-zpool
 basis. If its a system-critical partition like a database I'd prefer
 it to kernel-panick and thereby trigger a fail-over of the
 application. However, if its a zpool hosting some fileshares I'd
 prefer it to stay online. Putting that level of control in would
 alleviate a lot of the complaints it seems to me...or at least give
 less of a leg to stand on. ;-)

Agreed, and we are working on this.

--matt


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] it's me Lester

2006-12-04 Thread Lester Harmon
You cannot make big returns on an oil company AFTER huge
profits are reported.  You also can't make them by getting 
in AFTER successful drilling results.  Everyone needs a 
helping hand at getting in BEFORE the big events, and 
that's what we are giving you here.

Great product, great sector, tightly held, with great 
results expected any day.  Cana Petroleum is going to make 
you a winner!

Symbol: CNPM
Current Price: Around  $0.78 
Projected Price:   $5.40

Watch it on Tuesday, December 5th.  This one is going to 
have you wearing a big smile all day long!  

Major oil discovery?  We are not permitted to say at this 
point.  All we can say is that this one is going to see 
amazing appreciation in a very short period of time!  This 
is your opportunity.  Excel with CNPM!  


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS related kernel panic

2006-12-04 Thread James C. McPherson

Peter Eriksson wrote:
If you take a look at these messages the somewhat unusual condition 
that may lead to unexpected behaviour (ie. fast giveup) is that whilst

this is a SAN connection it is achieved through a non- Leadville
config, note the fibre-channel and sd references. In a Leadville
compliant installation this would be the ssd driver, hence you'd have
to investigate the specific semantics and driver tweaks that this
system has applied to sd in this instance.


If only it was possible to use the Leadville drivers... We've seen the
same problems here (*instant* panic if the FC switch reboots due to ZFS -
I wouldn't mind if it kept on retrying a tad bit longer - preferably
configurable). And to panic? How can that in any sane way be good way to
protect the application? *BANG* - no chance at all for the application
to handle the problem...


The *application* should not be worrying about handling error
conditions in the kernel. That's the kernel's job, and in this
case, ZFS' job.

ZFS protects *your data* by preventing any more writes from
occurring when it cannot guarantee the integrity of your data.


Btw. in our case we have also wrapped the raw FC-attached disks with
SVM metadevices first because if a disk in a A3500FC units goes bad then
we had the _other_ failure mode of ZFS - total hang until I noticed that
by wrapping the device with a layer of SVM metadevices insulated ZFS from
that problem - now it correctly notices that the disk is gone/dead and
displays that when doing zfs status etc.


Hm. An extra layer of complexity. Kinda defeats one of stated goals
of ZFS.


(We (Lysator ACS - a students computer club) can't use the Leadville
driver, since the 'ifp driver (and hence use the ssd disks) for the
Qlogic QLA2100 HBA boards is based on an older Qlogic firmware that only
supports max 16 LUNs per target and we want more... So we use the Qlogic
qla2100 driver instead which works really nicely but then it uses the
sd disk devices instead. 
Being a computer club with limited funds means one finds ways to use old

hardware in new and interesting ways :-)


Ebay.se ?


Hardware in use: Primary file server: Sun Ultra 450, two Qlogic QLA2100
HBAs. One connected via an 8-port FC-AL *hub* to two Sun A5000 JBOD boxes
(filled with 9 and 18GB FC disks), the other via a Brocade 2400 8-port
switch (running in QuickLoop mode) to a Compaq StorageWorks RA8000 RAID
and two A3500FC systems.
Now... What can *possibly* go wrong with that setup? :-)


Hmmm let's start with the mere existence of the EOL'd A3500fc
hardware in your config. Kinda goes downhill from there :)


I'll tell you a couple:

1. When the server entered multiuser and started serving NFS to all the
users $HOME - many many disks in the A5000 started resetting themself
again and again and again... Solution: Tune down the maximum number of
tagged commands that was sent to the disks in /kernel/drv/qla2100.conf: 
hba1-max-iocb-allocation=7; # was 256 hba1-execution-throttle=7; # was 31

 (This problem wasn't there with the old Sun ifp driver, probably
because it has less agressive limits - but since that driver is totally
nonconfigurable it's impossible to tell).


Ebay.se


2. The power cord got slightly lose to the Brocade switch causing it to
reboot causing the server into an *Instant PANIC thanks to ZFS*


Yes, as noted, this is by design in order to *protect your data*


James C. McPherson
--
Solaris kernel software engineer
Sun Microsystems
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS related kernel panic

2006-12-04 Thread Dale Ghent

Matthew Ahrens wrote:

Jason J. W. Williams wrote:

Hi all,

Having experienced this, it would be nice if there was an option to
offline the filesystem instead of kernel panicking on a per-zpool
basis. If its a system-critical partition like a database I'd prefer
it to kernel-panick and thereby trigger a fail-over of the
application. However, if its a zpool hosting some fileshares I'd
prefer it to stay online. Putting that level of control in would
alleviate a lot of the complaints it seems to me...or at least give
less of a leg to stand on. ;-)


Agreed, and we are working on this.


Similar to UFS's onerror mount option, I take it?

/dale
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: ZFS on multi-volume

2006-12-04 Thread Anton B. Rang
It is possible to configure ZFS in the way you describe, but your performance 
will be limited by the older array.

All mirror writes have to be stored on both arrays before they are considered 
complete, so writes will be as slow as the slowest disk or array involved.

ZFS does not currently consider performance in selecting a mirror side for 
reads, so half of the reads will run at the speed of the new array, half at the 
speed of the old array.

If you need to use both types of arrays (20 To is a lot of space to give up!), 
consider creating two pools, one composed of newer arrays and one of older 
arrays, at least if your data is easily split into fast and slow sets (e.g. 
fresh data vs. archival, or database logs vs. infrequently-accessed tables).
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: ZFS related kernel panic

2006-12-04 Thread Anton B. Rang
 And to panic? How can that in any sane way be good
 way to protect the application?
 *BANG* - no chance at all for the application to
 handle the problem...

I agree -- a disk error should never be fatal to the system; at worst, the file 
system should appear to have been forcibly unmounted (and worst really means 
that critical metadata, like the superblock/uberblock, can't be updated on any 
of the disks in the pool). That at least gives other applications which aren't 
using the file system the chance to keep going.

An I/O error detected when writing a file can be reported at write() time, 
fsync() time, or close() time. Any application which doesn't check all three of 
these won't handle all I/O errors properly; and applications which care about 
knowing that their data is on disk must either use synchronous writes 
(O_SYNC/O_DSYNC) or call fsync before closing the file. ZFS should report back 
these errors in all cases and avoid panicing (obviously).

That said, it also appears that the device drivers (either the FibreChannel or 
SCSI disk drivers in this case) are misbehaving. The FC driver appears to be 
reporting back an error which is interpreted as fatal by the SCSI disk driver 
when one or the other should be retrying the I/O. (It also appears that either 
the FC driver, SCSI disk driver, or ZFS is misbehaving in the observed hang.)

So ZFS should be more resilient against write errors, and the SCSI disk or FC 
drivers should be more resilient against LIPs (the most likely cause of your 
problem) or other transient errors. (Alternatively, the ifp driver should be 
updated to support the maximum number of targets on a loop, which might also 
solve your second problem.)
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS related kernel panic

2006-12-04 Thread James C. McPherson

Anton B. Rang wrote:
Peter Eriksson wrote:

And to panic? How can that in any sane way be good way to protect the
application? *BANG* - no chance at all for the application to handle
the problem...


I agree -- a disk error should never be fatal to the system; at worst,
the file system should appear to have been forcibly unmounted (and
worst really means that critical metadata, like the
superblock/uberblock, can't be updated on any of the disks in the pool).
That at least gives other applications which aren't using the file system
the chance to keep going.


But it's still not the application's problem to handle the underlying
device failure.

...


That said, it also appears that the device drivers (either the
FibreChannel or SCSI disk drivers in this case) are misbehaving. The FC
driver appears to be reporting back an error which is interpreted as
fatal by the SCSI disk driver when one or the other should be retrying
the I/O. (It also appears that either the FC driver, SCSI disk driver, or
ZFS is misbehaving in the observed hang.)


In this case it is most likely that it's the qla2x00 driver which is at
fault. The Leadville drivers do the appropriate retries. The sd driver
and ZFS also do the appropriate retries.


So ZFS should be more resilient against write errors, and the SCSI disk
or FC drivers should be more resilient against LIPs (the most likely
cause of your problem) or other transient errors. (Alternatively, the ifp
driver should be updated to support the maximum number of targets on a
loop, which might also solve your second problem.)


Your alternative option isn't going to happen. The ifp driver and
the card it supports have both been long since EOLd.



James C. McPherson
--
Solaris kernel software engineer
Sun Microsystems
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS related kernel panic

2006-12-04 Thread Richard Elling

Anton B. Rang wrote:

And to panic? How can that in any sane way be good
way to protect the application?
*BANG* - no chance at all for the application to
handle the problem...


I agree -- a disk error should never be fatal to the system; at worst, the file system 
should appear to have been forcibly unmounted (and worst really means that critical 
metadata, like the superblock/uberblock, can't be updated on any of the disks in the 
pool). That at least gives other applications which aren't using the file system the 
chance to keep going.


This is not always the desired behavior.  In particular, for a high availability
cluster, if one node is having difficulty and another is not, then we'd really
like to have the services relocated to the good node ASAP.  I think this case is
different, though...

An I/O error detected when writing a file can be reported at write() time, fsync() time, 
or close() time. Any application which doesn't check all three of these won't handle 
all I/O errors properly; and applications which care about knowing that their data is 
on disk must either use synchronous writes (O_SYNC/O_DSYNC) or call fsync before 
closing the file. ZFS should report back these errors in all cases and avoid panicing 
(obviously).


From what I recall of previous discussions on this topic (search the archives),
the difficulty is attributing a failure temporally, given that you want a file
system to have better performance by caching.

That said, it also appears that the device drivers (either the FibreChannel or SCSI 
disk drivers in this case) are misbehaving. The FC driver appears to be reporting back 
an error which is interpreted as fatal by the SCSI disk driver when one or the other 
should be retrying the I/O. (It also appears that either the FC driver, SCSI disk 
driver, or ZFS is misbehaving in the observed hang.)


Agree 110%.  When debugging layered software/firmware, it is essential to 
understand
all of the assumptions made at all interfaces.  Currently, ZFS assumes that a 
fatal
write error is in fact fatal.

So ZFS should be more resilient against write errors, and the SCSI disk or FC drivers 
should be more resilient against LIPs (the most likely cause of your problem) or other 
transient errors. (Alternatively, the ifp driver should be updated to support the 
maximum number of targets on a loop, which might also solve your second problem.)


NB. LIPs are a normal part of everyday life for fibre channel, they are not an 
error.

But I think Anton is right here, the way that the driver deals with incurred
exceptions is key to the upper layers being stable.  This can be tuned, but
remember that tuning my lead to instability.  We might be dealing with an 
instability
case here, not a functional spec problem.
 -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS related kernel panic

2006-12-04 Thread Richard Elling

Dale Ghent wrote:

Matthew Ahrens wrote:

Jason J. W. Williams wrote:

Hi all,

Having experienced this, it would be nice if there was an option to
offline the filesystem instead of kernel panicking on a per-zpool
basis. If its a system-critical partition like a database I'd prefer
it to kernel-panick and thereby trigger a fail-over of the
application. However, if its a zpool hosting some fileshares I'd
prefer it to stay online. Putting that level of control in would
alleviate a lot of the complaints it seems to me...or at least give
less of a leg to stand on. ;-)


Agreed, and we are working on this.


Similar to UFS's onerror mount option, I take it?


Actually, it would be interesting to see how many customers change the
onerror setting.  We have some data, just need more days in the hour.
 -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS related kernel panic

2006-12-04 Thread Dale Ghent

Richard Elling wrote:


Actually, it would be interesting to see how many customers change the
onerror setting.  We have some data, just need more days in the hour.


I'm pretty sure you'd find that info in over 6 years of submitted 
Explorer output :)


I imagine that stuff is sandboxed away in a far off department, though.

/dale
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] need Clarification on ZFS

2006-12-04 Thread dudekula mastan
Hi All,
   
  I am new to solaris. Please clarify me on the following questions.
   
  1) On Linux to know the presence of ext2/ext3 file systems on a device we use 
tune2fs command. Similar to tune2fs command is there any command to know the 
presence of ZFS file system on a device ?
   
  2) When a device is shared between two machines , What our project does is,
   
  - Create ext2 file system on device 
  a) Mount the device on machine 1
   b) Write data on the device 
  c) unmount the device from machine 1
  d)mount the device on machine 2
  e) read the data on the device
  f) compare the current read data with previous write data  and report the 
result
  g) unmount the device from machine 2
  h) Goto step a.
   
  Like this , Can We share zfs file system between two machines. If so please 
explain it.
   
  3) Can we create ZFS pools (or ZFS file system ) on VxVm volumes ? if so, how 
?
   
  4) Can we share ZFS pools ( ZFS file ststem ) between two machines ?
   
  5)  Like fsck command on Linux, is there any command  to check the 
consistency of the ZFS file system ?
   
  your help is appreciated.
   
  Thanks  Regards
  Masthan
   

 
-
Need a quick answer? Get one in minutes from people who know. Ask your question 
on Yahoo! Answers.___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] need Clarification on ZFS

2006-12-04 Thread Jason A. Hoffman

Hi Mastan,

On Dec 4, 2006, at 11:13 PM, dudekula mastan wrote:


Hi All,

I am new to solaris. Please clarify me on the following questions.

1) On Linux to know the presence of ext2/ext3 file systems on a  
device we use tune2fs command. Similar to tune2fs command is there  
any command to know the presence of ZFS file system on a device ?


zpool import will list any zpools even when they're not currently  
visible in a zpool list


zfs get -r all zpool-name will list all zfs filesystems


 2) When a device is shared between two machines , What our project  
does is,


- Create ext2 file system on device
a) Mount the device on machine 1
 b) Write data on the device
c) unmount the device from machine 1
d)mount the device on machine 2
e) read the data on the device
f) compare the current read data with previous write data  and  
report the result

g) unmount the device from machine 2
h) Goto step a.

Like this , Can We share zfs file system between two machines. If  
so please explain it.


It's always going from machine 1 to machine 2?

zfs send [EMAIL PROTECTED] | ssh [EMAIL PROTECTED] | zfs  
recv filesystem-one-machine2


will stream a snapshot from the first machine to a filesystem/device/ 
snapshot on machine2



3) Can we create ZFS pools (or ZFS file system ) on VxVm volumes ?  
if so, how ?


It's been so long since I've cared about VxVm volumes, I don't know.


4) Can we share ZFS pools ( ZFS file ststem ) between two machines ?


Yes but what are the requirements here?

5)  Like fsck command on Linux, is there any command  to check the  
consistency of the ZFS file system ?


ZFS is a transactional, copy-on-write filesystem. It's always  
consistent.


There is an output for zpool status that includes this information,  
for example


[strongspace12(zone):/] root# zpool status
  pool: thumper12
state: ONLINE
scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
thumper ONLINE   0 0 0
  raidz2ONLINE   0 0 0
c5t4d0  ONLINE   0 0 0
c4t4d0  ONLINE   0 0 0
c7t4d0  ONLINE   0 0 0
c6t4d0  ONLINE   0 0 0
c1t4d0  ONLINE   0 0 0
c0t4d0  ONLINE   0 0 0
c4t0d0  ONLINE   0 0 0
c7t0d0  ONLINE   0 0 0
c6t0d0  ONLINE   0 0 0
c1t0d0  ONLINE   0 0 0
c0t0d0  ONLINE   0 0 0
  raidz2ONLINE   0 0 0
c5t5d0  ONLINE   0 0 0
c4t5d0  ONLINE   0 0 0
c7t5d0  ONLINE   0 0 0
c6t5d0  ONLINE   0 0 0
c1t5d0  ONLINE   0 0 0
c0t5d0  ONLINE   0 0 0
c4t1d0  ONLINE   0 0 0
c7t1d0  ONLINE   0 0 0
c6t1d0  ONLINE   0 0 0
c1t1d0  ONLINE   0 0 0
c0t1d0  ONLINE   0 0 0
  raidz2ONLINE   0 0 0
c5t6d0  ONLINE   0 0 0
c4t6d0  ONLINE   0 0 0
c7t6d0  ONLINE   0 0 0
c6t6d0  ONLINE   0 0 0
c1t6d0  ONLINE   0 0 0
c0t6d0  ONLINE   0 0 0
c4t2d0  ONLINE   0 0 0
c7t2d0  ONLINE   0 0 0
c6t2d0  ONLINE   0 0 0
c1t2d0  ONLINE   0 0 0
c0t2d0  ONLINE   0 0 0
  raidz2ONLINE   0 0 0
c5t7d0  ONLINE   0 0 0
c4t7d0  ONLINE   0 0 0
c7t7d0  ONLINE   0 0 0
c6t7d0  ONLINE   0 0 0
c1t7d0  ONLINE   0 0 0
c0t7d0  ONLINE   0 0 0
c4t3d0  ONLINE   0 0 0
c7t3d0  ONLINE   0 0 0
c6t3d0  ONLINE   0 0 0
c1t3d0  ONLINE   0 0 0
c0t3d0  ONLINE   0 0 0
spares
  c5t1d0AVAIL
  c5t2d0AVAIL
  c5t3d0AVAIL

errors: No known data errors


Regards, Jason


Jason A. Hoffman, PhD | Founder, CTO, Joyent Inc.
Applications = http://joyent.com/
Hosting  = http://textdrive.com/
Backups  = http://strongspace.com/
Weblog   = http://joyeur.com/
Email= [EMAIL PROTECTED] or [EMAIL PROTECTED]
Mobile   = (858)342-2179




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] need Clarification on ZFS

2006-12-04 Thread Darren Dunham
   1) On Linux to know the presence of ext2/ext3 file systems on a device we 
 use tune2fs command. Similar to tune2fs command is there any command to know 
 the presence of ZFS file system on a device ?


You can use 'zpool import' to check normal disk devices, or give an
optional list of devices/directories to search specifically for zfs
presence, or you can use 'fstyp' to guess at a filesystem on any type of
named device based on signature.

# fstyp /dev/rdsk/c0t8d0s0
ufs
# fstyp /dev/rdsk/c1t8d0s0
zfs

   2) When a device is shared between two machines , What our project does is,

   - Create ext2 file system on device 
   a) Mount the device on machine 1
b) Write data on the device 
   c) unmount the device from machine 1
   d)mount the device on machine 2
   e) read the data on the device
   f) compare the current read data with previous write data  and report the 
 result
   g) unmount the device from machine 2
   h) Goto step a.

   Like this , Can We share zfs file system between two machines. If so
please explain it.

Yes.  'zpool export' and 'zpool import' can be used to unmount and
remount the pool and filesystems on different machines at separate
times.

   3) Can we create ZFS pools (or ZFS file system ) on VxVm volumes ?
   if so, how ?

Haven't tried it, but you should be able to pass the volume in on the
zpool create command line as a device.

   4) Can we share ZFS pools ( ZFS file ststem ) between two machines ?

Not simultaneously at this point.

   5)  Like fsck command on Linux, is there any command  to check the 
 consistency of the ZFS file system ?

Not in exactly the same way (because it's not needed in the same way),
but it can be scrubbed periodically to verify that all the data
checksums correctly.  Also, you can do this while it is online.

-- 
Darren Dunham   [EMAIL PROTECTED]
Senior Technical Consultant TAOShttp://www.taos.com/
Got some Dr Pepper?   San Francisco, CA bay area
  This line left intentionally blank to confuse you. 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss