Re: [zfs-discuss] Freezing OpenSolaris with ZFS

2009-03-16 Thread Markus Denhoff
Hi again, I read through your thread Blake and don't really know if we  
have exactly the same problem. I get different output and the system  
doesn't reboot automatically.

Controller is an Adaptec RAID 31605 and the board is a Supermicro X7DBE.

Here is some perhaps useful information.

"fmdump -eV" gives the following output:

Mar 16 2009 09:22:25.383935668 ereport.io.scsi.cmd.disk.dev.uderr
nvlist version: 0
class = ereport.io.scsi.cmd.disk.dev.uderr
ena = 0x139594f76b1
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = dev
device-path = /p...@0,0/pci8086,2...@1c/pci8086,3...@0/ 
pci9005,2...@e/d...@0,0

devid = id1,s...@tadaptec_31605___7ae2bc31
(end detector)

driver-assessment = fail
op-code = 0x1a
cdb = 0x1a 0x0 0x8 0x0 0x18 0x0
pkt-reason = 0x0
pkt-state = 0x1f
pkt-stats = 0x0
stat-code = 0x0
un-decode-info = sd_cache_control: Mode Sense caching page  
code mismatch 0

un-decode-value =
__ttl = 0x1
__tod = 0x49be0c41 0x16e264b4

Mar 16 2009 09:22:25.385150895 ereport.io.scsi.cmd.disk.dev.uderr
nvlist version: 0
class = ereport.io.scsi.cmd.disk.dev.uderr
ena = 0x1395a77fe51
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = dev
device-path = /p...@0,0/pci8086,2...@1c/pci8086,3...@0/ 
pci9005,2...@e/d...@1,0

devid = id1,s...@tadaptec_31605___6a16ac31
(end detector)

driver-assessment = fail
op-code = 0x1a
cdb = 0x1a 0x0 0x8 0x0 0x18 0x0
pkt-reason = 0x0
pkt-state = 0x1f
pkt-stats = 0x0
stat-code = 0x0
un-decode-info = sd_cache_control: Mode Sense caching page  
code mismatch 0

un-decode-value =
__ttl = 0x1
__tod = 0x49be0c41 0x16f4efaf

Mar 16 2009 09:22:25.386359711 ereport.io.scsi.cmd.disk.dev.uderr
nvlist version: 0
class = ereport.io.scsi.cmd.disk.dev.uderr
ena = 0x1395b9f5321
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = dev
device-path = /p...@0,0/pci8086,2...@1c/pci8086,3...@0/ 
pci9005,2...@e/d...@2,0

devid = id1,s...@tadaptec_31605___7206bc31
(end detector)

driver-assessment = fail
op-code = 0x1a
cdb = 0x1a 0x0 0x8 0x0 0x18 0x0
pkt-reason = 0x0
pkt-state = 0x1f
pkt-stats = 0x0
stat-code = 0x0
un-decode-info = sd_cache_control: Mode Sense caching page  
code mismatch 0

un-decode-value =
__ttl = 0x1
__tod = 0x49be0c41 0x1707619f

Mar 16 2009 09:22:25.387617744 ereport.io.scsi.cmd.disk.dev.uderr
nvlist version: 0
class = ereport.io.scsi.cmd.disk.dev.uderr
ena = 0x1395cd262d1
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = dev
device-path = /p...@0,0/pci8086,2...@1c/pci8086,3...@0/ 
pci9005,2...@e/d...@3,0

devid = id1,s...@tadaptec_31605___1a2ebc31
(end detector)

driver-assessment = fail
op-code = 0x1a
cdb = 0x1a 0x0 0x8 0x0 0x18 0x0
pkt-reason = 0x0
pkt-state = 0x1f
pkt-stats = 0x0
stat-code = 0x0
un-decode-info = sd_cache_control: Mode Sense caching page  
code mismatch 0

un-decode-value =
__ttl = 0x1
__tod = 0x49be0c41 0x171a93d0

Mar 16 2009 09:22:25.388910486 ereport.io.scsi.cmd.disk.dev.uderr
nvlist version: 0
class = ereport.io.scsi.cmd.disk.dev.uderr
ena = 0x1395e0ddda1
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = dev
device-path = /p...@0,0/pci8086,2...@1c/pci8086,3...@0/ 
pci9005,2...@e/d...@4,0

devid = id1,s...@tadaptec_31605___6c66cc31
(end detector)

driver-assessment = fail
op-code = 0x1a
cdb = 0x1a 0x0 0x8 0x0 0x18 0x0
pkt-reason = 0x0
pkt-state = 0x1f
pkt-stats = 0x0
stat-code = 0x0
un-decode-info = sd_cache_control: Mode Sense caching page  
code mismatch 0

un-decode-value =
__ttl = 0x1
__tod = 0x49be0c41 0x172e4d96

Mar 16 2009 09:22:25.390144519 ereport.io.scsi.cmd.disk.dev.uderr
nvlist version: 0
class = ereport.io.scsi.cmd.disk.dev.uderr
ena = 0x1395f3b4431
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = dev
device-path = /p...@0,0/pci8086,2...@1c/pci8086,3...@0/ 
pci9005,2...@e/d...@5,0

devid = id1,s...@tadaptec_31605___508acc

Re: [zfs-discuss] Freezing OpenSolaris with ZFS

2009-03-15 Thread Tim
On Sun, Mar 15, 2009 at 6:42 PM, Blake  wrote:

>
>
>
> On Sun, Mar 15, 2009 at 1:17 PM, Markus Denhoff 
> wrote:
> > Hi there,
> >
> > we set up an OpenSolaris/ZFS based storage server with two zpools: rpool
> is
> > a mirror for the operating system. tank is a raidz for data storage.
> >
> > The system is used to store large video files and has attached 12x1GB
> > SATA-drives (2 mirrored for the system). Everytime large files are copied
> > around the system hangs without apparent reason, 50% kernel CPU usage (so
> > one core is occupied totally) and about 2GB of free RAM (8GB installed).
> On
> > idle nothing crashes. Furthermore every scrub on tank hangs the system up
> > below 1% finished. Neither the /var/adm/messages nor the /var/log/syslog
> > file contains any errors or warnings. We limited the ZFS ARC cache to 4GB
> > with an entry in /etc/system.
> >
> > Does anyone has an idea what's happening there and how to solve the
> problem?
> >
> > Below some outputs which may help.
> >
> > Thanks and greetings from germany,
> >
> > Markus Denhoff,
> > Sebastian Friederichs
> >
> > # zpool status tank
> >  pool: tank
> >  state: ONLINE
> >  scrub: none requested
> > config:
> >
> >NAME STATE READ WRITE CKSUM
> >tank ONLINE   0 0 0
> >  raidz1 ONLINE   0 0 0
> >c6t2d0   ONLINE   0 0 0
> >c6t3d0   ONLINE   0 0 0
> >c6t4d0   ONLINE   0 0 0
> >c6t5d0   ONLINE   0 0 0
> >c6t6d0   ONLINE   0 0 0
> >c6t7d0   ONLINE   0 0 0
> >c6t8d0   ONLINE   0 0 0
> >c6t9d0   ONLINE   0 0 0
> >c6t10d0  ONLINE   0 0 0
> >c6t11d0  ONLINE   0 0 0
> >
> > errors: No known data errors
> >
> > # zpool iostat
> >   capacity operationsbandwidth
> > pool used  avail   read  write   read  write
> > --  -  -  -  -  -  -
> > rpool   37.8G   890G  3  2  94.7K  17.4K
> > tank2.03T  7.03T112  0  4.62M906
> > --  -  -  -  -  -  -
> >
> > # zfs list
> > NAME   USED  AVAIL  REFER  MOUNTPOINT
> > rpool 39.8G   874G72K  /rpool
> > rpool/ROOT35.7G   874G18K  legacy
> > rpool/ROOT/opensolaris35.6G   874G  35.3G  /
> > rpool/ROOT/opensolaris-1  89.9M   874G  2.47G  /tmp/tmp8CN5TR
> > rpool/dump2.00G   874G  2.00G  -
> > rpool/export   172M   874G19K  /export
> > rpool/export/home  172M   874G21K  /export/home
> > rpool/swap2.00G   876G24K  -
> > tank  1.81T  6.17T  32.2K  /tank
> > tank/data 1.81T  6.17T  1.77T  /data
> > tank/public-share 34.9K  6.17T  34.9K  /public-share



Might also be helpful to provide the version of Opensolaris you're on, as
well as the zfs version.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Freezing OpenSolaris with ZFS

2009-03-15 Thread Blake
This sounds quite like the problems I've been having with a spotty
sata controller and/or motherboard.  See my thread from last week
about copying large amounts of data that forced a reboot.  Lots of
good info from engineers and users in that thread.



On Sun, Mar 15, 2009 at 1:17 PM, Markus Denhoff  wrote:
> Hi there,
>
> we set up an OpenSolaris/ZFS based storage server with two zpools: rpool is
> a mirror for the operating system. tank is a raidz for data storage.
>
> The system is used to store large video files and has attached 12x1GB
> SATA-drives (2 mirrored for the system). Everytime large files are copied
> around the system hangs without apparent reason, 50% kernel CPU usage (so
> one core is occupied totally) and about 2GB of free RAM (8GB installed). On
> idle nothing crashes. Furthermore every scrub on tank hangs the system up
> below 1% finished. Neither the /var/adm/messages nor the /var/log/syslog
> file contains any errors or warnings. We limited the ZFS ARC cache to 4GB
> with an entry in /etc/system.
>
> Does anyone has an idea what's happening there and how to solve the problem?
>
> Below some outputs which may help.
>
> Thanks and greetings from germany,
>
> Markus Denhoff,
> Sebastian Friederichs
>
> # zpool status tank
>  pool: tank
>  state: ONLINE
>  scrub: none requested
> config:
>
>        NAME         STATE     READ WRITE CKSUM
>        tank         ONLINE       0     0     0
>          raidz1     ONLINE       0     0     0
>            c6t2d0   ONLINE       0     0     0
>            c6t3d0   ONLINE       0     0     0
>            c6t4d0   ONLINE       0     0     0
>            c6t5d0   ONLINE       0     0     0
>            c6t6d0   ONLINE       0     0     0
>            c6t7d0   ONLINE       0     0     0
>            c6t8d0   ONLINE       0     0     0
>            c6t9d0   ONLINE       0     0     0
>            c6t10d0  ONLINE       0     0     0
>            c6t11d0  ONLINE       0     0     0
>
> errors: No known data errors
>
> # zpool iostat
>               capacity     operations    bandwidth
> pool         used  avail   read  write   read  write
> --  -  -  -  -  -  -
> rpool       37.8G   890G      3      2  94.7K  17.4K
> tank        2.03T  7.03T    112      0  4.62M    906
> --  -  -  -  -  -  -
>
> # zfs list
> NAME                       USED  AVAIL  REFER  MOUNTPOINT
> rpool                     39.8G   874G    72K  /rpool
> rpool/ROOT                35.7G   874G    18K  legacy
> rpool/ROOT/opensolaris    35.6G   874G  35.3G  /
> rpool/ROOT/opensolaris-1  89.9M   874G  2.47G  /tmp/tmp8CN5TR
> rpool/dump                2.00G   874G  2.00G  -
> rpool/export               172M   874G    19K  /export
> rpool/export/home          172M   874G    21K  /export/home
> rpool/swap                2.00G   876G    24K  -
> tank                      1.81T  6.17T  32.2K  /tank
> tank/data                 1.81T  6.17T  1.77T  /data
> tank/public-share         34.9K  6.17T  34.9K  /public-share
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Freezing OpenSolaris with ZFS

2009-03-15 Thread Markus Denhoff

Hi there,

we set up an OpenSolaris/ZFS based storage server with two zpools:  
rpool is a mirror for the operating system. tank is a raidz for data  
storage.


The system is used to store large video files and has attached 12x1GB  
SATA-drives (2 mirrored for the system). Everytime large files are  
copied around the system hangs without apparent reason, 50% kernel CPU  
usage (so one core is occupied totally) and about 2GB of free RAM (8GB  
installed). On idle nothing crashes. Furthermore every scrub on tank  
hangs the system up below 1% finished. Neither the /var/adm/messages  
nor the /var/log/syslog file contains any errors or warnings. We  
limited the ZFS ARC cache to 4GB with an entry in /etc/system.


Does anyone has an idea what's happening there and how to solve the  
problem?


Below some outputs which may help.

Thanks and greetings from germany,

Markus Denhoff,
Sebastian Friederichs

# zpool status tank
  pool: tank
 state: ONLINE
 scrub: none requested
config:

NAME STATE READ WRITE CKSUM
tank ONLINE   0 0 0
  raidz1 ONLINE   0 0 0
c6t2d0   ONLINE   0 0 0
c6t3d0   ONLINE   0 0 0
c6t4d0   ONLINE   0 0 0
c6t5d0   ONLINE   0 0 0
c6t6d0   ONLINE   0 0 0
c6t7d0   ONLINE   0 0 0
c6t8d0   ONLINE   0 0 0
c6t9d0   ONLINE   0 0 0
c6t10d0  ONLINE   0 0 0
c6t11d0  ONLINE   0 0 0

errors: No known data errors

# zpool iostat
   capacity operationsbandwidth
pool used  avail   read  write   read  write
--  -  -  -  -  -  -
rpool   37.8G   890G  3  2  94.7K  17.4K
tank2.03T  7.03T112  0  4.62M906
--  -  -  -  -  -  -

# zfs list
NAME   USED  AVAIL  REFER  MOUNTPOINT
rpool 39.8G   874G72K  /rpool
rpool/ROOT35.7G   874G18K  legacy
rpool/ROOT/opensolaris35.6G   874G  35.3G  /
rpool/ROOT/opensolaris-1  89.9M   874G  2.47G  /tmp/tmp8CN5TR
rpool/dump2.00G   874G  2.00G  -
rpool/export   172M   874G19K  /export
rpool/export/home  172M   874G21K  /export/home
rpool/swap2.00G   876G24K  -
tank  1.81T  6.17T  32.2K  /tank
tank/data 1.81T  6.17T  1.77T  /data
tank/public-share 34.9K  6.17T  34.9K  /public-share
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss