Please note: You will not want to leave zil disabled as this could and would 
lead to nfs client corruption. 

Disabling zil for a test will allow you to see if zfs write cache of many small 
files on large FS is causing the problems.


> /etc/system
> set ZFS:zil_disable=1
> 
> This may help
> 
> 
> > It seems that all community moved from Solaris
> forums
> > here, so I'm actually re-posting my original post.
> > 
> > I've Solaris 10U4 running on x86/64 with 127729-07
> > patch applied (nfs large file panic zfs). I've 1TB
> > ZFS pool with over million files (mainly small,
> > mails) and folders shared and accessed over NFSv4.
> > Before application of mentioned patch machine were
> > crashing eash 2 hours, after application it became
> > rare - approx. once per 1-2 days.
> > 
> > After crash ZFS pool shows strange error:
> > ----------------
> > # zpool status -xv
> >   pool: box5
> > tate: ONLINE
> > status: One or more devices has experienced an
> error
> > resulting in data
> >         corruption.  Applications may be affected.
> > Restore the file in question if possible.
>  Otherwise
>  restore the
>          entire pool from backup.
> www.sun.com/msg/ZFS-8000-8A
> >  scrub: none requested
> > onfig:
> >  
> >        NAME        STATE     READ WRITE CKSUM
> >  box5        ONLINE       0     0     0
> >          mirror    ONLINE       0     0     0
> >    c1d0    ONLINE       0     0     0
> >          c2d0    ONLINE       0     0     0
> >  mirror    ONLINE       0     0     0
> >            c2d1    ONLINE       0     0     0
> >  c1d1    ONLINE       0     0     0
> > 
> > errors: Permanent errors have been detected in the
> > following files:
> >  
> >        box5:<0x0>
> > ---------
> > 
> > I removed this permanent error by running
> scrubbing
> > which which identified checksum error and then
> > cleaning it.
> > 
> > ----------------
> > # zpool status -xv
> >   pool: box5
> > tate: ONLINE
> > status: One or more devices has experienced an
> error
> > resulting in data
> >         corruption.  Applications may be affected.
> > Restore the file in question if possible.
>  Otherwise
>  restore the
>          entire pool from backup.
> www.sun.com/msg/ZFS-8000-8A
> >  scrub: scrub in progress, 1.20% done, 176h37m to
> go
> > onfig:
> >  
> >        NAME        STATE     READ WRITE CKSUM
> >  box5        ONLINE       0     0     4
> >          mirror    ONLINE       0     0     2
> >    c1d0    ONLINE       0     0     4
> >          c2d0    ONLINE       0     0     4
> >  mirror    ONLINE       0     0     2
> >            c2d1    ONLINE       0     0     4
> >  c1d1    ONLINE       0     0     4
> > 
> > errors: Permanent errors have been detected in the
> > following files:
> >  
> >        box5:<0x0>
> > ---------
> > 
> > However, ZFS error (box5:<0x0>) appeared as soon
> as
> > system crashed again.
> > 
> > I've attached stack trace, status and other infos
> for
> > last 4 crash dumps. I kept last one for further
> > debugging just in case. Basically stacks are very
> > similar.
> > 
> > The only suspection I had was my non-ECC ram
> modules.
> > But I replaced all modules couple a days ago and
> it
> > didn't help.
> > 
> > Anyone have similar symptoms? Any
> temporary/permanent
> > solutions appreciated.
> > 
> > Message was edited by: 
> >         rstml
 
 
This message posted from opensolaris.org
_______________________________________________
opensolaris-discuss mailing list
[email protected]

Reply via email to