Re: [zfs-discuss] [o.seib...@cs.ru.nl: A broken ZFS pool...]

2012-02-16 Thread Olaf Seibert
On Wed 15 Feb 2012 at 14:49:14 +0100, Olaf Seibert wrote:
 NAME STATE READ WRITE CKSUM
 tank FAULTED  0 0 2
   raidz2-0   DEGRADED 0 0 8
 da0  ONLINE   0 0 0
 da1  ONLINE   0 0 0
 da2  ONLINE   0 0 0
 da3  ONLINE   0 0 0
 3758301462980058947  UNAVAIL  0 0 0  was /dev/da4
 da5  ONLINE   0 0 0

Current status: I've been running zdb -bcsvL -e -L -p /dev tank, which
magical command I found from
http://sigtar.com/2009/10/19/opensolaris-zfs-recovery-after-kernel-panic/.
I apparently had to export the tank first.

It has been running overnight now, and the only output so far was

fourquid.0:/tmp$ sudo zdb -bcsvL -e -L -p /dev tank

Traversing all blocks to verify checksums ...
zdb_blkptr_cb: Got error 122 reading 42, 0, 3, 0 DVA[0]=0:508c6a90c00:3000 
DVA[1]=0:1813ba6c800:3000 [L3 DMU dnode] fletcher4 lzjb LE contiguous unique 
double size=4000L/1c00P birth=244334305L/244334305P fill=18480533 
cksum=2a43556fd2b:95a3245729a27:15e3e48f3c6a490e:70fa77061df61a76 -- skipping
zdb_blkptr_cb: Got error 122 reading 42, 0, 3, 3 DVA[0]=0:508c6aa2000:3000 
DVA[1]=0:1813ba72800:3000 [L3 DMU dnode] fletcher4 lzjb LE contiguous unique 
double size=4000L/1e00P birth=244334321L/244334321P fill=16777409 
cksum=2ad6a555e8f:a1dcced71be6c:191abf84e5905b05:e8564e4004372491 -- skipping


with the error 122 messages appearing after an hour or so.

Would these 2 errors be the 2 in the CKSUM column?

I haven't tried yet if this automagically has fixed / unlinked these
blocks, but if it didn't, how would I do that? How can I see whether
these blocks are important, i.e. are required for access to much data?
Would running it on OpenIndiana or so instead of on FreeBSD make a
difference?

-Olaf.
-- 
Pipe rene = new PipePicture(); assert(Not rene.GetType().Equals(Pipe));
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] [o.seib...@cs.ru.nl: A broken ZFS pool...]

2012-02-15 Thread Olaf Seibert
At the moment I am feverishly seeking advice for how to fix a broken ZFS
raidz2 I have (using FreeBSD 8.2-STABLE).

This is the current status:

$ zpool status
  pool: tank
 state: FAULTED
status: One or more devices could not be opened.  There are insufficient
replicas for the pool to continue functioning.
action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-3C
  scan: scrub repaired 0 in 49h3m with 2 errors on Fri Jan 20 15:10:35 2012
config:

NAME STATE READ WRITE CKSUM
tank FAULTED  0 0 2
  raidz2-0   DEGRADED 0 0 8
da0  ONLINE   0 0 0
da1  ONLINE   0 0 0
da2  ONLINE   0 0 0
da3  ONLINE   0 0 0
3758301462980058947  UNAVAIL  0 0 0  was /dev/da4
da5  ONLINE   0 0 0

The strange thing is that the pool is FAULTED while its part is merely
DEGRADED.

da4 failed reccently and was replaced with a new disk, but no resilvering is
taking place.

I've already tried lots of things with this, including exporting and
then zpool import -nFX tank. (I only got it back-imported with zpool
import -V tank). The -nFX (extreme rewind) option gives no output, but
there is a lot of I/O activity going on, as if it is rewinding forever,
or in a loop, or something like that.

One thing that may, or may not, complicate things is the following.
Already quite a while ago there suddenly was a directory that was so
corrupted that zfs reported I/O errors for various files in it. I could
not even remove them; in the end I moved the other files to a new
directory and put the original directory to the side, and made it mode
000. (If rewinding wants to go back to before this happened, I can
understand that this takes a while, but I left it running overnight and
it didn't make visible progress)

zdb and various other commands complain about the pool not being
available, or I/O errors. For instance:

fourquid.1:~$ sudo zpool clear -nF tank
fourquid.1:~$ sudo zpool clear -F tank
cannot clear errors for tank: I/O error
fourquid.1:~$ sudo zpool clear -nFX tank
(no output, uses some cpu, some I/O)

zdb -v  ok
zdb -v -c tank  zdb: can't open 'tank': input/output error
zdb -v -l /dev/da[01235]ok
zdb -v -u tank  zdb: can't open 'tank': Input/output error
zdb -v -l -u /dev/da[01235] ok
zdb -v -m tank  zdb: can't open 'tank': Input/output error
zdb -v -m -X tank   no output, uses cpu and I/O
zdb -v -i tank  zdb: can't open 'tank': Input/output error
zdb -v -i -F tank   zdb: can't open 'tank': Input/output error
zdb -v -i -X tank   no output, uses cpu and I/O

Are there any hints you can give me? I have full FreeBSD source online
so I can modify some tools, if needed.

Thanks in advance,
-Olaf.
-- 
Pipe rene = new PipePicture(); assert(Not rene.GetType().Equals(Pipe));
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss