11 сентября 2015 г. 20:57:46 CEST, "Watson, Dan" <dan.wat...@bcferries.com> 
пишет:
>Hi all,
>
>I've been enjoying OI for quite a while butI'm running into a problem
>with accessing zpool on disk image files sitting on zfs accessed via
>lofi that I hope someone can give me a hint on.
>
>To recover data from a zpool I've copied slice 0 off of all the disks
>to a different host under /alt (zfs file system)
>root@represent:/alt# ls
>c1t50014EE0037B0FF3d0s0.dd  c1t50014EE0AE25CF55d0s0.dd 
>c1t50014EE2081874CAd0s0.dd  c1t50014EE25D6CDE92d0s0.dd 
>c1t50014EE25D6DDBC7d0s0.dd  c1t50014EE2B2C380C3d0s0.dd
>c1t50014EE0037B105Fd0s0.dd  c1t50014EE0AE25EFD1d0s0.dd 
>c1t50014EE20818C0ECd0s0.dd  c1t50014EE25D6DCF0Ed0s0.dd 
>c1t50014EE2B2C27AE2d0s0.dd  c1t50014EE6033DD776d0s0.dd
>
>I use lofiadm to access the disk images as devices because for some
>reason zfs can't access a "device" formatted vdev as a file
>root@represent:/alt# lofiadm
>Block Device             File                           Options
>/dev/lofi/1              /alt/c1t50014EE0037B0FF3d0s0.dd        -
>/dev/lofi/2              /alt/c1t50014EE0037B105Fd0s0.dd        -
>/dev/lofi/3              /alt/c1t50014EE0AE25CF55d0s0.dd        -
>/dev/lofi/4              /alt/c1t50014EE0AE25EFD1d0s0.dd        -
>/dev/lofi/5              /alt/c1t50014EE2081874CAd0s0.dd        -
>/dev/lofi/6              /alt/c1t50014EE20818C0ECd0s0.dd        -
>/dev/lofi/7              /alt/c1t50014EE25D6CDE92d0s0.dd        -
>/dev/lofi/8              /alt/c1t50014EE25D6DCF0Ed0s0.dd        -
>/dev/lofi/9              /alt/c1t50014EE25D6DDBC7d0s0.dd        -
>/dev/lofi/10             /alt/c1t50014EE2B2C27AE2d0s0.dd        -
>/dev/lofi/11             /alt/c1t50014EE2B2C380C3d0s0.dd        -
>/dev/lofi/12             /alt/c1t50014EE6033DD776d0s0.dd        -
>
>The zpool is identifiable
>root@represent:/alt# zpool import -d /dev/lofi
>   pool: oldtank
>     id: 13463599998639852818
>  state: ONLINE
> status: One or more devices are missing from the system.
> action: The pool can be imported using its name or numeric identifier.
>   see: http://illumos.org/msg/ZFS-8000-2Q
> config:
>        oldtank                  ONLINE
>          raidz2-0               ONLINE
>            /dev/lofi/4          ONLINE
>            /dev/lofi/2          ONLINE
>            /dev/lofi/1          ONLINE
>            /dev/lofi/3          ONLINE
>            /dev/lofi/8          ONLINE
>            /dev/lofi/10         ONLINE
>            /dev/lofi/11         ONLINE
>            /dev/lofi/7          ONLINE
>            /dev/lofi/6          ONLINE
>            /dev/lofi/9          ONLINE
>            /dev/lofi/5          ONLINE
>            /dev/lofi/12         ONLINE
>        cache
>          c1t50015178F36728A3d0
>          c1t50015178F3672944d0
>
>And I import the zpool (this command never exits)
>root@represent:/alt# zpool import -d /dev/lofi oldtank
>
>In another window it is evident that the system has managed to add the
>zpool
>                       extended device statistics       ---- errors ---
>r/s    w/s   Mr/s   Mw/s wait actv wsvc_t asvc_t  %w  %b s/w h/w trn
>tot device
>101.1    0.0    1.7    0.0  0.3  2.8    2.9   27.5  28 100   0   0   0 
> 0 lofi1
>118.6    0.0    1.3    0.0  0.3  2.9    2.4   24.3  28 100   0   0   0 
> 0 lofi2
>123.8    0.0    1.0    0.0  0.3  2.9    2.7   23.3  31  94   0   0   0 
> 0 lofi3
>133.1    0.0    1.1    0.0  0.4  2.8    2.7   20.7  34  92   0   0   0 
> 0 lofi4
>144.8    0.0    1.6    0.0  0.2  2.7    1.3   18.7  17  97   0   0   0 
> 0 lofi5
>132.3    0.0    1.2    0.0  0.2  2.5    1.4   18.7  17  95   0   0   0 
> 0 lofi6
>100.3    0.0    1.0    0.0  0.2  2.7    1.9   26.6  18 100   0   0   0 
> 0 lofi7
>117.3    0.0    1.2    0.0  0.2  2.7    1.9   23.3  21  99   0   0   0 
> 0 lofi8
>142.1    0.0    1.0    0.0  0.3  2.5    1.9   17.3  26  85   0   0   0 
> 0 lofi9
>142.8    0.0    1.0    0.0  0.2  2.5    1.5   17.4  20  83   0   0   0 
> 0 lofi10
>144.1    0.0    0.9    0.0  0.3  2.7    2.0   19.0  28  96   0   0   0 
> 0 lofi11
>101.8    0.0    0.8    0.0  0.2  2.7    2.2   26.1  21  96   0   0   0 
> 0 lofi12
>1502.1    0.0   13.7    0.0 3229.1 35.3 2149.7   23.5 100 100   0   0  
>0   0 oldtank
>...
>195.6    0.0    5.8    0.0  0.0  6.1    0.0   31.4   0  95   0   0   0 
> 0 c0t50014EE25F8307D2d0
>200.9    0.0    5.8    0.0  0.0  7.5    0.0   37.2   0  97   0   0   0 
> 0 c0t50014EE2B4CAA6D3d0
>200.1    0.0    5.8    0.0  0.0  7.0    0.0   35.1   0  97   0   0   0 
> 0 c0t50014EE25F74EC15d0
>197.9    0.0    5.9    0.0  0.0  7.2    0.0   36.2   0  96   0   0   0 
> 0 c0t50014EE25F74DD46d0
>198.1    0.0    5.5    0.0  0.0  6.7    0.0   34.0   0  95   0   0   0 
> 0 c0t50014EE2B4D7C1C9d0
>202.4    0.0    5.9    0.0  0.0  6.9    0.0   34.1   0  97   0   0   0 
> 0 c0t50014EE2B4CA8F9Bd0
>223.9    0.0    6.9    0.0  0.0  8.8    0.0   39.1   0 100   0   0   0 
> 0 c0t50014EE20A2DAE1Ed0
>201.6    0.0    5.9    0.0  0.0  6.6    0.0   32.9   0  96   0   0   0 
> 0 c0t50014EE25F74F90Fd0
>210.9    0.0    6.0    0.0  0.0  8.7    0.0   41.5   0 100   0   0   0 
> 0 c0t50014EE20C083E31d0
>222.9    0.0    6.5    0.0  0.0  9.1    0.0   40.7   0  99   0   0   0 
> 0 c0t50014EE2B6B2FA22d0
>214.4    0.0    6.1    0.0  0.0  8.9    0.0   41.6   0 100   0   0   0 
> 0 c0t50014EE20C07F3F3d0
>222.1    0.0    6.5    0.0  0.0  9.7    0.0   43.6   0 100   0   0   0 
> 0 c0t50014EE2615D7B2Ed0
>219.1    0.0    6.2    0.0  0.0  9.5    0.0   43.3   0  99   0   2   8 
>10 c0t50014EE2B6B3FB99d0
>217.6    0.0    6.2    0.0  0.0  8.6    0.0   39.3   0  98   0   0   0 
> 0 c0t50014EE20C07E598d0
>216.4    0.0    6.1    0.0  0.0  7.7    0.0   35.5   0 100   0   0   0 
> 0 c0t50014EE2615D7ADAd0
>216.4    0.0    6.2    0.0  0.0  9.1    0.0   42.1   0 100   0   0   0 
> 0 c0t50014EE2B6B3F65Ed0
>...
>3360.1    0.0   97.3    0.0 447.0 129.1  133.0   38.4 100 100   0   0  
>0   0 tank
>
>But eventually the system panics, core dumps and reboots.
>
>Looking at the core dump I get the following
>> ::status
>debugging crash dump vmcore.0 (64-bit) from represent
>operating system: 5.11 oi_151a9 (i86pc)
>image uuid: 19b88adb-6510-e6e9-a723-95f098c85108
>panic message: I/O to pool 'oldtank' appears to be hung.
>dump content: kernel pages only
>> $c
>vpanic()
>vdev_deadman+0xda(ffffff01ceb3e800)
>vdev_deadman+0x37(ffffff01cff9f000)
>vdev_deadman+0x37(ffffff01d53c4800)
>spa_deadman+0x69(ffffff01ce186580)
>cyclic_softint+0xdc(fffffffffbc30640, 0)
>cbe_low_level+0x17()
>av_dispatch_softvect+0x5f(2)
>dispatch_softint+0x34(0, 0)
>switch_sp_and_call+0x13()
>dosoftint+0x59(ffffff0007a05ad0)
>do_interrupt+0x114(ffffff0007a05ad0, 1)
>_interrupt+0xba()
>mach_cpu_idle+6()
>cpu_idle+0xaf()
>cpu_idle_adaptive+0x19()
>idle+0x114()
>thread_start+8()
>>
>
>I have been able to reproduce this problem several times, although it
>has managed to complete enough to rename the original zpool.
>
>Has anyone else encountered this issue with lofi mounted zpools?
>I'm using mpt_sas with SATA drives, and I _DO_ have error counters
>climbing for some of those drives, is it probably that?
>Any other ideas?
>
>I'd greatly appreciate any suggestions.
>
>Thanks!
>Dan
>
>
>
>
>_______________________________________________
>openindiana-discuss mailing list
>openindiana-discuss@openindiana.org
>http://openindiana.org/mailman/listinfo/openindiana-discuss

From the zpool status I see it also refers to cache disks. Are those device 
names actually available (present and not used by another pool)? Can you remove 
them from the pool after you've imported it?

Consider importing with '-N' to not automount (and autoshare) filesystems from 
this pool, and '-R /a' or some other empty/absent altroot path to ensure lack 
of conflicts when you do mount (and also does not add the poll into zfs.cache 
file for later autoimports). At least, mounting and sharing as a (partially) 
kernel-side operation is something that might time out...

Also, you might want to tune or disable the deadman timer and increase other 
acceptable latencies (see OI wiki or other resources).

How much RAM does the box have (you pay twice the ARC cache for oldtank and for 
pool which hosts the dd files), maybe tune down primary/secondary caching for 
the files store.

How did you get into this recovery situation? Maybe oldtank is corrupted and so 
is trying to recover during import? E.g. I had a history with a deduped pool 
where I deleted lots of data and the kernel wanted more RAM to process the 
delete-queue of blocks than I had, and it took dozens of panic-reboots to 
complete (progress can be tracked with zdb).

Alternately you can import the pool read-only to maybe avoid these recoveries 
altogether if you only want to retrieve the data.

Jim

--
Typos courtesy of K-9 Mail on my Samsung Android

_______________________________________________
openindiana-discuss mailing list
openindiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss

Reply via email to