[zfs-discuss] ZFS recovery tool for Solaris 10 with a dead slog?
I just had an SSD blow out on me, taking a v10 zpool with it. The pool currently shows up as UNAVAIL, missing device. The system is currently running U9, which has `import -F`, but not `import -m`. My understanding is the pool would need to be =19 for that to work regardless. I have copies of zpool.cache from when the SSD was alive, and its GUID. Looking at https://github.com/pjjw/logfix it appears all I really need to do is mock up a new log device and update the labels in ZFS. All. However, logfix appears to want some version of Nevada. Does anyone have any tools for Solaris 10 that will accomplish this? Barring that, I suppose I could put SXCE b130 on it and give logfix a shot. Cheers. -- bdha cyberpunk is dead. long live cyberpunk. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] sharesmb should be ignored if filesystem is not mounted
On 10/28/10 08:40 AM, Richard L. Hamilton wrote: I have sharesmb=on set for a bunch of filesystems, including three that weren't mounted. Nevertheless, all of those are advertised. Needless to say, the one that isn't mounted can't be accessed remotely, even though since advertised, it looks like it could be. When you say advertised do you mean that it appears in /etc/dfs/sharetab when the dataset is not mounted and/or you can see it from a client with 'net view' on a client? I'm using a recent build and I see the smb share disappear from both when the dataset is unmounted. I could see it in Finder on a Mac client; presumably were I on a Windows client, it would have appeared with net view. I've since turned off the sharesmb property on those filesystems, so I may need to reboot (which I'd much rather not) to re-create the problem. But if recent builds don't have the problem, that's the main thing. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS send/receive and locking
The problem is not with how the replication is done. The locking happens during the basic zfs operations. We noticed: on server2 (which is quite busy serving maildirs) we did zfs create tank/newfs rsync 4GB from someotherserver to /tank/newfs zfs destroy tank/newfs Destroying newfs took more than 30 minutes, and during this time the production filesystem was inaccessible via NFS. We got a hint on priv that we should try the following experiment zfs create tank/tmp dd of=/tank/tmp/data [...] zpool scrub tank zfs destroy tank/tmp At this point the zfs destroy command gets suspended. Issuing zpool scrub -s causes the destroy to finish immediately. The other part of the hint is that it is an issue with I/O scheduler and we should upgrade. We will provide the details a soon as we have sorted this out. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [OpenIndiana-discuss] format dumps the core
I somehow doubt the problem is the same - looks more like cfgadm can't see my devices. I first tried with directly attached storage (1 SAS cable to each disk). Now, that has been replaced with a SAS expander (4xSAS to the expander, 12 drives on the expander). Format still dumps the core, and cfgadm doesn't seem to like my drives somehow. Any ideas? r...@tos-backup:~# format Searching for disks...Arithmetic Exception (core dumped) r...@tos-backup:~# ls -l /dev/rdsk/core -rw--- 1 root root 2463431 2010-11-04 17:41 /dev/rdsk/core r...@tos-backup:~# pstack /dev/rdsk/core core '/dev/rdsk/core' of 1217: format fee62e4a UDiv (4, 0, 8046c80, 80469a0, 8046a30, 8046a50) + 2a 08079799 auto_sense (4, 0, 8046c80, 0) + 281 080751a6 add_device_to_disklist (80479c0, 80475c0, fefd995b, feffb140) + 62a 080746ff do_search (0, 1, 8047e28, 8066576) + 273 0806658d main (1, 8047e58, 8047e60, 8047e4c) + c1 0805774d _start (1, 8047f00, 0, 8047f07, 8047f0b, 8047f1f) + 7d r...@tos-backup:~# zpool status pool: rpool state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 c4t5000C50019891202d0s0 ONLINE 0 0 0 errors: No known data errors r...@tos-backup:~# cfgadm -a Ap_Id Type Receptacle Occupant Condition c6 scsi-sas connectedconfigured unknown c6::es/ses0ESI connectedconfigured unknown c6::smp/expd0 smp connectedconfigured unknown c6::w5000c50019891202,0disk-pathconnectedconfigured unknown c6::w5000c50019890fed,0disk-pathconnectedconfigured unknown c7 scsi-sas connectedunconfigured unknown usb8/1 unknown emptyunconfigured ok usb8/2 unknown emptyunconfigured ok usb9/1 unknown emptyunconfigured ok usb9/2 usb-device connectedconfigured ok usb10/1unknown emptyunconfigured ok usb10/2unknown emptyunconfigured ok usb10/3unknown emptyunconfigured ok usb10/4unknown emptyunconfigured ok usb11/1unknown emptyunconfigured ok usb11/2unknown emptyunconfigured ok usb12/1unknown emptyunconfigured ok usb12/2unknown emptyunconfigured ok usb13/1unknown emptyunconfigured ok usb13/2unknown emptyunconfigured ok usb14/1usb-hub connectedconfigured ok usb14/1.1 unknown emptyunconfigured ok usb14/1.2 unknown emptyunconfigured ok usb14/1.3 usb-hub connectedconfigured ok usb14/1.3.1usb-device connectedconfigured ok usb14/1.3.2unknown emptyunconfigured ok usb14/1.3.3unknown emptyunconfigured ok usb14/1.3.4unknown emptyunconfigured ok usb14/1.4 unknown emptyunconfigured ok usb14/2unknown emptyunconfigured ok usb14/3unknown emptyunconfigured ok usb14/4unknown emptyunconfigured ok usb14/5unknown emptyunconfigured ok usb14/6unknown emptyunconfigured ok r...@tos-backup:~# - Original Message - Moazam, Thanks for the update. I hope this is Roy's issue too. I can see that format would freak out over ext3, but it shouldn't core dump. Cindy On 11/02/10 17:00, Moazam Raja wrote: Fixed! It turns out the problem was that we pulled these two disks from a Linux box and they were formatted with ext3 on partition 0 for the whole disk, which was somehow causing 'format' to freak out. So, we fdisk'ed the p0 slice to delete the Linux partition and then created a SOLARIS2 type partition on it. It worked and no more crash during format command. Cindy, please let the format team know about this since I'm sure others will also run into this problem at some point if they have a mixed Linux/Solaris environment. -Moazam On Tue, Nov 2, 2010 at 3:15 PM, Cindy Swearingen cindy.swearin...@oracle.com wrote: Hi Moazam, The initial diagnosis is that the LSI controller is reporting bogus information. It looks like Roy is using a similar controller. You might report this
Re: [zfs-discuss] [OpenIndiana-discuss] format dumps the core
also, this last test was with two 160gig drives only, the 2TB drives and the SSD are all disconnected... - Original Message - I somehow doubt the problem is the same - looks more like cfgadm can't see my devices. I first tried with directly attached storage (1 SAS cable to each disk). Now, that has been replaced with a SAS expander (4xSAS to the expander, 12 drives on the expander). Format still dumps the core, and cfgadm doesn't seem to like my drives somehow. Any ideas? r...@tos-backup:~# format Searching for disks...Arithmetic Exception (core dumped) r...@tos-backup:~# ls -l /dev/rdsk/core -rw--- 1 root root 2463431 2010-11-04 17:41 /dev/rdsk/core r...@tos-backup:~# pstack /dev/rdsk/core core '/dev/rdsk/core' of 1217: format fee62e4a UDiv (4, 0, 8046c80, 80469a0, 8046a30, 8046a50) + 2a 08079799 auto_sense (4, 0, 8046c80, 0) + 281 080751a6 add_device_to_disklist (80479c0, 80475c0, fefd995b, feffb140) + 62a 080746ff do_search (0, 1, 8047e28, 8066576) + 273 0806658d main (1, 8047e58, 8047e60, 8047e4c) + c1 0805774d _start (1, 8047f00, 0, 8047f07, 8047f0b, 8047f1f) + 7d r...@tos-backup:~# zpool status pool: rpool state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 c4t5000C50019891202d0s0 ONLINE 0 0 0 errors: No known data errors r...@tos-backup:~# cfgadm -a Ap_Id Type Receptacle Occupant Condition c6 scsi-sas connected configured unknown c6::es/ses0 ESI connected configured unknown c6::smp/expd0 smp connected configured unknown c6::w5000c50019891202,0 disk-path connected configured unknown c6::w5000c50019890fed,0 disk-path connected configured unknown c7 scsi-sas connected unconfigured unknown usb8/1 unknown empty unconfigured ok usb8/2 unknown empty unconfigured ok usb9/1 unknown empty unconfigured ok usb9/2 usb-device connected configured ok usb10/1 unknown empty unconfigured ok usb10/2 unknown empty unconfigured ok usb10/3 unknown empty unconfigured ok usb10/4 unknown empty unconfigured ok usb11/1 unknown empty unconfigured ok usb11/2 unknown empty unconfigured ok usb12/1 unknown empty unconfigured ok usb12/2 unknown empty unconfigured ok usb13/1 unknown empty unconfigured ok usb13/2 unknown empty unconfigured ok usb14/1 usb-hub connected configured ok usb14/1.1 unknown empty unconfigured ok usb14/1.2 unknown empty unconfigured ok usb14/1.3 usb-hub connected configured ok usb14/1.3.1 usb-device connected configured ok usb14/1.3.2 unknown empty unconfigured ok usb14/1.3.3 unknown empty unconfigured ok usb14/1.3.4 unknown empty unconfigured ok usb14/1.4 unknown empty unconfigured ok usb14/2 unknown empty unconfigured ok usb14/3 unknown empty unconfigured ok usb14/4 unknown empty unconfigured ok usb14/5 unknown empty unconfigured ok usb14/6 unknown empty unconfigured ok r...@tos-backup:~# - Original Message - Moazam, Thanks for the update. I hope this is Roy's issue too. I can see that format would freak out over ext3, but it shouldn't core dump. Cindy On 11/02/10 17:00, Moazam Raja wrote: Fixed! It turns out the problem was that we pulled these two disks from a Linux box and they were formatted with ext3 on partition 0 for the whole disk, which was somehow causing 'format' to freak out. So, we fdisk'ed the p0 slice to delete the Linux partition and then created a SOLARIS2 type partition on it. It worked and no more crash during format command. Cindy, please let the format team know about this since I'm sure others will also run into this problem at some point if they have a mixed Linux/Solaris environment. -Moazam On Tue, Nov 2, 2010 at 3:15 PM, Cindy Swearingen cindy.swearin...@oracle.com wrote: Hi Moazam, The initial diagnosis is that the LSI controller is reporting bogus information. It looks like Roy is using a similar controller. You might report this problem to LSI, but I will pass this issue along to the format folks. Thanks, Cindy On 11/02/10 15:26, Moazam Raja wrote: I'm having the same problem after adding 2 SSD disks to my machine. The controller is LSI SAS9211-8i PCI Express. # format Searching for disks...Arithmetic Exception (core dumped) # pstack core.format.1016 core 'core.format.1016' of 1016: format fee62e4a UDiv (4, 0, 8046bf0, 8046910, 80469a0, 80469c0) + 2a 08079799 auto_sense (4, 0, 8046bf0, 1c8) + 281 080751a6 add_device_to_disklist (8047930, 8047530, feffb8f4, 804716c) + 62a 080746ff do_search (0, 1, 8047d98, 8066576) + 273 0806658d main (1, 8047dd0, 8047dd8, 8047d8c) + c1 0805774d _start (1, 8047e88, 0, 8047e8f, 8047e99, 8047ead) + 7d I'm on b147. # uname -a SunOS geneva5 5.11 oi_147 i86pc i386 i86pc Solaris On Tue, Nov 2, 2010 at 7:17 AM, Joerg Schilling joerg.schill...@fokus.fraunhofer.de wrote: Roy Sigurd Karlsbakk
[zfs-discuss] zfs record size implications
I have read some conflicting things regarding the ZFs record size setting. Could you guys verify/correct my these statements: (These reflect my understanding, not necessarily the facts!) 1) The ZFS record size in a zvol is the unit that dedup happens at. So, for a volume that is shared to an NTFS machine, if the NTFS cluster size is smaller than the zvol record size, dedup will get dramatically worse, since it won't dedup clusters that are positioned differently in zvol records. 2) For shared folders, the record size is the allocation unit size, so large records can waste a substantial amount of space, in cases with lots of very small files. This is different than a HW raid stripe size, which only affects performance, not space usage. 3) Although small record sizes have a large RAM overhead for dedup tables, as long as the dedup table working set fits in RAM, and the rest fits in L2ARC, performance will be good. Thanks, Rob -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] sharesmb should be ignored if filesystem is not mounted
On 11/ 4/10 03:54 AM, Richard L. Hamilton wrote: On 10/28/10 08:40 AM, Richard L. Hamilton wrote: I have sharesmb=on set for a bunch of filesystems, including three that weren't mounted. Nevertheless, all of those are advertised. Needless to say, the one that isn't mounted can't be accessed remotely, even though since advertised, it looks like it could be. When you say advertised do you mean that it appears in /etc/dfs/sharetab when the dataset is not mounted and/or you can see it from a client with 'net view' on a client? I'm using a recent build and I see the smb share disappear from both when the dataset is unmounted. I could see it in Finder on a Mac client; presumably were I on a Windows client, it would have appeared with net view. I've since turned off the sharesmb property on those filesystems, so I may need to reboot (which I'd much rather not) to re-create the problem. That's fine. If you see it again, try svcadm restart smb/server My guess is that smbd had stale cache entries for those shares. This area was reworked in snv_149 and that smbd cache was eliminated. But if recent builds don't have the problem, that's the main thing. The following update was pushed to snv_149: PSARC/2010/154 Unified sharing system call 6968897 sharefs: Unified sharing system call Alan ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss