[zfs-discuss] ZFS recovery tool for Solaris 10 with a dead slog?

2010-11-04 Thread Bryan Horstmann-Allen
I just had an SSD blow out on me, taking a v10 zpool with it. The pool
currently shows up as UNAVAIL, missing device.

The system is currently running U9, which has `import -F`, but not `import -m`.
My understanding is the pool would need to be =19 for that to work regardless.

I have copies of zpool.cache from when the SSD was alive, and its GUID.

Looking at https://github.com/pjjw/logfix it appears all I really need to do is
mock up a new log device and update the labels in ZFS. All. However, logfix
appears to want some version of Nevada.

Does anyone have any tools for Solaris 10 that will accomplish this?

Barring that, I suppose I could put SXCE b130 on it and give logfix a shot.

Cheers.
-- 
bdha
cyberpunk is dead. long live cyberpunk.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] sharesmb should be ignored if filesystem is not mounted

2010-11-04 Thread Richard L. Hamilton
 On 10/28/10 08:40 AM, Richard L. Hamilton wrote:
  I have sharesmb=on set for a bunch of filesystems,
  including three that weren't mounted.
  Nevertheless,
  all of those are advertised.  Needless to say,
 the one that isn't mounted can't be accessed
  remotely,
 even though since advertised, it looks like it could
  be.
 When you say advertised do you mean that it appears
 in
 /etc/dfs/sharetab when the dataset is not mounted
 and/or
 you can see it from a client with 'net view' on a
 client?
 
 I'm using a recent build and I see the smb share
 disappear
 from both when the dataset is unmounted.

I could see it in Finder on a Mac client; presumably were
I on a Windows client, it would have appeared with net view.
I've since turned off the sharesmb property on those filesystems,
so I may need to reboot (which I'd much rather not) to re-create
the problem.

But if recent builds don't have the problem, that's the main thing.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS send/receive and locking

2010-11-04 Thread Byte Internet
The problem is not with how the replication is done.  The locking happens 
during the basic zfs operations. 

We noticed:
on server2 (which is quite busy serving maildirs) we did

zfs create tank/newfs
rsync 4GB from someotherserver to /tank/newfs
zfs destroy tank/newfs

Destroying newfs took more than 30 minutes, and during this time the production 
filesystem was inaccessible via NFS.

We got a hint on priv that we should try the following experiment

zfs create tank/tmp  
dd of=/tank/tmp/data [...]
zpool scrub tank 
zfs destroy tank/tmp

At this point the zfs destroy command gets suspended. 
Issuing 
zpool scrub -s  
causes the destroy to finish immediately.

The other part of the hint is that it is an issue with I/O scheduler and we 
should upgrade.
We will provide the details a soon as we have sorted this out.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [OpenIndiana-discuss] format dumps the core

2010-11-04 Thread Roy Sigurd Karlsbakk
I somehow doubt the problem is the same - looks more like cfgadm can't see my 
devices. I first tried with directly attached storage (1 SAS cable to each 
disk). Now, that has been replaced with a SAS expander (4xSAS to the expander, 
12 drives on the expander). Format still dumps the core, and cfgadm doesn't 
seem to like my drives somehow.

Any ideas?

r...@tos-backup:~# format
Searching for disks...Arithmetic Exception (core dumped)
r...@tos-backup:~# ls -l /dev/rdsk/core
-rw--- 1 root root 2463431 2010-11-04 17:41 /dev/rdsk/core
r...@tos-backup:~# pstack /dev/rdsk/core
core '/dev/rdsk/core' of 1217:  format
 fee62e4a UDiv (4, 0, 8046c80, 80469a0, 8046a30, 8046a50) + 2a
 08079799 auto_sense (4, 0, 8046c80, 0) + 281
 080751a6 add_device_to_disklist (80479c0, 80475c0, fefd995b, feffb140) + 62a
 080746ff do_search (0, 1, 8047e28, 8066576) + 273
 0806658d main (1, 8047e58, 8047e60, 8047e4c) + c1
 0805774d _start   (1, 8047f00, 0, 8047f07, 8047f0b, 8047f1f) + 7d
r...@tos-backup:~# zpool status
  pool: rpool
 state: ONLINE
 scan: none requested
config:

NAME   STATE READ WRITE CKSUM
rpool  ONLINE   0 0 0
  c4t5000C50019891202d0s0  ONLINE   0 0 0

errors: No known data errors
r...@tos-backup:~# cfgadm -a
Ap_Id  Type Receptacle   Occupant Condition
c6 scsi-sas connectedconfigured   unknown
c6::es/ses0ESI  connectedconfigured   unknown
c6::smp/expd0  smp  connectedconfigured   unknown
c6::w5000c50019891202,0disk-pathconnectedconfigured   unknown
c6::w5000c50019890fed,0disk-pathconnectedconfigured   unknown
c7 scsi-sas connectedunconfigured unknown
usb8/1 unknown  emptyunconfigured ok
usb8/2 unknown  emptyunconfigured ok
usb9/1 unknown  emptyunconfigured ok
usb9/2 usb-device   connectedconfigured   ok
usb10/1unknown  emptyunconfigured ok
usb10/2unknown  emptyunconfigured ok
usb10/3unknown  emptyunconfigured ok
usb10/4unknown  emptyunconfigured ok
usb11/1unknown  emptyunconfigured ok
usb11/2unknown  emptyunconfigured ok
usb12/1unknown  emptyunconfigured ok
usb12/2unknown  emptyunconfigured ok
usb13/1unknown  emptyunconfigured ok
usb13/2unknown  emptyunconfigured ok
usb14/1usb-hub  connectedconfigured   ok
usb14/1.1  unknown  emptyunconfigured ok
usb14/1.2  unknown  emptyunconfigured ok
usb14/1.3  usb-hub  connectedconfigured   ok
usb14/1.3.1usb-device   connectedconfigured   ok
usb14/1.3.2unknown  emptyunconfigured ok
usb14/1.3.3unknown  emptyunconfigured ok
usb14/1.3.4unknown  emptyunconfigured ok
usb14/1.4  unknown  emptyunconfigured ok
usb14/2unknown  emptyunconfigured ok
usb14/3unknown  emptyunconfigured ok
usb14/4unknown  emptyunconfigured ok
usb14/5unknown  emptyunconfigured ok
usb14/6unknown  emptyunconfigured ok
r...@tos-backup:~# 


- Original Message -
 Moazam,
 
 Thanks for the update. I hope this is Roy's issue too.
 
 I can see that format would freak out over ext3, but it
 shouldn't core dump.
 
 Cindy
 
 On 11/02/10 17:00, Moazam Raja wrote:
  Fixed!
 
  It turns out the problem was that we pulled these two disks from a
  Linux box and they were formatted with ext3 on partition 0 for the
  whole disk, which was somehow causing 'format' to freak out.
 
  So, we fdisk'ed the p0 slice to delete the Linux partition and then
  created a SOLARIS2 type partition on it. It worked and no more crash
  during format command.
 
  Cindy, please let the format team know about this since I'm sure
  others will also run into this problem at some point if they have a
  mixed Linux/Solaris environment.
 
 
  -Moazam
 
  On Tue, Nov 2, 2010 at 3:15 PM, Cindy Swearingen
  cindy.swearin...@oracle.com wrote:
  Hi Moazam,
 
  The initial diagnosis is that the LSI controller is reporting bogus
  information. It looks like Roy is using a similar controller.
 
  You might report this 

Re: [zfs-discuss] [OpenIndiana-discuss] format dumps the core

2010-11-04 Thread Roy Sigurd Karlsbakk
also, this last test was with two 160gig drives only, the 2TB drives and the 
SSD are all disconnected...

- Original Message -
 I somehow doubt the problem is the same - looks more like cfgadm can't
 see my devices. I first tried with directly attached storage (1 SAS
 cable to each disk). Now, that has been replaced with a SAS expander
 (4xSAS to the expander, 12 drives on the expander). Format still dumps
 the core, and cfgadm doesn't seem to like my drives somehow.
 
 Any ideas?
 
 r...@tos-backup:~# format
 Searching for disks...Arithmetic Exception (core dumped)
 r...@tos-backup:~# ls -l /dev/rdsk/core
 -rw--- 1 root root 2463431 2010-11-04 17:41 /dev/rdsk/core
 r...@tos-backup:~# pstack /dev/rdsk/core
 core '/dev/rdsk/core' of 1217: format
 fee62e4a UDiv (4, 0, 8046c80, 80469a0, 8046a30, 8046a50) + 2a
 08079799 auto_sense (4, 0, 8046c80, 0) + 281
 080751a6 add_device_to_disklist (80479c0, 80475c0, fefd995b, feffb140)
 + 62a
 080746ff do_search (0, 1, 8047e28, 8066576) + 273
 0806658d main (1, 8047e58, 8047e60, 8047e4c) + c1
 0805774d _start (1, 8047f00, 0, 8047f07, 8047f0b, 8047f1f) + 7d
 r...@tos-backup:~# zpool status
 pool: rpool
 state: ONLINE
 scan: none requested
 config:
 
 NAME STATE READ WRITE CKSUM
 rpool ONLINE 0 0 0
 c4t5000C50019891202d0s0 ONLINE 0 0 0
 
 errors: No known data errors
 r...@tos-backup:~# cfgadm -a
 Ap_Id Type Receptacle Occupant Condition
 c6 scsi-sas connected configured unknown
 c6::es/ses0 ESI connected configured unknown
 c6::smp/expd0 smp connected configured unknown
 c6::w5000c50019891202,0 disk-path connected configured unknown
 c6::w5000c50019890fed,0 disk-path connected configured unknown
 c7 scsi-sas connected unconfigured unknown
 usb8/1 unknown empty unconfigured ok
 usb8/2 unknown empty unconfigured ok
 usb9/1 unknown empty unconfigured ok
 usb9/2 usb-device connected configured ok
 usb10/1 unknown empty unconfigured ok
 usb10/2 unknown empty unconfigured ok
 usb10/3 unknown empty unconfigured ok
 usb10/4 unknown empty unconfigured ok
 usb11/1 unknown empty unconfigured ok
 usb11/2 unknown empty unconfigured ok
 usb12/1 unknown empty unconfigured ok
 usb12/2 unknown empty unconfigured ok
 usb13/1 unknown empty unconfigured ok
 usb13/2 unknown empty unconfigured ok
 usb14/1 usb-hub connected configured ok
 usb14/1.1 unknown empty unconfigured ok
 usb14/1.2 unknown empty unconfigured ok
 usb14/1.3 usb-hub connected configured ok
 usb14/1.3.1 usb-device connected configured ok
 usb14/1.3.2 unknown empty unconfigured ok
 usb14/1.3.3 unknown empty unconfigured ok
 usb14/1.3.4 unknown empty unconfigured ok
 usb14/1.4 unknown empty unconfigured ok
 usb14/2 unknown empty unconfigured ok
 usb14/3 unknown empty unconfigured ok
 usb14/4 unknown empty unconfigured ok
 usb14/5 unknown empty unconfigured ok
 usb14/6 unknown empty unconfigured ok
 r...@tos-backup:~#
 
 
 - Original Message -
  Moazam,
 
  Thanks for the update. I hope this is Roy's issue too.
 
  I can see that format would freak out over ext3, but it
  shouldn't core dump.
 
  Cindy
 
  On 11/02/10 17:00, Moazam Raja wrote:
   Fixed!
  
   It turns out the problem was that we pulled these two disks from a
   Linux box and they were formatted with ext3 on partition 0 for the
   whole disk, which was somehow causing 'format' to freak out.
  
   So, we fdisk'ed the p0 slice to delete the Linux partition and
   then
   created a SOLARIS2 type partition on it. It worked and no more
   crash
   during format command.
  
   Cindy, please let the format team know about this since I'm sure
   others will also run into this problem at some point if they have
   a
   mixed Linux/Solaris environment.
  
  
   -Moazam
  
   On Tue, Nov 2, 2010 at 3:15 PM, Cindy Swearingen
   cindy.swearin...@oracle.com wrote:
   Hi Moazam,
  
   The initial diagnosis is that the LSI controller is reporting
   bogus
   information. It looks like Roy is using a similar controller.
  
   You might report this problem to LSI, but I will pass this issue
   along to the format folks.
  
   Thanks,
  
   Cindy
  
   On 11/02/10 15:26, Moazam Raja wrote:
   I'm having the same problem after adding 2 SSD disks to my
   machine.
   The controller is LSI SAS9211-8i PCI Express.
  
   # format
   Searching for disks...Arithmetic Exception (core dumped)
  
  
  
   # pstack core.format.1016
   core 'core.format.1016' of 1016: format
fee62e4a UDiv (4, 0, 8046bf0, 8046910, 80469a0, 80469c0) + 2a
08079799 auto_sense (4, 0, 8046bf0, 1c8) + 281
080751a6 add_device_to_disklist (8047930, 8047530, feffb8f4,
804716c) +
   62a
080746ff do_search (0, 1, 8047d98, 8066576) + 273
0806658d main (1, 8047dd0, 8047dd8, 8047d8c) + c1
0805774d _start (1, 8047e88, 0, 8047e8f, 8047e99, 8047ead) + 7d
  
  
   I'm on b147.
  
   # uname -a
   SunOS geneva5 5.11 oi_147 i86pc i386 i86pc Solaris
  
  
   On Tue, Nov 2, 2010 at 7:17 AM, Joerg Schilling
   joerg.schill...@fokus.fraunhofer.de wrote:
   Roy Sigurd Karlsbakk 

[zfs-discuss] zfs record size implications

2010-11-04 Thread Rob Cohen
I have read some conflicting things regarding the ZFs record size setting.  
Could you guys verify/correct my these statements:

(These reflect my understanding, not necessarily the facts!)

1) The ZFS record size in a zvol is the unit that dedup happens at.  So, for a 
volume that is shared to an NTFS machine, if the NTFS cluster size is smaller 
than the zvol record size, dedup will get dramatically worse, since it won't 
dedup clusters that are positioned differently in zvol records.

2) For shared folders, the record size is the allocation unit size, so large 
records can waste a substantial amount of space, in cases with lots of very 
small files.  This is different than a HW raid stripe size, which only affects 
performance, not space usage.

3) Although small record sizes have a large RAM overhead for dedup tables, as 
long as the dedup table working set fits in RAM, and the rest fits in L2ARC, 
performance will be good.

Thanks,
   Rob
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] sharesmb should be ignored if filesystem is not mounted

2010-11-04 Thread Alan Wright

On 11/ 4/10 03:54 AM, Richard L. Hamilton wrote:

On 10/28/10 08:40 AM, Richard L. Hamilton wrote:

I have sharesmb=on set for a bunch of filesystems,
including three that weren't mounted.

  Nevertheless,
  all of those are advertised.  Needless to say,
the one that isn't mounted can't be accessed
  remotely,
even though since advertised, it looks like it could
  be.
When you say advertised do you mean that it appears
in
/etc/dfs/sharetab when the dataset is not mounted
and/or
you can see it from a client with 'net view' on a
client?

I'm using a recent build and I see the smb share
disappear
from both when the dataset is unmounted.


I could see it in Finder on a Mac client; presumably were
I on a Windows client, it would have appeared with net view.
I've since turned off the sharesmb property on those filesystems,
so I may need to reboot (which I'd much rather not) to re-create
the problem.


That's fine.  If you see it again, try

svcadm restart smb/server

My guess is that smbd had stale cache entries for those shares.
This area was reworked in snv_149 and that smbd cache was eliminated.


But if recent builds don't have the problem, that's the main thing.


The following update was pushed to snv_149:

PSARC/2010/154 Unified sharing system call
6968897 sharefs: Unified sharing system call

Alan
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss