Re: [zfs-discuss] Saving scrub results before scrub completes

2006-12-28 Thread Siegfried Nikolaivich


On 27-Dec-06, at 9:45 PM, George Wilson wrote:


Siegfried,

Can you provide the panic string that you are seeing? We should be  
able to pull out the persistent error log information from the  
corefile. You can take a look at spa_get_errlog() function as a  
starting point.




This is the panic string that I am seeing:

Dec 26 18:55:51 FServe unix: [ID 836849 kern.notice]
Dec 26 18:55:51 FServe ^Mpanic[cpu1]/thread=fe8000929c80:
Dec 26 18:55:51 FServe genunix: [ID 683410 kern.notice] BAD TRAP:  
type=e (#pf Page fault) rp=fe8000929980 addr=ff00b3e621f0

Dec 26 18:55:51 FServe unix: [ID 10 kern.notice]
Dec 26 18:55:51 FServe unix: [ID 839527 kern.notice] sched:
Dec 26 18:55:51 FServe unix: [ID 753105 kern.notice] #pf Page fault
Dec 26 18:55:51 FServe unix: [ID 532287 kern.notice] Bad kernel fault  
at addr=0xff00b3e621f0
Dec 26 18:55:51 FServe unix: [ID 243837 kern.notice] pid=0,  
pc=0xf3eaa2b0, sp=0xfe8000929a78, eflags=0x10282
Dec 26 18:55:51 FServe unix: [ID 211416 kern.notice] cr0:  
8005003bpg,wp,ne,et,ts,mp,pe cr4: 6f0xmme,fxsr,pge,mce,pae,pse
Dec 26 18:55:51 FServe unix: [ID 354241 kern.notice] cr2:  
ff00b3e621f0 cr3: a3ec000 cr8: c
Dec 26 18:55:51 FServe unix: [ID 592667 kern.notice]rdi:  
fe80dd69ad40 rsi: ff00b3e62040 rdx:0
Dec 26 18:55:51 FServe unix: [ID 592667 kern.notice]rcx:  
9c6bd6ce  r8:1  r9: 
Dec 26 18:55:51 FServe unix: [ID 592667 kern.notice]rax:  
ff00b3e62208 rbx: ff00b3e62040 rbp: fe8000929ab0
Dec 26 18:55:51 FServe unix: [ID 592667 kern.notice]r10:  
982421c8 r11:1 r12: ff00b3e62208
Dec 26 18:55:51 FServe unix: [ID 592667 kern.notice]r13:  
81204468 r14:  1c8 r15: fe80dd69ad40
Dec 26 18:55:51 FServe unix: [ID 592667 kern.notice]fsb:  
8000 gsb: 80f1d000  ds:   43
Dec 26 18:55:51 FServe unix: [ID 592667 kern.notice]  
es:   43  fs:0  gs:  1c3
Dec 26 18:55:51 FServe unix: [ID 592667 kern.notice] 
trp:e err:0 rip: f3eaa2b0
Dec 26 18:55:51 FServe unix: [ID 592667 kern.notice]  
cs:   28 rfl:10282 rsp: fe8000929a78
Dec 26 18:55:51 FServe unix: [ID 266532 kern.notice]  
ss:   30

Dec 26 18:55:51 FServe unix: [ID 10 kern.notice]
Dec 26 18:55:51 FServe genunix: [ID 655072 kern.notice]  
fe8000929890 unix:real_mode_end+6ad1 ()
Dec 26 18:55:51 FServe genunix: [ID 655072 kern.notice]  
fe8000929970 unix:trap+d77 ()
Dec 26 18:55:51 FServe genunix: [ID 655072 kern.notice]  
fe8000929980 unix:cmntrap+13f ()
Dec 26 18:55:51 FServe genunix: [ID 655072 kern.notice]  
fe8000929ab0 zfs:vdev_queue_offset_compare+0 ()
Dec 26 18:55:51 FServe genunix: [ID 655072 kern.notice]  
fe8000929ae0 genunix:avl_add+1f ()
Dec 26 18:55:51 FServe genunix: [ID 655072 kern.notice]  
fe8000929b60 zfs:vdev_queue_io_to_issue+1ec ()
Dec 26 18:55:51 FServe genunix: [ID 655072 kern.notice]  
fe8000929ba0 zfs:zfsctl_ops_root+33bc48b1 ()
Dec 26 18:55:51 FServe genunix: [ID 655072 kern.notice]  
fe8000929bc0 zfs:vdev_disk_io_done+11 ()
Dec 26 18:55:51 FServe genunix: [ID 655072 kern.notice]  
fe8000929bd0 zfs:vdev_io_done+12 ()
Dec 26 18:55:51 FServe genunix: [ID 655072 kern.notice]  
fe8000929be0 zfs:zio_vdev_io_done+1b ()
Dec 26 18:55:51 FServe genunix: [ID 655072 kern.notice]  
fe8000929c60 genunix:taskq_thread+bc ()
Dec 26 18:55:51 FServe genunix: [ID 655072 kern.notice]  
fe8000929c70 unix:thread_start+8 ()

Dec 26 18:55:51 FServe unix: [ID 10 kern.notice]
Dec 26 18:55:51 FServe genunix: [ID 672855 kern.notice] syncing file  
systems...

Dec 26 18:55:51 FServe genunix: [ID 733762 kern.notice]  3
Dec 26 18:55:52 FServe genunix: [ID 904073 kern.notice]  done
Dec 26 18:55:53 FServe genunix: [ID 111219 kern.notice] dumping to / 
dev/dsk/c1d0s1, offset 1719074816, content: kernel



Additionally, but perhaps not related, I came across this while  
looking at the logs:


Dec 26 17:53:00 FServe marvell88sx: [ID 812950 kern.warning] WARNING:  
marvell88sx0: error on port 1:
Dec 26 17:53:00 FServe marvell88sx: [ID 517869 kern.info]
SError interrupt
Dec 26 17:53:00 FServe marvell88sx: [ID 517869 kern.info]   EDMA  
self disabled
Dec 26 17:53:00 FServe marvell88sx: [ID 517869 kern.info]
command request queue parity error

Dec 26 17:53:00 FServe marvell88sx: [ID 131198 kern.info]   SErrors:
Dec 26 17:53:00 FServe marvell88sx: [ID 517869  
kern.info]   Recovered communication error
Dec 26 17:53:00 FServe marvell88sx: [ID 517869  
kern.info]   PHY ready change
Dec 26 17:53:00 FServe marvell88sx: [ID 517869  
kern.info]   10-bit to 8-bit decode error
Dec 26 17:53:00 FServe marvell88sx: [ID 517869  
kern.info]   Disparity error


This happened right before 

[zfs-discuss] Checksum errors...

2006-12-28 Thread John
Background:
Large ZFS pool built on a couple of Sun 3511 SATA arrays. RAID-5 is done in the 
3511s. ZFS is non-redundant. We have been using this setup for a couple of 
months now with no issues.

Problem:
Yesterday afternoon we started getting checksum errors.  There have been no 
hardware errors reported at either the Solaris level or the hardware level.  
3511 logs are clean. Here is the zpool status:

tsmsun1 - /home/root zpool status -xv
  pool: z_tsmsun1_pool
 state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
z_tsmsun1_pool  ONLINE   0 0   180
  c22t600C0FF000678A0A86F3D901d0s0  ONLINE   0 0 0
  c22t600C0FF000678A0A86F3D900d0s0  ONLINE   0 0 0
  c22t600C0FF00068190A86F3D901d0s0  ONLINE   0 0 0
  c22t600C0FF00068190A86F3D900d0s0  ONLINE   0 0 0
  c22t600C0FF00068191A598ED500d0s0  ONLINE   0 0 0
  c22t600C0FF000678A1A598ED500d0s0  ONLINE   0 0 0
  c22t600C0FF00068191A598ED501d0s0  ONLINE   0 0 0
  c22t600C0FF000681943A7223100d0s0  ONLINE   0 0 0
  c22t600C0FF000681943A7223101d0ONLINE   0 0 0
  c22t600C0FF000681932BBD24400d0s0  ONLINE   0 0 0
  c22t600C0FF000681932BBD24401d0s0  ONLINE   0 0 0
  c22t600C0FF000678A43A7223100d0s0  ONLINE   0 0   180
  c22t600C0FF000678A2055211B01d0s0  ONLINE   0 0 0
  c22t600C0FF000678A2055211B00d0s0  ONLINE   0 0 0
  c22t600C0FF000678A32BBD24401d0s0  ONLINE   0 0 0
  c22t600C0FF000678A1A598ED501d0s0  ONLINE   0 0 0
  c22t600C0FF000678A32BBD24400d0s0  ONLINE   0 0 0
  c22t600C0FF000678A43A7223101d0s0  ONLINE   0 0 0
  c22t600C0FF00068192055211B00d0s0  ONLINE   0 0 0
  c22t600C0FF00068192055211B01d0s0  ONLINE   0 0 0
  c22t600C0FF000678A44F3D81B00d0s0  ONLINE   0 0 0
  c22t600C0FF000678A44F3D81B01d0s0  ONLINE   0 0 0
  c22t600C0FF000681944F3D81B00d0s0  ONLINE   0 0 0
  c22t600C0FF000681944F3D81B01d0s0  ONLINE   0 0 0

errors: The following persistent errors have been detected:

  DATASET  OBJECT  RANGE
  z_tsmsun1_pool/tsmsrv1_pool  26208464760832-8464891904

Looks like I have possibly a single file that is corrupted.  My question is how 
do I find the file.  Is it as simple as doing a find command using -inum 
2620?  

TIA,
john
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: Checksum errors...

2006-12-28 Thread John
Ok... guess I answered my own question... LOL!

I did the find with the -inum... gave me a file name... so i did:

tsmsun1 - /tsmsrv1_pool dd if=000203db.bfs of=/dev/null bs=128k
read: I/O error
64581+0 records in
64581+0 records out

So it would appear the file is poo-poo 

Now the interesting thoughts... These 3511's have been around for a couple of 
years.  We were using them with Veritas VxFS...  We only recently switched over 
to ZFS to take advantage of compression So is it safe to say that I was 
lucky using VxFS and never had any corruption or was I suffering from silent 
corruption under VxFS hmmm.

thanks!
john
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Checksum errors...

2006-12-28 Thread Robert Milkowski
Hello John,

Thursday, December 28, 2006, 12:59:34 PM, you wrote:

J Ok... guess I answered my own question... LOL!

J I did the find with the -inum... gave me a file name... so i did:

J tsmsun1 - /tsmsrv1_pool dd if=000203db.bfs of=/dev/null bs=128k
J read: I/O error
J 64581+0 records in
J 64581+0 records out

J So it would appear the file is poo-poo 

J Now the interesting thoughts... These 3511's have been around for
J a couple of years.  We were using them with Veritas VxFS...  We
J only recently switched over to ZFS to take advantage of
J compression So is it safe to say that I was lucky using VxFS
J and never had any corruption or was I suffering from silent
J corruption under VxFS hmmm.

I guess you got silent corruption before.
With 3511 with sata driver I also get checksum errors from time to
time, while on 3510 with FC drives I haven't seen them (yet).
And it explains why we had to run fsck every few months before...

-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zfs services , label and packages

2006-12-28 Thread storage-disk
Hi there

I have 3 questions regarding zfs.

1. what are zfs packages?

2. what services need to be started in order for zfs working properly?

3. Is there an explanation for zfs label This is the out put from zdb -l 
/dev/dsk/C_t_d. I'd like to know what they mean each line. I have some idea 
what it means but not every line. 


LABEL 3

version=2
name='viper'
state=0
txg=14693
pool_guid=514982409923329758
top_guid=5076607254487322717
guid=9253340189361483228
vdev_tree
type='mirror'
id=0
guid=5076607254487322717
metaslab_array=13
metaslab_shift=20
ashift=9
asize=129499136
children[0]
type='disk'
id=0
guid=9253340189361483228
path='/dev/dsk/c3t53d0s1'
devid='id1,disk wwn/b'
whole_disk=0
DTL=19
children[1]
type='disk'
id=1
guid=6007798096730764430
path='/dev/dsk/c3t54d0s1'
devid='id1,disk wwn /b'
whole_disk=0
DTL=18

Thank you very much.
Giang
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: Re: Checksum errors...

2006-12-28 Thread John
Thanks for the reply!

As it turns out I ran a parity check on the suspect 3511... sure enough it 
popped and error!  So ZFS did detect the problem with the 3511...
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Solaris 10 11/06

2006-12-28 Thread George Wilson
Now that Solaris 10 11/06 is available, I wanted to post the complete list of 
ZFS features and bug fixes that were included in that release. I'm also 
including the necessary patches for anyone wanting to get all the ZFS features 
and fixes via patches (NOTE: later patch revision may already be available):

Solaris 10 Update 3 (11/06) Patches
sparc Patches

* 118833-36 SunOS 5.10: kernel patch
* 124204-03 SunOS 5.10: zfs patch
* 122660-07 SunOS 5.10: zones jumbo patch
* 120986-07 SunOS 5.10: mkfs and newfs patch
* 123839-01 SunOS 5.10: Fault Manager patch

i386 Patches

* 118855-36 SunOS 5.10_x86: kernel Patch
* 122661-05 SunOS 5.10_x86: zones jumbo patch
* 124205-04 SunOS 5.10_x86: zfs/zpool patch
* 120987-07 SunOS 5.10_x86: mkfs, newfs, other ufs utils patch
* 123840-01 SunOS 5.10_x86: Fault Manager patch


ZFS Features/Projects

PSARC 2006/223 ZFS Hot Spares
PSARC 2006/303 ZFS Clone Promotion
PSARC 2006/388 snapshot -r

ZFS Bug Fixes/RFEs

4034947 anon_swap_adjust() should call kmem_reap() if availrmem is low.
6276916 support for clone swap
6288488 du reports misleading size on RAID-Z
6354408 libdiskmgt needs to handle sysevent failures in miniroot or failsafe 
environments better
6366301 CREATE with owner_group attribute is not set correctly with NFSv4/ZFS
6373978 want to take lots of snapshots quickly ('zfs snapshot -r')
6385436 zfs set  returns an error, but still sets property value
6393490 libzfs should be a real library
6397148 fbufs debug code should be removed from buf_hash_insert()
6401400 zfs(1) usage output is excessively long
6405330 swap on zvol isn't added during boot
6405966 Hot Spare support in ZFS
6409228 typo in aclutils.h
6409302 passing a non-root vdev via zpool_create() panics system
6415739 assertion failed: !(zio-io_flags  0x00040)
6416482 filebench oltp workload hangs in zfs
6416759 ::dbufs does not find bonus buffers anymore
6416794 zfs panics in dnode_reallocate during incremental zfs restore
6417978 double parity RAID-Z a.k.a. RAID6
6420204 root filesystem's delete queue is not running
6421216 ufsrestore should use acl_set() for setting ACLs
6424554 full block re-writes need not read data in
6425111 detaching an offline device can result in import confusion
6425740 assertion failed: new_state != old_state
6430121 3-way deadlock involving tc_lock within zfs
6433208 should not be able to offline/online a spare
6433264 crash when adding spare: nvlist_lookup_string(cnv, path, path) == 0
6433406 zfs_open() can leak memory on failure
6433408 namespace_reload() can leak memory on allocation failure
6433679 zpool_refresh_stats() has poor error semantics
6433680 changelist_gather() ignores libuutil errors
6433717 offline devices should not be marked persistently unavailble
6435779 6433679 broke zpool import
6436502 fsstat needs to support file systems greater than 2TB
6436514 zfs share on /var/mail needs to be run explicitly after system boots
6436524 importing a bogus pool config can panic system
6436526 delete_queue thread reporting drained when it may not be true
6436800 ztest failure: spa_vdev_attach() returns EBUSY instead of ENOTSUP
6439102 assertain failed: dmu_buf_refcount(dd-dd_dbuf) == 2 in 
dsl_dir_destroy_check()
6439370 assertion failures possible in dsl_dataset_destroy_sync()
6440499 zil should avoid txg_wait_synced() and use dmu_sync() to issue parallel 
IOs when fsyncing
6443585 zpool create of poolname  250 and  256 characters panics in debug 
printout
6444346 zfs promote fails in zone
6446569 deferred list is hooked on flintstone vitamins
6447377 ZFS prefetch is inconsistant
6447381 dnode_free_range() does not handle non-power-of-two blocksizes 
correctly6451860 zfs rename' a filesystem|clone to its direct child will cause 
internal error
6447452 re-creating zfs files can lead to failure to unmount
6448371 'zfs promote' of a volume clone fails with EBUSY
6448999 panic: used == ds-ds_phys-ds_unique_bytes
6449033 PIT nightly fails due to the fix for 6436514
6449078 Makefile for fsstat contains '-g' option
6450292 unmount original file system, 'zfs promote' cause system panic.
6451124 assertion failed: rc-rc_count = number
6451412 renaming snapshot with 'mv' makes unmounting snapshot impossible
6452372 assertion failed: dnp-dn_nlevels == 1
6452420 zfs_get_data() of page data panics when blocksize is less than pagesize
6452923 really out of space panic even though ms_map.sm_space  0
6453304 s10u3_03 integration for 6405966 breaks on10-patch B3 feature build
6458781 random spurious ENOSPC failures

Thanks,
George
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs services , label and packages

2006-12-28 Thread George Wilson



storage-disk wrote:

Hi there

I have 3 questions regarding zfs.

1. what are zfs packages?


SUNWzfsr, SUNWzfskr, and SUNWzfsu. Note that ZFS has dependencies on 
other components of Solaris so installing just the packages in not 
supported.




2. what services need to be started in order for zfs working properly?


ZFS uses relies on three SMF services:

1) svc:/system/filesystem/local:default
2) svc:/system/device/local:default
3) svc:/system/fmd:default

The last service maybe disabled but you lose your ability to receive FMA 
events.




3. Is there an explanation for zfs label This is the out put from zdb -l /dev/dsk/C_t_d. I'd like to know what they mean each line. I have some idea what it means but not every line. 



LABEL 3

version=2
name='viper'
state=0
txg=14693
pool_guid=514982409923329758
top_guid=5076607254487322717
guid=9253340189361483228
vdev_tree
type='mirror'
id=0
guid=5076607254487322717
metaslab_array=13
metaslab_shift=20
ashift=9
asize=129499136
children[0]
type='disk'
id=0
guid=9253340189361483228
path='/dev/dsk/c3t53d0s1'
devid='id1,disk wwn/b'
whole_disk=0
DTL=19
children[1]
type='disk'
id=1
guid=6007798096730764430
path='/dev/dsk/c3t54d0s1'
devid='id1,disk wwn /b'
whole_disk=0
DTL=18


Take a look at chapter 1 of the Ondisk format guide (specifically pages 
 9-14) as they include the majority of the lines above.


Thanks,
George



Thank you very much.
Giang
 
 
This message posted from opensolaris.org

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] using zpool attach/detach to migrate drives from one controller to another

2006-12-28 Thread Richard Elling

I think ZFS might be too smart here.  The feature we like is that ZFS
will find the devices no matter what their path is.  This is very much
a highly desired feature.  If there are multiple paths to the same LUN,
then it does expect an intermediary to handle that: MPxIO, PowerPath, etc.
Unfortunately, the FC disks and hubs in the A5000 are quite limited in
their ability to do clever things, as is the socal interface.  It is no
wonder they were EOLed long ago.  As-is, the ability of ZFS to rediscover
the A5000+socal disks when a loop is dead may be a manual process or
automatic on reboot (qv. discussions here on [un]desired panics)
 -- richard

Derek E. Lewis wrote:

On Thu, 28 Dec 2006, George Wilson wrote:

You're best bet is to export and re-import the pool after moving 
devices. You might also try to 'zpool offline' the device, move it and 
then 'zpool online' it. This should force a reopen of the device and 
then it would only have to resilver the transactions during the time 
that the device was offline. I have not tried the later but it should 
work.


George,

I haven't moved any devices around. I have two physical paths to the 
JBOD, which allows the system to see all the disks on two different 
controllers (c1t53d0 and c2t53d0 are already there). 'zfs 
online/offline' and 'zfs import/export' aren't going to help at all 
unless I physically swap the fibre paths. This won't work because I have 
other pools on the JBOD.


If this were a production system, exporting the entire pool would not be 
ideal, just to change the controller the mirrored pooldevs are using. If 
ZFS cannot do this without (1) exporting the pool and importing it or 
(2) doing a complete resilver of the disk(s), this sounds like a valid 
RFE for a more intelligent 'zfs replace' or 'zfs attach/detach'.


Thanks,

Derek E. Lewis
[EMAIL PROTECTED]
http://delewis.blogspot.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Solaris 10 11/06

2006-12-28 Thread Luke Lonergan
Thanks for all the hard work on ZFS performance fixes George!  U3 works
great.

- Luke


On 12/28/06 9:18 AM, George Wilson [EMAIL PROTECTED] wrote:

 Now that Solaris 10 11/06 is available, I wanted to post the complete list of
 ZFS features and bug fixes that were included in that release. I'm also
 including the necessary patches for anyone wanting to get all the ZFS features
 and fixes via patches (NOTE: later patch revision may already be available):
 
 Solaris 10 Update 3 (11/06) Patches
 sparc Patches
 
 * 118833-36 SunOS 5.10: kernel patch
 * 124204-03 SunOS 5.10: zfs patch
 * 122660-07 SunOS 5.10: zones jumbo patch
 * 120986-07 SunOS 5.10: mkfs and newfs patch
 * 123839-01 SunOS 5.10: Fault Manager patch
 
 i386 Patches
 
 * 118855-36 SunOS 5.10_x86: kernel Patch
 * 122661-05 SunOS 5.10_x86: zones jumbo patch
 * 124205-04 SunOS 5.10_x86: zfs/zpool patch
 * 120987-07 SunOS 5.10_x86: mkfs, newfs, other ufs utils patch
 * 123840-01 SunOS 5.10_x86: Fault Manager patch
 
 
 ZFS Features/Projects
 
 PSARC 2006/223 ZFS Hot Spares
 PSARC 2006/303 ZFS Clone Promotion
 PSARC 2006/388 snapshot -r
 
 ZFS Bug Fixes/RFEs
 
 4034947 anon_swap_adjust() should call kmem_reap() if availrmem is low.
 6276916 support for clone swap
 6288488 du reports misleading size on RAID-Z
 6354408 libdiskmgt needs to handle sysevent failures in miniroot or failsafe
 environments better
 6366301 CREATE with owner_group attribute is not set correctly with NFSv4/ZFS
 6373978 want to take lots of snapshots quickly ('zfs snapshot -r')
 6385436 zfs set  returns an error, but still sets property value
 6393490 libzfs should be a real library
 6397148 fbufs debug code should be removed from buf_hash_insert()
 6401400 zfs(1) usage output is excessively long
 6405330 swap on zvol isn't added during boot
 6405966 Hot Spare support in ZFS
 6409228 typo in aclutils.h
 6409302 passing a non-root vdev via zpool_create() panics system
 6415739 assertion failed: !(zio-io_flags  0x00040)
 6416482 filebench oltp workload hangs in zfs
 6416759 ::dbufs does not find bonus buffers anymore
 6416794 zfs panics in dnode_reallocate during incremental zfs restore
 6417978 double parity RAID-Z a.k.a. RAID6
 6420204 root filesystem's delete queue is not running
 6421216 ufsrestore should use acl_set() for setting ACLs
 6424554 full block re-writes need not read data in
 6425111 detaching an offline device can result in import confusion
 6425740 assertion failed: new_state != old_state
 6430121 3-way deadlock involving tc_lock within zfs
 6433208 should not be able to offline/online a spare
 6433264 crash when adding spare: nvlist_lookup_string(cnv, path, path) == 0
 6433406 zfs_open() can leak memory on failure
 6433408 namespace_reload() can leak memory on allocation failure
 6433679 zpool_refresh_stats() has poor error semantics
 6433680 changelist_gather() ignores libuutil errors
 6433717 offline devices should not be marked persistently unavailble
 6435779 6433679 broke zpool import
 6436502 fsstat needs to support file systems greater than 2TB
 6436514 zfs share on /var/mail needs to be run explicitly after system boots
 6436524 importing a bogus pool config can panic system
 6436526 delete_queue thread reporting drained when it may not be true
 6436800 ztest failure: spa_vdev_attach() returns EBUSY instead of ENOTSUP
 6439102 assertain failed: dmu_buf_refcount(dd-dd_dbuf) == 2 in
 dsl_dir_destroy_check()
 6439370 assertion failures possible in dsl_dataset_destroy_sync()
 6440499 zil should avoid txg_wait_synced() and use dmu_sync() to issue
 parallel IOs when fsyncing
 6443585 zpool create of poolname  250 and  256 characters panics in debug
 printout
 6444346 zfs promote fails in zone
 6446569 deferred list is hooked on flintstone vitamins
 6447377 ZFS prefetch is inconsistant
 6447381 dnode_free_range() does not handle non-power-of-two blocksizes
 correctly6451860 zfs rename' a filesystem|clone to its direct child will cause
 internal error
 6447452 re-creating zfs files can lead to failure to unmount
 6448371 'zfs promote' of a volume clone fails with EBUSY
 6448999 panic: used == ds-ds_phys-ds_unique_bytes
 6449033 PIT nightly fails due to the fix for 6436514
 6449078 Makefile for fsstat contains '-g' option
 6450292 unmount original file system, 'zfs promote' cause system panic.
 6451124 assertion failed: rc-rc_count = number
 6451412 renaming snapshot with 'mv' makes unmounting snapshot impossible
 6452372 assertion failed: dnp-dn_nlevels == 1
 6452420 zfs_get_data() of page data panics when blocksize is less than
 pagesize
 6452923 really out of space panic even though ms_map.sm_space  0
 6453304 s10u3_03 integration for 6405966 breaks on10-patch B3 feature build
 6458781 random spurious ENOSPC failures
 
 Thanks,
 George
  
  
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 

[zfs-discuss] Deadlock with a pool using files on another zfs?

2006-12-28 Thread Jason Austin
When messing around with zfs trying to break it, I creating a new pool using 
files on an existing zfs filesystem.  It seem to work fine until I created a 
snapshot of the original filesystem and then tried to destroy the pool using 
the files.  The system appeared to deadlock and had to be rebooted.  When it 
came back up the files pool was in an error state and could be destroyed.

I don't see much value in being able to do that but it might be a good idea to 
have zpool error out instead of a creating a pool that could crash that system.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Fix dot-dot permissions without unmount?

2006-12-28 Thread Jason Austin
After importing some pools after a re-install of the OS, i hit that ..: 
Permission denied problem.  I figured out I could unmount, chmod, and mount to 
fix it but that wouldn't be a good situation on a production box.  Is there 
anyway to fix this problem without unmounting?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Fix dot-dot permissions without unmount?

2006-12-28 Thread Nicholas Senedzuk

The solution that I am going to give you has worked on ufs and vxfs for me
and I do not see why it would not work on zfs.

Share out the directory that you are having problems with lets say /opt.
Once its shared mount it to /a. Then you can do the chmod on /a then umount
/a and you should have fixed the underlying  permissions problem.


If you have any questions just let me know.


Nick

On 12/28/06, Jason Austin [EMAIL PROTECTED] wrote:


After importing some pools after a re-install of the OS, i hit that ..:
Permission denied problem.  I figured out I could unmount, chmod, and mount
to fix it but that wouldn't be a good situation on a production box.  Is
there anyway to fix this problem without unmounting?


This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: Fix dot-dot permissions without unmount?

2006-12-28 Thread Jason Austin
I should clarify.  Say I have a zfs with the mount point /u00 that I import on 
the system.  When it creates the /u00 directory on the UFS root, it's created 
with 700, and then the zfs is mounted and it appears to have the permissions of 
the root of the zfs.  755 in this case.  

But, if a non-root user tries cd .. while in /u00, they get a permission 
denied because the /u00 directory is 700 even though it doesn't show those 
permissions in ls and they are not changeable with chmod . The only way to fix 
it is unmount /u00, chmod the mount point, and then remount.  That's fine on my 
test system but in production where I've already started up my database that 
people are using, I can't just shut everything down and unmount the /u00 
directory.

I probably wouldn't even have noticed this but bash seems to traverse up the 
directory tree to determine CWD.  That creates an error (non fatal in this 
case) in my oracle startup script that does a su - oracle -c 
/u00/my/start/script.sh  

To reproduce, just unmount any zfs, chmod it's mount point to 700, remount, and 
then try to cd .. from a non-root user from the mount point directory.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Fix dot-dot permissions without unmount?

2006-12-28 Thread Nicholas Senedzuk

I understand what you are saying and have had the same problem. I used the
fix that I mentioned to fix this problem on 2 of my production DB systems.

On 12/28/06, Jason Austin [EMAIL PROTECTED] wrote:


I should clarify.  Say I have a zfs with the mount point /u00 that I
import on the system.  When it creates the /u00 directory on the UFS root,
it's created with 700, and then the zfs is mounted and it appears to have
the permissions of the root of the zfs.  755 in this case.

But, if a non-root user tries cd .. while in /u00, they get a permission
denied because the /u00 directory is 700 even though it doesn't show those
permissions in ls and they are not changeable with chmod . The only way to
fix it is unmount /u00, chmod the mount point, and then remount.  That's
fine on my test system but in production where I've already started up my
database that people are using, I can't just shut everything down and
unmount the /u00 directory.

I probably wouldn't even have noticed this but bash seems to traverse up
the directory tree to determine CWD.  That creates an error (non fatal in
this case) in my oracle startup script that does a su - oracle -c
/u00/my/start/script.sh

To reproduce, just unmount any zfs, chmod it's mount point to 700,
remount, and then try to cd .. from a non-root user from the mount point
directory.


This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss