Re: [zfs-discuss] ZFS problems which scrub can't find?

2008-12-18 Thread Marcin Szychowski
Do you use any form of compression?

I changed compression from none to gzip-9, got some message about changing 
properties of boot pool (or fs), copied and moved all files under /usr and /etc 
to enforce compression, rebooted, and - guess what message did I get.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS problems which scrub can't find?

2008-11-07 Thread Matt Ingenthron
Off the lists, someone suggested to me that the "Inconsistent 
filesystem" may be the boot archive and not the ZFS filesystem (though I 
still don't know what's wrong with booting b99).

Regardless, I tried rebuilding the boot_archive with bootadm 
update-archive -vf and verified it by mounting it  and peeking inside.  
I also tried both with and without /etc/hostid.  I still get the same 
behavior.

Any thoughts?

Thanks in advance,

- Matt

[EMAIL PROTECTED] wrote:
> Hi,
>
> After a recent pkg image-update to OpenSolaris build 100, my system 
> booted once and now will no longer boot.  After exhausting other 
> options, I am left wondering if there is some kind of ZFS issue a 
> scrub won't find.
>
> The current behavior is that it will load GRUB, but trying to boot the
> most recent boot environment (b100 based) I get "Error 16: Inconsistent
> filesystem structure".  The pool has gone through two scrubs from a 
> livecd based on b101a without finding anything wrong.  If I select the 
> previous boot environment (b99 based), I get a kernel panic.
>
> I've tried replacing the /etc/hostid based on a hunch from one of the 
> engineers working on Indiana and ZFS boot.  I also tried rebuilding 
> the boot_archive and reloading the GRUB based on build 100.  I then 
> tried reloading the build 99 grub to hopefully get to where I could 
> boot build 99.  No luck with any of these thus far.
>
> More below, and some comments in this bug:
> http://defect.opensolaris.org/bz/show_bug.cgi?id=3965, though may need
> to be a separate bug.
>
> I'd appreciate any suggestions and be glad to gather any data to 
> diagnose this if possible.
>
>
> == Screen when trying to boot b100 after boot menu ==
>
>  Booting 'opensolaris-15'
>
> bootfs rpool/ROOT/opensolaris-15
> kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS
> loading '/platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS' ...
> cpu: 'GenuineIntel' family 6 model 15 step 11
> [BIOS accepted mixed-mode target setting!]
>   [Multiboot-kludge, loadaddr=0xbffe38, text-and-data=0x1931a8, bss=0x0,
> entry=0xc0]
> '/platform/i86pc/kernel/amd64/unix -B
> zfs-bootfs=rpool/391,bootpath="[EMAIL PROTECTED],0/pci1179,[EMAIL 
> PROTECTED],2/[EMAIL PROTECTED],0:a",diskdevid="id1,[EMAIL PROTECTED]/a"' 
>
> is loaded
> module$ /platform/i86pc/$ISADIR/boot_archive
> loading '/platform/i86pc/$ISADIR/boot_archive' ...
>
> Error 16: Inconsistent filesystem structure
>
> Press any key to continue...
>
>
>
> == Booting b99 ==
> (by selecting the grub entry from the GRUB menu and adding -kd then 
> doing a :c to continue I get the following stack trace)
>
> debug_enter+37 ()
> panicsys+40b ()
> vpanic+15d ()
> panic+9c ()
> (lines above typed in from ::stack, lines below typed in from when it 
> dropped into the debugger)
> unix:die+ea ()
> unix:trap+3d0 ()
> unix:cmntrap+e9 ()
> unix:mutex_owner_running+d ()
> genunix:lokuppnat+bc ()
> genunix:vn_removeat+7c ()
> genunix:vn_remove_28 ()
> zfs:spa_config_write+18d ()
> zfs:spa_config_sync+102 ()
> zfs:spa_open_common+24b ()
> zfs:spa_open+1c ()
> zfs:dsl_dsobj_to_dsname+37 ()
> zfs:zfs_parse_bootfs+68 ()
> zfs:zfs_mountroot+10a ()
> genunxi:fsop_mountroot+1a ()
> genunix:rootconf+d5 ()
> genunix:vfs_mountroot+65 ()
> genunix:main+e6 ()
> unix:_locore_start+92 ()
>
> panic: entering debugger (no dump device, continue to reboot)
> Loaded modules: [ scsi_vhci uppc sd zfs specfs pcplusmp cpu.generic ]
> kmdb: target stopped at:
> kmdb_enter+0xb: movq   %rax,%rdi
>
>
>
> == Output from zdb ==
> 
> LABEL 0
> 
>version=10
>name='rpool'
>state=1
>txg=327816
>pool_guid=6981480028020800083
>hostid=95693
>hostname='opensolaris'
>top_guid=5199095267524632419
>guid=5199095267524632419
>vdev_tree
>type='disk'
>id=0
>guid=5199095267524632419
>path='/dev/dsk/c4t0d0s0'
>devid='id1,[EMAIL PROTECTED]/a'
>phys_path='/[EMAIL PROTECTED],0/pci1179,[EMAIL PROTECTED],2/[EMAIL 
> PROTECTED],0:a'
>whole_disk=0
>metaslab_array=14
>metaslab_shift=29
>ashift=9
>asize=90374406144
>is_log=0
>DTL=161
> 
> LABEL 1
> 
>version=10
>name='rpool'
>state=1
>txg=327816
>pool_guid=6981480028020800083
>hostid=95693
>hostname='opensolaris'
>top_guid=5199095267524632419
>guid=5199095267524632419
>vdev_tree
>type='disk'
>id=0
>guid=5199095267524632419
>path='/dev/dsk/c4t0d0s0'
>devid='id1,[EMAIL PROTECTED]/a'
>phys_path='/[EMAIL PROTECTED],0/pci1179,[EMAIL PROTECTED],2/[EMAIL 
> PROTECTED],0:a'
>whole_disk=0
>metaslab_array=14
>metaslab_shift=29
>ashift=9
>asize=90374406144
>is_log=0
>DTL=161
> -

[zfs-discuss] ZFS problems which scrub can't find?

2008-11-06 Thread Matt . Ingenthron
Hi,

After a recent pkg image-update to OpenSolaris build 100, my system 
booted once and now will no longer boot.  After exhausting other 
options, I am left wondering if there is some kind of ZFS issue a scrub 
won't find.

The current behavior is that it will load GRUB, but trying to boot the
most recent boot environment (b100 based) I get "Error 16: Inconsistent
filesystem structure".  The pool has gone through two scrubs from a 
livecd based on b101a without finding anything wrong.  If I select the 
previous boot environment (b99 based), I get a kernel panic.

I've tried replacing the /etc/hostid based on a hunch from one of the 
engineers working on Indiana and ZFS boot.  I also tried rebuilding the 
boot_archive and reloading the GRUB based on build 100.  I then tried 
reloading the build 99 grub to hopefully get to where I could boot build 
99.  No luck with any of these thus far.

More below, and some comments in this bug:
http://defect.opensolaris.org/bz/show_bug.cgi?id=3965, though may need
to be a separate bug.

I'd appreciate any suggestions and be glad to gather any data to 
diagnose this if possible.


== Screen when trying to boot b100 after boot menu ==

  Booting 'opensolaris-15'

bootfs rpool/ROOT/opensolaris-15
kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS
loading '/platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS' ...
cpu: 'GenuineIntel' family 6 model 15 step 11
[BIOS accepted mixed-mode target setting!]
   [Multiboot-kludge, loadaddr=0xbffe38, text-and-data=0x1931a8, bss=0x0,
entry=0xc0]
'/platform/i86pc/kernel/amd64/unix -B
zfs-bootfs=rpool/391,bootpath="[EMAIL PROTECTED],0/pci1179,[EMAIL 
PROTECTED],2/[EMAIL PROTECTED],0:a",diskdevid="id1,[EMAIL PROTECTED]/a"'
is loaded
module$ /platform/i86pc/$ISADIR/boot_archive
loading '/platform/i86pc/$ISADIR/boot_archive' ...

Error 16: Inconsistent filesystem structure

Press any key to continue...



== Booting b99 ==
(by selecting the grub entry from the GRUB menu and adding -kd then 
doing a :c to continue I get the following stack trace)

debug_enter+37 ()
panicsys+40b ()
vpanic+15d ()
panic+9c ()
(lines above typed in from ::stack, lines below typed in from when it 
dropped into the debugger)
unix:die+ea ()
unix:trap+3d0 ()
unix:cmntrap+e9 ()
unix:mutex_owner_running+d ()
genunix:lokuppnat+bc ()
genunix:vn_removeat+7c ()
genunix:vn_remove_28 ()
zfs:spa_config_write+18d ()
zfs:spa_config_sync+102 ()
zfs:spa_open_common+24b ()
zfs:spa_open+1c ()
zfs:dsl_dsobj_to_dsname+37 ()
zfs:zfs_parse_bootfs+68 ()
zfs:zfs_mountroot+10a ()
genunxi:fsop_mountroot+1a ()
genunix:rootconf+d5 ()
genunix:vfs_mountroot+65 ()
genunix:main+e6 ()
unix:_locore_start+92 ()

panic: entering debugger (no dump device, continue to reboot)
Loaded modules: [ scsi_vhci uppc sd zfs specfs pcplusmp cpu.generic ]
kmdb: target stopped at:
kmdb_enter+0xb: movq   %rax,%rdi



== Output from zdb ==

LABEL 0

version=10
name='rpool'
state=1
txg=327816
pool_guid=6981480028020800083
hostid=95693
hostname='opensolaris'
top_guid=5199095267524632419
guid=5199095267524632419
vdev_tree
type='disk'
id=0
guid=5199095267524632419
path='/dev/dsk/c4t0d0s0'
devid='id1,[EMAIL PROTECTED]/a'
phys_path='/[EMAIL PROTECTED],0/pci1179,[EMAIL PROTECTED],2/[EMAIL 
PROTECTED],0:a'
whole_disk=0
metaslab_array=14
metaslab_shift=29
ashift=9
asize=90374406144
is_log=0
DTL=161

LABEL 1

version=10
name='rpool'
state=1
txg=327816
pool_guid=6981480028020800083
hostid=95693
hostname='opensolaris'
top_guid=5199095267524632419
guid=5199095267524632419
vdev_tree
type='disk'
id=0
guid=5199095267524632419
path='/dev/dsk/c4t0d0s0'
devid='id1,[EMAIL PROTECTED]/a'
phys_path='/[EMAIL PROTECTED],0/pci1179,[EMAIL PROTECTED],2/[EMAIL 
PROTECTED],0:a'
whole_disk=0
metaslab_array=14
metaslab_shift=29
ashift=9
asize=90374406144
is_log=0
DTL=161

LABEL 2

version=10
name='rpool'
state=1
txg=327816
pool_guid=6981480028020800083
hostid=95693
hostname='opensolaris'
top_guid=5199095267524632419
guid=5199095267524632419
vdev_tree
type='disk'
id=0
guid=5199095267524632419
path='/dev/dsk/c4t0d0s0'
devid='id1,[EMAIL PROTECTED]/a'
phys_path='/[EMAIL PROTECTED],0/pci1179,[EMAIL PROTECTED],2/[EMAIL 
PROTECTED],0:a'
whole_disk=0
metaslab_array=14
metaslab_shift=29
ashift=9
asize=90374406144
is_log=0
DTL=161
--

Re: [zfs-discuss] ZFS Problems under vmware

2008-06-17 Thread Anthony Worrall
Raw Device Mapping is a feature of ESX 2.5 and above which allows a guest OS to 
have access to a LUN on fibre or ISCSI SAN.

See http://www.vmware.com/pdf/esx25_rawdevicemapping.pdf for more details.

You may be able to do something similar with the raw disks under workstation
see http://www.vmware.com/support/reference/linux/osonpartition_linux.html


Since I added the RDM to one of my guest OSes all of them them have started 
working using virtual disks after running

#zpool export tank
#zpool import -f tank


Maybe adding the RDM changed some behavoiur of ESX or mabe I just got lucky
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Problems under vmware

2008-06-16 Thread Dave Bechtel
Hi - I'm interested in your solution as my current ZFS/vmware experiment is 
stalled.
I have a 6-disk SCSI rack ( 6 @ 9GB/ea ) attached as Raw disks to the VM 
(Workstation 6), and have been getting ZFS pool corruption on reboot.  Vmware 
is allowing the Solaris guest to write a disklabel that is (1) cylinder over 
the physical # on the disk.

Could you please post more detailed info? TIA
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Problems under vmware

2008-06-16 Thread Anthony Worrall
Added an vdev using rdm and that seems to be stable over reboots

however the pools based on a virtual disk now also seems to be stable after 
doing an export and import -f
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Problems under vmware

2008-06-16 Thread Anthony Worrall
I am seeing the same problem using a seperate virtual disk for the pool.
This is happening with Solaris 10 U3, U4 and U5


SCSI reservations is know to be an issue with clustered solaris 
http://blogs.sun.com/SC/entry/clustering_solaris_guests_that_run

I wonder if this is the same problem. Maybe we have to use Raw Device Mapping 
(RDM) to get zfs to work under vmware.

Anthony Worrall
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS problems with USB Storage devices

2008-06-06 Thread Paulo Soeiro
Hi Ricardo,

I'll try that.

Thanks (Obrigado)
Paulo Soeiro



On 6/5/08, Ricardo M. Correia <[EMAIL PROTECTED]> wrote:
>
> On Ter, 2008-06-03 at 23:33 +0100, Paulo Soeiro wrote:
>
> 6)Remove and attached the usb sticks:
>
> zpool status
> pool: myPool
> state: UNAVAIL
> status: One or more devices could not be used because the label is missing
> or invalid. There are insufficient replicas for the pool to continue
> functioning.
> action: Destroy and re-create the pool from a backup source.
> see: http://www.sun.com/msg/ZFS-8000-5E
> scrub: none requested
> config:
> NAME STATE READ WRITE CKSUM
> myPool UNAVAIL 0 0 0 insufficient replicas
> mirror UNAVAIL 0 0 0 insufficient replicas
> c6t0d0p0 FAULTED 0 0 0 corrupted data
> c7t0d0p0 FAULTED 0 0 0 corrupted data
>
>
> This could be a problem of USB devices getting renumbered (or something to
> that effect).
> Try doing "zpool export myPool" and "zpool import myPool" at this point, it
> should work fine and you should be able to get your data back.
>
> Cheers,
> Ricardo
>   --
>*Ricardo Manuel Correia*
> Lustre Engineering
>
> *Sun Microsystems, Inc.*
> Portugal
> Phone +351.214134023 / x58723
> Mobile +351.912590825
> Email [EMAIL PROTECTED]
>
<<6g_top.gif>>___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS problems with USB Storage devices

2008-06-04 Thread Ricardo M. Correia
On Ter, 2008-06-03 at 23:33 +0100, Paulo Soeiro wrote:
> 6)Remove and attached the usb sticks:
> 
> zpool status
> pool: myPool
> state: UNAVAIL
> status: One or more devices could not be used because the label is
> missing 
> or invalid. There are insufficient replicas for the pool to continue
> functioning.
> action: Destroy and re-create the pool from a backup source.
> see: http://www.sun.com/msg/ZFS-8000-5E
> scrub: none requested
> config:
> NAME STATE READ WRITE CKSUM
> myPool UNAVAIL 0 0 0 insufficient replicas
> mirror UNAVAIL 0 0 0 insufficient replicas
> c6t0d0p0 FAULTED 0 0 0 corrupted data
> c7t0d0p0 FAULTED 0 0 0 corrupted data


This could be a problem of USB devices getting renumbered (or something
to that effect).
Try doing "zpool export myPool" and "zpool import myPool" at this point,
it should work fine and you should be able to get your data back.

Cheers,
Ricardo

--

Ricardo Manuel Correia
Lustre Engineering

Sun Microsystems, Inc.
Portugal
Phone +351.214134023 / x58723
Mobile +351.912590825
Email [EMAIL PROTECTED]
<>___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS problems with USB Storage devices

2008-06-03 Thread Bill McGonigle
On Jun 3, 2008, at 18:34, Paulo Soeiro wrote:

> This test was done without the hub:

FWIW, I bought 9 microSD's and 9 USB controller units for them from  
NewEgg to replicate the famous ZFS demo video, and I had problems  
getting them working with OpenSolaris (on VMWare on OSX, in this case).

After getting frustrated and thinking about it for a while, I decided  
to test each MicroSD card and controller independently (using dd) and  
one of the adapters turned out to be flakey at just writing zeros.   
It also happened to be the #0 adapter which through me off for a  
while, since that's where I started.  So, then I was still having  
problems (but I had tested the remaining units), so I went home for  
the weekend, and left them plugged into their hubs (i-rocks brand  
seems OK so far), and came back to a system log full of a second  
adapter dropping out several times over the weekend (though it  
survived a quick dd).  Taking it off the hub, it did the same thing  
for me if I waited long enough (10 minutes or so - I assume it was  
getting warmed up).

I've also had to replace a server mobo which had a faulty USB  
implementation (Compaq brand, one of the early USB2.0 chips).

Just food for thought - there's a lot to go wrong before ZFS sees it  
and USB gear isn't always well-made.

-Bill

-
Bill McGonigle, Owner   Work: 603.448.4440
BFC Computing, LLC  Home: 603.448.1668
[EMAIL PROTECTED]   Cell: 603.252.2606
http://www.bfccomputing.com/Page: 603.442.1833
Blog: http://blog.bfccomputing.com/
VCard: http://bfccomputing.com/vcard/bill.vcf

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS problems with USB Storage devices

2008-06-03 Thread Paulo Soeiro
This test was done without the hub:

On Tue, Jun 3, 2008 at 11:33 PM, Paulo Soeiro <[EMAIL PROTECTED]> wrote:

> Did the same test again and here is the result:
>
> 1)
>
> zpool create myPool mirror c6t0d0p0 c7t0d0p0
>
> 2)
>
> -bash-3.2# zfs create myPool/myfs
>
> -bash-3.2# zpool status
>
> pool: myPool
>
> state: ONLINE
>
> scrub: none requested
>
> config:
>
> NAME STATE READ WRITE CKSUM
>
> myPool ONLINE 0 0 0
>
> mirror ONLINE 0 0 0
>
> c6t0d0p0 ONLINE 0 0 0
>
> c7t0d0p0 ONLINE 0 0 0
>
> errors: No known data errors
>
> pool: rpool
>
> state: ONLINE
>
> scrub: none requested
>
> config:
>
> NAME STATE READ WRITE CKSUM
>
> rpool ONLINE 0 0 0
>
> c5t0d0s0 ONLINE 0 0 0
>
> errors: No known data errors
>
> 3)Copy a file to /myPool/myfs
>
> ls -ltrh
>
> total 369687
>
> -rwxr-xr-x 1 root root 184M Jun 3 22:38 test.bin
>
> 4)Copy a second file
>
> cp test.bin test2.bin &
>
> And shutdown
>
> Startup
>
> 5)
>
> -bash-3.2# zpool status
>
> pool: myPool
>
> state: UNAVAIL
>
> status: One or more devices could not be opened. There are insufficient
>
> replicas for the pool to continue functioning.
>
> action: Attach the missing device and online it using 'zpool online'.
>
> see: http://www.sun.com/msg/ZFS-8000-3C
>
> scrub: none requested
>
> config:
>
> NAME STATE READ WRITE CKSUM
>
> myPool UNAVAIL 0 0 0 insufficient replicas
>
> mirror UNAVAIL 0 0 0 insufficient replicas
>
> c6t0d0p0 UNAVAIL 0 0 0 cannot open
>
> c7t0d0p0 UNAVAIL 0 0 0 cannot open
>
> pool: rpool
>
> state: ONLINE
>
> scrub: none requested
>
> config:
>
> NAME STATE READ WRITE CKSUM
>
> rpool ONLINE 0 0 0
>
> c5t0d0s0 ONLINE 0 0 0
>
> errors: No known data errors
>
> 6)Remove and attached the usb sticks:
>
> zpool status
>
> pool: myPool
>
> state: UNAVAIL
>
> status: One or more devices could not be used because the label is missing
>
> or invalid. There are insufficient replicas for the pool to continue
>
> functioning.
>
> action: Destroy and re-create the pool from a backup source.
>
> see: http://www.sun.com/msg/ZFS-8000-5E
>
> scrub: none requested
>
> config:
>
> NAME STATE READ WRITE CKSUM
>
> myPool UNAVAIL 0 0 0 insufficient replicas
>
> mirror UNAVAIL 0 0 0 insufficient replicas
>
> c6t0d0p0 FAULTED 0 0 0 corrupted data
>
> c7t0d0p0 FAULTED 0 0 0 corrupted data
>
> pool: rpool
>
> state: ONLINE
>
> scrub: none requested
>
> config:
>
> NAME STATE READ WRITE CKSUM
>
> rpool ONLINE 0 0 0
>
> c5t0d0s0 ONLINE 0 0 0
>
> errors: No known data errors
>
> ---
>
> So it's not a hub problem, but it seems to be a zfs & usb storage problem.
> I just hope zfs works fine on hardisks. Because it's not working on usb
> sticks. It would be nice somebody from SUN could fix this problem...
>
>
>
> Thanks & Regards
>
> Paulo
>
>
>   On Tue, Jun 3, 2008 at 8:19 PM, Paulo Soeiro <[EMAIL PROTECTED]> wrote:
>
>> I'll try the same without the hub.
>>
>> Thanks & Regards
>> Paulo
>>
>>
>>
>>
>> On 6/2/08, Thommy M. <[EMAIL PROTECTED]> wrote:
>>>
>>> Paulo Soeiro wrote:
>>> > Greetings,
>>> >
>>> > I was experimenting with zfs, and i made the following test, i shutdown
>>> > the computer during a write operation
>>> > in a mirrored usb storage filesystem.
>>> >
>>> > Here is my configuration
>>> >
>>> > NGS USB 2.0 Minihub 4
>>> > 3 USB Silicom Power Storage Pens 1 GB each
>>> >
>>> > These are the ports:
>>> >
>>> > hub devices
>>> > /---\
>>> > | port 2 | port  1  |
>>> > | c10t0d0p0  | c9t0d0p0  |
>>> > -
>>> > | port 4 | port  4  |
>>> > | c12t0d0p0  | c11t0d0p0|
>>> > \/
>>> >
>>> > Here is the problem:
>>> >
>>> > 1)First i create a mirror with port2 and port1 devices
>>> >
>>> > zpool create myPool mirror c10t0d0p0 c9t0d0p0
>>> > -bash-3.2# zpool status
>>> >   pool: myPool
>>> >  state: ONLINE
>>> >  scrub: none requested
>>> > config:
>>> >
>>> > NAME   STATE READ WRITE CKSUM
>>> > myPool ONLINE   0 0 0
>>> >   mirror   ONLINE   0 0 0
>>> > c10t0d0p0  ONLINE   0 0 0
>>> > c9t0d0p0   ONLINE   0 0 0
>>> >
>>> > errors: No known data errors
>>> >
>>> >   pool: rpool
>>> >  state: ONLINE
>>> >  scrub: none requested
>>> > config:
>>> >
>>> > NAMESTATE READ WRITE CKSUM
>>> > rpool   ONLINE   0 0 0
>>> >   c5t0d0s0  ONLINE   0 0 0
>>> >
>>> > errors: No known data errors
>>> >
>>> > 2)zfs create myPool/myfs
>>> >
>>> > 3)created a random file (file.txt - more or less 100MB size)
>>> >
>>> > digest -a md5 file.txt
>>> > 3f9d17531d6103ec75ba9762cb250b4c
>>> >
>>> > 4)While making a second copy of the file:
>>> >
>>> > cp file.txt test &
>>> >
>>> > I've shutdown the computer while the file was being copied. And
>>> > restarted the computer again. And here is the result:
>>> >
>>> >
>>> > -bash-3.2# zpool status
>>> >   pool: myPool
>>> >

Re: [zfs-discuss] ZFS problems with USB Storage devices

2008-06-03 Thread Paulo Soeiro
Did the same test again and here is the result:

1)

zpool create myPool mirror c6t0d0p0 c7t0d0p0

2)

-bash-3.2# zfs create myPool/myfs

-bash-3.2# zpool status

pool: myPool

state: ONLINE

scrub: none requested

config:

NAME STATE READ WRITE CKSUM

myPool ONLINE 0 0 0

mirror ONLINE 0 0 0

c6t0d0p0 ONLINE 0 0 0

c7t0d0p0 ONLINE 0 0 0

errors: No known data errors

pool: rpool

state: ONLINE

scrub: none requested

config:

NAME STATE READ WRITE CKSUM

rpool ONLINE 0 0 0

c5t0d0s0 ONLINE 0 0 0

errors: No known data errors

3)Copy a file to /myPool/myfs

ls -ltrh

total 369687

-rwxr-xr-x 1 root root 184M Jun 3 22:38 test.bin

4)Copy a second file

cp test.bin test2.bin &

And shutdown

Startup

5)

-bash-3.2# zpool status

pool: myPool

state: UNAVAIL

status: One or more devices could not be opened. There are insufficient

replicas for the pool to continue functioning.

action: Attach the missing device and online it using 'zpool online'.

see: http://www.sun.com/msg/ZFS-8000-3C

scrub: none requested

config:

NAME STATE READ WRITE CKSUM

myPool UNAVAIL 0 0 0 insufficient replicas

mirror UNAVAIL 0 0 0 insufficient replicas

c6t0d0p0 UNAVAIL 0 0 0 cannot open

c7t0d0p0 UNAVAIL 0 0 0 cannot open

pool: rpool

state: ONLINE

scrub: none requested

config:

NAME STATE READ WRITE CKSUM

rpool ONLINE 0 0 0

c5t0d0s0 ONLINE 0 0 0

errors: No known data errors

6)Remove and attached the usb sticks:

zpool status

pool: myPool

state: UNAVAIL

status: One or more devices could not be used because the label is missing

or invalid. There are insufficient replicas for the pool to continue

functioning.

action: Destroy and re-create the pool from a backup source.

see: http://www.sun.com/msg/ZFS-8000-5E

scrub: none requested

config:

NAME STATE READ WRITE CKSUM

myPool UNAVAIL 0 0 0 insufficient replicas

mirror UNAVAIL 0 0 0 insufficient replicas

c6t0d0p0 FAULTED 0 0 0 corrupted data

c7t0d0p0 FAULTED 0 0 0 corrupted data

pool: rpool

state: ONLINE

scrub: none requested

config:

NAME STATE READ WRITE CKSUM

rpool ONLINE 0 0 0

c5t0d0s0 ONLINE 0 0 0

errors: No known data errors

---

So it's not a hub problem, but it seems to be a zfs & usb storage problem. I
just hope zfs works fine on hardisks. Because it's not working on usb
sticks. It would be nice somebody from SUN could fix this problem...



Thanks & Regards

Paulo


On Tue, Jun 3, 2008 at 8:19 PM, Paulo Soeiro <[EMAIL PROTECTED]> wrote:

> I'll try the same without the hub.
>
> Thanks & Regards
> Paulo
>
>
>
>
> On 6/2/08, Thommy M. <[EMAIL PROTECTED]> wrote:
>>
>> Paulo Soeiro wrote:
>> > Greetings,
>> >
>> > I was experimenting with zfs, and i made the following test, i shutdown
>> > the computer during a write operation
>> > in a mirrored usb storage filesystem.
>> >
>> > Here is my configuration
>> >
>> > NGS USB 2.0 Minihub 4
>> > 3 USB Silicom Power Storage Pens 1 GB each
>> >
>> > These are the ports:
>> >
>> > hub devices
>> > /---\
>> > | port 2 | port  1  |
>> > | c10t0d0p0  | c9t0d0p0  |
>> > -
>> > | port 4 | port  4  |
>> > | c12t0d0p0  | c11t0d0p0|
>> > \/
>> >
>> > Here is the problem:
>> >
>> > 1)First i create a mirror with port2 and port1 devices
>> >
>> > zpool create myPool mirror c10t0d0p0 c9t0d0p0
>> > -bash-3.2# zpool status
>> >   pool: myPool
>> >  state: ONLINE
>> >  scrub: none requested
>> > config:
>> >
>> > NAME   STATE READ WRITE CKSUM
>> > myPool ONLINE   0 0 0
>> >   mirror   ONLINE   0 0 0
>> > c10t0d0p0  ONLINE   0 0 0
>> > c9t0d0p0   ONLINE   0 0 0
>> >
>> > errors: No known data errors
>> >
>> >   pool: rpool
>> >  state: ONLINE
>> >  scrub: none requested
>> > config:
>> >
>> > NAMESTATE READ WRITE CKSUM
>> > rpool   ONLINE   0 0 0
>> >   c5t0d0s0  ONLINE   0 0 0
>> >
>> > errors: No known data errors
>> >
>> > 2)zfs create myPool/myfs
>> >
>> > 3)created a random file (file.txt - more or less 100MB size)
>> >
>> > digest -a md5 file.txt
>> > 3f9d17531d6103ec75ba9762cb250b4c
>> >
>> > 4)While making a second copy of the file:
>> >
>> > cp file.txt test &
>> >
>> > I've shutdown the computer while the file was being copied. And
>> > restarted the computer again. And here is the result:
>> >
>> >
>> > -bash-3.2# zpool status
>> >   pool: myPool
>> >  state: UNAVAIL
>> > status: One or more devices could not be used because the label is
>> missing
>> > or invalid.  There are insufficient replicas for the pool to
>> continue
>> > functioning.
>> > action: Destroy and re-create the pool from a backup source.
>> >see: http://www.sun.com/msg/ZFS-8000-5E
>> >  scrub: none requested
>> > config:
>> >
>> > NAME   STATE READ WRITE CKSUM
>> > myPool UNAVAIL  0

Re: [zfs-discuss] ZFS problems with USB Storage devices

2008-06-02 Thread Thommy M.
Justin Vassallo wrote:
> Thommy,
> 
> If I read correctly your post stated that the pools did not automount on
> startup, not that they would go corrupt. It seems to me that Paulo is
> actually experiencing a corrupt fs

Nah, I also had indications of "corrupted data" if you read my posts.
But the data was there after I fiddled with the sticks and
exported/imported the pool.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS problems with USB Storage devices

2008-06-02 Thread Justin Vassallo
Thommy,

If I read correctly your post stated that the pools did not automount on
startup, not that they would go corrupt. It seems to me that Paulo is
actually experiencing a corrupt fs

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Thommy M.
Sent: 02 June 2008 13:19
To: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] ZFS problems with USB Storage devices

Paulo Soeiro wrote:
> Greetings,
> 
> I was experimenting with zfs, and i made the following test, i shutdown
> the computer during a write operation
> in a mirrored usb storage filesystem.
> 
> Here is my configuration
> 
> NGS USB 2.0 Minihub 4
> 3 USB Silicom Power Storage Pens 1 GB each
> 
> These are the ports:
> 
> hub devices
> /---\  
> | port 2 | port  1  |
> | c10t0d0p0  | c9t0d0p0  |
> -
> | port 4 | port  4  |
> | c12t0d0p0  | c11t0d0p0|
> \/
> 
> Here is the problem:
> 
> 1)First i create a mirror with port2 and port1 devices
> 
> zpool create myPool mirror c10t0d0p0 c9t0d0p0
> -bash-3.2# zpool status
>   pool: myPool
>  state: ONLINE
>  scrub: none requested
> config:
> 
> NAME   STATE READ WRITE CKSUM
> myPool ONLINE   0 0 0
>   mirror   ONLINE   0 0 0
> c10t0d0p0  ONLINE   0 0 0
> c9t0d0p0   ONLINE   0 0 0
> 
> errors: No known data errors
> 
>   pool: rpool
>  state: ONLINE
>  scrub: none requested
> config:
> 
> NAMESTATE READ WRITE CKSUM
> rpool   ONLINE   0 0 0
>   c5t0d0s0  ONLINE   0 0 0
> 
> errors: No known data errors
> 
> 2)zfs create myPool/myfs
> 
> 3)created a random file (file.txt - more or less 100MB size)
> 
> digest -a md5 file.txt
> 3f9d17531d6103ec75ba9762cb250b4c
> 
> 4)While making a second copy of the file:
> 
> cp file.txt test &
> 
> I've shutdown the computer while the file was being copied. And
> restarted the computer again. And here is the result:
> 
> 
> -bash-3.2# zpool status
>   pool: myPool
>  state: UNAVAIL
> status: One or more devices could not be used because the label is missing
> or invalid.  There are insufficient replicas for the pool to continue
> functioning.
> action: Destroy and re-create the pool from a backup source.
>see: http://www.sun.com/msg/ZFS-8000-5E
>  scrub: none requested
> config:
> 
> NAME   STATE READ WRITE CKSUM
> myPool UNAVAIL  0 0 0  insufficient replicas
>   mirror   UNAVAIL  0 0 0  insufficient replicas
> c12t0d0p0  OFFLINE  0 0 0
> c9t0d0p0   FAULTED  0 0 0  corrupted data
> 
>   pool: rpool
>  state: ONLINE
>  scrub: none requested
> config:
> 
> NAMESTATE READ WRITE CKSUM
> rpool   ONLINE   0 0 0
>   c5t0d0s0  ONLINE   0 0 0
> 
> errors: No known data errors
> 
> ---
> 
> I was expecting that only one of the files was corrupted, not the all
> the filesystem.

This looks exactly like the problem I had (thread "USB stick unavailable
after restart") and the answer I got was that you can't relay on the HUB ...

I haven't tried another HUB yet but will eventually test the Adaptec
XHub 4 (AUH-4000) which is on the HCL list...




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


smime.p7s
Description: S/MIME cryptographic signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS problems with USB Storage devices

2008-06-02 Thread Thommy M.
Paulo Soeiro wrote:
> Greetings,
> 
> I was experimenting with zfs, and i made the following test, i shutdown
> the computer during a write operation
> in a mirrored usb storage filesystem.
> 
> Here is my configuration
> 
> NGS USB 2.0 Minihub 4
> 3 USB Silicom Power Storage Pens 1 GB each
> 
> These are the ports:
> 
> hub devices
> /---\  
> | port 2 | port  1  |
> | c10t0d0p0  | c9t0d0p0  |
> -
> | port 4 | port  4  |
> | c12t0d0p0  | c11t0d0p0|
> \/
> 
> Here is the problem:
> 
> 1)First i create a mirror with port2 and port1 devices
> 
> zpool create myPool mirror c10t0d0p0 c9t0d0p0
> -bash-3.2# zpool status
>   pool: myPool
>  state: ONLINE
>  scrub: none requested
> config:
> 
> NAME   STATE READ WRITE CKSUM
> myPool ONLINE   0 0 0
>   mirror   ONLINE   0 0 0
> c10t0d0p0  ONLINE   0 0 0
> c9t0d0p0   ONLINE   0 0 0
> 
> errors: No known data errors
> 
>   pool: rpool
>  state: ONLINE
>  scrub: none requested
> config:
> 
> NAMESTATE READ WRITE CKSUM
> rpool   ONLINE   0 0 0
>   c5t0d0s0  ONLINE   0 0 0
> 
> errors: No known data errors
> 
> 2)zfs create myPool/myfs
> 
> 3)created a random file (file.txt - more or less 100MB size)
> 
> digest -a md5 file.txt
> 3f9d17531d6103ec75ba9762cb250b4c
> 
> 4)While making a second copy of the file:
> 
> cp file.txt test &
> 
> I've shutdown the computer while the file was being copied. And
> restarted the computer again. And here is the result:
> 
> 
> -bash-3.2# zpool status
>   pool: myPool
>  state: UNAVAIL
> status: One or more devices could not be used because the label is missing
> or invalid.  There are insufficient replicas for the pool to continue
> functioning.
> action: Destroy and re-create the pool from a backup source.
>see: http://www.sun.com/msg/ZFS-8000-5E
>  scrub: none requested
> config:
> 
> NAME   STATE READ WRITE CKSUM
> myPool UNAVAIL  0 0 0  insufficient replicas
>   mirror   UNAVAIL  0 0 0  insufficient replicas
> c12t0d0p0  OFFLINE  0 0 0
> c9t0d0p0   FAULTED  0 0 0  corrupted data
> 
>   pool: rpool
>  state: ONLINE
>  scrub: none requested
> config:
> 
> NAMESTATE READ WRITE CKSUM
> rpool   ONLINE   0 0 0
>   c5t0d0s0  ONLINE   0 0 0
> 
> errors: No known data errors
> 
> ---
> 
> I was expecting that only one of the files was corrupted, not the all
> the filesystem.

This looks exactly like the problem I had (thread "USB stick unavailable
after restart") and the answer I got was that you can't relay on the HUB ...

I haven't tried another HUB yet but will eventually test the Adaptec
XHub 4 (AUH-4000) which is on the HCL list...




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS problems with USB Storage devices

2008-06-01 Thread Paulo Soeiro
Greetings,

I was experimenting with zfs, and i made the following test, i shutdown the
computer during a write operation
in a mirrored usb storage filesystem.

Here is my configuration

NGS USB 2.0 Minihub 4
3 USB Silicom Power Storage Pens 1 GB each

These are the ports:

hub devices
/---\
| port 2 | port  1  |
| c10t0d0p0  | c9t0d0p0  |
-
| port 4 | port  4  |
| c12t0d0p0  | c11t0d0p0|
\/

Here is the problem:

1)First i create a mirror with port2 and port1 devices

zpool create myPool mirror c10t0d0p0 c9t0d0p0
-bash-3.2# zpool status
  pool: myPool
 state: ONLINE
 scrub: none requested
config:

NAME   STATE READ WRITE CKSUM
myPool ONLINE   0 0 0
  mirror   ONLINE   0 0 0
c10t0d0p0  ONLINE   0 0 0
c9t0d0p0   ONLINE   0 0 0

errors: No known data errors

  pool: rpool
 state: ONLINE
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
rpool   ONLINE   0 0 0
  c5t0d0s0  ONLINE   0 0 0

errors: No known data errors

2)zfs create myPool/myfs

3)created a random file (file.txt - more or less 100MB size)

digest -a md5 file.txt
3f9d17531d6103ec75ba9762cb250b4c

4)While making a second copy of the file:

cp file.txt test &

I've shutdown the computer while the file was being copied. And restarted
the computer again. And here is the result:


-bash-3.2# zpool status
  pool: myPool
 state: UNAVAIL
status: One or more devices could not be used because the label is missing
or invalid.  There are insufficient replicas for the pool to continue
functioning.
action: Destroy and re-create the pool from a backup source.
   see: http://www.sun.com/msg/ZFS-8000-5E
 scrub: none requested
config:

NAME   STATE READ WRITE CKSUM
myPool UNAVAIL  0 0 0  insufficient replicas
  mirror   UNAVAIL  0 0 0  insufficient replicas
c12t0d0p0  OFFLINE  0 0 0
c9t0d0p0   FAULTED  0 0 0  corrupted data

  pool: rpool
 state: ONLINE
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
rpool   ONLINE   0 0 0
  c5t0d0s0  ONLINE   0 0 0

errors: No known data errors

---

I was expecting that only one of the files was corrupted, not the all the
filesystem.


Thanks & Regards
Paulo
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Problems under vmware

2008-05-28 Thread Gabriele Bulfon
Hello, I'm having the same exact situation on one VM, and not on another VM on 
the same infrastructure.
The only difference is that on the failing VM I initially created the pool with 
a name and then changed the mountpoint to another name.
Did you found a solution to the issue?
Should I consider to get back to UFS on this infrastructure?
Thanx a lot
Gabriele.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS Problems under vmware

2008-05-12 Thread Paul B. Henson

I have a test bed S10U5 system running under vmware ESX that has a weird
problem.

I have a single virtual disk, with some slices allocated as UFS filesystem
for the operating system, and s7 as a ZFS pool.

Whenever I reboot, the pool fails to open:

May  8 17:32:30 niblet fmd: [ID 441519 daemon.error] SUNW-MSG-ID: ZFS-8000-CS, 
TYPE: Fault, VER: 1, SEVERITY: Major
May  8 17:32:30 niblet EVENT-TIME: Thu May  8 17:32:30 PDT 2008
May  8 17:32:30 niblet PLATFORM: VMware Virtual Platform, CSN: VMware-50 35 75 
0b a3 b3 e5 d4-38 3f 00 7a 10 c0 e2 d7, HOSTNAME: niblet
May  8 17:32:30 niblet SOURCE: zfs-diagnosis, REV: 1.0
May  8 17:32:30 niblet EVENT-ID: f163d843-694d-4659-81e8-aa15bb72e2e0
May  8 17:32:30 niblet DESC: A ZFS pool failed to open.  Refer to 
http://sun.com/msg/ZFS-8000-CS for more information.
May  8 17:32:30 niblet AUTO-RESPONSE: No automated response will occur.
May  8 17:32:30 niblet IMPACT: The pool data is unavailable
May  8 17:32:30 niblet REC-ACTION: Run 'zpool status -x' and either attach the 
missing device or
May  8 17:32:30 niblet  restore from backup.


According to 'zpool status', the device could not be opened:

[EMAIL PROTECTED] ~ # zpool status
  pool: ospool
 state: UNAVAIL
status: One or more devices could not be opened.  There are insufficient
replicas for the pool to continue functioning.
action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-D3
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
ospool  UNAVAIL  0 0 0  insufficient replicas
  c1t0d0s7  UNAVAIL  0 0 0  cannot open


However, according to format, the device is perfectly accessible, and
format even indicates that slice 7 is an active pool:

[EMAIL PROTECTED] ~ # format
Searching for disks...done

AVAILABLE DISK SELECTIONS:
   0. c1t0d0 
  /[EMAIL PROTECTED],0/pci1000,[EMAIL PROTECTED]/[EMAIL PROTECTED],0
Specify disk (enter its number): 0
selecting c1t0d0
[disk formatted]
Warning: Current Disk has mounted partitions.
/dev/dsk/c1t0d0s0 is currently mounted on /. Please see umount(1M).
/dev/dsk/c1t0d0s1 is currently used by swap. Please see swap(1M).
/dev/dsk/c1t0d0s3 is currently mounted on /usr. Please see umount(1M).
/dev/dsk/c1t0d0s4 is currently mounted on /var. Please see umount(1M).
/dev/dsk/c1t0d0s5 is currently mounted on /opt. Please see umount(1M).
/dev/dsk/c1t0d0s6 is currently mounted on /home. Please see umount(1M).
/dev/dsk/c1t0d0s7 is part of active ZFS pool ospool. Please see zpool(1M).


Trying to import it does not find it:

[EMAIL PROTECTED] ~ # zpool import
no pools available to import


Exporting it works fine:

[EMAIL PROTECTED] ~ # zpool export ospool


But then the import indicates that the pool may still be in use:

[EMAIL PROTECTED] ~ # zpool import ospool
cannot import 'ospool': pool may be in use from other system


Adding the -f flag imports successfully:

[EMAIL PROTECTED] ~ # zpool import -f ospool

[EMAIL PROTECTED] ~ # zpool status
  pool: ospool
 state: ONLINE
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
ospool  ONLINE   0 0 0
  c1t0d0s7  ONLINE   0 0 0

errors: No known data errors


And then everything works perfectly fine, until I reboot again, at which
point the cycle repeats.

I have a similar test bed running on actual x4100 hardware that doesn't
exhibit this problem.

Any idea what's going on here?


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  [EMAIL PROTECTED]
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS problems in dCache

2007-08-22 Thread Xavier Canehan
We have the same issue (using dCache on Thumpers, data on ZFS).
A workaround has been to move the directory on a local UFS filesystem using a 
low nbpi parameter.

However, this is not a solution.

Doesn't look like a threading problem,  thanks anyway Jens !
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS problems in dCache

2007-08-22 Thread Jens Elkner
On Wed, Aug 01, 2007 at 09:49:26AM -0700, Sergey Chechelnitskiy wrote:
Hi Sergey,
> 
> I have a flat directory with a lot of small files inside. And I have a java 
> application that reads all these files when it starts. If this directory is 
> located on ZFS the application starts fast (15 mins) when the number of files 
> is around 300,000 and starts very slow (more than 24 hours) when the number 
> of files is around 400,000. 
> 
> The question is why ? 
> Let's set aside the question why this application is designed this way.
> 
> I still needed to run this application. So, I installed a linux box with XFS, 
> mounted this XFS directory to the Solaris box and moved my flat directory 
> there. Then my application started fast ( < 30 mins) even if the number of 
> files (in the linux operated XFS directory mounted thru NSF to the Solaris 
> box) was 400,000 or more. 
> 
> Basicly, what I want to do is to run this application on a Solaris box. Now I 
> cannot do it.

Just a rough guess - this might be a Solaris threading problem. See
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6518490

So perhaps starting the app with -XX:-UseThreadPriorities may help ...

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS problems in dCache

2007-08-08 Thread Steven Gelsie
I think I am having the same problem using a different application (Windchill). 
 zfs is consuming hugh amounts of memory and system (T2000) is performing 
poorly. Occasionally it will take a long time (several hours) to do a snapshot. 
Normally a snapshot will take a second or two. The application will allow me to 
break the one directory which has almost 600,000 files in to several 
directories. I am in the process of doing this now. I never thought it was a 
good idea to have that many files in one directory.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS problems in dCache

2007-08-01 Thread Sergey Chechelnitskiy
Hi All, 

Thank you for answers. 
I am not really comparing anything. 
I have a flat directory with a lot of small files inside. And I have a java 
application that reads all these files when it starts. If this directory is 
located on ZFS the application starts fast (15 mins) when the number of files 
is around 300,000 and starts very slow (more than 24 hours) when the number 
of files is around 400,000. 

The question is why ? 
Let's set aside the question why this application is designed this way.

I still needed to run this application. So, I installed a linux box with XFS, 
mounted this XFS directory to the Solaris box and moved my flat directory 
there. Then my application started fast ( < 30 mins) even if the number of 
files (in the linux operated XFS directory mounted thru NSF to the Solaris 
box) was 400,000 or more. 

Basicly, what I want to do is to run this application on a Solaris box. Now I 
cannot do it.

Thanks, 
Sergey

On August 1, 2007 08:15 am, [EMAIL PROTECTED] wrote:
> > On 01/08/2007, at 7:50 PM, Joerg Schilling wrote:
> > > Boyd Adamson <[EMAIL PROTECTED]> wrote:
> > >> Or alternatively, are you comparing ZFS(Fuse) on Linux with XFS on
> > >> Linux? That doesn't seem to make sense since the userspace
> > >> implementation will always suffer.
> > >>
> > >> Someone has just mentioned that all of UFS, ZFS and XFS are
> > >> available on
> > >> FreeBSD. Are you using that platform? That information would be
> > >> useful
> > >> too.
> > >
> > > FreeBSD does not use what Solaris calls UFS.
> > >
> > > Both Solaris and FreeBSD did start with the same filesystem code but
> > > Sun did start enhancing UFD in the late 1980's while BSD did not
> > > take over
> > > the changes. Later BSD started a fork on the filesystemcode.
> > > Filesystem
> > > performance thus cannot be compared.
> >
> > I'm aware of that, but they still call it UFS. I'm trying to
> > determine what the OP is asking.
>
>   I seem to remember many daemons that used large grouping of files such as
> this changing to a split out directory tree starting in the late 80's to
> avoid slow stat issues.  Is this type of design (tossing 300k+ files into
> one flat directory) becoming more acceptable again?
>
>
> -Wade
>
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS problems in dCache

2007-08-01 Thread Wade . Stuart




> On 01/08/2007, at 7:50 PM, Joerg Schilling wrote:
> > Boyd Adamson <[EMAIL PROTECTED]> wrote:
> >
> >> Or alternatively, are you comparing ZFS(Fuse) on Linux with XFS on
> >> Linux? That doesn't seem to make sense since the userspace
> >> implementation will always suffer.
> >>
> >> Someone has just mentioned that all of UFS, ZFS and XFS are
> >> available on
> >> FreeBSD. Are you using that platform? That information would be
> >> useful
> >> too.
> >
> > FreeBSD does not use what Solaris calls UFS.
> >
> > Both Solaris and FreeBSD did start with the same filesystem code but
> > Sun did start enhancing UFD in the late 1980's while BSD did not
> > take over
> > the changes. Later BSD started a fork on the filesystemcode.
> > Filesystem
> > performance thus cannot be compared.
>
> I'm aware of that, but they still call it UFS. I'm trying to
> determine what the OP is asking.


  I seem to remember many daemons that used large grouping of files such as
this changing to a split out directory tree starting in the late 80's to
avoid slow stat issues.  Is this type of design (tossing 300k+ files into
one flat directory) becoming more acceptable again?


-Wade

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS problems in dCache

2007-08-01 Thread Boyd Adamson
On 01/08/2007, at 7:50 PM, Joerg Schilling wrote:
> Boyd Adamson <[EMAIL PROTECTED]> wrote:
>
>> Or alternatively, are you comparing ZFS(Fuse) on Linux with XFS on
>> Linux? That doesn't seem to make sense since the userspace
>> implementation will always suffer.
>>
>> Someone has just mentioned that all of UFS, ZFS and XFS are  
>> available on
>> FreeBSD. Are you using that platform? That information would be  
>> useful
>> too.
>
> FreeBSD does not use what Solaris calls UFS.
>
> Both Solaris and FreeBSD did start with the same filesystem code but
> Sun did start enhancing UFD in the late 1980's while BSD did not  
> take over
> the changes. Later BSD started a fork on the filesystemcode.  
> Filesystem
> performance thus cannot be compared.

I'm aware of that, but they still call it UFS. I'm trying to  
determine what the OP is asking.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS problems in dCache

2007-08-01 Thread Joerg Schilling
Boyd Adamson <[EMAIL PROTECTED]> wrote:

> Or alternatively, are you comparing ZFS(Fuse) on Linux with XFS on
> Linux? That doesn't seem to make sense since the userspace
> implementation will always suffer.
>
> Someone has just mentioned that all of UFS, ZFS and XFS are available on
> FreeBSD. Are you using that platform? That information would be useful
> too.

FreeBSD does not use what Solaris calls UFS.

Both Solaris and FreeBSD did start with the same filesystem code but
Sun did start enhancing UFD in the late 1980's while BSD did not take over 
the changes. Later BSD started a fork on the filesystemcode. Filesystem 
performance thus cannot be compared.

Jörg

-- 
 EMail:[EMAIL PROTECTED] (home) Jörg Schilling D-13353 Berlin
   [EMAIL PROTECTED](uni)  
   [EMAIL PROTECTED] (work) Blog: http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS problems in dCache

2007-07-31 Thread Boyd Adamson
Sergey Chechelnitskiy <[EMAIL PROTECTED]> writes:

> Hi All,
>
> We have a problem running a scientific application dCache on ZFS.
> dCache is a java based software that allows to store huge datasets in
> pools.  One dCache pool consists of two directories pool/data and
> pool/control. The real data goes into pool/data/ For each file in
> pool/data/ the pool/control/ directory contains two small files, one
> is 23 bytes, another one is 989 bytes.  When dcache pool starts it
> consecutively reads all the files in control/ directory.  We run a
> pool on ZFS.
>
> When we have approx 300,000 files in control/ the pool startup time is
> about 12-15 minutes. When we have approx 350,000 files in control/ the
> pool startup time increases to 70 minutes. If we setup a new zfs pool
> with the smalles possible blocksize and move control/ there the
> startup time decreases to 40 minutes (in case of 350,000 files).  But
> if we run the same pool on XFS the startup time is only 15 minutes.
> Could you suggest to reconfigure ZFS to decrease the startup time.
>
> When we have approx 400,000 files in control/ we were not able to
> start the pool in 24 hours. UFS did not work either in this case, but
> XFS worked.
>
> What could be the problem ?  Thank you,

I'm not sure I understand what you're comparing. Is there an XFS
implementation for Solaris that I don't know about?

Are you comparing ZFS on Solaris vs XFS on Linux? If that's the case it
seems there is much more that's different than just the filesystem.

Or alternatively, are you comparing ZFS(Fuse) on Linux with XFS on
Linux? That doesn't seem to make sense since the userspace
implementation will always suffer.

Someone has just mentioned that all of UFS, ZFS and XFS are available on
FreeBSD. Are you using that platform? That information would be useful
too.

Boyd

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS problems in dCache

2007-07-31 Thread Sergey Chechelnitskiy
Hi All, 

We have a problem running a scientific application dCache on ZFS. 
dCache is a java based software that allows to store huge datasets in pools.
One dCache pool consists of two directories pool/data and pool/control. The 
real data goes into pool/data/ 
For each file in pool/data/ the pool/control/ directory contains two small 
files, one is 23 bytes, another one is 989 bytes. 
When dcache pool starts it consecutively reads all the files in control/ 
directory.
We run a pool on ZFS.

When we have approx 300,000 files in control/ the pool startup time is about 
12-15 minutes. 
When we have approx 350,000 files in control/ the pool startup time increases 
to 70 minutes. 
If we setup a new zfs pool with the smalles possible blocksize and move 
control/ there the startup time decreases to 40 minutes (in case of 350,000 
files). 
But if we run the same pool on XFS the startup time is only 15 minutes. 
Could you suggest to reconfigure ZFS to decrease the startup time.

When we have approx 400,000 files in control/ we were not able to start the 
pool in 24 hours. UFS did not work either in this case, but XFS worked.

What could be the problem ? 
Thank you,

-- 
--
Best Regards, 
Sergey Chechelnitskiy ([EMAIL PROTECTED])
WestGrid/SFU
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re[2]: [zfs-discuss] ZFS problems

2006-12-11 Thread Robert Milkowski
Hello James,

Saturday, November 18, 2006, 11:34:52 AM, you wrote:
JM> as far as I can see, your setup does not mee the minimum
JM> redundancy requirements for a Raid-Z, which is 3 devices.
JM> Since you only have 2 devices you are out on a limb.


Actually only two disks for raid-z is fine and you get redundancy.
However it would make more sense to do mirror with just two disk -
performance would be better and available space would be the same.




-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS problems

2006-11-26 Thread Richard Elling

David Dyer-Bennet wrote:

On 11/26/06, Al Hopper <[EMAIL PROTECTED]> wrote:


[4] I proposed this solution to a user on the [EMAIL PROTECTED]
list - and it resolved his problem.  His problem - the system would reset
after getting about 1/2 way through a Solaris install.  The installer was
simply acting as a good system exerciser and heating up his CPU until it
glitched out.  After he removed the CPU fan and cleaned up his heatsink -
he loaded up Solaris successfully.


I just identified and fixed exactly this symptom on my mother's
Windows system, in fact; it'd get half-way through an install, then
start getting flakier and flakier, and fairly soon refuse to boot at
all.  This made me think "heat", and on examination the fan on the CPU
cooler wasn't spinning *at all*.  It's less than two years old -- but
one of the three wires seems to be broken off right at the fan, so
that may be the problem.  It's not seized up physically, though it's a
bit stiff.

Anyway, while the software here isn't Solaris, the basic diagnostic
issue is the same.  This kind of thing is remarkably common, in fact!


Yep, the top 4 things that tend to break are: fans, power supplies,
disks, and memory (in no particular order).  The enterprise-class
systems should monitor the fan speed and alert when they are not
operating normally.
 -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS problems

2006-11-26 Thread David Dyer-Bennet

On 11/26/06, Al Hopper <[EMAIL PROTECTED]> wrote:


[4] I proposed this solution to a user on the [EMAIL PROTECTED]
list - and it resolved his problem.  His problem - the system would reset
after getting about 1/2 way through a Solaris install.  The installer was
simply acting as a good system exerciser and heating up his CPU until it
glitched out.  After he removed the CPU fan and cleaned up his heatsink -
he loaded up Solaris successfully.


I just identified and fixed exactly this symptom on my mother's
Windows system, in fact; it'd get half-way through an install, then
start getting flakier and flakier, and fairly soon refuse to boot at
all.  This made me think "heat", and on examination the fan on the CPU
cooler wasn't spinning *at all*.  It's less than two years old -- but
one of the three wires seems to be broken off right at the fan, so
that may be the problem.  It's not seized up physically, though it's a
bit stiff.

Anyway, while the software here isn't Solaris, the basic diagnostic
issue is the same.  This kind of thing is remarkably common, in fact!

This one has a nearly-good ending, since nothing appears to have
cooked enough to be permanently ruined.  Only nearly-good because I
had to bend the heatsink to get the replacement 70mm fan to fit; the
screw holes lined up, but the new one was physically slightly too
large, about a mm, to fit on the heatsink.

--
David Dyer-Bennet, , 
RKBA: 
Pics: 
Dragaera/Steven Brust: 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS problems

2006-11-26 Thread Al Hopper
On Sat, 25 Nov 2006 [EMAIL PROTECTED] wrote:

 reformatted ...
> First thing is I would like to thank everyone for their replies/help.
> This machine has been running for two years under Linux, but for last
^

Ugh Oh - possible CPU fan "fatigue" time... more below.

> two or three months has had Nexenta Solaris on it. This machine has
> never once crashed. I rebooted with a Knoppix disk in and ran memtest86.
> Within 30 minutes it counted several hundred errors which after cleaning
> the connections still occurred in the same locations. I replaced the RAM
> module and retested with no errors. My md5sums all verified no data was
> lost making me very happy. I did a zpool scrub which came back 100%
> clean. I still don't understand how the machine ran reliably with bad
> ram. That being said, a few days later I did a zpool status and saw 20
> checksum errors on one drive and 30 errors on the other.

You're still chasing a hardware issue(s) IMHO.  First, ensure that you are
blowing air over the HDA (Head Disk Assembly) of your installed hard
drives.  The drives don't care if the airflow is from back to front, left
to right, right to left, front to back etc.  And it does not have to be a
lot of air.  As long as there is positive airflow over the HDA and the
disk drive controller electronics.  Otherwise, it's likely that the disk
drives will overheat while there is a lot of head movement taking place.
My suggestion is to get a 92mm fan(s) with a hard disk type connector and
jury rig the fan(s) to blow air accross the drives.  Do whatever it takes
to secure the fans in position - bent wire hangers secured to the case
will work!  It may not look pretty - but it'll get the job done.  Or ..
mount the drives in drive cannisters with built-in fans.

Next is to check for hotspots within the box.  Check the memory SIMMs are
getting good airflow.  A great way to resolve this type of issue is to use
the Zalman Fan Bracket (FB123) and one or more 92mm fans.  The bracket
itself is hard to explain - but it allows you to attach up to 4 fans in
slots and position them above anything that is a hot-spot - including, the
motherboard chipset, RAM SIMMs, graphics boards, gigabit ethernet cards
etc.  A picture is worth a 1000 words:

http://www.endpcnoise.com/cgi-bin/e/std/sku=fb123

Note: this is not an endorsement of this site - just a good picture -
since the Zalman site (zalmanusa.com) is a pain to navigate.

Still on the cooling thread - the Seasonic PSUs are highly rated and very
quiet.  But ... they don't move enough air through your box and should be
suplemented with an intake fan (if you box has provision to add one) and a
rear panel mounted exhaust fan.  Many PC users have upgraded their PSUs
and been careful to select a quiet PSU - but they did not realize that the
quiet PSU, with its slow moving fan, greatly reduced the existing airflow
through the box.  The PSU can run effectively with the reduced airflow -
but not the other components in the system.

If you want to apply science and actually measure your box for hotspots, I
suggest you run the box at the usual ambient temp, with the usual active
workload then carefully remove the side cover very quickly (while the box
is still running) and use a Fluke IR (Infra Red) thermal probe[1] to
measure for hot spots.  Record the CPU heatsink temp, RAM DIMMs, HDA,
motherboard chipset etc.  You can also busy out the box by running SETI
and/or beat up on the disk drives and take more measurements[2].  Then
after you apply the fixes ... retest.

A couple of pointers that may help.  If your box has an 80mm exhaust fan -
replace it with a 92mm (or 120mm) fan and use a plastic 90mm to 80mm
adaptor.  This'll increase airflow without increasing the noise.  Also,
Zalman makes a small "gizmo" that you put inline with a fan, that allows
you to vary the fan speed and set the speed to get the best noise/cooling
tradeoff for your box.  Its called the "fan mate 2".

Last item on cooling (sorry) - many older systems that used small CPU fan
based coolers, die after only 2 years.  But in many cases, the fan does
not actually stop turning - but slows down dramatically.  And, sometimes
it'll slow down only after it heats up a little.  So if you take the side
cover off after the system has been running for a couple of hours, you'll
see the fan turning slowly - and touching the CPU heatsink will probably
burn your finger.  If you check it a minute after first powering up the
system - it'll look normal and completely fool you.  When this happens
(fan slows down), the CPU temp will increase, it's thermal resistance will
go lower, and it'll draw more current ... which will generate even more
heat.  This is the classic symptom of what we call "thermal runaway".  A
slightly more subtle variant of this issue, is with the AMD factory based
coolers.  After you remove the CPU heatsink fan, you'll notice a lot of
dirt/dust blocking up to 1/2 the area of the heat

Re: [zfs-discuss] ZFS problems

2006-11-25 Thread zfs
First thing is I would like to thank everyone for their replies/help. This 
machine has been running for two years under Linux, but for last two or three 
months has had Nexenta Solaris on it. This machine has never once crashed. I 
rebooted with a Knoppix disk in and ran memtest86. Within 30 minutes it counted 
several hundred errors which after cleaning the connections still occurred in 
the same locations. I replaced the RAM module and retested with no errors. My 
md5sums all verified no data was lost making me very happy. I did a zpool scrub 
which came back 100% clean. I still don't understand how the machine ran 
reliably with bad ram. That being said, a few days later I did a zpool status 
and saw 20 checksum errors on one drive and 30 errors on the other. 

Does anyone have any idea why I have to do "zpool export amber" followed by 
"zpool import amber" for my zpool to be mounted on reboot? zfs set mountpoint 
does nothing.

BTW to answer some other concerns, the Seasonic supply is 400Watts with a 
guaranteed minimum efficency of 80%. Using a kill-o-watt meter I have about 
120Watts power consumption. The machine is on a UPS.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS problems

2006-11-18 Thread Frank Cusack

[ I've seen the response where one astute list participate noticed you're
running a 2-way raidz device, when the documentation clearly states that
the mimimum raidz volume consists of 3 devices ]


Not very astute.  The documentation clearly states that the minimum is
2 devices.

zpool(1M):

A raidz group with N disks of size X can  hold  approxi-
mately (N-1)*Xbytes and can withstand one device failing
before data integrity is compromised. The minimum number
of devices in a raidz group is 2. The recommended number
is between 3 and 9.

If the minimum were actually 3, this configuration wouldn't work at all.

-frank
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS problems

2006-11-18 Thread Al Hopper
On Sat, 18 Nov 2006 [EMAIL PROTECTED] wrote:

> I'm new to this group, so hello everyone! I am having some issues with

Welcome!

> my Nexenta system I set up about two months ago as a zfs/zraid server. I
> have two new Maxtor 500GB Sata drives and an Adaptec controller which I
> believe has a Silicon Image chipset. Also I have a Seasonic 80+ power
> supply, so the power should be as clean as you can get. I had an issue

Just wondering (out loud) if your PSU is capable of meeting the demands of
your current hardware - including the zfs related disk drives you just
added and if the system is on a UPS.  Just questions for you to answer and
off topic for this list.  But you'll see that this thought process is
relevant to your particular problem - see more below.

> with Nexenta where I had to reinstall, and since then everytime I reboot
> I have to type
>
> zpool export amber
> zpool import amber
>
> to get my zfs volume mounted. A week ago I noticed a couple of CKSUM
> errors when I did a zpool status, so I did a zpool scrub. This is the
> output after:
>
> # zpool status
>   pool: amber
>  state: ONLINE
> status: One or more devices has experienced an unrecoverable error.  An
> attempt was made to correct the error.  Applications are unaffected.
> action: Determine if the device needs to be replaced, and clear the errors
> using 'zpool clear' or replace the device with 'zpool replace'.
>see: http://www.sun.com/msg/ZFS-8000-9P
>  scrub: scrub completed with 0 errors on Mon Nov 13 04:49:35 2006
> config:
>
> NAMESTATE READ WRITE CKSUM
> amber   ONLINE   0 0 0
>   raidz1ONLINE   0 0 0
> c4d0ONLINE   0 051
> c5d0ONLINE   0 041
>
> errors: No known data errors
>
>
> I have md5sums on a lot of the files and it looks like maybe 5% of my
> files are corrupted. Does anyone have any ideas? I was under the
> impression that zfs was pretty reliable but I guess with any software it
> needs time to get the bugs ironed out.

[ I've seen the response where one astute list participate noticed you're
running a 2-way raidz device, when the documentation clearly states that
the mimimum raidz volume consists of 3 devices ]

Going back to zero day (my terminology) for ZFS, when it was first
integrated, if you read the zfs related blogs, you'll realize that zfs is
arguably one of the most extensively tested bodies of software _ever_
added to (Open)Solaris.  If there was a basic issue with zfs, like you
describe above, zfs would never have been integrated (into (Open)Solaris).
You can imagine that there were a lot of willing zfs testers ("please can
I be on the beta test...")[0] - but there were also a few cases of "this
issue has *got* to be ZFS related" - because there were no other
_rational_ explanations.  One such case is mentioned here:

http://blogs.sun.com/roller/page/elowe?anchor=zfs_saves_the_day_ta

I would suggest that you look for some basic hardware problems within your
system.  The first place to start is to download/burn a copy of the
Ultimate Boot CD ROM (UBCD) [1] and run the latest version of memtest
memtest86 for 24 hours.  It's likely that you have hardware issues.

Please keep the list informed

[0] including this author who built hardware specifically to eval/test/use
ZFS and get it into production ASAP to solve a business storage problem
for $6k instead of $30k to $40k.

[1] http://www.ultimatebootcd.com/

Regards,

Al Hopper  Logical Approach Inc, Plano, TX.  [EMAIL PROTECTED]
   Voice: 972.379.2133 Fax: 972.379.2134  Timezone: US CDT
OpenSolaris.Org Community Advisory Board (CAB) Member - Apr 2005
 OpenSolaris Governing Board (OGB) Member - Feb 2006
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS problems

2006-11-18 Thread Toby Thain


On 18-Nov-06, at 2:01 PM, Bill Moore wrote:


Hi Michael.  Based on the output, there should be no user-visible file
corruption.  ZFS saw a bunch of checksum errors on the disk, but was
able to recover in every instance.

While 2-disk RAID-Z is really a fancy (and slightly more expensive,
CPU-wise) way of doing mirroring, at no point should your data be at
risk.

I've been working on ZFS a long time, and if what you say is true, it
will be the first instance I have ever seen (or heard) of such a
phenomenon.  I strongly doubt that somehow ZFS returned corrupted data
without knowing about it.



Also, I'd check your RAM.

--Toby


How are you sure that some application on
your box didn't modify the contents of the files?


--Bill


On Sat, Nov 18, 2006 at 02:01:39AM -0800, [EMAIL PROTECTED]  
wrote:
I'm new to this group, so hello everyone! I am having some issues  
with my Nexenta system I set up about two months ago as a zfs/ 
zraid server. I have two new Maxtor 500GB Sata drives and an  
Adaptec controller which I believe has a Silicon Image chipset.  
Also I have a Seasonic 80+ power supply, so the power should be as  
clean as you can get. I had an issue with Nexenta where I had to  
reinstall, and since then everytime I reboot I have to type


zpool export amber
zpool import amber

to get my zfs volume mounted. A week ago I noticed a couple of  
CKSUM errors when I did a zpool status, so I did a zpool scrub.  
This is the output after:


# zpool status
  pool: amber
 state: ONLINE
status: One or more devices has experienced an unrecoverable  
error.  An
attempt was made to correct the error.  Applications are  
unaffected.
action: Determine if the device needs to be replaced, and clear  
the errors
using 'zpool clear' or replace the device with 'zpool  
replace'.

   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: scrub completed with 0 errors on Mon Nov 13 04:49:35 2006
config:

NAMESTATE READ WRITE CKSUM
amber   ONLINE   0 0 0
  raidz1ONLINE   0 0 0
c4d0ONLINE   0 051
c5d0ONLINE   0 041

errors: No known data errors


I have md5sums on a lot of the files and it looks like maybe 5% of  
my files are corrupted. Does anyone have any ideas? I was under  
the impression that zfs was pretty reliable but I guess with any  
software it needs time to get the bugs ironed out.


Michael
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS problems

2006-11-18 Thread Bill Moore
Hi Michael.  Based on the output, there should be no user-visible file
corruption.  ZFS saw a bunch of checksum errors on the disk, but was
able to recover in every instance.

While 2-disk RAID-Z is really a fancy (and slightly more expensive,
CPU-wise) way of doing mirroring, at no point should your data be at
risk.

I've been working on ZFS a long time, and if what you say is true, it
will be the first instance I have ever seen (or heard) of such a
phenomenon.  I strongly doubt that somehow ZFS returned corrupted data
without knowing about it.  How are you sure that some application on
your box didn't modify the contents of the files?


--Bill


On Sat, Nov 18, 2006 at 02:01:39AM -0800, [EMAIL PROTECTED] wrote:
> I'm new to this group, so hello everyone! I am having some issues with my 
> Nexenta system I set up about two months ago as a zfs/zraid server. I have 
> two new Maxtor 500GB Sata drives and an Adaptec controller which I believe 
> has a Silicon Image chipset. Also I have a Seasonic 80+ power supply, so the 
> power should be as clean as you can get. I had an issue with Nexenta where I 
> had to reinstall, and since then everytime I reboot I have to type
> 
> zpool export amber
> zpool import amber
> 
> to get my zfs volume mounted. A week ago I noticed a couple of CKSUM errors 
> when I did a zpool status, so I did a zpool scrub. This is the output after:
> 
> # zpool status
>   pool: amber
>  state: ONLINE
> status: One or more devices has experienced an unrecoverable error.  An
> attempt was made to correct the error.  Applications are unaffected.
> action: Determine if the device needs to be replaced, and clear the errors
> using 'zpool clear' or replace the device with 'zpool replace'.
>see: http://www.sun.com/msg/ZFS-8000-9P
>  scrub: scrub completed with 0 errors on Mon Nov 13 04:49:35 2006
> config:
> 
> NAMESTATE READ WRITE CKSUM
> amber   ONLINE   0 0 0
>   raidz1ONLINE   0 0 0
> c4d0ONLINE   0 051
> c5d0ONLINE   0 041
> 
> errors: No known data errors
> 
> 
> I have md5sums on a lot of the files and it looks like maybe 5% of my files 
> are corrupted. Does anyone have any ideas? I was under the impression that 
> zfs was pretty reliable but I guess with any software it needs time to get 
> the bugs ironed out.
> 
> Michael
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS problems

2006-11-18 Thread James McPherson

On 11/18/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
...

 scrub: scrub completed with 0 errors on Mon Nov 13 04:49:35 2006
config:

NAMESTATE READ WRITE CKSUM
amber   ONLINE   0 0 0
  raidz1ONLINE   0 0 0
c4d0ONLINE   0 051
c5d0ONLINE   0 041

errors: No known data errors


I have md5sums on a lot of the files and it looks like maybe 5% of
my files are corrupted. Does anyone have any ideas?


Michael,
as far as I can see, your setup does not mee the minimum
redundancy requirements for a Raid-Z, which is 3 devices.
Since you only have 2 devices you are out on a limb.

Please read the manpage for the zpool command and pay
close attention to the restrictions in the section on raidz.



I was under the impression that zfs was pretty reliable but I
guess with any software it needs time to get the bugs ironed out.


ZFS is reliable. I use it - mirrored - at home. If I was going to
use raidz or raidz2 I would make sure that I followed the
instructions in the manpage about the number of devices
I need in order to guarantee redundancy and thus reliability,
rather than making an assumption.

You should also check the output of "iostat -En" and see
whether your devices are listed there with any error counts.


James C. McPherson
--
Solaris kernel software engineer, system admin and troubleshooter
 http://www.jmcp.homeunix.com/blog
Find me on LinkedIn @ http://www.linkedin.com/in/jamescmcpherson
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS problems

2006-11-18 Thread zfs
I'm new to this group, so hello everyone! I am having some issues with my 
Nexenta system I set up about two months ago as a zfs/zraid server. I have two 
new Maxtor 500GB Sata drives and an Adaptec controller which I believe has a 
Silicon Image chipset. Also I have a Seasonic 80+ power supply, so the power 
should be as clean as you can get. I had an issue with Nexenta where I had to 
reinstall, and since then everytime I reboot I have to type

zpool export amber
zpool import amber

to get my zfs volume mounted. A week ago I noticed a couple of CKSUM errors 
when I did a zpool status, so I did a zpool scrub. This is the output after:

# zpool status
  pool: amber
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: scrub completed with 0 errors on Mon Nov 13 04:49:35 2006
config:

NAMESTATE READ WRITE CKSUM
amber   ONLINE   0 0 0
  raidz1ONLINE   0 0 0
c4d0ONLINE   0 051
c5d0ONLINE   0 041

errors: No known data errors


I have md5sums on a lot of the files and it looks like maybe 5% of my files are 
corrupted. Does anyone have any ideas? I was under the impression that zfs was 
pretty reliable but I guess with any software it needs time to get the bugs 
ironed out.

Michael
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss