Re: [zfs-discuss] Assistance needed expanding RAIDZ with larger drives

2008-01-13 Thread Chris Murray
 About that issue, please check my post in:
 http://www.opensolaris.org/jive/thread.jspa?threadID=48483tstart=0

Thanks - when I originally tried to replace the first drive, my intention was 
to:
1. Move solaris box and drives
2. Power up to test it still works
3. Power down
4. Replace drive.

I suspect I may have missed out 2  3, and ran into the same situation that you 
did.

Anyhow, I seem to now be in an even bigger mess than earlier - when I tried to 
simply swap out one of the old drives with a new one and perform a replace, I 
ran into problems:

1. The hard drive light on the PC lit up, and I heard lots of disk noise, as 
you would expect
2. The light went off. My continuous ping did the following:

Reply from 192.168.0.10: bytes=32 time1ms TTL=255
Reply from 192.168.0.10: bytes=32 time1ms TTL=255
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Reply from 192.168.0.10: bytes=32 time=2092ms TTL=255
Reply from 192.168.0.10: bytes=32 time1ms TTL=255
Reply from 192.168.0.10: bytes=32 time1ms TTL=255

3. The light came back on again .. more disk noise. Good - perhaps the pause 
was just a momentary blip
4. Light goes off (this is about 20 minutes since the start)
5. zpool status reports that a resilver completed, and there are errors in 
zp/storage and zp/VMware, and suggests that I should restore from backup
6. I nearly cry, as these are the only 2 files I use.
7. I have heard of ZFS thinking that there are unrecoverable errors before, so 
I run zpool scrub and then zpool clear a number of times. Seem to make no 
difference.

This whole project started when I wanted to move 900 GB of data from a server 
2003 box containing the 4 old disks, to a solaris box. I borrowed 2 x 500 GB 
drives from a friend, copied all the data onto them, put the 4 old drives into 
the solaris box, created the zpool, created my storage and VMware volumes, 
shared them out using iSCSI, created NTFS volumes on the server 2003 box and 
copied the data back onto them. Aside from a couple of networking issues, this 
worked absolutely perfectly. Then I decided I'd like some more space, and 
that's where it all went wrong.

Despite the reports of corruption, the storage and VMware drives do still 
work in windows. The iSCSI initiator still picks them up, and if I if I dir /a 
/s, I can see all of the files that were on these NTFS volumes before I tried 
this morning's replace. However, should I trust this? I suspect that even if I 
ran a chkdsk /f, a successful result may not be all that it seems. I still have 
the 2 x 500 GB drives with my data from weeks ago. I'd be sad to lose a few 
weeks worth of work, but that would be better than assuming that ZFS is 
incorrect in saying the volumes are corrupt and then discovering in months time 
that I cannot get at NTFS files because of this root cause.

Since the report of corruption in these 2 volumes, I had a genius 
troubleshooting idea - what if the problem is not with ZFS, but instead with 
Solaris not liking the drives in general?. I exported my current zpool, 
disconnected all drives, plugged in the 4 new ones, and waited for the system 
to boot again... nothing. The system had stopped in the BIOS, requesting that I 
press F1 as SMART reports that one of the drives is bad! Already?!? I only 
bought the drives a few days ago!!! Now the problem is that I know which of 
these drives is bad, but I don't know whether this was the one that was plugged 
in when zpool status reported all the read/write/checksum errors.

So maybe I have a duff batch of drives .. I leave the remaining 3 plugged in 
and create a brand new zpool called test. No problems at all. I create a 1300 
GB volume on it. Also no problem. I'm currently overwriting it with random data:

dd if=/dev/urandom of=/dev/zvol/rdsk/test/test bs=1048576 count=1331200

I throw in the odd zpool scrub to see how things are doing so far and as yet, 
there hasn't been a single error of any sort. So, 3 of the WD drives (0430739, 
0388708, 0417089) appear to be fine and one is dead already (0373211).

So this leads me to the conclusion that (ignoring the bad one), these drives 
work fine with Solaris. They work fine with ZFS too. It's just the act of 
trying to replace a drive from my old zpool with a new one that causes issues.

My next step will be to run the WD diagnostics on all drives, send the broken 
one back, and then have 4 fully functioning 750 GB drives. I'll also import the 
old zpool into the solaris box - it'll undoubtedly complain that one of the 
drives is missing (the one that I tried to add earlier and got all the errors), 
so I think I'll try one more replace to get all 4 old drives back in the pool. 

So, what do I do after that?

1. Create a brand new pool out of the WD drives, share it using iSCSI and copy 
onto that my data from my friends drives? I'll have lost a good few weeks of 
work but I'll be confident that it isn't corrupt.
2. Ignore the fact 

Re: [zfs-discuss] Panic on Zpool Import (Urgent)

2008-01-13 Thread Prabahar Jeyaram
Your system seems to have hit the BUG 6458218 :

http://bugs.opensolaris.org/view_bug.do?bug_id=6458218

It is fixed in snv_60. As far ZFS, snv_43 is quite old.

--
Prabahar.

On Jan 12, 2008, at 11:15 PM, Ben Rockwood wrote:

 Today, suddenly, without any apparent reason that I can find, I'm
 getting panic's during zpool import.  The system paniced earlier today
 and has been suffering since.  This is snv_43 on a thumper.  Here's  
 the
 stack:

 panic[cpu0]/thread=99adbac0: assertion failed: ss != NULL,  
 file:
 ../../common/fs/zfs/space_map.c, line: 145

 fe8000a240a0 genunix:assfail+83 ()
 fe8000a24130 zfs:space_map_remove+1d6 ()
 fe8000a24180 zfs:space_map_claim+49 ()
 fe8000a241e0 zfs:metaslab_claim_dva+130 ()
 fe8000a24240 zfs:metaslab_claim+94 ()
 fe8000a24270 zfs:zio_dva_claim+27 ()
 fe8000a24290 zfs:zio_next_stage+6b ()
 fe8000a242b0 zfs:zio_gang_pipeline+33 ()
 fe8000a242d0 zfs:zio_next_stage+6b ()
 fe8000a24320 zfs:zio_wait_for_children+67 ()
 fe8000a24340 zfs:zio_wait_children_ready+22 ()
 fe8000a24360 zfs:zio_next_stage_async+c9 ()
 fe8000a243a0 zfs:zio_wait+33 ()
 fe8000a243f0 zfs:zil_claim_log_block+69 ()
 fe8000a24520 zfs:zil_parse+ec ()
 fe8000a24570 zfs:zil_claim+9a ()
 fe8000a24750 zfs:dmu_objset_find+2cc ()
 fe8000a24930 zfs:dmu_objset_find+fc ()
 fe8000a24b10 zfs:dmu_objset_find+fc ()
 fe8000a24bb0 zfs:spa_load+67b ()
 fe8000a24c20 zfs:spa_import+a0 ()
 fe8000a24c60 zfs:zfs_ioc_pool_import+79 ()
 fe8000a24ce0 zfs:zfsdev_ioctl+135 ()
 fe8000a24d20 genunix:cdev_ioctl+55 ()
 fe8000a24d60 specfs:spec_ioctl+99 ()
 fe8000a24dc0 genunix:fop_ioctl+3b ()
 fe8000a24ec0 genunix:ioctl+180 ()
 fe8000a24f10 unix:sys_syscall32+101 ()

 syncing file systems... done

 This is almost identical to a post to this list over a year ago titled
 ZFS Panic.  There was follow up on it but the results didn't make it
 back to the list.

 I spent time doing a full sweep for any hardware failures, pulled 2
 drives that I suspected as problematic but weren't flagged as such,  
 etc,
 etc, etc.  Nothing helps.

 Bill suggested a 'zpool import -o ro' on the other post, but thats not
 working either.

 I _can_ use 'zpool import' to see the pool, but I have to force the
 import.  A simple 'zpool import' returns output in about a minute.
 'zpool import -f poolname' takes almost exactly 10 minutes every  
 single
 time, like it hits some timeout and then panics.

 I did notice that while the 'zpool import' is running 'iostat' is
 useless, just hangs.  I still want to believe this is some device
 misbehaving but I have no evidence to support that theory.

 Any and all suggestions are greatly appreciated.  I've put around 8
 hours into this so far and I'm getting absolutely nowhere.

 Thanks

 benr.
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Panic on Zpool Import (Urgent)

2008-01-13 Thread Akhilesh Mritunjai
Hi Ben

Not that I know much, but while monitoring the posts I read sometime long ago 
that there was a bug/race condition in slab allocator which results in panic on 
double free (ss != NULL).

I think zpool is fine but your system is tripping on this bug. Since it is 
snv43, I'd suggest upgrading. Is LU/fresh install possible ? Can you quickly 
try importing it on belenix liveCD/USB ?

- Akhilesh

PS: I'll post the bug# if I find it.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Panic on Zpool Import (Urgent)

2008-01-13 Thread Akhilesh Mritunjai
Most probable culprit (close, but not identical stacktrace):

http://bugs.opensolaris.org/view_bug.do?bug_id=6458218

Fixed since snv60.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Panic on Zpool Import (Urgent)

2008-01-13 Thread Rob Logan
as its been pointed out it likely 6458218
but a zdb -e poolname
will tell you alittle more

Rob

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Phenom support in b78

2008-01-13 Thread Al Hopper
On Sat, 12 Jan 2008, Alan Romeril wrote:

[  reformatted  ]
 Hello All,

In a moment of insanity I've upgraded from a 5200+ to a Phenom 
 9600 on my zfs server and I've had a lot of problems with hard hangs 
 when accessing the pool. The motherboard is an Asus M2N32-WS, which 
 has had the latest available BIOS upgrade installed to support the 
 Phenom.

 bash-3.2# psrinfo -pv
 The physical processor has 4 virtual processors (0-3)
  x86 (AuthenticAMD 100F22 family 16 model 2 step 2 clock 2310 MHz)
AMD Phenom(tm) 9600 Quad-Core Processor

The pool is spread across 12 disks ( 3 x 4 disk raidz groups ) 
 attached to both the motherboard and a Supermicro AOC-SAT2-MV8 in a 
 PCI-X slot (marvell88sx driver).  The hangs occur during large 
 writes to the pool, i.e a 10G mkfile, usually just after the 
 physical disk access start, and the file is not created in the 
 directory on the pool at all.  The system hard hangs at this point, 
 even with booting under kmdb there's no panic string and after 
 setting snooping=1 in /etc/system there's no crash dump created 
 after it reboots.  Doing the same operation to a single UFS disk 
 attached to the motherboard's ATA133 interface doesn't cause a 
 problem, neither does writing to a raidz pool created from 4 files 
 on that ATA disk.  If I use psradm and disable any 2 cores on the 
 Phenom there's no problem with the mkfile either, but turn a third 
 on and it'll hang.  This is with the virtualization, and power now 
 extensions disabled in the BIOS.

So, before I go and shout at the motherboard manufacturer are 
 there any components in b78 that might not be expecting a quad core 
 AMD cpu?  Possibly in the marvell88sx driver?  Or is there anything 
 more I can do to track this issue down.

Please read the tomshardware.com article[1] where he found that Phenom 
upgrade compatibility is not what AMD would have 
expected/predicted/published.  It's also possible that your CPU VRM 
(voltage regulators) can't supply the necessary current when the 
Phenom gets really busy.

The only way to diagnose this issue is to apply swap-tronics to the 
motherboard and power supply.  Welcome to the bleeding edge!  :(

IMHO Phenom is far from ready for prime time.  And this is coming from 
an AMD fanboy who has built, bought and recommended AMD based systems 
exclusively for the last 2 1/2 years+.

Squawking at the motherboard maker is unlikely to get you any 
satisfaction IMHO.  Cut your losses and go back to the 5200+ or build 
a system based on a Penyrn chip when the less expensive Penyrn family 
members become available - proba-bobly[2] within 60 days.

As an aside, with ZFS, you gain more by maxing out your memory than by 
spending the equivalent dollars on a CPU upgrade.  And memory has 
*never* been this inexpensive.  Recommendation: max out your memory 
and tune your 5200+ based system for max memory throughput[3].

PS: IMHO Phenom won't be a real contender until they triple the L3 
memory.  The architecture is sound, but currently cache-starved IMHO.

PPS: On an Sun x2200 system (bottom-of-the-line config [2*2.2GHz dual 
core CPUs] purchased during Suns anniverserary sale) we pushed in a 
SAS controller, two 140Gb SAS disks and 24Gb of 3rd party RAM[4]. 
Yes - configured for ZFS boot and ZFS based filesystems exclusively 
and currently running snv_68 (due to be upgraded when build 80 ships). 
You cannot believe how responsive this system is - mainly due to the 
RAM.  For a highly performant ZFS system, there are 3 things that you 
should maximize/optimize:

1) RAM capacity
2) RAM capacity
3) RAM capacity

PPPS: Sorry to beat this horse into submission - but!  If you have a 
choice (at a given budget) of 800MHz memory parts at N gigabytes 
(capacity), or, 667MHz (or 553MHz) memory parts at N * 2 gigabytes - 
*always*[5] go with the config that gives you the maximum memory 
capacity.  You really won't notice the difference between 800MHz 
memory parts and 667MHz memory parts, but you *will* notice the 
difference between the system with 8Gb of RAM and (the same system 
with) 16Gbs of RAM when it comes to ZFS (and overall) performance.

[1] http://www.tomshardware.com/2007/12/26/phenom_motherboards/
[2] deliberate new word - represents techno uncertainty
[3] memtestx86 v3 is your friend.  Available on the UBCD (Ultimage 
Bood CD ROM)
[4] odd mixture of 1Gb and 2Gb parts
[5] there are some very rare exceptions to this rule - for really 
unusual workload scenarios (like scientific computing).

HTH.

Regards,

Al Hopper  Logical Approach Inc, Plano, TX.  [EMAIL PROTECTED]
Voice: 972.379.2133 Fax: 972.379.2134  Timezone: US CDT
OpenSolaris Governing Board (OGB) Member - Apr 2005 to Mar 2007
http://www.opensolaris.org/os/community/ogb/ogb_2005-2007/
Graduate from sugar-coating school?  Sorry - I never attended! :)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org