Re: [zfs-discuss] ZFS + OpenSolaris for home NAS?
I had a problem like that on my laptop that also has an rge interface, ping worked fine, but ssh and ftp didn't. To get around it I had to add set ip:dohwcksum = 0 to /etc/system and reboot. That worked and is worth a try for you :) Cheers, Alan -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Phenom support in b78
Hi Al, Thanks for the tips, I've maxed the memory on the board now (Up to 8GB from 4GB) and you are dead right about it being cheap to do so. I'd upgraded the power supply as I thought that was an issue since the original couldn't provide enough start-up current but that didn't make much difference to the hard hangs. However after moaning to ASUS I was given a beta BIOS ( Version 1802 if anyone else needs to chase it up ) and that has made a big difference to the system. It's now stable! I'm going to keep an eye on things and see how well it performs, hopefully it'll be worth the upgrade cost and hassle. Cheers, Alan > > > >So, before I go and shout at the motherboard > manufacturer are > > there any components in b78 that might not be > expecting a quad core > > AMD cpu? Possibly in the marvell88sx driver? Or > is there anything > > more I can do to track this issue down. > > Please read the tomshardware.com article[1] where he > found that Phenom > upgrade compatibility is not what AMD would have > expected/predicted/published. It's also possible > that your CPU VRM > (voltage regulators) can't supply the necessary > current when the > Phenom gets really busy. > > The only way to diagnose this issue is to apply > "swap-tronics" to the > motherboard and power supply. Welcome to the > bleeding edge! :( > > IMHO Phenom is far from ready for prime time. And > this is coming from > an AMD fanboy who has built, bought and recommended > AMD based systems > exclusively for the last 2 1/2 years+. > > Squawking at the motherboard maker is unlikely to get > you any > satisfaction IMHO. Cut your losses and go back to > the 5200+ or build > a system based on a Penyrn chip when the less > expensive Penyrn family > members become available - proba-bobly[2] within 60 > days. > > As an aside, with ZFS, you gain more by maxing out > your memory than by > spending the equivalent dollars on a CPU upgrade. > And memory has > never* been this inexpensive. Recommendation: max > out your memory > and tune your 5200+ based system for max memory > throughput[3]. > > PS: IMHO Phenom won't be a real contender until they > triple the L3 > memory. The architecture is sound, but currently > cache-starved IMHO. > > PPS: On an Sun x2200 system (bottom-of-the-line > config [2*2.2GHz dual > core CPUs] purchased during Suns anniverserary sale) > we "pushed in" a > SAS controller, two 140Gb SAS disks and 24Gb of 3rd > party RAM[4]. > Yes - configured for ZFS boot and ZFS based > filesystems exclusively > and currently running snv_68 (due to be upgraded when > build 80 ships). > You cannot believe how responsive this system is - > mainly due to the > RAM. For a highly performant ZFS system, there are 3 > things that you > should maximize/optimize: > > 1) RAM capacity > 2) RAM capacity > 3) RAM capacity > > PPPS: Sorry to beat this horse into submission - but! > If you have a > hoice (at a given budget) of 800MHz memory parts at N > gigabytes > (capacity), or, 667MHz (or 553MHz) memory parts at N > * 2 gigabytes - > *always*[5] go with the config that gives you the > maximum memory > capacity. You really won't notice the difference > between 800MHz > memory parts and 667MHz memory parts, but you *will* > notice the > difference between the system with 8Gb of RAM and > (the same system > with) 16Gbs of RAM when it comes to ZFS (and overall) > performance. > > [1] > http://www.tomshardware.com/2007/12/26/phenom_motherbo > ards/ > [2] deliberate new word - represents techno > uncertainty > [3] memtestx86 v3 is your friend. Available on the > UBCD (Ultimage > Bood CD ROM) > [4] odd mixture of 1Gb and 2Gb parts > [5] there are some very rare exceptions to this rule > - for really > unusual workload scenarios (like scientific > computing). > > HTH. > > Regards, > > Al Hopper Logical Approach Inc, Plano, TX. > [EMAIL PROTECTED] > Voice: 972.379.2133 Fax: 972.379.2134 > Timezone: US CDT > enSolaris Governing Board (OGB) Member - Apr 2005 to > Mar 2007 > http://www.opensolaris.org/os/community/ogb/ogb_2005-2 > 007/ > Graduate from "sugar-coating school"? Sorry - I > never attended! :) > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discu > ss This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Phenom support in b78
Hello All, In a moment of insanity I've upgraded from a 5200+ to a Phenom 9600 on my zfs server and I've had a lot of problems with hard hangs when accessing the pool. The motherboard is an Asus M2N32-WS, which has had the latest available BIOS upgrade installed to support the Phenom. bash-3.2# psrinfo -pv The physical processor has 4 virtual processors (0-3) x86 (AuthenticAMD 100F22 family 16 model 2 step 2 clock 2310 MHz) AMD Phenom(tm) 9600 Quad-Core Processor The pool is spread across 12 disks ( 3 x 4 disk raidz groups ) attached to both the motherboard and a Supermicro AOC-SAT2-MV8 in a PCI-X slot (marvell88sx driver). The hangs occur during large writes to the pool, i.e a 10G mkfile, usually just after the physical disk access start, and the file is not created in the directory on the pool at all. The system hard hangs at this point, even with booting under kmdb there's no panic string and after setting snooping=1 in /etc/system there's no crash dump created after it reboots. Doing the same operation to a single UFS disk attached to the motherboard's ATA133 interface doesn't cause a problem, neither does writing to a raidz pool created from 4 files on that ATA disk. If I use psradm and disable any 2 cores on the Phenom there's no problem with the mkfile either, but turn a third on and it'll hang. This is with the virtualization, and power now extensions disabled in the BIOS. So, before I go and shout at the motherboard manufacturer are there any components in b78 that might not be expecting a quad core AMD cpu? Possibly in the marvell88sx driver? Or is there anything more I can do to track this issue down. Thanks, Alan This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zpool iostat -v oddity with snv_73
Having recently upgraded from snv_57 to snv_73 I've noticed some strange behaviour with the -v option to zpool iostat. Without the -v option on an idle pool things look reasonable. bash-3.00# zpool iostat 1 capacity operationsbandwidth pool used avail read write read write -- - - - - - - raidpool 165G 2.92T109 36 12.5M 3.74M raidpool 165G 2.92T 0 0 0 0 raidpool 165G 2.92T 0 0 0 0 raidpool 165G 2.92T 0 0 0 0 raidpool 165G 2.92T 0 0 0 0 raidpool 165G 2.92T 0 0 0 0 raidpool 165G 2.92T 0 0 0 0 bash-3.00# iostat -x 1 extended device statistics devicer/sw/s kr/s kw/s wait actv svc_t %w %b cmdk0 0.00.00.00.0 0.0 0.00.0 0 0 sd1 0.00.00.00.0 0.0 0.00.0 0 0 sd2 0.00.00.00.0 0.0 0.00.0 0 0 sd3 0.00.00.00.0 0.0 0.00.0 0 0 sd4 0.00.00.00.0 0.0 0.00.0 0 0 sd7 0.00.00.00.0 0.0 0.00.0 0 0 sd8 0.00.00.00.0 0.0 0.00.0 0 0 sd9 0.00.00.00.0 0.0 0.00.0 0 0 sd10 0.00.00.00.0 0.0 0.00.0 0 0 sd11 0.00.00.00.0 0.0 0.00.0 0 0 sd12 0.00.00.00.0 0.0 0.00.0 0 0 sd13 0.00.00.00.0 0.0 0.00.0 0 0 sd14 0.00.00.00.0 0.0 0.00.0 0 0 But with the -v option on the idle pool this seems to constantly generate a mix of reads / writes bash-3.00# zpool iostat -v 1 capacity operationsbandwidth pool used avail read write read write -- - - - - - - raidpool 165G 2.92T111 35 12.7M 3.80M raidz155.4G 1.03T 37 11 4.27M 1.27M c1d0- - 20 7 1.41M 448K c2d0- - 20 7 1.40M 448K c3d0- - 20 7 1.41M 448K c4d0- - 20 6 1.40M 448K raidz154.9G 1.03T 37 11 4.23M 1.27M c7t0d0 - - 19 6 1.39M 434K c7t1d0 - - 19 6 1.39M 434K c7t6d0 - - 20 7 1.40M 434K c7t7d0 - - 20 6 1.40M 434K raidz154.8G 873G 37 11 4.22M 1.27M c7t2d0 - - 19 7 1.39M 434K c7t3d0 - - 20 8 1.39M 434K c7t4d0 - - 20 8 1.40M 434K c7t5d0 - - 20 6 1.40M 433K -- - - - - - - capacity operationsbandwidth pool used avail read write read write -- - - - - - - raidpool 165G 2.92T108 35 12.3M 3.68M raidz155.4G 1.03T 36 11 4.13M 1.23M c1d0- - 19 7 1.36M 437K c2d0- - 19 7 1.36M 437K c3d0- - 19 7 1.36M 437K c4d0- - 20 6 1.36M 437K raidz154.9G 1.03T 36 11 4.09M 1.23M c7t0d0 - - 19 6 1.35M 421K c7t1d0 - - 19 6 1.35M 421K c7t6d0 - - 19 7 1.36M 421K c7t7d0 - - 19 6 1.36M 420K raidz154.8G 873G 36 11 4.09M 1.23M c7t2d0 - - 19 7 1.35M 421K c7t3d0 - - 19 8 1.35M 421K c7t4d0 - - 19 8 1.35M 421K c7t5d0 - - 20 6 1.35M 420K -- - - - - - - and so on. This is also seen in the standard iostat, and there is certainly audible disk activity on the machine. extended device statistics devicer/sw/s kr/s kw/s wait actv svc_t %w %b cmdk0 0.0 10.00.0 79.6 0.0 0.00.2 0 0 sd1 0.0 175.10.0 265.7 0.0 0.00.2 0 3 sd2 0.0 175.10.0 265.7 0.0 0.00.2 0 3 sd3 0.0 175.10.0 265.7 0.0 0.00.2 0 2 sd4 0.0 173.10.0 265.7 0.0 0.00.2 0 3 sd7 0.0 178.10.0 267.7 0.0 0.10.6 0 5 sd8 0.0 128.10.0 214.7 0.0 0.00.2 0 2 sd9 0.0 188.10.0 267.7 0.0 0.00.3 0 3 sd10 0.0 128.10.0 214.7 0.0 0.00.2 0 2 sd11 0.0 196.10.0 1605.7 0.0 0.00.2 1 3 sd12 4.0 197.1 17.0 1605.7 0.0 0.00.2 1 3 sd13 4.0 203.2 17.0 1606.7 0.1 0.10.8 4 7 sd14 4.0 143.1 17.0 1554.2 0.1 0.11.0 5 7 The zpool iostat -v 1 output is very slow, especially when i/o is done to the pool, and the values f
[zfs-discuss] Re: A Plea for Help: Thumper/ZFS/NFS/B43
Hi Ben, Your sar output shows one core pegged pretty much constantly! From the Solaris Performance and Tools book that SLP state value has "The remainder of important events such as disk and network waits. along with other kernel wait events.. kernel locks or condition variables also accumilate time in this state" ZFS COUNT zfs_create 4178 ZFS AVG TIME zfs_create 71215587 ZFS SUM TIME zfs_create 297538724997 I think it looks like the system must be spinning in zfs_create(), looking in usr/src/uts/common/fs/zfs/zfs_vnops.c there are a couple of places it could loop:- 1129 /* 1130 * Create a new file object and update the directory 1131 * to reference it. 1132 */ 1154 error = dmu_tx_assign(tx, zfsvfs->z_assign); 1155 if (error) { 1156 zfs_dirent_unlock(dl); 1157 if (error == ERESTART && 1158 zfsvfs->z_assign == TXG_NOWAIT) { 1159 dmu_tx_wait(tx); 1160 dmu_tx_abort(tx); 1161 goto top; 1162 } and 1201 /* 1202 * Truncate regular files if requested. 1203 */ 1204 if ((ZTOV(zp)->v_type == VREG) && 1205 (zp->z_phys->zp_size != 0) && 1206 (vap->va_mask & AT_SIZE) && (vap->va_size == 0)) { 1207 error = zfs_freesp(zp, 0, 0, mode, TRUE); 1208 if (error == ERESTART && 1209 zfsvfs->z_assign == TXG_NOWAIT) { 1210 /* NB: we already did dmu_tx_wait() */ 1211 zfs_dirent_unlock(dl); 1212 VN_RELE(ZTOV(zp)); 1213 goto top; I think the snoop would be very useful to pour over. Cheers, Alan This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: Re: Managed to corrupt my pool
Hold fire on the re-init until one of the devs chips in, maybe I'm barking up the wrong tree ;) --a This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: Re: Managed to corrupt my pool
Hi Jim, That looks interesting though, I'm not a zfs expert by any means but look at some of the properties of the children elements of the mirror:- version=3 name='zmir' state=0 txg=770 pool_guid=5904723747772934703 vdev_tree type='root' id=0 guid=5904723747772934703 children[0] type='mirror' id=0 guid=15067187713781123481 metaslab_array=15 metaslab_shift=28 ashift=9 asize=36690722816 children[0] type='disk' id=0 guid=8544021753105415508 [b]path='/dev/dsk/c3t3d0s0'[/b] devid='id1,[EMAIL PROTECTED]/a' whole_disk=1 [b]is_spare=1[/b] DTL=19 children[1] type='disk' id=1 guid=3579059219373561470 [b]path='/dev/dsk/c3t4d0s0'[/b] devid='id1,[EMAIL PROTECTED]/a' whole_disk=1 [b]is_spare=1[/b] DTL=20 If those are the original path ids, and you didn't move the disks on the bus? Why is the is_spare flag set? There are a lot of options to zdb, some can produce a lot of output. Try zdb zmir Check the drive label contents with zdb -l /dev/dsk/c3t0d0s0 zdb -l /dev/dsk/c3t1d0s0 zdb -l /dev/dsk/c3t3d0s0 zdb -l /dev/dsk/c3t4d0s0 Uberblock info with zdb -uuu zmir And dataset info with zdb -dd zmir There are more options, and they give even more info if you repeat the option letter more times ( especially the -d flag... ) These might be worth posting to help one of the developers spot something. Cheers, Alan This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: SPEC SFS97 benchmark of ZFS,UFS,VxFS
PxFS performance improvements of the order of 5-6 times are possible depending on the workload using Fastwrite option. Fantastic! Has this been targetted at directory operations? We've had issues with large directorys full of small files being very slow to handle over PxFS. Are there plans for PxFS on ZFS any time soon :) ? Or any plans to release PxFS as part of opensolaris? Cheers, Alan This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: Possible corruption after disk hiccups...
Eh maybe it's not a problem after all, the scrub has completed well... --a bash-3.00# zpool status -v pool: raidpool state: ONLINE scrub: scrub completed with 0 errors on Tue May 9 21:10:55 2006 config: NAMESTATE READ WRITE CKSUM raidpoolONLINE 0 0 0 raidz ONLINE 0 0 0 c2d0ONLINE 0 0 0 c3d0ONLINE 0 0 0 c4d0ONLINE 0 0 0 c5d0ONLINE 0 0 0 c6d0ONLINE 0 0 0 c6d1ONLINE 0 0 0 c7d0ONLINE 0 0 0 c7d1ONLINE 0 0 0 errors: No known data errors This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Possible corruption after disk hiccups...
I'm not sure exactly what happened with my box here, but something caused a hiccup on multiple sata disks... May 9 16:40:33 sol scsi: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED] (ata6): May 9 16:47:43 sol scsi: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],0/[EMAIL PROTECTED]/[EMAIL PROTECTED] (ata3): May 9 16:47:43 sol timeout: abort request, target=0 lun=0 May 9 16:47:43 sol scsi: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],0/[EMAIL PROTECTED]/[EMAIL PROTECTED] (ata3): May 9 16:40:33 sol scsi: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED] (ata6): May 9 16:47:43 sol scsi: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],0/[EMAIL PROTECTED]/[EMAIL PROTECTED] (ata3): May 9 16:47:43 sol timeout: abort request, target=0 lun=0 May 9 16:47:43 sol scsi: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],0/[EMAIL PROTECTED]/[EMAIL PROTECTED] (ata3): May 9 16:47:43 sol timeout: abort device, target=0 lun=0 May 9 16:47:43 sol scsi: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],0/[EMAIL PROTECTED]/[EMAIL PROTECTED] (ata3): May 9 16:47:43 sol timeout: reset target, target=0 lun=0 May 9 16:47:43 sol scsi: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],0/[EMAIL PROTECTED]/[EMAIL PROTECTED] (ata3): May 9 16:47:43 sol timeout: reset bus, target=0 lun=0 May 9 16:47:43 sol gda: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],0/[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (Disk1): May 9 16:47:43 sol Error for command 'write sector'Error Level: Informational May 9 16:47:43 sol gda: [ID 107833 kern.notice]Sense Key: aborted command May 9 16:47:43 sol gda: [ID 107833 kern.notice]Vendor 'Gen-ATA ' error code: 0x3 May 9 16:47:43 sol gda: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],0/[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (Disk1): May 9 16:47:43 sol Error for command 'read sector' Error Level: Informational May 9 16:47:43 sol gda: [ID 107833 kern.notice]Sense Key: aborted command May 9 16:47:43 sol gda: [ID 107833 kern.notice]Vendor 'Gen-ATA ' error code: 0x3 May 9 16:47:43 sol gda: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (Disk6): May 9 16:47:43 sol Error for command 'write sector'Error Level: Informational May 9 16:47:43 sol gda: [ID 107833 kern.notice]Sense Key: aborted command May 9 16:47:43 sol gda: [ID 107833 kern.notice]Vendor 'Gen-ATA ' error code: 0x3 May 9 16:47:43 sol gda: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (Disk5): May 9 16:47:43 sol Error for command 'write sector'Error Level: Informational May 9 16:47:43 sol gda: [ID 107833 kern.notice]Sense Key: aborted command May 9 16:47:43 sol gda: [ID 107833 kern.notice]Vendor 'Gen-ATA ' error code: 0x3 May 9 16:47:43 sol gda: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],0/[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (Disk4): May 9 16:47:43 sol Error for command 'write sector'Error Level: Informational May 9 16:47:43 sol gda: [ID 107833 kern.notice]Sense Key: aborted command May 9 16:47:43 sol gda: [ID 107833 kern.notice]Vendor 'Gen-ATA ' error code: 0x3 May 9 16:47:43 sol gda: [ID 107833 kern.notice]Sense Key: aborted command May 9 16:47:43 sol gda: [ID 107833 kern.notice]Vendor 'Gen-ATA ' error code: 0x3 May 9 16:47:43 sol gda: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],0/[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (Disk4): May 9 16:47:43 sol Error for command 'write sector'Error Level: Informational May 9 16:47:43 sol gda: [ID 107833 kern.notice]Sense Key: aborted command May 9 16:47:43 sol gda: [ID 107833 kern.notice]Vendor 'Gen-ATA ' error code: 0x3 May 9 16:47:43 sol gda: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],0/[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (Disk2): May 9 16:47:43 sol Error for command 'write sector'Error Level: Informational May 9 16:47:43 sol gda: [ID 107833 kern.notice]Sense Key: aborted command May 9 16:47:43 sol gda: [ID 107833 kern.notice]Vendor 'Gen-ATA ' error code: 0x3 May 9 16:47:43 sol unix: [ID 836849 kern.notice] May 9 16:47:43 sol ^Mpanic[cpu0]/thread=fe8000581c80: May 9 16:47:43 sol genunix: [ID 809409 kern.notice] ZFS: I/O failure (write on off 0: zio fe81a5972340 [L0 ZIL intent log] 2000L/2000P DVA[0]=<0:25786c7000:2800> zilog uncompressed LE contiguous birth=1468445 fill=0 cksum=4392a2279563047e:1b7716cbbf370c72:ac: 6b): error 5 May 9 16:47:4