Re: [zfs-discuss] lost zpool when server restarted.
Looking at the txg numbers, it's clear that labels on to devices that are unavailable now may be stale: Krzys wrote: > When I do zdb on emcpower3a which seems to be ok from zpool perspective I get > the following output: > bash-3.00# zdb -lv /dev/dsk/emcpower3a > > LABEL 0 > > version=3 > name='mypool' > state=0 > txg=4367380 > pool_guid=4148251638983938048 > top_guid=9690155374174551757 > guid=9690155374174551757 > vdev_tree > type='disk' > id=2 > guid=9690155374174551757 > path='/dev/dsk/emcpower3a' > whole_disk=0 > metaslab_array=1813 > metaslab_shift=30 > ashift=9 > asize=134208815104 Here we have txg=4367380, but on other two devices (probably; at least on one of them) - txg=4367379: > But when I do zdb on emcpower0a which seems to be not that ok and get the > following output: > bash-3.00# zdb -lv /dev/dsk/emcpower0a > > LABEL 0 > > version=3 > name='mypool' > state=0 > txg=4367379 > pool_guid=4148251638983938048 > top_guid=14125143252243381576 > guid=14125143252243381576 > vdev_tree > type='disk' > id=0 > guid=14125143252243381576 > path='/dev/dsk/emcpower0a' > whole_disk=0 > metaslab_array=13 > metaslab_shift=29 > ashift=9 > asize=107365269504 > DTL=727 > > that also is the same for emcpower2a in my pool. What does 'zdb -uuu mypool' say? > Is there a way to be able to fix failed LABELs 2 and 3? I know you need 4 of > them, but is there a way to reconstruct them in any way? It looks like the problem is not that labels 2 and 3 are missing, but that labels 0 and 1 are stale > Or is my pool lost completely and I need to recreate it? > It would be off that reboot of a server could cause such disaster. There's Dirty Time Log object allocated for device with unreadable labels, and it means that device in question was not available for some time, so something weird might be going on with your storage a while back (prior to reboot)... > But I was unable to find anywhere where people would be able to > repair or recreate those LABELS. How would I recover my zpools? Any > help or suggestion is greatly appreciated. Have you seen this thread - http://www.opensolaris.org/jive/thread.jspa?messageID=220125 ? I think some of that experience may be applicable to this case as well Btw, what kind of Solaris are you running? wbr, victor ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] recovering data from a dettach mirrored vdev
Oh, you're right! Well, that will simplify things! All we have to do is convince a few bits of code to ignore ub_txg == 0. I'll try a couple of things and get back to you in a few hours... Jeff On Fri, May 02, 2008 at 03:31:52AM -0700, Benjamin Brumaire wrote: > Hi, > > while diving deeply in zfs in order to recover data I found that every > uberblock in label0 does have the same ub_rootbp and a zeroed ub_txg. Does it > means only ub_txg was touch while detaching? > > Hoping it is the case, I modified ub_txg from one uberblock to match the tgx > from the label and now I try to calculate the new SHA256 checksum but I > failed. Can someone explain what I did wrong? And of course how to do it > correctly? > > bbr > > > The example is from a valid uberblock which belongs an other pool. > > Dumping the active uberblock in Label 0: > > # dd if=/dev/dsk/c0d1s4 bs=1 iseek=247808 count=1024 | od -x > 1024+0 records in > 1024+0 records out > 000 b10c 00ba 0009 > 020 8bf2 8eef f6db c46f 4dcc > 040 bba8 481a 0001 > 060 05e6 0003 0001 > 100 05e6 005b 0001 > 120 44e9 00b2 0001 0703 800b > 140 > 160 8bf2 > 200 0018 a981 2f65 0008 > 220 e734 adf2 037a cedc d398 c063 > 240 da03 8a6e 26fc 001c > 260 > * > 0001720 7a11 b10c da7a 0210 > 0001740 3836 20fb e2a7 a737 a947 feed 43c5 c045 > 0001760 82a8 133d 0ba7 9ce7 e5d5 64e2 2474 3b03 > 0002000 > > Checksum is at pos 01740 01760 > > I try to calculate it assuming only uberblock is relevant. > #dd if=/dev/dsk/c0d1s4 bs=1 iseek=247808 count=168 | digest -a sha256 > 168+0 records in > 168+0 records out > 710306650facf818e824db5621be394f3b3fe934107bdfc861bbc82cb9e1bbf3 > > Helas not matching :-( > > > This message posted from opensolaris.org > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Debugging filesystem lock-ups
Hello, I'm using snv_81 x86 as a file server and occasional CPU server at home. It consists of one system disk with normal UFS/swap and one pool of six disks in raidz1 configuration. Every now and again the raidz file systems will lock up hard. Any access to them will block in IO-wait. Trying to reboot will lock up the system, so pressing the reset button in the only option. After a reboot everything works fine again. I can usually trigger the problem within 12 hours by doing lots of compilations in parrallell, but just leaving it alone serving files via Samba and NFS will trigger it within a couple of weeks. The problem has been there ever since I installed snv_55 on it way back, so my guess is that it's not a systematic error in ZFS, but rather a driver problem or a hardware glitch. The trick is figuring out which of those two it is so I can correct it. I should mention that we are talking about disks, whose natural state of course are full: % df -h /famine Filesystem size used avail capacity Mounted on famine 2.7T72K14G 1%/famine So... 1. The root shell still works. How do I go about trying to debug things when the filesystems lock up? 2. This is a pretty well ventilated chassi, but trying to determine if things get to hot of pull too much power is always prudent. Where should I look for information on how to set up MB sensors and SMART access? 3. The pool is divided between two SIL-3114 cards flashed to the non-RAID bios version running on a single P4 CPU. Any known problems with that configuration? TIA, -- Peter Bortas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] cp -r hanged copying a directory
Wow, thanks Dave. Looks like you've had this hell too :) So, that makes me happy that the disks and pool are probably OK, but it does seem an issue with the NVidia MCP 55 chipset, or at least perhaps the nv_sata driver. From reading the bug list below, it seems the problem might be a more general disk driver problem, perhaps not just limited to the nv_sata driver. I looked at the post you listed, and followed a chain of bug reports: http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6658565 (Accepted) http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6642154 (dup of 6662881) http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6662881 (fixed in snv_87) http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6669134 (fixed in snv_90) http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6662400 (dup of 6669134) http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6671523 (Accepted) http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6672202 (Need more info) Maybe I'll try snv_87 again, or just wait until snv_90 is released. My work around for until this issue is fixed is to use 'rsync' to do intra-pool copies as it seems to stress the system less and thus prevents ZFS file system lockup. Thanks to everyone who helped. I might post a link to this thread in the storage-discuss group to see if I can get any further help, if anyone there knows any more details on this. That Supermicro AOC MV8 card looks good, but I would prefer not to have to buy new hardware to fix what should hopefully turn out to be a driver problem. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] cp -r hanged copying a directory
I have similar, but not exactly the same drives: format> inq Vendor: ATA Product: WDC WD7500AYYS-0 Revision: 4G30 Same firmware revision. I have no problems with drive performance, although I use them under UFS and for backing stores for iscsi disks. FYI, I had random lockups and crashes on my Tyan MB with the MCP55 chipset. I bought Supermicro AOL-SAT2-MV8's and moved all my disks to them. Haven't had a problem since. http://de.opensolaris.org/jive/thread.jspa?messageID=204736 -- Dave On 05/03/2008 01:44 PM, Simon Breden wrote: > @Max: I've not tried this with other file systems, and not with multiple dd > commands at the same time with raw disks. I suppose this is not possible to > do with my disks which are currently part of this RAIDZ1 vdev in the pool > without corrupting data? I'll assume not. > > @Rob: OK, let's assume that, like you say, it's not a ZFS issue, but in fact > a drive, firmware etc issue. That said, where should I create a new thread -- > in storage-discuss ? I will refer to these 2 threads here for all the gory > details ;-) > > If this can be proven to be a disk problem then I want to return them under > warranty and get some different ones. Normally these disks have absolutely > excellent user feedback on newegg.com, so I'm quite surprised if the disks > are ALL bad. > > I wonder if, in fact, there could be some issue between the motherboard's > BIOS and the drives, or is this not possible? The motherboard is an "Asus > M2N-SLI Deluxe" and it uses an NVidia controller (MCP 55?) which is part of > the NVidia 570 SLI chipset, again not exactly an exotic, unused chipset. > > If it's possible for the BIOS to affect the disk interface then I have seen > that a new BIOS is available, which I could try. > > Also I could update to snv_b87, which is the latest one, although I first saw > this problem with that build (87), so the OS upgrade might not help. > > > This message posted from opensolaris.org > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] cp -r hanged copying a directory
@Max: I've not tried this with other file systems, and not with multiple dd commands at the same time with raw disks. I suppose this is not possible to do with my disks which are currently part of this RAIDZ1 vdev in the pool without corrupting data? I'll assume not. @Rob: OK, let's assume that, like you say, it's not a ZFS issue, but in fact a drive, firmware etc issue. That said, where should I create a new thread -- in storage-discuss ? I will refer to these 2 threads here for all the gory details ;-) If this can be proven to be a disk problem then I want to return them under warranty and get some different ones. Normally these disks have absolutely excellent user feedback on newegg.com, so I'm quite surprised if the disks are ALL bad. I wonder if, in fact, there could be some issue between the motherboard's BIOS and the drives, or is this not possible? The motherboard is an "Asus M2N-SLI Deluxe" and it uses an NVidia controller (MCP 55?) which is part of the NVidia 570 SLI chipset, again not exactly an exotic, unused chipset. If it's possible for the BIOS to affect the disk interface then I have seen that a new BIOS is available, which I could try. Also I could update to snv_b87, which is the latest one, although I first saw this problem with that build (87), so the OS upgrade might not help. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] cp -r hanged copying a directory
Hi Simon, Simon Breden wrote: > Thanks Max, and the fact that rsync stresses the system less would help > explain why rsync works, and cp hangs. The directory was around 11GB in size. > > If Sun engineers are interested in this problem then I'm happy to run > whatever commands they give me -- after all, I have a pure goldmine here for > them to debug ;-) And it *is* running on a ZFS filesystem. Opportunities like > this don't come along every day :) Tempted? :) > > Well, if I can't tempt Sun, then for anyone who has the same disks, I would > be interested to see what happens on your machine: > Model Number: WD7500AAKS-00RBA0 > Firmware revision: 4G30 > > I use three of these disks in a RAIDZ1 vdev within the pool. > > I think Rob Logan is probably correct, and there is a problem with the disks, not zfs. Have you tried this with a different file system (ufs), or multiple dd commands running at the same time with the raw disks? max ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] cp -r hanged copying a directory
Thanks Max, and the fact that rsync stresses the system less would help explain why rsync works, and cp hangs. The directory was around 11GB in size. If Sun engineers are interested in this problem then I'm happy to run whatever commands they give me -- after all, I have a pure goldmine here for them to debug ;-) And it *is* running on a ZFS filesystem. Opportunities like this don't come along every day :) Tempted? :) Well, if I can't tempt Sun, then for anyone who has the same disks, I would be interested to see what happens on your machine: Model Number: WD7500AAKS-00RBA0 Firmware revision: 4G30 I use three of these disks in a RAIDZ1 vdev within the pool. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] cp -r hanged copying a directory
oops, I lied... according to my self http://mail.opensolaris.org/pipermail/zfs-discuss/2008-January/045141.html "wait" are queued in solaris and active > 1 are in the drives NCQ. so the question is: Where are the drive's command getting dropped across 3 disks at the same time? and in all cases its not a zfs issue, but a disk, controller or [EMAIL PROTECTED] issue. Rob ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] cp -r hanged copying a directory
no amount of playing with cp will fix a drive FW issue. but as pointed out the slower rsync will tax the FW less. Looking at devicer/sw/s kr/s kw/s wait actv svc_t %w %b s/w h/w trn tot us sy wt id sd0 0.00.00.00.0 35.0 0.00.0 100 0 0 0 0 0 it seems the disk still has requests queued but not active, so the echo "set sata:sata_max_queue_depth = 0x1" >> /etc/system didn't work.. (perhaps running bits older than snv_74?) http://bugs.opensolaris.org/view_bug.do?bug_id=6589306 so lets try the older echo "set sata:sata_func_enable = 0x7" >> /etc/system but of cource fixing the drive FW is the answer. ref: http://mail.opensolaris.org/pipermail/storage-discuss/2008-January/004428.html Rob ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] cp -r hanged copying a directory
Hi Simon, Simon Breden wrote: > The plot thickens. I replaced 'cp' with 'rsync' and it worked -- I ran it a > few times and it didn't hang so far. > > So on the face of it, it appears that 'cp' is doing something that causes my > system to hang if the files are read from and written to the same pool, but > simply replacing 'cp' with 'rsync' works. Hmmm... anyone have a clue about > what I can do next to home in on the problem with 'cp' ? > > Here is the output using 'rsync' : > > bash-3.2$ truss -topen rsync -a z1 z2 > open("/var/ld/ld.config", O_RDONLY) Err#2 ENOENT > The rsync command and cp command work very differently. cp mmaps up to 8MB of the input file and writes from the returned address of mmap, faulting in the pages as it writes (unless you are a normal user on Indiana, in which case cp is gnu's cp which reads/writes (so, why are there 2 versions?)). Rsync forks and sets up a socketpair between parent and child processes then reads/writes. It should be much slower than cp, and put much less stress on the disk. It would be great to have a way to reproduce this. I have not had any problems. How large is the directory you are copying? Either the disk has not sent a response to an I/O operation, or the response was somehow lost. If I could reproduce the problem, I might try to dtrace the commands being sent to the HBA and responses coming back... Hopefully someone here who has experience with the disks you are using will be able to help. max ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] cp -r hanged copying a directory
The plot thickens. I replaced 'cp' with 'rsync' and it worked -- I ran it a few times and it didn't hang so far. So on the face of it, it appears that 'cp' is doing something that causes my system to hang if the files are read from and written to the same pool, but simply replacing 'cp' with 'rsync' works. Hmmm... anyone have a clue about what I can do next to home in on the problem with 'cp' ? Here is the output using 'rsync' : bash-3.2$ truss -topen rsync -a z1 z2 open("/var/ld/ld.config", O_RDONLY) Err#2 ENOENT open("/lib/libsocket.so.1", O_RDONLY) = 3 open("/lib/libnsl.so.1", O_RDONLY) = 3 open("/lib/libc.so.1", O_RDONLY)= 3 open("/usr/lib/locale/en_GB.UTF-8/en_GB.UTF-8.so.3", O_RDONLY) = 3 open("/usr/lib/locale/common/methods_unicode.so.3", O_RDONLY) = 3 open64("/etc/popt", O_RDONLY) Err#2 ENOENT open64("/export/home/simon/.popt", O_RDONLY)Err#2 ENOENT open("/usr/lib/iconv/UTF-8%UTF-8.so", O_RDONLY) = 3 open64("/var/run/name_service_door", O_RDONLY) = 3 open64("z1/testdir/f01", O_RDONLY) = 5 open64("z1/testdir/f02", O_RDONLY) = 5 open64("z1/testdir/f03", O_RDONLY) = 5 open64("z1/testdir/f04", O_RDONLY) = 5 open64("z1/testdir/f05", O_RDONLY) = 5 open64("z1/testdir/f06", O_RDONLY) = 5 open64("z1/testdir/f07", O_RDONLY) = 5 open64("z1/testdir/f08", O_RDONLY) = 5 open64("z1/testdir/f09", O_RDONLY) = 5 open64("z1/testdir/f10", O_RDONLY) = 5 open64("z1/testdir/f11", O_RDONLY) = 5 open64("z1/testdir/f12", O_RDONLY) = 5 open64("z1/testdir/f13", O_RDONLY) = 5 open64("z1/testdir/f14", O_RDONLY) = 5 open64("z1/testdir/f15", O_RDONLY) = 5 open64("z1/testdir/f16", O_RDONLY) = 5 Received signal #18, SIGCLD, in pollsys() [caught] siginfo: SIGCLD CLD_EXITED pid=910 status=0x bash-3.2$ This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] cp -r hanged copying a directory
Well, I had some more ideas and ran some more tests: 1. cp -r testdir ~/z1 This copied the testdir directory from the zfs pool into my home directory on the IDE boot drive, so not part of the zfs pool, and this worked. 2. cp -r ~/z1 . This copied the files back from my home directory on the IDE boot disk and into the ZFS pool. This worked. 3. cp -r z1 z2 This copied the files from the ZFS pool to another directory in the ZFS pool and this has not worked -- it hanged again, but differently this time. It copied a couple of files, then the hanged. The mouse wouldn't move, keyboard inactive, I hit loads of keys including ALT TAB and finally the mouse was back, the copying continued to copy 2 or 3 more files and then hanged again, this time no more files are being copied, and its hanged copying a different file from other times. So from these tests, it appears that copying the test directory out of the ZFS pool is successful, copying it in from outside the pool is successful, but reading and writing the files completely within the pool is failing. My gut instinct is that reading and writing purely within the pool is stressing the disks due to more I/O demands on the disks used for the ZFS pool, but this may be completely wrong. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS still crashing after patch
Hello Rustam, Saturday, May 3, 2008, 9:16:41 AM, you wrote: R> I don't think that this is hardware issue, however i don't except this. I'll try to explain why. R> 1. I've replaced all memory modules which are more likely to cause such a problem. R> 2. There are many different applications running on that server R> (Apache, PostgreSQL, etc.). However, if you look at the four R> different crash dump stack traces you see the same picture: R> -- crash dump st1 -- R> mutex_enter+0xb() R> zio_buf_alloc+0x1a() R> zio_read+0xba() R> spa_scrub_io_start+0xf1() R> spa_scrub_cb+0x13d() R> -- crash dump st2 -- R> mutex_enter+0xb() R> zio_buf_alloc+0x1a() R> zio_read+0xba() R> arc_read+0x3cc() R> dbuf_prefetch+0x11d() R> dmu_prefetch+0x107() R> zfs_readdir+0x408() R> fop_readdir+0x34() R> -- crash dump st3 -- R> mutex_enter+0xb() R> zio_buf_alloc+0x1a() R> zio_read+0xba() R> arc_read+0x3cc() R> dbuf_prefetch+0x11d() R> dmu_prefetch+0x107() R> zfs_readdir+0x408() R> fop_readdir+0x34() R> -- crash dump st4 -- R> mutex_enter+0xb() R> zio_buf_alloc+0x1a() R> zio_read+0xba() R> arc_read+0x3cc() R> dbuf_prefetch+0x11d() R> dmu_prefetch+0x107() R> zfs_readdir+0x408() R> fop_readdir+0x34() R> All four crash dumps show problem at zio_read/zio_buf_alloc. Three R> of these appeared during metadata prefetch (dmu_prefetch) and one R> during scrubbing. I don't think that it's coincidence. IMHO, R> checksum errors are the result of this inconsistency. Which would happen if you have problem with HW and you're getting wring checksums on both side of your mirrors. Maybe PS? Try memtest anyway or sunvts -- Best regards, Robert Milkowskimailto:[EMAIL PROTECTED] http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] cp -r hanged copying a directory
Thanks Max, I have done a few tests with what you suggest and I have listed the output below. I wait a few minutes before deciding it's failed, and there is never any console output about anything failing, and nothing in any log files I've looked in: /var/adm/messages or /var/log/syslog. Maybe if I left it 2 hours I might see a message somewhere, but who knows? This is a nasty problem, as: 1. it appears to be failing on different files, although I think I'm seeing a common pattern where it fails on the third file often. 2. it copied all the files successfullly once, see log below for run #2, but then I do run #3 immediately afterwards and it fails again, so I list debug output for run #3 3. I cannot kill the hanging cp command and then my whole zfs filesystem is locked up meaning I have to reboot. Even doing an 'ls' command hangs often due to the hanging 'cp' command. 4. I cannot use 'shutdown -y -g 0 -i 5' to shutdown the machine, as it seems to be blocked by the hanging cp command 5. The way to shutdown the machine is to hit the reset button, and I don't like doing this when there are, theoretically, write operations occurring, or at least pending. Anyway here is the long output. Perhaps if people reply they can avoid keeping this text as part of their reply or we'll be lost in a sea of dump output :) output: run #1 (after reboot) bash-3.2$ truss -topen cp -r testdir z open("/var/ld/ld.config", O_RDONLY) Err#2 ENOENT open("/lib/libc.so.1", O_RDONLY)= 3 open("/usr/lib/locale/en_GB.UTF-8/en_GB.UTF-8.so.3", O_RDONLY) = 3 open("/usr/lib/locale/common/methods_unicode.so.3", O_RDONLY) = 3 open("/lib/libsec.so.1", O_RDONLY) = 3 open("/lib/libcmdutils.so.1", O_RDONLY) = 3 open("/lib/libavl.so.1", O_RDONLY) = 3 open64("testdir/f06", O_RDONLY) = 4 open64("testdir/f15", O_RDONLY) = 4 open64("testdir/f12", O_RDONLY) = 4 # mdb -k Loading modules: [ unix genunix specfs dtrace cpu.generic cpu_ms.AuthenticAMD.15 uppc pcplusmp scsi_vhci ufs ip hook neti sctp arp usba s1394 nca lofs zfs random md sppp smbsrv nfs ptm crypto ipc ] > ::pgrep cp SPID PPID PGIDSIDUID FLAGS ADDR NAME R910909909869501 0x4a004000 ff01d96e6e30 cp > ff01d96e6e30::walk thread | ::threadlist -v ADDR PROC LWP CLS PRIWCHAN ff01d4371b60 ff01d96e6e30 ff01d9b28930 2 60 ff01f4aa52c0 PC: _resume_from_idle+0xf1CMD: cp -r testdir z stack pointer for thread ff01d4371b60: ff000949f260 [ ff000949f260 _resume_from_idle+0xf1() ] swtch+0x17f() cv_wait+0x61() zio_wait+0x5f() dmu_buf_hold_array_by_dnode+0x214() dmu_read+0xd4() zfs_fillpage+0x15e() zfs_getpage+0x187() fop_getpage+0x9f() segvn_fault+0x9ef() as_fault+0x5ae() pagefault+0x95() trap+0x1286() 0xfb8001d9() fuword8+0x21() zfs_write+0x147() fop_write+0x69() write+0x2af() write32+0x1e() sys_syscall32+0x101() > bash-3.2$ iostat -xce 1 extended device statistics errors --- cpu devicer/sw/s kr/s kw/s wait actv svc_t %w %b s/w h/w trn tot us sy wt id cmdk033.55.7 431.0 32.6 0.4 0.2 14.5 6 12 0 0 0 0 4 2 0 94 sd0 1.34.3 111.1 45.0 0.1 0.0 17.3 0 1 0 0 0 0 sd1 2.04.4 210.0 45.1 0.1 0.0 14.5 0 1 0 0 0 0 sd2 1.34.3 111.1 45.2 0.1 0.0 17.4 0 1 0 0 0 0 sd3 0.00.00.00.0 0.0 0.00.0 0 0 0 5 0 5 extended device statistics errors --- cpu devicer/sw/s kr/s kw/s wait actv svc_t %w %b s/w h/w trn tot us sy wt id cmdk0 0.00.00.00.0 0.0 0.00.0 0 0 0 0 0 0 1 48 0 51 sd0 325.6 56.3 36340.5 146.7 26.2 0.9 71.0 92 92 0 0 0 0 sd1 518.6 49.2 65734.5 117.1 25.9 0.9 47.0 86 85 0 0 0 0 sd2 327.6 57.3 36983.7 144.7 27.3 0.9 73.4 94 93 0 0 0 0 sd3 0.00.00.00.0 0.0 0.00.0 0 0 0 5 0 5 extended device statistics errors --- cpu devicer/sw/s kr/s kw/s wait actv svc_t %w %b s/w h/w trn tot us sy wt id cmdk0 0.00.00.00.0 0.0 0.00.0 0 0 0 0 0 0 0 43 0 57 sd0 301.11.0 33550.00.0 23.6 0.8 80.8 84 84 0 0 0 0 sd1 556.21.0 69661.10.0 26.1 0.8 48.3 92 83 0 0 0 0 sd2 300.15.0 33229.94.0 23.8 0.8 80.7 84 84 0 0 0 0 sd3 0.00.00.00.0 0.0 0.00.0 0 0 0 5 0 5 ex
Re: [zfs-discuss] Endian relevance for decoding lzjb blocks
Hi Benjamin, Benjamin Brumaire wrote: > I 'm trying to decode a lzjb compressed blocks and I have some hard times > regarding big/little endian. I'm on x86 working with build 77. > > #zdb - ztest > ... > rootbp = [L0 DMU objset] 400L/200P DVA[0]=<0:e0c98e00:200> > ... > > ## zdb -R ztest:c0d1s4:e0c98e00:200: > Found vdev: /dev/dsk/c0d1s4 > > ztest:c0d1s4:e0c98e00:200: > 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef > 00: 0003020e0a00 dd0304050020b601 .. . > 10: c505048404040504 35b558231002047c |...#X.5 > > Using the modified zdb, you should be able to do: # zdb -R ztest:c0d1s4:e0c98e00:200:d,lzjb,400 2>/tmp/foo Then you can od /tmp/foo. I am not sure what happens if you run zdb with a zfs file system that is different endianess from the machine on which you are running zdb. It may just work... The "d:lzjb:400" says to use lzjb decompression with a logical (after decompression) size of 0x400 bytes. It dumps raw data to stderr, hence the "2>/tmp/foo". max > Looking at this blocks with dd: > dd if=/dev/dsk/c0d1s4 iseek=7374023 bs=512 count=1 | od -x > 000: 0a00 020e 0003 b601 0020 0405 dd03 > > od -x is responsible for swapping every two bytes. I have on disk > 000: 000a 0e02 0300 01b6 0200 0504 03dd > > Comparing with the zdb output is every 8 bytes reversed. > > Now I don't know how to pass this to my lzjb decoding programm? > > Should I read the 512 bytes and pass them: >- from the end >- from the start and reverse every 8 bytes >- or something else > > thanks for any advice > > bbr > > > This message posted from opensolaris.org > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] modification to zdb to decompress blocks
thanks for the quick reaction. I ve now a working binary for my system. I don't understand why these changes should go through a project. The hooks are already there so once the code is written no much work have to be done. But it's an other story. Lets decode lzjb blocks now :-) bbr This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Endian relevance for decoding lzjb blocks
I 'm trying to decode a lzjb compressed blocks and I have some hard times regarding big/little endian. I'm on x86 working with build 77. #zdb - ztest ... rootbp = [L0 DMU objset] 400L/200P DVA[0]=<0:e0c98e00:200> ... ## zdb -R ztest:c0d1s4:e0c98e00:200: Found vdev: /dev/dsk/c0d1s4 ztest:c0d1s4:e0c98e00:200: 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef 00: 0003020e0a00 dd0304050020b601 .. . 10: c505048404040504 35b558231002047c |...#X.5 Looking at this blocks with dd: dd if=/dev/dsk/c0d1s4 iseek=7374023 bs=512 count=1 | od -x 000: 0a00 020e 0003 b601 0020 0405 dd03 od -x is responsible for swapping every two bytes. I have on disk 000: 000a 0e02 0300 01b6 0200 0504 03dd Comparing with the zdb output is every 8 bytes reversed. Now I don't know how to pass this to my lzjb decoding programm? Should I read the 512 bytes and pass them: - from the end - from the start and reverse every 8 bytes - or something else thanks for any advice bbr This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS still crashing after patch
I don't think that this is hardware issue, however i don't except this. I'll try to explain why. 1. I've replaced all memory modules which are more likely to cause such a problem. 2. There are many different applications running on that server (Apache, PostgreSQL, etc.). However, if you look at the four different crash dump stack traces you see the same picture: -- crash dump st1 -- mutex_enter+0xb() zio_buf_alloc+0x1a() zio_read+0xba() spa_scrub_io_start+0xf1() spa_scrub_cb+0x13d() -- crash dump st2 -- mutex_enter+0xb() zio_buf_alloc+0x1a() zio_read+0xba() arc_read+0x3cc() dbuf_prefetch+0x11d() dmu_prefetch+0x107() zfs_readdir+0x408() fop_readdir+0x34() -- crash dump st3 -- mutex_enter+0xb() zio_buf_alloc+0x1a() zio_read+0xba() arc_read+0x3cc() dbuf_prefetch+0x11d() dmu_prefetch+0x107() zfs_readdir+0x408() fop_readdir+0x34() -- crash dump st4 -- mutex_enter+0xb() zio_buf_alloc+0x1a() zio_read+0xba() arc_read+0x3cc() dbuf_prefetch+0x11d() dmu_prefetch+0x107() zfs_readdir+0x408() fop_readdir+0x34() All four crash dumps show problem at zio_read/zio_buf_alloc. Three of these appeared during metadata prefetch (dmu_prefetch) and one during scrubbing. I don't think that it's coincidence. IMHO, checksum errors are the result of this inconsistency. I tend to think that problem is in ZFS it exists even in the latest Solaris version (maybe OpenSolaris as well). > > Lots of CKSUM errors like you see is often indicative > of bad hardware. Run > memtest for 24-48 hours. > > -marc This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] modification to zdb to decompress blocks
Hi, Great stuff. Does this change will make it into opensolaris? Looking at actual code I couldn't find the modification. I try to replace zdb.c in the opensolaris main tree before compiling with nightly but the compiler wasn't happy with it. Can you write down the right options? bbr This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] cp -r hanged copying a directory
Simon Breden wrote: > set sata:sata_max_queue_depth = 0x1 > > = > > Anyway, after adding the line above into /etc/system, I rebooted and then > re-tried the copy with truss: > > truss cp -r testdir z4 > > It seems to hang on random files -- so it's not always the same file that it > hangs on. > > On this particular run here are the last few lines of truss output, although > they're probably not useful: > Hi Simon, Try with: truss -topen cp -r testdir z4 This will only show you the files being opened. The last file opened in testdir is the one it is hanging on. (Unless it is hanging in getdents(2), but I don't think so based on the kernel stacktrace). But, if it is hanging on random files, this is not going to help either. How long do you wait before deciding it's hung? I think usually you should get console output saying I/O has been retried if the device does not respond to a previously sent I/O. max ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss