Re: [zfs-discuss] zfs hangs with B141 when filebench runs
zap_increment_int+0x68(ff05028c8940, , 0, fffeffef7e00, ff0511d9bc80) ff002035c9f0 do_userquota_update+0x69(ff05028c8940, 100108000, 3, 0, 0, 1, ff0511d9bc80) ff002035ca50 dmu_objset_do_userquota_updates+0xde(ff05028c8940, ff0511d9bc80) ff002035cad0 dsl_pool_sync+0x112(ff0502ceac00, f34) ff002035cb80 spa_sync+0x37b(ff0501269580, f34) ff002035cc20 txg_sync_thread+0x247(ff0502ceac00) ff002035cc30 thread_start+8() ff05123ce048::zio -r ADDRESS TYPE STAGEWAITER ff05123ce048 NULL CHECKSUM_VERIFY ff002035cc40 ff051a9a9338READ VDEV_IO_START- ff050e3a4050 READ VDEV_IO_DONE - ff0519173c90 READ VDEV_IO_START- ff0519173c90::print zio_t io_done io_done = vdev_cache_fill The zio ff0519173c90 is vdec cach read rquest and can not be done so that txt_sync_thread isblocked. I dont know why this zio can not be satisfied and enter into done stage. I have tried to dd the raw device which consists the pool when this zfs hangs, it works ok. Thanks Zhihui On Mon, Jul 5, 2010 at 7:56 PM, zhihui Chen zhch...@gmail.com wrote: I tried to run zfs list on my system, but looks that this command will hangs. This command can not return even if I press contrl+c as following: r...@intel7:/export/bench/io/filebench/results# zfs list ^C^C^C^C ^C^C^C^C .. When this happens, I am running filebench benchmark with oltp workload. But zpool status shows that all pools are in good statu like following: r...@intel7:~# zpool status pool: rpool state: ONLINE status: The pool is formatted using an older on-disk format. The pool can still be used, but some features are unavailable. action: Upgrade the pool using 'zpool upgrade'. Once this is done, the pool will no longer be accessible on older software versions. scan: none requested config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 c8t0d0s0 ONLINE 0 0 0 errors: No known data errors pool: tpool state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM tpool ONLINE 0 0 0 c10t1d0 ONLINE 0 0 0 errors: No known data errors My system is running B141 and tpool is using the latest version 26. Tried command truss -p `pgrep zfs`, but it failes like following: r...@intel7:~# truss -p `pgrep zfs` truss: unanticipated system error: 5060 Looks that zfs is in deadlock state, but I dont know what is the cause. I have tried to run filebench/oltp workload several times, each time it will leads to this state. But if I run filebench with other workload such as fileserver, webwerver, this issue does not happen. Thanks Zhihui ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs hangs with B141 when filebench runs
Thanks, I have filed the bug but I dont know how to provide the crash dump. If this bug is accepted, RE can get the crash dump file from me. On Thu, Jul 15, 2010 at 10:33 PM, George Wilson george.r.wil...@oracle.com wrote: I don't recall seeing this issue before. Best thing to do is file a bug and include a pointer to the crash dump. - George zhihui Chen wrote: Looks that the txg_sync_thread for this pool has been blocked and never return, which leads to many other threads have been blocked. I have tried to change zfs_vdev_max_pending value from 10 to 35 and retested the workload serveral times, this issue does not happen. But if I change it back to 10, it happens very easily. Any known bug on this or any suggestion to solve this issue? ff0502c3378c::wchaninfo -v ADDR TYPE NWAITERS THREAD PROC ff0502c3378c cond 1730: ff051cc6b500 go_filebench ff051ce61020 go_filebench ff051cc4e4e0 go_filebench ff051d115120 go_filebench ff051e9ed000 go_filebench ff051bf644c0 go_filebench ff051c65b000 go_filebench ff051c728500 go_filebench ff050d83a8c0 go_filebench ff051c528c00 go_filebench ff051b750800 go_filebench ff051cdd7520 go_filebench ff051ce71bc0 go_filebench ff051cb5e840 go_filebench ff051cbdec60 go_filebench ff0516473c60 go_filebench ff051d132820 go_filebench ff051d13a400 go_filebench ff050fbf0b40 go_filebench ff051ce7a400 go_filebench ff051b781820 go_filebench ff051ce603e0 go_filebench ff051d1bf840 go_filebench ff051c6c24c0 go_filebench ff051d204100 go_filebench ff051cbdf160 go_filebench ff051ce52c00 go_filebench ... ff051cc6b500::findstack -v stack pointer for thread ff051cc6b500: ff0020a76ac0 [ ff0020a76ac0 _resume_from_idle+0xf1() ] ff0020a76af0 swtch+0x145() ff0020a76b20 cv_wait+0x61(ff0502c3378c, ff0502c33700) ff0020a76b70 zil_commit+0x67(ff0502c33700, 6b255, 14) ff0020a76d80 zfs_write+0xaaf(ff050b5c9140, ff0020a76e40, 40, ff0502dab258, 0) ff0020a76df0 fop_write+0x6b(ff050b5c9140, ff0020a76e40, 40, ff0502dab258, 0) ff0020a76ec0 pwrite64+0x244(1a, b6f2a000, 800, b841a800, 0) ff0020a76f10 sys_syscall32+0xff() From the zil_commit code, I try to find the thread whose stack have function call zil_commit_writer. This thread did not return back from zil_commit_write so that it will not call cv_broadcast to wake up the waiting threads. ff051d10fba0::findstack -v stack pointer for thread ff051d10fba0: ff0021ab9a10 [ ff0021ab9a10 _resume_from_idle+0xf1() ] ff0021ab9a40 swtch+0x145() ff0021ab9a70 cv_wait+0x61(ff051ae1b988, ff051ae1b980) ff0021ab9ab0 zio_wait+0x5d(ff051ae1b680) ff0021ab9b20 zil_commit_writer+0x249(ff0502c33700, 6b250, e) ff0021ab9b70 zil_commit+0x91(ff0502c33700, 6b250, e) ff0021ab9d80 zfs_write+0xaaf(ff050b5c9540, ff0021ab9e40, 40, ff0502dab258, 0) ff0021ab9df0 fop_write+0x6b(ff050b5c9540, ff0021ab9e40, 40, ff0502dab258, 0) ff0021ab9ec0 pwrite64+0x244(14, bfbfb800, 800, 88f3f000, 0) ff0021ab9f10 sys_syscall32+0xff() ff051ae1b680::zio -r ADDRESS TYPE STAGE WAITER ff051ae1b680 NULL CHECKSUM_VERIFY ff051d10fba0 ff051a9c1978 WRITE VDEV_IO_START - ff052454d348 WRITE VDEV_IO_START - ff051572b960 WRITE VDEV_IO_START - ff050accb330 WRITE VDEV_IO_START - ff0514453c80 WRITE VDEV_IO_START - ff0524537648 WRITE VDEV_IO_START - ff05090e9660 WRITE VDEV_IO_START - ff05151cb698 WRITE VDEV_IO_START - ff0514668658 WRITE VDEV_IO_START - ff0514835690 WRITE
[zfs-discuss] zfs hangs with B141 when filebench runs
I tried to run zfs list on my system, but looks that this command will hangs. This command can not return even if I press contrl+c as following: r...@intel7:/export/bench/io/filebench/results# zfs list ^C^C^C^C ^C^C^C^C .. When this happens, I am running filebench benchmark with oltp workload. But zpool status shows that all pools are in good statu like following: r...@intel7:~# zpool status pool: rpool state: ONLINE status: The pool is formatted using an older on-disk format. The pool can still be used, but some features are unavailable. action: Upgrade the pool using 'zpool upgrade'. Once this is done, the pool will no longer be accessible on older software versions. scan: none requested config: NAMESTATE READ WRITE CKSUM rpool ONLINE 0 0 0 c8t0d0s0 ONLINE 0 0 0 errors: No known data errors pool: tpool state: ONLINE scan: none requested config: NAMESTATE READ WRITE CKSUM tpool ONLINE 0 0 0 c10t1d0 ONLINE 0 0 0 errors: No known data errors My system is running B141 and tpool is using the latest version 26. Tried command truss -p `pgrep zfs`, but it failes like following: r...@intel7:~# truss -p `pgrep zfs` truss: unanticipated system error: 5060 Looks that zfs is in deadlock state, but I dont know what is the cause. I have tried to run filebench/oltp workload several times, each time it will leads to this state. But if I run filebench with other workload such as fileserver, webwerver, this issue does not happen. Thanks Zhihui ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] how to convert zio-io_offset to disk block number?
Thanks, fixes following two issues, I can get the right value:(1) Dividing offset 0x657800(6649856) by 512 and take it as the iseek value. (2) Run the dd command on device c2t0d0s0, not c2t0d0. Zhihui 2009/6/26 m...@bruningsystems.com m...@bruningsystems.com Hi Zhihui Chen, zhihui Chen wrote: Find that zio-io_offset is the absolute offset of device, not in sector unit. And If we need use zdb -R to dump the block, we should use the offset (zio-io_offset-0x40). 2009/6/25 zhihui Chen zhch...@gmail.com mailto:zhch...@gmail.com I use following dtrace script to trace the postion of one file on zfs: #!/usr/sbin/dtrace -qs zio_done:entry /((zio_t *)(arg0))-io_vd/ { zio=(zio_t *)arg0; printf(Offset:%x and Size:%x\n,zio-io_offset,zio-io_size); printf(vd:%x\n,(unsigned long)(zio-io_vd)); printf(process name:%s\n,execname); tracemem(zio-io_data,40); stack(); } and I run dd command: dd if=/export/dsk1/test1 bs=512 count=1, the dtrace script will generate following it output: Offset:657800 and Size:200 vd:ff02d6a1a700 process name:sched zfs`zio_execute+0xa0 genunix`taskq_thread+0x193 unix`thread_start+0x8 ^C The tracemem output is the right context of file test1, which is a 512-byte text file. zpool status has following output: pool: tpool state: ONLINE scrub: none requested config: NAMESTATE READ WRITE CKSUM tpool ONLINE 0 0 0 c2t0d0ONLINE 0 0 0 errors: No known data errors My question is how to translate zio-io_offset (0x657800, equal to decimal number 6649856) outputed by dtace to block number on disk c2t0d0? I tried to use dd if=/dev/dsk/c2t0d0 of=text iseek=6650112 bs=512 count=1 for a check,but the result is not right. I assume that 6650112 is offset plus 0x40? Try dividing 6650112 by 512 and using that as the iseek value. max Thanks Zhihui ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] how to convert zio-io_offset to disk block number?
I use following dtrace script to trace the postion of one file on zfs: #!/usr/sbin/dtrace -qs zio_done:entry /((zio_t *)(arg0))-io_vd/ { zio=(zio_t *)arg0; printf(Offset:%x and Size:%x\n,zio-io_offset,zio-io_size); printf(vd:%x\n,(unsigned long)(zio-io_vd)); printf(process name:%s\n,execname); tracemem(zio-io_data,40); stack(); } and I run dd command: dd if=/export/dsk1/test1 bs=512 count=1, the dtrace script will generate following it output: Offset:657800 and Size:200 vd:ff02d6a1a700 process name:sched zfs`zio_execute+0xa0 genunix`taskq_thread+0x193 unix`thread_start+0x8 ^C The tracemem output is the right context of file test1, which is a 512-byte text file. zpool status has following output: pool: tpool state: ONLINE scrub: none requested config: NAMESTATE READ WRITE CKSUM tpool ONLINE 0 0 0 c2t0d0ONLINE 0 0 0 errors: No known data errors My question is how to translate zio-io_offset (0x657800, equal to decimal number 6649856) outputed by dtace to block number on disk c2t0d0? I tried to use dd if=/dev/dsk/c2t0d0 of=text iseek=6650112 bs=512 count=1 for a check,but the result is not right. Thanks Zhihui ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] how to convert zio-io_offset to disk block number?
Find that zio-io_offset is the absolute offset of device, not in sector unit. And If we need use zdb -R to dump the block, we should use the offset (zio-io_offset-0x40). 2009/6/25 zhihui Chen zhch...@gmail.com I use following dtrace script to trace the postion of one file on zfs: #!/usr/sbin/dtrace -qs zio_done:entry /((zio_t *)(arg0))-io_vd/ { zio=(zio_t *)arg0; printf(Offset:%x and Size:%x\n,zio-io_offset,zio-io_size); printf(vd:%x\n,(unsigned long)(zio-io_vd)); printf(process name:%s\n,execname); tracemem(zio-io_data,40); stack(); } and I run dd command: dd if=/export/dsk1/test1 bs=512 count=1, the dtrace script will generate following it output: Offset:657800 and Size:200 vd:ff02d6a1a700 process name:sched zfs`zio_execute+0xa0 genunix`taskq_thread+0x193 unix`thread_start+0x8 ^C The tracemem output is the right context of file test1, which is a 512-byte text file. zpool status has following output: pool: tpool state: ONLINE scrub: none requested config: NAMESTATE READ WRITE CKSUM tpool ONLINE 0 0 0 c2t0d0ONLINE 0 0 0 errors: No known data errors My question is how to translate zio-io_offset (0x657800, equal to decimal number 6649856) outputed by dtace to block number on disk c2t0d0? I tried to use dd if=/dev/dsk/c2t0d0 of=text iseek=6650112 bs=512 count=1 for a check,but the result is not right. Thanks Zhihui ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] About ZFS compatibility
I have created a pool on external storage with B114. Then I export this pool and import it on another system with B110.But this import will fail and show error: cannot import 'tpool': pool is formatted using a newer ZFS version. Any big change in ZFS with B114 leads to this compatibility issue? Thanks Zhihui ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss