Hello Robert, Friday, April 13, 2007, 2:07:11 AM, you wrote:
RM> Hello Enda, RM> Thursday, April 12, 2007, 2:36:39 PM, you wrote: EOCSMSI>> Robert Milkowski wrote: >>> Hello Enda, >>> >>> Wednesday, April 11, 2007, 4:21:35 PM, you wrote: >>> >>> EOCSMSI> Robert Milkowski wrote: >>>>> Hello zfs-discuss, >>>>> >>>>> In order to get IDR126199-01 I need to install 120473-05 first. >>>>> I can get 120473-07 but everything more than -05 is marked as >>>>> incompatible with IDR126199-01 so I do not want to force it. >>>>> >>>>> Local Sun's support has problems with getting 120473-05 also so I'm >>>>> stuck for now and I would really like to get that IDR running. >>>>> >>>>> Can someone help? >>>>> >>> >>> EOCSMSI> Hi >>> EOCSMSI> This patch will be on SunSolve possibly later today, tomorrow at >>> latest >>> EOCSMSI> I suspect as it has only justed being pushed out from testing. >>> EOCSMSI> I have sent the patch in another mail for now. >>> >>> Thank you patch - it worked (installed) along with IDR properly. >>> >>> However it seems like the problem is not solved by IDR :( >>> EOCSMSI>> Hi Robert EOCSMSI>> So this IDR has two bugs as fixed EOCSMSI>> 6458218 assertion failed: ss == NULL EOCSMSI>> 6495013 Loops and recursion in metaslab_ff_alloc can kill performance, EOCSMSI>> even on a pool with lots of free data EOCSMSI>> I have add'ed the IDR's requestors as they can comment, which one of the EOCSMSI>> above fixes was not solved via this IDR in your testing. RM> I'm talking about 6495013 Loops and recursion in metaslab_ff_alloc can RM> kill performance. RM> I've got to test again with dtrace during peak hours if recursion RM> still happens (on that scale) or maybe it's not. The problem is that RM> during peak hours I still (again) can see performance problems. At RM> first glance it looks like this time CPU is probably NOT a problem, so RM> it would seem as 6495013 is indeed fixed and the problem is somewhere RM> else. As I send|recv all file systems to another server and than back RM> to the same server (after re-creating pool) before applying IDR it RM> solved problem with a performance for some time. Not it again is RM> getting worse day by day, and the IDR seems not to help. So perhaps RM> there is a problem with zfs data fragmentation - that's why zfs RM> sedn|recv helped and 6495013 made it only worse. RM> ?? RM> Do you have any specific dtrace script in mind to test if 6495013 is RM> solved or should I use the same as in the case? bash-3.00# uname -a SunOS nfs-10-1.srv 5.10 Generic_125100-04 sun4u sparc SUNW,Sun-Fire-V440 bash-3.00# showrev -p|grep IDR Patch: IDR126199-01 Obsoletes: Requires: 120473-05 Incompatibles: 120473-06 Packages: SUNWzfskr bash-3.00# Right now this is during off hours and while I see no problems on clients it seems much slower then similar configs on UFS (zil_disable=1). There's no much writes. I'll send later output during peak hours. For example: bash-3.00# ptime sync real 1:29.199 user 0.001 sys 0.032 bash-3.00# bash-3.00# dtrace -n profile-97'[EMAIL PROTECTED]()]=count();}' -n END'{trunc(@,10);printa(@);}' dtrace: description 'profile-97' matched 1 probe dtrace: description 'END' matched 1 probe ^C CPU ID FUNCTION:NAME 3 2 :END unix`i_ddi_splhigh+0x14 unix`disp_getwork+0x38 unix`idle+0xd4 unix`thread_start+0x4 116 unix`disp_getwork+0x1a8 unix`idle+0xd4 unix`thread_start+0x4 117 unix`idle+0x12c unix`thread_start+0x4 124 unix`i_ddi_splx unix`disp_getwork+0x160 unix`idle+0xd4 unix`thread_start+0x4 127 unix`disp_getwork+0x10c unix`idle+0xd4 unix`thread_start+0x4 128 zfs`space_map_seg_compare+0x4 genunix`avl_find+0x2c zfs`space_map_add+0x12c zfs`space_map_load+0x218 zfs`metaslab_activate+0x3c zfs`metaslab_group_alloc+0x1d4 zfs`metaslab_alloc_dva+0x114 zfs`metaslab_alloc+0x2c zfs`zio_dva_allocate+0x4c zfs`zio_write_compress+0x1ec zfs`arc_write+0xe4 zfs`dbuf_sync+0x6c0 zfs`dnode_sync+0x35c zfs`dmu_objset_sync_dnodes+0x6c zfs`dmu_objset_sync+0x54 zfs`dsl_dataset_sync+0xc zfs`dsl_pool_sync+0x64 zfs`spa_sync+0x1b0 zfs`txg_sync_thread+0x134 unix`thread_start+0x4 128 unix`disp_getwork+0x90 unix`idle+0xd4 unix`thread_start+0x4 132 unix`disp_getwork+0x7c unix`idle+0xd4 unix`thread_start+0x4 139 zfs`space_map_seg_compare genunix`avl_find+0x2c zfs`space_map_add+0x12c zfs`space_map_load+0x218 zfs`metaslab_activate+0x3c zfs`metaslab_group_alloc+0x1d4 zfs`metaslab_alloc_dva+0x114 zfs`metaslab_alloc+0x2c zfs`zio_dva_allocate+0x4c zfs`zio_write_compress+0x1ec zfs`arc_write+0xe4 zfs`dbuf_sync+0x6c0 zfs`dnode_sync+0x35c zfs`dmu_objset_sync_dnodes+0x6c zfs`dmu_objset_sync+0x54 zfs`dsl_dataset_sync+0xc zfs`dsl_pool_sync+0x64 zfs`spa_sync+0x1b0 zfs`txg_sync_thread+0x134 unix`thread_start+0x4 140 unix`i_ddi_splx+0x1c unix`disp_getwork+0x160 unix`idle+0xd4 unix`thread_start+0x4 318 bash-3.00# bash-3.00# ./avl_impact_on_zfs.d ^C Loops in metaslab_ff_alloc value ------------- Distribution ------------- count -1 | 0 0 |@@@@@@@@@@@@@@@@@@@@@@@ 2249 1 |@@@@@ 485 2 |@@@ 334 4 |@@ 241 8 |@@ 170 16 |@ 125 32 |@ 76 64 |@ 61 128 | 36 256 | 19 512 | 15 1024 | 8 2048 | 8 4096 | 5 8192 | 6 16384 | 9 32768 | 3 65536 | 5 131072 | 5 262144 | 6 524288 | 0 Time per avl_walk in ns value ------------- Distribution ------------- count -1 | 0 0 | 138 1 | 0 2 | 0 4 | 0 8 | 0 16 | 0 32 | 0 64 |@@@@@@@@@@@@@@@@ 1523563 128 |@@@@@ 527718 256 |@@@@@@@@@ 860982 512 |@@@@@@@ 696676 1024 |@@@ 254440 2048 | 25446 4096 | 643 8192 | 603 16384 | 1368 32768 | 292 65536 | 23 131072 | 0 Loops in avl_find value ------------- Distribution ------------- count -1 | 0 0 | 86 1 | 188 2 | 1100 4 | 2178 8 |@ 5261 16 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 288941 32 | 0 Time per avl_find in ns value ------------- Distribution ------------- count 64 | 0 128 | 27 256 | 104 512 | 591 1024 |@ 5094 2048 |@@@@@@@@@@@@@@@@@@@ 142584 4096 |@@@@@@@@@@@@@@@@@@@@ 146899 8192 | 1277 16384 | 685 32768 | 480 65536 | 12 131072 | 0 262144 | 0 524288 | 1 1048576 | 0 Time per avl_insert in ns value ------------- Distribution ------------- count 32 | 0 64 |@@@@@@@@@@@@@ 727720 128 |@@@@@@@@@@@@@@@@@ 906701 256 |@@@@@@@@@@ 556764 512 | 1213 1024 | 122 2048 | 122 4096 | 26 8192 | 40 16384 | 257 32768 | 73 65536 | 5 131072 | 0 Loops in metaslab_ff_alloc value ------------- Distribution ------------- count -1 | 0 0 |@@@@@@@@@@@@@@@@@@@@@@@ 2249 1 |@@@@@ 485 2 |@@@ 334 4 |@@ 241 8 |@@ 170 16 |@ 125 32 |@ 76 64 |@ 61 128 | 36 256 | 19 512 | 15 1024 | 8 2048 | 8 4096 | 5 8192 | 6 16384 | 9 32768 | 3 65536 | 5 131072 | 5 262144 | 6 524288 | 0 Time per avl_walk in ns value ------------- Distribution ------------- count -1 | 0 0 | 138 1 | 0 2 | 0 4 | 0 8 | 0 16 | 0 32 | 0 64 |@@@@@@@@@@@@@@@@ 1523563 128 |@@@@@ 527718 256 |@@@@@@@@@ 860982 512 |@@@@@@@ 696676 1024 |@@@ 254440 2048 | 25446 4096 | 643 8192 | 603 16384 | 1368 32768 | 292 65536 | 23 131072 | 0 Time spent in avl_find : 1284633600 ns Time spent in avl_insert : 445301000 ns Time spent in avl_walk : 1620582500 ns Elapsed time : 65060492600 ns #### avl script bash-3.00# cat avl_impact_on_zfs.d #!/usr/sbin/dtrace -s #pragma D option quiet BEGIN { self->in_metaslab = 0; self->in_space_map_add = 0; self->in_space_map_remove = 0; Start = timestamp; TotalTimeIn_avl_walk = 0; TotalTimeIn_avl_find = 0; TotalTimeIn_avl_insert = 0; } fbt::metaslab_ff_alloc:entry /self->in_metaslab == 0/ { self->in_metaslab = 1; self->loopcount = 0; self->metaslab_alloc_start = timestamp; } fbt::space_map_add:entry /self->in_space_map_add == 0/ { self->in_space_map_add = 1; } fbt::avl_walk:entry /self->in_metaslab/ { self->loopcount++; self->avl_walk_start_time = vtimestamp; } fbt::avl_walk:return /self->in_metaslab/ { @avl_walk_time["Time per avl_walk in ns"] = quantize(vtimestamp - self->avl_walk_start_time); TotalTimeIn_avl_walk += vtimestamp - self->avl_walk_start_time; } fbt::avl_insert:entry /self->in_space_map_remove/ { self->avl_insert_start_time = vtimestamp; } fbt::avl_insert:return /self->in_space_map_remove/ { @avl_insert_time["Time per avl_insert in ns"] = quantize(vtimestamp - self->avl_insert_start_time); TotalTimeIn_avl_insert += vtimestamp - self->avl_insert_start_time; } fbt::avl_find:entry /self->in_space_map_add/ { self->loops_in_avl_find = 0; self->avl_find_start_time = vtimestamp; } fbt::space_map_seg_compare:entry /self->in_space_map_add/ { self->loops_in_avl_find++; } fbt::avl_find:return /self->in_space_map_add/ { @avl_find_loops["Loops in avl_find"] = quantize(self->loops_in_avl_find); @avl_find_time["Time per avl_find in ns"] = quantize(vtimestamp - self->avl_find_start_time); TotalTimeIn_avl_find += vtimestamp - self->avl_find_start_time; } fbt::space_map_add:return /self->in_space_map_add/ { self->in_space_map_add = 0; } fbt::metaslab_ff_alloc:return /self->in_metaslab/ { self->in_metaslab = 0; @loops["Loops in metaslab_ff_alloc"] = quantize(self->loopcount); self->loopcount = 0; } fbt::space_map_remove:entry /self->in_space_map_remove == 0/ { self->in_space_map_remove = 1; } fbt::space_map_remove:return /self->in_space_map_remove/ { self->in_space_map_remove = 0; } END { printa(@loops); printa(@avl_find_loops); printa(@avl_find_time); printa(@avl_insert_time); printa(@avl_walk_time); printf ("\tTime spent in avl_find\t\t: %d ns\n", TotalTimeIn_avl_find); printf ("\tTime spent in avl_insert\t: %d ns\n", TotalTimeIn_avl_insert); printf ("\tTime spent in avl_walk\t\t: %d ns\n", TotalTimeIn_avl_walk); printf ("\tElapsed time\t\t\t: %d ns\n",timestamp - Start); } -- Best regards, Robert mailto:[EMAIL PROTECTED] http://milek.blogspot.com _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss