> with recent bits ZFS compression is now handled concurrently with > many CPUs working on different records. > So this load will burn more CPUs and acheive it's results > (compression) faster. > > So the observed pauses should be consistent with that of a load > generating high system time. > The assumption is that compression now goes faster than when is was > single threaded. > > Is this undesirable ? We might seek a way to slow down compression in > order to limit the system load.
According to this dtrace script.... #!/usr/sbin/dtrace -s sdt:genunix::taskq-enqueue /((taskq_ent_t *)arg1)->tqent_func == (task_func_t *)&`zio_write_compress/ { @where[stack()] = count(); } tick-5s { printa(@where); trunc(@where); } ... I see bursts of ~ 1000 zio_write_compress() [gzip] taskq calls enqueued into the "spa_zio_issue" taskq by zfs`spa_sync() and its children: 0 76337 :tick-5s ... zfs`zio_next_stage+0xa1 zfs`zio_wait_for_children+0x5d zfs`zio_wait_children_ready+0x20 zfs`zio_next_stage_async+0xbb zfs`zio_nowait+0x11 zfs`dbuf_sync_leaf+0x1b3 zfs`dbuf_sync_list+0x51 zfs`dbuf_sync_indirect+0xcd zfs`dbuf_sync_list+0x5e zfs`dbuf_sync_indirect+0xcd zfs`dbuf_sync_list+0x5e zfs`dnode_sync+0x214 zfs`dmu_objset_sync_dnodes+0x55 zfs`dmu_objset_sync+0x13d zfs`dsl_dataset_sync+0x42 zfs`dsl_pool_sync+0xb5 zfs`spa_sync+0x1c5 zfs`txg_sync_thread+0x19a unix`thread_start+0x8 1092 0 76337 :tick-5s It seems that after such a batch of compress requests is submitted to the "spa_zio_issue" taskq, the kernel is busy for several seconds working on these taskq entries. It seems that this blocks all other "taskq" activity inside the kernel... This dtrace script counts the number of zio_write_compress() calls enqueued / execed by the kernel per second: #!/usr/sbin/dtrace -qs sdt:genunix::taskq-enqueue /((taskq_ent_t *)arg1)->tqent_func == (task_func_t *)&`zio_write_compress/ { this->tqe = (taskq_ent_t *)arg1; @enq[this->tqe->tqent_func] = count(); } sdt:genunix::taskq-exec-end /((taskq_ent_t *)arg1)->tqent_func == (task_func_t *)&`zio_write_compress/ { this->tqe = (taskq_ent_t *)arg1; @exec[this->tqe->tqent_func] = count(); } tick-1s { /* printf("%Y\n", walltimestamp); */ printf("TS(sec): %u\n", timestamp / 1000000000); printa("enqueue %a: [EMAIL PROTECTED]", @enq); printa("exec %a: [EMAIL PROTECTED]", @exec); trunc(@enq); trunc(@exec); } I see bursts of zio_write_compress() calls enqueued / execed, and periods of time where no zio_write_compress() taskq calls are enqueued or execed. 10# ~jk/src/dtrace/zpool_gzip7.d TS(sec): 7829 TS(sec): 7830 TS(sec): 7831 TS(sec): 7832 TS(sec): 7833 TS(sec): 7834 TS(sec): 7835 enqueue zfs`zio_write_compress: 1330 exec zfs`zio_write_compress: 1330 TS(sec): 7836 TS(sec): 7837 TS(sec): 7838 TS(sec): 7839 TS(sec): 7840 TS(sec): 7841 TS(sec): 7842 TS(sec): 7843 TS(sec): 7844 enqueue zfs`zio_write_compress: 1116 exec zfs`zio_write_compress: 1116 TS(sec): 7845 TS(sec): 7846 TS(sec): 7847 TS(sec): 7848 TS(sec): 7849 TS(sec): 7850 TS(sec): 7851 TS(sec): 7852 TS(sec): 7853 TS(sec): 7854 TS(sec): 7855 TS(sec): 7856 TS(sec): 7857 enqueue zfs`zio_write_compress: 932 exec zfs`zio_write_compress: 932 TS(sec): 7858 TS(sec): 7859 TS(sec): 7860 TS(sec): 7861 TS(sec): 7862 TS(sec): 7863 TS(sec): 7864 TS(sec): 7865 TS(sec): 7866 TS(sec): 7867 enqueue zfs`zio_write_compress: 5 exec zfs`zio_write_compress: 5 TS(sec): 7868 enqueue zfs`zio_write_compress: 774 exec zfs`zio_write_compress: 774 TS(sec): 7869 TS(sec): 7870 TS(sec): 7871 TS(sec): 7872 TS(sec): 7873 TS(sec): 7874 TS(sec): 7875 TS(sec): 7876 enqueue zfs`zio_write_compress: 653 exec zfs`zio_write_compress: 653 TS(sec): 7877 TS(sec): 7878 TS(sec): 7879 TS(sec): 7880 TS(sec): 7881 And a final dtrace script, which monitors scheduler activity while filling a gzip compressed pool: #!/usr/sbin/dtrace -qs sched:::off-cpu, sched:::on-cpu, sched:::remain-cpu, sched:::preempt { /* @[probename, stack()] = count(); */ @[probename] = count(); } tick-1s { printf("%Y", walltimestamp); printa(@); trunc(@); } It shows periods of time with absolutely *no* scheduling activity (I guess this is when the "spa_zio_issue" taskq is working on such a bug batch of submitted gzip compression calls): 21# ~jk/src/dtrace/zpool_gzip9.d 2007 May 6 21:38:12 preempt 13 off-cpu 808 on-cpu 808 2007 May 6 21:38:13 preempt 1 off-cpu 446 on-cpu 446 2007 May 6 21:38:14 preempt 6 off-cpu 2308 on-cpu 2308 2007 May 6 21:38:15 preempt 1 off-cpu 429 on-cpu 429 2007 May 6 21:38:16 off-cpu 412 on-cpu 412 2007 May 6 21:38:17 off-cpu 711 on-cpu 711 2007 May 6 21:38:18 preempt 2 off-cpu 238 on-cpu 238 2007 May 6 21:38:19 preempt 48 off-cpu 7740 on-cpu 7740 2007 May 6 21:38:20 preempt 114 off-cpu 20823 on-cpu 20823 2007 May 6 21:38:21 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< 2007 May 6 21:38:22 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< 2007 May 6 21:38:23 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< 2007 May 6 21:38:24 preempt 181 off-cpu 32486 on-cpu 32486 2007 May 6 21:38:25 preempt 1 off-cpu 1612 on-cpu 1612 2007 May 6 21:38:26 preempt 6 off-cpu 1068 on-cpu 1068 2007 May 6 21:38:27 2007 May 6 21:38:28 preempt 56 off-cpu 9756 on-cpu 9756 2007 May 6 21:38:29 preempt 200 off-cpu 36195 on-cpu 36195 2007 May 6 21:38:30 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< 2007 May 6 21:38:31 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< 2007 May 6 21:38:32 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< 2007 May 6 21:38:33 preempt 144 off-cpu 25982 on-cpu 25982 2007 May 6 21:38:34 preempt 118 off-cpu 20705 on-cpu 20705 2007 May 6 21:38:35 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< 2007 May 6 21:38:36 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< 2007 May 6 21:38:37 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< 2007 May 6 21:38:38 preempt 231 off-cpu 42003 on-cpu 42003 2007 May 6 21:38:39 preempt 42 off-cpu 4781 on-cpu 4781 2007 May 6 21:38:40 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< 2007 May 6 21:38:41 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< 2007 May 6 21:38:42 preempt 4 off-cpu 486 on-cpu 486 This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss