Re: An attempted test of main's "git: 2ad756a6bbb3" "merge openzfs/zfs@95f71c019" that did not go as planned

Mark Millard Mon, 04 Sep 2023 22:07:30 -0700

On Sep 4, 2023, at 18:39, Mark Millard <mark...@yahoo.com> wrote:

> On Sep 4, 2023, at 10:05, Alexander Motin <m...@freebsd.org> wrote:
> 
>> On 04.09.2023 11:45, Mark Millard wrote:
>>> On Sep 4, 2023, at 06:09, Alexander Motin <m...@freebsd.org> wrote:
>>>> per_txg_dirty_frees_percent is directly related to the delete delays we 
>>>> see here.  You are forcing ZFS to commit transactions each 5% of dirty ARC 
>>>> limit, which is 5% of 10% or memory size.  I haven't looked on that code 
>>>> recently, but I guess setting it too low can make ZFS commit transactions 
>>>> too often, increasing write inflation for the underlying storage.  I would 
>>>> propose you to restore the default and try again.
>>> While this machine is different, the original problem was worse than
>>> the issue here: the load average was less than 1 for the most part
>>> the parallel bulk build when 30 was used. The fraction of time waiting
>>> was much longer than with 5. If I understand right, both too high and
>>> too low for a type of context can lead to increased elapsed time and
>>> getting it set to a near optimal is a non-obvious exploration.
>> 
>> IIRC this limit was modified several times since originally implemented.  
>> May be it could benefit from another look, if default 30% is not good.  It 
>> would be good if generic ZFS issues like this were reported to OpenZFS 
>> upstream to be visible to a wider public.  Unfortunately I have several 
>> other project I must work on, so if it is not a regression I can't promise 
>> I'll take it right now, so anybody else is welcome.
> 
> As I understand, there are contexts were 5 is inappropriate
> and 30 works fairly well: no good single answer as to what
> value range will avoid problems.
> 
>>> An overall point for the goal of my activity is: what makes a
>>> good test context for checking if ZFS is again safe to use?
>>> May be other tradeoffs make, say, 4 hardware threads more
>>> reasonable than 32.
>> 
>> Thank you for your testing.  The best test is one that nobody else run. It 
>> also correlates with the topic of "safe to use", which also depends on what 
>> it is used for. :)
> 
> Looks like use of a M.2 Samsung SSD 960 PRO 1TB with a
> non-debug FreeBSD build is suitable for the bulk -a -J128
> test (no ALLOW_MAKE_JOBS variants enabled, USE_TMPFS=no in
> use) on the 32 hardware thread system. (The swap partition
> in use is from the normal environment's PCIe Optane media.)
> The %idle and the load averages and %user stayed reasonable
> in a preliminary test. One thing it does introduce is trim
> management (both available and potentially useful). (Optane
> media does not support or need it.) No
> per_txg_dirty_frees_percent adjustment involved (still 5).
> 
> I've learned to not use ^T for fear of /bin/sh aborting
> and messing up poudriere's context. So I now monitor with:
> 
> # poudriere status -b
> 
> in a separate ssh session.
> 
> I'll note that I doubt I'd try for a complete bulk -a .
> I'd probably stop it if I notice that the number of
> active builders drops off for a notable time (normal
> waiting for prerequisites appearing to be why).
> 
>


So much for that idea. It has reached a state of staying
under 3500 w/s and up to 4.5ms/w (normally above 2ms/w).
%busy wondering in the range 85% to 101%. Lots of top
showing tx->tx. There is some read and other activity as
well. Of course the kBps figures are larger than the
earlier USB3 context (larger kB figures).

It reached about 1350 port->package builds over the first
hour after the 2nd "Buildee started".

autotrim is off. Doing a "zpool trim -w zamd64" leads to
. . . larger w/s figures during the process. So
more exploring to do at some point. Possibly:

autotrim
per_txg_dirty_frees_percent

For now, I'm just running "zpool trim -w zamd64" once
and a while so the process continues better.

Still no evidence of deadlocks. No evidence of builds
failing for corruptions.

. . . At around the end of 2nd hour: 2920 or so built. 

Still no evidence of deadlocks. No evidence of builds
failing for corruptions.

. . . I've turned on autotrim without stopping the bulk
build.

. . . At around the end of 3rd hour: 4080 or so built. 

Still no evidence of deadlocks. No evidence of builds
failing for corruptions.

Looks like the % idle has been high for a significant
time. I think I'll stop this specific test and clean
things out.

Looks like lang/guile* are examples of not respecting
the lack of ALLOW_MAKE_JOBS use.

Hmm. The ^C ended up with:

^C[03:41:07] Error: Signal SIGINT caught, cleaning up and exiting
[main-amd64-bulk_a-default] [2023-09-04_17h49m28s] [sigint:] Queued: 34588 
Built: 4331  Failed: 12    Skipped: 303   Ignored: 335   Fetched: 0     
Tobuild: 29607  Time: 03:41:05
[03:41:07] Logs: 
/usr/local/poudriere/data/logs/bulk/main-amd64-bulk_a-default/2023-09-04_17h49m28s
[03:41:19] [95] [00:06:59] Finished databases/php80-pdo_sqlite | 
php80-pdo_sqlite-8.0.30: Failed: build-depends
[03:41:20] [60] [00:10:53] Finished www/php82-session | php82-session-8.2.9: 
Success
[03:41:20] [19] [00:06:42] Finished databases/pecl-memcache@php80 | 
php80-pecl-memcache-8.2: Failed: build-depends
[03:41:23] [62] [00:06:40] Finished net-mgmt/icingaweb2-module-incubator@php81 
| icingaweb2-module-incubator-php81-0.20.0: Success
[03:41:24] Cleaning up
exit: cannot open ./var/run/116_nohang.pid: No such file or directory
exit: cannot open ./var/run/93_nohang.pid: No such file or directory

and JID 253's processes are stuck in the STOP state, along with the
related JID 0 processes. These are for x11-toolkits/libXaw . I do
not know how long it might have been stuck.

/usr/local/libexec/poudriere/sh -e /usr/local/share/poudriere/bulk.sh -J128 
-jmain-amd64-bulk_a -a

is stuck in killpg state:

 3896 101784 sh                  -                   mi_switch+0xbb 
_sx_xlock_hard+0x3e1 killpg1+0x193 kern_kill+0x23c amd64_syscall+0x109 
fast_syscall_common+0xf8 

Most recent process birth time listed-first order for the STOP's:

86826 634820 sh                  -                   mi_switch+0xbb 
thread_suspend_switch+0xc3 cursig+0x7b9 ast_sig+0x1df ast_handler+0x88 ast+0x20 
doreti_ast+0x1c 
86825 321240 sh                  -                   mi_switch+0xbb 
thread_suspend_check+0x260 sig_intr+0x4c fork1+0x186 sys_fork+0x54 
amd64_syscall+0x109 fast_syscall_common+0xf8 
86550 308525 sh                  -                   mi_switch+0xbb 
sleepq_catch_signals+0x2ab sleepq_wait_sig+0x9 _sleep+0x1e3 pipe_read+0x420 
dofileread+0x86 sys_read+0xbe amd64_syscall+0x109 fast_syscall_common+0xf8 
 3998 637224 gmake               -                   mi_switch+0xbb 
sleepq_catch_signals+0x2ab sleepq_wait_sig+0x9 _sleep+0x1e3 kern_wait6+0x51c 
sys_wait4+0x7d amd64_syscall+0x109 fast_syscall_common+0xf8 
 3976 298566 sh                  -                   mi_switch+0xbb 
sleepq_catch_signals+0x2ab sleepq_wait_sig+0x9 _sleep+0x1e3 kern_wait6+0x51c 
sys_wait4+0x7d amd64_syscall+0x109 fast_syscall_common+0xf8 
 3974 309336 gmake               -                   mi_switch+0xbb 
sleepq_catch_signals+0x2ab sleepq_wait_sig+0x9 _sleep+0x1e3 kern_wait6+0x51c 
sys_wait4+0x7d amd64_syscall+0x109 fast_syscall_common+0xf8 
 3962 577230 gmake               -                   mi_switch+0xbb 
sleepq_catch_signals+0x2ab sleepq_wait_sig+0x9 _sleep+0x1e3 kern_wait6+0x51c 
sys_wait4+0x7d amd64_syscall+0x109 fast_syscall_common+0xf8 
 3958 635598 sh                  -                   mi_switch+0xbb 
sleepq_catch_signals+0x2ab sleepq_wait_sig+0x9 _sleep+0x1e3 kern_wait6+0x51c 
sys_wait4+0x7d amd64_syscall+0x109 fast_syscall_common+
0xf8 
 3915 668148 make                -                   mi_switch+0xbb 
sleepq_catch_signals+0x2ab sleepq_wait_sig+0x9 _sleep+0x1e3 kern_wait6+0x51c 
sys_wait4+0x7d amd64_syscall+0x109 fast_syscall_common+
0xf8 
 3912 638108 sh                  -                   mi_switch+0xbb 
sleepq_catch_signals+0x2ab sleepq_wait_sig+0x9 _sleep+0x1e3 kern_wait6+0x51c 
sys_wait4+0x7d amd64_syscall+0x109 fast_syscall_common+
0xf
46142 212631 sh                  -                   mi_switch+0xbb 
sleepq_catch_signals+0x2ab sleepq_timedwait_sig+0x12 
_cv_timedwait_sig_sbt+0x10c seltdwait+0x75 kern_select+0x95e sys_select+0x57 
amd64_syscall+0x109 fast_syscall_common+0xf8 

===
Mark Millard
marklmi at yahoo.com

Re: An attempted test of main's "git: 2ad756a6bbb3" "merge openzfs/zfs@95f71c019" that did not go as planned

Reply via email to