main aarch64: poudriere-devel [UFS context] cpdup stuck in pgnslp state

2024-03-21 Thread Mark Millard
Note, more recent process creations towards top, older ones towards bottom:

  PID   JID USERNAMEPRI NICE SIZE   RES STATEC   TIME CPU 
COMMAND
 . .
3369319 root 680   6524Ki3252Ki wait 3   0:00   0.00% 
/usr/bin/make -C /usr/ports/lang/gcc13 build
33692 0 root 680  15728Ki3552Ki wait 0   0:00   0.00% 
sh: poudriere[main-CA7-default][02]: build_pkg (gcc13-13.2.0_4) (sh)
30174 0 root 680  15728Ki3564Ki select   3   0:00   0.00% 
sh: poudriere[main-CA7-default][02]: build_pkg (gcc13-13.2.0_4) (sh)
26338 0 root 660  17740Ki5044Ki pgnslp   0   0:01   0.00% 
cpdup -i0 -s0 -f -x ref 01
26308 0 root 680  15728Ki3556Ki wait 0   0:00   0.00% 
sh: poudriere[main-CA7-default][01]: build_pkg (boost-libs-1.84.0) (sh)
33592 0 root 260  15728Ki3388Ki piperd   2   0:01   0.00% 
sh: poudriere[main-CA7-default]: pkg_cacher_main (sh)
29205 0 root 680  15728Ki3392Ki nanslp   2   1:52   0.14% 
sh: poudriere[main-CA7-default]: html_json_main (sh)
28834 0 root 200  15728Ki3548Ki select   3   0:01   0.00% 
/usr/local/libexec/poudriere/sh -e /usr/local/share/poudriere/bulk.sh 
-jmain-CA7 -c -f /root/origins/CA7-origins.txt
28833 0 root 200  13560Ki1924Ki wait 3   0:00   0.00% 
/bin/sh /root/build-ports-main-CA7.sh -c
 . .

pgnslp seems to be from: vm_page_acquire_unlocked in sys/vm/vm_page.c .
That in turn looks to be using vm_page_grab_sleep :

if (!vm_page_grab_sleep(object, m, pindex, "pgnslp",
allocflags, false))
return (false);

and:

/*
 *  vm_page_grab_sleep
 *
 *  Sleep for busy according to VM_ALLOC_ parameters.  Returns true
 *  if the caller should retry and false otherwise.
 *
 *  If the object is locked on entry the object will be unlocked with
 *  false returns and still locked but possibly having been dropped
 *  with true returns.
 */
static bool
vm_page_grab_sleep(vm_object_t object, vm_page_t m, vm_pindex_t pindex,
const char *wmesg, int allocflags, bool locked)
{

if ((allocflags & VM_ALLOC_NOWAIT) != 0)
return (false);
 
/*
 * Reference the page before unlocking and sleeping so that
 * the page daemon is less likely to reclaim it.
 */
if (locked && (allocflags & VM_ALLOC_NOCREAT) == 0)
vm_page_reference(m);

if (_vm_page_busy_sleep(object, m, pindex, wmesg, allocflags, locked) &&
locked)
VM_OBJECT_WLOCK(object);
if ((allocflags & VM_ALLOC_WAITFAIL) != 0)
return (false);

return (true);
}

 . .
[10:08:06] [01] [00:00:00] Building devel/boost-libs | boost-libs-1.84.0
 . .

# poudriere status -b
[main-CA7-default] [2024-03-21_06h23m31s] [parallel_build] Queued: 265 Built: 
213 Failed: 0   Skipped: 0   Ignored: 0   Fetched: 0   Tobuild: 52   Time: 
10:50:40
 ID  TOTAL  ORIGIN   PKGNAME
PHASE TIME TMPFS  CPU% MEM%
[01] 00:42:40 devel/boost-libs | boost-libs-1.84.0  
 starting 00:42:40 951.54 MiB  
 . .

Unfortunately:

A) The booted kernel is my personal build based on -mcpu=cortex-a76
   and LSE_ATOMICS . (It is in use on a RPi5 booted via EDK2.)

B) The booted world is a PkgBase world.

C) The poudriere jail's world directory tree is my personal armv7
   world build based on -mcpu=cortex-a7 .

All are based on: main-n268827-75464941dc17 . (Well, PkgBase
commit identification/verification for world does not exist.
I happened to update PkgBase during a long lull for commits
to main. In the context, the boot-world seems unlikely to be
involved here.)

The boot media is a U2 Optane 960 GB used via a USB3 adaptor.

I've done bunches of builds in the (A)-(C) context on the RPi5
and have not seen this before, so: does not look to be readily
repeatable.

(Unfortunately, the purpose of the build was to find out how long
the particular build configuration took to finish building the
265 packages from scratch, for comparison to other builds.)

I may wait for the system to become fairly idle and then see about
forcing a crash dump. It may be a while before the poudriere bulk
runs out of packages it can build, absent building boost-libs .


Side note:
As far as I can tell, how to identify a context that allows
identification of what commit vintage a PkgBase world is based on
is unspecified so far. For a PkgBase kernel uname -apKU may well
report the kernel-commit identification well. (Hard to verify.)

===
Mark Millard
marklmi at yahoo.com




Re: Problem with make installworld

2024-03-21 Thread tuexen
> On 21. Mar 2024, at 18:12, Dimitry Andric  wrote:
> 
> On 21 Mar 2024, at 01:12, tue...@freebsd.org wrote:
>> 
>>> On 21. Mar 2024, at 00:27, Dimitry Andric  wrote:
>>> 
>>> On 20 Mar 2024, at 21:44, tue...@freebsd.org wrote:
 
 I'm trying to run make buildworld / make installworld on a recent main 
 branch
 (some days old).
 
 The problem is related to lib/libc/tests/ssp/Makefile
 which contains:
 _libclang_rt_ubsan= 
 ${SYSROOT}${SANITIZER_LIBDIR}/libclang_rt.ubsan_standalone-${CRTARCH}.a
 if exists(${_libclang_rt_ubsan})
 PROGS+= h_raw
 LDADD.h_raw+=   ${SANITIZER_LDFLAGS}
 
 When running make buildworld, we have
 ${SYSROOT} = /usr/obj/usr/home/tuexen/freebsd-src/powerpc.powerpc64/tmp
 ${SANITIZER_LIBDIR} = /usr/lib/clang/17/lib/freebsd
 and so the script is looking for
 /usr/obj/usr/home/tuexen/freebsd-src/powerpc.powerpc64/tmp/usr/lib/clang/17/lib/freebsd/libclang_rt.ubsan_standalone-powerpc64.a
 which does not exist:
 tuexen@blackbird:~ % ls -l 
 /usr/obj/usr/home/tuexen/freebsd-src/powerpc.powerpc64/tmp/usr/lib/clang/17/lib/freebsd/
 total 652
 -r--r--r--  1 root wheel 284316 Mar 20 18:03 libclang_rt.profile-powerpc.a
 -r--r--r--  1 root wheel 380704 Mar 20 17:41 
 libclang_rt.profile-powerpc64.a
 
 Therefore, h_raw to NOT built.
>>> 
>>> As far as I can see, for powerpc64 it should have been built somewhere 
>>> during the libraries stage. So it's a bit strange that you don't have the 
>>> file. Did you use any special options to build?
>> No, not any I'm aware of. I can run tests or provide further information.
> 
> This was my mistake: I recently refactored lib/libclang_rt/Makefile to make 
> it more readable and maintainable, but it accidentally broke building of most 
> of the libclang_rt*.a files for powerpc64. It should now be fixed by 
> https://cgit.freebsd.org/src/commit/?id=f0620ceeccf0 .
Hi Dimitry,

I tested the main branch with your fix and I can confirm that the
problem is fixed.

Thank you very much for the quick fix!

Best regards
Michael
> 
> -Dimitry
> 




Re: Problem with make installworld

2024-03-21 Thread Dimitry Andric
On 21 Mar 2024, at 01:12, tue...@freebsd.org wrote:
> 
>> On 21. Mar 2024, at 00:27, Dimitry Andric  wrote:
>> 
>> On 20 Mar 2024, at 21:44, tue...@freebsd.org wrote:
>>> 
>>> I'm trying to run make buildworld / make installworld on a recent main 
>>> branch
>>> (some days old).
>>> 
>>> The problem is related to lib/libc/tests/ssp/Makefile
>>> which contains:
>>> _libclang_rt_ubsan= 
>>> ${SYSROOT}${SANITIZER_LIBDIR}/libclang_rt.ubsan_standalone-${CRTARCH}.a
>>> if exists(${_libclang_rt_ubsan})
>>> PROGS+= h_raw
>>> LDADD.h_raw+=   ${SANITIZER_LDFLAGS}
>>> 
>>> When running make buildworld, we have
>>> ${SYSROOT} = /usr/obj/usr/home/tuexen/freebsd-src/powerpc.powerpc64/tmp
>>> ${SANITIZER_LIBDIR} = /usr/lib/clang/17/lib/freebsd
>>> and so the script is looking for
>>> /usr/obj/usr/home/tuexen/freebsd-src/powerpc.powerpc64/tmp/usr/lib/clang/17/lib/freebsd/libclang_rt.ubsan_standalone-powerpc64.a
>>> which does not exist:
>>> tuexen@blackbird:~ % ls -l 
>>> /usr/obj/usr/home/tuexen/freebsd-src/powerpc.powerpc64/tmp/usr/lib/clang/17/lib/freebsd/
>>> total 652
>>> -r--r--r--  1 root wheel 284316 Mar 20 18:03 libclang_rt.profile-powerpc.a
>>> -r--r--r--  1 root wheel 380704 Mar 20 17:41 libclang_rt.profile-powerpc64.a
>>> 
>>> Therefore, h_raw to NOT built.
>> 
>> As far as I can see, for powerpc64 it should have been built somewhere 
>> during the libraries stage. So it's a bit strange that you don't have the 
>> file. Did you use any special options to build?
> No, not any I'm aware of. I can run tests or provide further information.

This was my mistake: I recently refactored lib/libclang_rt/Makefile to make it 
more readable and maintainable, but it accidentally broke building of most of 
the libclang_rt*.a files for powerpc64. It should now be fixed by 
https://cgit.freebsd.org/src/commit/?id=f0620ceeccf0 .

-Dimitry




Re: Request for Testing: TCP RACK

2024-03-21 Thread Drew Gallatin
The entire point is to *NOT* go through the overhead of scheduling something 
asynchronously, but to take advantage of the fact that a user/kernel transition 
is going to trash the cache anyway.

In the common case of a system which has less than the threshold  number of 
connections , we access the tcp_hpts_softclock function pointer, make one 
function call, and access hpts_that_need_softclock, and then return.  So that's 
2 variables and a function call.

I think it would be preferable to avoid that call, and to move the declaration 
of tcp_hpts_softclock and hpts_that_need_softclock so that they are in the same 
cacheline.  Then we'd be hitting just a single line in the common case.  (I've 
made comments on the review to that effect).

Also, I wonder if the threshold could get higher by default, so that hpts is 
never called in this context unless we're to the point where we're scheduling 
thousands of runs of the hpts thread (and taking all those clock interrupts).

Drew

On Wed, Mar 20, 2024, at 8:17 PM, Konstantin Belousov wrote:
> On Tue, Mar 19, 2024 at 06:19:52AM -0400, rrs wrote:
> > Ok I have created
> > 
> > https://reviews.freebsd.org/D44420
> > 
> > 
> > To address the issue. I also attach a short version of the patch that Nuno
> > can try and validate
> > 
> > it works. Drew you may want to try this and validate the optimization does
> > kick in since I can
> > 
> > only now test that it does not on my local box :)
> The patch still causes access to all cpu's cachelines on each userret.
> It would be much better to inc/check the threshold and only schedule the
> call when exceeded.  Then the call can occur in some dedicated context,
> like per-CPU thread, instead of userret.
> 
> > 
> > 
> > R
> > 
> > 
> > 
> > On 3/18/24 3:42 PM, Drew Gallatin wrote:
> > > No.  The goal is to run on every return to userspace for every thread.
> > > 
> > > Drew
> > > 
> > > On Mon, Mar 18, 2024, at 3:41 PM, Konstantin Belousov wrote:
> > > > On Mon, Mar 18, 2024 at 03:13:11PM -0400, Drew Gallatin wrote:
> > > > > I got the idea from
> > > > > https://people.mpi-sws.org/~druschel/publications/soft-timers-tocs.pdf
> > > > > The gist is that the TCP pacing stuff needs to run frequently, and
> > > > > rather than run it out of a clock interrupt, its more efficient to run
> > > > > it out of a system call context at just the point where we return to
> > > > > userspace and the cache is trashed anyway. The current implementation
> > > > > is fine for our workload, but probably not idea for a generic system.
> > > > > Especially one where something is banging on system calls.
> > > > >
> > > > > Ast's could be the right tool for this, but I'm super unfamiliar with
> > > > > them, and I can't find any docs on them.
> > > > >
> > > > > Would ast_register(0, ASTR_UNCOND, 0, func) be roughly equivalent to
> > > > > what's happening here?
> > > > This call would need some AST number added, and then it registers the
> > > > ast to run on next return to userspace, for the current thread.
> > > > 
> > > > Is it enough?
> > > > >
> > > > > Drew
> > > > 
> > > > >
> > > > > On Mon, Mar 18, 2024, at 2:33 PM, Konstantin Belousov wrote:
> > > > > > On Mon, Mar 18, 2024 at 07:26:10AM -0500, Mike Karels wrote:
> > > > > > > On 18 Mar 2024, at 7:04, tue...@freebsd.org wrote:
> > > > > > >
> > > > > > > >> On 18. Mar 2024, at 12:42, Nuno Teixeira
> > > >  wrote:
> > > > > > > >>
> > > > > > > >> Hello all!
> > > > > > > >>
> > > > > > > >> It works just fine!
> > > > > > > >> System performance is OK.
> > > > > > > >> Using patch on main-n268841-b0aaf8beb126(-dirty).
> > > > > > > >>
> > > > > > > >> ---
> > > > > > > >> net.inet.tcp.functions_available:
> > > > > > > >> Stack   D
> > > > AliasPCB count
> > > > > > > >> freebsd freebsd  0
> > > > > > > >> rack*
> > > > rack 38
> > > > > > > >> ---
> > > > > > > >>
> > > > > > > >> It would be so nice that we can have a sysctl tunnable for
> > > > this patch
> > > > > > > >> so we could do more tests without recompiling kernel.
> > > > > > > > Thanks for testing!
> > > > > > > >
> > > > > > > > @gallatin: can you come up with a patch that is acceptable
> > > > for Netflix
> > > > > > > > and allows to mitigate the performance regression.
> > > > > > >
> > > > > > > Ideally, tcphpts could enable this automatically when it
> > > > starts to be
> > > > > > > used (enough?), but a sysctl could select auto/on/off.
> > > > > > There is already a well-known mechanism to request execution of the
> > > > > > specific function on return to userspace, namely AST.  The 
> > > > > > difference
> > > > > > with the current hack is that the execution is requested for one
> > > > callback
> > > > > > in the context of the specific thread.
> > > > > >
> > > > > > Still, it might be worth a try to use it; what is the reason to
> > > > hit a thread
> > > > > > that