On Thursday, 20 June 2024 20:06:17 BST Eli Schwartz wrote:
> On 6/20/24 8:46 AM, Peter Humphrey wrote:
> > On interrupting one such hang, I found that 32 install jobs had been
> > waiting to run; is this limit hard coded? I also saw "too many jobs" or
> > something, and "could not read job counter".
> > 
> > Is it now bug-report time?
> 
> It's not clear to me what "stalling" means here. Did portage stop doing
> any work as verified by ps or htop? Did it just spend a long time
> showing no progress?

All activity had stopped: top showed no portage processes at all. I didn't 
check ps.

> I do know what the 32 install jobs "waiting to run" is though. Or at
> least I'm pretty sure I know what it means.
> 
> Recent portage has this change:
> https://gitweb.gentoo.org/proj/portage.git/commit/?id=825db01b91a37dcd9890ee
> 5bf9f462ea524ac5cc
> 
> "Add merge-wait FEATURES setting enabled by default"
> 
> From the changelog:
> 
> portage-3.0.62 (2024-02-22)
> --------------
> 
> * FEATURES: Add FEATURES="merge-wait", enabled by default, to control
>   whether we do parallel merges of images to the live filesystem (bug
>   #663324).
> 
>   If enabled, we serialize these merges.
> 
>   For now, this makes FEATURES="parallel-install" a no-op, but in
>   future, it will be improved to allow parallel merges, just not while
>   any packages are compiling.

That isn't what i saw, though. Parallel make jobs ran happily, but none of 
them were installed. Portage then stopped and waited until they had been. 
Nothing was happening at all (other than background OS tasks: kworker etc.), 
as confirmed by top. I waited a few minutes, then CTRL-C'd it. Instantly, the 
whole batch of 32 install tasks was released, appearing on the terminal as 
fast as it could display them; they ran in parallel until they'd all finished. 
I could then restart the emerging of @plasma, which is my set of plasma 
packages to be built on top of xorg; it causes >600 package emerges.

prh@wstn ~ $ ls /etc/portage/sets
apps  base  core  plasma  utils  xorg
prh@wstn ~ $ wc -l /etc/portage/sets/*
  20 /etc/portage/sets/apps
  32 /etc/portage/sets/base
  10 /etc/portage/sets/core
  34 /etc/portage/sets/plasma
   9 /etc/portage/sets/utils
  13 /etc/portage/sets/xorg
 118 total

The same thing happened twice more before the whole @plasma set was finished.

--->8

> In https://bugs.gentoo.org/934382 portage is adding additional options:
> 
> --jobs-merge-wait-threshold=X will cause portage to stop starting new
> jobs when X number of packages are in pending-merge state, and portage
> will copy the installed package files from the image to the root
> filesystem. Otherwise, portage will get there eventually but it might
> take a bit longer

I wonder if that's the culprit. I could test it by starting another system 
build, if I can disable that threshold and if you need me to - but remember 
the  "too many jobs" and "could not read job counter".

-- 
Regards,
Peter.




Reply via email to