On Thursday, 20 June 2024 20:06:17 BST Eli Schwartz wrote: > On 6/20/24 8:46 AM, Peter Humphrey wrote: > > On interrupting one such hang, I found that 32 install jobs had been > > waiting to run; is this limit hard coded? I also saw "too many jobs" or > > something, and "could not read job counter". > > > > Is it now bug-report time? > > It's not clear to me what "stalling" means here. Did portage stop doing > any work as verified by ps or htop? Did it just spend a long time > showing no progress?
All activity had stopped: top showed no portage processes at all. I didn't check ps. > I do know what the 32 install jobs "waiting to run" is though. Or at > least I'm pretty sure I know what it means. > > Recent portage has this change: > https://gitweb.gentoo.org/proj/portage.git/commit/?id=825db01b91a37dcd9890ee > 5bf9f462ea524ac5cc > > "Add merge-wait FEATURES setting enabled by default" > > From the changelog: > > portage-3.0.62 (2024-02-22) > -------------- > > * FEATURES: Add FEATURES="merge-wait", enabled by default, to control > whether we do parallel merges of images to the live filesystem (bug > #663324). > > If enabled, we serialize these merges. > > For now, this makes FEATURES="parallel-install" a no-op, but in > future, it will be improved to allow parallel merges, just not while > any packages are compiling. That isn't what i saw, though. Parallel make jobs ran happily, but none of them were installed. Portage then stopped and waited until they had been. Nothing was happening at all (other than background OS tasks: kworker etc.), as confirmed by top. I waited a few minutes, then CTRL-C'd it. Instantly, the whole batch of 32 install tasks was released, appearing on the terminal as fast as it could display them; they ran in parallel until they'd all finished. I could then restart the emerging of @plasma, which is my set of plasma packages to be built on top of xorg; it causes >600 package emerges. prh@wstn ~ $ ls /etc/portage/sets apps base core plasma utils xorg prh@wstn ~ $ wc -l /etc/portage/sets/* 20 /etc/portage/sets/apps 32 /etc/portage/sets/base 10 /etc/portage/sets/core 34 /etc/portage/sets/plasma 9 /etc/portage/sets/utils 13 /etc/portage/sets/xorg 118 total The same thing happened twice more before the whole @plasma set was finished. --->8 > In https://bugs.gentoo.org/934382 portage is adding additional options: > > --jobs-merge-wait-threshold=X will cause portage to stop starting new > jobs when X number of packages are in pending-merge state, and portage > will copy the installed package files from the image to the root > filesystem. Otherwise, portage will get there eventually but it might > take a bit longer I wonder if that's the culprit. I could test it by starting another system build, if I can disable that threshold and if you need me to - but remember the "too many jobs" and "could not read job counter". -- Regards, Peter.