Repeated build failure of GDK-Pixbuf 2.38.1
Hi all, I've been having the following build failure during ‘guix upgrade’: […] building /gnu/store/js4ix1yhhhk2j184bc19ynq3cap1f3a5-gdk-pixbuf-2.38.1.drv... building /gnu/store/qjnfwlnhs8x4dn23wzsa0815fc4kknms-llvm-7.0.1.drv... \builder for `/gnu/store/js4ix1yhhhk2j184bc19ynq3cap1f3a5-gdk-pixbuf-2.38.1.drv' failed with exit code 1 build of /gnu/store/js4ix1yhhhk2j184bc19ynq3cap1f3a5-gdk-pixbuf-2.38.1.drv failed View build log at '/var/log/guix/drvs/js/4ix1yhhhk2j184bc19ynq3cap1f3a5-gdk-pixbuf-2.38.1.drv.bz2'. cannot build derivation `/gnu/store/crrrvbdh0bhg305l70fxyx04mwpz5ff7-gdk-pixbuf+svg-2.38.1.drv': 1 dependencies couldn't be built building /gnu/store/wyib6fc7l9y6rgs66mpj9x3dpblkx93z-wayland-protocols-1.17.tar.xz.drv... building /gnu/store/5nfc4y0i7bgz65fdcsfwvrmn113gr3yl-xauth-1.0.10.tar.bz2.drv... cannot build derivation `/gnu/store/8h7ps1kdf6glgf0wags3ywsxg9d266gv-gajim-1.1.3.drv': 1 dependencies couldn't be built guix upgrade: error: build of `/gnu/store/8h7ps1kdf6glgf0wags3ywsxg9d266gv-gajim-1.1.3.drv' failed […] I think that this has been the case for all but the most recent pull generation that I haven't GC'd, as my only package generation is the one that remained after a GC that I did early December: Generation 37 Dec 09 2019 01:44:33(current) glibc-utf8-locales2.28out /gnu/store/94k5w17z54w25lgp90czdqfv9m4hwzhq-glibc-utf8-locales-2.28 tiled 1.2.3 out /gnu/store/pijdpwzmk27xxi9ay6h0n5069m8x4lfx-tiled-1.2.3 gajim 1.1.3 out /gnu/store/7ba6j1r1ns2fa71dnba4m3lvi3s2hll5-gajim-1.1.3 nss-certs 3.43out /gnu/store/6w65nzbc3ah30y5kr4zx9rcgknpjr1f5-nss-certs-3.43 youtube-dl2019.04.30 out /gnu/store/idg86l68r4qia0l0qs2r0w7vg7gf2189-youtube-dl-2019.04.30 darcs 2.14.2 out /gnu/store/f9zs0z6kmd4j60p8443ki2dgw07gwm7x-darcs-2.14.2 gajim-omemo 2.6.28 out /gnu/store/sz4fryksqlcihz8ln4j05dpm3jzf9jc5-gajim-omemo-2.6.28 I'm pretty sure that no ‘guix upgrade’ since that GC has succeeded, else I'd have more than 1 package generation. ‘guix pull’ has been succeeding as I have several pull generations: $ guix pull --list-generations 2>/dev/null | sed --regexp-extended --quiet -- "s/^(commit: .*)$/\1/p" commit: bc587eb178799ccb9bd051f8f46569e1673a9991 commit: e3388d6361dedabb2f6df9cdd5cc98e6cac7f457 commit: 8c5cde2546b8bcca2285d5fa6545adeb5076b74e commit: 762867313cb26bee32fafb6821a7ef7584c537c2 commit: 704719edade1368f798c9301f3a8197a0df5c930 commit: d63e0ee78d00ff1ccef9f74307b237b7e46ea15d commit: 47ea2ad196a41a3c3de3e5fa970b6881495c0e8f commit: fcb510c541e83291ea6682cba87020a913c64914 commit: 7ee8acbb76bd440a8985a4b2a20b75521c6853ed If I understand correctly then the GDK Pixbuf build regression was introduced between commits bc587eb178799ccb9bd051f8f46569e1673a9991 and e3388d6361dedabb2f6df9cdd5cc98e6cac7f457, sometime between Wednesday 11th December and Wednesday 22nd January, which narrows down to 6 weeks where I didn't use Guix. I hope the detail helps you fix the regression. Regards, James R. Haigh. -- 4 days, 3 hours, and 58 minutes left until FOSDEM 2020 (Saturday)! 5 days, 3 hours, and 58 minutes left until FOSDEM 2020 (Sunday)! Wealth doesn't bring happiness, but poverty brings sadness. https://wiki.FSFE.org/Fellows/JRHaigh Sent from Debian with Claws Mail, using email subaddressing as an alternative to error-prone heuristical spam filtering.
Re: Default autogroup niceness of Guix build daemon
Hi Giovanni, At 2020-01-27Mon18:37:53+01, Giovanni Biscuolo sent: > […] > Since Debian 9 [uses] systemd, should be possible by configuring a limit in > the systemd service unit file [1]; I've never tried but try adding > "LimitNICE=19" in the [Service] stanza. Thanks for the suggestion. I tried it and it seemed to only affect process niceness, not autogroup niceness. Htop reported process niceness values of 19 on the relevant processes, and I already knew that setting these manually from Htop does not fix the lag. Autogroup niceness still defaults to 0: $ (for P in $(ps --group="guixbuild" --format="pid="); do F="/proc/$P/autogroup"; echo "$F"; cat "$F"; done) /proc/26741/autogroup /autogroup-12141 nice 0 > Documentation on that parameter here: > https://www.freedesktop.org/software/systemd/man/systemd.exec.html#Process%20Properties Thank you. There does not seem to be anything on autogroup niceness there, which is apparently deliberate, unfortunately. I also tried ‘LimitNICE=+19’ and ‘CPUSchedulingPolicy=idle’ but neither prevented Guix builds from hogging the CPU. As well as reloading the SystemD configuration, I also tried interrupting the build and starting it again, but the lag returned as soon as the build got going again. As a temporary workaround, I tried disabling CFS's autogrouping feature with: sudo tee /proc/sys/kernel/sched_autogroup_enabled <<<0 …but it did not seem to do anything, despite Htop reporting that the relevant processes have a process niceness of 19. It seems that autogrouping is still enabled, despite the attempt to disable it, because the command in my previous email (repeated below) to imperatively change the autogroup niceness (to 19 and then back to 0) continues to have unmistakable effect on interactivity. As all other attempts so far have had no noticeable effect, that command remains to be the only workaround so far: $ (for P in $(ps --group="guixbuild" --format="pid="); do F="/proc/$P/autogroup"; echo "$F"; cat "$F" && sudo tee "$F" <<<19; done) /proc/26741/autogroup /autogroup-12141 nice 0 [sudo] password for JRHaigh: 19 $ (for P in $(ps --group="guixbuild" --format="pid="); do F="/proc/$P/autogroup"; echo "$F"; cat "$F" && sudo tee "$F" <<<0; done) ## /proc/26741/autogroup /autogroup-12141 nice 19 0 $ (for P in $(ps --group="guixbuild" --format="pid="); do F="/proc/$P/autogroup"; echo "$F"; cat "$F" && sudo tee "$F" <<<19; done) /proc/26741/autogroup /autogroup-12141 nice 0 19 These cammands very clearly made my system responsive, laggy, and then responsive again, which was totally unexpected seeing as I had supposedly disabled autogrouping. Given that there are reasons why autogrouping was added and has been default on most/many distros for many years, it would be best if the solution worked with autogrouping enabled anyway, so I don't really want to waste time investigating why autogrouping does not seem to get disabled. Are there any modifications that could be made to the Guix daemon to fix this? Seeing as SystemD apparently does not support autogroup niceness, perhaps the next best alternative is to fix the CPUSchedulingPolicy setting. I'm guessing that it only affects the Guix daemon and does not cascade to its builders. Fixing the CPUSchedulingPolicy setting might provide a decent, declarative solution for build deprioritisation that could be used instead of niceness. Regards, James R. Haigh. -- 4 days, 5 hours, and 4 minutes left until FOSDEM 2020 (Saturday)! 5 days, 5 hours, and 4 minutes left until FOSDEM 2020 (Sunday)! Wealth doesn't bring happiness, but poverty brings sadness. https://wiki.FSFE.org/Fellows/JRHaigh Sent from Debian with Claws Mail, using email subaddressing as an alternative to error-prone heuristical spam filtering.
Re: Default autogroup niceness of Guix build daemon
Hi James, "J. R. Haigh (re. Guix)" writes: > Hi all, > I've been using Guix on Debian 9 Stretch [...] > Is there a way to declaratively set the default autogroup > niceness of Guix's build daemon? Since Debian 9 users systemd, should be possible by configuring a limit in the systemd service unit file [1]; I've never tried but try adding "LimitNICE=19" in the [Service] stanza Documentation on that parameter here: https://www.freedesktop.org/software/systemd/man/systemd.exec.html#Process%20Properties Remember to "systemctl daemon-reload" after editing one the systemd service unit file HTH! Gio' [1] /etc/systemd/system/guix-daemon.service [...] -- Giovanni Biscuolo Xelera IT Infrastructures signature.asc Description: PGP signature
Re: Guix and openmpi in a container environment
‐‐‐ Original Message ‐‐‐ On Sunday, 19 January 2020 11:25, Todor Kondić wrote: > I am getting mpirun errors when trying to execute a simple > > mpirun -np 1 program > > (where program is e.g. 'ls') command in a container environment. > > The error is usually: > > All nodes which are allocated for this job are already filled. > > which makes no sense, as I am trying this on my workstation (single socket, > four cores -- your off-the-shelf i5 cpu) and no scheduling system enabled. > > I set up the container with this command: > > guix environment -C -N --ad-hoc -m default.scm > > where default.scm: > > (use-modules (guix packages)) > (specifications->manifest > `(;; Utilities > "less" > "bash" > "make" > "openssh" > "guile" > "nano" > "glibc-locales" > "gcc-toolchain@7.4.0" > "gfortran-toolchain@7.4.0" > "python" > "openmpi" > "fftw" > "fftw-openmpi" > ,@(map (lambda (x) (package-name x)) %base-packages))) > > Simply installing openmpi (guix package -i openmpi) in my usual Guix profile > just works out of the box. So, there has to be some quirk where the openmpi > container installation is blind to some settings within the usual environment. For the environment above, if the mpirun invocation is changed to provide the hostname mpirun --host $HOSTNAME:4 -np 4 ls ls is executed in four processes and the output is four times the contents of the current directory as expected. Of course, ls is not an MPI program. However, testing this elementary fortran MPI code, --- program testrun2 use mpi implicit none integer :: ierr call mpi_init(ierr) call mpi_finalize(ierr) end program testrun2 --- fails with runtime errors on any number of processes. The compilation line was: mpif90 test2.f90 -o testrun2 The mpirun command: mpirun --host $HOSTNAME:4 -np 4 Let me reiterate, there is no need to declare the host and its maximal number of slots in the normal user environment. Also, the runtime errors are gone. Could it be that the openmpi package needs a few other basic dependencies not present in the package declaration for the particular case of a single node (normal PC) machine? Also, I noted that gfortran/mpif90 ignores "CPATH" and "LIBRARY_PATH" env variables. I had to specify this explicitly via -I and -L flags to the compiler.