Repeated build failure of GDK-Pixbuf 2.38.1

2020-01-27 Thread J. R. Haigh (re. Guix)
Hi all,
I've been having the following build failure during ‘guix upgrade’:
[…]
building /gnu/store/js4ix1yhhhk2j184bc19ynq3cap1f3a5-gdk-pixbuf-2.38.1.drv...
building /gnu/store/qjnfwlnhs8x4dn23wzsa0815fc4kknms-llvm-7.0.1.drv...
\builder for 
`/gnu/store/js4ix1yhhhk2j184bc19ynq3cap1f3a5-gdk-pixbuf-2.38.1.drv' failed with 
exit code 1
build of /gnu/store/js4ix1yhhhk2j184bc19ynq3cap1f3a5-gdk-pixbuf-2.38.1.drv 
failed
View build log at 
'/var/log/guix/drvs/js/4ix1yhhhk2j184bc19ynq3cap1f3a5-gdk-pixbuf-2.38.1.drv.bz2'.
cannot build derivation 
`/gnu/store/crrrvbdh0bhg305l70fxyx04mwpz5ff7-gdk-pixbuf+svg-2.38.1.drv': 1 
dependencies couldn't be built
building 
/gnu/store/wyib6fc7l9y6rgs66mpj9x3dpblkx93z-wayland-protocols-1.17.tar.xz.drv...
building /gnu/store/5nfc4y0i7bgz65fdcsfwvrmn113gr3yl-xauth-1.0.10.tar.bz2.drv...
cannot build derivation 
`/gnu/store/8h7ps1kdf6glgf0wags3ywsxg9d266gv-gajim-1.1.3.drv': 1 dependencies 
couldn't be built
guix upgrade: error: build of 
`/gnu/store/8h7ps1kdf6glgf0wags3ywsxg9d266gv-gajim-1.1.3.drv' failed
[…]

I think that this has been the case for all but the most recent pull generation 
that I haven't GC'd, as my only package generation is the one that remained 
after a GC that I did early December:
Generation 37   Dec 09 2019 01:44:33(current)
  glibc-utf8-locales2.28out 
/gnu/store/94k5w17z54w25lgp90czdqfv9m4hwzhq-glibc-utf8-locales-2.28
  tiled 1.2.3   out /gnu/store/pijdpwzmk27xxi9ay6h0n5069m8x4lfx-tiled-1.2.3
  gajim 1.1.3   out /gnu/store/7ba6j1r1ns2fa71dnba4m3lvi3s2hll5-gajim-1.1.3
  nss-certs 3.43out 
/gnu/store/6w65nzbc3ah30y5kr4zx9rcgknpjr1f5-nss-certs-3.43
  youtube-dl2019.04.30  out 
/gnu/store/idg86l68r4qia0l0qs2r0w7vg7gf2189-youtube-dl-2019.04.30
  darcs 2.14.2  out /gnu/store/f9zs0z6kmd4j60p8443ki2dgw07gwm7x-darcs-2.14.2
  gajim-omemo   2.6.28  out 
/gnu/store/sz4fryksqlcihz8ln4j05dpm3jzf9jc5-gajim-omemo-2.6.28

I'm pretty sure that no ‘guix upgrade’ since that GC has succeeded, else I'd 
have more than 1 package generation. ‘guix pull’ has been succeeding as I have 
several pull generations:

$ guix pull --list-generations 2>/dev/null | sed --regexp-extended --quiet -- 
"s/^(commit: .*)$/\1/p"
commit: bc587eb178799ccb9bd051f8f46569e1673a9991
commit: e3388d6361dedabb2f6df9cdd5cc98e6cac7f457
commit: 8c5cde2546b8bcca2285d5fa6545adeb5076b74e
commit: 762867313cb26bee32fafb6821a7ef7584c537c2
commit: 704719edade1368f798c9301f3a8197a0df5c930
commit: d63e0ee78d00ff1ccef9f74307b237b7e46ea15d
commit: 47ea2ad196a41a3c3de3e5fa970b6881495c0e8f
commit: fcb510c541e83291ea6682cba87020a913c64914
commit: 7ee8acbb76bd440a8985a4b2a20b75521c6853ed

If I understand correctly then the GDK Pixbuf build regression was introduced 
between commits bc587eb178799ccb9bd051f8f46569e1673a9991 and 
e3388d6361dedabb2f6df9cdd5cc98e6cac7f457, sometime between Wednesday 11th 
December and Wednesday 22nd January, which narrows down to 6 weeks where I 
didn't use Guix.
I hope the detail helps you fix the regression.

Regards,
James R. Haigh.
-- 
4 days, 3 hours, and 58 minutes left until FOSDEM 2020 (Saturday)!
5 days, 3 hours, and 58 minutes left until FOSDEM 2020 (Sunday)!
Wealth doesn't bring happiness, but poverty brings sadness.
https://wiki.FSFE.org/Fellows/JRHaigh
Sent from Debian with Claws Mail, using email subaddressing as an alternative 
to error-prone heuristical spam filtering.



Re: Default autogroup niceness of Guix build daemon

2020-01-27 Thread J. R. Haigh (re. Guix)
Hi Giovanni,

At 2020-01-27Mon18:37:53+01, Giovanni Biscuolo sent:
> […]
> Since Debian 9 [uses] systemd, should be possible by configuring a limit in 
> the systemd service unit file [1]; I've never tried but try adding 
> "LimitNICE=19" in the [Service] stanza.

Thanks for the suggestion. I tried it and it seemed to only affect process 
niceness, not autogroup niceness. Htop reported process niceness values of 19 
on the relevant processes, and I already knew that setting these manually from 
Htop does not fix the lag. Autogroup niceness still defaults to 0:

$ (for P in $(ps --group="guixbuild" --format="pid="); do 
F="/proc/$P/autogroup"; echo "$F"; cat "$F"; done)
/proc/26741/autogroup
/autogroup-12141 nice 0

> Documentation on that parameter here:
> https://www.freedesktop.org/software/systemd/man/systemd.exec.html#Process%20Properties

Thank you. There does not seem to be anything on autogroup niceness there, 
which is apparently deliberate, unfortunately. I also tried ‘LimitNICE=+19’ and 
‘CPUSchedulingPolicy=idle’ but neither prevented Guix builds from hogging the 
CPU. As well as reloading the SystemD configuration, I also tried interrupting 
the build and starting it again, but the lag returned as soon as the build got 
going again.
As a temporary workaround, I tried disabling CFS's autogrouping feature 
with:

sudo tee /proc/sys/kernel/sched_autogroup_enabled <<<0

…but it did not seem to do anything, despite Htop reporting that the relevant 
processes have a process niceness of 19. It seems that autogrouping is still 
enabled, despite the attempt to disable it, because the command in my previous 
email (repeated below) to imperatively change the autogroup niceness (to 19 and 
then back to 0) continues to have unmistakable effect on interactivity. As all 
other attempts so far have had no noticeable effect, that command remains to be 
the only workaround so far:

$ (for P in $(ps --group="guixbuild" --format="pid="); do 
F="/proc/$P/autogroup"; echo "$F"; cat "$F" && sudo tee "$F" <<<19; done)
/proc/26741/autogroup
/autogroup-12141 nice 0
[sudo] password for JRHaigh: 
19
$ (for P in $(ps --group="guixbuild" --format="pid="); do 
F="/proc/$P/autogroup"; echo "$F"; cat "$F" && sudo tee "$F" <<<0; done)  ##
/proc/26741/autogroup
/autogroup-12141 nice 19
0
$ (for P in $(ps --group="guixbuild" --format="pid="); do 
F="/proc/$P/autogroup"; echo "$F"; cat "$F" && sudo tee "$F" <<<19; done)
/proc/26741/autogroup
/autogroup-12141 nice 0
19

These cammands very clearly made my system responsive, laggy, and then 
responsive again, which was totally unexpected seeing as I had supposedly 
disabled autogrouping. Given that there are reasons why autogrouping was added 
and has been default on most/many distros for many years, it would be best if 
the solution worked with autogrouping enabled anyway, so I don't really want to 
waste time investigating why autogrouping does not seem to get disabled.
Are there any modifications that could be made to the Guix daemon to 
fix this? Seeing as SystemD apparently does not support autogroup niceness, 
perhaps the next best alternative is to fix the CPUSchedulingPolicy setting. 
I'm guessing that it only affects the Guix daemon and does not cascade to its 
builders. Fixing the CPUSchedulingPolicy setting might provide a decent, 
declarative solution for build deprioritisation that could be used instead of 
niceness.

Regards,
James R. Haigh.
-- 
4 days, 5 hours, and 4 minutes left until FOSDEM 2020 (Saturday)!
5 days, 5 hours, and 4 minutes left until FOSDEM 2020 (Sunday)!
Wealth doesn't bring happiness, but poverty brings sadness.
https://wiki.FSFE.org/Fellows/JRHaigh
Sent from Debian with Claws Mail, using email subaddressing as an alternative 
to error-prone heuristical spam filtering.



Re: Default autogroup niceness of Guix build daemon

2020-01-27 Thread Giovanni Biscuolo
Hi James,

"J. R. Haigh (re. Guix)"  writes:

> Hi all,
>   I've been using Guix on Debian 9 Stretch

[...]

>   Is there a way to declaratively set the default autogroup
>   niceness of Guix's build daemon?

Since Debian 9 users systemd, should be possible by configuring a limit
in the systemd service unit file [1]; I've never tried but try adding
"LimitNICE=19" in the [Service] stanza

Documentation on that parameter here:
https://www.freedesktop.org/software/systemd/man/systemd.exec.html#Process%20Properties

Remember to "systemctl daemon-reload" after editing one the systemd
service unit file

HTH! Gio'

[1] /etc/systemd/system/guix-daemon.service

[...]

-- 
Giovanni Biscuolo

Xelera IT Infrastructures


signature.asc
Description: PGP signature


Re: Guix and openmpi in a container environment

2020-01-27 Thread Todor Kondić
‐‐‐ Original Message ‐‐‐
On Sunday, 19 January 2020 11:25, Todor Kondić  wrote:

> I am getting mpirun errors when trying to execute a simple
>
> mpirun -np 1 program
>
> (where program is e.g. 'ls') command in a container environment.
>
> The error is usually:
>
> All nodes which are allocated for this job are already filled.
>
> which makes no sense, as I am trying this on my workstation (single socket, 
> four cores -- your off-the-shelf i5 cpu) and no scheduling system enabled.
>
> I set up the container with this command:
>
> guix environment -C -N --ad-hoc -m default.scm
>
> where default.scm:
>
> (use-modules (guix packages))
> (specifications->manifest
> `(;; Utilities
> "less"
> "bash"
> "make"
> "openssh"
> "guile"
> "nano"
> "glibc-locales"
> "gcc-toolchain@7.4.0"
> "gfortran-toolchain@7.4.0"
> "python"
> "openmpi"
> "fftw"
> "fftw-openmpi"
> ,@(map (lambda (x) (package-name x)) %base-packages)))
>
> Simply installing openmpi (guix package -i openmpi) in my usual Guix profile 
> just works out of the box. So, there has to be some quirk where the openmpi 
> container installation is blind to some settings within the usual environment.

For the environment above,

if the mpirun invocation is changed to provide the hostname

mpirun --host $HOSTNAME:4 -np 4 ls

ls is executed in four processes and the output is four times the contents of 
the current directory as expected.

Of course, ls is not an MPI program. However, testing this elementary fortran 
MPI code,

---
program testrun2
  use mpi
  implicit none
  integer :: ierr

  call mpi_init(ierr)
  call mpi_finalize(ierr)

end program testrun2
---

fails with runtime errors on any number of processes.


The compilation line was:
mpif90 test2.f90 -o testrun2

The mpirun command:
mpirun --host $HOSTNAME:4 -np 4


Let me reiterate, there is no need to declare the host and its maximal number 
of slots in the normal user environment. Also, the runtime errors are gone.

Could it be that the openmpi package needs a few other basic dependencies not 
present in the package declaration for the particular case of a single node 
(normal PC) machine?

Also, I noted that gfortran/mpif90 ignores "CPATH" and "LIBRARY_PATH" env 
variables. I had to specify this explicitly via -I and -L flags to the compiler.