Hi,

[email protected] writes:

> I tested guix-publish and that had no issues.

You mean the first ‘wget -O …’ passes?

> Some checks I did yesterday with guix-dameon:
> - Shepherd is passing a blocking socket
> - The "fdSocket" in "acceptConnection" is always blocking.
> - the "remote" socket in "acceptConnection" is O_NONBLOCK on the first 
> connection only.

Looking at ‘accept4.c’ in libc, the only way ‘remote’ can be O_NONBLOCK
is if:

  1. ‘accept4’ is passed SOCK_NONBLOCK, but that’s not the case here
     (see ‘accept.c’);

  2. ‘__socket_accept’ returns a O_NONBLOCK socket, which would be a bug
     in the server, pflocal.

At first sight ‘S_io_set_all_openmodes’ in pflocal does the job and
‘S_socket_accept’ honors those flags.

> Adding the same check as for the fd 3 socket  for O_NONBLOCK to the
> "connection" socket after accept  to tests/systemd.sh passes on Linux
> but causes a failure on the Hurd.

So we have a reproducer.

Could you pass it on to bug-hurd? :-)  It may be easier if the whole
thing is in C.

> I am unsure what to do about this because shepherd seems to do
> everything correctly. I saw that ci.g.g.o has started to build
> i586-gnu substitutes (in particular gcc-final) but if you are
> restarting the builders more aggressively now then each first build
> will fail because of this and idk if cuirass can reschedule builds on
> such failures.

Yeah, it’s not great.  Those will have to be restarted manually I’m
afraid, but most of the time anybody can click on the “Restart” button
in Cuirass.

> Maybe the easiest is to to expose the #:lazy-start? option for now and 
> disable it for guix-daemon in %base-services/hurd ?

Hmm maybe.  Let’s first figure out if this is Hurd bug.

Thanks for investigating!

Ludo’.



Reply via email to