On Sat, 2026-01-24 at 17:33 +0100, Santiago Vila wrote:
> > > Can you tell if this parallel mode is the default, and if not, what
> > > other reasons for a race condition can be?
> > 
> > I am not aware of any "parallel mode" here. The postinst calls dracut
> > just with >&2 to redirect stdout to stderr.
> 
> I refer to this:
> 
>     if [[ $parallel != "yes" ]]; then
>         for i in *; do
>             [[ -f $i/modules.dep ]] || [[ -f $i/modules.dep.bin ]] || continue
>             "$dracut_cmd" --kver="$i" "${dracut_args[@]}"
>             _rc=$?
>             if [[ $_rc -gt 0 ]]; then
>                 printf "%s\n" "dracut[F]: image generation failed for kernel 
> '$i'." >&2
>                 ((ret += _rc))
>             fi
>         done
>     else
>         for i in *; do
>             [[ -f $i/modules.dep ]] || [[ -f $i/modules.dep.bin ]] || continue
> -->         "$dracut_cmd" --kver="$i" "${dracut_args[@]}" &
>         done
> 
> This "&" at the end is what draw my attention and made me to remember
> the funny bug in "sumo" which I pointed out before.
> > 

Oh, that code path is only taken when dracut is called with
--regenerate-all (plus --parallel). That is not the case in our case.

> > > I would be willing to build linux-signed-amd64 a lot of times using a
> > > modified dracut package with whatever debug changes we could add to
> > > see what's going on (even if they are simple "echo foo"), so I'm
> > > open for suggestions about those potential changes.
> > 
> > Thanks. Could you try to run a test with this patch applied to dracut:
> > https://github.com/dracut-ng/dracut-ng/pull/2109/changes
> > 
> > Plus add an "exit 1" after those 3 dinfo/dwarning calls to let dracut
> > fail, because I expect dracut to take the happy path.
> 
> This first modified version fails a lot less, maybe because by adding those
> additional lines we make the race condition to be less likely.
> 
> But I can't see any special thing in the logs (are they redirected somewhere?)
> 

Does this really fail? help_output=$(3cpio --help) will always read the
full output and therefore 3cpio should not fail with broken pipe. So I
would expect this patch to be a workaround as side effect.

> > Can you test with just[*] the stderr redirection to /dev/null removed?
> 
> This second modified version fails again quite often, but at least
> we can see the error from 3cpio:
> 
> [...]
> I: /vmlinuz is now a symlink to boot/vmlinuz-6.18.5+deb14-cloud-amd64
> I: /initrd.img is now a symlink to boot/initrd.img-6.18.5+deb14-cloud-amd64
> /etc/kernel/postinst.d/dracut:
> dracut: Generating /boot/initrd.img-6.18.5+deb14-cloud-amd64
> 
> thread 'main' panicked at library/std/src/io/stdio.rs:1165:9:
> failed printing to stdout: Broken pipe (os error 32)
> note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
> /usr/bin/dracut: line 2772: cpio: command not found
> dracut[F]: Creation of /boot/initrd.img-6.18.5+deb14-cloud-amd64 failed
> [...]
> 
> 
> [*] Note that after cloning from salsa I still have to revert the
> change made by Bastian so that it depends again on 3cpio|cpio. The
> first time I forgot about such little detail and was suspiciously
> surprised that it did not fail at all...

Thanks. That confirms my suspicion. grep terminates and 3cpio cannot
write to the pipe any more. Instead of happily quitting, it throws an
error message and exits with code 1. I'll fix 3cpio to handle the broken
pipe case better.

-- 
Benjamin Drung
Debian & Ubuntu Developer

Reply via email to