Re: suspected bug in timeout command

Roberto A. Foglietta Mon, 14 Feb 2022 08:10:30 -0800

Il giorno lun 14 feb 2022 alle ore 17:00 Roberto A. Foglietta
<roberto.foglie...@gmail.com> ha scritto:
>
> Il giorno sab 12 feb 2022 alle ore 02:41 Raffaello D. Di Napoli
> <raf...@dinapo.li> ha scritto:
> >
> >
> > On 2/11/22 16:22, Rob Landley wrote:
> > > On 2/9/22 11:12 AM, Baruch Siach wrote:
> > >> Hi Sun,
> > >>
> > >> On Wed, Feb 09 2022, סאן עמר wrote:
> > >>> Hi, I'm using busybox for a while now (v1.29.2). and I had an issue 
> > >>> with a sigterm send randomly to a process of mine. I debugged it until 
> > >>> I found
> > >>> it from the timeout process which was assigned before to another 
> > >>> process with the same pid. (i'm using a lot of timeouts for a lot of 
> > >>> jobs)
> > >>> so i looked at the code, "timeout.c" file where it sleep for 1 second 
> > >>> in each iteration then check the timeout status. I suspect at this time 
> > >>> the
> > >>> process timeout monitoring is terminated, but another one with the same 
> > >>> pid is already created. which creates unwanted timeout.
> > >>>
> > >>> There is a comment in there about sleep for "HUGE NUM" will probably 
> > >>> result in this issue, but I can't see why it won't occur also in the 
> > >>> current
> > >>> case.
> > >>>
> > >>> there is no change of this behaviour in the latest master.
> > >>> i would appreciate any help, sun.
> > >> Any reference to PID number is inherently racy.
> > > Not between parent and child.
> >
> > Except in BB’s timeout, the relationship is not parent/child :)
> >
> > Much to my surprise, I’ll say that. When I read the bug report the other
> > day, I thought to myself well, this one ought to be easy to fix. But no,
> > there’s no SIGCHLD to be handled, no relationship between processes to
> > be leveraged.
> >
> > I don’t think this bug can be fixed without a near-complete rewrite, or
> > without doing a lot of procfs digging to really validate the waited-on
> > process, since kill(pid, 0) only validates a pid, not a process.
>
> https://github.com/brgl/busybox/blob/master/miscutils/timeout.c
>
> This is the code under inspection:
>
>  grandchild:
> /* Just sleep(HUGE_NUM); kill(parent) may kill wrong process! */
> while (1) {
> sleep(1);
> if (--timeout <= 0)
> break;
> if (kill(parent, 0)) {
> /* process is gone */
> return EXIT_SUCCESS;
> }
> }
> kill(parent, signo);
> return EXIT_SUCCESS;
>
> After all, it might conduct to a PID-race only if the same pid is
> reused within a second. Which means that 32768-N processes are created
> in less than a second. Where N is the running processes in the system.


 The number of pids could be hugely increased up to 2^22 = 4,194,304
on a 64 bits platform. This does not resolve the issue but it makes it
hugely less probable.

 However, if this bug shows-up, probably it means that the system has
a lot of processes running and a lot of processes created and
destroyed compared to the max PID available. Thus, the system might be
incorrectly configured compared with its typical usage which probably
is the main reason because nobody complained before.

 https://stackoverflow.com/questions/6294133/maximum-pid-in-linux

 Best regards,
-- 
Roberto A. Foglietta
+39.349.33.30.697
_______________________________________________
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: suspected bug in timeout command

Reply via email to