Re: suspected bug in timeout command

2022-02-14 Thread Roberto A. Foglietta
Il Sab 12 Feb 2022, 14:13 David Laight  ha scritto:

> From: Michael Conrad
> > Sent: 12 February 2022 12:59
> >
> > On 2/12/22 07:38, Michael Conrad wrote:
> > > Correctly using pidfd *still* requires that you be the parent process,
> > > else the child could get reaped and replaced before the pidfd is
> > > created.  As far as I can tell, the only purpose of pidfd is for
> > > waking on poll() instead of using signals, which is orthagonal to this
> > > problem.
>
> Even if the pidfd can't be passed down, it lets you verify the process
> information and then send the signal.
>
> > > I haven't looked at the source in busybox yet, but it boggles my mind
> > > that it wouldn't just be a simple fork+alarm+waitpid because that is
> > > literally the least code implementation, and race-free.
> > >
> > > -Mike C
> > >
> > Sorry for being lazy.  I looked at the source and this is the reason:
> >
> > /* We want to create a grandchild which will watch
> >   * and kill the grandparent. Other methods:
> >   * making parent watch child disrupts parent<->child link
> >   * (example: "tcpsvd 0.0.0.0 1234 timeout service_prog" -
> >   * it's better if service_prog is a child of tcpsvd!),
> >   * making child watch parent results in programs having
> >   * unexpected children.*/
> >
> > I don't follow this reasoning.  Does "disrupts the parent<->child link"
> > just about sending signals?  If the timeout app relays all signals from
> > itself to the child, what remaining problems would exist?
>
> In that case you can pass 'verification information' through to
> the grandchild.
>
> It could be an open fd of "/proc/self" - which allows the non-racy
> kill on recent kernels.
> But other information would allow the timing window be minimalised on
> older kernels.
>
> ISTR that on older kernels an open fd to "proc/nn" always refers to the
> current process with pid nn. But the actual behaviour is worth checking.
> I think newer kernels will fail any reads after the process has exited.
>

Checking the procfs man page there is some mount options that do not
allows reading other PID files in that filesystem.

I am not 100% sure that a invitered file descriptor could not be read
but negate access.

Mount options The proc filesystem supports the following mount
options: hidepid=n (since Linux 3.3) This option controls who can
access the information in /proc/[pid] directories. The argument, n, is
one of the following values:
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


busybox-armv8l doesn't work on Apple M1

2022-02-14 Thread Meng-Yuan Huang
Hello.

Some people said Apple M1 doesn't support arm32:
https://news.ycombinator.com/item?id=27277351
In contrast, Raspberry Pi 4 SoC supports both arm32 and arm64.

This difference results in Apple M1 can't run this busybox armv8l binary
https://busybox.net/downloads/binaries/1.31.0-defconfig-multiarch-musl/busybox-armv8l
but Raspberry Pi 4 can.

This is the error log of executing 
busybox-armv8l
 on Ubuntu 21 arm64 on Apple M1.
"
myh@ubuntu:~$ wget 
https://busybox.net/downloads/binaries/1.31.0-defconfig-multiarch-musl/busybox-armv8l
 && chmod 755 busybox-armv8l && ./busybox-armv8l
--2022-02-15 12:46:15--  
https://busybox.net/downloads/binaries/1.31.0-defconfig-multiarch-musl/busybox-armv8l
Resolving busybox.net (busybox.net)... 140.211.167.122
Connecting to busybox.net (busybox.net)|140.211.167.122|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1148524 (1.1M)
Saving to: ‘busybox-armv8l’

busybox-armv8l  
100%[==>]   
1.09M  1.05MB/sin 1.0s

2022-02-15 12:46:17 (1.05 MB/s) - ‘busybox-armv8l’ saved [1148524/1148524]

bash: ./busybox-armv8l: cannot execute binary file: Exec format error
myh@ubuntu:~$ file busybox-armv8l
busybox-armv8l: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), 
statically linked, stripped
"

Unfortunately, some downstream apps use the 
busybox-armv8l
 in their arm64 apps. For example, distroless container image:
https://github.com/GoogleContainerTools/distroless/blob/6028308845393e394c650ce5c1332f8182451f17/busybox_archives.bzl#L19-L24
Thus, their apps work only on CPUs supporting both arm32 and arm64, not arm64 
only CPUs as Apple M1.

Could provide an arm64 busybox binary on busybox.net to help fixing this 
problem?

Regards,
Meng-Yuan Huang

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


RE: suspected bug in timeout command

2022-02-14 Thread David Laight
Yes it does.
But it is unlikely to happen in the short time between some kind of test
that the process is the right one, and sending the signal.
But hoping to see the process ‘gone’ on a 1 second poll is pretty useless.

So if the process’s ‘start time’ is found (before any of the forks) that can
be passed through to the grandchild and checked every second.
Then you are very unlikely to kill the wrong process.

If you also pass through the open directory fd to “/proc/self” and use openat()
to open whichever /proc/xx/yy file is needed then I think with a later kernel
you’ll get an error is the process has died (and the pid reused) and you can
also use it to send the signal – avoiding the pid reuse race.

Just a smop.

David


Im in not wrong linux use a simple allocation method for pid’d were it just set 
the next pid to last pid allocated + 1. Then check if its free and if not keep 
adding until found free one. A process can be created few minutes before it 
exits and meanwhile the pid’s are continuing and wrap around to the point that 
at the moment it release is the exact same moment when the next_pid is the same 
one as the process that just exits.

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: suspected bug in timeout command

2022-02-14 Thread Roberto A. Foglietta
‪Il giorno lun 14 feb 2022 alle ore 17:21 ‫סאן עמר‬‎
 ha scritto:‬
>
> Im in not wrong linux use a simple allocation method for pid’d were it just 
> set the next pid to last pid allocated + 1. Then check if its free and if not 
> keep adding until found free one. A process can be created few minutes before 
> it exits and meanwhile the pid’s are continuing and wrap around to the point 
> that at the moment it release is the exact same moment when the next_pid is 
> the same one as the process that just exits.

While the timeout is sleeping for 1 second, the supervised PID process
dies and another one process starts with the same PID, then the
timeout wakes-up and checks for the PID then the race condition
happens if the PID process will last longer than the expiring timeout.
How much is probable the PID race condition? It related to these
factors:

- long timeout is used
- a lot of processes is created
- a lot of processes is running
- small maximum PID is used

 It could happen despite any oddity, by design. That's true. However,
it is not avoidable unless timeout does not establish a child-parent
relationship with the surveilled process as shown by Rob.


 Best regards,
--
Roberto A. Foglietta
+39.349.33.30.697
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: suspected bug in timeout command

2022-02-14 Thread סאן עמר
Im in not wrong linux use a simple allocation method for pid’d were it just
set the next pid to last pid allocated + 1. Then check if its free and if
not keep adding until found free one. A process can be created few minutes
before it exits and meanwhile the pid’s are continuing and wrap around to
the point that at the moment it release is the exact same moment when the
next_pid is the same one as the process that just exits.

בתאריך יום ב׳, 14 בפבר׳ 2022 ב-18:01 מאת Roberto A. Foglietta <
roberto.foglie...@gmail.com>:

> Il giorno sab 12 feb 2022 alle ore 02:41 Raffaello D. Di Napoli
>  ha scritto:
> >
> >
> > On 2/11/22 16:22, Rob Landley wrote:
> > > On 2/9/22 11:12 AM, Baruch Siach wrote:
> > >> Hi Sun,
> > >>
> > >> On Wed, Feb 09 2022, סאן עמר wrote:
> > >>> Hi, I'm using busybox for a while now (v1.29.2). and I had an issue
> with a sigterm send randomly to a process of mine. I debugged it until I
> found
> > >>> it from the timeout process which was assigned before to another
> process with the same pid. (i'm using a lot of timeouts for a lot of jobs)
> > >>> so i looked at the code, "timeout.c" file where it sleep for 1
> second in each iteration then check the timeout status. I suspect at this
> time the
> > >>> process timeout monitoring is terminated, but another one with the
> same pid is already created. which creates unwanted timeout.
> > >>>
> > >>> There is a comment in there about sleep for "HUGE NUM" will probably
> result in this issue, but I can't see why it won't occur also in the current
> > >>> case.
> > >>>
> > >>> there is no change of this behaviour in the latest master.
> > >>> i would appreciate any help, sun.
> > >> Any reference to PID number is inherently racy.
> > > Not between parent and child.
> >
> > Except in BB’s timeout, the relationship is not parent/child :)
> >
> > Much to my surprise, I’ll say that. When I read the bug report the other
> > day, I thought to myself well, this one ought to be easy to fix. But no,
> > there’s no SIGCHLD to be handled, no relationship between processes to
> > be leveraged.
> >
> > I don’t think this bug can be fixed without a near-complete rewrite, or
> > without doing a lot of procfs digging to really validate the waited-on
> > process, since kill(pid, 0) only validates a pid, not a process.
>
> https://github.com/brgl/busybox/blob/master/miscutils/timeout.c
>
> This is the code under inspection:
>
>  grandchild:
> /* Just sleep(HUGE_NUM); kill(parent) may kill wrong process! */
> while (1) {
> sleep(1);
> if (--timeout <= 0)
> break;
> if (kill(parent, 0)) {
> /* process is gone */
> return EXIT_SUCCESS;
> }
> }
> kill(parent, signo);
> return EXIT_SUCCESS;
>
> After all, it might conduct to a PID-race only if the same pid is
> reused within a second. Which means that 32768-N processes are created
> in less than a second. Where N is the running processes in the system.
>
>  Best regards,
> --
> Roberto A. Foglietta
> +39.349.33.30.697
>
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: suspected bug in timeout command

2022-02-14 Thread Roberto A. Foglietta
Il giorno lun 14 feb 2022 alle ore 17:00 Roberto A. Foglietta
 ha scritto:
>
> Il giorno sab 12 feb 2022 alle ore 02:41 Raffaello D. Di Napoli
>  ha scritto:
> >
> >
> > On 2/11/22 16:22, Rob Landley wrote:
> > > On 2/9/22 11:12 AM, Baruch Siach wrote:
> > >> Hi Sun,
> > >>
> > >> On Wed, Feb 09 2022, סאן עמר wrote:
> > >>> Hi, I'm using busybox for a while now (v1.29.2). and I had an issue 
> > >>> with a sigterm send randomly to a process of mine. I debugged it until 
> > >>> I found
> > >>> it from the timeout process which was assigned before to another 
> > >>> process with the same pid. (i'm using a lot of timeouts for a lot of 
> > >>> jobs)
> > >>> so i looked at the code, "timeout.c" file where it sleep for 1 second 
> > >>> in each iteration then check the timeout status. I suspect at this time 
> > >>> the
> > >>> process timeout monitoring is terminated, but another one with the same 
> > >>> pid is already created. which creates unwanted timeout.
> > >>>
> > >>> There is a comment in there about sleep for "HUGE NUM" will probably 
> > >>> result in this issue, but I can't see why it won't occur also in the 
> > >>> current
> > >>> case.
> > >>>
> > >>> there is no change of this behaviour in the latest master.
> > >>> i would appreciate any help, sun.
> > >> Any reference to PID number is inherently racy.
> > > Not between parent and child.
> >
> > Except in BB’s timeout, the relationship is not parent/child :)
> >
> > Much to my surprise, I’ll say that. When I read the bug report the other
> > day, I thought to myself well, this one ought to be easy to fix. But no,
> > there’s no SIGCHLD to be handled, no relationship between processes to
> > be leveraged.
> >
> > I don’t think this bug can be fixed without a near-complete rewrite, or
> > without doing a lot of procfs digging to really validate the waited-on
> > process, since kill(pid, 0) only validates a pid, not a process.
>
> https://github.com/brgl/busybox/blob/master/miscutils/timeout.c
>
> This is the code under inspection:
>
>  grandchild:
> /* Just sleep(HUGE_NUM); kill(parent) may kill wrong process! */
> while (1) {
> sleep(1);
> if (--timeout <= 0)
> break;
> if (kill(parent, 0)) {
> /* process is gone */
> return EXIT_SUCCESS;
> }
> }
> kill(parent, signo);
> return EXIT_SUCCESS;
>
> After all, it might conduct to a PID-race only if the same pid is
> reused within a second. Which means that 32768-N processes are created
> in less than a second. Where N is the running processes in the system.

 The number of pids could be hugely increased up to 2^22 = 4,194,304
on a 64 bits platform. This does not resolve the issue but it makes it
hugely less probable.

 However, if this bug shows-up, probably it means that the system has
a lot of processes running and a lot of processes created and
destroyed compared to the max PID available. Thus, the system might be
incorrectly configured compared with its typical usage which probably
is the main reason because nobody complained before.

 https://stackoverflow.com/questions/6294133/maximum-pid-in-linux

 Best regards,
-- 
Roberto A. Foglietta
+39.349.33.30.697
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: suspected bug in timeout command

2022-02-14 Thread Roberto A. Foglietta
Il giorno sab 12 feb 2022 alle ore 02:41 Raffaello D. Di Napoli
 ha scritto:
>
>
> On 2/11/22 16:22, Rob Landley wrote:
> > On 2/9/22 11:12 AM, Baruch Siach wrote:
> >> Hi Sun,
> >>
> >> On Wed, Feb 09 2022, סאן עמר wrote:
> >>> Hi, I'm using busybox for a while now (v1.29.2). and I had an issue with 
> >>> a sigterm send randomly to a process of mine. I debugged it until I found
> >>> it from the timeout process which was assigned before to another process 
> >>> with the same pid. (i'm using a lot of timeouts for a lot of jobs)
> >>> so i looked at the code, "timeout.c" file where it sleep for 1 second in 
> >>> each iteration then check the timeout status. I suspect at this time the
> >>> process timeout monitoring is terminated, but another one with the same 
> >>> pid is already created. which creates unwanted timeout.
> >>>
> >>> There is a comment in there about sleep for "HUGE NUM" will probably 
> >>> result in this issue, but I can't see why it won't occur also in the 
> >>> current
> >>> case.
> >>>
> >>> there is no change of this behaviour in the latest master.
> >>> i would appreciate any help, sun.
> >> Any reference to PID number is inherently racy.
> > Not between parent and child.
>
> Except in BB’s timeout, the relationship is not parent/child :)
>
> Much to my surprise, I’ll say that. When I read the bug report the other
> day, I thought to myself well, this one ought to be easy to fix. But no,
> there’s no SIGCHLD to be handled, no relationship between processes to
> be leveraged.
>
> I don’t think this bug can be fixed without a near-complete rewrite, or
> without doing a lot of procfs digging to really validate the waited-on
> process, since kill(pid, 0) only validates a pid, not a process.

https://github.com/brgl/busybox/blob/master/miscutils/timeout.c

This is the code under inspection:

 grandchild:
/* Just sleep(HUGE_NUM); kill(parent) may kill wrong process! */
while (1) {
sleep(1);
if (--timeout <= 0)
break;
if (kill(parent, 0)) {
/* process is gone */
return EXIT_SUCCESS;
}
}
kill(parent, signo);
return EXIT_SUCCESS;

After all, it might conduct to a PID-race only if the same pid is
reused within a second. Which means that 32768-N processes are created
in less than a second. Where N is the running processes in the system.

 Best regards,
-- 
Roberto A. Foglietta
+39.349.33.30.697
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


CONFIG_FEATURE_UNIX_LOCAL

2022-02-14 Thread Spare Project
Using busybox and dropbear trying to blag unix socket forwarding without
installing more software.

nc -l local:/tmp/socket doesn’t create any socket ?

httpd -p local:/tmp/testing on the other hand does create a socket

Is this just a bug in busybox ? Or am I doing something wrong. Will try and
have a look at the code itself but I’m not even close to competent enough
to think I’ll get anywhere.
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: suspected bug in timeout command

2022-02-14 Thread Laurent Bercot

/* We want to create a grandchild which will watch
 * and kill the grandparent. Other methods:
 * making parent watch child disrupts parent<->child link
 * (example: "tcpsvd 0.0.0.0 1234 timeout service_prog" -
 * it's better if service_prog is a child of tcpsvd!),
 * making child watch parent results in programs having
 * unexpected children.*/

I don't follow this reasoning.  Does "disrupts the parent<->child link" just 
about sending signals?  If the timeout app relays all signals from itself to the child, what 
remaining problems would exist?


 Yeah, that's clearly a misdesign.
 Keeping the same pid for the end of a Bernstein chain whenever possible
is a good idea, so the intention is good - but it's only possible when
the child is *entirely transparent* wrt control flow. It's a good
model, for instance, for ssh-agent, or for a data processor such as a
TLS tunnel (that's how s6-tlsd operates); but here, since the child
actually impacts control (it sends a signal to the parent on timeout)
it's just not applicable - and this thread illustrates exactly why.

 To really fix the bug, timeout should be rewritten, and run as the
parent, and possibly forward signals to the child.

--
 Laurent

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox