On Fri, Sep 06, 2024 at 10:19:56AM +0200, Martin Pieuchot wrote:
> On 26/07/24(Fri) 08:36, Claudio Jeker wrote:
> > On Thu, Jul 25, 2024 at 08:20:32PM +0200, Martin Pieuchot wrote:
> > > On 25/07/24(Thu) 17:33, Claudio Jeker wrote:
> > > > On Thu, Jul 25, 2024 at 05:15:32PM +0200, Martin Pieuchot wrote:
> > > > > On 25/07/24(Thu) 14:51, Claudio Jeker wrote:
> > > > > > On Thu, Jul 25, 2024 at 11:09:44AM +0200, Martin Pieuchot wrote:
> > > > > > [...]
> > > > > > > > Index: kern/kern_synch.c
> > > > > > > > ===================================================================
> > > > > > > > RCS file: /cvs/src/sys/kern/kern_synch.c,v
> > > > > > > > diff -u -p -r1.206 kern_synch.c
> > > > > > > > --- kern/kern_synch.c 23 Jul 2024 08:38:02 -0000 1.206
> > > > > > > > +++ kern/kern_synch.c 24 Jul 2024 14:14:06 -0000
> > > > > > > > @@ -399,15 +399,18 @@ sleep_finish(int timo, int do_sleep)
> > > > > > > > */
> > > > > > > > if (p->p_wchan == NULL)
> > > > > > > > do_sleep = 0;
> > > > > > > > + KASSERT((p->p_flag & P_SINGLESLEEP) == 0);
> > > > > > > > atomic_clearbits_int(&p->p_flag, P_WSLEEP);
> > > > > > > >
> > > > > > > > + /* If requested to stop always force a stop even if
> > > > > > > > do_sleep == 0 */
> > > > > > > > + if (p->p_stat == SSTOP)
> > > > > > > > + do_sleep = 1;
> > > > > > >
> > > > > > > This is also scary. The problem with the current scheme is that
> > > > > > > we don't
> > > > > > > know who changed `p_stat' and if we already did our context
> > > > > > > switch or
> > > > > > > not.
> > > > > >
> > > > > > It is scary indeed. I would prefer if sleep_signal_check() would not
> > > > > > randomly mi_switch() away behind our back. I first tried that but
> > > > > > it is
> > > > > > harder then you think.
> > > > >
> > > > > So maybe let's start by adding:
> > > > >
> > > > > KASSERT(!(p->p_stat == SSTOP && do_sleep == 0))
> > > > >
> > > > > And see if something blows.
> > > >
> > > > I can assure you single_thread_set() is able to put p->p_stat to SSTOP
> > > > between these lines:
> > > > if ((error = sleep_signal_check(p)) != 0) {
> > > > catch = 0;
> > > > do_sleep = 0;
> > > > }
> > > > }
> > > >
> > > > SCHED_LOCK();
> > > >
> > > > So it can happen but the window is reasonably small (mainly the call to
> > > > cursig() and some minimal other fluff) that the KASSERT will probably
> > > > never hit.
> > >
> > > This whole discussion makes me believe that SINGLE_SUSPEND is the
> > > incorrect solution for this and should die.
> > >
> > > Instead of trying to change the state of siblings in single_thread_set()
> > > and context switching in single_thread_check() all threads should stop
> > > inside cursig().
> > >
> > > I really appreciate all the efforts you've put into debugging this.
> > > However I cannot believe that adding more hacks and checking for p_stat
> > > is the way to go.
> >
> > It is not a hack. There will always be something like this unless you want
> > to revert back to a giant recursive scheduler lock that can block the full
> > sleep machinery from start to finish and accept the fact that OpenBSD will
> > never scale to more than 8CPUs.
> >
> > You have to accept that putting a thread to sleep is dirty business and
> > that sleep finish must check that the transaction was actually successful.
> >
> > I tried various approaches to fix this problem. All of them where
> > horrible. I spent a week on this bug and I would prefer to spend my time
> > on actually working on fixing SIGSTOP, dowait6() and finally replace the
> > sched_lock in sleep. All of this will happen in and around the same place
> > and maybe once we're done my "hack" is gone as well.
> >
> > I will cycle back into this mess at some point but I will not rewrite the
> > ptrace interaction now. I'm happy to drop this and just accept the fact
> > that debugging multithreaded applications on OpenBSD is simply broken.
>
> I believe you should commit this before release. Having functional MT
> debugging is more than valuable. So ok mpi@ for the diff.
>
> That said, I'm still concerned about the growing number of per-thread
> flags used by the sleep_setup/sleep_finish() machinery to work around
> under-defined SSLEEP & SSTOP transitions:
>
> - P_WSLEEP to work around setting SSLEEP to early
> - P_SINTR to indicate that SSLEEP isn't enough
> - P_SUSPSINGLE to indicate that a thread is parked and not SSTOP
> - P_SUSPSIG to indicate which thread handled the STOP signal
> - P_TIMEOUT to indicate the sleep has a timer
> - P_SINGLESLEEP to work around a race single_thread & signal APIs.
>
> I believe we can and should do better in the future.
>
Lets split this up into small pieces.
First lets add the P_TRACESINGLE bits.
--
:wq Claudio
Index: kern/kern_sig.c
===================================================================
RCS file: /cvs/src/sys/kern/kern_sig.c,v
diff -u -p -r1.338 kern_sig.c
--- kern/kern_sig.c 10 Aug 2024 09:18:09 -0000 1.338
+++ kern/kern_sig.c 7 Sep 2024 12:35:49 -0000
@@ -840,7 +840,9 @@ trapsignal(struct proc *p, int signum, u
SCHED_UNLOCK();
signum = pr->ps_xsig;
- single_thread_clear(p, 0);
+ if ((p->p_flag & P_TRACESINGLE) == 0)
+ single_thread_clear(p, 0);
+ atomic_clearbits_int(&p->p_flag, P_TRACESINGLE);
/*
* If we are no longer being traced, or the parent
@@ -1359,7 +1361,9 @@ cursig(struct proc *p, struct sigctx *sc
atomic_clearbits_int(&pr->ps_siglist, mask);
}
- single_thread_clear(p, 0);
+ if ((p->p_flag & P_TRACESINGLE) == 0)
+ single_thread_clear(p, 0);
+ atomic_clearbits_int(&p->p_flag, P_TRACESINGLE);
/*
* If we are no longer being traced, or the parent
Index: kern/sys_process.c
===================================================================
RCS file: /cvs/src/sys/kern/sys_process.c,v
diff -u -p -r1.98 sys_process.c
--- kern/sys_process.c 3 Jun 2024 12:48:25 -0000 1.98
+++ kern/sys_process.c 7 Sep 2024 12:35:49 -0000
@@ -441,6 +441,13 @@ ptrace_ctrl(struct proc *p, int req, pid
if (pid < THREAD_PID_OFFSET && tr->ps_single)
t = tr->ps_single;
+ else if (t == tr->ps_single)
+ atomic_setbits_int(&t->p_flag, P_TRACESINGLE);
+ else {
+ error = EINVAL;
+ goto fail;
+ }
+
/* If the address parameter is not (int *)1, set the pc. */
if ((int *)addr != (int *)1)
Index: sys/proc.h
===================================================================
RCS file: /cvs/src/sys/sys/proc.h,v
diff -u -p -r1.371 proc.h
--- sys/proc.h 1 Sep 2024 03:09:00 -0000 1.371
+++ sys/proc.h 7 Sep 2024 12:35:49 -0000
@@ -436,6 +436,7 @@ struct proc {
#define P_SINTR 0x00000080 /* Sleep is interruptible. */
#define P_SYSTEM 0x00000200 /* No sigs, stats or swapping.
*/
#define P_TIMEOUT 0x00000400 /* Timing out during sleep. */
+#define P_TRACESINGLE 0x00001000 /* ptrace: keep single
threaded. */
#define P_WEXIT 0x00002000 /* Working on exiting. */
#define P_OWEUPC 0x00008000 /* Owe proc an addupc() at next
ast. */
#define P_SUSPSINGLE 0x00080000 /* Need to stop for single
threading. */
@@ -446,8 +447,8 @@ struct proc {
#define P_BITS \
("\20" "\01INKTR" "\02PROFPEND" "\03ALRMPEND" "\04SIGSUSPEND" \
"\05CANTSLEEP" "\06WSLEEP" "\010SINTR" "\012SYSTEM" "\013TIMEOUT" \
- "\016WEXIT" "\020OWEUPC" "\024SUSPSINGLE" "\033THREAD" \
- "\034SUSPSIG" "\037CPUPEG")
+ "\015TRACESINGLE" "\016WEXIT" "\020OWEUPC" "\024SUSPSINGLE" \
+ "\033THREAD" "\034SUSPSIG" "\037CPUPEG")
#define THREAD_PID_OFFSET 100000