> > i can reproduce it with this:
> > 
> > http://cm.bell-labs.com/sources/contrib/cinap_lenrek/traptest/
> > 
> > 8c test.c
> > 8a int80.s
> > 8l test.8 int80.8
> > ./8.out
> > 
> > 8.out 12490667: suicide: sys: trap: general protection violation 
> > pc=0x00001333
> 
> okay.  it seems pretty clear from the code that you're dead meat
> if you receive a note while you're in the note handler.  that is,
> up->notified = 1. 

No! Notes are bufferd in the up->note[] array. If you are in the note handler,
another process *can* send you further (NUser) notes without doing any harm.

If we are in the note handler (up->notified == 1) and notify() gets hit,
it will do nothing and return 0 see:

/sys/src/9/pc/trap.c: notify()
...
        if(n->flag!=NUser && (up->notified || up->notify==0)){
                if(n->flag == NDebug)
                        pprint("suicide: %s\n", n->msg);
                qunlock(&up->debug);
                pexit(n->msg, n->flag!=NDebug);
        }

        if(up->notified){
                qunlock(&up->debug);
                splhi();
                return 0;
        }
...

The problem is when we get a NDebug note *after* an NUser note. Then
after notify() poped the first NUser note and putting the process into
the user handler, the NDebug note will be the next/first (up->note[0]) and then,
any (indirect) call to notify() will kill us because now it thinks while 
handling the last
note (up->notified == 1) it caused some trap/fatal event (up->note[0].flag != 
NUser).
but this was *not* the case here! We just traped after some other process
put a note in our queue.

The notify() code for detecting trap in note handler is fine i think.
Whats wrong is that the trap got put after the NUser note.

> it looks pretty clear that this is intentional.
> i don't see why one couldn't get 3-4 note before the note handler
> is called, however.
> 
> given this, calling sleep() from the note handler is an especially
> bad idea.
> 
> however, on a multiprocessor (or if you get scheduled by a clock
> tick on a up), you're still vulnerable.  this is akin to hitting ^c
> twice quickly — and watching one's shell exit.
> 
> it would be good to track down what's really going on in your
> vm.  how many processors does plan 9 think it has?

just one :-)

> i did some looking to see if i could find any discussions on the
> implementation of notes and didn't find anything in my quick scan.
> it would be very interesting to have a little perspective from someone
> who was there.

I have done further experiments and changed postnote() in
/sys/src/9/port/proc.c from:
...
        if(flag != NUser && (p->notify == 0 || p->notified))
                p->nnote = 0;
...
to:
...
        if(flag != NUser)
                p->nnote = 0;
...
which lets the testcase run without any suicides.

What it does is to ensure (in a harsh way) that not only
if the destination process is currently inside
the notehandler but always, the trap will end up as the first
entry in the up->note array. so no matter what NUser-notes
we received before.

A trap caused by a note handler will still suicide the
process which is correct.

This is just a hack. It would be better to keep the
other notes and move the tail one step down and then
putting the new note on the first entry if its != NUser.

What do you think?

> - erik

--
cinap
--- Begin Message ---
> i can reproduce it with this:
> 
> http://cm.bell-labs.com/sources/contrib/cinap_lenrek/traptest/
> 
> 8c test.c
> 8a int80.s
> 8l test.8 int80.8
> ./8.out
> 
> 8.out 12490667: suicide: sys: trap: general protection violation 
> pc=0x00001333

okay.  it seems pretty clear from the code that you're dead meat
if you receive a note while you're in the note handler.  that is,
up->notified = 1.  it looks pretty clear that this is intentional.
i don't see why one couldn't get 3-4 note before the note handler
is called, however.

given this, calling sleep() from the note handler is an especially
bad idea.

however, on a multiprocessor (or if you get scheduled by a clock
tick on a up), you're still vulnerable.  this is akin to hitting ^c
twice quickly — and watching one's shell exit.

it would be good to track down what's really going on in your
vm.  how many processors does plan 9 think it has?

i did some looking to see if i could find any discussions on the
implementation of notes and didn't find anything in my quick scan.
it would be very interesting to have a little perspective from someone
who was there.

- erik

--- End Message ---

Reply via email to