>>> I think the bigger issues as you've pointed out are the cost of >>> the additional spin lock and should the additional state be >>> stored in-band (fewer cache lines) or out-of band (less risk of >>> breaking due to unpredictable application behavior). >> >> We don't need the spinlock if clearing the shadow byte after >> setting the status to user. >> >> Worst case, user will set it back to kernel while the shadow >> byte is not cleared yet and the next producer will drop a packet. >> But next producers will make progress, so there is no deadlock >> or corruption. > > I thought so too for a while but after spending more time than I > care to admit I relized the following sequence was occuring: > > Core A Core B > ------ ------ > - Enter spin_lock > - Get tp_status of head (X) > tp_status == 0 > - Check inuse > inuse == 0 > - Allocate entry X > advance head (X+1) > set inuse=1 > - Exit spin_lock > > <very long delay> > > <allocate N-1 entries > where N = size of ring> > > - Enter spin_lock > - get tp_status of head (X+N) > tp_status == 0 (but slot > in use for X on core A) > > - write tp_status of <--- trouble! > X = TP_STATUS_USER <--- trouble! > - write inuse=0 <--- trouble! > > - Check inuse > inuse == 0 > - Allocate entry X+N > advance head (X+N+1) > set inuse=1 > - Exit spin_lock > > > At this point Core A just passed slot X to userspace with a > packet and Core B has just been assigned slot X+N (same slot as > X) for it's new packet. Both cores A and B end up filling in that > slot. Tracking ths donw was one of the reasons it took me a > while to produce these updated diffs.
Is this not just an ordering issue? Since inuse is set after tp_status, it has to be tested first (and barriers are needed to avoid reordering).