On 2013-02-18 5:12 PM, Zack Weinberg wrote:
To fix things, we need to change abstract behavior in two places. First,
when you event_del() an event while the event loop to which it belongs
is running, it does not actually get disassociated from the event loop
until the next time the loop is about to either block or exit (this is
called the "stabilization point").  It is not safe to deallocate the
event object before then.  Second, we consider each event object to
_own_ its context data (if any), and therefore to be responsible for
deallocating it at the appropriate time, via an application-provided
callback (henceforth the "deallocation hook").

Concretely, when you call event_free, it _may_ synchronously deallocate
the event and call your deallocation hook to free the context data, but
if the event loop is running, it will just event_del the event and
return, and the actual deallocation will be queued to occur when the
event loop reaches the stabilization point.  Similarly for
bufferevent_free and all other libevent objects that are wrapped around
an event.

I have realized that this is only good enough for applications in which each thread has its own event_base. If more than one thread uses the same event_base, then we can have this race:

thread 1        thread 2
select          select
wakeup          .
evmap: A        .
handler(A)      .
.               wakeup, sem++
.               evmap: A
.               handler(A)
event_free(A)   .
return          .
deallocate A    .
select          .
.               A->thing [CRASH]

I see two possible fixes for this race. The first is to use a counting semaphore to figure out when a thread is the _last_ thread to be about to block. Continuing with the above example, that would look like this:

semaphore    thread 1        thread 2
0            select          select
1            wakeup, sem++   .
1            evmap: A        .
1            handler(A)      .
2            .               wakeup, sem++
2            .               evmap: A
2            .               handler(A)
2            event_free(A)   .
2            return          .
1            --sem > 0       .
1            select          .
1            .               A->thing
1            .               .
1            .               return
0            .               --sem = 0
0            .               deallocate A
0            .               select

I'm reasonably confident this is high-level sound, but there still might be raceable gaps between exiting select and bumping the semaphore, or between dropping the semaphore and entering select. Perhaps worse, it depends on regular idle points at which all threads but one are blocked. Under load, it might well be that at least two threads are always active, and then deallocations will be delayed indefinitely.

The other possibility is to give every event object a reference count (currently, as far as I can tell, bufferevents and evbuffers are refcounted but events themselves aren't). Having been event_add()ed counts as 1, and every thread which is running handlers for an event also counts as a reference. Then we have

A refcnt     thread 1        thread 2
1            select          select
1            wakeup          .
2            evmap: A        .
2            handler(A)      .
2            .               wakeup
3            .               evmap: A
3            .               handler(A)
2            event_free(A)   .
2            return          .
1            --ref(A) > 0    .
1            select          .
1            .               A->thing
1            .               .
1            .               return
0            .               --ref(A) == 0
0            .               deallocate A
0            .               select

This way, deallocations cannot get starved, and there's also no need for a pending-deallocations queue, *provided* that each thread is careful not to drop any reference count until it is completely done running handlers for that event. (The problem I alluded to in my original message, involving bufferevent use-after-free in a *single*-threaded application, appeared to be a case where the read callback got invoked immediately after the event callback had called bufferevent_free, probably because the kernel posted "data available" and "connection closed" simultaneously. I never did track that down with certainty, though.)

My worry with this one is that we might have unavoidable races inside evmap. I haven't really looked at that layer at all, but I'm imagining a situation where thread 2 gets a wakeup on a fd that is about to become stale, looks up the event object, and is about to incref it when thread 1 deallocates it.

zw
***********************************************************************
To unsubscribe, send an e-mail to majord...@freehaven.net with
unsubscribe libevent-users    in the body.

Reply via email to