I'm working on revamping SipRefreshManager. The system of timers and
locks used in the code is obscure, and after some work in it, I see that
the problem to be solved is difficult. I think that I have devised a
locking strategy that works reliably, but I would like to circulate it
for discussion.
Problem: We have a set of objects, and the objects contain timers.
When a timer fires, the event routine may modify the object containing
the timer. In addition, methods called by other threads may modify
the objects, and may create and delete objects. The nastiest part of
the problem is the race between an externally-called method that wants
to delete an object and the event routine of a timer within the
object.
Limitation: We assume that a timer event routine does not affect
objects other than the one that contains it, and does not delete its
object.
Proposed solution:
The set is given a lock. Each object is given a lock.
If a thread possesses a pointer to an object, the thread must hold the
set's lock. If a thread wishes to access an object (with the
exception of stopping one of the object's timers), it must also hold
the object's lock.
If a thread wishes to create or delete an object, it must hold the
set's lock. Since this excludes other threads from possessing a
pointer to the object, the thread need not hold the object's lock if
there are no running timers in the object.
A timer (that is started) is treated as a separate thread. It need
not hold the set's lock as the timer's existence implies that the
object exists. But in order to read or modify the object (especially
including restarting itself or other timers), it must hold the
object's lock.
In order to prevent deadlocks, a thread my hold only one object lock
at a time, and if (as is usual) it also wants to hold the set's lock,
it must seize the set's lock first.
Each object contains a boolean member, SuppressTimerEventRoutine.
This value is normally false. If it is true, then the body of the
timer event routine for every timer in the object is skipped. A
consequence of this is that a synchronous stop() applied to a timer
when it's object's SuppressTimerEventRoutine is true will not start
any timer, nor will it seize the object's lock.
Pattern for timer event routine:
if not SuppressTimerEventRoutine
then
seize the object's lock
perform the operations
release the object's lock
end if
Pattern for an externally-called method that does not affect timers:
seize the set's lock
look up the object in question
seize the object's lock
perform the operations
release the object's lock
release the set's lock
Most externally-called methods will cause state changes in the object
for which timers should be stopped and started. In that case, a more
elaborate pattern is needed:
seize the set's lock
look up the object in question
seize the object's lock
set SuppressTimerFire
release the object's lock
stop timer(s) synchronously
# At this point all the object's timers are stopped and
# there are no queued event routine firings.
seize the object's lock
clear SuppressTimerFire
perform the operations (which may include starting timers)
release the object's lock
release the set's lock
Pattern for an externally-called method to delete an object:
seize the set's lock
look up the object in question
seize the object's lock
set SuppressTimerFire
release the object's lock
stop timer(s) synchronously
# At this point all the object's timers are stopped and
# there are no queued event routine firings.
remove the object from the set
delete the object
release the set's lock
Pattern for an externally-called method to create an object:
seize the set's lock
create the object
seize the object's lock
clear SuppressTimerFire
perform the operations (which may include starting timers)
release the object's lock
release the set's lock
In SipRefreshManager, timer event routines or externally-called
methods may want to send SIP messages. Since SipUserAgent::send() can
take a long time, the routine must wait until after it releases all
locks to send SIP messages. This introduces new race conditions, in
that a message may be sent after the object's state changes in a way
that renders the message obsolete, but SIP already allows for
duplicated or delayed messages, so no harm can come from this.
Dale
_______________________________________________
sipx-dev mailing list [email protected]
List Archive: http://list.sipfoundry.org/archive/sipx-dev
Unsubscribe: http://list.sipfoundry.org/mailman/listinfo/sipx-dev
sipXecs IP PBX -- http://www.sipfoundry.org/