Brian Pane wrote:

If we could change the apr_atomic_inc/dec functions to use

apr_uint32_t,

this part of the fdqueue code could become a lot simpler.

As far as I know, we can make it work with apr_uint32_t on most
platforms, as long as we declare any inline assembly blocks as
volatile (thanks to Sascha Schumann for fixing this recently in
apr_atomic_cas on Linux).
>
The one platform I'm not sure of is OS/390, due to the special
data types and API involved:
http://marc.theaimsgroup.com/?l=apr-dev&m=104129861312811

Don't let that stop you. I'll take care of OS/390. Right off the top of my head, I can't see how the cs_t could be anything other than a 32 bit integer, knowing how the cs instruction works.


(sorry for not replying to that msg earlier...I'm still backed up from the holidays.)

I still have one more change I'm hoping to make in the fdqueue
synchronization logic: move the queue manipulation code in
ap_queue_push/pop outside the mutex protected region by maintaining
a linked list of queue nodes with apr_atomic_casptr.

Sounds good, as long as the pops are single threaded somehow, or if

you have


some other way of getting around the A-B-A cas pop window.


I'm planning on spinning around the CAS; i.e., retry if the
CAS fails due to a collision with another thread.  The queue_info
synchronization, which uses that same technique, seems to be
working well in practice.

no...we're cool if the CAS fails; the problem is that sometimes it doesn't fail when you wish it would.


thread 1 prepares to pop A off the list and picks up its next pointer to B, then gets interrupted/swapped out/whatever.

other threads really pop A off the list, pop B off the list and free it, then push A back on the list with A's next pointer now pointing at C.

thread 1 wakes back up and uses CAS to compare for A and swap with B. The CAS succeeds on most architectures, because the head contains a pointer to A once again. This corrupts the head to point to B which has been freed. The problem is that A's next pointer had been updated, and a straightforward CAS on a single word can't tell the difference between a pointer to old A and a pointer to new improved A.

I did look at the queue_info pop and decided it must be single threaded for a given pool/qi combo, because it is driven as a pool cleanup and if that wasn't single threaded already, the apr pool destroy code would go nuts. I hope that assessment is correct.

A double word CAS can do it safely by incrementing a list version # concurrently with updating the list head pointer. That's possible on Pentiums with CMPXCHG8B and __cds on S/390. PowerPCs and others (Alpha??) don't need the double word version; they can use Load and Reserve followed by Store Conditional safely. But I'm not aware of such a mechanism on Sparc for example, and I don't know much about other CPU architectures.

Greg



Reply via email to