On 14 Jan 2013, at 17:47, Jilles Tjoelker wrote:

> The code which does that check is actually under contrib/gcc. Problem
> is, they designed __gthread_active_p() to distinguish threaded and
> unthreaded programming environments -- it must be known in advance and
> cannot be changed later. The code for the unthreaded environment then
> takes advantage of this by not even allocating memory for mutexes in
> some cases.

It's worth taking a step back and asking why this code exists at all, and the 
main reason is that acquiring a mutex used to be really expensive.  It still is 
on some fruit-flavoured operating systems, but elsewhere it's a single atomic 
operation in the uncontended case, and in that case the cache line will already 
be exclusively owned by the calling core in single-threaded code.  

I would much rather that we followed the example of Solaris and made the 
multithreaded case fast and the default than keep piling on hacks that allow 
code to shave off a few clock cycles in the single-threaded case.  In 
particular, the popularity of multicore systems means that it is increasingly 
rare for code to be both single threaded and performance critical, so this 
seems like misplaced optimisation.

I strongly suspect that making it possible to inline the uncontended lock case 
for a pthread mutex and eliminating all of the branches on __isthreaded would 
give us a net speedup in both single and multithreaded cases.

> This __gthread_active_p() thing is another barrier to bringing in a
> threaded plugin in an unthreaded application. Ports people spend a fair
> amount of time adding -pthread flags to things (such as perl) to work
> around this.


This and the similar checks in libc cause a lot of pain, and it seems that the 
correct fix is ensuring that the performance penalty for linking libthr is so 
small that there is no point in avoiding it.

David

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to