Geoff, we did not seem to be able to figure out what openssl Makefile actually builds crypto/threads/th-lock.c
In our particular case we explicitly included that file when building our server, but for “pure” OpenSSL, what make includes th-lock.c ? Igor. From: Geoffrey Thorpe [mailto:ge...@geoffthorpe.com] Sent: Monday, June 09, 2014 10:13 PM To: openssl-dev@openssl.org Cc: Levin, Igor; Salz, Rich Subject: Re: Locking inefficiency Hey Bodo, On Mon, Jun 9, 2014 at 3:15 PM, Bodo Moeller <bmoel...@acm.org<mailto:bmoel...@acm.org>> wrote: Geoffrey Thorpe <ge...@geoffthorpe.com<mailto:ge...@geoffthorpe.com>>: First, you're right, pthreads_locking_callback() is collapsing everything to a mutex. I was well aware of this and thought we did this for compatibility reasons (because I couldn't think of any other reasonable explanation, I guess). If actual read-write locks are just as portable, I think it's a no-brainer that we should switch to them. (After all, our code is already prepared for that, for applications that provide appropriate custom callbacks. It's just the default that falls behind.) Well, who knows what the legacy is behind some of the SSLeay-esque assumptions, but I dare say that any pthread implementation that didn't have rw-locks is likely to be relegated to history by now. In any case, I suspect we will soon require that "supported platforms" be those for which automated builds and tests are running regularly. Before the next stable branch is cut, in any case. So I'm going to propose that we initially put this patch into the development head only, and defer a decision on whether to cherry-pick it into stable branches until that testing is in place. The use of mutex instead of rwclock is not a functional bug, and for this to be a noticeable performance issue (ie. the kind of workload where this matters) is likely to be in situations where downstream patching of the openssl release/checkout is OK. Well, until such time as a cherry-pick-to-stable decision can be made with confidence, in any case. I certainly don't want us to replace the mutex-using code with something that uses mutex *or* rwlock depending on whether OPENSSL_PTHREADS_SUCK is defined or not. BTW, I mention this because NPTL headers apparently cage the rwlock definitions in some #ifdef-ery that I think we want to avoid in the mutex->rwlock changes in openssl. Rather than grappling with the will-some-platform-fail-in-some-subtle-way issues, I prefer that we rely on the short-term arrival of platform coverage/testing to detect the issue if there is one to cater for. What do you think? For future work, a lock-free approach (using thread-local storage?) certainly might make sense, but switching to read-write locks in the default callbacks should be a tiny change with significant benefits to multithreaded applications. Yeah, but a couple of things come to mind. (1) rwlocks (under optimised conditions anyway) seem to be essentially "lock free" in the fast-path case anyway, ie. for the read-lock/no-contention case, due to futex magic. That means no context-switch (to the kernel or otherwise) in that by-far-the-most-common case. So I think a change to rwlocks is likely to eliminate the observable syscall and contention overheads anyway. (2) that error code is pretty convoluted and is one of my top candidates for getting an axe taken to it, so fiddling with it to get the fast-path lockless (in spite of pthread optimisations) sounds like a bad use of time. Cheers, Geoff