Geoff,
we did not seem to be able to figure out what openssl Makefile actually builds
crypto/threads/th-lock.c
In our particular case we explicitly included that file when building our
server, but for “pure” OpenSSL, what make includes th-lock.c ?
Igor.
From: Geoffrey Thorpe [mailto:[email protected]]
Sent: Monday, June 09, 2014 10:13 PM
To: [email protected]
Cc: Levin, Igor; Salz, Rich
Subject: Re: Locking inefficiency
Hey Bodo,
On Mon, Jun 9, 2014 at 3:15 PM, Bodo Moeller
<[email protected]<mailto:[email protected]>> wrote:
Geoffrey Thorpe <[email protected]<mailto:[email protected]>>:
First, you're right, pthreads_locking_callback() is collapsing everything to a
mutex.
I was well aware of this and thought we did this for compatibility reasons
(because I couldn't think of any other reasonable explanation, I guess). If
actual read-write locks are just as portable, I think it's a no-brainer that we
should switch to them. (After all, our code is already prepared for that, for
applications that provide appropriate custom callbacks. It's just the default
that falls behind.)
Well, who knows what the legacy is behind some of the SSLeay-esque assumptions,
but I dare say that any pthread implementation that didn't have rw-locks is
likely to be relegated to history by now.
In any case, I suspect we will soon require that "supported platforms" be those
for which automated builds and tests are running regularly. Before the next
stable branch is cut, in any case.
So I'm going to propose that we initially put this patch into the development
head only, and defer a decision on whether to cherry-pick it into stable
branches until that testing is in place. The use of mutex instead of rwclock is
not a functional bug, and for this to be a noticeable performance issue (ie.
the kind of workload where this matters) is likely to be in situations where
downstream patching of the openssl release/checkout is OK. Well, until such
time as a cherry-pick-to-stable decision can be made with confidence, in any
case.
I certainly don't want us to replace the mutex-using code with something that
uses mutex *or* rwlock depending on whether OPENSSL_PTHREADS_SUCK is defined or
not.
BTW, I mention this because NPTL headers apparently cage the rwlock definitions
in some #ifdef-ery that I think we want to avoid in the mutex->rwlock changes
in openssl. Rather than grappling with the
will-some-platform-fail-in-some-subtle-way issues, I prefer that we rely on the
short-term arrival of platform coverage/testing to detect the issue if there is
one to cater for.
What do you think?
For future work, a lock-free approach (using thread-local storage?) certainly
might make sense, but switching to read-write locks in the default callbacks
should be a tiny change with significant benefits to multithreaded applications.
Yeah, but a couple of things come to mind.
(1) rwlocks (under optimised conditions anyway) seem to be essentially "lock
free" in the fast-path case anyway, ie. for the read-lock/no-contention case,
due to futex magic. That means no context-switch (to the kernel or otherwise)
in that by-far-the-most-common case. So I think a change to rwlocks is likely
to eliminate the observable syscall and contention overheads anyway. (2) that
error code is pretty convoluted and is one of my top candidates for getting an
axe taken to it, so fiddling with it to get the fast-path lockless (in spite of
pthread optimisations) sounds like a bad use of time.
Cheers,
Geoff