I got an email saying unified diff is preferred, so here's a resend
in that format.
Index: rthread_tls.c
===================================================================
RCS file: /cvs/src/lib/librthread/rthread_tls.c,v
retrieving revision 1.13
diff -u -r1.13 rthread_tls.c
--- rthread_tls.c 6 Nov 2011 11:48:59 -0000 1.13
+++ rthread_tls.c 30 Sep 2013 14:48:18 -0000
@@ -96,7 +96,12 @@
struct rthread_storage *rs;
pthread_t self;
- _spinlock(&rkeyslock);
+ /* No lock is needed if
+ (1) the key is not valid (undefined behavior)
+ (2) storage is already allocated for the key (the linked
+ list is only ever modified by adding to the head)
+ */
+
if (!rkeys[key].used) {
rs = NULL;
goto out;
@@ -109,17 +114,25 @@
break;
}
if (!rs) {
- rs = calloc(1, sizeof(*rs));
- if (!rs)
- goto out;
- rs->keyid = key;
- rs->data = NULL;
- rs->next = self->local_storage;
- self->local_storage = rs;
+ _spinlock(&rkeyslock);
+ /* Repeat search with lock held */
+ for (rs = self->local_storage; rs; rs = rs->next) {
+ if (rs->keyid == key)
+ break;
+ }
+ if (!rs) {
+ rs = calloc(1, sizeof(*rs));
+ if (rs) {
+ rs->keyid = key;
+ rs->data = NULL;
+ rs->next = self->local_storage;
+ self->local_storage = rs;
+ }
+ }
+ _spinunlock(&rkeyslock);
}
out:
- _spinunlock(&rkeyslock);
return (rs);
}
> Here is a diff relative of OpenBSD 5.3 to avoid taking a lock in
> pthread_getspecific in the common case where storage for the key is
> already allocated.
>
> I have only tested this patch for my use case where a single key is
> created once and used many times. In that case, the bottleneck I
> observed is gone. My parallel program sees linear speedup.
>
> This patch would be unsafe if elements were ever removed from the
> per-thread list of allocated storage. In the 5.3 implementation the
> list can only grow. Its size is bounded by PTHREAD_KEYS_MAX so I
> do not consider this a leak. Possibly it should be an array instead
> of a list.
>