Re: Bug that spans tomcat and tomcat-native

Rainer Jung Tue, 28 Jun 2016 03:35:06 -0700

Am 28.06.2016 um 11:15 schrieb Mark Thomas:

On 27/06/2016 23:54, Nate Clark wrote:

On Mon, Jun 27, 2016 at 5:19 PM, Mark Thomas <ma...@apache.org> wrote:

On 27/06/2016 19:59, Nate Clark wrote:

On Mon, Jun 27, 2016 at 2:13 PM, Rainer Jung <rainer.j...@kippdata.de> wrote:


<snip/>

I wouldn't care too much about portability right now. If you can prove
Mark's analysis by using gettid() on Linux and checking that this fixes the
performance issue, that would very useful feedback. If it doesn't help, we
can go down the path of using the hash statistics to get more info about
what's happening. If it helps, we can think further about how to reduce hash
collisions fo every platform.


I went down the hash statistics route any way. The short version is with
11 threads and 8 buckets every thread ends up in the same bucket. This
was on OSX. I'm pretty sure Linux will be the same.

I can try that, but unfortunately will not be able to run the test
right away. The machine I was using is a shared resource, which
somebody else needs for the next few days.


Next up is trying to figure out a patch to fix the bucket distribution.


Here is my proposed fix for OSX:

Index: src/ssl.c
===================================================================
--- src/ssl.c   (revision 1750259)
+++ src/ssl.c   (working copy)
@@ -420,6 +420,10 @@
     return psaptr->PSATOLD;
 #elif defined(WIN32)
     return (unsigned long)GetCurrentThreadId();
+#elif defined(DARWIN)
+    uint64_t tid;
+    pthread_threadid_np(NULL, &tid);
+    return (unsigned long)tid;
 #else
     return (unsigned long)(apr_os_thread_current());
 #endif


I want to do some similar testing for Linux before adding what I suspect
will be a very similar block using gettid().


We could either add something to configure.in. Untested:

Index: native/configure.in
===================================================================
--- native/configure.in (revision 1750462)
+++ native/configure.in (working copy)
@@ -218,6 +218,9 @@
     *-solaris2*)
         APR_ADDTO(TCNATIVE_LIBS, -lkstat)
         ;;
+    *linux*)
+        APR_ADDTO(CFLAGS, -DTCNATIVE_LINUX)
+        ;;
     *)
         ;;
 esac


and then use a

#ifdef TCNATIVE_LINUX

or we copy some other more direct linux check from e.g. APR:

#ifdef __linux__

The latter looks simpler, but the version above is based on all thelogic put into config.guess.

Finally we could check against SYS_gettid, either during an autodettectin configure or directly in the c source file. It seems gettid() must becalled as "syscall(SYS_gettid)" and was added in kernel 2.4.11 at theend of 2001.

Even if that does help the performance and makes it so the hash map
has better distribution. It won't help with long term bloat. If a
system is running long enough and threads are started and stopped the
hash map could grow to include up to almost max pid entries, 32768 by
default on Linux. With the size of the ERR_STATE struct that will be
roughly 12MB of memory, not horrible but not ideal. I still think
ERR_remove_thread_state() should probably be invoked prior to a thread
exiting. The only way I can think of doing that and keeping everything
in native and not call it for every request would be to call
pthread_cleanup_push(ERR_remove_thread_state, NULL) the first time a
thread is used to perform any ssl operation. That would then require
you to track per thread if the cleanup function had been registered.


That is definitely a secondary consideration at the moment.

Also, any solution which make the thread id random and not a bounded
set of values would cause the max memory foot print of the hash map to
be virtually unbounded and require that ERR_remove_thread_state be
invoked to clean up the error states.


The function that generated the 'random' ID from the thread ID would
have to be deterministic so the total number of values should remain
bounded. Shouldn't it?


If the error queue's are not being removed that would be a
requirement. If it is to be bound it can not use
apr_os_thread_current(), since that is a "random" memory address and
inherently unbounded. If you tried to reduce that value to a smaller
set you would then have to worry about collisions which is something
any algorithm would have to worry about. The code in err.c assumes
that the error state is only ever modified by one thread and doesn't
lock modifying the state just the map access. The easiest thing to
base the ID on, from a Linux perspective, would be the TID, which is
bounded and guaranteed to be unique for a running thread. I am not as
sure about other operating systems. It seems like on some versions of
OSx pthread_getthreadid_np(), might do what you want.


I do think OS specific code to get the real thread ID is the way to go
here until we can switch to OpenSSL 1.1.0 at which point the problem
should go away.

Agreed, but cleaning up at the end of a thread would be good as well,e.g. if the executor is configured to renew threads when a context isreloaded we can accumulate lots of old hash entries.


Rainer


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Re: Bug that spans tomcat and tomcat-native

Reply via email to