Please bear with me, as this whole Open Source concept is new to me.  I have
several fixes that I've made over the past two days, and I've reported them
all to "you", but then I realized that I don't know who "you" is.  Is it the
same as "me"?  Am I supposed to somehow submit my fixes myself?  Who, if
anyone, verifies them for correctness?  How do I get the patches into the
distribution code?  How fast can this happen?  How, exactly, does all this
work?  Again, please cut me some slack, as I don't know how all this is
supposed to work.

Aside from all that, I have a much more important issue.  While it appears
that I've fixed the other crash problem I was having, now lh_retrieve
(crypto/lhash/lhash.c) is crashing.  It's not obvious to me from looking at
the code how this could be happening.  

Specifically, the call to lh_retrieve comes from ERR_get_state, and it all
looks thread-safe to me.  Apparently it's not, though.  Unfortunately, I
discovered this crash last night after about 122,000 calls across 122,000
threads running 20 threads at a time.  It's pretty hard to catch anything
about to happen that way.  The problem, of course, is that until I fix this
bug I don't have a product, and we were supposed to be shipping a week ago.
Business as usual...

As before, this problem occurs in thread cleanup before exit, in the calling
sequence:

                int iErr = ERR_get_error ();    <<- Problem is in here...
                ERR_error_string (iErr, buf);
                ERR_reason_error_string (iErr);
                ERR_remove_state (0);
        
Here's a dbx stack trace.  Anyone have any ideas?

 (in this case, the call to get_error_values is an optimization of the call
to ERR_get_err, above.)

(dbx) where
current thread: t@245478
  [1] lh_retrieve(0x56828, 0x6f77, 0xe27cc, 0x61737465, 0xedc0f88c,
0xe9df0), at 0x5520c
  [2] ERR_get_state(0xb09a8, 0xef5d9c08, 0x893, 0x0, 0x0, 0x3bee6), at
0x56994
  [3] get_error_values(0x1, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x56194
  [4] 0xef786ca8(0xe8fd0, 0xef5aaa1c, 0xef5a6ae8, 0x14, 0x7, 0x38e12c), at
0xef786ca7
=>[5] __cAppThread::~__cAppThread(this = 0xe8fd0), line 304 in
"SockTest.cpp"
  [6] __SLIP.DELETER__B(0xe8fd0, 0x1, 0x64, 0x16760, 0xe1b70, 0xe8fd0), at
0x2b2b0
  [7] __cAppThread::ThreadMain(this = 0xe8fd0), line 329 in "SockTest.cpp"
  [8] ThreadRootStartingPoint(pThreadInstance = 0xe8fd0), line 74 in
"cThread.cpp"

Thanks again for the help,

Bill Rebey
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [EMAIL PROTECTED]
Automated List Manager                           [EMAIL PROTECTED]

Reply via email to