Hi guys,

this week-end, I experienced some bad issues with the server-integ tests, while trying to use the same LdapDecoder for both client and server. After 3 or 4 tests, I get blocked for 30 seconds, and then move to the next test, get blocked again, etc.

Of course, at the end, I got test failures, as expected.

I did some deep analysis on what was going on, added some traces, and that led me to a point where I get blocked while trying to access to any of the LdapContainer method.

Here is an exemple of the plcae I get blocked in :

public void decode( ByteBuffer stream, Asn1Container container ) throws DecoderException
    {
        boolean hasRemaining = stream.hasRemaining();
System.out.println( "<Has Remaining> : " + hasRemaining + "/" + stream.remaining() );

        // Increment the PDU size counter.
        int remaining = stream.remaining();
        System.out.println( "----- 0" );
        container.incrementDecodeBytes( remaining );

        System.out.println( "----- 1" );

and in the console, the "----- 0" get printed, not the "----- 1". The incrementDecodeBytes is a very simple method :

    public void incrementDecodeBytes( int nb )
    {
        System.out.println( "---in incrementDecodedBytes" );
        decodeBytes += nb;
    }

(note that I don't get either the "---in incrementDecodedBytes" message). It's not synchronized, not the decodeBytes is an AtomicInteger.

I did a threadDump, which is available here : http://pastebin.com/wPNPL1FY

It shows that 15 threads are blocked on monitors :

pool-24-thread-1 Task 5 51326976 5680140288 waiting on condition [152900000] pool-22-thread-1 Task 5 50612224 5684236288 waiting on condition [152ce8000] pool-14-thread-4 Task 5 56862720 5715816448 waiting on condition [154b06000] net.sf.ehcache.CacheManager@6f7cf6b6 Daemon 5 27879424 5657616384 in Object.wait() [151385000]
main    Task    5       50333696        4300214272      waiting on condition    
[1004fe000]
kdcReplayCache.data Daemon 5 50774016 5658677248 waiting on condition [151488000] groupCache.data Daemon 5 50778112 5660798976 waiting on condition [15168e000] changePwdReplayCache.data Daemon 5 27881472 5659738112 waiting on condition [15158b000] SynchWorkerThread Task 5 35258368 5652594688 in Object.wait() [150ebb000] Surefire fork timeout timer Task 5 26988544 5647831040 in Object.wait() [150a30000] Reference Handler Daemon 10 50362368 5634641920 in Object.wait() [14fd9c000]
Finalizer       Daemon  8       50364416        5635702784      in 
Object.wait()        [14fe9f000]
FelixStartLevel Daemon 5 35293184 5670658048 in Object.wait() [151ff5000] FelixPackageAdmin Daemon 5 33517568 5655359488 in Object.wait() [15115e000] FelixDispatchQueue Task 5 28618752 5661859840 in Object.wait() [151791000]


I have *no idea* why the test get blocked, I see no rational reason for it to be blocked. I'll add the thread-id n the logs to see which thread it is running on...

Can felix be an issue here ?

--
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Reply via email to