[ 
https://issues.apache.org/jira/browse/JCR-929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12496810
 ] 

Ian Boston commented on JCR-929:
--------------------------------


If the mbean monitor code looks at waiting threads 2 threads are found to go 
into a permanent wait state.

The ClusterNode thread (used by the cluster node to replay journal entries) 
goes into a permanent wait state  on a RenetrantLock object within the 
LockManagerImpl

Thread ClusterNode-localhost2 waiting by [EMAIL PROTECTED] ::WAITING at 
EDU.oswego.cs.dl.util.concurrent.ReentrantLock.acquire(null:-1)
     at java.lang.Object.wait(Object.java:-2)
     at java.lang.Object.wait(Object.java:474)
     at EDU.oswego.cs.dl.util.concurrent.ReentrantLock.acquire(null:-1)
     at 
org.apache.jackrabbit.core.lock.LockManagerImpl.acquire(LockManagerImpl.java:599)
     at 
org.apache.jackrabbit.core.lock.LockManagerImpl.nodeAdded(LockManagerImpl.java:838)

And the HTTP threads all go into a permanent wait state  (when they access the 
JCR) in the AbstractJournal.lockAndSync on a WriterLock

Thread http-8580-Processor23 waiting by [EMAIL PROTECTED] ::WAITING at 
EDU.oswego.cs.dl.util.concurrent.WriterPreferenceReadWriteLock$WriterLock.acquire(null:-1)
     at java.lang.Object.wait(Object.java:-2)
     at java.lang.Object.wait(Object.java:474) 
     at 
EDU.oswego.cs.dl.util.concurrent.WriterPreferenceReadWriteLock$WriterLock.acquire(null:-1)
     at 
org.apache.jackrabbit.core.journal.AbstractJournal.lockAndSync(AbstractJournal.java:228)
     at 
org.apache.jackrabbit.core.journal.DefaultRecordProducer.append(DefaultRecordProducer.java:51)
 


The LockManagerImpl.acquire contains a for(;;) loop that will loop forever if 
the lock is not aquired, I am putting some debug in the catch to see if the 
loop is spinning or if the wait is forever.


I will also try and tracedown the objects being waited on to see if they give 
any clues to what is effectively deadlocking.


> Under Heavy load in a Cluster HTTP Threads Block and stall requests
> -------------------------------------------------------------------
>
>                 Key: JCR-929
>                 URL: https://issues.apache.org/jira/browse/JCR-929
>             Project: Jackrabbit
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.3
>         Environment: 2 Node Cluster, OSX, JDK 1.5 with DatabaseJournal, 
> DatabasePersistanceManager, all content in DB, using WebDAV to load
>            Reporter: Ian Boston
>
> Under Heavy load created by mounting both nodes in the cluster in OSX Finder 
> and then uploading large numebers of files to each node at the same time ( a 
> few 1000), eventually one of the nodes stops responding and the Finder mount 
> timesout and disconnects.
> Once that happens that node becomes unusable.
> More mount attempts will prompt for a password indicating HTTP is still 
> running, but will timeout once the connection is authenticated.
> Access by the Web Browser will prompt for a password, conenct and provide a 
> once only listing of any collection in the workspace. If you try to refresh 
> that collection, the HTTP request hangs forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to