[
https://issues.apache.org/jira/browse/JCR-929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12496810
]
Ian Boston commented on JCR-929:
--------------------------------
If the mbean monitor code looks at waiting threads 2 threads are found to go
into a permanent wait state.
The ClusterNode thread (used by the cluster node to replay journal entries)
goes into a permanent wait state on a RenetrantLock object within the
LockManagerImpl
Thread ClusterNode-localhost2 waiting by [EMAIL PROTECTED] ::WAITING at
EDU.oswego.cs.dl.util.concurrent.ReentrantLock.acquire(null:-1)
at java.lang.Object.wait(Object.java:-2)
at java.lang.Object.wait(Object.java:474)
at EDU.oswego.cs.dl.util.concurrent.ReentrantLock.acquire(null:-1)
at
org.apache.jackrabbit.core.lock.LockManagerImpl.acquire(LockManagerImpl.java:599)
at
org.apache.jackrabbit.core.lock.LockManagerImpl.nodeAdded(LockManagerImpl.java:838)
And the HTTP threads all go into a permanent wait state (when they access the
JCR) in the AbstractJournal.lockAndSync on a WriterLock
Thread http-8580-Processor23 waiting by [EMAIL PROTECTED] ::WAITING at
EDU.oswego.cs.dl.util.concurrent.WriterPreferenceReadWriteLock$WriterLock.acquire(null:-1)
at java.lang.Object.wait(Object.java:-2)
at java.lang.Object.wait(Object.java:474)
at
EDU.oswego.cs.dl.util.concurrent.WriterPreferenceReadWriteLock$WriterLock.acquire(null:-1)
at
org.apache.jackrabbit.core.journal.AbstractJournal.lockAndSync(AbstractJournal.java:228)
at
org.apache.jackrabbit.core.journal.DefaultRecordProducer.append(DefaultRecordProducer.java:51)
The LockManagerImpl.acquire contains a for(;;) loop that will loop forever if
the lock is not aquired, I am putting some debug in the catch to see if the
loop is spinning or if the wait is forever.
I will also try and tracedown the objects being waited on to see if they give
any clues to what is effectively deadlocking.
> Under Heavy load in a Cluster HTTP Threads Block and stall requests
> -------------------------------------------------------------------
>
> Key: JCR-929
> URL: https://issues.apache.org/jira/browse/JCR-929
> Project: Jackrabbit
> Issue Type: Bug
> Components: core
> Affects Versions: 1.3
> Environment: 2 Node Cluster, OSX, JDK 1.5 with DatabaseJournal,
> DatabasePersistanceManager, all content in DB, using WebDAV to load
> Reporter: Ian Boston
>
> Under Heavy load created by mounting both nodes in the cluster in OSX Finder
> and then uploading large numebers of files to each node at the same time ( a
> few 1000), eventually one of the nodes stops responding and the Finder mount
> timesout and disconnects.
> Once that happens that node becomes unusable.
> More mount attempts will prompt for a password indicating HTTP is still
> running, but will timeout once the connection is authenticated.
> Access by the Web Browser will prompt for a password, conenct and provide a
> once only listing of any collection in the workspace. If you try to refresh
> that collection, the HTTP request hangs forever.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.