Dominique,
I have tried with Ctrl+/, and with kill -3 and with a jmx deadlock
monitor class that looks for deadlocked threads every 500ms..... but it
doesn't find any....... which makes me think that its not a
deadlock..... JProfiler also has a deadlock detector (which might be
more reliable than my code :) ) and so far it hasn't found any ... so
perhapse its not a deadlock.... not certain what else it could be there
is no cpu activity.
Im going to go back to basics, instrument the code and produce some big
log files. (Just hoping that doesnt generate a work around!)
Ian
Dominique Pfister wrote:
Hi Ian,
have you been able to generate a thread dump of the stalled node, at
the moment it doesn't appear to respond any more? That might help...
Kind regards
Dominique
On 5/15/07, Ian Boston <[EMAIL PROTECTED]> wrote:
Hi,
I've been doing some testing of a 2 node jackrabbit cluster using 1.3
(with the JCR-915 patch), but I am getting some strange behavior.
I use OSX Finder to mount a DAV service from each node and then upload
lots of files to each dav mount at the same time. All goes Ok for the
first few 1000 files, and then one of the nodes stops responding to that
session. The other node continues and finishes.
Eventually OSX disconnects the stalled node.
When I try the port of the apparently stalled cluster node, its still
responds, however with some strange behaviour.
A remount attempt responds with a 401 and forces basic login, but stalls
after that point. (the URL is to the base of a workspace)
If I open firefox and access the dav servlet via firefox, I can navigate
down the directory tree, but if I try and refresh any jcr folder or jcr
file that I have already visited (since the cluster node has been up),
FF spins forever.
I have put a Deadlock detector class into both nodes (java class that
looks for deadlock through jmx) but it doesnt detect anything.
I have also use JProfiler connected to one node but it never detects a
deadlock.
I have tried all of this in single node mode, with no Journal or
ClusterNode and not been able to re-create the problem (yet).
The one thing that I have seen in JProfiler is threads blocked waiting
for an ItemState? monitor inside jackrabbit, but never for more that
500ms.
I am using the standard DatabaseJournal and the
SimpleDbPersistanceManager, however I see the same happening with the
FileJournal.
Any ideas ? I might put some very simple debug in near that monitor that
was blocking for 500ms ?
I did search JIRA but couldnt find anything that was a close match.
Ian