On May 31, 2012, at 3:39 PM, Greg Farnum wrote:
>> 
>> Nevermind to my last comment. Hmm, I've seen this, but very rarely.
> Noah, do you have any leads on this? Do you think it's a bug in your Java 
> code or in the C/++ libraries?

I _think_ this is because the JVM uses its own threading library, and Ceph 
assumes pthreads and pthread compatible mutexes--is that assumption about Ceph 
correct? Hence the error that looks like Mutex::lock(bool) being reference for 
context during the segfault. To verify this all that is needed is some 
synchronization added to the Java.

There are only two segfaults that I've ever encountered, one in which the C 
wrappers are used with an unmounted client, and the error Nam is seeing 
(although they could be related). I will re-submit an updated patch for the 
former, which should rule that out as the culprit.

Nam: where are you grabbing the Java patches from? I'll push some updates.


The only other scenario that comes to mind is related to signaling:

The RADOS Java wrappers suffered from an interaction between the JVM and RADOS 
client signal handlers, in which either the JVM or RADOS would replace the 
handlers for the other (not sure which order). Anyway, the solution was to link 
in the JVM libjsig.so signal chaining library. This might be the same thing we 
are seeing here, but I'm betting it is the first theory I mentioned.

- Noah--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to