I'm working on integrating a fairly large legacy system to an ejb server. I've been using Inline::Java 0.50. Lately I've been seeing a lot of jni failures under load.

I've been investigating some mysterious malloc failures that occured in the jvm durring hostname lookups. I found this bug report:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6369541

And a libc related bug filed at redhat (we're using RH enterprise 4),
http://sources.redhat.com/bugzilla/show_bug.cgi?id=156

My symptoms are pretty similar. I have a perl process with a jvm running in JNI mode. Under load, I see random SEGV's. The hotspot error log shows failures in the libc/name resolution libraries. So far I've tried the latest jvm from sun. I've yet to test this with jrocket. Has anyone else experienced this problem? Is the only path forward really making name resolution synchronised (as discussed in some forums linked to off the bug report on sun.com)?


On a possibly related note the snippet below is happening on the new build box. What exactly is the test for shared_fork trying to do? From taking a quick look at the code, its instantiating an object, forking off a bunch of processes and modifying the integer in the instantiated object. Its not calling wait or waitpid, just doing a sleep(1). Is this just a simple race condition? It used to pass, but that was on a single cpu machine (the build box has 4). I'm still digging to see what's going on in the test classes, but I'm not sure what the point of the test is. I really think the processes are completing out of order.


t/08_study.............ok
t/09_usages............ok
t/10_1_shared_alone....ok
t/10_2_shared_start....ok
t/10_3_shared_use......ok
t/10_4_shared_stop.....ok
t/10_5_shared_fork.....ok 5/8# Test 6 got: "14" (t/10_5_shared_fork.t at
line 54)
t/10_5_shared_fork.....NOK 6
#  t/10_5_shared_fork.t line 54 is: ok($t10::t10::i, $sum) ;
t/10_5_shared_fork.....FAILED test 6
        Failed 1/8 tests, 87.50% okay
t/10_6_shared_sim......ok
t/11_exceptions........ok
t/12_1_callbacks.......ok
t/13_end...............ok
Failed Test          Stat Wstat Total Fail  Failed  List of Failed
------------------------------------------------------------------------
--------------------------------------------------------
t/10_5_shared_fork.t                8    1  12.50%  6
Failed 1/18 test scripts, 94.44% okay. 1/314 subtests failed, 99.68%
okay.

any thoughts/feedback would be appreciated.


--
J.



Reply via email to