On Sat, 23 Aug 2008, Felix Schwarz wrote:

Your function has some contradictions:
- initVM() must be called from the main thread
- attachCurrentThread() makes sense only when called from threads other than the main one

The main thread is the thread which creates the other threads, right?

The main thread is the thread where your program starts. It's the first thread, the thread that creates the second thread, the thread that runs your program's C entrypoint, main(). But any thread can create more threads not just the main thread.

And after initVM is called, lucene.getVMEnv() always returns a valid environment (for all threads), right?

There are two environments at work here. The one returned by initVM() or
getVMEnv() is of Python type JCCEnv. There is only one C++ instance per process of this env. It's a singleton global variable set in jcc.cpp by initVM().

The other type of env is the Java VM's env and there is one per thread. Calling jccenv.attachCurrentThread() gets the Java VM env for your thread put into the thread-local storage of the C++ instance of jccenv. This per-thread Java VM env is used in each and every Java Native Interface (JNI) call.

Why all these details ? You're asking if getVMEnv() always returns a valid env. Well, I'm quite sure the JCC env is valid all along but I cannot make any guarantees about the Java VM env contained in its thread-local storage. It should be valid as long as your thread is valid and as long as attachCurrentThread() was called before any other Java calls were made from that thread.

In this case, my function works because I call it once before starting the server (which creates the worker threads) and all workers just do attachCurrentThread.

But thanks for the hint, I will rework it to make my code more obvious.

Since it's so easy for you to reproduce the bug, it'd be interesting to know what the value of the obj and id parameters are in JCCEnv::deleteGlobalRef() at the time of the crash. The latest JCC in trunk [1] has better support for compiling for debugging when using --debug. I'd get that JCC from trunk built first with debug enabled too (adding -O0 and -g to CFLAGS for your platform and invoking setup.py with --debug). If id is not NULL, then what is the value of iter->second.global (see JCCEnv.cpp) ? (I guess id is not NULL because the complaint is about a call to DeleteGlobalRef()).

And last but not least, what version of gcc are you using ? What compile flags did you use when you built JCC itself ? (yes, Python's distutils sets most of these, what are they ?) Does switching from -O3 to -O2 or -O0 (no optimizations) make the problem go away ? (Whenever you switch compile flags, you need to rebuild JCC since the compile flags JCC was compiled with are reused by JCC itself, see config.py file created by JCC's setup.py at JCC build time).

In an earlier message you said:

I'm using Python 2.4 on CentOS 5.2 (i386 and x86_64) with OpenJDK
(java-1.6.0-openjdk-1.6.0.0-0.20.b11.el5). However I can reproduce the bug
with Python 2.5 on Fedora 9 (x86_64) and Python 2.4 on Windows (i386).

Have you tried using a different JDK/JRE on CentOS 5.2 (such as Sun's 1.5 or 1.6) ? Is the error you're getting on Windows the same ? Which JRE are you using on Windows, OpenJDK 6 as well ?
If you've hit a JRE bug here, switching JREs may 'solve' the problem.

If after you've tried all the above, the problem is still not resolved or worked around, since isolating the problem is too difficult, if you can give me ssh access to your machine to reproduce and debug your problem remotely from a command line interface, I'd be willing to spend some time investigating it.

Andi..

[1] http://svn.osafoundation.org/pylucene/trunk/jcc/jcc
_______________________________________________
pylucene-dev mailing list
pylucene-dev@osafoundation.org
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev

Reply via email to