Nice to see that your comments do come from some understanding of the issues. Been number of times in the past when people have gone off saying things about multiple interpreters, didn't really know what they were talking about and were just echoing what some one else had said. Some of the things being said were often just wrong though. It just gets annoying. :-(
Anyway, a few comments below with pointers to some documentation on various issues, plus details of other issues I know of. On Feb 3, 6:38 pm, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > > If you are going to make a comment such as 'multi-interpreter feature > > doesn't really work' you really should substantiate it by pointing to > > where it is documented what the problems are or enumerate yourself > > exactly what the issues are. There is already enough FUD being spread > > around about the ability to run multiple sub interpreters in an > > embedded Python application, so adding more doesn't help. > > I don't think the limitations have been documented in a systematic > manner. Some of the problems I know of are: > - objects can easily get shared across interpreters, and often are. > This is particularly true for static variables that extensions keep, > and for static type objects. Yep, but basically a problem with how people write C extension modules. Ie., they don't write them with the fact that multiple interpreters can be used in mind. Until code was fixed recently in trunk, one high profile module which had this sort of problem was psycop2. Not sure if there has been an official release yet which includes the fix. From memory the problem they had was that a static variable was caching a reference to the type object for Decimal from the interpreter which first loaded and initialised the module. That type object was then used to create instances of Decimal type which were passed to other interpreters. These Decimal instances would then fail isinstance() checks within those other interpreters. Some details about this in section 'Multiple Python Sub Interpreters' of: http://code.google.com/p/modwsgi/wiki/ApplicationIssues That section of documentation also highlights some of the other errors that can arise where file objects in particular are somehow shared between interpreters, plus issues when unmarshalling data. You might also read section 'Application Environment Variables' of that document. This talks about the problem of leakage of environment variables between sub interpreters. There probably isn't much that one can do about it as one needs to push changes to os.environ into C environment variables so various system library calls will get them, but still quite annoying that the variables set in one interpreter then show up in interpreters created after that point. It means that environment variable separation for changes made unique to a sub interpreter is impossible. > - Py_EndInterpreter doesn't guarantee that all objects are released, > and may leak. This is the problem that the OP seems to have. > All it does is to clear modules, sys, builtins, and a few other > things; it is then up to reference counting and the cycle GC > whether this releases all memory or not. There is another problem with deleting interpreters and then creating new ones. This is where a C extension module doesn't declare reference counts to static Python objects it creates. When the interpreter is destroyed and objects that can be destroyed are destroyed, then it may destroy these objects which are referenced by the static variables. When a subsequent interpreter is created which tries to use the same C extension module, that static variable now contains a dangling invalid pointer to unused or reused memory. PEP 3121 could help with this by making it more obvious of what requirements exist on C extension modules to cope with such issues. I don't know whether it is a fundamental problem with the tool or how people use it, but Pyrex generated code seems to also do this. This was showing up in PyProtocols in particular when attempts were made to recycle interpreters within the lifetime of a process. Other packages having the problem were pyscopg2 again, lxml and possibly subversion bindings. Some details on this can be found in section 'Reloading Python Interpreters' of that document. > - the mechanism of PEP 311 doesn't work for multiple interpreters. Yep, and since SWIG defaults to using it, it means that SWIG generated code can't be used in anything but the main interpreter. Subversion bindings seem to possibly have a lot of issues related to this as well. Some details on this can be found in section 'Python Simplified GIL State API' of that document. > > Oh, it would also be nice to know exactly what embedded systems you > > have developed which make use of multiple sub interpreters so we can > > gauge with what standing you have to make such a comment. > > I have never used that feature myself. However, I wrote PEP 3121 > to overcome some of its limitations. As well as the above there are a number of other issues as well. Ones I can remember right now are as follows. First is that one can't use different versions of a C extension module in different sub interpreters. This is because the first one loaded effectively gets priority. Am not even sure you get an error when another interpreter tries to load a different version, it just assumes the one already loaded is okay. This can mean one may get a set of Python wrappers which doesn't match the C extension module. In other words, C extension modules are global to process and not local to sub interpreter. I know I have talked about this one numerous times, but can't seem to see where I cover it in the documentation I pointed at. I'll have to make sure I add it if it isn't there. Second issue is that when you call Py_EndInterpreter, it doesn't do some of the stuff that would be done if it was the main interpreter. The two main culprits are that it doesn't try to stop non daemonised Python threads and it doesn't call functions registered with the atexit module. One might argue that it shouldn't be calling atexit registered functions as the process isn't being shutdown, but in Python such functions being called are really at the point the main interpreter is being destroyed, not the process. As such, it may be appropriate such registered functions be called for a specific sub interpreter as well, but obviously only for callbacks registered in that sub interpreter. One of the reasons for calling atexit registered functions for a sub interpreter is to terminate daemonised threads. If one isn't able to kill off daemonised threads created within a sub interpreter then they can keep running while and after the sub interpreter has been destroyed. This could result in just a Python exception occuring for that thread causing it to exit, but can also cause it to crash the process. To ensure proper cleanup of sub interpreters when being destroyed and allow hosted applications to do things properly on exit they may want to do, found it necessary to do these two things explicitly, when possibly the Python internals should provide a means, even if optional, to do it. Anyway, have a read through that document as you might find a few interesting things in there about the current problems. Some stuff isn't necessarily documented as the code for the package this relates to just works around the issues so everything works as one would expected rather. For example the atexit register functions being called for sub interpreters. In general what I have found is that as long as you are aware of the limitations, multiple interpreters are still usable. The one thing I would avoid is trying to recycle sub interpreters. Once they are created, only safe thing to do is to destroy them on process exit and no sooner. Otherwise you get issues that OP is seeing, but also some of the issues I describe above. Hope you have find this and the referenced document interesting. :-) Graham -- http://mail.python.org/mailman/listinfo/python-list