Hi all, I have just pushed a fix to make joblib.Parallel work on the development version of Python that adds more flexibility to the way multiprocessing is spawning the worker process. See: http://docs.python.org/dev/library/multiprocessing.html#contexts-and-start-methods
I think this along with the memory mapping support, the threading backend and other fixes deserves to be tested by a broader community of users and in particular as part of the scikit-learn project. Hence I would like to do an alpha release of joblib as soon as possible and embed it in the master branch of scikit-learn prior to the 0.15 release scheduled for January 2014. The detailed list of changes since 0.7 are listed here: https://github.com/joblib/joblib/blob/master/CHANGES.rst Gilles (from scikit-learn) confirmed to me that the threading backend is working as expected to parallelize the fit of large forests of randomized trees. In my own experience this completely fixes the memory copy issue and further removes some pickling overhead caused by the communication between the parent process and its child workers. Please let me know quickly if you have any objection to this plan. Regards, -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel ------------------------------------------------------------------------------ Rapidly troubleshoot problems before they affect your business. Most IT organizations don't have a clear picture of how application performance affects their revenue. With AppDynamics, you get 100% visibility into your Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
