2012/9/22 Christian Jauvin <[email protected]>:
> Hi,
>
> I have been doing multiple experiments using a RandomForestClassifier
> (trained with the parallel code option) recently, without encountering
> any particular problem. However as soon as I began using a much bigger
> dataset (with the exact same code), I got this threading error:
>
> Exception in thread Thread-2:
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/threading.py", line 551, in __bootstrap_inner
>     self.run()
>   File "/usr/lib/python2.7/threading.py", line 504, in run
>     self.__target(*self.__args, **self.__kwargs)
>   File "/usr/lib/python2.7/multiprocessing/pool.py", line 319, in 
> _handle_tasks
>     put(task)
> SystemError: NULL result without error in PyObject_Call
>
> I can provide additional details of course, but first maybe there is
> something in particular I should be aware of, about size or memory
> limit of the underlying objects in question?
>

It can be a memory error as the current implementation is very bad at
managing the memory.

You can try to replace the joblib folder in the sklearn source tree by
the "pickling-pool" branch of my repo:

https://github.com/joblib/joblib/pull/44

That should help a lot. You can further memmap your original dataset
has explained in the following doc to get even better memory usage
reduction:

https://github.com/ogrisel/joblib/blob/pickling-pool/doc/parallel_numpy.rst

You might also want to set the TMP environment variable to a folder on
a big partition.

I am very interested in any feedback while using this branch.

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

------------------------------------------------------------------------------
How fast is your code?
3 out of 4 devs don\\\'t know how their code performs in production.
Find out how slow your code is with AppDynamics Lite.
http://ad.doubleclick.net/clk;262219672;13503038;z?
http://info.appdynamics.com/FreeJavaPerformanceDownload.html
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to