On Fri, Jan 06, 2012 at 06:36:33PM +0200, Vlad Niculae wrote: > [Parallel(n_jobs=4)]: Done 1 out of 10 |elapsed: 0.2s remaining: > 1.4s
> I think at least "Done job x of y" should be printed, I don't see why it > should be more difficult in the no-multiprocessing case. :). It's funny, I've been working today on a pull request in joblib that deals with this part of the code. The reason is simple: what is given to the parallel processing code is an iterator, and not a list. In other word each item is expended as we go. Thus when you are doing something like this: Parallel(n_job=1)(delayed(process)(X[fold], y[fold]) for fold in folds) There is not a list of temporaries with the data in each fold that is created. This is fairly important for memory reasons. The Parallel object could of course transform the iterator that it is given in a list to be able to measure its length. But that defeats the purpose. In the multi-processing context, this is a bit different, as the Parallel object is dispatching folds to different processes. Thus it is consuming the iterator. By default, it greedily dispatches everything, and thus consumes all the iterator, and knows its length. The reason that it is down greedily is that there is a delay in the dispatch, thus it enable to fill in the queue fast. There is an option (heavily used in the scikit) called 'pre_dispatch' that enables the dispatching to be on the fly, as the queue empties. The reason being that the greedy strategy will blow the memory if there are many folds. In this case, the display is no longer as pleasant. You can see such a display when using the GridSearch with many folds. I hope this answers your question :) Gael ------------------------------------------------------------------------------ Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex infrastructure or vast IT resources to deliver seamless, secure access to virtual desktops. With this all-in-one solution, easily deploy virtual desktops for less than the cost of PCs and save 60% on VDI infrastructure costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general