Re: [Scikit-learn-general] greetings; more flexibility in trees

2013-06-08 Thread Ken Geis
On Jun 7, 2013, at 11:57 PM, Ken Geis wrote: > On May 23, 2013, at 5:03 AM, Gilles Louppe wrote: >>> So I'd like to contribute a simple MAE criterion that would be efficient >>> for random splits (i.e. O(n) given a single batch update.) Is the direction >>> forward for something like this to h

Re: [Scikit-learn-general] Re-cycling pipeline stages in GridSearchCV?

2013-06-08 Thread Joel Nothman
On Fri, Jun 7, 2013 at 11:59 PM, Gael Varoquaux < gael.varoqu...@normalesup.org> wrote: > > Memorization and parallelization don't play along nicely. > > Yes, I am strongly thinking of adding optional memoization directly to > joblib.Parallel. It is often a fairly natural place to put a memoizatio

Re: [Scikit-learn-general] Re-cycling pipeline stages in GridSearchCV?

2013-06-08 Thread Gael Varoquaux
> I don't see how that helps Pipeline; perhaps expand your idea a bit...? It doesn't. At all. I think that pipeline can be improved by memoizing the transforms, or the transformer's fit. G -- How ServiceNow helps IT peop

Re: [Scikit-learn-general] How to present parameter search results

2013-06-08 Thread Joel Nothman
But where it is the case, an index into the results (so that you can use np.asarray(results)[grid.build_index()] in the desired manner) is possible. https://github.com/scikit-learn/scikit-learn/pull/1842 On the other hand, as long as you can get an array of parameter values for each parameter name

Re: [Scikit-learn-general] How to present parameter search results

2013-06-08 Thread Joel Nothman
Thanks, Olivier. Those are some interesting use-cases: > A- Fault tolerance and handling missing results caused by evaluation errors I don't think this affects the output format, except where we can actually get partial results for a fold, or if we want to report successful folds and ignore other

Re: [Scikit-learn-general] How to present parameter search results

2013-06-08 Thread Joel Nothman
On Sun, Jun 9, 2013 at 12:38 PM, Joel Nothman wrote: > > This may be getting into crazy land, and certainly close to reimplementing > Pandas for the 2d case, or recarrays with benefits, but: imagine we had a > SearchResult object with: > * attributes like fold_test_score, fold_train_score, fold_tr