On 06/17/2013 09:59 AM, Olivier Grisel wrote: > 2013/6/17 Andreas Mueller <[email protected]>: >> On 06/14/2013 02:32 AM, Lars Buitinck wrote: >>> 2013/6/14 Mathieu Blondel <[email protected]>: >>>> We could return a tuple X, y, metadata. >>> That would save a lot of newb confusion, I think. The question "how do >>> I create a Bunch" comes up regularly -- people seem to think it's our >>> preferred dataset format, even though it's just meant for the >>> examples. >>> >> With a long enough deprecation cycle, I'm +1. > I am not sure I am +1 on this one as I am not sure that all datasets > will be always represented as a X, y pair. > > For instance for learning to rank problems it's more a triple like > (data, target, query_id). > So what is the problem with just returning this? (also, what is query_id?)
I think the code would look more natural with X and y. Often you have something like digits = load_digits() X = StandardScaler().fit_transform(digits.data) svm = SVC() svc.fit(X, digits.target) I think that would look nicer with X, y, _ = load_digits() ------------------------------------------------------------------------------ This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
