Re: [Scikit-learn-general] Causes for one class dominating?

2012-01-29 Thread Michael Waskom
Sorry, probably not clear from that snippet, but the labels vector corresponds to run (and is the id i'm using for the leave-one-label out CV strategy that's giving me problems). My (perhaps naive) assumption would be that the dataset should be distributed more or less evenly across these splits,

Re: [Scikit-learn-general] Causes for one class dominating?

2012-01-29 Thread bthirion
It looks like you fit the PCA on class-specific data. You cannot expect that this will yield a meaningful organization when pooling across folds. You probably want to train the PCA on the whole dataset, or did I miss something ? Bertrand On 01/29/2012 10:38 PM, Michael Waskom wrote: > Aha, thi

Re: [Scikit-learn-general] Causes for one class dominating?

2012-01-29 Thread Michael Waskom
Aha, this does indeed suggest something strange: http://web.mit.edu/mwaskom/www/pca.png I'm going to dig into this some more, but I don't really have any strong intuitions to guide me here so if anything pops out at you from that do feel free to speak up :) Michael On Sun, Jan 29, 2012 at 1:14

Re: [Scikit-learn-general] Causes for one class dominating?

2012-01-29 Thread Alexandre Gramfort
hum... final suggestion: I would try to visualize a 2D or 3D PCA to see if it can give me some intuition on what's happening. Alex On Sun, Jan 29, 2012 at 9:58 PM, Michael Waskom wrote: > Hi Alex, > > See my response to Yarick for some results from a binary > classification.  I reran both the t

Re: [Scikit-learn-general] Causes for one class dominating?

2012-01-29 Thread Michael Waskom
Hi Alex, See my response to Yarick for some results from a binary classification. I reran both the three-way and binary classification with SVC, though, with similar results: cv = LeaveOneLabelOut(bin_labels) pipe = Pipeline([("scale", Scaler()), ("classify", SVC(kernel="linear"))]) print cross_

Re: [Scikit-learn-general] Causes for one class dominating?

2012-01-29 Thread Alexandre Gramfort
ok some more suggestions: - do you observe the same behavior with SVC which uses a different multiclass strategy? - what do you see when you inspect results obtained with binary predictions (keeping 2 classes at a time)? Alex On Sun, Jan 29, 2012 at 4:59 PM, Michael Waskom wrote: > Hi Alex, >

Re: [Scikit-learn-general] Causes for one class dominating?

2012-01-29 Thread Michael Waskom
Hi Alex, No, each subject has four runs so I'm doing leave-one-run-out cross validation in the original case. I'm estimating separate models within each subject (as is common in fmri) so all my example code here would be from within a for subject in subjects: loop, but this pattern of weirdness is

Re: [Scikit-learn-general] Causes for one class dominating?

2012-01-29 Thread Alexandre Gramfort
hi, just a thought. You seem to be doing inter-subject prediction. In this case a 5 fold mixes subjects. A hint is that you may have a subject effect that acts as a confound. again just a thought ready the email quickly Alex On Sun, Jan 29, 2012 at 5:39 AM, Michael Waskom wrote: > Hi Yarick, t