It doesn't look like anyone's replied to this yet, so here's my two cents.

I think of this sort of situation as a case of imbalance - there aren't equal numbers of examples of each class in each training/testing set (aka chunk). This happens in all sorts of situations, such as when which trials are included depends upon participant behavior (e.g. correctly-performed trials).

There isn't a universally appropriate strategy to regain balance, but either the chunks or the examples will need to be changed.

For example, in one dataset we wanted to do leave-one-run-out cross-validation, but the imbalance was too great (e.g. some runs with very few examples), so we combined runs, for leave-three-runs-out cross-validation. We combined temporally adjacent runs (e.g. 1-3, 4-6, 7-9) to make sure we didn't somehow inflate the accuracy. Depending on the design, you could potentially partition on something other than the runs to give more flexibility. If the imbalance is not too great (e.g. 10 of one class and 12 of the other), my usual practice is to subset the larger class at random, repeating the whole thing a few times (leaving out different examples).

By changing the examples I mean strategies like averaging across examples within a run (or fitting parameter estimate images), so that instead of classifying with individual trials you have a fixed number of summary images (e.g. beta weights, averages) per person. In my experience this can really help performance, even though the number of samples is greatly reduced.

good luck,
Jo



On 8/2/2013 6:49 AM, Jan Derrfuss wrote:
Hello,

I would like to run an exploratory analysis where the presence of
samples in the chunks was not under experimental control. As a result,
there are chunks where only one of the two target classes I'm decoding
is present in the chunk the classifier is tested on (there is also a
single chunk in one subject where neither of the two classes is
present). I'm running a searchlight analysis with 6-fold
cross-validation, use a linear SVM, and compute the mean true positive
rate.

Is there a preferred way to deal with such a situation?

Jan

PS. The temperature here in the Lower Rhine region is currently 33 °C
(91 °F). Wherever you are, I hope it's colder there! :-)

_______________________________________________
Pkg-ExpPsy-PyMVPA mailing list
[email protected]
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa

--
Joset A. Etzel, Ph.D.
Research Analyst
Cognitive Control & Psychopathology Lab
Washington University in St. Louis
http://mvpa.blogspot.com/

_______________________________________________
Pkg-ExpPsy-PyMVPA mailing list
[email protected]
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa

Reply via email to