Re: [pymvpa] No samples of a class in a chunk

J.A. Etzel Tue, 06 Aug 2013 14:05:07 -0700

It doesn't look like anyone's replied to this yet, so here's my two cents.

I think of this sort of situation as a case of imbalance - there aren'tequal numbers of examples of each class in each training/testing set(aka chunk). This happens in all sorts of situations, such as when whichtrials are included depends upon participant behavior (e.g.correctly-performed trials).

There isn't a universally appropriate strategy to regain balance, buteither the chunks or the examples will need to be changed.

For example, in one dataset we wanted to do leave-one-run-outcross-validation, but the imbalance was too great (e.g. some runs withvery few examples), so we combined runs, for leave-three-runs-outcross-validation. We combined temporally adjacent runs (e.g. 1-3, 4-6,7-9) to make sure we didn't somehow inflate the accuracy. Depending onthe design, you could potentially partition on something other than theruns to give more flexibility. If the imbalance is not too great (e.g.10 of one class and 12 of the other), my usual practice is to subset thelarger class at random, repeating the whole thing a few times (leavingout different examples).

By changing the examples I mean strategies like averaging acrossexamples within a run (or fitting parameter estimate images), so thatinstead of classifying with individual trials you have a fixed number ofsummary images (e.g. beta weights, averages) per person. In myexperience this can really help performance, even though the number ofsamples is greatly reduced.


good luck,
Jo



On 8/2/2013 6:49 AM, Jan Derrfuss wrote:

Hello,

I would like to run an exploratory analysis where the presence of
samples in the chunks was not under experimental control. As a result,
there are chunks where only one of the two target classes I'm decoding
is present in the chunk the classifier is tested on (there is also a
single chunk in one subject where neither of the two classes is
present). I'm running a searchlight analysis with 6-fold
cross-validation, use a linear SVM, and compute the mean true positive
rate.

Is there a preferred way to deal with such a situation?

Jan

PS. The temperature here in the Lower Rhine region is currently 33 °C
(91 °F). Wherever you are, I hope it's colder there! :-)

_______________________________________________
Pkg-ExpPsy-PyMVPA mailing list
[email protected]
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa


--
Joset A. Etzel, Ph.D.
Research Analyst
Cognitive Control & Psychopathology Lab
Washington University in St. Louis
http://mvpa.blogspot.com/

_______________________________________________
Pkg-ExpPsy-PyMVPA mailing list
[email protected]
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa

Re: [pymvpa] No samples of a class in a chunk

Reply via email to