Dear Michael, Thanks for the clarification and the modification of the codes - it works now! Best,Meng > Date: Mon, 30 Jun 2014 21:33:42 +0200 > From: [email protected] > To: [email protected] > Subject: Re: [pymvpa] same sensitivity values for all dataset splits from > cross validation using tutorial script snippet > > Hi, > > I am sorry you had to go through this... > > On Mon, Jun 30, 2014 at 02:18:07PM +0100, Meng Liang wrote: > > I am trying to obtain the sensitivity values for all splits of the > > dataset during leave-one-out cross-validation (classification using > > SVM). I found in the tutorial "Classification Model Parameters – > > Sensitivity Analysis" ( > > http://www.pymvpa.org/tutorial_sensitivity.html ) that > > RepeatedMeasure(sensana, NFoldPartitioner()) should give the > > sensitivity values for each fold. Here are the code snippet I used in > > my script slightly adapted from the tutorial: > > > > clf = LinearNuSVMC() > > cv = CrossValidation(clf, NFoldPartitioner(),enable_ca=['stats']) > > sensana = clf.get_sensitivity_analyzer() > > cv_sensana = RepeatedMeasure(sensana, NFoldPartitioner()) > > error = cv(ds) > > sensmap_cv = cv_sensana(ds) > > 'print sensmap_cv.shape' > > > > gave me: (14L, 87L). > > > > I have 14 subjects and I am using leave-one-subject-out > > cross-validation, and there are 87 features. So the data structure > > seems correct. However, when I look at the values of this 14x87 array, > > all the rows in the array contain exactly the same values (i.e., the > > first row looks the same with all the other rows). > > I am afraid you found a bug in the documentation (more specifically a > bit of code that has not been properly adjusted when we switched from > dataset splitters to dataset partitioners -- just mentioning it for > those who have been around for that long...). > > The reason for the behavior you observe is that, in contrast to what is > advertised in the tutorial, RepeatedMeasure does not split any dataset. > It does what it says on the label: it repeats a measure, for whatever > datasets come out of the provided generator -- in your case > NFoldPartitioner. However, partitioners only add a sample attribute to a > dataset that indicate the current partitioning scheme -- they do not > split a dataset -- hence you are actually computing sensitivities, > repeatedly, from the identical dataset. > > If you want to compute the sensitivities on the respective training > samples of each data fold (which I think you do) you need to change that > line to: > > cv_sensana = RepeatedMeasure(sensana, > ChainNode((NFoldPartitioner(), > Splitter('partitions', > attr_values=(1,))))) > > This change amends the partitioner with a splitter that actually takes > out the training samples of each fold and feeds them into the > sensitivity measure. > > > A related question about normalizing the sensitivity values: in the > > "Closing Words" of the tutorial on the same webpage, it says: "It > > should also be noted that sensitivities can not be directly compared > > to each other, even if they stem from the same algorithm and are just > > computed on different dataset splits. In an analysis one would have to > > normalize them first." My question is: if we cannot compare the > > sensitivity values from different data splits without normalizing them > > first, why can we average them or take the maximum value across data > > splits without applying any normalization (the example script snippets > > in the tutorial seem to do so)? I would imagine that the average or > > the max value would also be affected by the scale of the data. > > Yes, you are right: they could be normalized even more (the dataset in > the tutorial, however, is a single subject and it was z-scored upfront. > So it is not that bad... > > Sorry for the bug. I filed a bug report and we'll fix it ASAP. > > Michael > > -- > J.-Prof. Dr. Michael Hanke > Psychoinformatik Labor, Institut für Psychologie II > Otto-von-Guericke-Universität Magdeburg, Universitätsplatz 2, Geb.24 > Tel.: +49(0)391-67-18481 Fax: +49(0)391-67-11947 GPG: 4096R/7FFB9E9B > > _______________________________________________ > Pkg-ExpPsy-PyMVPA mailing list > [email protected] > http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa
_______________________________________________ Pkg-ExpPsy-PyMVPA mailing list [email protected] http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa

