sorry for being silent On Tue, 03 Jan 2012, Mike E. Klein wrote: > (1) I haven't done a permutation test. By "chance distribution" I just > meant the bulk of the data points using my real-label-coded data. While > I'm obviously hoping for a histogram that contains a positive skew, at > worst�I'd expect a normal distribution centered around chance. Once I get > this error figured out, I will do some permutation testing as well, but at > the moment it doesn't seem necessary. (In other words, with real data or > fake data, I can't see why I'd ever see a negative�skew unless I'm doing > something else wrong.)
I had similar feeling -- performance distributions should be pretty much a mixture of two: chance distribution (centered at chance level for that task) and some "interesting" one in the right tail, e.g. as we have shown in a toy example in http://www.pymvpa.org/examples/curvefitting.html#searchlight-accuracy-distributions indeed that is most often the case, BUT as you have mentioned -- not always. Some times "negative preference" becomes too prominent, thus giving a histogram the peak below chance. As you have discussed - reasons could be various, but I think that it might also be due to the same fact -- samples are not independent! In this case "samples" are searchlights, which are clearly not independent not only due to overlaps among them, but more important here is that they explore "homogeneous" regions which could be of different sizes. What could it lead to? if indeed there is no signal, and we assume that those "macro" regions are "independent", and mean accuracies across the regions (not searchlights) indeed follow chance distribution; when we run searchlight then due to inherent correlations within regions, regions of larger size will have heavier impact on the resultant histogram of the performances. Thus if signal in the region, by chance, falls below the chance level, searchlights covering that region most probably will have similar below chance performance and depending on the region size might result in stronger "peak" below chance of histogram of overall performances. So in turn it might also amplify those confounds you were talking about leading to anti-learner effects. Overall, as in good old GLM, it becomes difficult to state anything at per-subject level without looking at a summary of searchlights across subjects, where such unfortunate effects seems (at least in my cases) to become less prominent (of cause though they hinder "2nd level" statistics) -- =------------------------------------------------------------------= Keep in touch www.onerussian.com Yaroslav Halchenko www.ohloh.net/accounts/yarikoptic _______________________________________________ Pkg-ExpPsy-PyMVPA mailing list Pkg-ExpPsy-PyMVPA@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa