Hello, I cannot help you with the inference on subject level, however we do have nice example for the group level inference. It is what we used in pandora data paper http://f1000research.com/articles/4-174/v1 (review pending) to produce figure 4 and table3. Code for the replication of the analysis is available at https://github.com/psychoinformatics-de/paper-f1000_pandora_data
Best wishes, Richard On Tue, Aug 11, 2015 at 11:13 PM, Yaroslav Halchenko <deb...@onerussian.com> wrote: > > On Tue, 11 Aug 2015, Roni Maimon wrote: > > > Hi all, > > Hi Roni, > > > I started by implementing the inference at the subject level (attaching > the > > code). Is this how I'm supposed to evaluate the p values of the > > classifications for a single subject? What is the differences between > > adding the null_dist to the sl level and the cross validation level? > > if you add it at the CV level (which is also legit) you would need some > way to "collect" all of the stats across all the searchlights, e.g. > replace output of CV accuracy/error with the p value, or make two > samples (if you care to see also effects -- i.e. CV > error/accuracy), one of which would be accuracy, another p-value. Also > it might be "noisier" since different searchlights then might permute > differently. Also I think it will take a bit longer > > So the cleaner way -- at the searchlight level, where the entire > searchlight then estimated with the same permutation at a time. > > > My code: > > clf = LinearCSVMC() > > splt = NFoldPartitioner(attr='chunks') > > > repeater = Repeater(count=100) > > permutator = AttributePermutator('targets', limit={'partitions': 1}, > > count=1) > > null_cv = CrossValidation(clf, ChainNode([splt, > > permutator],space=splt.get_space()), > > postproc=mean_sample()) > > null_sl = sphere_searchlight(null_cv, radius=3, space='voxel_indices', > > enable_ca=['roi_sizes']) > > distr_est = MCNullDist(repeater,tail='left', measure=null_sl, > > enable_ca=['dist_samples']) > > I see.. you are trying to maintain the same assignment in testing > dataset... > IMHO (especially since you seems to just use betas from GLM, so I guess a > beta > per run) it is not necessary. Just make a straightforward permutator in > all > chunks (but within each chunk) at the level of the searchlight without > trying > to do that fancy dance (I know -- the one our tutorial tutors atm): > > permutator = AttributePermutator('targets', count=100, > limit='chunks') > distr_est = MCNullDist(permutator, tail='left', > enable_ca=['dist_samples']) > > > cv = CrossValidation(clf,splt, > > enable_ca=['stats'], postproc=mean_sample() ) > > sl = sphere_searchlight(cv, radius=3, space='voxel_indices', > > null_dist=distr_est, > > enable_ca=['roi_sizes']) > > ds = glm_dataset.copy(deep=False, > > sa=['targets','chunks'], > > fa=['voxel_indices'], > > a=['mapper']) > > sl_map = sl(ds) > > p_values = distr_est.cdf(sl_map.samples) # IS THIS THE RIGHT WAY?? > > just access sl.ca.null_prob for the p value (you could also use > .ca.null_t > (if you enable it for searchlight) to get a corresponding t ... actually > z-score (i.e. it is not t-score since assumption of the distribution is > normal), which would be easier to visualize and comprehend (2.3 must > correspond > to p=0.01 at a one-sided test ;-)) > > > > Is there a way to make sure the permutations are exhaustive? > > if they are exhaustive -- you might need more data ;) especially with > the searchlight, 100 permutations would just give you p no smaller than > ~0.01 > (well 1/101 to be precise). with any correction for multiple comparisons > (you > have many searchlights) you would need "stronger" p's , thus more > permutations. > > > In order to make an inference on the group level I understand I can > > use GroupClusterThreshold. > > Correct > > > Does anyone have a code sample for that? Do I use the MCNullDist's > created > > at the subject level? > > Unfortunately we don't have a full nice example yet, and Michael whom we > could > blame (d'oh -- thank!) for this functionality is on vacation (CCing though > the > very original author -- Richard, please correct me if I am wrong > anywhere) BUT your code above, with my addition (note me enabling > 'dist_samples') is pretty much ready. > > 1. Just run your searchlights per subject e.g. 20-50 times, collect per > each > subject distr_est.ca.dist_samples, and place them all into a single > dataset > where each permutation will be a "sample", while you set sa.chunks to > group > subjects (i.e. all permutations from subject 1 should have chunks == 1). > Call it e.g. perms > > 2. Create bootstrapping of those permutation results: e.g. > > clthr = gct.GroupClusterThreshold() # defaults might be adequate ;), > otherwise adjust > > 3. Train it on your collection of > > clthr.train(perms) > > which will do all the bootstrapping (would take awhile) > > 4. Estimate significance/treshold your original map (mean of errors across > subjects without permutations) > > res = clthr(mean_map) > > > 5. look into res.a for all kinds of stats (# of clusters, their > locations, significance etc) > and then > > res.fa.clusters_fwe_thresh > > will contain actual map of clusters indices which passed fwe thresholding. > > > Hope this helps! > -- > Yaroslav O. Halchenko, Ph.D. > http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org > Research Scientist, Psychological and Brain Sciences Dept. > Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755 > Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419 > WWW: http://www.linkedin.com/in/yarik >
_______________________________________________ Pkg-ExpPsy-PyMVPA mailing list Pkg-ExpPsy-PyMVPA@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa