Hi Everyone: Thanks for your responses! I figured I should follow up and let you know my progress.
First, a bit more description of the data: The 20 samples will definitely grow as we get more participants. Essentially, I'm trying to predict individual differences in various psychological measures (e.g., working memory capacity, personality traints, etc...) based on functional activity during a task. This is similar to classifying whether or not people have some neurological disease from looking at their structural MRIs. The reason I have so many features is that I have three conditions in the task and ran an FIR with 8 time points for each condition, leaving me with 3*8*nvoxels per sample (in my case where the data have been upsampled, this leaves me with about 3 million features). To tame the data and attempt to prevent over fitting I wanted to reduce those features down. Running an SVD on all those data with loads of features and only 20 samples didn't converge, so I ended up splitting the features into smaller chunks and concatenating SVDs from less data. After this step I had about 4K features down from around 3 million. I then used a GLMNET_R sparse regression classifier to predict individual psychological measures at a time. This was able to replicate previously published data, for example I can predict with an R=.6 what someone's neuroticism level will be (it turns out it's pretty much all amygdal activity for that one, which is already in the literature). On all my other measures I'm doing quite poorly, however. What I really wanted to do was run either a partial least squares regression or a cannonical correlation on these data, trying to predict all psychological measures at once to see interactions between them, but all my attempts at that (with R and matlab because I don't have python code for either) have not worked at all. I got the PLS regression to run (via RPy2), but it didn't even replicate my other analysis, failing to fit anything. So, I'm still plugging away, but if anyone has any thoughts or ideas, I'd love to hear them. Best, Per On Thu, Apr 7, 2011 at 12:03 PM, Yaroslav Halchenko <deb...@onerussian.com> wrote: > yeap! and that is why I was also skeptical about results based on > 10-20 samples ;) > > On Thu, 07 Apr 2011, Emanuele Olivetti wrote: > >> Cute indeed :-). Figure 1 is pretty scary, especially if we replace >> "number of classifiers" with "number of features". Of course the >> assumption is that classifiers (or features) are independent. But still... > >> Best, > >> E. > -- > =------------------------------------------------------------------= > Keep in touch www.onerussian.com > Yaroslav Halchenko www.ohloh.net/accounts/yarikoptic > > _______________________________________________ > Pkg-ExpPsy-PyMVPA mailing list > Pkg-ExpPsy-PyMVPA@lists.alioth.debian.org > http://lists.alioth.debian.org/mailman/listinfo/pkg-exppsy-pymvpa > _______________________________________________ Pkg-ExpPsy-PyMVPA mailing list Pkg-ExpPsy-PyMVPA@lists.alioth.debian.org http://lists.alioth.debian.org/mailman/listinfo/pkg-exppsy-pymvpa