Hi, have you tried using the vstack command implemented in pymvpa? http://www.pymvpa.org/generated/mvpa2.base.dataset.vstack.html#mvpa2.base.dataset.vstack
Roberto On 1 October 2014 05:15, Shane Hoversten <[email protected]> wrote: > Hi - > > I have some code that creates a dataset from a bunch of ornate processing > (figuring out which volumes to censor based on subject performance, subject > motion, other scanning params; creating aggregate event types for certain > events; etc.) Dataset creation has been done, to this point, per-session: > subjects were scanned twice, and each session is processed separately; and > all that ornate stuff that is done will be different from session to > session. > > Now I want to aggregate these two sessions and do MVPA things after > throwing all the data into the hopper. Specifically, instead of using > NFoldPartitioner on a day's worth of runs (there are 4 runs per day) and > leaving one run out, I want to run it on the combined runs from both days > (8 runs) and leave two out. > > PyMVPA is awesome and so it would be clear how to do this if all the data > were aggregated together; but as I mentioned, to get the dataset in the > right format I have to do a bunch of processing, and it would be a pain to > combine all the various files to make this into one single aggregate set. > What I'd rather do is just glue together two separate datasets, which have > already been processed in the ways they require, s.t. the new dataset just > had the samples, targets, and associated attributes from the second > session's dataset glued onto the first session's dataset. > > The ds.samples variable reports as being a numpy.ndarray, so I figured I > could just stuff them together with array operations, for instance: > > combined_ds = np.append(ds_1.samples, ds_2.samples) > > and so on for the targets, sample attributes, etc. But nope, this gets > screwed up immediately: > > In [*57*]: ds_1 = m.MVPAMaster("tp101", 1, "dc", > "new_temporal_tp101_day1.nii") > > In [*58*]: len(ds_1.ds) > > Out[*58*]: 290 > > In [*57*]: ds_2 = m.MVPAMaster("tp101", 2, "dc", > "new_temporal_tp101_day2.nii") > > In [*58*]: len(ds_2.ds) > > Out[*58*]: 290 > > In [*64*]: combined = np.append(ds_1.ds.samples, ds_2.ds.samples) > > In [*65*]: len(combined) > > Out[*65*]: 16960360 > > I'm thinking this is a mapper unrolling everything behind the scenes, > maybe? I could beat my ahead against this for a while, but I figured first > I'd ask and see if there's a straightforward method to extending a dataset > in this fashion? > > Thanks, > > Shane > > _______________________________________________ > Pkg-ExpPsy-PyMVPA mailing list > [email protected] > http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa >
_______________________________________________ Pkg-ExpPsy-PyMVPA mailing list [email protected] http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa

