Nick, I think I have in the region of 250000 voxels! All I did was extract the voxels with a brain mask. I didn't realise that this would be so memory consuming. So is ridge regression n^2m memory usage for n features and m subjects?
Michael, I just use the following: ds = Dataset(newMetaData['maskedData']) ds.sa['diagnoses'] = newMetaData['diagnosesT1'] and then my ds.summary() is Dataset: 133x217105@float32, <sa: diagnoses,runtype>, <a: mapper> stats: mean=-2.9664e-06 std=0.966266 var=0.93367 min=-5.76981 max=11.4395 On Mon, Nov 24, 2014 at 5:14 PM, Michael Hanke <[email protected]> wrote: > Hi, > > could you please show the code that assembles the dataset? Also, the > output of > > print ds.summary() > > could help to find the problem. > > Michael > > > > > On Mon, Nov 24, 2014 at 4:35 PM, Thomas Nickson <[email protected]> > wrote: > >> Hey All, >> >> I'm using some structural data, 133 subjects with a size of about 350 >> meg, on a machine with 32gig of ram. I'm trying to set up a basic algorithm >> to make sure that everything works okay. I have chosen ridge regression and >> I'm attempting to use a HalfPartitioner and I'm running all of the code >> from ipython notebook. >> >> from mvpa2.clfs.ridge import RidgeReg >> a = HalfPartitioner(attr='runtype') >> clf = RidgeReg() >> cv = CrossValidation(clf, a) >> cv_results = cv(ds) >> >> However, I get the following memory error, even when I use a very minimal >> subset such as 2 subjects. >> >> ---------------------------------------------------------------------------MemoryError >> Traceback (most recent call >> last)<ipython-input-205-fafb290e8469> in <module>()----> 1 cv_results = >> cv(ds) >> /home/orkney_01/tnickson/Programming/pyVirtualEnv/lib/python2.6/site-packages/mvpa2/base/learner.pyc >> in __call__(self, ds) 257 "used and >> auto training is disabled." 258 % >> str(self))--> 259 return super(Learner, self).__call__(ds) 260 >> 261 >> /home/orkney_01/tnickson/Programming/pyVirtualEnv/lib/python2.6/site-packages/mvpa2/base/node.pyc >> in __call__(self, ds) 119 120 self._precall(ds)--> 121 >> result = self._call(ds) 122 result = self._postcall(ds, result) >> 123 >> /home/orkney_01/tnickson/Programming/pyVirtualEnv/lib/python2.6/site-packages/mvpa2/measures/base.pyc >> in _call(self, ds) 495 # always untrain to wipe out previous >> stats 496 self.untrain()--> 497 return >> super(CrossValidation, self)._call(ds) 498 499 >> /home/orkney_01/tnickson/Programming/pyVirtualEnv/lib/python2.6/site-packages/mvpa2/measures/base.pyc >> in _call(self, ds) 324 ca.datasets.append(sds) 325 >> # run the beast--> 326 result = node(sds) 327 >> # callback 328 if not self._callback is None: >> /home/orkney_01/tnickson/Programming/pyVirtualEnv/lib/python2.6/site-packages/mvpa2/base/learner.pyc >> in __call__(self, ds) 257 "used and >> auto training is disabled." 258 % >> str(self))--> 259 return super(Learner, self).__call__(ds) 260 >> 261 >> /home/orkney_01/tnickson/Programming/pyVirtualEnv/lib/python2.6/site-packages/mvpa2/base/node.pyc >> in __call__(self, ds) 119 120 self._precall(ds)--> 121 >> result = self._call(ds) 122 result = self._postcall(ds, result) >> 123 >> /home/orkney_01/tnickson/Programming/pyVirtualEnv/lib/python2.6/site-packages/mvpa2/measures/base.pyc >> in _call(self, ds) 598 for i in >> dstrain.get_attr(splitter.get_space())[0].unique]) 599 # ask >> splitter for first part--> 600 measure.train(dstrain) 601 >> # cleanup to free memory 602 del dstrain >> /home/orkney_01/tnickson/Programming/pyVirtualEnv/lib/python2.6/site-packages/mvpa2/base/learner.pyc >> in train(self, ds) 130 # things might have happened during >> pretraining 131 if ds.nfeatures > 0:--> 132 >> result = self._train(ds) 133 else: 134 >> warning("Trying to train on dataset with no features present") >> /home/orkney_01/tnickson/Programming/pyVirtualEnv/lib/python2.6/site-packages/mvpa2/clfs/ridge.pyc >> in _train(self, data) 76 if self.__lm is None: 77 >> # Not specified, so calculate based on .05*nfeatures---> 78 >> Lambda = .05*data.nfeatures*np.eye(data.nfeatures) 79 >> else: 80 # use the provided penalty >> /usr/lib64/python2.6/site-packages/numpy/lib/twodim_base.pyc in eye(N, M, k, >> dtype) 208 if M is None: 209 M = N--> 210 m = >> zeros((N, M), dtype=dtype) 211 if k >= M: 212 return m >> MemoryError: >> >> >> Surely, this can't really be because I'm out of memory can it? The 32gig >> machine should be >> >> more than able to load the a small portion of the 350 meg dataset. >> >> Is there something I'm missing or have I configured something wrongly? >> >> Also, I'm not really sure how the halfpartitioner works. Does it just >> generate the complement >> >> of the sets I determine or can it split the dataset in two for me based on >> the >> >> groups? If that's possible, how can I do that? >> >> Thanks, >> >> Tom >> >> >> _______________________________________________ >> Pkg-ExpPsy-PyMVPA mailing list >> [email protected] >> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa >> > > > > -- > Michael Hanke > http://mih.voxindeserto.de > > > _______________________________________________ > Pkg-ExpPsy-PyMVPA mailing list > [email protected] > http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa >
_______________________________________________ Pkg-ExpPsy-PyMVPA mailing list [email protected] http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa

