[pymvpa] Parallelization

marco tettamanti Fri, 10 Nov 2017 00:57:48 -0800

Dear Matteo and Nick,
thank you for your responses.

I take the occasion to ask some follow-up questions, because I am struggling tomake pymvpa2 computations faster and more efficient.

I often find myself in the situation of giving up with a particular analysis,because it is going to take far more time that I can bear (weeks, months!). Thishappens particularly with searchlight permutation testing (gnbsearchlight ismuch faster, but does not support pprocess), and nested cross-validation.As for the latter, for example, I recently wanted to run nested cross-validationin a sample of 18 patients and 18 controls (1 image x subject), training theclassifiers to discriminate patients from controls in a leave-one-pair-outpartitioning scheme. This yields 18*18=324 folds. For a small ROI of 36 voxels,cycling over approx 40 different classifiers takes about 2 hours for each foldon a decent PowerEdge T430 Dell server with 128GB RAM. This means approx. 27days for all 324 folds!The same server is equipped with 32 CPUs. With full parallelization, the sameanalysis may be completed in less than one day. This is the reason of myinterest and questions about parallelization.

Is there anything that you experts do in such situations to speed up or make thecomputation more efficient?


Thank you again and best wishes,
Marco

On 10/11/2017 10:07, Nick Oosterhof wrote:

There have been some plans / minor attempts for using parallelisation more
parallel, but as far as I know we only support pprocces, and only for (1)
searchlight; (2) surface-based voxel selection; and (3) hyperalignment. I
do remember that parallelisation of other functions was challenging due to
some getting the conditional attributes set right, but this is long time
ago.

On 09/11/2017 18:35, Matteo Visconti di Oleggio Castello wrote:

Hi Marco,
AFAIK, there is no support for parallelization at the level of
cross-validation. Usually for a small ROI (such a searchlight) and with
standard CV schemes, the process is quite fast, and the bottleneck is
really the number of searchlights to be computed (for which parallelization
exists).

In my experience, we tend to parallelize at the level of individual
participants; for example we might set up a searchlight analysis with
however n_procs you can have, and then submit one such job for every
participant to a cluster (using either torque or condor).

HTH,
Matteo

On 09/11/2017 10:08, marco tettamanti wrote:

Dear all,
forgive me if this has already been asked in the past, but I was wondering
whether there has been any development meanwhile.

Are there any chances that one can generally apply parallel computing (multiple
CPUs or clusters) with pymvpa2, in addition to what is already implemented for
searchlight (pprocess)? That is, also for general cross-validation, nested
cross-validation, permutation testing, RFE, etc.?

Has anyone had succesful experience with parallelization schemes such as
ipyparallel, condor or else?

Thank you and best wishes!
Marco



_______________________________________________
Pkg-ExpPsy-PyMVPA mailing list
Pkg-ExpPsy-PyMVPA@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa

[pymvpa] Parallelization

Reply via email to