On 13/02/15 13:40, Mark Wynter wrote:
Hi Moritz

With the second approach  (the code I shared in my post), I have 3500 discrete 
jobs, and I set the number of batches equal to the number of CPUs.  Each batch 
job is despatched to a cpu, where it then pulls from a queue of job id’s that 
are processed in serial within each batch job.  The thinking behind this 
approach was to allocate jobs across available CPUs as separate batch processes.

The other and preferred approach is to launch 1 batch job, and then GNU 
parallel draws down from the list of 3500 jobs, assigning jobs to worker 
functions as CPUs become available.  This code pattern I’ve had much success 
with parallelising PostGIS queries etc.

As you have suspected, I get no benefit from additional CPUs.

Are you sure the problem is CPU-bound ?


Unfortunately I don’t have time on my side, and parallelisation is critical.  A 
fallback is to spin up a cluster of 16 x 2 CPU machines and pre-allocate 
job-ids to machines, and then write the results back to the master node - but 
this is not ideal and pathway I am reticent about going down.

Do you know anyone who may have attempted to parallelise v.net?

No. Personally I don't have any experience with this.
You are specifically speaking about v.net.distance, here, or ?


I guess the most important question right now is - is it possible to do poor 
man’s parallelisation with v.net?   Anyone?

The one who knows the insides of these modules best is Markus Metz.

Moritz
_______________________________________________
grass-user mailing list
grass-user@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-user

Reply via email to