I’ve encountered a bottleneck somewhere with v.net when scaling out with GNU Parallel… not sure if its an underlying issue with v.net or the way I’m calling the batch jobs?
I’ve got 32 CPUs and commensurate RAM. What I’m observing is v.net CPU utilisation dropping off in accordance with number of jobs running. I’ve tried launching a single batch job with single mapset, as well as multiple batch jobs each with their own mapset (and database). I’ve tried both PG and sqlite backends. Same issue. The script at the bottom describes the approach of launching multiple batch jobs each with their own map set. Executing a single batch job, and then launching parallel within the batch script is much cleaner code - but the results are no different. I feel I’m so close, yet so far at such a critical stage of project delivery. Hope someone can help Kind regards Mark RESULTS ONE JOB TOTAL SCRIPT TIME: 70 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 31313 root 20 0 28876 4080 1284 S 76.5 0.0 0:20.25 sqlite 31293 root 20 0 276m 134m 8320 S 68.5 0.2 0:20.22 v.net.distance ————————— TWO JOBS TOTAL SCRIPT TIME: 96 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 21391 root 20 0 28876 4080 1284 R 53.0 0.0 0:01.90 sqlite 21392 root 20 0 28876 4080 1284 R 52.6 0.0 0:01.86 sqlite 21380 root 20 0 276m 128m 8320 R 49.3 0.2 0:04.02 v.net.distance 21381 root 20 0 276m 128m 8320 S 48.3 0.2 0:03.97 v.net.distance ————————— FOUR JOBS TOTAL SCRIPT TIME: 187 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 6953 mark 20 0 180m 100m 9520 S 63.6 0.2 1:47.39 x2goagent 23025 root 20 0 28876 4080 1284 S 21.5 0.0 0:02.03 sqlite 23026 root 20 0 28876 4080 1284 R 19.9 0.0 0:02.08 sqlite 23027 root 20 0 28876 4080 1284 S 19.5 0.0 0:01.87 sqlite 23028 root 20 0 28876 4080 1284 S 19.5 0.0 0:01.84 sqlite 23014 root 20 0 276m 128m 8320 R 18.5 0.2 0:04.06 v.net.distance 23012 root 20 0 276m 128m 8320 R 17.5 0.2 0:03.91 v.net.distance 23011 root 20 0 276m 128m 8320 S 16.9 0.2 0:04.13 v.net.distance 23015 root 20 0 276m 128m 8320 R 16.9 0.2 0:03.80 v.net.distance ————————— EIGHT JOBS TOTAL SCRIPT TIME: 373 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 27157 root 20 0 28876 4088 1284 S 19.5 0.0 0:42.39 sqlite 27162 root 20 0 28876 4088 1284 R 16.9 0.0 0:40.60 sqlite 6953 mark 20 0 181m 101m 9520 S 16.5 0.2 2:18.86 x2goagent 27154 root 20 0 28876 4088 1284 S 16.5 0.0 0:39.38 sqlite 27153 root 20 0 28876 4088 1284 S 16.2 0.0 0:35.60 sqlite 27156 root 20 0 28876 4088 1284 R 16.2 0.0 0:38.18 sqlite 27161 root 20 0 28876 4088 1284 S 15.9 0.0 0:40.96 sqlite 27155 root 20 0 28876 4088 1284 S 15.6 0.0 0:38.41 sqlite 27104 root 20 0 284m 139m 8332 S 14.9 0.2 0:39.94 v.net.distance 27158 root 20 0 28876 4088 1284 R 14.6 0.0 0:37.49 sqlite 27095 root 20 0 284m 138m 8332 S 14.2 0.2 0:34.48 v.net.distance 27099 root 20 0 284m 138m 8332 S 14.2 0.2 0:38.27 v.net.distance 27101 root 20 0 284m 139m 8332 R 14.2 0.2 0:38.80 v.net.distance 27105 root 20 0 284m 139m 8332 R 14.2 0.2 0:37.95 v.net.distance 27093 root 20 0 284m 138m 8332 R 13.9 0.2 0:32.64 v.net.distance 27102 root 20 0 284m 140m 8332 R 13.6 0.2 0:40.90 v.net.distance 27094 root 20 0 284m 138m 8332 R 13.2 0.2 0:35.78 v.net.distance ————————— ################################################ ############ WORKER FUNCTION ############# ################################################ # CREATE MAPSETS AND BASH SCRIPTS FOR EACH CPU fn_worker (){ ####################### # copy mapset ####################### cp -R /var/tmp/jtw/PERMANENT /var/tmp/jtw/batch_"$1" ####################### # generate batch_job file ####################### echo -e '#!/bin/bash dbsettings="/mnt/data/common/repos/cf_private/settings/current.sh" source $dbsettings cpu='$1' jid=`psql -d $dbname -U $username -A -t -c "SELECT min(jid) FROM jtw.nsw_tz_joblist WHERE processed = false and cpu = '$1';"` o_tz11=`psql -d $dbname -U $username -A -t -c "SELECT o_tz11 FROM jtw.nsw_tz_joblist WHERE jid = $jid;"` o_cat=`psql -d $dbname -U $username -A -t -c "SELECT o_tz11 FROM jtw.nsw_tz_joblist WHERE jid = $jid;"` d_cat=`psql -d $dbname -U $username -A -t -c "SELECT d_tz11 FROM jtw.nsw_tz_joblist WHERE jid = $jid;"` layername="temp_"$jid v.net.distance --overwrite in=nsw_road_network_final_connected@batch_'$1' out=$layername from_layer=2 to_layer=2 from_cats=$d_cat to_cats=$o_cat arc_column=fwdcost arc_backward_column=bwdcost v.out.ogr --o input=$layername output=/var/tmp/$layername type=line ogr2ogr -overwrite -f "PostgreSQL" PG:"host=localhost dbname=o$dbname user=$username password=$password" /var/tmp/$layername/$layername.shp -nln jtw.$layername -s_srs EPSG:3577 -t_srs EPSG:3577 -a_srs EPSG:3577 -nlt LINESTRING psql -d $dbname -U $username -c "INSERT INTO jtw.nsw_tz_journey_paths With s AS (SELECT a.cat, a.tcat, b.tz_code11 as o_tz11, c.tz_code11 as d_tz11, d.lid, d.wkb_geometry, e.employed_persons FROM jtw.$layername a, grass.nsw_tz_centroids_nodes b, grass.nsw_tz_centroids_nodes c, jtw.nsw_road_network_final_net_att d, jtw.nsw_tz_volumes e WHERE a.tcat = b.cat AND a.cat = c.cat AND ST_Equals(a.wkb_geometry, d.wkb_geometry) AND d.type <> '\'service_line\'' AND b.tz_code11 = e.o_tz11 AND c.tz_code11 = e.d_tz11 AND e.mode9 = 4) SELECT NEXTVAL('\'jtw.nsw_tz_journey_paths_jid_seq\''), o_tz11, d_tz11, lid, wkb_geometry, employed_persons FROM s; UPDATE jtw.nsw_tz_joblist SET processed = true WHERE jid = $jid;" #end of job file' > /var/tmp/jtw/jobs/batch_$1.sh ####################### chmod u+x /var/tmp/jtw/jobs/batch_$1.sh } export -f fn_worker # remove previous mapsets before writing new files rm -rf /var/tmp/jtw/batch* rm -rf /var/tmp/jtw/jobs/batch* #execute in parallel seq 1 4 | parallel fn_worker {1} wait ####################### ################################################ ####### JOB SCHEDULER ######## ################################################ #\\\\\\\\\\\\\\\\\\\\\\\\\ START_T1=$(date +%s) #\\\\\\\\\\\\\\\\\\\\\\\\\ fn_worker (){ export GRASS_BATCH_JOB=/var/tmp/jtw/jobs/batch_$1.sh grass70 /var/tmp/jtw/batch_$1 unset GRASS_BATCH_JOB } export -f fn_worker seq 1 4 | parallel fn_worker {1} wait #\\\\\\\\\\\\\\\\\\\\\\\\\ END_T1=$(date +%s) #\\\\\\\\\\\\\\\\\\\\\\\\\ TOTAL_DIFF=$(( $END_T1 - $START_T1 )) echo "TOTAL SCRIPT TIME: $TOTAL_DIFF" #\\\\\\\\\\\\\\\\\\\\\\\\\ ################################################ >> >> The slow rate of writing out the v.net.allpair results from >> PostgreSQL was due to the sheer volume of line strings, as the number >> of pairs increased (n^2). Simple math said stop. I?ve since >> changed my approach and am using v.net.distance in a novel way where >> the to_cat is the origin, and the from_cat is a string of >> destinations - this is an equivalent way of generating multiple >> v.net.paths in a single operation. Moreover, I?m feeding each origin >> - destination collection into GNU Parallel as a separate job, so it >> rips through the data at scale! >
_______________________________________________ grass-user mailing list grass-user@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/grass-user