[GitHub] jjacob7734 opened a new pull request #50: SDAP-151 Determine parallelism automatically for Spark analytics

GitBox Sun, 04 Nov 2018 18:26:45 -0800

jjacob7734 opened a new pull request #50: SDAP-151 Determine parallelism 
automatically for Spark analytics
URL: https://github.com/apache/incubator-sdap-nexus/pull/50
 
 
   The built-in NEXUS analytics timeSeriesSpark, timeAvgMapSpark, corrMapSpark, 
and climMapSpark got the desired parallelism from a job request parameter like 
"spark=mesos,16,32".  If that was omitted, we defaulted to "spark=local,1,1", 
which runs on a single core.  The new algorithms automatically determine the 
appropriate level of parallelism based on the job's Spark cluster 
configuration.  The job parameter called "spark" is no longer supported.  A new 
optional job parameter called "nparts" can be used to explicitly set the number 
of data partitions (e.g., "nparts=16").


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] jjacob7734 opened a new pull request #50: SDAP-151 Determine parallelism automatically for Spark analytics

Reply via email to