skorper opened a new pull request, #262:
URL: https://github.com/apache/incubator-sdap-nexus/pull/262

   _Note: This is branched off of SDAP-467. We should merge these branches like 
SDAP-455 -> SDAP-467 -> SDAP-473_
   
   Implemented basic job prioritization in SDAP for matchup jobs. 
   
   - Updated scheduling between pools to `FAIR`. Default in Spark is `FIFO`.
   
   > Under fair sharing, Spark assigns tasks between jobs in a “round robin” 
fashion, so that all jobs get a roughly equal share of cluster resources 
[source](https://spark.apache.org/docs/latest/job-scheduling.html#scheduling-within-an-application)
   
   - Created new `scheduler.xml` file which is where the two new pools are 
specified. For now, very simple "large" and "small" pools are used. In the 
future, we should consider adding more pools.
   - If num primary tiles > threshold, use the "large" pool. The "large" pools 
is give much lower priority in terms of resources compared to "small" pool 
jobs. The threshold is currently 4000 -- this is somewhat arbitrary. This will 
also perform very differently on sat-to-sat vs sat-to-insitu vs anything with 
l2 tiles. 
   
   
   ---
   
   ### Testing
   
   This was tested by submitting a very large job to my local SDAP,  then 
submitting a very small job to my local SDAP. 
   
   Large job:
   
   ```bash
   
/match_spark?primary=VIIRS_NPP-2018_Heatwave&secondary=ASCATB-L2-Coastal&startTime=2018-08-01T00%3A00%3A00Z&endTime=2018-08-02T23%3A59%3A59Z&b=-122%2C26%2C-114%2C34&platforms=0&depthMin=-1&depthMax=1&tt=43200&rt=18750&matchOnce=true&resultSizeLimit=500
   ```
   
   Small job:
   
   ```bash
   
/match_spark?matchOnce=true&secondary=MUR25-JPL-L4-GLOB-v04.2&primary=ASCATB-L2-Coastal&startTime=2018-08-01T00%3A00%3A00Z&endTime=2018-08-01T23%3A59%3A59Z&b=-122%2C26%2C-114%2C34&platforms=30&depthMin=-1&depthMax=1&tt=43200&rt=18750&resultSizeLimit=500
   ```
   
   When run alone, the small job takes ~ 30 seconds to 60 seconds to complete. 
When run after the large job is submitted, the small job takes ~ 3 minutes to 
complete. The fact that it completes at all indicates some level of resource 
sharing happening in the system.
   
   I ran the above test again on the SDAP-467 (pagination) branch. The second 
small job has been running for 10 minutes and still has not finished. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to