Davis Shepherd created SPARK-20483: -------------------------------------- Summary: Mesos Coarse mode may starve other Mesos frameworks if max cores is not a multiple of executor cores Key: SPARK-20483 URL: https://issues.apache.org/jira/browse/SPARK-20483 Project: Spark Issue Type: Bug Components: Mesos Affects Versions: 2.1.0 Reporter: Davis Shepherd Priority: Minor
if `spark.cores.max = 10` for example and `spark.executor.cores = 4`, 2 executors will get lauched thus `totalCoresAcquired = 8`. All future Mesos offers will not get tasks launched because `sc.conf.getInt("spark.executor.cores", ...) + totalCoresAcquired <= maxCores` will always evaluate to false. However, in `handleMatchedOffers` we check if `totalCoresAcquired >= maxCores` to determine if we should decline the offer "for a configurable amount of time to avoid starving other frameworks", and this will always evaluate to false in the above scenario. This leaves the framework in a state of limbo where it will never launch any new executors, but only decline offers for the Mesos default of 5 seconds, thus starving other frameworks of offers. Relates to: SPARK-12554 -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org