[GitHub] spark pull request #16905: [SPARK-19567][CORE][SCHEDULER] Support some Sched...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16905 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16905: [SPARK-19567][CORE][SCHEDULER] Support some Sched...
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/16905#discussion_r106244124 --- Diff: core/src/test/scala/org/apache/spark/scheduler/TaskSchedulerImplSuite.scala --- @@ -73,17 +73,15 @@ class TaskSchedulerImplSuite extends SparkFunSuite with LocalSparkContext with B } } - def setupScheduler(confs: (String, String)*): TaskSchedulerImpl = { + private def setupScheduler(confs: (String, String)*): TaskSchedulerImpl = { --- End diff -- one quick comment here if Imran didn't already merge: can you un-do these changes? It's not useful / necessary to make test classes private (they're already hidden by the build), and this change will make git blames more confusing in the future. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16905: [SPARK-19567][CORE][SCHEDULER] Support some Sched...
Github user erenavsarogullari commented on a diff in the pull request: https://github.com/apache/spark/pull/16905#discussion_r101176246 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Pool.scala --- @@ -37,25 +37,22 @@ private[spark] class Pool( val schedulableQueue = new ConcurrentLinkedQueue[Schedulable] val schedulableNameToSchedulable = new ConcurrentHashMap[String, Schedulable] - var weight = initWeight - var minShare = initMinShare + val weight = initWeight --- End diff -- `Pool` extends `Schedulable` trait and needs to override `weight`. It is also used for `taskToWeightRatio` calculation at `FairSchedulingAlgorithm` level. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16905: [SPARK-19567][CORE][SCHEDULER] Support some Sched...
Github user erenavsarogullari commented on a diff in the pull request: https://github.com/apache/spark/pull/16905#discussion_r101173634 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -130,15 +130,17 @@ private[spark] class TaskSchedulerImpl private[scheduler]( val mapOutputTracker = SparkEnv.get.mapOutputTracker - var schedulableBuilder: SchedulableBuilder = null + private val SCHEDULER_MODE_PROPERTY = "spark.scheduler.mode" + private var schedulableBuilder: SchedulableBuilder = null var rootPool: Pool = null // default scheduler is FIFO - private val schedulingModeConf = conf.get("spark.scheduler.mode", "FIFO") + private val schedulingModeConf = conf.get(SCHEDULER_MODE_PROPERTY, SchedulingMode.FIFO.toString) val schedulingMode: SchedulingMode = try { SchedulingMode.withName(schedulingModeConf.toUpperCase) } catch { case e: java.util.NoSuchElementException => - throw new SparkException(s"Unrecognized spark.scheduler.mode: $schedulingModeConf") + throw new SparkException(s"Unrecognized $SCHEDULER_MODE_PROPERTY: $schedulingModeConf. " + +s"Supported modes: ${SchedulingMode.FAIR} or ${SchedulingMode.FIFO}.") --- End diff -- Yep, `TaskSetManager` also uses `NONE` to override `schedulingMode` (from parent `Schedulable` trait). However, it does not use schedulingMode. I think if NONE is removed, then FIFO will be used as the default value(e.g. TaskManager), right? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16905: [SPARK-19567][CORE][SCHEDULER] Support some Sched...
Github user erenavsarogullari commented on a diff in the pull request: https://github.com/apache/spark/pull/16905#discussion_r101164858 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -130,15 +130,17 @@ private[spark] class TaskSchedulerImpl private[scheduler]( val mapOutputTracker = SparkEnv.get.mapOutputTracker - var schedulableBuilder: SchedulableBuilder = null + private val SCHEDULER_MODE_PROPERTY = "spark.scheduler.mode" + private var schedulableBuilder: SchedulableBuilder = null var rootPool: Pool = null // default scheduler is FIFO - private val schedulingModeConf = conf.get("spark.scheduler.mode", "FIFO") + private val schedulingModeConf = conf.get(SCHEDULER_MODE_PROPERTY, SchedulingMode.FIFO.toString) val schedulingMode: SchedulingMode = try { SchedulingMode.withName(schedulingModeConf.toUpperCase) } catch { case e: java.util.NoSuchElementException => - throw new SparkException(s"Unrecognized spark.scheduler.mode: $schedulingModeConf") + throw new SparkException(s"Unrecognized $SCHEDULER_MODE_PROPERTY: $schedulingModeConf. " + +s"Supported modes: ${SchedulingMode.FAIR} or ${SchedulingMode.FIFO}.") --- End diff -- `SchedulingMode` possible values are `FIFO`, `FAIR` and `NONE` but `NONE` is _unsupported_ value. I agree to support for potential values and it can be achieved by adding following logic to `SchedulingMode` object and can be used required places(2 times here and 1 time at `Pool`) `def getSupportedValuesAsString(): String = values.filter(_ != NONE).mkString(", ") SchedulingMode.getSupportedValuesAsString() // returns FIFO, FAIR` WDYT? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16905: [SPARK-19567][CORE][SCHEDULER] Support some Sched...
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/16905#discussion_r100836419 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -130,15 +130,17 @@ private[spark] class TaskSchedulerImpl private[scheduler]( val mapOutputTracker = SparkEnv.get.mapOutputTracker - var schedulableBuilder: SchedulableBuilder = null + private val SCHEDULER_MODE_PROPERTY = "spark.scheduler.mode" + private var schedulableBuilder: SchedulableBuilder = null var rootPool: Pool = null // default scheduler is FIFO - private val schedulingModeConf = conf.get("spark.scheduler.mode", "FIFO") + private val schedulingModeConf = conf.get(SCHEDULER_MODE_PROPERTY, SchedulingMode.FIFO.toString) val schedulingMode: SchedulingMode = try { SchedulingMode.withName(schedulingModeConf.toUpperCase) } catch { case e: java.util.NoSuchElementException => - throw new SparkException(s"Unrecognized spark.scheduler.mode: $schedulingModeConf") + throw new SparkException(s"Unrecognized $SCHEDULER_MODE_PROPERTY: $schedulingModeConf. " + +s"Supported modes: ${SchedulingMode.FAIR} or ${SchedulingMode.FIFO}.") --- End diff -- We're not likely to add or remove SchedulingModes with any frequency, if at all, so this isn't likely to cause much opportunity for error -- but I agree with the principle that extracting from `.values` is a better approach. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16905: [SPARK-19567][CORE][SCHEDULER] Support some Sched...
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/16905#discussion_r100834970 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Pool.scala --- @@ -37,25 +37,22 @@ private[spark] class Pool( val schedulableQueue = new ConcurrentLinkedQueue[Schedulable] val schedulableNameToSchedulable = new ConcurrentHashMap[String, Schedulable] - var weight = initWeight - var minShare = initMinShare + val weight = initWeight + val minShare = initMinShare var runningTasks = 0 - var priority = 0 + val priority = 0 // A pool's stage id is used to break the tie in scheduling. var stageId = -1 - var name = poolName + val name = poolName var parent: Pool = null - var taskSetSchedulingAlgorithm: SchedulingAlgorithm = { + private val taskSetSchedulingAlgorithm: SchedulingAlgorithm = { schedulingMode match { - case SchedulingMode.FAIR => -new FairSchedulingAlgorithm() - case SchedulingMode.FIFO => -new FIFOSchedulingAlgorithm() - case _ => -val msg = "Unsupported scheduling mode: $schedulingMode. Use FAIR or FIFO instead." -throw new IllegalArgumentException(msg) + case SchedulingMode.FAIR => new FairSchedulingAlgorithm() + case SchedulingMode.FIFO => new FIFOSchedulingAlgorithm() + case _ => throw new IllegalArgumentException("Unsupported scheduling mode: " + +s"$schedulingMode. Supported modes: ${SchedulingMode.FAIR} or ${SchedulingMode.FIFO}.") --- End diff -- I'd really rather not see this kind of change. Other that the missing string-interpolation `s` in `msg`, the prior code was at least as good (and arguably better) than the new style, and making such an inconsequential style change just adds complication to future investigations of the git history. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16905: [SPARK-19567][CORE][SCHEDULER] Support some Sched...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/16905#discussion_r100771635 --- Diff: core/src/test/scala/org/apache/spark/scheduler/TaskSchedulerImplSuite.scala --- @@ -41,6 +41,8 @@ class FakeSchedulerBackend extends SchedulerBackend { class TaskSchedulerImplSuite extends SparkFunSuite with LocalSparkContext with BeforeAndAfterEach with Logging with MockitoSugar { + val SCHEDULER_MODE_PROPERTY = "spark.scheduler.mode" --- End diff -- This is duplicated --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16905: [SPARK-19567][CORE][SCHEDULER] Support some Sched...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/16905#discussion_r100771562 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Pool.scala --- @@ -37,25 +37,22 @@ private[spark] class Pool( val schedulableQueue = new ConcurrentLinkedQueue[Schedulable] val schedulableNameToSchedulable = new ConcurrentHashMap[String, Schedulable] - var weight = initWeight - var minShare = initMinShare + val weight = initWeight --- End diff -- These appear to be pretty much redundant then? if they're just set once to another existing variable's value? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16905: [SPARK-19567][CORE][SCHEDULER] Support some Sched...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/16905#discussion_r100771342 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -130,15 +130,17 @@ private[spark] class TaskSchedulerImpl private[scheduler]( val mapOutputTracker = SparkEnv.get.mapOutputTracker - var schedulableBuilder: SchedulableBuilder = null + private val SCHEDULER_MODE_PROPERTY = "spark.scheduler.mode" + private var schedulableBuilder: SchedulableBuilder = null var rootPool: Pool = null // default scheduler is FIFO - private val schedulingModeConf = conf.get("spark.scheduler.mode", "FIFO") + private val schedulingModeConf = conf.get(SCHEDULER_MODE_PROPERTY, SchedulingMode.FIFO.toString) val schedulingMode: SchedulingMode = try { SchedulingMode.withName(schedulingModeConf.toUpperCase) } catch { case e: java.util.NoSuchElementException => - throw new SparkException(s"Unrecognized spark.scheduler.mode: $schedulingModeConf") + throw new SparkException(s"Unrecognized $SCHEDULER_MODE_PROPERTY: $schedulingModeConf. " + +s"Supported modes: ${SchedulingMode.FAIR} or ${SchedulingMode.FIFO}.") --- End diff -- Maintaining these messages with the various options is error-prone. Can you just use its `.values` and print that? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16905: [SPARK-19567][CORE][SCHEDULER] Support some Sched...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/16905#discussion_r100770989 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -130,15 +130,17 @@ private[spark] class TaskSchedulerImpl private[scheduler]( val mapOutputTracker = SparkEnv.get.mapOutputTracker - var schedulableBuilder: SchedulableBuilder = null + private val SCHEDULER_MODE_PROPERTY = "spark.scheduler.mode" --- End diff -- This should be in a companion object, but really, maybe not worth it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org