[GitHub] spark pull request #11746: [SPARK-13602][CORE] Add shutdown hook to DriverRu...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/11746#discussion_r118757391 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala --- @@ -53,9 +53,11 @@ private[deploy] class DriverRunner( @volatile private var killed = false // Populated once finished - private[worker] var finalState: Option[DriverState] = None - private[worker] var finalException: Option[Exception] = None - private var finalExitCode: Option[Int] = None + @volatile private[worker] var finalState: Option[DriverState] = None + @volatile private[worker] var finalException: Option[Exception] = None + + // Timeout to wait for when trying to terminate a driver. + private val DRIVER_TERMINATE_TIMEOUT_MS = 10 * 1000 --- End diff -- > Can't this just be a property added to SparkConf? It will be a whole cluster conf. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #11746: [SPARK-13602][CORE] Add shutdown hook to DriverRu...
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/11746#discussion_r118750666 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala --- @@ -53,9 +53,11 @@ private[deploy] class DriverRunner( @volatile private var killed = false // Populated once finished - private[worker] var finalState: Option[DriverState] = None - private[worker] var finalException: Option[Exception] = None - private var finalExitCode: Option[Int] = None + @volatile private[worker] var finalState: Option[DriverState] = None + @volatile private[worker] var finalException: Option[Exception] = None + + // Timeout to wait for when trying to terminate a driver. + private val DRIVER_TERMINATE_TIMEOUT_MS = 10 * 1000 --- End diff -- Can't this just be a property added to `SparkConf`? Btw, this timeout is also hard-coded in `ExecutorRunner` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #11746: [SPARK-13602][CORE] Add shutdown hook to DriverRu...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/11746#discussion_r118544204 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala --- @@ -53,9 +53,11 @@ private[deploy] class DriverRunner( @volatile private var killed = false // Populated once finished - private[worker] var finalState: Option[DriverState] = None - private[worker] var finalException: Option[Exception] = None - private var finalExitCode: Option[Int] = None + @volatile private[worker] var finalState: Option[DriverState] = None + @volatile private[worker] var finalException: Option[Exception] = None + + // Timeout to wait for when trying to terminate a driver. + private val DRIVER_TERMINATE_TIMEOUT_MS = 10 * 1000 --- End diff -- @cloud-fan Make sense. However, it requires designing an approach to set configurations for launching driver JVM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #11746: [SPARK-13602][CORE] Add shutdown hook to DriverRu...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/11746#discussion_r118542465 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala --- @@ -53,9 +53,11 @@ private[deploy] class DriverRunner( @volatile private var killed = false // Populated once finished - private[worker] var finalState: Option[DriverState] = None - private[worker] var finalException: Option[Exception] = None - private var finalExitCode: Option[Int] = None + @volatile private[worker] var finalState: Option[DriverState] = None + @volatile private[worker] var finalException: Option[Exception] = None + + // Timeout to wait for when trying to terminate a driver. + private val DRIVER_TERMINATE_TIMEOUT_MS = 10 * 1000 --- End diff -- I think we should make this configurable for each application. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #11746: [SPARK-13602][CORE] Add shutdown hook to DriverRu...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/11746 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #11746: [SPARK-13602][CORE] Add shutdown hook to DriverRu...
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/11746#discussion_r73771605 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala --- @@ -78,49 +80,53 @@ private[deploy] class DriverRunner( private[worker] def start() = { new Thread("DriverRunner for " + driverId) { override def run() { +var shutdownHook: AnyRef = null try { - val driverDir = createWorkingDirectory() - val localJarFilename = downloadUserJar(driverDir) - - def substituteVariables(argument: String): String = argument match { -case "{{WORKER_URL}}" => workerUrl -case "{{USER_JAR}}" => localJarFilename -case other => other + shutdownHook = ShutdownHookManager.addShutdownHook { () => +logInfo(s"Worker shutting down, killing driver $driverId") +kill() } - // TODO: If we add ability to submit multiple jars they should also be added here - val builder = CommandUtils.buildProcessBuilder(driverDesc.command, securityManager, -driverDesc.mem, sparkHome.getAbsolutePath, substituteVariables) - launchDriver(builder, driverDir, driverDesc.supervise) -} -catch { - case e: Exception => finalException = Some(e) -} + // prepare driver jars and launch driver + val exitCode = prepareAndLaunchDriver() --- End diff -- yup, sounds good! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #11746: [SPARK-13602][CORE] Add shutdown hook to DriverRu...
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/11746#discussion_r73771392 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala --- @@ -53,9 +53,11 @@ private[deploy] class DriverRunner( @volatile private var killed = false // Populated once finished - private[worker] var finalState: Option[DriverState] = None - private[worker] var finalException: Option[Exception] = None - private var finalExitCode: Option[Int] = None + @volatile private[worker] var finalState: Option[DriverState] = None --- End diff -- It's actually called by `WorkerPage.scala` which uses the `Option` to show that the driver is "running" --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #11746: [SPARK-13602][CORE] Add shutdown hook to DriverRu...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/11746#discussion_r73770621 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala --- @@ -168,7 +173,24 @@ private[deploy] class DriverRunner( localJarFilename } - private def launchDriver(builder: ProcessBuilder, baseDir: File, supervise: Boolean) { + private[worker] def prepareAndLaunchDriver(): Int = { +val driverDir = createWorkingDirectory() +val localJarFilename = downloadUserJar(driverDir) + +def substituteVariables(argument: String): String = argument match { + case "{{WORKER_URL}}" => workerUrl + case "{{USER_JAR}}" => localJarFilename + case other => other +} + +// TODO: If we add ability to submit multiple jars they should also be added here +val builder = CommandUtils.buildProcessBuilder(driverDesc.command, securityManager, + driverDesc.mem, sparkHome.getAbsolutePath, substituteVariables) + +launchDriver(builder, driverDir, driverDesc.supervise) + } + + private def launchDriver(builder: ProcessBuilder, baseDir: File, supervise: Boolean): Int = { --- End diff -- Similar. `runDriver` instead of `launchDriver`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #11746: [SPARK-13602][CORE] Add shutdown hook to DriverRu...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/11746#discussion_r73770529 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala --- @@ -78,49 +80,53 @@ private[deploy] class DriverRunner( private[worker] def start() = { new Thread("DriverRunner for " + driverId) { override def run() { +var shutdownHook: AnyRef = null try { - val driverDir = createWorkingDirectory() - val localJarFilename = downloadUserJar(driverDir) - - def substituteVariables(argument: String): String = argument match { -case "{{WORKER_URL}}" => workerUrl -case "{{USER_JAR}}" => localJarFilename -case other => other + shutdownHook = ShutdownHookManager.addShutdownHook { () => +logInfo(s"Worker shutting down, killing driver $driverId") +kill() } - // TODO: If we add ability to submit multiple jars they should also be added here - val builder = CommandUtils.buildProcessBuilder(driverDesc.command, securityManager, -driverDesc.mem, sparkHome.getAbsolutePath, substituteVariables) - launchDriver(builder, driverDir, driverDesc.supervise) -} -catch { - case e: Exception => finalException = Some(e) -} + // prepare driver jars and launch driver + val exitCode = prepareAndLaunchDriver() --- End diff -- nit: it's a little weird for a method that returns an exit code to be called "Launch". Maybe `prepareAndRunDriver`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #11746: [SPARK-13602][CORE] Add shutdown hook to DriverRu...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/11746#discussion_r73769695 --- Diff: core/src/test/scala/org/apache/spark/deploy/worker/DriverRunnerTest.scala --- @@ -45,6 +52,18 @@ class DriverRunnerTest extends SparkFunSuite { (processBuilder, process) } + private def createTestableDriverRunner(processBuilder: ProcessBuilderLike, --- End diff -- nit: wrong alignment for multi-line params. See https://cwiki.apache.org/confluence/display/SPARK/Spark+Code+Style+Guide#SparkCodeStyleGuide-Indentation --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #11746: [SPARK-13602][CORE] Add shutdown hook to DriverRu...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/11746#discussion_r73769000 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala --- @@ -53,9 +53,11 @@ private[deploy] class DriverRunner( @volatile private var killed = false // Populated once finished - private[worker] var finalState: Option[DriverState] = None - private[worker] var finalException: Option[Exception] = None - private var finalExitCode: Option[Int] = None + @volatile private[worker] var finalState: Option[DriverState] = None --- End diff -- Does this need to be an option? You call `.get` on it unconditionally. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #11746: [SPARK-13602][CORE] Add shutdown hook to DriverRu...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/11746#discussion_r73768915 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala --- @@ -78,49 +80,53 @@ private[deploy] class DriverRunner( private[worker] def start() = { new Thread("DriverRunner for " + driverId) { override def run() { +var shutdownHook: AnyRef = null try { - val driverDir = createWorkingDirectory() - val localJarFilename = downloadUserJar(driverDir) - - def substituteVariables(argument: String): String = argument match { -case "{{WORKER_URL}}" => workerUrl -case "{{USER_JAR}}" => localJarFilename -case other => other + shutdownHook = ShutdownHookManager.addShutdownHook { () => +logInfo(s"Worker shutting down, killing driver $driverId") +kill() } - // TODO: If we add ability to submit multiple jars they should also be added here - val builder = CommandUtils.buildProcessBuilder(driverDesc.command, securityManager, -driverDesc.mem, sparkHome.getAbsolutePath, substituteVariables) - launchDriver(builder, driverDir, driverDesc.supervise) -} -catch { - case e: Exception => finalException = Some(e) -} + // prepare driver jars and launch driver + val exitCode = prepareAndLaunchDriver() -val state = - if (killed) { -DriverState.KILLED - } else if (finalException.isDefined) { -DriverState.ERROR + // set final state depending on if forcibly killed and process exit code + finalState = if (exitCode == 0) { +Some(DriverState.FINISHED) + } else if (killed) { +Some(DriverState.KILLED) } else { -finalExitCode match { - case Some(0) => DriverState.FINISHED - case _ => DriverState.FAILED -} +Some(DriverState.FAILED) } +} +catch { + case e: Exception => +kill() +finalState = Some(DriverState.ERROR) +finalException = Some(e) +} +finally { + if (shutdownHook != null) ShutdownHookManager.removeShutdownHook(shutdownHook) +} -finalState = Some(state) - -worker.send(DriverStateChanged(driverId, state, finalException)) +// notify worker of final driver state, possible exception +worker.send(DriverStateChanged(driverId, finalState.get, finalException)) } }.start() } /** Terminate this driver (or prevent it from ever starting if not yet started) */ - private[worker] def kill() { + private[worker] def kill(): Unit = { +logInfo("Killing driver process!") +killed = true synchronized { - process.foreach(_.destroy()) - killed = true + process.foreach(p => { --- End diff -- nit: `.foreach { p =>` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #11746: [SPARK-13602][CORE] Add shutdown hook to DriverRu...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/11746#discussion_r73768879 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala --- @@ -78,49 +80,53 @@ private[deploy] class DriverRunner( private[worker] def start() = { new Thread("DriverRunner for " + driverId) { override def run() { +var shutdownHook: AnyRef = null try { - val driverDir = createWorkingDirectory() - val localJarFilename = downloadUserJar(driverDir) - - def substituteVariables(argument: String): String = argument match { -case "{{WORKER_URL}}" => workerUrl -case "{{USER_JAR}}" => localJarFilename -case other => other + shutdownHook = ShutdownHookManager.addShutdownHook { () => +logInfo(s"Worker shutting down, killing driver $driverId") +kill() } - // TODO: If we add ability to submit multiple jars they should also be added here - val builder = CommandUtils.buildProcessBuilder(driverDesc.command, securityManager, -driverDesc.mem, sparkHome.getAbsolutePath, substituteVariables) - launchDriver(builder, driverDir, driverDesc.supervise) -} -catch { - case e: Exception => finalException = Some(e) -} + // prepare driver jars and launch driver + val exitCode = prepareAndLaunchDriver() -val state = - if (killed) { -DriverState.KILLED - } else if (finalException.isDefined) { -DriverState.ERROR + // set final state depending on if forcibly killed and process exit code + finalState = if (exitCode == 0) { +Some(DriverState.FINISHED) + } else if (killed) { +Some(DriverState.KILLED) } else { -finalExitCode match { - case Some(0) => DriverState.FINISHED - case _ => DriverState.FAILED -} +Some(DriverState.FAILED) } +} +catch { + case e: Exception => +kill() +finalState = Some(DriverState.ERROR) +finalException = Some(e) +} +finally { + if (shutdownHook != null) ShutdownHookManager.removeShutdownHook(shutdownHook) --- End diff -- nit: use long form ``` if (foo) { do something } ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #11746: [SPARK-13602][CORE] Add shutdown hook to DriverRu...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/11746#discussion_r73768813 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala --- @@ -78,49 +80,53 @@ private[deploy] class DriverRunner( private[worker] def start() = { new Thread("DriverRunner for " + driverId) { override def run() { +var shutdownHook: AnyRef = null try { - val driverDir = createWorkingDirectory() - val localJarFilename = downloadUserJar(driverDir) - - def substituteVariables(argument: String): String = argument match { -case "{{WORKER_URL}}" => workerUrl -case "{{USER_JAR}}" => localJarFilename -case other => other + shutdownHook = ShutdownHookManager.addShutdownHook { () => +logInfo(s"Worker shutting down, killing driver $driverId") +kill() } - // TODO: If we add ability to submit multiple jars they should also be added here - val builder = CommandUtils.buildProcessBuilder(driverDesc.command, securityManager, -driverDesc.mem, sparkHome.getAbsolutePath, substituteVariables) - launchDriver(builder, driverDir, driverDesc.supervise) -} -catch { - case e: Exception => finalException = Some(e) -} + // prepare driver jars and launch driver + val exitCode = prepareAndLaunchDriver() -val state = - if (killed) { -DriverState.KILLED - } else if (finalException.isDefined) { -DriverState.ERROR + // set final state depending on if forcibly killed and process exit code + finalState = if (exitCode == 0) { +Some(DriverState.FINISHED) + } else if (killed) { +Some(DriverState.KILLED) } else { -finalExitCode match { - case Some(0) => DriverState.FINISHED - case _ => DriverState.FAILED -} +Some(DriverState.FAILED) } +} +catch { --- End diff -- nit: move to previous line --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #11746: [SPARK-13602][CORE] Add shutdown hook to DriverRu...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/11746#discussion_r73768836 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala --- @@ -78,49 +80,53 @@ private[deploy] class DriverRunner( private[worker] def start() = { new Thread("DriverRunner for " + driverId) { override def run() { +var shutdownHook: AnyRef = null try { - val driverDir = createWorkingDirectory() - val localJarFilename = downloadUserJar(driverDir) - - def substituteVariables(argument: String): String = argument match { -case "{{WORKER_URL}}" => workerUrl -case "{{USER_JAR}}" => localJarFilename -case other => other + shutdownHook = ShutdownHookManager.addShutdownHook { () => +logInfo(s"Worker shutting down, killing driver $driverId") +kill() } - // TODO: If we add ability to submit multiple jars they should also be added here - val builder = CommandUtils.buildProcessBuilder(driverDesc.command, securityManager, -driverDesc.mem, sparkHome.getAbsolutePath, substituteVariables) - launchDriver(builder, driverDir, driverDesc.supervise) -} -catch { - case e: Exception => finalException = Some(e) -} + // prepare driver jars and launch driver + val exitCode = prepareAndLaunchDriver() -val state = - if (killed) { -DriverState.KILLED - } else if (finalException.isDefined) { -DriverState.ERROR + // set final state depending on if forcibly killed and process exit code + finalState = if (exitCode == 0) { +Some(DriverState.FINISHED) + } else if (killed) { +Some(DriverState.KILLED) } else { -finalExitCode match { - case Some(0) => DriverState.FINISHED - case _ => DriverState.FAILED -} +Some(DriverState.FAILED) } +} +catch { + case e: Exception => +kill() +finalState = Some(DriverState.ERROR) +finalException = Some(e) +} +finally { --- End diff -- nit: move to previous line --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org