[GitHub] spark pull request #13077: [SPARK-10748] [Mesos] Log error instead of crashi...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/13077 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13077: [SPARK-10748] [Mesos] Log error instead of crashi...
Github user devaraj-kavali commented on a diff in the pull request: https://github.com/apache/spark/pull/13077#discussion_r94205390 --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala --- @@ -559,15 +560,29 @@ private[spark] class MesosClusterScheduler( } else { val offer = offerOption.get val queuedTasks = tasks.getOrElseUpdate(offer.offerId, new ArrayBuffer[TaskInfo]) -val task = createTaskInfo(submission, offer) -queuedTasks += task -logTrace(s"Using offer ${offer.offerId.getValue} to launch driver " + - submission.submissionId) -val newState = new MesosClusterSubmissionState(submission, task.getTaskId, offer.slaveId, - None, new Date(), None, getDriverFrameworkID(submission)) -launchedDrivers(submission.submissionId) = newState -launchedDriversState.persist(submission.submissionId, newState) -afterLaunchCallback(submission.submissionId) +breakable { --- End diff -- Here it needs to continue in the for loop from the catch block with next set of drivers. It cannot return from the exception since it needs to launch the other candidates, I can consider the other suggestion i.e. moving the following code into try clause. I will update the PR by moving the code into try block. Please let me know if it doesnât make sense. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13077: [SPARK-10748] [Mesos] Log error instead of crashi...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/13077#discussion_r94185533 --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala --- @@ -559,15 +560,29 @@ private[spark] class MesosClusterScheduler( } else { val offer = offerOption.get val queuedTasks = tasks.getOrElseUpdate(offer.offerId, new ArrayBuffer[TaskInfo]) -val task = createTaskInfo(submission, offer) -queuedTasks += task -logTrace(s"Using offer ${offer.offerId.getValue} to launch driver " + - submission.submissionId) -val newState = new MesosClusterSubmissionState(submission, task.getTaskId, offer.slaveId, - None, new Date(), None, getDriverFrameworkID(submission)) -launchedDrivers(submission.submissionId) = newState -launchedDriversState.persist(submission.submissionId, newState) -afterLaunchCallback(submission.submissionId) +breakable { --- End diff -- More of a style thing but why use breakable? can you just return from the exception case? or move the code following the catch into the try clause? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13077: [SPARK-10748] [Mesos] Log error instead of crashi...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/13077#discussion_r72010691 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala --- @@ -502,41 +502,53 @@ private[spark] class MesosClusterScheduler( } else { val offer = offerOption.get val taskId = TaskID.newBuilder().setValue(submission.submissionId).build() -val (remainingResources, cpuResourcesToUse) = - partitionResources(offer.resources, "cpus", driverCpu) -val (finalResources, memResourcesToUse) = - partitionResources(remainingResources.asJava, "mem", driverMem) -val commandInfo = buildDriverCommand(submission) -val appName = submission.schedulerProperties("spark.app.name") -val taskInfo = TaskInfo.newBuilder() - .setTaskId(taskId) - .setName(s"Driver for $appName") - .setSlaveId(offer.slaveId) - .setCommand(commandInfo) - .addAllResources(cpuResourcesToUse.asJava) - .addAllResources(memResourcesToUse.asJava) -offer.resources = finalResources.asJava - submission.schedulerProperties.get("spark.mesos.executor.docker.image").foreach { image => - val container = taskInfo.getContainerBuilder() - val volumes = submission.schedulerProperties -.get("spark.mesos.executor.docker.volumes") -.map(MesosSchedulerBackendUtil.parseVolumesSpec) - val portmaps = submission.schedulerProperties -.get("spark.mesos.executor.docker.portmaps") -.map(MesosSchedulerBackendUtil.parsePortMappingsSpec) - MesosSchedulerBackendUtil.addDockerInfo( -container, image, volumes = volumes, portmaps = portmaps) - taskInfo.setContainer(container.build()) +var commandInfo: CommandInfo = null +try { + commandInfo = buildDriverCommand(submission) +} catch { + case e: SparkException => +afterLaunchCallback(submission.submissionId) +finishedDrivers += new MesosClusterSubmissionState(submission, taskId, + SlaveID.newBuilder().setValue("").build(), None, null, None) +logError(s"Failed to launch the driver with id: ${submission.submissionId}, " + + s"cpu: $driverCpu, mem: $driverMem, reason: ${e.getMessage}") +} +if (commandInfo != null) { --- End diff -- Rather than make a big long if statement, is it simpler and clearer to just return from the method in the catch block? then you can even ... ``` val commandInfo = try { buildDriverCommand(submission) } ... ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org