[GitHub] spark pull request #19428: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-04 Thread susanxhuynh
GitHub user susanxhuynh opened a pull request:

https://github.com/apache/spark/pull/19428

[SPARK-22131][MESOS] Mesos driver secrets

## What changes were proposed in this pull request?

The driver launches executors that have access to env or file-based secrets.

Most of the changes are a refactor of the `dispatcher` secrets support - 
placing it in a common place that can be used by both the dispatcher and 
drivers. The same goes for the unit tests.

## How was this patch tested?

Unit tests.
Tested in DC/OS.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mesosphere/spark sh-mesos-driver-secret

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19428.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19428


commit 4b7ae0a8e802f8ef6a159f3ce95b3203352b548f
Author: Susan X. Huynh 
Date:   2017-10-04T11:30:31Z

[SPARK-22131] Mesos driver secrets. The driver launches executors that have 
access to env or file-based secrets.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19428: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-04 Thread susanxhuynh
Github user susanxhuynh closed the pull request at:

https://github.com/apache/spark/pull/19428


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-05 Thread susanxhuynh
GitHub user susanxhuynh opened a pull request:

https://github.com/apache/spark/pull/19437

[SPARK-22131][MESOS] Mesos driver secrets

## Background

In #18837 , @ArtRand added Mesos secrets support to the dispatcher. **This 
PR is to add the same secrets support to the drivers.** This means if the 
secret configs are set, the driver will launch executors that have access to 
either env or file-based secrets.

One use case for this is to support TLS in the driver <=> executor 
communication.

## What changes were proposed in this pull request?

Most of the changes are a refactor of the dispatcher secrets support 
(#18837) - moving it to a common place that can be used by both the dispatcher 
and drivers. The same goes for the unit tests.

## How was this patch tested?

There are four config combinations: [env or file-based] x [value or 
reference secret]. For each combination:
- Added a unit test.
- Tested in DC/OS.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mesosphere/spark sh-mesos-driver-secret

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19437.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19437


commit b289bcc95f0b67cda94ddf416fc9a15e5d1855b4
Author: Susan X. Huynh 
Date:   2017-10-04T11:30:31Z

[SPARK-22131] Mesos driver secrets. The driver launches executors that have 
access to env or file-based secrets.

commit 6f062c00f6382d266619b4a56a753ec27d1db10b
Author: Susan X. Huynh 
Date:   2017-10-05T12:07:20Z

[SPARK-22131] Updated docs




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-05 Thread susanxhuynh
Github user susanxhuynh commented on the issue:

https://github.com/apache/spark/pull/19437
  
@ArtRand @skonto Please review. Tests passed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19374: [SPARK-22145][MESOS] fix supervise with checkpoin...

2017-10-09 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19374#discussion_r143361887
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala
 ---
@@ -276,8 +276,8 @@ private[spark] class MesosClusterScheduler(
   private def recoverState(): Unit = {
 stateLock.synchronized {
   launchedDriversState.fetchAll[MesosClusterSubmissionState]().foreach 
{ state =>
-launchedDrivers(state.taskId.getValue) = state
-pendingRecover(state.taskId.getValue) = state.slaveId
+launchedDrivers(state.driverDescription.submissionId) = state
+pendingRecover(state.taskId.toString) = state.slaveId
--- End diff --

Why did this change? I think the original `getValue` is the standard way to 
get the value of this TaskID proto. 
http://mesos.apache.org/api/latest/java/org/apache/mesos/Protos.TaskID.html


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19374: [SPARK-22145][MESOS] fix supervise with checkpoin...

2017-10-09 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19374#discussion_r143487275
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala
 ---
@@ -276,8 +276,8 @@ private[spark] class MesosClusterScheduler(
   private def recoverState(): Unit = {
 stateLock.synchronized {
   launchedDriversState.fetchAll[MesosClusterSubmissionState]().foreach 
{ state =>
-launchedDrivers(state.taskId.getValue) = state
-pendingRecover(state.taskId.getValue) = state.slaveId
+launchedDrivers(state.driverDescription.submissionId) = state
--- End diff --

Maybe modify the comments up in L.138-150 ^^ to clarify which data 
structures are keyed by submission ID vs. task ID:
- Keyed by submission ID: `launchedDrivers`, `queuedDrivers`, 
`pendingRetryDrivers`, `finishedDrivers`
- Keyed by task ID: `pendingRecover`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19374: [SPARK-22145][MESOS] fix supervise with checkpoin...

2017-10-09 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19374#discussion_r143484031
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala
 ---
@@ -804,45 +814,52 @@ private[spark] class MesosClusterScheduler(
 logInfo(s"Received status update: taskId=${taskId}" +
   s" state=${status.getState}" +
   s" message=${status.getMessage}" +
-  s" reason=${status.getReason}");
+  s" reason=${status.getReason}")
 
 stateLock.synchronized {
-  if (launchedDrivers.contains(taskId)) {
+  val subId = getSumbmissionIdFromTaskId(taskId)
+  if (launchedDrivers.contains(subId)) {
 if (status.getReason == Reason.REASON_RECONCILIATION &&
   !pendingRecover.contains(taskId)) {
   // Task has already received update and no longer requires 
reconciliation.
   return
 }
-val state = launchedDrivers(taskId)
+val state = launchedDrivers(subId)
 // Check if the driver is supervise enabled and can be relaunched.
 if (state.driverDescription.supervise && 
shouldRelaunch(status.getState)) {
-  removeFromLaunchedDrivers(taskId)
+  removeFromLaunchedDrivers(subId)
   state.finishDate = Some(new Date())
   val retryState: Option[MesosClusterRetryState] = 
state.driverDescription.retryState
   val (retries, waitTimeSec) = retryState
 .map { rs => (rs.retries + 1, Math.min(maxRetryWaitTime, 
rs.waitTime * 2)) }
 .getOrElse{ (1, 1) }
   val nextRetry = new Date(new Date().getTime + waitTimeSec * 
1000L)
-
   val newDriverDescription = state.driverDescription.copy(
 retryState = Some(new MesosClusterRetryState(status, retries, 
nextRetry, waitTimeSec)))
-  addDriverToPending(newDriverDescription, taskId);
+  addDriverToPending(newDriverDescription, 
newDriverDescription.submissionId)
 } else if 
(TaskState.isFinished(mesosToTaskState(status.getState))) {
-  removeFromLaunchedDrivers(taskId)
-  state.finishDate = Some(new Date())
-  if (finishedDrivers.size >= retainedDrivers) {
-val toRemove = math.max(retainedDrivers / 10, 1)
-finishedDrivers.trimStart(toRemove)
-  }
-  finishedDrivers += state
+  retireDriver(subId, state, status)
 }
 state.mesosTaskStatus = Option(status)
   } else {
-logError(s"Unable to find driver $taskId in status update")
+logError(s"Unable to find driver with $taskId in status update")
   }
 }
   }
 
+  private def retireDriver(
+  submissionId: String,
+  state: MesosClusterSubmissionState,
+  status: TaskStatus) = {
--- End diff --

This parameter, `status`, is not used.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19374: [SPARK-22145][MESOS] fix supervise with checkpoin...

2017-10-09 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19374#discussion_r143344688
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala
 ---
@@ -374,6 +375,15 @@ private[spark] class MesosClusterScheduler(
 s"${frameworkId}-${desc.submissionId}${retries}"
   }
 
+  private def getDriverTaskId(desc: MesosDriverDescription): String = {
+val sId = desc.submissionId
+desc.retryState.map(state => sId + 
s"-retry-${state.retries.toString}").getOrElse(sId)
+  }
+
+  private def getSumbmissionIdFromTaskId(taskId: String): String = {
--- End diff --

typo: "Submission"


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19374: [SPARK-22145][MESOS] fix supervise with checkpointing on...

2017-10-09 Thread susanxhuynh
Github user susanxhuynh commented on the issue:

https://github.com/apache/spark/pull/19374
  
@skonto One more question: in your screen shot of the History Server, I 
noticed the "Completed" time is 1969-12-31 for all the drivers (the original 
one, retry-1, and retry-2). Is that to be expected?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-09 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19437#discussion_r143549493
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackendUtil.scala
 ---
@@ -170,9 +174,122 @@ private[mesos] object MesosSchedulerBackendUtil 
extends Logging {
   containerInfo.addNetworkInfos(info)
 }
 
+getSecretVolume(conf, secretConfig).foreach { volume =>
+  if (volume.getSource.getSecret.getReference.isInitialized) {
+logInfo(s"Setting reference secret 
${volume.getSource.getSecret.getReference.getName}" +
+  s"on file ${volume.getContainerPath}")
+  } else {
+logInfo(s"Setting secret on file name=${volume.getContainerPath}")
+  }
+  containerInfo.addVolumes(volume)
+}
+
 containerInfo
   }
 
+  def addSecretEnvVar(
+  envBuilder: Environment.Builder,
+  conf: SparkConf,
+  secretConfig: MesosSecretConfig): Unit = {
+getSecretEnvVar(conf, secretConfig).foreach { variable =>
+  if (variable.getSecret.getReference.isInitialized) {
+logInfo(s"Setting reference secret 
${variable.getSecret.getReference.getName}" +
+  s"on file ${variable.getName}")
+  } else {
+logInfo(s"Setting secret on environment variable 
name=${variable.getName}")
+  }
+  envBuilder.addVariables(variable)
+}
+  }
+
+  private def getSecrets(conf: SparkConf, secretConfig: MesosSecretConfig):
+  Seq[Secret] = {
+def createValueSecret(data: String): Secret = {
+  Secret.newBuilder()
+.setType(Secret.Type.VALUE)
+
.setValue(Secret.Value.newBuilder().setData(ByteString.copyFrom(data.getBytes)))
+.build()
+}
+
+def createReferenceSecret(name: String): Secret = {
+  Secret.newBuilder()
+.setReference(Secret.Reference.newBuilder().setName(name))
+.setType(Secret.Type.REFERENCE)
+.build()
+}
+
+val referenceSecrets: Seq[Secret] =
+  conf.get(secretConfig.SECRET_NAME).getOrElse(Nil).map(s => 
createReferenceSecret(s))
+
+val valueSecrets: Seq[Secret] = {
+  conf.get(secretConfig.SECRET_VALUE).getOrElse(Nil).map(s => 
createValueSecret(s))
+}
+
+if (valueSecrets.nonEmpty && referenceSecrets.nonEmpty) {
+  throw new SparkException("Cannot specify VALUE type secrets and 
REFERENCE types ones")
+}
+
+if (referenceSecrets.nonEmpty) referenceSecrets else valueSecrets
+  }
+
+  private def illegalSecretInput(dest: Seq[String], s: Seq[Secret]): 
Boolean = {
--- End diff --

Done


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-09 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19437#discussion_r143540778
  
--- Diff: 
resource-managers/mesos/src/test/scala/org/apache/spark/scheduler/cluster/mesos/Utils.scala
 ---
@@ -105,4 +108,108 @@ object Utils {
   def createTaskId(taskId: String): TaskID = {
 TaskID.newBuilder().setValue(taskId).build()
   }
+
+  def configEnvBasedRefSecrets(secretConfig: MesosSecretConfig): 
Map[String, String] = {
+val secretName = "/path/to/secret,/anothersecret"
+val envKey = "SECRET_ENV_KEY,PASSWORD"
+Map(
+  secretConfig.SECRET_NAME.key -> secretName,
+  secretConfig.SECRET_ENVKEY.key -> envKey
+)
+  }
+
+  def verifyEnvBasedRefSecrets(launchedTasks: List[TaskInfo]): Unit = {
+val envVars = launchedTasks.head
+  .getCommand
+  .getEnvironment
+  .getVariablesList
+  .asScala
+assert(envVars
+  .filter(!_.getName.startsWith("SPARK_")).length == 2)  // 
user-defined secret env vars
--- End diff --

Thanks, I'll fix it.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-09 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19437#discussion_r143543546
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/deploy/mesos/config.scala
 ---
@@ -21,6 +21,39 @@ import java.util.concurrent.TimeUnit
 
 import org.apache.spark.internal.config.ConfigBuilder
 
+private[spark] class MesosSecretConfig(taskType: String) {
+  private[spark] val SECRET_NAME =
--- End diff --

I see what you mean. I'll change them.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-09 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19437#discussion_r143546897
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackendUtil.scala
 ---
@@ -17,10 +17,14 @@
 
 package org.apache.spark.scheduler.cluster.mesos
 
-import org.apache.mesos.Protos.{ContainerInfo, Image, NetworkInfo, 
Parameter, Volume}
+import org.apache.mesos.Protos._
--- End diff --

Hmm, IntelliJ added this automatically - I'll fix it.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-09 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19437#discussion_r143548722
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackendUtil.scala
 ---
@@ -122,7 +126,7 @@ private[mesos] object MesosSchedulerBackendUtil extends 
Logging {
 .toList
   }
 
-  def containerInfo(conf: SparkConf): ContainerInfo.Builder = {
+  def containerInfo(conf: SparkConf, secretConfig: MesosSecretConfig): 
ContainerInfo.Builder = {
--- End diff --

OK, I'll use `buildContainerInfo`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-09 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19437#discussion_r143571471
  
--- Diff: 
resource-managers/mesos/src/test/scala/org/apache/spark/scheduler/cluster/mesos/Utils.scala
 ---
@@ -105,4 +108,108 @@ object Utils {
   def createTaskId(taskId: String): TaskID = {
 TaskID.newBuilder().setValue(taskId).build()
   }
+
+  def configEnvBasedRefSecrets(secretConfig: MesosSecretConfig): 
Map[String, String] = {
+val secretName = "/path/to/secret,/anothersecret"
+val envKey = "SECRET_ENV_KEY,PASSWORD"
+Map(
+  secretConfig.SECRET_NAME.key -> secretName,
+  secretConfig.SECRET_ENVKEY.key -> envKey
+)
+  }
+
+  def verifyEnvBasedRefSecrets(launchedTasks: List[TaskInfo]): Unit = {
+val envVars = launchedTasks.head
+  .getCommand
+  .getEnvironment
+  .getVariablesList
+  .asScala
+assert(envVars
+  .filter(!_.getName.startsWith("SPARK_")).length == 2)  // 
user-defined secret env vars
+val variableOne = envVars.filter(_.getName == "SECRET_ENV_KEY").head
+assert(variableOne.getSecret.isInitialized)
+assert(variableOne.getSecret.getType == Secret.Type.REFERENCE)
+assert(variableOne.getSecret.getReference.getName == "/path/to/secret")
+assert(variableOne.getType == Environment.Variable.Type.SECRET)
+val variableTwo = envVars.filter(_.getName == "PASSWORD").head
+assert(variableTwo.getSecret.isInitialized)
+assert(variableTwo.getSecret.getType == Secret.Type.REFERENCE)
+assert(variableTwo.getSecret.getReference.getName == "/anothersecret")
+assert(variableTwo.getType == Environment.Variable.Type.SECRET)
+  }
+
+  def configEnvBasedValueSecrets(secretConfig: MesosSecretConfig): 
Map[String, String] = {
+val secretValues = "user,password"
+val envKeys = "USER,PASSWORD"
+Map(
+  secretConfig.SECRET_VALUE.key -> secretValues,
+  secretConfig.SECRET_ENVKEY.key -> envKeys
+)
+  }
+
+  def verifyEnvBasedValueSecrets(launchedTasks: List[TaskInfo]): Unit = {
+val envVars = launchedTasks.head
+  .getCommand
+  .getEnvironment
+  .getVariablesList
+  .asScala
+assert(envVars
+  .filter(!_.getName.startsWith("SPARK_")).length == 2)  // 
user-defined secret env vars
--- End diff --

Done


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-09 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19437#discussion_r143565549
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackendUtil.scala
 ---
@@ -170,9 +174,122 @@ private[mesos] object MesosSchedulerBackendUtil 
extends Logging {
   containerInfo.addNetworkInfos(info)
 }
 
+getSecretVolume(conf, secretConfig).foreach { volume =>
+  if (volume.getSource.getSecret.getReference.isInitialized) {
+logInfo(s"Setting reference secret 
${volume.getSource.getSecret.getReference.getName}" +
+  s"on file ${volume.getContainerPath}")
+  } else {
+logInfo(s"Setting secret on file name=${volume.getContainerPath}")
+  }
+  containerInfo.addVolumes(volume)
+}
+
 containerInfo
   }
 
+  def addSecretEnvVar(
+  envBuilder: Environment.Builder,
+  conf: SparkConf,
+  secretConfig: MesosSecretConfig): Unit = {
+getSecretEnvVar(conf, secretConfig).foreach { variable =>
+  if (variable.getSecret.getReference.isInitialized) {
+logInfo(s"Setting reference secret 
${variable.getSecret.getReference.getName}" +
+  s"on file ${variable.getName}")
+  } else {
+logInfo(s"Setting secret on environment variable 
name=${variable.getName}")
+  }
+  envBuilder.addVariables(variable)
+}
+  }
+
+  private def getSecrets(conf: SparkConf, secretConfig: MesosSecretConfig):
+  Seq[Secret] = {
+def createValueSecret(data: String): Secret = {
+  Secret.newBuilder()
+.setType(Secret.Type.VALUE)
+
.setValue(Secret.Value.newBuilder().setData(ByteString.copyFrom(data.getBytes)))
+.build()
+}
+
+def createReferenceSecret(name: String): Secret = {
+  Secret.newBuilder()
+.setReference(Secret.Reference.newBuilder().setName(name))
+.setType(Secret.Type.REFERENCE)
+.build()
+}
+
+val referenceSecrets: Seq[Secret] =
+  conf.get(secretConfig.SECRET_NAME).getOrElse(Nil).map(s => 
createReferenceSecret(s))
+
+val valueSecrets: Seq[Secret] = {
+  conf.get(secretConfig.SECRET_VALUE).getOrElse(Nil).map(s => 
createValueSecret(s))
+}
+
+if (valueSecrets.nonEmpty && referenceSecrets.nonEmpty) {
+  throw new SparkException("Cannot specify VALUE type secrets and 
REFERENCE types ones")
+}
+
+if (referenceSecrets.nonEmpty) referenceSecrets else valueSecrets
+  }
+
+  private def illegalSecretInput(dest: Seq[String], s: Seq[Secret]): 
Boolean = {
+if (dest.isEmpty) {  // no destination set (ie not using secrets of 
this type
--- End diff --

Good point. If they specify paths but no secrets, it should throw an 
exception.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-11 Thread susanxhuynh
Github user susanxhuynh commented on the issue:

https://github.com/apache/spark/pull/19437
  
@skonto I haven't tested with TLS; I'll work on that in the next couple of 
days.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-11 Thread susanxhuynh
Github user susanxhuynh commented on the issue:

https://github.com/apache/spark/pull/19437
  
@vanzin Would you mind reviewing this PR? A followup to ArtRand's secrets 
PR.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19374: [SPARK-22145][MESOS] fix supervise with checkpointing on...

2017-10-17 Thread susanxhuynh
Github user susanxhuynh commented on the issue:

https://github.com/apache/spark/pull/19374
  
LGTM


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-17 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19437#discussion_r145268821
  
--- Diff: docs/running-on-mesos.md ---
@@ -522,6 +522,43 @@ See the [configuration page](configuration.html) for 
information on Spark config
 
 
 
+  spark.mesos.executor.secret.envkeys
+  (none)
+  
+A comma-separated list that, if set, the contents of the secret 
referenced
--- End diff --

I'll add some examples. I didn't find a good general example; I added a 
separate example to each config option.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-17 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19437#discussion_r145282300
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosFineGrainedSchedulerBackend.scala
 ---
@@ -159,7 +160,8 @@ private[spark] class MesosFineGrainedSchedulerBackend(
   .setCommand(command)
   .setData(ByteString.copyFrom(createExecArg()))
 
-
executorInfo.setContainer(MesosSchedulerBackendUtil.containerInfo(sc.conf))
+executorInfo.setContainer(
--- End diff --

Okay, I've reverted this change in fine-grained mode.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-17 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19437#discussion_r145286357
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackendUtil.scala
 ---
@@ -170,9 +175,119 @@ private[mesos] object MesosSchedulerBackendUtil 
extends Logging {
   containerInfo.addNetworkInfos(info)
 }
 
+getSecretVolume(conf, secretConfig).foreach { volume =>
+  if (volume.getSource.getSecret.getReference.isInitialized) {
+logInfo(s"Setting reference secret 
${volume.getSource.getSecret.getReference.getName}" +
--- End diff --

Fixed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-17 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19437#discussion_r145290625
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackendUtil.scala
 ---
@@ -170,9 +175,119 @@ private[mesos] object MesosSchedulerBackendUtil 
extends Logging {
   containerInfo.addNetworkInfos(info)
 }
 
+getSecretVolume(conf, secretConfig).foreach { volume =>
+  if (volume.getSource.getSecret.getReference.isInitialized) {
+logInfo(s"Setting reference secret 
${volume.getSource.getSecret.getReference.getName}" +
+  s"on file ${volume.getContainerPath}")
+  } else {
+logInfo(s"Setting secret on file name=${volume.getContainerPath}")
+  }
+  containerInfo.addVolumes(volume)
+}
+
 containerInfo
   }
 
+  def addSecretEnvVar(
+  envBuilder: Environment.Builder,
+  conf: SparkConf,
+  secretConfig: MesosSecretConfig): Unit = {
+getSecretEnvVar(conf, secretConfig).foreach { variable =>
+  if (variable.getSecret.getReference.isInitialized) {
+logInfo(s"Setting reference secret 
${variable.getSecret.getReference.getName}" +
--- End diff --

Fixed


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-17 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19437#discussion_r145290603
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackendUtil.scala
 ---
@@ -170,9 +175,119 @@ private[mesos] object MesosSchedulerBackendUtil 
extends Logging {
   containerInfo.addNetworkInfos(info)
 }
 
+getSecretVolume(conf, secretConfig).foreach { volume =>
+  if (volume.getSource.getSecret.getReference.isInitialized) {
+logInfo(s"Setting reference secret 
${volume.getSource.getSecret.getReference.getName}" +
+  s"on file ${volume.getContainerPath}")
+  } else {
+logInfo(s"Setting secret on file name=${volume.getContainerPath}")
+  }
+  containerInfo.addVolumes(volume)
+}
+
 containerInfo
   }
 
+  def addSecretEnvVar(
+  envBuilder: Environment.Builder,
+  conf: SparkConf,
+  secretConfig: MesosSecretConfig): Unit = {
+getSecretEnvVar(conf, secretConfig).foreach { variable =>
+  if (variable.getSecret.getReference.isInitialized) {
+logInfo(s"Setting reference secret 
${variable.getSecret.getReference.getName}" +
+  s"on file ${variable.getName}")
+  } else {
+logInfo(s"Setting secret on environment variable 
name=${variable.getName}")
+  }
+  envBuilder.addVariables(variable)
+}
+  }
+
+  private def getSecrets(conf: SparkConf, secretConfig: MesosSecretConfig):
+  Seq[Secret] = {
--- End diff --

Fixed


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-17 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19437#discussion_r145282997
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackendUtil.scala
 ---
@@ -122,7 +126,8 @@ private[mesos] object MesosSchedulerBackendUtil extends 
Logging {
 .toList
   }
 
-  def containerInfo(conf: SparkConf): ContainerInfo.Builder = {
+  def buildContainerInfo(conf: SparkConf, secretConfig: MesosSecretConfig):
--- End diff --

I've removed that second parameter and moved the secret stuff out of this 
method, since it's called by Fine-Grained mode, and we don't want to touch 
Fine-Grained mode.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-17 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19437#discussion_r145287767
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackendUtil.scala
 ---
@@ -170,9 +175,119 @@ private[mesos] object MesosSchedulerBackendUtil 
extends Logging {
   containerInfo.addNetworkInfos(info)
 }
 
+getSecretVolume(conf, secretConfig).foreach { volume =>
+  if (volume.getSource.getSecret.getReference.isInitialized) {
+logInfo(s"Setting reference secret 
${volume.getSource.getSecret.getReference.getName}" +
+  s"on file ${volume.getContainerPath}")
+  } else {
+logInfo(s"Setting secret on file name=${volume.getContainerPath}")
+  }
+  containerInfo.addVolumes(volume)
+}
+
 containerInfo
   }
 
+  def addSecretEnvVar(
--- End diff --

I've removed this method. To be more consistent, I've moved this code back 
into MesosClusterScheduler. There's a little duplication, because 
MesosCoarseGrainedSchedulerBackend now has a similar code snippet, but it does 
avoid the mutation.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-18 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19437#discussion_r145437968
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackendUtil.scala
 ---
@@ -173,6 +178,90 @@ private[mesos] object MesosSchedulerBackendUtil 
extends Logging {
 containerInfo
   }
 
+  private def getSecrets(conf: SparkConf, secretConfig: 
MesosSecretConfig): Seq[Secret] = {
+def createValueSecret(data: String): Secret = {
+  Secret.newBuilder()
+.setType(Secret.Type.VALUE)
+
.setValue(Secret.Value.newBuilder().setData(ByteString.copyFrom(data.getBytes)))
+.build()
+}
+
+def createReferenceSecret(name: String): Secret = {
+  Secret.newBuilder()
+.setReference(Secret.Reference.newBuilder().setName(name))
+.setType(Secret.Type.REFERENCE)
+.build()
+}
+
+val referenceSecrets: Seq[Secret] =
+  conf.get(secretConfig.SECRET_NAMES).getOrElse(Nil).map(s => 
createReferenceSecret(s))
+
+val valueSecrets: Seq[Secret] = {
+  conf.get(secretConfig.SECRET_VALUES).getOrElse(Nil).map(s => 
createValueSecret(s))
+}
+
+if (valueSecrets.nonEmpty && referenceSecrets.nonEmpty) {
+  throw new SparkException("Cannot specify VALUE type secrets and 
REFERENCE types ones")
+}
+
+if (referenceSecrets.nonEmpty) referenceSecrets else valueSecrets
+  }
+
+  private def illegalSecretInput(dest: Seq[String], secrets: Seq[Secret]): 
Boolean = {
+if (dest.nonEmpty) {
+  // make sure there is a one-to-one correspondence between 
destinations and secrets
+  if (dest.length != secrets.length) {
+return true
+  }
+}
+false
+  }
+
+  def getSecretVolume(conf: SparkConf, secretConfig: MesosSecretConfig): 
List[Volume] = {
+val secrets = getSecrets(conf, secretConfig)
+val secretPaths: Seq[String] =
+  conf.get(secretConfig.SECRET_FILENAMES).getOrElse(Nil)
+
+if (illegalSecretInput(secretPaths, secrets)) {
+  throw new SparkException(
+s"Need to give equal numbers of secrets and file paths for 
file-based " +
+  s"reference secrets got secrets $secrets, and paths 
$secretPaths")
+}
+
+secrets.zip(secretPaths).map {
+  case (s, p) =>
+val source = Volume.Source.newBuilder()
+  .setType(Volume.Source.Type.SECRET)
+  .setSecret(s)
+Volume.newBuilder()
+  .setContainerPath(p)
+  .setSource(source)
+  .setMode(Volume.Mode.RO)
+  .build
+}.toList
+  }
+
+  def getSecretEnvVar(conf: SparkConf, secretConfig: MesosSecretConfig):
+  List[Variable] = {
--- End diff --

Fixed


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-18 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19437#discussion_r145437791
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackendUtil.scala
 ---
@@ -122,7 +126,8 @@ private[mesos] object MesosSchedulerBackendUtil extends 
Logging {
 .toList
   }
 
-  def containerInfo(conf: SparkConf): ContainerInfo.Builder = {
+  def buildContainerInfo(conf: SparkConf):
+  ContainerInfo.Builder = {
--- End diff --

Fixed


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-18 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19437#discussion_r145437403
  
--- Diff: docs/running-on-mesos.md ---
@@ -501,23 +503,74 @@ See the [configuration page](configuration.html) for 
information on Spark config
 spark.mesos.driver.secret.names or spark.mesos.driver.secret.values 
will be
 written to the provided file. Paths are relative to the container's 
work
 directory.  Absolute paths must already exist.  Consult the Mesos 
Secret
-protobuf for more information.
+protobuf for more information. Example:
--- End diff --

Mentioned custom module and added link to Mesos documentation.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-18 Thread susanxhuynh
Github user susanxhuynh commented on the issue:

https://github.com/apache/spark/pull/19437
  
@srowen Would you like to help review? Adding Mesos secrets support in 
driver for executor tasks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19390: [SPARK-18935][MESOS] Fix dynamic reservations on ...

2017-10-18 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19390#discussion_r145530482
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala
 ---
@@ -446,6 +456,9 @@ private[spark] class MesosCoarseGrainedSchedulerBackend(
 totalGpusAcquired += taskGPUs
 gpusByTaskId(taskId) = taskGPUs
   }
+} else {
+  logDebug(s"Cannot launch a task for offer with id: $offerId on 
slave with id: $slaveId." +
--- End diff --

Add a space at the end of this sentence (before the next sentence).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-20 Thread susanxhuynh
Github user susanxhuynh commented on the issue:

https://github.com/apache/spark/pull/19437
  
@vanzin Ping, would you mind reviewing this PR?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-23 Thread susanxhuynh
Github user susanxhuynh commented on the issue:

https://github.com/apache/spark/pull/19437
  
@srowen Ping, would you like to help review?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-25 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19437#discussion_r146701899
  
--- Diff: docs/running-on-mesos.md ---
@@ -501,23 +503,78 @@ See the [configuration page](configuration.html) for 
information on Spark config
 spark.mesos.driver.secret.names or spark.mesos.driver.secret.values 
will be
 written to the provided file. Paths are relative to the container's 
work
 directory.  Absolute paths must already exist.  Consult the Mesos 
Secret
-protobuf for more information.
+protobuf for more information. Note: File-based secrets require a 
custom
+http://mesos.apache.org/documentation/latest/secrets/";>SecretResolver
+module. Example:
+
+filename1,filename2
   
 
 
   spark.mesos.driver.secret.names
   (none)
   
 A comma-separated list of secret references. Consult the Mesos Secret
-protobuf for more information.
+protobuf for more information. Example:
+
+secretname1,secretname2
   
 
 
   spark.mesos.driver.secret.values
   (none)
   
 A comma-separated list of secret values. Consult the Mesos Secret
-protobuf for more information.
+protobuf for more information. Example:
--- End diff --

I have grouped some of these properties together to make the description 
clearer. Please let me know what you think.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-25 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19437#discussion_r146702275
  
--- Diff: docs/running-on-mesos.md ---
@@ -501,23 +503,78 @@ See the [configuration page](configuration.html) for 
information on Spark config
 spark.mesos.driver.secret.names or spark.mesos.driver.secret.values 
will be
 written to the provided file. Paths are relative to the container's 
work
 directory.  Absolute paths must already exist.  Consult the Mesos 
Secret
-protobuf for more information.
+protobuf for more information. Note: File-based secrets require a 
custom
+http://mesos.apache.org/documentation/latest/secrets/";>SecretResolver
+module. Example:
+
+filename1,filename2
   
 
 
   spark.mesos.driver.secret.names
   (none)
   
 A comma-separated list of secret references. Consult the Mesos Secret
-protobuf for more information.
+protobuf for more information. Example:
+
+secretname1,secretname2
   
 
 
   spark.mesos.driver.secret.values
   (none)
   
 A comma-separated list of secret values. Consult the Mesos Secret
-protobuf for more information.
+protobuf for more information. Example:
+
+secretvalue1,secretvalue2
+  
+
+
+
+  spark.mesos.executor.secret.envkeys
+  (none)
+  
+A comma-separated list that, if set, the contents of the secret 
referenced
--- End diff --

Let me try to improve the descriptions of all these secret configs.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-25 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19437#discussion_r146712893
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackendUtil.scala
 ---
@@ -173,6 +178,90 @@ private[mesos] object MesosSchedulerBackendUtil 
extends Logging {
 containerInfo
   }
 
+  private def getSecrets(conf: SparkConf, secretConfig: 
MesosSecretConfig): Seq[Secret] = {
+def createValueSecret(data: String): Secret = {
+  Secret.newBuilder()
+.setType(Secret.Type.VALUE)
+
.setValue(Secret.Value.newBuilder().setData(ByteString.copyFrom(data.getBytes)))
+.build()
+}
+
+def createReferenceSecret(name: String): Secret = {
+  Secret.newBuilder()
+.setReference(Secret.Reference.newBuilder().setName(name))
+.setType(Secret.Type.REFERENCE)
+.build()
+}
+
+val referenceSecrets: Seq[Secret] =
+  conf.get(secretConfig.SECRET_NAMES).getOrElse(Nil).map(s => 
createReferenceSecret(s))
+
+val valueSecrets: Seq[Secret] = {
+  conf.get(secretConfig.SECRET_VALUES).getOrElse(Nil).map(s => 
createValueSecret(s))
+}
+
+if (valueSecrets.nonEmpty && referenceSecrets.nonEmpty) {
+  throw new SparkException("Cannot specify VALUE type secrets and 
REFERENCE types ones")
+}
+
+if (referenceSecrets.nonEmpty) referenceSecrets else valueSecrets
+  }
+
+  private def illegalSecretInput(dest: Seq[String], secrets: Seq[Secret]): 
Boolean = {
+if (dest.nonEmpty) {
+  // make sure there is a one-to-one correspondence between 
destinations and secrets
+  if (dest.length != secrets.length) {
+return true
+  }
+}
+false
+  }
+
+  def getSecretVolume(conf: SparkConf, secretConfig: MesosSecretConfig): 
List[Volume] = {
+val secrets = getSecrets(conf, secretConfig)
+val secretPaths: Seq[String] =
+  conf.get(secretConfig.SECRET_FILENAMES).getOrElse(Nil)
+
+if (illegalSecretInput(secretPaths, secrets)) {
+  throw new SparkException(
+s"Need to give equal numbers of secrets and file paths for 
file-based " +
+  s"reference secrets got secrets $secrets, and paths 
$secretPaths")
+}
+
+secrets.zip(secretPaths).map {
+  case (s, p) =>
+val source = Volume.Source.newBuilder()
+  .setType(Volume.Source.Type.SECRET)
+  .setSecret(s)
+Volume.newBuilder()
+  .setContainerPath(p)
+  .setSource(source)
+  .setMode(Volume.Mode.RO)
+  .build
+}.toList
+  }
+
+  def getSecretEnvVar(conf: SparkConf, secretConfig: MesosSecretConfig):
+List[Variable] = {
+val secrets = getSecrets(conf, secretConfig)
+val secretEnvKeys = 
conf.get(secretConfig.SECRET_ENVKEYS).getOrElse(Nil)
+if (illegalSecretInput(secretEnvKeys, secrets)) {
+  throw new SparkException(
+s"Need to give equal numbers of secrets and environment keys " +
+  s"for environment-based reference secrets got secrets $secrets, 
" +
+  s"and keys $secretEnvKeys")
+}
+
+secrets.zip(secretEnvKeys).map {
+  case (s, k) =>
--- End diff --

ok


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-25 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19437#discussion_r146704074
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/deploy/mesos/config.scala
 ---
@@ -21,6 +21,39 @@ import java.util.concurrent.TimeUnit
 
 import org.apache.spark.internal.config.ConfigBuilder
 
+private[spark] class MesosSecretConfig(taskType: String) {
--- End diff --

Done. I've moved this `MesosSecretConfig` class inside the `config` object, 
and made the constructor private to that object.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-25 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19437#discussion_r146712778
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackendUtil.scala
 ---
@@ -173,6 +178,90 @@ private[mesos] object MesosSchedulerBackendUtil 
extends Logging {
 containerInfo
   }
 
+  private def getSecrets(conf: SparkConf, secretConfig: 
MesosSecretConfig): Seq[Secret] = {
+def createValueSecret(data: String): Secret = {
+  Secret.newBuilder()
+.setType(Secret.Type.VALUE)
+
.setValue(Secret.Value.newBuilder().setData(ByteString.copyFrom(data.getBytes)))
+.build()
+}
+
+def createReferenceSecret(name: String): Secret = {
+  Secret.newBuilder()
+.setReference(Secret.Reference.newBuilder().setName(name))
+.setType(Secret.Type.REFERENCE)
+.build()
+}
+
+val referenceSecrets: Seq[Secret] =
+  conf.get(secretConfig.SECRET_NAMES).getOrElse(Nil).map(s => 
createReferenceSecret(s))
+
+val valueSecrets: Seq[Secret] = {
+  conf.get(secretConfig.SECRET_VALUES).getOrElse(Nil).map(s => 
createValueSecret(s))
+}
+
+if (valueSecrets.nonEmpty && referenceSecrets.nonEmpty) {
+  throw new SparkException("Cannot specify VALUE type secrets and 
REFERENCE types ones")
+}
+
+if (referenceSecrets.nonEmpty) referenceSecrets else valueSecrets
+  }
+
+  private def illegalSecretInput(dest: Seq[String], secrets: Seq[Secret]): 
Boolean = {
+if (dest.nonEmpty) {
+  // make sure there is a one-to-one correspondence between 
destinations and secrets
+  if (dest.length != secrets.length) {
+return true
+  }
+}
+false
+  }
+
+  def getSecretVolume(conf: SparkConf, secretConfig: MesosSecretConfig): 
List[Volume] = {
+val secrets = getSecrets(conf, secretConfig)
+val secretPaths: Seq[String] =
+  conf.get(secretConfig.SECRET_FILENAMES).getOrElse(Nil)
+
+if (illegalSecretInput(secretPaths, secrets)) {
+  throw new SparkException(
+s"Need to give equal numbers of secrets and file paths for 
file-based " +
+  s"reference secrets got secrets $secrets, and paths 
$secretPaths")
+}
+
+secrets.zip(secretPaths).map {
+  case (s, p) =>
--- End diff --

ok


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-25 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19437#discussion_r146712170
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala
 ---
@@ -394,9 +393,10 @@ private[spark] class MesosClusterScheduler(
 }
 
 // add secret environment variables
-getSecretEnvVar(desc).foreach { variable =>
+MesosSchedulerBackendUtil.getSecretEnvVar(desc.conf, 
config.driverSecretConfig)
+  .foreach { variable =>
   if (variable.getSecret.getReference.isInitialized) {
--- End diff --

Fixed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-25 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19437#discussion_r146712534
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackendUtil.scala
 ---
@@ -122,7 +126,8 @@ private[mesos] object MesosSchedulerBackendUtil extends 
Logging {
 .toList
   }
 
-  def containerInfo(conf: SparkConf): ContainerInfo.Builder = {
+  def buildContainerInfo(conf: SparkConf):
+ContainerInfo.Builder = {
--- End diff --

ok


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-25 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19437#discussion_r146712337
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala
 ---
@@ -440,6 +420,23 @@ private[spark] class MesosClusterScheduler(
   
CommandInfo.URI.newBuilder().setValue(uri.trim()).setCache(useFetchCache).build())
   }
 
+  private def getContainerInfo(desc: MesosDriverDescription): 
ContainerInfo.Builder = {
+val containerInfo = 
MesosSchedulerBackendUtil.buildContainerInfo(desc.conf)
+
+MesosSchedulerBackendUtil.getSecretVolume(desc.conf, 
config.driverSecretConfig)
+  .foreach { volume =>
+  if (volume.getSource.getSecret.getReference.isInitialized) {
--- End diff --

Fixed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-25 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19437#discussion_r146702357
  
--- Diff: docs/running-on-mesos.md ---
@@ -501,23 +503,78 @@ See the [configuration page](configuration.html) for 
information on Spark config
 spark.mesos.driver.secret.names or spark.mesos.driver.secret.values 
will be
 written to the provided file. Paths are relative to the container's 
work
 directory.  Absolute paths must already exist.  Consult the Mesos 
Secret
-protobuf for more information.
+protobuf for more information. Note: File-based secrets require a 
custom
+http://mesos.apache.org/documentation/latest/secrets/";>SecretResolver
+module. Example:
+
+filename1,filename2
   
 
 
   spark.mesos.driver.secret.names
   (none)
   
 A comma-separated list of secret references. Consult the Mesos Secret
-protobuf for more information.
+protobuf for more information. Example:
+
+secretname1,secretname2
   
 
 
   spark.mesos.driver.secret.values
   (none)
   
 A comma-separated list of secret values. Consult the Mesos Secret
-protobuf for more information.
+protobuf for more information. Example:
+
+secretvalue1,secretvalue2
+  
+
+
+
+  spark.mesos.executor.secret.envkeys
+  (none)
+  
+A comma-separated list that, if set, the contents of the secret 
referenced
+by spark.mesos.executor.secret.names or 
spark.mesos.executor.secret.values will be
+set to the provided environment variable in the executor's process. 
Example:
+
+ENVKEY1,ENVKEY2
+  
+  
+  
+spark.mesos.executor.secret.filenames
--- End diff --

Fixed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-25 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19437#discussion_r146702191
  
--- Diff: docs/running-on-mesos.md ---
@@ -501,23 +503,78 @@ See the [configuration page](configuration.html) for 
information on Spark config
 spark.mesos.driver.secret.names or spark.mesos.driver.secret.values 
will be
 written to the provided file. Paths are relative to the container's 
work
 directory.  Absolute paths must already exist.  Consult the Mesos 
Secret
-protobuf for more information.
+protobuf for more information. Note: File-based secrets require a 
custom
+http://mesos.apache.org/documentation/latest/secrets/";>SecretResolver
+module. Example:
+
+filename1,filename2
   
 
 
   spark.mesos.driver.secret.names
   (none)
   
 A comma-separated list of secret references. Consult the Mesos Secret
-protobuf for more information.
+protobuf for more information. Example:
+
+secretname1,secretname2
   
 
 
   spark.mesos.driver.secret.values
   (none)
   
 A comma-separated list of secret values. Consult the Mesos Secret
-protobuf for more information.
+protobuf for more information. Example:
+
+secretvalue1,secretvalue2
+  
+
+
+
+  spark.mesos.executor.secret.envkeys
+  (none)
+  
+A comma-separated list that, if set, the contents of the secret 
referenced
+by spark.mesos.executor.secret.names or 
spark.mesos.executor.secret.values will be
+set to the provided environment variable in the executor's process. 
Example:
+
+ENVKEY1,ENVKEY2
+  
+  
--- End diff --

Fixed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-26 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19437#discussion_r147159788
  
--- Diff: docs/running-on-mesos.md ---
@@ -485,39 +485,87 @@ See the [configuration page](configuration.html) for 
information on Spark config
 
 
 
-  spark.mesos.driver.secret.envkeys
-  (none)
   
-A comma-separated list that, if set, the contents of the secret 
referenced
-by spark.mesos.driver.secret.names or spark.mesos.driver.secret.values 
will be
-set to the provided environment variable in the driver's process.
+spark.mesos.driver.secret.values,
--- End diff --

It looks okay in a browser. This column takes up about 25% of the page 
width. Here's a screenshot: 
![documentation 
screenshot](https://user-images.githubusercontent.com/22622418/32058932-76e5e456-ba1f-11e7-999e-4302d49e636c.png)



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-26 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19437#discussion_r147161356
  
--- Diff: docs/running-on-mesos.md ---
@@ -485,39 +485,87 @@ See the [configuration page](configuration.html) for 
information on Spark config
 
 
 
-  spark.mesos.driver.secret.envkeys
-  (none)
   
-A comma-separated list that, if set, the contents of the secret 
referenced
-by spark.mesos.driver.secret.names or spark.mesos.driver.secret.values 
will be
-set to the provided environment variable in the driver's process.
+spark.mesos.driver.secret.values,
+spark.mesos.driver.secret.names,
+spark.mesos.executor.secret.values,
+spark.mesos.executor.secret.names,
   
-  
-  
-spark.mesos.driver.secret.filenames
   (none)
   
-A comma-separated list that, if set, the contents of the secret 
referenced by
-spark.mesos.driver.secret.names or spark.mesos.driver.secret.values 
will be
-written to the provided file. Paths are relative to the container's 
work
-directory.  Absolute paths must already exist.  Consult the Mesos 
Secret
-protobuf for more information.
+A secret is specified by its contents and destination. These properties
+specify a secret's contents. To specify a secret's destination, see 
the cell below.
+
+You can specify a secret's contents either (1) by value or (2) by 
reference.
--- End diff --

Good catch. I added `` paragraph tags to separate paragraphs in this 
table cell. (Jekyll doesn't need them in normal text body, only inside a table 
cell.)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-26 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19437#discussion_r147165077
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackendUtil.scala
 ---
@@ -173,6 +177,88 @@ private[mesos] object MesosSchedulerBackendUtil 
extends Logging {
 containerInfo
   }
 
+  private def getSecrets(conf: SparkConf, secretConfig: 
MesosSecretConfig): Seq[Secret] = {
+def createValueSecret(data: String): Secret = {
+  Secret.newBuilder()
+.setType(Secret.Type.VALUE)
+
.setValue(Secret.Value.newBuilder().setData(ByteString.copyFrom(data.getBytes)))
+.build()
+}
+
+def createReferenceSecret(name: String): Secret = {
+  Secret.newBuilder()
+.setReference(Secret.Reference.newBuilder().setName(name))
+.setType(Secret.Type.REFERENCE)
+.build()
+}
+
+val referenceSecrets: Seq[Secret] =
+  conf.get(secretConfig.SECRET_NAMES).getOrElse(Nil).map(s => 
createReferenceSecret(s))
--- End diff --

Fixed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-26 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19437#discussion_r147162791
  
--- Diff: docs/running-on-mesos.md ---
@@ -485,39 +485,87 @@ See the [configuration page](configuration.html) for 
information on Spark config
 
 
 
-  spark.mesos.driver.secret.envkeys
-  (none)
   
-A comma-separated list that, if set, the contents of the secret 
referenced
-by spark.mesos.driver.secret.names or spark.mesos.driver.secret.values 
will be
-set to the provided environment variable in the driver's process.
+spark.mesos.driver.secret.values,
+spark.mesos.driver.secret.names,
+spark.mesos.executor.secret.values,
+spark.mesos.executor.secret.names,
   
-  
-  
-spark.mesos.driver.secret.filenames
   (none)
   
-A comma-separated list that, if set, the contents of the secret 
referenced by
-spark.mesos.driver.secret.names or spark.mesos.driver.secret.values 
will be
-written to the provided file. Paths are relative to the container's 
work
-directory.  Absolute paths must already exist.  Consult the Mesos 
Secret
-protobuf for more information.
+A secret is specified by its contents and destination. These properties
+specify a secret's contents. To specify a secret's destination, see 
the cell below.
+
+You can specify a secret's contents either (1) by value or (2) by 
reference.
+(1) To specify a secret by value, set the
+spark.mesos.[driver|executor].secret.values
+property, to make the secret available in the driver or executors.
+For example, to make a secret password "guessme" available to the 
driver process, set:
+
+spark.mesos.driver.secret.values=guessme
+
+(2) To specify a secret that has been placed in a secret store
--- End diff --

I added a note about this. The secret store has to be provided by the user. 
You can configure Mesos to integrate with a secret store via a custom Mesos 
module.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-26 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19437#discussion_r147165029
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackendUtil.scala
 ---
@@ -173,6 +177,88 @@ private[mesos] object MesosSchedulerBackendUtil 
extends Logging {
 containerInfo
   }
 
+  private def getSecrets(conf: SparkConf, secretConfig: 
MesosSecretConfig): Seq[Secret] = {
+def createValueSecret(data: String): Secret = {
+  Secret.newBuilder()
+.setType(Secret.Type.VALUE)
+
.setValue(Secret.Value.newBuilder().setData(ByteString.copyFrom(data.getBytes)))
+.build()
+}
+
+def createReferenceSecret(name: String): Secret = {
+  Secret.newBuilder()
+.setReference(Secret.Reference.newBuilder().setName(name))
+.setType(Secret.Type.REFERENCE)
+.build()
+}
+
+val referenceSecrets: Seq[Secret] =
+  conf.get(secretConfig.SECRET_NAMES).getOrElse(Nil).map(s => 
createReferenceSecret(s))
+
+val valueSecrets: Seq[Secret] = {
+  conf.get(secretConfig.SECRET_VALUES).getOrElse(Nil).map(s => 
createValueSecret(s))
+}
+
+if (valueSecrets.nonEmpty && referenceSecrets.nonEmpty) {
+  throw new SparkException("Cannot specify VALUE type secrets and 
REFERENCE types ones")
--- End diff --

Fixed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19390: [SPARK-18935][MESOS] Fix dynamic reservations on ...

2017-10-26 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19390#discussion_r147261741
  
--- Diff: dev/deps/spark-deps-hadoop-2.6 ---
@@ -138,7 +138,7 @@ lz4-java-1.4.0.jar
 machinist_2.11-0.6.1.jar
 macro-compat_2.11-1.1.1.jar
 mail-1.4.7.jar
-mesos-1.3.0-shaded-protobuf.jar
+mesos-1.4.0-shaded-protobuf.jar
--- End diff --

@skonto Was there a specific reason for upgrading to 1.4?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19510: [SPARK-22292][Mesos] Added spark.mem.max support for Mes...

2017-10-26 Thread susanxhuynh
Github user susanxhuynh commented on the issue:

https://github.com/apache/spark/pull/19510
  
@windkit Trying to understand the need for this config ... could you 
accomplish the same thing by setting spark.cores.max, spark.executor.cores, and 
spark.executor.memory? Could you give an example scenario where this is needed?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19510: [SPARK-22292][Mesos] Added spark.mem.max support ...

2017-10-26 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19510#discussion_r147277408
  
--- Diff: docs/running-on-mesos.md ---
@@ -613,6 +621,39 @@ See the [configuration page](configuration.html) for 
information on Spark config
 driver disconnects, the master immediately tears down the framework.
   
 
+
+  spark.mesos.rejectOfferDuration
+  120s
+  
+Time to consider unused resources refused, serves as a fallback of
+`spark.mesos.rejectOfferDurationForUnmetConstraints`,
+`spark.mesos.rejectOfferDurationForReachedMaxCores`,
+`spark.mesos.rejectOfferDurationForReachedMaxMem`
+  
+
+
+  spark.mesos.rejectOfferDurationForUnmetConstraints
+  park.mesos.rejectOfferDuration
+  
+Time to consider unused resources refused with unmet constraints
+  
+
+
+  spark.mesos.rejectOfferDurationForReachedMaxCores
+  park.mesos.rejectOfferDuration
+  
+Time to consider unused resources refused when maximum number of cores
+spark.cores.max is reached
--- End diff --

My suggestion: "Duration for which unused resources are considered 
declined, when maximum number of cores spark.cores.max has been reached."
@ArtRand Is this the documentation you had in mind in 
https://issues.apache.org/jira/browse/SPARK-22133 ? Is this enough information 
for a non-Mesos expert to set this?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19510: [SPARK-22292][Mesos] Added spark.mem.max support ...

2017-10-26 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19510#discussion_r147265373
  
--- Diff: docs/running-on-mesos.md ---
@@ -613,6 +621,39 @@ See the [configuration page](configuration.html) for 
information on Spark config
 driver disconnects, the master immediately tears down the framework.
   
 
+
+  spark.mesos.rejectOfferDuration
+  120s
+  
+Time to consider unused resources refused, serves as a fallback of
+`spark.mesos.rejectOfferDurationForUnmetConstraints`,
+`spark.mesos.rejectOfferDurationForReachedMaxCores`,
+`spark.mesos.rejectOfferDurationForReachedMaxMem`
+  
+
+
+  spark.mesos.rejectOfferDurationForUnmetConstraints
+  park.mesos.rejectOfferDuration
+  
+Time to consider unused resources refused with unmet constraints
+  
+
+
+  spark.mesos.rejectOfferDurationForReachedMaxCores
+  park.mesos.rejectOfferDuration
+  
+Time to consider unused resources refused when maximum number of cores
+spark.cores.max is reached
+  
+
+
+  spark.mesos.rejectOfferDurationForReachedMaxMem
+  park.mesos.rejectOfferDuration
--- End diff --

Typo: 'park'


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19510: [SPARK-22292][Mesos] Added spark.mem.max support ...

2017-10-26 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19510#discussion_r147265329
  
--- Diff: docs/running-on-mesos.md ---
@@ -613,6 +621,39 @@ See the [configuration page](configuration.html) for 
information on Spark config
 driver disconnects, the master immediately tears down the framework.
   
 
+
+  spark.mesos.rejectOfferDuration
+  120s
+  
+Time to consider unused resources refused, serves as a fallback of
+`spark.mesos.rejectOfferDurationForUnmetConstraints`,
+`spark.mesos.rejectOfferDurationForReachedMaxCores`,
+`spark.mesos.rejectOfferDurationForReachedMaxMem`
+  
+
+
+  spark.mesos.rejectOfferDurationForUnmetConstraints
+  park.mesos.rejectOfferDuration
+  
+Time to consider unused resources refused with unmet constraints
+  
+
+
+  spark.mesos.rejectOfferDurationForReachedMaxCores
+  park.mesos.rejectOfferDuration
--- End diff --

Typo: 'park'


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19510: [SPARK-22292][Mesos] Added spark.mem.max support ...

2017-10-26 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19510#discussion_r147265261
  
--- Diff: docs/running-on-mesos.md ---
@@ -613,6 +621,39 @@ See the [configuration page](configuration.html) for 
information on Spark config
 driver disconnects, the master immediately tears down the framework.
   
 
+
+  spark.mesos.rejectOfferDuration
+  120s
+  
+Time to consider unused resources refused, serves as a fallback of
+`spark.mesos.rejectOfferDurationForUnmetConstraints`,
+`spark.mesos.rejectOfferDurationForReachedMaxCores`,
+`spark.mesos.rejectOfferDurationForReachedMaxMem`
+  
+
+
+  spark.mesos.rejectOfferDurationForUnmetConstraints
+  park.mesos.rejectOfferDuration
--- End diff --

Typo: 'park'


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19510: [SPARK-22292][Mesos] Added spark.mem.max support ...

2017-10-26 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19510#discussion_r147274681
  
--- Diff: docs/running-on-mesos.md ---
@@ -344,6 +345,13 @@ See the [configuration page](configuration.html) for 
information on Spark config
   
 
 
+  spark.mem.max
+  (none)
+  
+Maximum amount of memory Spark accepts from Mesos launching executor.
--- End diff --

Maybe add "across the cluster (not from each machine)". And, something 
about there is no maximum if this property is not set.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19390: [SPARK-18935][MESOS] Fix dynamic reservations on ...

2017-10-27 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19390#discussion_r147429031
  
--- Diff: dev/deps/spark-deps-hadoop-2.6 ---
@@ -138,7 +138,7 @@ lz4-java-1.4.0.jar
 machinist_2.11-0.6.1.jar
 macro-compat_2.11-1.1.1.jar
 mail-1.4.7.jar
-mesos-1.3.0-shaded-protobuf.jar
+mesos-1.4.0-shaded-protobuf.jar
--- End diff --

OK, makes sense.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19543: [SPARK-19606][MESOS] Support constraints in spark...

2017-11-09 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19543#discussion_r150101873
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala
 ---
@@ -556,7 +556,8 @@ private[spark] class MesosClusterScheduler(
 
   private class ResourceOffer(
   val offer: Offer,
-  var remainingResources: JList[Resource]) {
+  var remainingResources: JList[Resource],
+  var attributes: JList[Attribute]) {
 override def toString(): String = {
   s"Offer id: ${offer.getId}, resources: ${remainingResources}"
--- End diff --

Include the `attributes` in this string?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19543: [SPARK-19606][MESOS] Support constraints in spark...

2017-11-09 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19543#discussion_r150097549
  
--- Diff: docs/running-on-mesos.md ---
@@ -458,6 +461,14 @@ See the [configuration page](configuration.html) for 
information on Spark config
   
 
 
+  spark.mesos.driver.constraints
+  (none)
+  
+Same as spark.mesos.constraints except applied to drivers 
when launched through the dispatcher. By default,
+all resource offers will be accepted.
--- End diff --

Maybe clarify in the description for `spark.mesos.constraints` in the above 
cell that it only applies to executors.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19543: [SPARK-19606][MESOS] Support constraints in spark...

2017-11-09 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19543#discussion_r150100634
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala
 ---
@@ -601,10 +602,14 @@ private[spark] class MesosClusterScheduler(
 for (submission <- candidates) {
   val driverCpu = submission.cores
   val driverMem = submission.mem
-  logTrace(s"Finding offer to launch driver with cpu: $driverCpu, mem: 
$driverMem")
+  val driverConstraints =
+
parseConstraintString(submission.conf.get("spark.mesos.driver.constraints", ""))
--- End diff --

New config properties would ideally be defined in the `config` object: 
https://github.com/apache/spark/blob/master/resource-managers/mesos/src/main/scala/org/apache/spark/deploy/mesos/config.scala#L24
 and referenced by the variable name.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18910: [SPARK-21694][MESOS] Support Mesos CNI network labels

2017-08-23 Thread susanxhuynh
Github user susanxhuynh commented on the issue:

https://github.com/apache/spark/pull/18910
  
@srowen Thanks for reviewing. I've removed those spaces; please take a look.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19272: [Spark-21842][Mesos] Support Kerberos ticket rene...

2017-09-19 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19272#discussion_r139779444
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -159,6 +159,13 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, val rpcEnv: Rp
 scheduler.getExecutorsAliveOnHost(host).foreach { exec =>
   killExecutors(exec.toSeq, replace = true, force = true)
 }
+
+  case UpdateDelegationTokens(tokens) =>
+logDebug("Asking each executor to update HDFS delegation tokens")
+for ((x, executorData) <- executorDataMap) {
--- End diff --

`(_, executorData)` would be more Scala-like.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19272: [Spark-21842][Mesos] Support Kerberos ticket rene...

2017-09-19 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19272#discussion_r139772286
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCredentialRenewer.scala
 ---
@@ -0,0 +1,150 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.scheduler.cluster.mesos
+
+import java.security.PrivilegedExceptionAction
+import java.util.concurrent.{Executors, TimeUnit}
+
+import scala.collection.JavaConverters._
+import scala.util.Try
+
+import org.apache.hadoop.security.UserGroupInformation
+
+import org.apache.spark.SparkConf
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.security.HadoopDelegationTokenManager
+import org.apache.spark.internal.Logging
+import org.apache.spark.rpc.RpcEndpointRef
+import 
org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.UpdateDelegationTokens
+import org.apache.spark.util.ThreadUtils
+
+
+class MesosCredentialRenewer(
+conf: SparkConf,
+tokenManager: HadoopDelegationTokenManager,
+nextRenewal: Long,
+de: RpcEndpointRef) extends Logging {
+  private val credentialRenewerThread =
+Executors.newSingleThreadScheduledExecutor(
+  ThreadUtils.namedThreadFactory("Credential Refresh Thread"))
+
+  @volatile private var timeOfNextRenewal = nextRenewal
+
+  private val principal = conf.get("spark.yarn.principal")
+
+  private val (secretFile, mode) = getSecretFile(conf)
+
+  private def getSecretFile(conf: SparkConf): (String, String) = {
+val keytab64 = conf.get("spark.yarn.keytab", null)
+val tgt64 = System.getenv("KRB5CCNAME")
+require(keytab64 != null || tgt64 != null, "keytab or tgt required")
+require(keytab64 == null || tgt64 == null, "keytab and tgt cannot be 
used at the same time")
+val mode = if (keytab64 != null) "keytab" else "tgt"
+val secretFile = if (keytab64 != null) keytab64 else tgt64
+logInfo(s"Logging in as $principal with mode $mode to retrieve HDFS 
delegation tokens")
+logDebug(s"secretFile is $secretFile")
+(secretFile, mode)
+  }
+
+  def scheduleTokenRenewal(): Unit = {
+def scheduleRenewal(runnable: Runnable): Unit = {
+  val remainingTime = timeOfNextRenewal - System.currentTimeMillis()
+  if (remainingTime <= 0) {
+logInfo("Credentials have expired, creating new ones now.")
+runnable.run()
+  } else {
+logInfo(s"Scheduling login from keytab in $remainingTime millis.")
+credentialRenewerThread.schedule(runnable, remainingTime, 
TimeUnit.MILLISECONDS)
+  }
+}
+
+val credentialRenewerRunnable =
+  new Runnable {
+override def run(): Unit = {
+  try {
+val creds = getRenewedDelegationTokens(conf)
+broadcastDelegationTokens(creds)
+  } catch {
+case e: Exception =>
+  // Log the error and try to write new tokens back in an hour
--- End diff --

Comment says "an hour" but code has 20 seconds.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19272: [Spark-21842][Mesos] Support Kerberos ticket rene...

2017-09-19 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19272#discussion_r139726573
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCredentialRenewer.scala
 ---
@@ -0,0 +1,150 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.scheduler.cluster.mesos
+
+import java.security.PrivilegedExceptionAction
+import java.util.concurrent.{Executors, TimeUnit}
+
+import scala.collection.JavaConverters._
+import scala.util.Try
+
+import org.apache.hadoop.security.UserGroupInformation
+
+import org.apache.spark.SparkConf
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.security.HadoopDelegationTokenManager
+import org.apache.spark.internal.Logging
+import org.apache.spark.rpc.RpcEndpointRef
+import 
org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.UpdateDelegationTokens
+import org.apache.spark.util.ThreadUtils
+
+
+class MesosCredentialRenewer(
+conf: SparkConf,
+tokenManager: HadoopDelegationTokenManager,
+nextRenewal: Long,
+de: RpcEndpointRef) extends Logging {
+  private val credentialRenewerThread =
+Executors.newSingleThreadScheduledExecutor(
+  ThreadUtils.namedThreadFactory("Credential Refresh Thread"))
+
+  @volatile private var timeOfNextRenewal = nextRenewal
+
+  private val principal = conf.get("spark.yarn.principal")
+
+  private val (secretFile, mode) = getSecretFile(conf)
+
+  private def getSecretFile(conf: SparkConf): (String, String) = {
+val keytab64 = conf.get("spark.yarn.keytab", null)
+val tgt64 = System.getenv("KRB5CCNAME")
+require(keytab64 != null || tgt64 != null, "keytab or tgt required")
+require(keytab64 == null || tgt64 == null, "keytab and tgt cannot be 
used at the same time")
+val mode = if (keytab64 != null) "keytab" else "tgt"
+val secretFile = if (keytab64 != null) keytab64 else tgt64
+logInfo(s"Logging in as $principal with mode $mode to retrieve HDFS 
delegation tokens")
+logDebug(s"secretFile is $secretFile")
+(secretFile, mode)
+  }
+
+  def scheduleTokenRenewal(): Unit = {
+def scheduleRenewal(runnable: Runnable): Unit = {
+  val remainingTime = timeOfNextRenewal - System.currentTimeMillis()
+  if (remainingTime <= 0) {
+logInfo("Credentials have expired, creating new ones now.")
+runnable.run()
+  } else {
+logInfo(s"Scheduling login from keytab in $remainingTime millis.")
+credentialRenewerThread.schedule(runnable, remainingTime, 
TimeUnit.MILLISECONDS)
+  }
+}
+
+val credentialRenewerRunnable =
+  new Runnable {
+override def run(): Unit = {
+  try {
+val creds = getRenewedDelegationTokens(conf)
+broadcastDelegationTokens(creds)
+  } catch {
+case e: Exception =>
+  // Log the error and try to write new tokens back in an hour
+  logWarning("Couldn't broadcast tokens, trying agin in 20 
seconds", e)
--- End diff --

(sp) "ag**a**in"


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19272: [Spark-21842][Mesos] Support Kerberos ticket rene...

2017-09-20 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19272#discussion_r140117055
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala
 ---
@@ -198,16 +198,19 @@ private[spark] class 
MesosCoarseGrainedSchedulerBackend(
   sc.conf.getOption("spark.mesos.driver.frameworkId").map(_ + suffix)
 )
 
-if (principal != null) {
+// check that the credentials are defined, even though it's likely 
that auth would have failed
+// already if you've made it this far
+if (principal != null && hadoopDelegationCreds.isDefined) {
   logDebug(s"Principal found ($principal) starting token renewer")
   val credentialRenewerThread = new Thread {
 setName("MesosCredentialRenewer")
 override def run(): Unit = {
+  val dummy: Option[Array[Byte]] = None
--- End diff --

What is this for?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19272: [Spark-21842][Mesos] Support Kerberos ticket rene...

2017-09-20 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19272#discussion_r140117253
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCredentialRenewer.scala
 ---
@@ -63,7 +63,8 @@ class MesosCredentialRenewer(
 
   def scheduleTokenRenewal(): Unit = {
 def scheduleRenewal(runnable: Runnable): Unit = {
-  val remainingTime = timeOfNextRenewal - System.currentTimeMillis()
+  // val remainingTime = timeOfNextRenewal - System.currentTimeMillis()
+  val remainingTime = 5000
--- End diff --

Why 5000?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20167: [SPARK-16501] [MESOS] Allow providing Mesos princ...

2018-02-05 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/20167#discussion_r166086691
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerUtils.scala
 ---
@@ -71,40 +74,64 @@ trait MesosSchedulerUtils extends Logging {
   failoverTimeout: Option[Double] = None,
   frameworkId: Option[String] = None): SchedulerDriver = {
 val fwInfoBuilder = 
FrameworkInfo.newBuilder().setUser(sparkUser).setName(appName)
-val credBuilder = Credential.newBuilder()
+
fwInfoBuilder.setHostname(Option(conf.getenv("SPARK_PUBLIC_DNS")).getOrElse(
+  conf.get(DRIVER_HOST_ADDRESS)))
 webuiUrl.foreach { url => fwInfoBuilder.setWebuiUrl(url) }
 checkpoint.foreach { checkpoint => 
fwInfoBuilder.setCheckpoint(checkpoint) }
 failoverTimeout.foreach { timeout => 
fwInfoBuilder.setFailoverTimeout(timeout) }
 frameworkId.foreach { id =>
   fwInfoBuilder.setId(FrameworkID.newBuilder().setValue(id).build())
 }
-
fwInfoBuilder.setHostname(Option(conf.getenv("SPARK_PUBLIC_DNS")).getOrElse(
-  conf.get(DRIVER_HOST_ADDRESS)))
-conf.getOption("spark.mesos.principal").foreach { principal =>
-  fwInfoBuilder.setPrincipal(principal)
-  credBuilder.setPrincipal(principal)
-}
-conf.getOption("spark.mesos.secret").foreach { secret =>
-  credBuilder.setSecret(secret)
-}
-if (credBuilder.hasSecret && !fwInfoBuilder.hasPrincipal) {
-  throw new SparkException(
-"spark.mesos.principal must be configured when spark.mesos.secret 
is set")
-}
+
 conf.getOption("spark.mesos.role").foreach { role =>
   fwInfoBuilder.setRole(role)
 }
 val maxGpus = conf.getInt("spark.mesos.gpus.max", 0)
 if (maxGpus > 0) {
   
fwInfoBuilder.addCapabilities(Capability.newBuilder().setType(Capability.Type.GPU_RESOURCES))
 }
+val credBuilder = buildCredentials(conf, fwInfoBuilder)
 if (credBuilder.hasPrincipal) {
   new MesosSchedulerDriver(
 scheduler, fwInfoBuilder.build(), masterUrl, credBuilder.build())
 } else {
   new MesosSchedulerDriver(scheduler, fwInfoBuilder.build(), masterUrl)
 }
   }
+  
+  def buildCredentials(
+  conf: SparkConf, 
+  fwInfoBuilder: Protos.FrameworkInfo.Builder): 
Protos.Credential.Builder = {
+val credBuilder = Credential.newBuilder()
+conf.getOption("spark.mesos.principal")
+  .orElse(Option(conf.getenv("SPARK_MESOS_PRINCIPAL")))
--- End diff --

I think it's fine to set that env var, SPARK_MESOS_PRINCIPAL.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20640: [SPARK-19755][Mesos] Blacklist is always active f...

2018-02-21 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/20640#discussion_r169676540
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala
 ---
@@ -648,15 +645,6 @@ private[spark] class 
MesosCoarseGrainedSchedulerBackend(
   totalGpusAcquired -= gpus
   gpusByTaskId -= taskId
 }
-// If it was a failure, mark the slave as failed for blacklisting 
purposes
-if (TaskState.isFailed(state)) {
-  slave.taskFailures += 1
-
-  if (slave.taskFailures >= MAX_SLAVE_FAILURES) {
-logInfo(s"Blacklisting Mesos slave $slaveId due to too many 
failures; " +
--- End diff --

@IgorBerman Yes, in the default case, it would be nice to have this 
information about a task failing, especially if it fails repeatedly.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20640: [SPARK-19755][Mesos] Blacklist is always active for Meso...

2018-02-22 Thread susanxhuynh
Github user susanxhuynh commented on the issue:

https://github.com/apache/spark/pull/20640
  
@skonto We should not remove the logging. The logging 
[here](https://github.com/apache/spark/blob/f41c0a93fd3913ad93e55ddbfd875229872ecc97/core/src/main/scala/org/apache/spark/scheduler/BlacklistTracker.scala#L194)
 is only available if blacklisting is enabled, but by default blacklisting is 
disabled. The BlacklistTracker object [is not 
created](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala#L748)
 if blacklisting is disabled. But, we might still want to see the log of 
executor failure in this case.

In the case where an executor fails before entering Spark code (for 
example, Mesos agent failed to create the sandbox), would it be detected?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20641: [SPARK-23464][MESOS] Fix mesos cluster scheduler options...

2018-02-24 Thread susanxhuynh
Github user susanxhuynh commented on the issue:

https://github.com/apache/spark/pull/20641
  
Thanks for the PR! It seems that the previous attempt to fix this 
(SPARK-18114) was wrong -- I'm not sure why we didn't catch the problem before, 
maybe lack of testing? @krcz My suggestion for this patch is to add a test, in 
order to prevent another regression in the future. I've written a unit test for 
this -- you could do something similar:  
https://github.com/mesosphere/spark/commit/4812ba3d10264f6d22ec654fa16b5810d70c27a9
 I will also do more testing with my own integration tests. cc @skonto 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20641: [SPARK-23464][MESOS] Fix mesos cluster scheduler ...

2018-03-05 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/20641#discussion_r17399
  
--- Diff: 
resource-managers/mesos/src/test/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterSchedulerSuite.scala
 ---
@@ -199,6 +199,38 @@ class MesosClusterSchedulerSuite extends SparkFunSuite 
with LocalSparkContext wi
 })
   }
 
+  test("properly wraps and escapes parameters passed to driver command") {
+setScheduler()
+
+val mem = 1000
+val cpu = 1
+
+val response = scheduler.submitDriver(
+  new MesosDriverDescription("d1", "jar", mem, cpu, true,
+command,
+Map("spark.mesos.executor.home" -> "test",
+  "spark.app.name" -> "test",
+  // no special characters, wrap only
+  "spark.driver.extraJavaOptions" ->
+"-XX+PrintGC -Dparam1=val1 -Dparam2=val2",
+  // special characters, to be escaped
+  "spark.executor.extraJavaOptions" ->
+"""-Dparam1="value 1" -Dparam2=value\ 2 -Dpath=$PATH"""),
--- End diff --

I'm not sure the second param (param2) would be parsed correctly if you 
actually ran the command. Doesn't there need to be quotes around the space? 
Have you tested it and checked if the executor gets the correct value for 
param2?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18705: [SPARK-21502][Mesos] fix --supervise for mesos in cluste...

2017-07-21 Thread susanxhuynh
Github user susanxhuynh commented on the issue:

https://github.com/apache/spark/pull/18705
  
@skonto LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18837: [Spark-20812][Mesos] Add secrets support to the d...

2017-08-04 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/18837#discussion_r131429404
  
--- Diff: docs/running-on-mesos.md ---
@@ -479,6 +479,35 @@ See the [configuration page](configuration.html) for 
information on Spark config
 
 
 
+  spark.mesos.driver.secret.envkey
+  (none)
+  
+If set, the contents of the secret referenced by
+spark.mesos.driver.secret.name will be written to the provided
+environment variable in the driver's process.
+  
+  
+  
+spark.mesos.driver.secret.filename
+  (none)
+  
+If set, the contents of the secret referenced by
+spark.mesos.driver.secret.name will be written to the provided
+file.  Relative paths are relative to the container's work
+directory.  Absolute paths must already exist.  Consult the Mesos 
Secret
+protobuf for more information.
+  
+
+
+  spark.mesos.driver.secret.name
--- End diff --

I think this refers to the `message Secret` here 
https://github.com/apache/mesos/blob/master/include/mesos/mesos.proto#L2335 It 
says a secret can have one of two types: reference or value. Presumably the 
type here is `REFERENCE` and this is the name  of the reference. And, in 
practice this could be the path in a secret store.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18837: [Spark-20812][Mesos] Add secrets support to the d...

2017-08-04 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/18837#discussion_r131431005
  
--- Diff: docs/running-on-mesos.md ---
@@ -479,6 +479,35 @@ See the [configuration page](configuration.html) for 
information on Spark config
 
 
 
+  spark.mesos.driver.secret.envkey
+  (none)
+  
+If set, the contents of the secret referenced by
+spark.mesos.driver.secret.name will be written to the provided
+environment variable in the driver's process.
+  
+  
+  
+spark.mesos.driver.secret.filename
+  (none)
+  
+If set, the contents of the secret referenced by
+spark.mesos.driver.secret.name will be written to the provided
+file.  Relative paths are relative to the container's work
+directory.  Absolute paths must already exist.  Consult the Mesos 
Secret
+protobuf for more information.
+  
+
+
+  spark.mesos.driver.secret.name
--- End diff --

Actually, the `Secret` protobuf message is new in Mesos 1.3. What are the 
implications for compatibility with older (<1.3) Mesos clusters? Also, is there 
anything in Mesos 1.4 that this relies on?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18837: [Spark-20812][Mesos] Add secrets support to the d...

2017-08-07 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/18837#discussion_r131780454
  
--- Diff: docs/running-on-mesos.md ---
@@ -479,6 +479,35 @@ See the [configuration page](configuration.html) for 
information on Spark config
 
 
 
+  spark.mesos.driver.secret.envkey
+  (none)
+  
+If set, the contents of the secret referenced by
+spark.mesos.driver.secret.name will be written to the provided
+environment variable in the driver's process.
+  
+  
+  
+spark.mesos.driver.secret.filename
+  (none)
+  
+If set, the contents of the secret referenced by
+spark.mesos.driver.secret.name will be written to the provided
+file.  Relative paths are relative to the container's work
+directory.  Absolute paths must already exist.  Consult the Mesos 
Secret
+protobuf for more information.
+  
+
+
+  spark.mesos.driver.secret.name
--- End diff --

@ArtRand What would the `path` mean for TYPE=VALUE. Would it be a path to a 
local file containing the value of the secret?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18837: [Spark-20812][Mesos] Add secrets support to the d...

2017-08-07 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/18837#discussion_r131792566
  
--- Diff: docs/running-on-mesos.md ---
@@ -479,6 +479,35 @@ See the [configuration page](configuration.html) for 
information on Spark config
 
 
 
+  spark.mesos.driver.secret.envkey
+  (none)
+  
+If set, the contents of the secret referenced by
+spark.mesos.driver.secret.name will be written to the provided
+environment variable in the driver's process.
+  
+  
+  
+spark.mesos.driver.secret.filename
+  (none)
+  
+If set, the contents of the secret referenced by
+spark.mesos.driver.secret.name will be written to the provided
+file.  Relative paths are relative to the container's work
+directory.  Absolute paths must already exist.  Consult the Mesos 
Secret
+protobuf for more information.
+  
+
+
+  spark.mesos.driver.secret.name
--- End diff --

@ArtRand Looking at the protobuf definition, 
https://github.com/apache/mesos/blob/master/include/mesos/mesos.proto#L2335, I 
see that unlike a `Reference`, a `message Value` does not have a name: it has a 
field called `data`. So, I was wondering how the user would pass in that data.
@ArtRand Looking at the protobuf definition, 
https://github.com/apache/mesos/blob/master/include/mesos/mesos.proto#L2335, I 
see that unlike a `Reference`, a `message Value` does not have a name: it has a 
field called `data`. So, I was wondering how the user would pass in that data.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18837: [Spark-20812][Mesos] Add secrets support to the d...

2017-08-08 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/18837#discussion_r132010482
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/deploy/mesos/config.scala
 ---
@@ -58,12 +58,39 @@ package object config {
 
   private [spark] val DRIVER_LABELS =
 ConfigBuilder("spark.mesos.driver.labels")
-  .doc("Mesos labels to add to the driver.  Labels are free-form 
key-value pairs.  Key-value " +
+  .doc("Mesos labels to add to the driver.  Labels are free-form 
key-value pairs.  Key-value" +
 "pairs should be separated by a colon, and commas used to list 
more than one." +
 "Ex. key:value,key2:value2")
   .stringConf
   .createOptional
 
+  private[spark] val SECRET_NAME =
+ConfigBuilder("spark.mesos.driver.secret.name")
+  .doc("A comma-seperated list of secret references. Consult the Mesos 
Secret protobuf for " +
--- End diff --

Sp: "sep**a**rated". Check for same typo below also.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18837: [Spark-20812][Mesos] Add secrets support to the d...

2017-08-08 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/18837#discussion_r132012345
  
--- Diff: docs/running-on-mesos.md ---
@@ -479,6 +479,35 @@ See the [configuration page](configuration.html) for 
information on Spark config
 
 
 
+  spark.mesos.driver.secret.envkey
+  (none)
+  
+If set, the contents of the secret referenced by
+spark.mesos.driver.secret.name will be written to the provided
+environment variable in the driver's process.
+  
--- End diff --

Maybe update to mention this can be a comma-separated list of env vars. 
Same for filename below.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18837: [Spark-20812][Mesos] Add secrets support to the d...

2017-08-08 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/18837#discussion_r132009351
  
--- Diff: resource-managers/mesos/pom.xml ---
@@ -29,7 +29,7 @@
   Spark Project Mesos
   
 mesos
-1.0.0
+1.3.0
 shaded-protobuf
--- End diff --

https://spark.apache.org/docs/latest/running-on-mesos.html#installing-mesos 
mentions that Spark is designed for use with Mesos 1.0.0+. Is that still true? 
Do these changes affect users who have clusters that are < 1.3.0? Also, how 
much of this change requires 1.4?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18837: [Spark-20812][Mesos] Add secrets support to the d...

2017-08-08 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/18837#discussion_r132040995
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala
 ---
@@ -529,18 +570,120 @@ private[spark] class MesosClusterScheduler(
 
 val appName = desc.conf.get("spark.app.name")
 
+val driverLabels = 
MesosProtoUtils.mesosLabels(desc.conf.get(config.DRIVER_LABELS)
+  .getOrElse(""))
+
 TaskInfo.newBuilder()
   .setTaskId(taskId)
   .setName(s"Driver for ${appName}")
   .setSlaveId(offer.offer.getSlaveId)
   .setCommand(buildDriverCommand(desc))
+  .setContainer(getContainerInfo(desc))
   .addAllResources(cpuResourcesToUse.asJava)
   .addAllResources(memResourcesToUse.asJava)
-  
.setLabels(MesosProtoUtils.mesosLabels(desc.conf.get(config.DRIVER_LABELS).getOrElse("")))
-  .setContainer(MesosSchedulerBackendUtil.containerInfo(desc.conf))
+  .setLabels(driverLabels)
   .build
   }
 
+  private def getContainerInfo(desc: MesosDriverDescription): 
ContainerInfo.Builder = {
+val containerInfo = MesosSchedulerBackendUtil.containerInfo(desc.conf)
+
+getSecretVolume(desc).foreach { volume =>
+  logInfo(s"Setting secret 
name=${volume.getSource.getSecret.getReference.getName} " +
--- End diff --

Is this assuming the type is `Reference` when it could be either 
`Reference` or `Value`? Same question for the logInfo() for env vars above.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18837: [Spark-20812][Mesos] Add secrets support to the d...

2017-08-08 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/18837#discussion_r132008642
  
--- Diff: docs/running-on-mesos.md ---
@@ -479,6 +479,35 @@ See the [configuration page](configuration.html) for 
information on Spark config
 
 
 
+  spark.mesos.driver.secret.envkey
+  (none)
+  
+If set, the contents of the secret referenced by
+spark.mesos.driver.secret.name will be written to the provided
+environment variable in the driver's process.
+  
+  
+  
+spark.mesos.driver.secret.filename
+  (none)
+  
+If set, the contents of the secret referenced by
+spark.mesos.driver.secret.name will be written to the provided
+file.  Relative paths are relative to the container's work
+directory.  Absolute paths must already exist.  Consult the Mesos 
Secret
+protobuf for more information.
+  
+
+
+  spark.mesos.driver.secret.name
+  (none)
+  
+Set the secret's reference name.  Consult the Mesos Secret
+protobuf for more information.
+  
+
+
--- End diff --

Document `spark.mesos.driver.secret.value` here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18837: [Spark-20812][Mesos] Add secrets support to the d...

2017-08-08 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/18837#discussion_r132009978
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/deploy/mesos/config.scala
 ---
@@ -58,12 +58,39 @@ package object config {
 
   private [spark] val DRIVER_LABELS =
 ConfigBuilder("spark.mesos.driver.labels")
-  .doc("Mesos labels to add to the driver.  Labels are free-form 
key-value pairs.  Key-value " +
+  .doc("Mesos labels to add to the driver.  Labels are free-form 
key-value pairs.  Key-value" +
--- End diff --

Missing a space after "Key-value"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18910: [SPARK-21694][MESOS] Support Mesos CNI network la...

2017-08-10 Thread susanxhuynh
GitHub user susanxhuynh opened a pull request:

https://github.com/apache/spark/pull/18910

[SPARK-21694][MESOS] Support Mesos CNI network labels

JIRA ticket: https://issues.apache.org/jira/browse/SPARK-21694

## What changes were proposed in this pull request?

Spark already supports launching containers attached to a given CNI network 
by specifying it via the config `spark.mesos.network.name`.

This PR adds support to pass in network labels to CNI plugins via a new 
config option `spark.mesos.network.labels`. These network labels are key-value 
pairs that are set in the `NetworkInfo` of both the driver and executor tasks. 
More details in the related Mesos documentation:  
http://mesos.apache.org/documentation/latest/cni/#mesos-meta-data-to-cni-plugins

## How was this patch tested?

Unit tests, for both driver and executor tasks.
Manual integration test to submit a job with the 
`spark.mesos.network.labels` option, hit the mesos/state.json endpoint, and 
check that the labels are set in the driver and executor tasks.

@ArtRand @skonto 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mesosphere/spark sh-mesos-cni-labels

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18910.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18910


commit 255e5195a414f0f9e609cb579a2e8fdfea200be1
Author: Susan X. Huynh 
Date:   2017-08-10T18:19:51Z

[SPARK-21694][MESOS] Allow the user to pass network labels to CNI plugins 
via




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18910: [SPARK-21694][MESOS] Support Mesos CNI network la...

2017-08-11 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/18910#discussion_r132766509
  
--- Diff: 
resource-managers/mesos/src/test/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackendSuite.scala
 ---
@@ -582,6 +583,10 @@ class MesosCoarseGrainedSchedulerBackendSuite extends 
SparkFunSuite
 val networkInfos = launchedTasks.head.getContainer.getNetworkInfosList
 assert(networkInfos.size == 1)
 assert(networkInfos.get(0).getName == "test-network-name")
+assert(networkInfos.get(0).getLabels.getLabels(0).getKey == "key1")
+assert(networkInfos.get(0).getLabels.getLabels(0).getValue == "val1")
+assert(networkInfos.get(0).getLabels.getLabels(1).getKey == "key2")
+assert(networkInfos.get(0).getLabels.getLabels(1).getValue == "val2")
--- End diff --

@ArtRand I believe this _is_ testing the worker tasks: (L.572) the 
`setBackend()` method creates a MesosCoarseGrainedSchedulerBackend, L.577-579 
creates the requirements for an Executor, (L. 580) backend.resourceOffers() 
looks for an offer that matches the requirements and launches an Executor task. 
There is a separate unit test above, in MesosClusterSchedulerSuite.scala, 
which checks for the labels in the driver itself.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18910: [SPARK-21694][MESOS] Support Mesos CNI network labels

2017-08-11 Thread susanxhuynh
Github user susanxhuynh commented on the issue:

https://github.com/apache/spark/pull/18910
  
@skonto @ArtRand Thanks for the feedback. I have fixed the documentation 
and added NETWORK_NAME to the config object. Please let me know what you think.

@skonto I have not tested this particular change on a real CNI network. I 
think the only difference is that a Spark job runs on a different network, but 
there's no change in the Spark functionality.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18910: [SPARK-21694][MESOS] Support Mesos CNI network labels

2017-08-21 Thread susanxhuynh
Github user susanxhuynh commented on the issue:

https://github.com/apache/spark/pull/18910
  
@vanzin Would you mind helping review this PR? Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18910: [SPARK-21694][MESOS] Support Mesos CNI network labels

2017-08-23 Thread susanxhuynh
Github user susanxhuynh commented on the issue:

https://github.com/apache/spark/pull/18910
  
@srowen Sean, would you like to review this PR? Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18674: [SPARK-21456][MESOS] Make the driver failover_tim...

2017-07-18 Thread susanxhuynh
GitHub user susanxhuynh opened a pull request:

https://github.com/apache/spark/pull/18674

[SPARK-21456][MESOS] Make the driver failover_timeout configurable

## What changes were proposed in this pull request?

Current behavior: in Mesos cluster mode, the driver failover_timeout is set 
to zero. If the driver temporarily loses connectivity with the Mesos master, 
the framework will be torn down and all executors killed.

Proposed change: make the failover_timeout configurable via a new option, 
spark.mesos.driver.failoverTimeout. The default value is still zero.

Note: with non-zero failover_timeout, an explicit teardown is needed in 
some cases. This is captured in 
https://issues.apache.org/jira/browse/SPARK-21458

## How was this patch tested?

Added a unit test to make sure the config option is set while creating the 
scheduler driver.

Ran an integration test with mesosphere/spark showing that with a non-zero 
failover_timeout the Spark job finishes after a driver is disconnected from the 
master.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mesosphere/spark sh-mesos-failover-timeout

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18674.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18674


commit ae3e5bd0e0ab139f9d871d667c7fad7f2682285d
Author: Susan X. Huynh 
Date:   2017-07-18T18:07:53Z

Made the driver failover_timeout configurable as 
spark.mesos.driver.failoverTimeout. Added a unit test.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17750: [SPARK-4899][MESOS] Support for Checkpointing on Coarse ...

2017-07-18 Thread susanxhuynh
Github user susanxhuynh commented on the issue:

https://github.com/apache/spark/pull/17750
  
@gkc2104 @lins05 I have created a separate PR for configuring the driver's 
failover_timeout: https://github.com/apache/spark/pull/18674 I would appreciate 
a review if you are interested.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18674: [SPARK-21456][MESOS] Make the driver failover_tim...

2017-07-19 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/18674#discussion_r128261909
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/deploy/mesos/config.scala
 ---
@@ -58,9 +58,16 @@ package object config {
 
   private [spark] val DRIVER_LABELS =
 ConfigBuilder("spark.mesos.driver.labels")
-  .doc("Mesos labels to add to the driver.  Labels are free-form 
key-value pairs.  Key-value" +
+  .doc("Mesos labels to add to the driver.  Labels are free-form 
key-value pairs.  Key-value " +
 "pairs should be separated by a colon, and commas used to list 
more than one." +
 "Ex. key:value,key2:value2")
   .stringConf
   .createOptional
+
+  private [spark] val DRIVER_FAILOVER_TIMEOUT =
+ConfigBuilder("spark.mesos.driver.failoverTimeout")
+  .doc("Amount of time in seconds that the master will wait to hear 
from the driver, " +
+  "during a temporary disconnection, before tearing down all the 
executors.")
+  .doubleConf
+  .createWithDefault(0.0)
--- End diff --

No, zero means 0ns. If the driver is disconnected, the Mesos master will 
wait 0ns for the driver to reconnect (it will teardown the framework 
immediately). This is the current default value.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18674: [SPARK-21456][MESOS] Make the driver failover_timeout co...

2017-07-19 Thread susanxhuynh
Github user susanxhuynh commented on the issue:

https://github.com/apache/spark/pull/18674
  
@skonto There is not much changed in the dispatcher. The main difference is 
in the executors.

First of all, if the user does not set this new config, it will default to 
the old behavior (failover_timeout=0), and everything remains the same. If the 
driver is temporarily disconnected from the Mesos master, the master tears down 
the framework immediately, killing all the executors.

If the user sets the new failover_timeout > 0, then if the driver becomes 
disconnected from the master, the master will wait failover_timeout seconds. It 
will not kill any executors during this time. If the driver reconnects within 
the timeout period, then the job continues running uninterrupted.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18674: [SPARK-21456][MESOS] Make the driver failover_timeout co...

2017-07-19 Thread susanxhuynh
Github user susanxhuynh commented on the issue:

https://github.com/apache/spark/pull/18674
  
In both cases, the dispatcher will get a task status update on the driver 
when it fails or finishes (no change here).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18674: [SPARK-21456][MESOS] Make the driver failover_timeout co...

2017-07-19 Thread susanxhuynh
Github user susanxhuynh commented on the issue:

https://github.com/apache/spark/pull/18674
  
@skonto So, my PR is addressing just the issue of the driver temporarily 
losing connectivity with the master. My PR does change the behavior if the 
driver fails. If the driver fails, all the executors will fail also, and if the 
--supervise flag is set, a completely new framework will be launched, with a 
new framework ID. I think the quote that you found in mesos.apache.org is 
referring to a different type of framework behavior: if the scheduler fails and 
restarts, the same framework can continue running (same executors). I do not 
think that is the way that --supervise is intended to work, but in any case, I 
was not trying to address this in my PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18674: [SPARK-21456][MESOS] Make the driver failover_timeout co...

2017-07-19 Thread susanxhuynh
Github user susanxhuynh commented on the issue:

https://github.com/apache/spark/pull/18674
  
@skonto Do you have any other questions? Are there any changes you want me 
to make in this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18674: [SPARK-21456][MESOS] Make the driver failover_timeout co...

2017-07-19 Thread susanxhuynh
Github user susanxhuynh commented on the issue:

https://github.com/apache/spark/pull/18674
  
@vanzin Thanks for the review. I have made the changes you recommended 
(documenting the zero default value and using the config key).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19543: [SPARK-19606][MESOS] Support constraints in spark...

2017-11-12 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19543#discussion_r150419093
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/deploy/mesos/config.scala
 ---
@@ -122,4 +122,11 @@ package object config {
 "Example: key1:val1,key2:val2")
   .stringConf
   .createOptional
+
+  private[spark] val DRIVER_CONSTRAINTS =
+ConfigBuilder("spark.mesos.driver.constraints")
+  .doc("Attribute based constraints on mesos resource offers. Applied 
by the dispatcher " +
+"when launching drivers. Default is to accept all offers with 
sufficient resources.")
+  .stringConf
+  .createWithDefault("")
--- End diff --

@felixcheung I think it's okay. There's a utility function, 
`parseConstraintString`, which parses the input string into a Map. If the 
string is empty, it returns an empty Map:

https://github.com/apache/spark/blob/fc45c2c88a838b8f46659ebad2a8f3a9923bc95f/resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerUtils.scala#L321
This is also consistent with how the executor constraint string is parsed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19543: [SPARK-19606][MESOS] Support constraints in spark-dispat...

2017-11-12 Thread susanxhuynh
Github user susanxhuynh commented on the issue:

https://github.com/apache/spark/pull/19543
  
LGTM


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19793: [SPARK-22574] [Mesos] [Submit] Check submission r...

2017-11-27 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19793#discussion_r153352732
  
--- Diff: 
core/src/test/scala/org/apache/spark/deploy/rest/SubmitRestProtocolSuite.scala 
---
@@ -86,6 +86,8 @@ class SubmitRestProtocolSuite extends SparkFunSuite {
 message.clientSparkVersion = "1.2.3"
 message.appResource = "honey-walnut-cherry.jar"
 message.mainClass = "org.apache.spark.examples.SparkPie"
+message.appArgs = Array("hdfs://tmp/auth")
+message.environmentVariables = Map("SPARK_HOME" -> "/test")
--- End diff --

There should be a check at the end of this test to make sure these fields 
show up in the `newMessage` object. (around L.125)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19793: [SPARK-22574] [Mesos] [Submit] Check submission r...

2017-11-27 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19793#discussion_r153352098
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/rest/SubmitRestProtocolRequest.scala
 ---
@@ -46,6 +46,8 @@ private[rest] class CreateSubmissionRequest extends 
SubmitRestProtocolRequest {
 super.doValidate()
 assert(sparkProperties != null, "No Spark properties set!")
 assertFieldIsSet(appResource, "appResource")
+assertFieldIsSet(appArgs, "appArgs")
+assertFieldIsSet(environmentVariables, "environmentVariables")
--- End diff --

What if there are no args or environment variables for a particular job? Is 
the caller expected to pass in an empty array?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19793: [SPARK-22574] [Mesos] [Submit] Check submission r...

2017-11-28 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19793#discussion_r153530156
  
--- Diff: 
core/src/test/scala/org/apache/spark/deploy/rest/SubmitRestProtocolSuite.scala 
---
@@ -86,6 +86,8 @@ class SubmitRestProtocolSuite extends SparkFunSuite {
 message.clientSparkVersion = "1.2.3"
 message.appResource = "honey-walnut-cherry.jar"
 message.mainClass = "org.apache.spark.examples.SparkPie"
+message.appArgs = Array("hdfs://tmp/auth")
+message.environmentVariables = Map("SPARK_HOME" -> "/test")
--- End diff --

@Gschiavon You're right, I didn't see those variables were already set 
below. But, I do have a question now about whether these fields are optional or 
not. In the original test, (L.93), there's a comment about "optional fields", 
and the app args and env vars are tested there (L.105-106). Since this test was 
added with Standalone mode in mind, I wonder if these fields are considered 
optional there? There's no documentation as you said so we can't know for sure. 
I just want to make sure the extra validation doesn't break any existing 
applications (in Standalone cluster mode).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19793: [SPARK-22574] [Mesos] [Submit] Check submission r...

2017-11-29 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request:

https://github.com/apache/spark/pull/19793#discussion_r153824692
  
--- Diff: 
core/src/test/scala/org/apache/spark/deploy/rest/SubmitRestProtocolSuite.scala 
---
@@ -86,6 +86,8 @@ class SubmitRestProtocolSuite extends SparkFunSuite {
 message.clientSparkVersion = "1.2.3"
 message.appResource = "honey-walnut-cherry.jar"
 message.mainClass = "org.apache.spark.examples.SparkPie"
+message.appArgs = Array("hdfs://tmp/auth")
+message.environmentVariables = Map("SPARK_HOME" -> "/test")
--- End diff --

OK, I was looking at RestSubmissionClient (used by Mesos and Standalone) - 
the client does always set 'appArgs' and 'envrionmentVariables': 
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionClient.scala#L427
 So, it's only a bug when the user does not use the RestSubmissionClient (by 
using 'curl' directly to the server, for example). So, that addresses my 
concern about Standalone mode. As to whether to add a similar check in the 
Standalone class, I don't have a strong opinion about it (fix it here or fix in 
another PR).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   >