[GitHub] spark pull request #17357: [SPARK-20025][CORE] Ignore SPARK_LOCAL* env, whil...

2017-10-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/17357


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17357: [SPARK-20025][CORE] Ignore SPARK_LOCAL* env, whil...

2017-10-09 Thread ScrapCodes
Github user ScrapCodes commented on a diff in the pull request:

https://github.com/apache/spark/pull/17357#discussion_r143407147
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/worker/DriverWrapper.scala ---
@@ -23,14 +23,15 @@ import org.apache.commons.lang3.StringUtils
 
 import org.apache.spark.{SecurityManager, SparkConf}
 import org.apache.spark.deploy.{DependencyUtils, SparkHadoopUtil, 
SparkSubmit}
+import org.apache.spark.internal.Logging
 import org.apache.spark.rpc.RpcEnv
 import org.apache.spark.util.{ChildFirstURLClassLoader, 
MutableURLClassLoader, Utils}
 
 /**
  * Utility object for launching driver programs such that they share fate 
with the Worker process.
  * This is used in standalone cluster mode only.
  */
-object DriverWrapper {
+object DriverWrapper extends Logging {
--- End diff --

I can not conceive, why that can be a problem. Can you also describe, why 
do you think that it can be a problem?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17357: [SPARK-20025][CORE] Ignore SPARK_LOCAL* env, whil...

2017-10-05 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17357#discussion_r143096475
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/worker/DriverWrapper.scala ---
@@ -23,14 +23,15 @@ import org.apache.commons.lang3.StringUtils
 
 import org.apache.spark.{SecurityManager, SparkConf}
 import org.apache.spark.deploy.{DependencyUtils, SparkHadoopUtil, 
SparkSubmit}
+import org.apache.spark.internal.Logging
 import org.apache.spark.rpc.RpcEnv
 import org.apache.spark.util.{ChildFirstURLClassLoader, 
MutableURLClassLoader, Utils}
 
 /**
  * Utility object for launching driver programs such that they share fate 
with the Worker process.
  * This is used in standalone cluster mode only.
  */
-object DriverWrapper {
+object DriverWrapper extends Logging {
--- End diff --

This `DriverWrapper` actually runs within the same JVM of driver, and 
initialize log4j instance earlier. Will this be a problem?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17357: [SPARK-20025][CORE] Ignore SPARK_LOCAL* env, whil...

2017-08-28 Thread ScrapCodes
Github user ScrapCodes commented on a diff in the pull request:

https://github.com/apache/spark/pull/17357#discussion_r135699350
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/rest/StandaloneRestServer.scala ---
@@ -139,7 +139,9 @@ private[rest] class StandaloneSubmitRequestServlet(
 val driverExtraLibraryPath = 
sparkProperties.get("spark.driver.extraLibraryPath")
 val superviseDriver = sparkProperties.get("spark.driver.supervise")
 val appArgs = request.appArgs
-val environmentVariables = request.environmentVariables
+// Filter SPARK_LOCAL_(IP|HOSTNAME) environment variables from being 
set on the remote system.
+val environmentVariables =
+  request.environmentVariables.filterNot(x => 
x._1.matches("SPARK_LOCAL_(IP|HOSTNAME"))
--- End diff --

Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17357: [SPARK-20025][CORE] Ignore SPARK_LOCAL* env, whil...

2017-08-28 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/17357#discussion_r135627170
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/rest/StandaloneRestServer.scala ---
@@ -139,7 +139,9 @@ private[rest] class StandaloneSubmitRequestServlet(
 val driverExtraLibraryPath = 
sparkProperties.get("spark.driver.extraLibraryPath")
 val superviseDriver = sparkProperties.get("spark.driver.supervise")
 val appArgs = request.appArgs
-val environmentVariables = request.environmentVariables
+// Filter SPARK_LOCAL_(IP|HOSTNAME) environment variables from being 
set on the remote system.
+val environmentVariables =
+  request.environmentVariables.filterNot(x => 
x._1.matches("SPARK_LOCAL_(IP|HOSTNAME"))
--- End diff --

Should it be `"SPARK_LOCAL_(IP|HOSTNAME)"`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17357: [SPARK-20025][CORE] Ignore SPARK_LOCAL* env, whil...

2017-08-28 Thread ScrapCodes
Github user ScrapCodes commented on a diff in the pull request:

https://github.com/apache/spark/pull/17357#discussion_r135485503
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/rest/StandaloneRestServer.scala ---
@@ -139,7 +139,9 @@ private[rest] class StandaloneSubmitRequestServlet(
 val driverExtraLibraryPath = 
sparkProperties.get("spark.driver.extraLibraryPath")
 val superviseDriver = sparkProperties.get("spark.driver.supervise")
 val appArgs = request.appArgs
-val environmentVariables = request.environmentVariables
+// Filter SPARK_LOCAL_(IP|HOSTNAME) environment variables from being 
set on the remote system.
+val environmentVariables =
+  request.environmentVariables.filterNot(x => 
x._1.matches("SPARK_LOCAL_(IP|HOSTNAME"))
--- End diff --

@jiangxb1987 Hi I have changed it to only filter HOSTNAME and IP.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17357: [SPARK-20025][CORE] Ignore SPARK_LOCAL* env, whil...

2017-08-24 Thread ScrapCodes
Github user ScrapCodes commented on a diff in the pull request:

https://github.com/apache/spark/pull/17357#discussion_r135187410
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/rest/StandaloneRestServer.scala ---
@@ -139,7 +139,9 @@ private[rest] class StandaloneSubmitRequestServlet(
 val driverExtraLibraryPath = 
sparkProperties.get("spark.driver.extraLibraryPath")
 val superviseDriver = sparkProperties.get("spark.driver.supervise")
 val appArgs = request.appArgs
-val environmentVariables = request.environmentVariables
+// Filter SPARK_LOCAL environment variables from being set on the 
remote system.
+val environmentVariables =
+  
request.environmentVariables.filterNot(_._1.startsWith("SPARK_LOCAL"))
--- End diff --

Alright, I will check how it is used across the project. 
Just noted, In `LocalDirsSuite`, comments in `test("SPARK_LOCAL_DIRS 
override also affects driver") ` seems to corroborate with my intentions here.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17357: [SPARK-20025][CORE] Ignore SPARK_LOCAL* env, whil...

2017-08-24 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/17357#discussion_r135032686
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/rest/StandaloneRestServer.scala ---
@@ -139,7 +139,9 @@ private[rest] class StandaloneSubmitRequestServlet(
 val driverExtraLibraryPath = 
sparkProperties.get("spark.driver.extraLibraryPath")
 val superviseDriver = sparkProperties.get("spark.driver.supervise")
 val appArgs = request.appArgs
-val environmentVariables = request.environmentVariables
+// Filter SPARK_LOCAL environment variables from being set on the 
remote system.
+val environmentVariables =
+  
request.environmentVariables.filterNot(_._1.startsWith("SPARK_LOCAL"))
--- End diff --

Could you please check the code to ensure that `SPARK_LOCAL_DIRS` has not 
been used as a global config?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17357: [SPARK-20025][CORE] Ignore SPARK_LOCAL* env, whil...

2017-08-24 Thread ScrapCodes
Github user ScrapCodes commented on a diff in the pull request:

https://github.com/apache/spark/pull/17357#discussion_r134986472
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/worker/DriverWrapper.scala ---
@@ -38,8 +39,10 @@ object DriverWrapper {
*/
   case workerUrl :: userJar :: mainClass :: extraArgs =>
 val conf = new SparkConf()
-val rpcEnv = RpcEnv.create("Driver",
-  Utils.localHostName(), 0, conf, new SecurityManager(conf))
+val host: String = Utils.localHostName()
--- End diff --

Please correct me, AFAIU `spark.driver.host` may not be specified on each 
node of the cluster and its value might be global(i.e. cluster wide). Since 
driver is launched on a random node during a failover, it's value can not be 
pre-assigned and has to be picked up from that node's local environment.

About `spark.driver.port`, I am not sure, but the fact is - it might work, 
even if it is global cluster wide constant.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17357: [SPARK-20025][CORE] Ignore SPARK_LOCAL* env, whil...

2017-08-24 Thread ScrapCodes
Github user ScrapCodes commented on a diff in the pull request:

https://github.com/apache/spark/pull/17357#discussion_r134985214
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/rest/StandaloneRestServer.scala ---
@@ -139,7 +139,9 @@ private[rest] class StandaloneSubmitRequestServlet(
 val driverExtraLibraryPath = 
sparkProperties.get("spark.driver.extraLibraryPath")
 val superviseDriver = sparkProperties.get("spark.driver.supervise")
 val appArgs = request.appArgs
-val environmentVariables = request.environmentVariables
+// Filter SPARK_LOCAL environment variables from being set on the 
remote system.
+val environmentVariables =
+  
request.environmentVariables.filterNot(_._1.startsWith("SPARK_LOCAL"))
--- End diff --

You are right, but shouldn't all SPARK_LOCAL* properties be picked up from 
the local environment of the node where driver is going to be started? 

Not filtering them, would mean, that these local properties are common to 
all nodes. 

But for this particular bug, `SPARK_LOCAL_DIRS` is not required to be 
filtered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17357: [SPARK-20025][CORE] Ignore SPARK_LOCAL* env, whil...

2017-08-06 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/17357#discussion_r131582810
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/rest/StandaloneRestServer.scala ---
@@ -139,7 +139,9 @@ private[rest] class StandaloneSubmitRequestServlet(
 val driverExtraLibraryPath = 
sparkProperties.get("spark.driver.extraLibraryPath")
 val superviseDriver = sparkProperties.get("spark.driver.supervise")
 val appArgs = request.appArgs
-val environmentVariables = request.environmentVariables
+// Filter SPARK_LOCAL environment variables from being set on the 
remote system.
+val environmentVariables =
+  
request.environmentVariables.filterNot(_._1.startsWith("SPARK_LOCAL"))
--- End diff --

I guess that Driver might not use `SPARK_LOCAL_DIRS`. But yes we may only 
need to filter out `SPARK_LOCAL_IP` and `SPARK_LOCAL_HOSTNAME`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17357: [SPARK-20025][CORE] Ignore SPARK_LOCAL* env, whil...

2017-08-06 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/17357#discussion_r131544143
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/rest/StandaloneRestServer.scala ---
@@ -139,7 +139,9 @@ private[rest] class StandaloneSubmitRequestServlet(
 val driverExtraLibraryPath = 
sparkProperties.get("spark.driver.extraLibraryPath")
 val superviseDriver = sparkProperties.get("spark.driver.supervise")
 val appArgs = request.appArgs
-val environmentVariables = request.environmentVariables
+// Filter SPARK_LOCAL environment variables from being set on the 
remote system.
+val environmentVariables =
+  
request.environmentVariables.filterNot(_._1.startsWith("SPARK_LOCAL"))
--- End diff --

I think this should also affect `SPARK_LOCAL_DIRS`, is that expected?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17357: [SPARK-20025][CORE] Ignore SPARK_LOCAL* env, whil...

2017-08-06 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/17357#discussion_r131537192
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/worker/DriverWrapper.scala ---
@@ -38,8 +39,10 @@ object DriverWrapper {
*/
   case workerUrl :: userJar :: mainClass :: extraArgs =>
 val conf = new SparkConf()
-val rpcEnv = RpcEnv.create("Driver",
-  Utils.localHostName(), 0, conf, new SecurityManager(conf))
+val host: String = Utils.localHostName()
--- End diff --

I think it is reasonable to use `Utils.localHostName()`. But what is 
`spark.driver.host` for?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org