[GitHub] spark pull request #17821: [SPARK-20529][Core]Allow worker and master work w...

2017-05-16 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/17821


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17821: [SPARK-20529][Core]Allow worker and master work w...

2017-05-02 Thread zsxwing
Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/17821#discussion_r114446352
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala 
---
@@ -266,7 +289,8 @@ private[deploy] class Worker(
 if (registerMasterFutures != null) {
   registerMasterFutures.foreach(_.cancel(true))
 }
-val masterAddress = masterRef.address
+val masterAddress =
+  if (preferConfiguredMasterAddress) 
masterAddressToConnect.get else masterRef.address
--- End diff --

Right now `masterRef` and `masterAddressToConnect` are set at the same 
time. It's impossible unless we break something in future. It's better to fail 
rather than hiding the broken change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17821: [SPARK-20529][Core]Allow worker and master work w...

2017-05-02 Thread sameeragarwal
Github user sameeragarwal commented on a diff in the pull request:

https://github.com/apache/spark/pull/17821#discussion_r11771
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/DeployMessage.scala 
---
@@ -80,8 +91,16 @@ private[deploy] object DeployMessages {
 
   sealed trait RegisterWorkerResponse
 
-  case class RegisteredWorker(master: RpcEndpointRef, masterWebUiUrl: 
String) extends DeployMessage
-with RegisterWorkerResponse
+  /**
+   * @param master the master ref
+   * @param masterWebUiUrl the master Web UI address
+   * @param masterAddress the master address used by the worker to 
connect. It should be
+   *  [[RegisterWorker.masterAddress]].
+   */
+  case class RegisteredWorker(
+  master: RpcEndpointRef,
+  masterWebUiUrl: String,
+  masterAddress: RpcAddress) extends DeployMessage with 
RegisterWorkerResponse
--- End diff --

Alright, that sounds good.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17821: [SPARK-20529][Core]Allow worker and master work w...

2017-05-02 Thread sameeragarwal
Github user sameeragarwal commented on a diff in the pull request:

https://github.com/apache/spark/pull/17821#discussion_r114445132
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala 
---
@@ -266,7 +289,8 @@ private[deploy] class Worker(
 if (registerMasterFutures != null) {
   registerMasterFutures.foreach(_.cancel(true))
 }
-val masterAddress = masterRef.address
+val masterAddress =
+  if (preferConfiguredMasterAddress) 
masterAddressToConnect.get else masterRef.address
--- End diff --

Perhaps it isn't an issue but do you think we should fall back to 
`masterRef.address` in case `masterAddressToConnect` isn't set (instead of 
throwing a generic scala exception)? Something along the lines of:

```scala
val masterAddress = masterAddressToConnect match {
  case Some(master) if preferConfiguredMasterAddress => master
  case _ => masterRef.address
}
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17821: [SPARK-20529][Core]Allow worker and master work w...

2017-05-02 Thread zsxwing
Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/17821#discussion_r114419583
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala 
---
@@ -266,7 +282,7 @@ private[deploy] class Worker(
 if (registerMasterFutures != null) {
   registerMasterFutures.foreach(_.cancel(true))
 }
-val masterAddress = masterRef.address
+val masterAddress = masterAddressToConnect.get
--- End diff --

Added a new conf


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17821: [SPARK-20529][Core]Allow worker and master work w...

2017-05-02 Thread zsxwing
Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/17821#discussion_r114419550
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/DeployMessage.scala 
---
@@ -80,8 +91,16 @@ private[deploy] object DeployMessages {
 
   sealed trait RegisterWorkerResponse
 
-  case class RegisteredWorker(master: RpcEndpointRef, masterWebUiUrl: 
String) extends DeployMessage
-with RegisterWorkerResponse
+  /**
+   * @param master the master ref
+   * @param masterWebUiUrl the master Web UI address
+   * @param masterAddress the master address used by the worker to 
connect. It should be
+   *  [[RegisterWorker.masterAddress]].
+   */
+  case class RegisteredWorker(
+  master: RpcEndpointRef,
+  masterWebUiUrl: String,
+  masterAddress: RpcAddress) extends DeployMessage with 
RegisterWorkerResponse
--- End diff --

Checked the current codes. Unfortunately, we cannot remove this extra 
field. `master.address` and `masterAddress` are different.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17821: [SPARK-20529][Core]Allow worker and master work w...

2017-05-01 Thread sameeragarwal
Github user sameeragarwal commented on a diff in the pull request:

https://github.com/apache/spark/pull/17821#discussion_r114205566
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/DeployMessage.scala 
---
@@ -80,8 +91,16 @@ private[deploy] object DeployMessages {
 
   sealed trait RegisterWorkerResponse
 
-  case class RegisteredWorker(master: RpcEndpointRef, masterWebUiUrl: 
String) extends DeployMessage
-with RegisterWorkerResponse
+  /**
+   * @param master the master ref
+   * @param masterWebUiUrl the master Web UI address
+   * @param masterAddress the master address used by the worker to 
connect. It should be
+   *  [[RegisterWorker.masterAddress]].
+   */
+  case class RegisteredWorker(
+  master: RpcEndpointRef,
+  masterWebUiUrl: String,
+  masterAddress: RpcAddress) extends DeployMessage with 
RegisterWorkerResponse
--- End diff --

Can we avoid adding an extra field here? Perhaps just put the 
`masterAddress` in the `master` field.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17821: [SPARK-20529][Core]Allow worker and master work w...

2017-05-01 Thread sameeragarwal
Github user sameeragarwal commented on a diff in the pull request:

https://github.com/apache/spark/pull/17821#discussion_r114206001
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala 
---
@@ -266,7 +282,7 @@ private[deploy] class Worker(
 if (registerMasterFutures != null) {
   registerMasterFutures.foreach(_.cancel(true))
 }
-val masterAddress = masterRef.address
+val masterAddress = masterAddressToConnect.get
--- End diff --

How about we conf protect this change (with a default that still uses 
`masterRef`). If we can merge `master` and `masterAddress` as I suggested 
above, we can just add a conf on the master and the worker code can be largely 
unaffected.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17821: [SPARK-20529][Core]Allow worker and master work w...

2017-05-01 Thread zsxwing
GitHub user zsxwing opened a pull request:

https://github.com/apache/spark/pull/17821

[SPARK-20529][Core]Allow worker and master work with a proxy server

## What changes were proposed in this pull request?

In the current codes, when worker connects to master, master will send its 
address to the worker. Then worker will save this address and use it to 
reconnect in case of failure. However, sometimes, this address is not correct. 
If there is a proxy between master and worker, the address master sent is not 
the address of proxy.

In this PR, the master address used by the worker will be sent to the 
master, then master just replies this address back, worker will use this 
address to reconnect in case of failure. In other words, the worker will use 
the config master address set in the worker side if possible rather than the 
master address set in the master side.

There is still one potential issue though. When a master is restarted or 
takes over leadership, the work will use the address sent from the master to 
connect. If there is still a proxy between  master and worker, the address may 
be wrong. However, there is no way to figure it out just in the worker.

## How was this patch tested?

The new added unit test.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zsxwing/spark SPARK-20529

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17821.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17821


commit 8ded9b197cc7ef3cdb32858da385cf9f900deb7d
Author: Shixiong Zhu 
Date:   2017-04-28T23:04:48Z

Fix SPARK-20529




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org