[GitHub] spark issue #19396: [SPARK-22172][CORE] Worker hangs when the external shuff...

2017-11-01 Thread jerryshao
Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/19396
  
Sorry I didn't notice it, will double-check next time.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19396: [SPARK-22172][CORE] Worker hangs when the external shuff...

2017-11-01 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/19396
  
we should update PR description too, but it's too late now...


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19396: [SPARK-22172][CORE] Worker hangs when the external shuff...

2017-11-01 Thread jerryshao
Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/19396
  
OK, let me merge to master branch.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19396: [SPARK-22172][CORE] Worker hangs when the external shuff...

2017-10-31 Thread jerryshao
Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/19396
  
I'm OK with the current changes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19396: [SPARK-22172][CORE] Worker hangs when the external shuff...

2017-10-31 Thread jiangxb1987
Github user jiangxb1987 commented on the issue:

https://github.com/apache/spark/pull/19396
  
The change itself looks good to me, WDYT @jerryshao @cloud-fan ?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19396: [SPARK-22172][CORE] Worker hangs when the external shuff...

2017-10-31 Thread devaraj-kavali
Github user devaraj-kavali commented on the issue:

https://github.com/apache/spark/pull/19396
  
@jiangxb1987 Thanks for the comment, I made the change which throws 
exception and exits the worker.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19396: [SPARK-22172][CORE] Worker hangs when the external shuff...

2017-10-19 Thread jiangxb1987
Github user jiangxb1987 commented on the issue:

https://github.com/apache/spark/pull/19396
  
IMO we should throw a new Exception in order to fail fast, it may cause 
some weird issues running with an ESS that you can't connect to.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19396: [SPARK-22172][CORE] Worker hangs when the external shuff...

2017-10-16 Thread jerryshao
Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/19396
  
Sorry for the late response. I understand you purpose now. I think such 
behavior discrepancy is not a big problem. 

I guess the reason why NM still run with exception is that NM doesn't serve 
only for Spark, but also MR/TEZ, so the failure of Spark external service 
should not affect MR's.

Based on your comment above, I don't have a strong preference on either, I 
think both are OK. Maybe you can ping others to get their feedbacks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19396: [SPARK-22172][CORE] Worker hangs when the external shuff...

2017-10-13 Thread devaraj-kavali
Github user devaraj-kavali commented on the issue:

https://github.com/apache/spark/pull/19396
  
@jerryshao Please let me know if you don't convince with the above comment, 
I can make the changes to PR to make Worker do down on external shuffle service 
start failure.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19396: [SPARK-22172][CORE] Worker hangs when the external shuff...

2017-09-30 Thread devaraj-kavali
Github user devaraj-kavali commented on the issue:

https://github.com/apache/spark/pull/19396
  
Thanks @jerryshao for the comment.

> IMO I think it might be better to throw an exception instead of not 
starting shuffle service. Since user want to use external shuffle explicitly, 
letting user to know the issues and fix the issue would be better.

I considered this before creating PR but the Node Manager continues to run 
when spark shuffle service gets BindException, thought to make the behavior 
inline with NM spark shuffle.

Please let me know if you think to make the Worker to go down on external 
shuffle service start failure I can update this PR.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19396: [SPARK-22172][CORE] Worker hangs when the external shuff...

2017-09-29 Thread jerryshao
Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/19396
  
IMO I think it might be better to throw an exception instead of shifting to 
another shuffle. Since user want to use external shuffle explicitly, letting 
user to know the issues and fix the issue would be better. 

Besides, this will lead to an issue if Cluster manager changing to not 
start shuffle service, while Spark application still assume external shuffle is 
on, and tries to connect to external shuffle service.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19396: [SPARK-22172][CORE] Worker hangs when the external shuff...

2017-09-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19396
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org