Re: How to control Spark Executors from getting Lost when using YARN client mode?

2015-08-04 Thread Jeff Zhang
Please check the node manager logs to see why the container is killed.

On Mon, Aug 3, 2015 at 11:59 PM, Umesh Kacha umesh.ka...@gmail.com wrote:

 Hi all any help will be much appreciated my spark job runs fine but in the
 middle it starts loosing executors because of netafetchfailed exception
 saying shuffle not found at the location since executor is lost
 On Jul 31, 2015 11:41 PM, Umesh Kacha umesh.ka...@gmail.com wrote:

 Hi thanks for the response. It looks like YARN container is getting
 killed but dont know why I see shuffle metafetchexception as mentioned in
 the following SO link. I have enough memory 8 nodes 8 cores 30 gig memory
 each. And because of this metafetchexpcetion YARN killing container running
 executor how can it over run memory I tried to give each executor 25 gig
 still it is not sufficient and it fails. Please guide I dont understand
 what is going on I am using Spark 1.4.0 I am using spark.shuffle.memory as
 0.0 and spark.storage.memory as 0.5. I have almost all optimal properties
 like Kyro serializer I have kept 500 akka frame size 20 akka threads dont
 know I am trapped its been two days I am trying to recover from this issue.


 http://stackoverflow.com/questions/29850784/what-are-the-likely-causes-of-org-apache-spark-shuffle-metadatafetchfailedexcept



 On Thu, Jul 30, 2015 at 9:56 PM, Ashwin Giridharan 
 ashwin.fo...@gmail.com wrote:

 What is your cluster configuration ( size and resources) ?

 If you do not have enough resources, then your executor will not run.
 Moreover allocating 8 cores to an executor is too much.

 If you have a cluster with four nodes running NodeManagers, each
 equipped with 4 cores and 8GB of memory,
 then an optimal configuration would be,

 --num-executors 8 --executor-cores 2 --executor-memory 2G

 Thanks,
 Ashwin

 On Thu, Jul 30, 2015 at 12:08 PM, unk1102 umesh.ka...@gmail.com wrote:

 Hi I have one Spark job which runs fine locally with less data but when
 I
 schedule it on YARN to execute I keep on getting the following ERROR and
 slowly all executors gets removed from UI and my job fails

 15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 8 on
 myhost1.com: remote Rpc client disassociated
 15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 6 on
 myhost2.com: remote Rpc client disassociated
 I use the following command to schedule spark job in yarn-client mode

  ./spark-submit --class com.xyz.MySpark --conf
 spark.executor.extraJavaOptions=-XX:MaxPermSize=512M
 --driver-java-options
 -XX:MaxPermSize=512m --driver-memory 3g --master yarn-client
 --executor-memory 2G --executor-cores 8 --num-executors 12
 /home/myuser/myspark-1.0.jar

 I dont know what is the problem please guide. I am new to Spark. Thanks
 in
 advance.



 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/How-to-control-Spark-Executors-from-getting-Lost-when-using-YARN-client-mode-tp24084.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




 --
 Thanks  Regards,
 Ashwin Giridharan





-- 
Best Regards

Jeff Zhang


Re: How to control Spark Executors from getting Lost when using YARN client mode?

2015-08-03 Thread Umesh Kacha
Hi all any help will be much appreciated my spark job runs fine but in the
middle it starts loosing executors because of netafetchfailed exception
saying shuffle not found at the location since executor is lost
On Jul 31, 2015 11:41 PM, Umesh Kacha umesh.ka...@gmail.com wrote:

 Hi thanks for the response. It looks like YARN container is getting killed
 but dont know why I see shuffle metafetchexception as mentioned in the
 following SO link. I have enough memory 8 nodes 8 cores 30 gig memory each.
 And because of this metafetchexpcetion YARN killing container running
 executor how can it over run memory I tried to give each executor 25 gig
 still it is not sufficient and it fails. Please guide I dont understand
 what is going on I am using Spark 1.4.0 I am using spark.shuffle.memory as
 0.0 and spark.storage.memory as 0.5. I have almost all optimal properties
 like Kyro serializer I have kept 500 akka frame size 20 akka threads dont
 know I am trapped its been two days I am trying to recover from this issue.


 http://stackoverflow.com/questions/29850784/what-are-the-likely-causes-of-org-apache-spark-shuffle-metadatafetchfailedexcept



 On Thu, Jul 30, 2015 at 9:56 PM, Ashwin Giridharan ashwin.fo...@gmail.com
  wrote:

 What is your cluster configuration ( size and resources) ?

 If you do not have enough resources, then your executor will not run.
 Moreover allocating 8 cores to an executor is too much.

 If you have a cluster with four nodes running NodeManagers, each equipped
 with 4 cores and 8GB of memory,
 then an optimal configuration would be,

 --num-executors 8 --executor-cores 2 --executor-memory 2G

 Thanks,
 Ashwin

 On Thu, Jul 30, 2015 at 12:08 PM, unk1102 umesh.ka...@gmail.com wrote:

 Hi I have one Spark job which runs fine locally with less data but when I
 schedule it on YARN to execute I keep on getting the following ERROR and
 slowly all executors gets removed from UI and my job fails

 15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 8 on
 myhost1.com: remote Rpc client disassociated
 15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 6 on
 myhost2.com: remote Rpc client disassociated
 I use the following command to schedule spark job in yarn-client mode

  ./spark-submit --class com.xyz.MySpark --conf
 spark.executor.extraJavaOptions=-XX:MaxPermSize=512M
 --driver-java-options
 -XX:MaxPermSize=512m --driver-memory 3g --master yarn-client
 --executor-memory 2G --executor-cores 8 --num-executors 12
 /home/myuser/myspark-1.0.jar

 I dont know what is the problem please guide. I am new to Spark. Thanks
 in
 advance.



 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/How-to-control-Spark-Executors-from-getting-Lost-when-using-YARN-client-mode-tp24084.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




 --
 Thanks  Regards,
 Ashwin Giridharan





Re: How to control Spark Executors from getting Lost when using YARN client mode?

2015-07-31 Thread Umesh Kacha
Hi thanks for the response. It looks like YARN container is getting killed
but dont know why I see shuffle metafetchexception as mentioned in the
following SO link. I have enough memory 8 nodes 8 cores 30 gig memory each.
And because of this metafetchexpcetion YARN killing container running
executor how can it over run memory I tried to give each executor 25 gig
still it is not sufficient and it fails. Please guide I dont understand
what is going on I am using Spark 1.4.0 I am using spark.shuffle.memory as
0.0 and spark.storage.memory as 0.5. I have almost all optimal properties
like Kyro serializer I have kept 500 akka frame size 20 akka threads dont
know I am trapped its been two days I am trying to recover from this issue.

http://stackoverflow.com/questions/29850784/what-are-the-likely-causes-of-org-apache-spark-shuffle-metadatafetchfailedexcept



On Thu, Jul 30, 2015 at 9:56 PM, Ashwin Giridharan ashwin.fo...@gmail.com
wrote:

 What is your cluster configuration ( size and resources) ?

 If you do not have enough resources, then your executor will not run.
 Moreover allocating 8 cores to an executor is too much.

 If you have a cluster with four nodes running NodeManagers, each equipped
 with 4 cores and 8GB of memory,
 then an optimal configuration would be,

 --num-executors 8 --executor-cores 2 --executor-memory 2G

 Thanks,
 Ashwin

 On Thu, Jul 30, 2015 at 12:08 PM, unk1102 umesh.ka...@gmail.com wrote:

 Hi I have one Spark job which runs fine locally with less data but when I
 schedule it on YARN to execute I keep on getting the following ERROR and
 slowly all executors gets removed from UI and my job fails

 15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 8 on
 myhost1.com: remote Rpc client disassociated
 15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 6 on
 myhost2.com: remote Rpc client disassociated
 I use the following command to schedule spark job in yarn-client mode

  ./spark-submit --class com.xyz.MySpark --conf
 spark.executor.extraJavaOptions=-XX:MaxPermSize=512M
 --driver-java-options
 -XX:MaxPermSize=512m --driver-memory 3g --master yarn-client
 --executor-memory 2G --executor-cores 8 --num-executors 12
 /home/myuser/myspark-1.0.jar

 I dont know what is the problem please guide. I am new to Spark. Thanks in
 advance.



 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/How-to-control-Spark-Executors-from-getting-Lost-when-using-YARN-client-mode-tp24084.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




 --
 Thanks  Regards,
 Ashwin Giridharan



How to control Spark Executors from getting Lost when using YARN client mode?

2015-07-30 Thread unk1102
Hi I have one Spark job which runs fine locally with less data but when I
schedule it on YARN to execute I keep on getting the following ERROR and
slowly all executors gets removed from UI and my job fails

15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 8 on
myhost1.com: remote Rpc client disassociated
15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 6 on
myhost2.com: remote Rpc client disassociated
I use the following command to schedule spark job in yarn-client mode

 ./spark-submit --class com.xyz.MySpark --conf
spark.executor.extraJavaOptions=-XX:MaxPermSize=512M --driver-java-options
-XX:MaxPermSize=512m --driver-memory 3g --master yarn-client
--executor-memory 2G --executor-cores 8 --num-executors 12 
/home/myuser/myspark-1.0.jar

I dont know what is the problem please guide. I am new to Spark. Thanks in
advance.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-control-Spark-Executors-from-getting-Lost-when-using-YARN-client-mode-tp24084.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: How to control Spark Executors from getting Lost when using YARN client mode?

2015-07-30 Thread Ashwin Giridharan
What is your cluster configuration ( size and resources) ?

If you do not have enough resources, then your executor will not run.
Moreover allocating 8 cores to an executor is too much.

If you have a cluster with four nodes running NodeManagers, each equipped
with 4 cores and 8GB of memory,
then an optimal configuration would be,

--num-executors 8 --executor-cores 2 --executor-memory 2G

Thanks,
Ashwin

On Thu, Jul 30, 2015 at 12:08 PM, unk1102 umesh.ka...@gmail.com wrote:

 Hi I have one Spark job which runs fine locally with less data but when I
 schedule it on YARN to execute I keep on getting the following ERROR and
 slowly all executors gets removed from UI and my job fails

 15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 8 on
 myhost1.com: remote Rpc client disassociated
 15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 6 on
 myhost2.com: remote Rpc client disassociated
 I use the following command to schedule spark job in yarn-client mode

  ./spark-submit --class com.xyz.MySpark --conf
 spark.executor.extraJavaOptions=-XX:MaxPermSize=512M
 --driver-java-options
 -XX:MaxPermSize=512m --driver-memory 3g --master yarn-client
 --executor-memory 2G --executor-cores 8 --num-executors 12
 /home/myuser/myspark-1.0.jar

 I dont know what is the problem please guide. I am new to Spark. Thanks in
 advance.



 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/How-to-control-Spark-Executors-from-getting-Lost-when-using-YARN-client-mode-tp24084.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




-- 
Thanks  Regards,
Ashwin Giridharan