Re: How to control Spark Executors from getting Lost when using YARN client mode?
Please check the node manager logs to see why the container is killed. On Mon, Aug 3, 2015 at 11:59 PM, Umesh Kacha wrote: > Hi all any help will be much appreciated my spark job runs fine but in the > middle it starts loosing executors because of netafetchfailed exception > saying shuffle not found at the location since executor is lost > On Jul 31, 2015 11:41 PM, "Umesh Kacha" wrote: > >> Hi thanks for the response. It looks like YARN container is getting >> killed but dont know why I see shuffle metafetchexception as mentioned in >> the following SO link. I have enough memory 8 nodes 8 cores 30 gig memory >> each. And because of this metafetchexpcetion YARN killing container running >> executor how can it over run memory I tried to give each executor 25 gig >> still it is not sufficient and it fails. Please guide I dont understand >> what is going on I am using Spark 1.4.0 I am using spark.shuffle.memory as >> 0.0 and spark.storage.memory as 0.5. I have almost all optimal properties >> like Kyro serializer I have kept 500 akka frame size 20 akka threads dont >> know I am trapped its been two days I am trying to recover from this issue. >> >> >> http://stackoverflow.com/questions/29850784/what-are-the-likely-causes-of-org-apache-spark-shuffle-metadatafetchfailedexcept >> >> >> >> On Thu, Jul 30, 2015 at 9:56 PM, Ashwin Giridharan < >> ashwin.fo...@gmail.com> wrote: >> >>> What is your cluster configuration ( size and resources) ? >>> >>> If you do not have enough resources, then your executor will not run. >>> Moreover allocating 8 cores to an executor is too much. >>> >>> If you have a cluster with four nodes running NodeManagers, each >>> equipped with 4 cores and 8GB of memory, >>> then an optimal configuration would be, >>> >>> --num-executors 8 --executor-cores 2 --executor-memory 2G >>> >>> Thanks, >>> Ashwin >>> >>> On Thu, Jul 30, 2015 at 12:08 PM, unk1102 wrote: >>> >>>> Hi I have one Spark job which runs fine locally with less data but when >>>> I >>>> schedule it on YARN to execute I keep on getting the following ERROR and >>>> slowly all executors gets removed from UI and my job fails >>>> >>>> 15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 8 on >>>> myhost1.com: remote Rpc client disassociated >>>> 15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 6 on >>>> myhost2.com: remote Rpc client disassociated >>>> I use the following command to schedule spark job in yarn-client mode >>>> >>>> ./spark-submit --class com.xyz.MySpark --conf >>>> "spark.executor.extraJavaOptions=-XX:MaxPermSize=512M" >>>> --driver-java-options >>>> -XX:MaxPermSize=512m --driver-memory 3g --master yarn-client >>>> --executor-memory 2G --executor-cores 8 --num-executors 12 >>>> /home/myuser/myspark-1.0.jar >>>> >>>> I dont know what is the problem please guide. I am new to Spark. Thanks >>>> in >>>> advance. >>>> >>>> >>>> >>>> -- >>>> View this message in context: >>>> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-control-Spark-Executors-from-getting-Lost-when-using-YARN-client-mode-tp24084.html >>>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>>> >>>> - >>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>>> For additional commands, e-mail: user-h...@spark.apache.org >>>> >>>> >>> >>> >>> -- >>> Thanks & Regards, >>> Ashwin Giridharan >>> >> >> -- Best Regards Jeff Zhang
Re: How to control Spark Executors from getting Lost when using YARN client mode?
Hi all any help will be much appreciated my spark job runs fine but in the middle it starts loosing executors because of netafetchfailed exception saying shuffle not found at the location since executor is lost On Jul 31, 2015 11:41 PM, "Umesh Kacha" wrote: > Hi thanks for the response. It looks like YARN container is getting killed > but dont know why I see shuffle metafetchexception as mentioned in the > following SO link. I have enough memory 8 nodes 8 cores 30 gig memory each. > And because of this metafetchexpcetion YARN killing container running > executor how can it over run memory I tried to give each executor 25 gig > still it is not sufficient and it fails. Please guide I dont understand > what is going on I am using Spark 1.4.0 I am using spark.shuffle.memory as > 0.0 and spark.storage.memory as 0.5. I have almost all optimal properties > like Kyro serializer I have kept 500 akka frame size 20 akka threads dont > know I am trapped its been two days I am trying to recover from this issue. > > > http://stackoverflow.com/questions/29850784/what-are-the-likely-causes-of-org-apache-spark-shuffle-metadatafetchfailedexcept > > > > On Thu, Jul 30, 2015 at 9:56 PM, Ashwin Giridharan > wrote: > >> What is your cluster configuration ( size and resources) ? >> >> If you do not have enough resources, then your executor will not run. >> Moreover allocating 8 cores to an executor is too much. >> >> If you have a cluster with four nodes running NodeManagers, each equipped >> with 4 cores and 8GB of memory, >> then an optimal configuration would be, >> >> --num-executors 8 --executor-cores 2 --executor-memory 2G >> >> Thanks, >> Ashwin >> >> On Thu, Jul 30, 2015 at 12:08 PM, unk1102 wrote: >> >>> Hi I have one Spark job which runs fine locally with less data but when I >>> schedule it on YARN to execute I keep on getting the following ERROR and >>> slowly all executors gets removed from UI and my job fails >>> >>> 15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 8 on >>> myhost1.com: remote Rpc client disassociated >>> 15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 6 on >>> myhost2.com: remote Rpc client disassociated >>> I use the following command to schedule spark job in yarn-client mode >>> >>> ./spark-submit --class com.xyz.MySpark --conf >>> "spark.executor.extraJavaOptions=-XX:MaxPermSize=512M" >>> --driver-java-options >>> -XX:MaxPermSize=512m --driver-memory 3g --master yarn-client >>> --executor-memory 2G --executor-cores 8 --num-executors 12 >>> /home/myuser/myspark-1.0.jar >>> >>> I dont know what is the problem please guide. I am new to Spark. Thanks >>> in >>> advance. >>> >>> >>> >>> -- >>> View this message in context: >>> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-control-Spark-Executors-from-getting-Lost-when-using-YARN-client-mode-tp24084.html >>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>> >>> - >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> For additional commands, e-mail: user-h...@spark.apache.org >>> >>> >> >> >> -- >> Thanks & Regards, >> Ashwin Giridharan >> > >
Re: How to control Spark Executors from getting Lost when using YARN client mode?
Hi thanks for the response. It looks like YARN container is getting killed but dont know why I see shuffle metafetchexception as mentioned in the following SO link. I have enough memory 8 nodes 8 cores 30 gig memory each. And because of this metafetchexpcetion YARN killing container running executor how can it over run memory I tried to give each executor 25 gig still it is not sufficient and it fails. Please guide I dont understand what is going on I am using Spark 1.4.0 I am using spark.shuffle.memory as 0.0 and spark.storage.memory as 0.5. I have almost all optimal properties like Kyro serializer I have kept 500 akka frame size 20 akka threads dont know I am trapped its been two days I am trying to recover from this issue. http://stackoverflow.com/questions/29850784/what-are-the-likely-causes-of-org-apache-spark-shuffle-metadatafetchfailedexcept On Thu, Jul 30, 2015 at 9:56 PM, Ashwin Giridharan wrote: > What is your cluster configuration ( size and resources) ? > > If you do not have enough resources, then your executor will not run. > Moreover allocating 8 cores to an executor is too much. > > If you have a cluster with four nodes running NodeManagers, each equipped > with 4 cores and 8GB of memory, > then an optimal configuration would be, > > --num-executors 8 --executor-cores 2 --executor-memory 2G > > Thanks, > Ashwin > > On Thu, Jul 30, 2015 at 12:08 PM, unk1102 wrote: > >> Hi I have one Spark job which runs fine locally with less data but when I >> schedule it on YARN to execute I keep on getting the following ERROR and >> slowly all executors gets removed from UI and my job fails >> >> 15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 8 on >> myhost1.com: remote Rpc client disassociated >> 15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 6 on >> myhost2.com: remote Rpc client disassociated >> I use the following command to schedule spark job in yarn-client mode >> >> ./spark-submit --class com.xyz.MySpark --conf >> "spark.executor.extraJavaOptions=-XX:MaxPermSize=512M" >> --driver-java-options >> -XX:MaxPermSize=512m --driver-memory 3g --master yarn-client >> --executor-memory 2G --executor-cores 8 --num-executors 12 >> /home/myuser/myspark-1.0.jar >> >> I dont know what is the problem please guide. I am new to Spark. Thanks in >> advance. >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-control-Spark-Executors-from-getting-Lost-when-using-YARN-client-mode-tp24084.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> - >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> > > > -- > Thanks & Regards, > Ashwin Giridharan >
Re: How to control Spark Executors from getting Lost when using YARN client mode?
What is your cluster configuration ( size and resources) ? If you do not have enough resources, then your executor will not run. Moreover allocating 8 cores to an executor is too much. If you have a cluster with four nodes running NodeManagers, each equipped with 4 cores and 8GB of memory, then an optimal configuration would be, --num-executors 8 --executor-cores 2 --executor-memory 2G Thanks, Ashwin On Thu, Jul 30, 2015 at 12:08 PM, unk1102 wrote: > Hi I have one Spark job which runs fine locally with less data but when I > schedule it on YARN to execute I keep on getting the following ERROR and > slowly all executors gets removed from UI and my job fails > > 15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 8 on > myhost1.com: remote Rpc client disassociated > 15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 6 on > myhost2.com: remote Rpc client disassociated > I use the following command to schedule spark job in yarn-client mode > > ./spark-submit --class com.xyz.MySpark --conf > "spark.executor.extraJavaOptions=-XX:MaxPermSize=512M" > --driver-java-options > -XX:MaxPermSize=512m --driver-memory 3g --master yarn-client > --executor-memory 2G --executor-cores 8 --num-executors 12 > /home/myuser/myspark-1.0.jar > > I dont know what is the problem please guide. I am new to Spark. Thanks in > advance. > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/How-to-control-Spark-Executors-from-getting-Lost-when-using-YARN-client-mode-tp24084.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > > -- Thanks & Regards, Ashwin Giridharan
Re: How to control Spark Executors from getting Lost when using YARN client mode?
See past thread on this topic: http://search-hadoop.com/m/q3RTt0NZXV1cC6q02 On Thu, Jul 30, 2015 at 9:08 AM, unk1102 wrote: > Hi I have one Spark job which runs fine locally with less data but when I > schedule it on YARN to execute I keep on getting the following ERROR and > slowly all executors gets removed from UI and my job fails > > 15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 8 on > myhost1.com: remote Rpc client disassociated > 15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 6 on > myhost2.com: remote Rpc client disassociated > I use the following command to schedule spark job in yarn-client mode > > ./spark-submit --class com.xyz.MySpark --conf > "spark.executor.extraJavaOptions=-XX:MaxPermSize=512M" > --driver-java-options > -XX:MaxPermSize=512m --driver-memory 3g --master yarn-client > --executor-memory 2G --executor-cores 8 --num-executors 12 > /home/myuser/myspark-1.0.jar > > I dont know what is the problem please guide. I am new to Spark. Thanks in > advance. > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/How-to-control-Spark-Executors-from-getting-Lost-when-using-YARN-client-mode-tp24084.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >
How to control Spark Executors from getting Lost when using YARN client mode?
Hi I have one Spark job which runs fine locally with less data but when I schedule it on YARN to execute I keep on getting the following ERROR and slowly all executors gets removed from UI and my job fails 15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 8 on myhost1.com: remote Rpc client disassociated 15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 6 on myhost2.com: remote Rpc client disassociated I use the following command to schedule spark job in yarn-client mode ./spark-submit --class com.xyz.MySpark --conf "spark.executor.extraJavaOptions=-XX:MaxPermSize=512M" --driver-java-options -XX:MaxPermSize=512m --driver-memory 3g --master yarn-client --executor-memory 2G --executor-cores 8 --num-executors 12 /home/myuser/myspark-1.0.jar I dont know what is the problem please guide. I am new to Spark. Thanks in advance. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-control-Spark-Executors-from-getting-Lost-when-using-YARN-client-mode-tp24084.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org