Is all NodeManager services restarted after the change in yarn-site.xml

On Thu, Mar 3, 2016 at 6:00 AM, Jeff Zhang <zjf...@gmail.com> wrote:

> The executor may fail to start. You need to check the executor logs, if
> there's no executor log then you need to check node manager log.
>
> On Wed, Mar 2, 2016 at 4:26 PM, Xiaoye Sun <sunxiaoy...@gmail.com> wrote:
>
>> Hi all,
>>
>> I am very new to spark and yarn.
>>
>> I am running a BroadcastTest example application using spark 1.6.0 and
>> Hadoop/Yarn 2.7.1. in a 5 nodes cluster.
>>
>> I configured my configuration files according to
>> https://spark.apache.org/docs/latest/job-scheduling.html#dynamic-resource-allocation
>>
>> 1. copy
>> ./spark-1.6.0/network/yarn/target/scala-2.10/spark-1.6.0-yarn-shuffle.jar
>> to /hadoop-2.7.1/share/hadoop/yarn/lib/
>> 2. yarn-site.xml is like this
>> http://www.owlnet.rice.edu/~xs6/yarn-site.xml
>> 3. spark-defaults.conf is like this
>> http://www.owlnet.rice.edu/~xs6/spark-defaults.conf
>> 4. spark-env.sh is like this http://www.owlnet.rice.edu/~xs6/spark-env.sh
>> 5. the command I use to submit spark application is: ./bin/spark-submit
>> --class org.apache.spark.examples.BroadcastTest --master yarn --deploy-mode
>> cluster ./examples/target/spark-examples_2.10-1.6.0.jar 1 10000000 Http
>>
>> However, the job is stuck at RUNNING status, and by looking at the log, I
>> found that the executor is failed/cancelled frequently...
>> Here is the log output http://www.owlnet.rice.edu/~xs6/stderr
>> It shows something like
>>
>> 16/03/02 02:07:35 WARN yarn.YarnAllocator: Container marked as failed: 
>> container_1456905762620_0002_01_000002 on host: bold-x.rice.edu. Exit 
>> status: 1. Diagnostics: Exception from container-launch.
>>
>>
>> Is there anybody know what is the problem here?
>> Best,
>> Xiaoye
>>
>
>
>
> --
> Best Regards
>
> Jeff Zhang
>

Reply via email to