subject:"Spark Streaming failing on YARN Cluster"

Re: Spark Streaming failing on YARN Cluster

2015-08-25 Thread Ramkumar V

yes , when i see my yarn logs for that particular failed app_id, i got the
following error.

ERROR yarn.ApplicationMaster: SparkContext did not initialize after waiting
for 10 ms. Please check earlier log output for errors. Failing the
application

For this error, I need to change the 'SparkContext', set the Master on yarn
cluster ( SetMaster(yarn-cluster) ). Its working fine in cluster mode.
Thanks for everyone.

*Thanks*,
https://in.linkedin.com/in/ramkumarcs31

On Fri, Aug 21, 2015 at 6:41 AM, Jeff Zhang zjf...@gmail.com wrote:

AM fails to launch, could you check the yarn app logs ? You can use
command yarn logs -your_app_id to get the yarn app logs.

On Thu, Aug 20, 2015 at 1:15 AM, Ramkumar V ramkumar.c...@gmail.com
wrote:

I'm getting some spark exception. Please look this log trace (
*http://pastebin.com/xL9jaRUa
http://pastebin.com/xL9jaRUa* ).

*Thanks*,
https://in.linkedin.com/in/ramkumarcs31

On Wed, Aug 19, 2015 at 10:20 PM, Hari Shreedharan
hshreedha...@cloudera.com wrote:

It looks like you are having issues with the files getting distributed
to the cluster. What is the exception you are getting now?

On Wednesday, August 19, 2015, Ramkumar V ramkumar.c...@gmail.com
wrote:

Thanks a lot for your suggestion. I had modified HADOOP_CONF_DIR in
spark-env.sh so that core-site.xml is under HADOOP_CONF_DIR. i can
able to see the logs like that you had shown above. Now i can able to run
for 3 minutes and store results between every minutes. After sometimes,
there is an exception. How to fix this exception ? and Can you please
explain where its going wrong ?

*Log Link : http://pastebin.com/xL9jaRUa
http://pastebin.com/xL9jaRUa *

*Thanks*,
https://in.linkedin.com/in/ramkumarcs31

On Wed, Aug 19, 2015 at 1:54 PM, Jeff Zhang zjf...@gmail.com wrote:

HADOOP_CONF_DIR is the environment variable point to the hadoop conf
directory. Not sure how CDH organize that, make sure core-site.xml is
under HADOOP_CONF_DIR.

On Wed, Aug 19, 2015 at 4:06 PM, Ramkumar V ramkumar.c...@gmail.com
wrote:

We are using Cloudera-5.3.1. since it is one of the earlier version
of CDH, it doesnt supports the latest version of spark. So i installed
spark-1.4.1 separately in my machine. I couldnt able to do spark-submit
in
cluster mode. How to core-site.xml under classpath ? it will be very
helpful if you could explain in detail to solve this issue.

*Thanks*,
https://in.linkedin.com/in/ramkumarcs31

On Fri, Aug 14, 2015 at 8:25 AM, Jeff Zhang zjf...@gmail.com wrote:

1. 15/08/12 13:24:49 INFO Client: Source and destination file
systems are the same. Not copying

file:/home/hdfs/spark-1.4.1/assembly/target/scala-2.10/spark-assembly-1.4.1-hadoop2.5.0-cdh5.3.5.jar
2. 15/08/12 13:24:49 INFO Client: Source and destination file
systems are the same. Not copying

file:/home/hdfs/spark-1.4.1/external/kafka-assembly/target/spark-streaming-kafka-assembly_2.10-1.4.1.jar
3. 15/08/12 13:24:49 INFO Client: Source and destination file
systems are the same. Not copying
file:/home/hdfs/spark-1.4.1/python/lib/pyspark.zip
4. 15/08/12 13:24:49 INFO Client: Source and destination file
systems are the same. Not copying
file:/home/hdfs/spark-1.4.1/python/lib/py4j-0.8.2.1-src.zip
5. 15/08/12 13:24:49 INFO Client: Source and destination file
systems are the same. Not copying
file:/home/hdfs/spark-1.4.1/examples/src/main/python/streaming/kyt.py
6.

1. diagnostics: Application application_1437639737006_3808
failed 2 times due to AM Container for
appattempt_1437639737006_3808_02
exited with exitCode: -1000 due to: File
file:/home/hdfs/spark-1.4.1/python/lib/pyspark.zip does not exist
2. .Failing this attempt.. Failing the application.

The machine you run spark is the client machine, while the yarn AM
is running on another machine. And the yarn AM complains that the files
are
not found as your logs shown.
From the logs, its seems that these files are not copied to the HDFS
as local resources. I doubt that you didn't put core-site.xml under your
classpath, so that spark can not detect your remote file system and
won't
copy the files to hdfs as local resources. Usually in yarn-cluster mode,
you should be able to see the logs like following.

15/08/14 10:48:49 INFO yarn.Client: Preparing resources for our AM
container
15/08/14 10:48:49 INFO yarn.Client: Uploading resource
file:/Users/abc/github/spark/assembly/target/scala-2.10/spark-assembly-1.5.0-SNAPSHOT-hadoop2.6.0.jar
- hdfs://
0.0.0.0:9000/user/abc/.sparkStaging/application_1439432662178_0019/spark-assembly-1.5.0-SNAPSHOT-hadoop2.6.0.jar
15/08/14 10:48:50 INFO yarn.Client: Uploading resource
file:/Users/abc/github/spark/spark.py - hdfs://
0.0.0.0:9000/user/abc/.sparkStaging/application_1439432662178_0019/spark.py
15/08/14 10:48:50 INFO yarn.Client: Uploading resource
file:/Users/abc/github/spark/python/lib/pyspark.zip - hdfs://

Re: Spark Streaming failing on YARN Cluster

2015-08-19 Thread Ramkumar V

I'm getting some spark exception. Please look this log trace (
*http://pastebin.com/xL9jaRUa
http://pastebin.com/xL9jaRUa* ).

*Thanks*,
https://in.linkedin.com/in/ramkumarcs31

On Wed, Aug 19, 2015 at 10:20 PM, Hari Shreedharan
hshreedha...@cloudera.com wrote:

It looks like you are having issues with the files getting distributed to
the cluster. What is the exception you are getting now?

On Wednesday, August 19, 2015, Ramkumar V ramkumar.c...@gmail.com wrote:

Thanks a lot for your suggestion. I had modified HADOOP_CONF_DIR in
spark-env.sh so that core-site.xml is under HADOOP_CONF_DIR. i can able
to see the logs like that you had shown above. Now i can able to run for 3
minutes and store results between every minutes. After sometimes, there is
an exception. How to fix this exception ? and Can you please explain where
its going wrong ?

*Log Link : http://pastebin.com/xL9jaRUa http://pastebin.com/xL9jaRUa *

*Thanks*,
https://in.linkedin.com/in/ramkumarcs31

On Wed, Aug 19, 2015 at 1:54 PM, Jeff Zhang zjf...@gmail.com wrote:

HADOOP_CONF_DIR is the environment variable point to the hadoop conf
directory. Not sure how CDH organize that, make sure core-site.xml is
under HADOOP_CONF_DIR.

On Wed, Aug 19, 2015 at 4:06 PM, Ramkumar V ramkumar.c...@gmail.com
wrote:

We are using Cloudera-5.3.1. since it is one of the earlier version of
CDH, it doesnt supports the latest version of spark. So i installed
spark-1.4.1 separately in my machine. I couldnt able to do spark-submit in
cluster mode. How to core-site.xml under classpath ? it will be very
helpful if you could explain in detail to solve this issue.

*Thanks*,
https://in.linkedin.com/in/ramkumarcs31

On Fri, Aug 14, 2015 at 8:25 AM, Jeff Zhang zjf...@gmail.com wrote:

1. 15/08/12 13:24:49 INFO Client: Source and destination file
systems are the same. Not copying

file:/home/hdfs/spark-1.4.1/assembly/target/scala-2.10/spark-assembly-1.4.1-hadoop2.5.0-cdh5.3.5.jar
2. 15/08/12 13:24:49 INFO Client: Source and destination file
systems are the same. Not copying

1. diagnostics: Application application_1437639737006_3808 failed
2 times due to AM Container for appattempt_1437639737006_3808_02
exited
with exitCode: -1000 due to: File
file:/home/hdfs/spark-1.4.1/python/lib/pyspark.zip does not exist
2. .Failing this attempt.. Failing the application.

The machine you run spark is the client machine, while the yarn AM is
running on another machine. And the yarn AM complains that the files are
not found as your logs shown.
From the logs, its seems that these files are not copied to the HDFS
as local resources. I doubt that you didn't put core-site.xml under your
classpath, so that spark can not detect your remote file system and won't
copy the files to hdfs as local resources. Usually in yarn-cluster mode,
you should be able to see the logs like following.

On Thu, Aug 13, 2015 at 2:50 PM, Ramkumar V ramkumar.c...@gmail.com
wrote:

Hi,

I have a cluster of 1 master and 2 slaves. I'm running a spark
streaming in master and I want to utilize all nodes in my cluster. i had
specified some parameters like driver memory and executor memory in my
code. when i give --deploy-mode cluster --master yarn-cluster in my
spark-submit, it gives the following error.

Log link : *http://pastebin.com/kfyVWDGR
http://pastebin.com/kfyVWDGR*

How to fix this issue ? Please help me if i'm doing wrong.

*Thanks*,
Ramkumar V

--
Best Regards

Jeff Zhang

--
Best Regards

Jeff Zhang

Thanks,
Hari

Re: Spark Streaming failing on YARN Cluster

2015-08-19 Thread Ramkumar V

Thanks a lot for your suggestion. I had modified HADOOP_CONF_DIR in
spark-env.sh so that core-site.xml is under HADOOP_CONF_DIR. i can able to
see the logs like that you had shown above. Now i can able to run for 3
minutes and store results between every minutes. After sometimes, there is
an exception. How to fix this exception ? and Can you please explain where
its going wrong ?

*Log Link : http://pastebin.com/xL9jaRUa http://pastebin.com/xL9jaRUa *

*Thanks*,
https://in.linkedin.com/in/ramkumarcs31

On Wed, Aug 19, 2015 at 1:54 PM, Jeff Zhang zjf...@gmail.com wrote:

HADOOP_CONF_DIR is the environment variable point to the hadoop conf
directory. Not sure how CDH organize that, make sure core-site.xml is
under HADOOP_CONF_DIR.

On Wed, Aug 19, 2015 at 4:06 PM, Ramkumar V ramkumar.c...@gmail.com
wrote:

We are using Cloudera-5.3.1. since it is one of the earlier version of
CDH, it doesnt supports the latest version of spark. So i installed
spark-1.4.1 separately in my machine. I couldnt able to do spark-submit in
cluster mode. How to core-site.xml under classpath ? it will be very
helpful if you could explain in detail to solve this issue.

*Thanks*,
https://in.linkedin.com/in/ramkumarcs31

On Fri, Aug 14, 2015 at 8:25 AM, Jeff Zhang zjf...@gmail.com wrote:

1. 15/08/12 13:24:49 INFO Client: Source and destination file
systems are the same. Not copying

file:/home/hdfs/spark-1.4.1/assembly/target/scala-2.10/spark-assembly-1.4.1-hadoop2.5.0-cdh5.3.5.jar
2. 15/08/12 13:24:49 INFO Client: Source and destination file
systems are the same. Not copying

1. diagnostics: Application application_1437639737006_3808 failed 2
times due to AM Container for appattempt_1437639737006_3808_02 exited
with exitCode: -1000 due to: File
file:/home/hdfs/spark-1.4.1/python/lib/pyspark.zip does not exist
2. .Failing this attempt.. Failing the application.

The machine you run spark is the client machine, while the yarn AM is
running on another machine. And the yarn AM complains that the files are
not found as your logs shown.
From the logs, its seems that these files are not copied to the HDFS as
local resources. I doubt that you didn't put core-site.xml under your
classpath, so that spark can not detect your remote file system and won't
copy the files to hdfs as local resources. Usually in yarn-cluster mode,
you should be able to see the logs like following.

On Thu, Aug 13, 2015 at 2:50 PM, Ramkumar V ramkumar.c...@gmail.com
wrote:

Hi,

Log link : *http://pastebin.com/kfyVWDGR
http://pastebin.com/kfyVWDGR*

How to fix this issue ? Please help me if i'm doing wrong.

*Thanks*,
Ramkumar V

--
Best Regards

Jeff Zhang

--
Best Regards

Jeff Zhang

Re: Spark Streaming failing on YARN Cluster

2015-08-19 Thread Ramkumar V

We are using Cloudera-5.3.1. since it is one of the earlier version of CDH,
it doesnt supports the latest version of spark. So i installed spark-1.4.1
separately in my machine. I couldnt able to do spark-submit in cluster
mode. How to core-site.xml under classpath ? it will be very helpful if you
could explain in detail to solve this issue.

*Thanks*,
https://in.linkedin.com/in/ramkumarcs31

On Fri, Aug 14, 2015 at 8:25 AM, Jeff Zhang zjf...@gmail.com wrote:

1. 15/08/12 13:24:49 INFO Client: Source and destination file systems
are the same. Not copying

file:/home/hdfs/spark-1.4.1/assembly/target/scala-2.10/spark-assembly-1.4.1-hadoop2.5.0-cdh5.3.5.jar
2. 15/08/12 13:24:49 INFO Client: Source and destination file systems
are the same. Not copying

file:/home/hdfs/spark-1.4.1/external/kafka-assembly/target/spark-streaming-kafka-assembly_2.10-1.4.1.jar
3. 15/08/12 13:24:49 INFO Client: Source and destination file systems
are the same. Not copying
file:/home/hdfs/spark-1.4.1/python/lib/pyspark.zip
4. 15/08/12 13:24:49 INFO Client: Source and destination file systems
are the same. Not copying
file:/home/hdfs/spark-1.4.1/python/lib/py4j-0.8.2.1-src.zip
5. 15/08/12 13:24:49 INFO Client: Source and destination file systems
are the same. Not copying
file:/home/hdfs/spark-1.4.1/examples/src/main/python/streaming/kyt.py
6.

1. diagnostics: Application application_1437639737006_3808 failed 2
times due to AM Container for appattempt_1437639737006_3808_02 exited
with exitCode: -1000 due to: File
file:/home/hdfs/spark-1.4.1/python/lib/pyspark.zip does not exist
2. .Failing this attempt.. Failing the application.

The machine you run spark is the client machine, while the yarn AM is
running on another machine. And the yarn AM complains that the files are
not found as your logs shown.
From the logs, its seems that these files are not copied to the HDFS as
local resources. I doubt that you didn't put core-site.xml under your
classpath, so that spark can not detect your remote file system and won't
copy the files to hdfs as local resources. Usually in yarn-cluster mode,
you should be able to see the logs like following.

On Thu, Aug 13, 2015 at 2:50 PM, Ramkumar V ramkumar.c...@gmail.com
wrote:

Hi,

I have a cluster of 1 master and 2 slaves. I'm running a spark streaming
in master and I want to utilize all nodes in my cluster. i had specified
some parameters like driver memory and executor memory in my code. when i
give --deploy-mode cluster --master yarn-cluster in my spark-submit, it
gives the following error.

Log link : *http://pastebin.com/kfyVWDGR http://pastebin.com/kfyVWDGR*

How to fix this issue ? Please help me if i'm doing wrong.

*Thanks*,
Ramkumar V

--
Best Regards

Jeff Zhang

Spark Streaming failing on YARN Cluster

2015-08-13 Thread Ramkumar V

Hi,

I have a cluster of 1 master and 2 slaves. I'm running a spark streaming in
master and I want to utilize all nodes in my cluster. i had specified some
parameters like driver memory and executor memory in my code. when i
give --deploy-mode cluster --master yarn-cluster in my spark-submit, it
gives the following error.

Log link : *http://pastebin.com/kfyVWDGR http://pastebin.com/kfyVWDGR*

How to fix this issue ? Please help me if i'm doing wrong.


*Thanks*,
Ramkumar V

Re: Spark Streaming failing on YARN Cluster

2015-08-13 Thread Ramkumar V

Yes. this file is available in this path in the same machine where i'm
running the spark. later i moved spark-1.4.1 folder to all other machines
in my cluster but still i'm facing the same issue.


*Thanks*,
https://in.linkedin.com/in/ramkumarcs31


On Thu, Aug 13, 2015 at 1:17 PM, Akhil Das ak...@sigmoidanalytics.com
wrote:

 Just make sure this file is available:

 appattempt_1437639737006_3808_02 exited with  exitCode: -1000 due to:
 File *file:/home/hdfs/spark-1.4.1/python/lib/pyspark.zip* does not exist

 Thanks
 Best Regards

 On Thu, Aug 13, 2015 at 12:20 PM, Ramkumar V ramkumar.c...@gmail.com
 wrote:

 Hi,

 I have a cluster of 1 master and 2 slaves. I'm running a spark streaming
 in master and I want to utilize all nodes in my cluster. i had specified
 some parameters like driver memory and executor memory in my code. when i
 give --deploy-mode cluster --master yarn-cluster in my spark-submit, it
 gives the following error.

 Log link : *http://pastebin.com/kfyVWDGR http://pastebin.com/kfyVWDGR*

 How to fix this issue ? Please help me if i'm doing wrong.


 *Thanks*,
 Ramkumar V

Re: Spark Streaming failing on YARN Cluster

Re: Spark Streaming failing on YARN Cluster

Re: Spark Streaming failing on YARN Cluster

Re: Spark Streaming failing on YARN Cluster

Spark Streaming failing on YARN Cluster

Re: Spark Streaming failing on YARN Cluster

6 matches

Site Navigation

Mail list logo

Footer information