Re: Spark Streaming failing on YARN Cluster

2015-08-19 Thread Ramkumar V
. Usually in yarn-cluster mode, you should be able to see the logs like following. 15/08/14 10:48:49 INFO yarn.Client: Preparing resources for our AM container 15/08/14 10:48:49 INFO yarn.Client: Uploading resource file:/Users/abc/github/spark/assembly/target/scala-2.10/spark-assembly-1.5.0

Re: Spark Streaming failing on YARN Cluster

2015-08-19 Thread Ramkumar V
file system and won't copy the files to hdfs as local resources. Usually in yarn-cluster mode, you should be able to see the logs like following. 15/08/14 10:48:49 INFO yarn.Client: Preparing resources for our AM container 15/08/14 10:48:49 INFO yarn.Client: Uploading resource file:/Users/abc

Re: issue Running Spark Job on Yarn Cluster

2015-08-18 Thread MooseSpark
Please check logs in your hadoop yarn cluster, there you would get precise error or exception. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/issue-Running-Spark-Job-on-Yarn-Cluster-tp21779p24308.html Sent from the Apache Spark User List mailing list

Re: issue Running Spark Job on Yarn Cluster

2015-08-17 Thread poolis
Did you resolve this issue? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/issue-Running-Spark-Job-on-Yarn-Cluster-tp21779p24300.html Sent from the Apache Spark User List mailing list archive at Nabble.com

Spark Streaming failing on YARN Cluster

2015-08-13 Thread Ramkumar V
Hi, I have a cluster of 1 master and 2 slaves. I'm running a spark streaming in master and I want to utilize all nodes in my cluster. i had specified some parameters like driver memory and executor memory in my code. when i give --deploy-mode cluster --master yarn-cluster in my spark-submit

Re: Spark Streaming failing on YARN Cluster

2015-08-13 Thread Ramkumar V
...@gmail.com wrote: Hi, I have a cluster of 1 master and 2 slaves. I'm running a spark streaming in master and I want to utilize all nodes in my cluster. i had specified some parameters like driver memory and executor memory in my code. when i give --deploy-mode cluster --master yarn

org.apache.spark.SparkException: Detected yarn-cluster mode, but isn't running on a cluster. Deployment to YARN is not supported directly by SparkContext. Please use spark-submit

2015-08-03 Thread Rajeshkumar J
Hi Everyone, I am using Apache Spark for 2 weeks and as of now I am querying hive tables using spark java api. And it is working fine in Hadoop single mode but when I tried the same code in Hadoop multi cluster it throws org.apache.spark.SparkException: Detected yarn-cluster mode, but isn't

Fwd: org.apache.spark.SparkException: Detected yarn-cluster mode, but isn't running on a cluster. Deployment to YARN is not supported directly by SparkContext. Please use spark-submit

2015-08-03 Thread Rajeshkumar J
Hi Everyone, I am using Apache Spark for 2 weeks and as of now I am querying hive tables using spark java api. And it is working fine in Hadoop single mode but when I tried the same code in Hadoop multi cluster it throws org.apache.spark.SparkException: Detected yarn-cluster mode, but isn't

No event logs in yarn-cluster mode

2015-08-01 Thread Akmal Abbasov
Hi, I am trying to configure a history server for application. When I running locally(./run-example SparkPi), the event logs are being created, and I can start history server. But when I am trying ./spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster file:///opt/hadoop

Re: No event logs in yarn-cluster mode

2015-08-01 Thread Marcelo Vanzin
On Sat, Aug 1, 2015 at 9:25 AM, Akmal Abbasov akmal.abba...@icloud.com wrote: When I running locally(./run-example SparkPi), the event logs are being created, and I can start history server. But when I am trying ./spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster

Re: [ Potential bug ] Spark terminal logs say that job has succeeded even though job has failed in Yarn cluster mode

2015-07-28 Thread Elkhan Dadashov
can fail but still successfully clean up their resources and give them back to the Yarn cluster. Because of this, there's a difference between your code throwing an exception in an executor/driver and the Yarn application failing. Generally you'll see a yarn application fail when there's a memory

Re: [ Potential bug ] Spark terminal logs say that job has succeeded even though job has failed in Yarn cluster mode

2015-07-28 Thread Marcelo Vanzin
their resources and give them back to the Yarn cluster. Because of this, there's a difference between your code throwing an exception in an executor/driver and the Yarn application failing. Generally you'll see a yarn application fail when there's a memory problem (too much memory being allocated

Re: [ Potential bug ] Spark terminal logs say that job has succeeded even though job has failed in Yarn cluster mode

2015-07-28 Thread Marcelo Vanzin
state that job has failed to expected error in python script. More details: Scenario While running Spark Word count python example on *Yarn cluster mode*, if I make intentional error in wordcount.py by changing this line (I'm using Spark 1.4.1, but this problem exists in Spark 1.4.0

Re: [ Potential bug ] Spark terminal logs say that job has succeeded even though job has failed in Yarn cluster mode

2015-07-28 Thread Elkhan Dadashov
: Elkhan, What does the ResourceManager say about the final status of the job? Spark jobs that run as Yarn applications can fail but still successfully clean up their resources and give them back to the Yarn cluster. Because of this, there's a difference between your code throwing an exception

Re: [ Potential bug ] Spark terminal logs say that job has succeeded even though job has failed in Yarn cluster mode

2015-07-28 Thread Elkhan Dadashov
: Elkhan, What does the ResourceManager say about the final status of the job? Spark jobs that run as Yarn applications can fail but still successfully clean up their resources and give them back to the Yarn cluster. Because of this, there's a difference between your code throwing an exception

Re: [ Potential bug ] Spark terminal logs say that job has succeeded even though job has failed in Yarn cluster mode

2015-07-28 Thread Corey Nolet
enabled of your cluster, the yarn log command should give you any exceptions that were thrown in the driver / executors when you are running in yarn cluster mode. If you were running in yarn-client mode, you'd see the errors that caused a job to fail in your local log (errors that would cause a job

Re: [ Potential bug ] Spark terminal logs say that job has succeeded even though job has failed in Yarn cluster mode

2015-07-28 Thread Elkhan Dadashov
I run Spark in yarn-cluster mode. and yes , log aggregation enabled. In Yarn aggregated logs i can the job status correctly. The issue is Yarn Client logs (which is written to stdout in terminal) states that job has succeeded even though the job has failed. As user is not testing if Yarn RM

Re: [ Potential bug ] Spark terminal logs say that job has succeeded even though job has failed in Yarn cluster mode

2015-07-27 Thread Elkhan Dadashov
example with intentional mistake in *Yarn cluster mode*, Spark terminal states final status as SUCCEEDED, but log files state correct results indicating that the job failed. Why terminal log output application log output contradict each other ? If i run same job on *local mode* then terminal

Re: [ Potential bug ] Spark terminal logs say that job has succeeded even though job has failed in Yarn cluster mode

2015-07-27 Thread Corey Nolet
Elkhan, What does the ResourceManager say about the final status of the job? Spark jobs that run as Yarn applications can fail but still successfully clean up their resources and give them back to the Yarn cluster. Because of this, there's a difference between your code throwing an exception

Re: problems running Spark on a firewalled remote YARN cluster via SOCKS proxy

2015-07-24 Thread Rok Roskar
it looks like they are actually not the same (since in that case it sounds like a standalone cluster is being used). -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/problems-running-Spark-on-a-firewalled-remote-YARN-cluster-via-SOCKS-proxy-tp23955.html Sent

[ Potential bug ] Spark terminal logs say that job has succeeded even though job has failed in Yarn cluster mode

2015-07-23 Thread Elkhan Dadashov
Hi all, While running Spark Word count python example with intentional mistake in *Yarn cluster mode*, Spark terminal states final status as SUCCEEDED, but log files state correct results indicating that the job failed. Why terminal log output application log output contradict each other

Re: problems running Spark on a firewalled remote YARN cluster via SOCKS proxy

2015-07-23 Thread Akhil Das
-YARN-cluster-via-SOCKS-proxy-tp23955.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h

problems running Spark on a firewalled remote YARN cluster via SOCKS proxy

2015-07-22 Thread rok
it looks like they are actually not the same (since in that case it sounds like a standalone cluster is being used). -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/problems-running-Spark-on-a-firewalled-remote-YARN-cluster-via-SOCKS-proxy-tp23955.html Sent

Re: spark.executor.memory and spark.driver.memory have no effect in yarn-cluster mode (1.4.x)?

2015-07-22 Thread Michael Misiewicz
the configuration values spark.executor.memory and spark.driver.memory don't have any effect, when they are set from within my application (i.e. prior to creating a SparkContext). In yarn-cluster mode, only the values specified on the command line via spark-submit for driver and executor memory

spark.executor.memory and spark.driver.memory have no effect in yarn-cluster mode (1.4.x)?

2015-07-22 Thread Michael Misiewicz
. prior to creating a SparkContext). In yarn-cluster mode, only the values specified on the command line via spark-submit for driver and executor memory are respected, and if not, it appears spark falls back to defaults. For example, Correct behavior noted in Driver's logs on YARN when --executor

Has anyone run Python Spark application on Yarn-cluster mode ? (which has 3rd party Python modules (i.e., numpy) to be shipped with)

2015-07-17 Thread Elkhan Dadashov
Hi all, After SPARK-5479 https://issues.apache.org/jira/browse/SPARK-5479 issue fix (thanks to Marcelo Vanzin), now pyspark handles several python files (or in zip folder with __init__.py) addition to PYTHONPATH correctly in yarn-cluster mode. But adding python module as zip folder, still fails

Why does SparkSubmit process takes so much virtual memory in yarn-cluster mode ?

2015-07-14 Thread Elkhan Dadashov
More particular example: I run pi.py Spark Python example in *yarn-cluster* mode (--master) through SparkLauncher in Java. While the program is running, these are the stats of how much memory each process takes: SparkSubmit process : 11.266 *gigabyte* Virtual Memory ApplicationMaster process

Re: Why does SparkSubmit process takes so much virtual memory in yarn-cluster mode ?

2015-07-14 Thread Elkhan Dadashov
ApplicationMaster process: 2303480 *byte *Virtual Memory That SparkSubmit number looks very suspicious. In yarn-cluster mode, SparkSubmit doesn't do much and should not use a lot of memory. You could set SPARK_PRINT_LAUNCH_CMD=1 before launching the app to see the exact java command line being used

Re: Why does SparkSubmit process takes so much virtual memory in yarn-cluster mode ?

2015-07-14 Thread Marcelo Vanzin
On Tue, Jul 14, 2015 at 3:42 PM, Elkhan Dadashov elkhan8...@gmail.com wrote: I looked into Virtual memory usage (jmap+jvisualvm) does not show that 11.5 g Virtual Memory usage - it is much less. I get 11.5 g Virtual memory usage using top -p pid command for SparkSubmit process. If you're

Re: Why does SparkSubmit process takes so much virtual memory in yarn-cluster mode ?

2015-07-14 Thread Marcelo Vanzin
That SparkSubmit number looks very suspicious. In yarn-cluster mode, SparkSubmit doesn't do much and should not use a lot of memory. You could set SPARK_PRINT_LAUNCH_CMD=1 before launching the app to see the exact java command line being used, and see whether it has any suspicious configuration. You could also

Re: Pyspark not working on yarn-cluster mode

2015-07-10 Thread Elkhan Dadashov
Yes, you can launch (from Java code) pyspark scripts with yarn-cluster mode without using the spark-submit script. Check SparkLauncher code in this link https://github.com/apache/spark/tree/master/launcher/src/main/java/org/apache/spark/launcher . SparkLauncher is not dependent on Spark core jars

Re: Pyspark not working on yarn-cluster mode

2015-07-10 Thread Sandy Ryza
To add to this, conceptually, it makes no sense to launch something in yarn-cluster mode by creating a SparkContext on the client - the whole point of yarn-cluster mode is that the SparkContext runs on the cluster, not on the client. On Thu, Jul 9, 2015 at 2:35 PM, Marcelo Vanzin van

Pyspark not working on yarn-cluster mode

2015-07-09 Thread jegordon
Hi to all, Is there any way to run pyspark scripts with yarn-cluster mode without using the spark-submit script? I need it in this way because i will integrate this code into a django web app. When i try to run any script in yarn-cluster mode i got the following error

Re: Pyspark not working on yarn-cluster mode

2015-07-09 Thread Marcelo Vanzin
You cannot run Spark in cluster mode by instantiating a SparkContext like that. You have to launch it with the spark-submit command line script. On Thu, Jul 9, 2015 at 2:23 PM, jegordon jgordo...@gmail.com wrote: Hi to all, Is there any way to run pyspark scripts with yarn-cluster mode

Re: Spark 1.4.0, Secure YARN Cluster, Application Master throws 500 connection refused (Resolved)

2015-06-25 Thread Nachiketa
Setting the yarn.resourcemanager.webapp.address.rm1 and yarn.resourcemanager.webapp.address.rm2 in yarn-site.xml seems to have resolved the issue. Appreciate any comments about the regression from 1.3.1 ? Thanks. Regards, Nachiketa On Fri, Jun 26, 2015 at 1:28 AM, Nachiketa

Re: Has anyone run Python Spark application on Yarn-cluster mode ? (which has 3rd party Python modules to be shipped with)

2015-06-25 Thread Marcelo Vanzin
of these commands succeed: ./bin/spark-submit --master yarn-cluster --verbose hdfs:///pi.py ./bin/spark-submit --master yarn-cluster --deploy-mode cluster --verbose examples/src/main/python/pi.py But in this particular example with 3rd party numpy module: ./bin/spark-submit --verbose --master yarn

Re: Has anyone run Python Spark application on Yarn-cluster mode ? (which has 3rd party Python modules to be shipped with)

2015-06-25 Thread Elkhan Dadashov
i try to include 3rd party dependency from local computer with --py-files (in Spark 1.4) Both of these commands succeed: ./bin/spark-submit --master yarn-cluster --verbose hdfs:///pi.py ./bin/spark-submit --master yarn-cluster --deploy-mode cluster --verbose examples/src/main/python/pi.py

Re: Has anyone run Python Spark application on Yarn-cluster mode ? (which has 3rd party Python modules to be shipped with)

2015-06-25 Thread Naveen Madhire
-submit --verbose --master yarn-cluster --py-files mypython/libs/numpy-1.9.2.zip --deploy-mode cluster mypython/scripts/kmeans.py /kmeans_data.txt 5 1.0 - numpy-1.9.2.zip - is downloaded numpy package - kmeans.py is default example which comes with Spark 1.4 - kmeans_data.txt - is default data

Re: Spark 1.4.0, Secure YARN Cluster, Application Master throws 500 connection refused

2015-06-25 Thread Nachiketa
A few other observations. 1. Spark 1.3.1 (custom built against HDP 2.2) was running fine against the same cluster and same hadoop configuration (hence seems like regression). 2. HA is enabled for YARN RM and HDFS (not sure if this would impact anything but wanted to share anyway). 3. Found this

Re: How to run kmeans.py Spark example in yarn-cluster ?

2015-06-25 Thread Elkhan Dadashov
Hi all, Does Spark 1.4 version support Python applications on Yarn-cluster ? (--master yarn-cluster) Does Spark 1.4 version support Python applications with deploy-mode cluster ? (--deploy-mode cluster) How can we ship 3rd party Python dependencies with Python Spark job ? (running on Yarn

Has anyone run Python Spark application on Yarn-cluster mode ? (which has 3rd party Python modules to be shipped with)

2015-06-25 Thread Elkhan Dadashov
In addition to previous emails, when i try to execute this command from command line: ./bin/spark-submit --verbose --master yarn-cluster --py-files mypython/libs/numpy-1.9.2.zip --deploy-mode cluster mypython/scripts/kmeans.py /kmeans_data.txt 5 1.0 - numpy-1.9.2.zip - is downloaded numpy

Spark 1.4.0, Secure YARN Cluster, Application Master throws 500 connection refused

2015-06-25 Thread Nachiketa
Spark 1.4.0 - Custom built from source against Hortonworks HDP 2.2 (hadoop 2.6.0+) HDP 2.2 Cluster (Secure, kerberos) spark-shell (--master yarn-client) launches fine and the prompt shows up. Clicking on the Application Master url on the YARN RM UI, throws 500 connect error. The same build works

Re: Has anyone run Python Spark application on Yarn-cluster mode ? (which has 3rd party Python modules to be shipped with)

2015-06-25 Thread Marcelo Vanzin
That sounds like SPARK-5479 which is not in 1.4... On Thu, Jun 25, 2015 at 12:17 PM, Elkhan Dadashov elkhan8...@gmail.com wrote: In addition to previous emails, when i try to execute this command from command line: ./bin/spark-submit --verbose --master yarn-cluster --py-files mypython/libs

How to run kmeans.py Spark example in yarn-cluster ?

2015-06-24 Thread Elkhan Dadashov
Hi all, I'm trying to run kmeans.py Spark example on Yarn cluster mode. I'm using Spark 1.4.0. I'm passing numpy-1.9.2.zip with --py-files flag. Here is the command I'm trying to execute but it fails: ./bin/spark-submit --master yarn-cluster --verbose --py-files mypython/libs/numpy-1.9.2

Re: GSSException when submitting Spark job in yarn-cluster mode with HiveContext APIs on Kerberos cluster

2015-06-22 Thread Olivier Girardot
Hi, I can't get this to work using CDH 5.4, Spark 1.4.0 in yarn cluster mode. @andrew did you manage to get it work with the latest version ? Le mar. 21 avr. 2015 à 00:02, Andrew Lee alee...@hotmail.com a écrit : Hi Marcelo, Exactly what I need to track, thanks for the JIRA pointer. Date

Re: Spark Streming yarn-cluster Mode Off-heap Memory Is Constantly Growing

2015-06-18 Thread Ji ZHANG
with spark. Thanks Best Regards On Wed, May 27, 2015 at 11:51 AM, Ji ZHANG zhangj...@gmail.com wrote: Hi, I'm using Spark Streaming 1.3 on CDH5.1 with yarn-cluster mode. I find out that YARN is killing the driver and executor process because of excessive use of memory. Here's something I

Re: Spark Streming yarn-cluster Mode Off-heap Memory Is Constantly Growing

2015-06-18 Thread Tathagata Das
/docs/1.3.1/streaming-kafka-integration.html#approach-2-direct-approach-no-receivers that comes up with spark. Thanks Best Regards On Wed, May 27, 2015 at 11:51 AM, Ji ZHANG zhangj...@gmail.com wrote: Hi, I'm using Spark Streaming 1.3 on CDH5.1 with yarn-cluster mode. I find out

Re: Is there programmatic way running Spark job on Yarn cluster without using spark-submit script ?

2015-06-18 Thread Corey Nolet
This is not independent programmatic way of running of Spark job on Yarn cluster. The example I created simply demonstrates how to wire up the classpath so that spark submit can be called programmatically. For my use case, I wanted to hold open a connection so I could send tasks to the executors

Is there programmatic way running Spark job on Yarn cluster without using spark-submit script ?

2015-06-17 Thread Elkhan Dadashov
Hi all, Is there any way running Spark job in programmatic way on Yarn cluster without using spark-submit script ? I cannot include Spark jars on my Java application (due o dependency conflict and other reasons), so I'll be shipping Spark assembly uber jar (spark-assembly-1.3.1-hadoop2.3.0.jar

Re: Is there programmatic way running Spark job on Yarn cluster without using spark-submit script ?

2015-06-17 Thread Elkhan Dadashov
This is not independent programmatic way of running of Spark job on Yarn cluster. That example demonstrates running on *Yarn-client* mode, also will be dependent of Jetty. Users writing Spark programs do not want to depend on that. I found this SparkLauncher class introduced in Spark 1.4 version

Re: Is there programmatic way running Spark job on Yarn cluster without using spark-submit script ?

2015-06-17 Thread Guru Medasani
programmatic way of running of Spark job on Yarn cluster. That example demonstrates running on Yarn-client mode, also will be dependent of Jetty. Users writing Spark programs do not want to depend on that. I found this SparkLauncher class introduced in Spark 1.4 version (https://github.com/apache

Re: Is there programmatic way running Spark job on Yarn cluster without using spark-submit script ?

2015-06-17 Thread Corey Nolet
cluster without using spark-submit script ? I cannot include Spark jars on my Java application (due o dependency conflict and other reasons), so I'll be shipping Spark assembly uber jar (spark-assembly-1.3.1-hadoop2.3.0.jar) to Yarn cluster, and then execute job (Python or Java) on Yarn-cluster

Kerberos authentication exception when spark access hbase with yarn-cluster mode on a kerberos yarn Cluster

2015-06-17 Thread 马元文
Hi, all I have a question about spark access hbase with yarn-cluster mode on a kerberos yarn Cluster. Is it the only way to enable Spark access HBase by distributing the keytab to each NodeManager? It seems that Spark doesn't provide a delegation token like MR job, am I right

Re: Is it possible to see Spark jobs on MapReduce job history ? (running Spark on YARN cluster)

2015-06-12 Thread Steve Loughran
...@gmail.com wrote: Hi all, I wonder if anyone has used use MapReduce Job History to show Spark jobs. I can see my Spark jobs (Spark running on Yarn cluster) on Resource manager (RM). I start Spark History server, and then through Spark's web-based user interface I can monitor

Is it possible to see Spark jobs on MapReduce job history ? (running Spark on YARN cluster)

2015-06-11 Thread Elkhan Dadashov
Hi all, I wonder if anyone has used use MapReduce Job History to show Spark jobs. I can see my Spark jobs (Spark running on Yarn cluster) on Resource manager (RM). I start Spark History server, and then through Spark's web-based user interface I can monitor the cluster (and track cluster

Problem with pyspark on Docker talking to YARN cluster

2015-06-10 Thread Ashwin Shankar
All, I was wondering if any of you have solved this problem : I have pyspark(ipython mode) running on docker talking to a yarn cluster(AM/executors are NOT running on docker). When I start pyspark in the docker container, it binds to port *49460.* Once the app is submitted to YARN, the app(AM

Re: Spark Streming yarn-cluster Mode Off-heap Memory Is Constantly Growing

2015-06-04 Thread Ji ZHANG
on CDH5.1 with yarn-cluster mode. I find out that YARN is killing the driver and executor process because of excessive use of memory. Here's something I tried: 1. Xmx is set to 512M and the GC looks fine (one ygc per 10s), so the extra memory is not used by heap. 2. I set the two

Re: Spark Streming yarn-cluster Mode Off-heap Memory Is Constantly Growing

2015-06-02 Thread Ji ZHANG
On Wed, May 27, 2015 at 11:51 AM, Ji ZHANG zhangj...@gmail.com wrote: Hi, I'm using Spark Streaming 1.3 on CDH5.1 with yarn-cluster mode. I find out that YARN is killing the driver and executor process because of excessive use of memory. Here's something I tried: 1. Xmx is set to 512M

Re: Spark Streming yarn-cluster Mode Off-heap Memory Is Constantly Growing

2015-06-02 Thread Tathagata Das
/streaming-kafka-integration.html#approach-2-direct-approach-no-receivers that comes up with spark. Thanks Best Regards On Wed, May 27, 2015 at 11:51 AM, Ji ZHANG zhangj...@gmail.com wrote: Hi, I'm using Spark Streaming 1.3 on CDH5.1 with yarn-cluster mode. I find out that YARN

Re: Spark Streming yarn-cluster Mode Off-heap Memory Is Constantly Growing

2015-06-02 Thread Ji ZHANG
:51 AM, Ji ZHANG zhangj...@gmail.com wrote: Hi, I'm using Spark Streaming 1.3 on CDH5.1 with yarn-cluster mode. I find out that YARN is killing the driver and executor process because of excessive use of memory. Here's something I tried: 1. Xmx is set to 512M and the GC looks fine (one

Re: yarn-cluster spark-submit process not dying

2015-05-28 Thread Corey Nolet
...@cloudera.com wrote: Hi Corey, As of this PR https://github.com/apache/spark/pull/5297/files, this can be controlled with spark.yarn.submit.waitAppCompletion. -Sandy On Thu, May 28, 2015 at 11:48 AM, Corey Nolet cjno...@gmail.com wrote: I am submitting jobs to my yarn cluster via the yarn

yarn-cluster spark-submit process not dying

2015-05-28 Thread Corey Nolet
I am submitting jobs to my yarn cluster via the yarn-cluster mode and I'm noticing the jvm that fires up to allocate the resources, etc... is not going away after the application master and executors have been allocated. Instead, it just sits there printing 1 second status updates to the console

Re: yarn-cluster spark-submit process not dying

2015-05-28 Thread Sandy Ryza
Hi Corey, As of this PR https://github.com/apache/spark/pull/5297/files, this can be controlled with spark.yarn.submit.waitAppCompletion. -Sandy On Thu, May 28, 2015 at 11:48 AM, Corey Nolet cjno...@gmail.com wrote: I am submitting jobs to my yarn cluster via the yarn-cluster mode and I'm

Re: Spark Streming yarn-cluster Mode Off-heap Memory Is Constantly Growing

2015-05-28 Thread Akhil Das
...@gmail.com wrote: Hi, I'm using Spark Streaming 1.3 on CDH5.1 with yarn-cluster mode. I find out that YARN is killing the driver and executor process because of excessive use of memory. Here's something I tried: 1. Xmx is set to 512M and the GC looks fine (one ygc per 10s), so the extra memory

Re: Spark Streming yarn-cluster Mode Off-heap Memory Is Constantly Growing

2015-05-28 Thread Ji ZHANG
with spark. Thanks Best Regards On Wed, May 27, 2015 at 11:51 AM, Ji ZHANG zhangj...@gmail.com wrote: Hi, I'm using Spark Streaming 1.3 on CDH5.1 with yarn-cluster mode. I find out that YARN is killing the driver and executor process because of excessive use of memory. Here's something I

Re: Spark Streming yarn-cluster Mode Off-heap Memory Is Constantly Growing

2015-05-28 Thread Akhil Das
, Ji ZHANG zhangj...@gmail.com wrote: Hi, I'm using Spark Streaming 1.3 on CDH5.1 with yarn-cluster mode. I find out that YARN is killing the driver and executor process because of excessive use of memory. Here's something I tried: 1. Xmx is set to 512M and the GC looks fine (one ygc per 10s

Re: Spark Streming yarn-cluster Mode Off-heap Memory Is Constantly Growing

2015-05-28 Thread Ji ZHANG
with yarn-cluster mode. I find out that YARN is killing the driver and executor process because of excessive use of memory. Here's something I tried: 1. Xmx is set to 512M and the GC looks fine (one ygc per 10s), so the extra memory is not used by heap. 2. I set the two memoryOverhead params

Re: Spark Streming yarn-cluster Mode Off-heap Memory Is Constantly Growing

2015-05-27 Thread Akhil Das
https://spark.apache.org/docs/1.3.1/streaming-kafka-integration.html#approach-2-direct-approach-no-receivers that comes up with spark. Thanks Best Regards On Wed, May 27, 2015 at 11:51 AM, Ji ZHANG zhangj...@gmail.com wrote: Hi, I'm using Spark Streaming 1.3 on CDH5.1 with yarn-cluster mode. I

Re: Spark Streming yarn-cluster Mode Off-heap Memory Is Constantly Growing

2015-05-27 Thread Ji ZHANG
with spark. Thanks Best Regards On Wed, May 27, 2015 at 11:51 AM, Ji ZHANG zhangj...@gmail.com wrote: Hi, I'm using Spark Streaming 1.3 on CDH5.1 with yarn-cluster mode. I find out that YARN is killing the driver and executor process because of excessive use of memory. Here's something I tried

Re: Spark Streming yarn-cluster Mode Off-heap Memory Is Constantly Growing

2015-05-27 Thread Ji ZHANG
with spark. Thanks Best Regards On Wed, May 27, 2015 at 11:51 AM, Ji ZHANG zhangj...@gmail.com wrote: Hi, I'm using Spark Streaming 1.3 on CDH5.1 with yarn-cluster mode. I find out that YARN is killing the driver and executor process because of excessive use of memory. Here's something I

Re: --jars works in yarn-client but not yarn-cluster mode, why?

2015-05-20 Thread Marcelo Vanzin
Hello, Sorry for the delay. The issue you're running into is because most HBase classes are in the system class path, while jars added with --jars are only visible to the application class loader created by Spark. So classes in the system class path cannot see them. You can work around this by

Re: --jars works in yarn-client but not yarn-cluster mode, why?

2015-05-20 Thread Fengyun RAO
Thank you so much, Marcelo! It WORKS! 2015-05-21 2:05 GMT+08:00 Marcelo Vanzin van...@cloudera.com: Hello, Sorry for the delay. The issue you're running into is because most HBase classes are in the system class path, while jars added with --jars are only visible to the application class

Re: --jars works in yarn-client but not yarn-cluster mode, why?

2015-05-19 Thread Fengyun RAO
Thanks, Marcelo! Below is the full log, SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in

Re: --jars works in yarn-client but not yarn-cluster mode, why?

2015-05-14 Thread Fengyun RAO
thanks, Wilfred. In our program, the htrace-core-3.1.0-incubating.jar dependency is only required in the executor, not in the driver. while in both yarn-client and yarn-cluster, the executor runs in cluster. and it's clearly in yarn-cluster mode, the jar IS in spark.yarn.secondary.jars

Re: How to get applicationId for yarn mode(both yarn-client and yarn-cluster mode)

2015-05-13 Thread thanhtien522
by the specific AppName? Thanks! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-get-applicationId-for-yarn-mode-both-yarn-client-and-yarn-cluster-mode-tp19462p22865.html Sent from the Apache Spark User List mailing list archive at Nabble.com

Re: --jars works in yarn-client but not yarn-cluster mode, why?

2015-05-13 Thread Fengyun RAO
I look into the Environment in both modes. yarn-client: spark.jars local:/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar,file:/home/xxx/my-app.jar yarn-cluster: spark.yarn.secondary.jars local:/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar I

--jars works in yarn-client but not yarn-cluster mode, why?

2015-05-13 Thread Fengyun RAO
/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar my-app.jar /input /output However, if we change yarn-client to yarn-cluster', it throws an ClassNotFoundException (actually the class exists in htrace-core-3.1.0-incubating.jar): Caused by: java.lang.NoClassDefFoundError: org/apache

spark yarn-cluster job failing in batch processing

2015-04-23 Thread sachin Singh
Hi All, I am trying to execute batch processing in yarn-cluster mode i.e. I have many sql insert queries,based on argument provided it will it will fetch the queries ,create context , schema RDD and insert in hive tables, Please Note- in standalone mode its working and in cluster mode working

RE: GSSException when submitting Spark job in yarn-cluster mode with HiveContext APIs on Kerberos cluster

2015-04-20 Thread Andrew Lee
Hi Marcelo, Exactly what I need to track, thanks for the JIRA pointer. Date: Mon, 20 Apr 2015 14:03:55 -0700 Subject: Re: GSSException when submitting Spark job in yarn-cluster mode with HiveContext APIs on Kerberos cluster From: van...@cloudera.com To: alee...@hotmail.com CC: user

Re: Spark sql failed in yarn-cluster mode when connecting to non-default hive database

2015-04-13 Thread sachin Singh
: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-sql-failed-in-yarn-cluster-mode-when-connecting-to-non-default-hive-database-tp11811p22486.html Sent from the Apache Spark User List mailing list archive at Nabble.com

ExceptionDriver-Memory while running Spark job on Yarn-cluster

2015-04-13 Thread sachin Singh
Hi , When I am submitting spark job as --master yarn-cluster with below command/options getting driver memory error- spark-submit --jars ./libs/mysql-connector-java-5.1.17.jar,./libs/log4j-1.2.17.jar --files datasource.properties,log4j.properties --master yarn-cluster --num-executors 1 --driver

Re: ExceptionDriver-Memory while running Spark job on Yarn-cluster

2015-04-13 Thread ๏̯͡๏
Try this ./bin/spark-submit -v --master yarn-cluster --jars ./libs/mysql-connector-java-5.1.17.jar,./libs/log4j-1.2.17.jar --files datasource.properties,log4j.properties --num-executors 1 --driver-memory 4g *--driver-java-options -XX:MaxPermSize=1G* --executor-memory 2g --executor-cores 1

need info on Spark submit on yarn-cluster mode

2015-04-08 Thread sachin Singh
Hi , I observed that we have installed only one cluster, and submiting job as yarn-cluster then getting below error, so is this cause that installation is only one cluster? Please correct me, if this is not cause then why I am not able to run in cluster mode, spark submit command is - spark-submit

Re: need info on Spark submit on yarn-cluster mode

2015-04-08 Thread Steve Loughran
like 10 minutes or longer property nameyarn.nodemanager.delete.debug-delay-sec/name value600/value /property On 8 Apr 2015, at 07:24, sachin Singh sachin.sha...@gmail.com wrote: Hi , I observed that we have installed only one cluster, and submiting job as yarn-cluster then getting

Re: What is best way to run spark job in yarn-cluster mode from java program(servlet container) and NOT using spark-submit command.

2015-03-26 Thread Noorul Islam K M
Sandy Ryza sandy.r...@cloudera.com writes: Creating a SparkContext and setting master as yarn-cluster unfortunately will not work. SPARK-4924 added APIs for doing this in Spark, but won't be included until 1.4. -Sandy Did you look into something like [1]? With that you can make rest API

Re: What is best way to run spark job in yarn-cluster mode from java program(servlet container) and NOT using spark-submit command.

2015-03-26 Thread Sandy Ryza
Creating a SparkContext and setting master as yarn-cluster unfortunately will not work. SPARK-4924 added APIs for doing this in Spark, but won't be included until 1.4. -Sandy On Tue, Mar 17, 2015 at 3:19 AM, Akhil Das ak...@sigmoidanalytics.com wrote: Create SparkContext set master as yarn

issue while submitting Spark Job as --master yarn-cluster

2015-03-25 Thread sachin Singh
.. Failing the application. APPID=application_1427124496008_0028 -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/issue-while-submitting-Spark-Job-as-master-yarn-cluster-tp0.html Sent from the Apache Spark User List mailing list archive at Nabble.com

Re: issue while submitting Spark Job as --master yarn-cluster

2015-03-25 Thread Sandy Ryza
exited with a non-zero exit code 13 .Failing this attempt.. Failing the application. APPID=application_1427124496008_0028 -- View this message in context: http://apache-spark-user-list. 1001560.n3.nabble.com/issue-while-submitting-Spark-Job-as- master-yarn-cluster-tp0.html Sent from

Re: issue while submitting Spark Job as --master yarn-cluster

2015-03-25 Thread Xi Shen
=application_1427124496008_0028 -- View this message in context: http://apache-spark-user-list. 1001560.n3.nabble.com/issue-while-submitting-Spark-Job-as- master-yarn-cluster-tp0.html Sent from the Apache Spark User List mailing list archive at Nabble.com

Re: issue while submitting Spark Job as --master yarn-cluster

2015-03-25 Thread Sachin Singh
-Spark-Job-as- master-yarn-cluster-tp0.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h

Re: Why doesn't the --conf parameter work in yarn-cluster mode (but works in yarn-client and local)?

2015-03-24 Thread Emre Sevinc
Hello Sandy, Thank you for your explanation. Then I would at least expect that to be consistent across local, yarn-client, and yarn-cluster modes. (And not lead to the case where it somehow works in two of them, and not for the third). Kind regards, Emre Sevinç http://www.bigindustries.be

Re: Why doesn't the --conf parameter work in yarn-cluster mode (but works in yarn-client and local)?

2015-03-24 Thread Sandy Ryza
but not in yarn-cluster mode). I'm surprised why I can't use it on the cluster while I can use it while local development and testing. Kind regards, Emre Sevinç http://www.bigindustries.be/ On Mon, Mar 23, 2015 at 6:15 PM, Sandy Ryza sandy.r...@cloudera.com wrote: Hi Emre, The --conf property

Why doesn't the --conf parameter work in yarn-cluster mode (but works in yarn-client and local)?

2015-03-23 Thread Emre Sevinc
same program in *yarn-cluster* mode. Why can't I retrieve the value of key given as --conf key=value when I submit my Spark application in *yarn-cluster* mode? Any ideas and/or workarounds? -- Emre Sevinç http://www.bigindustries.be/

Re: Why doesn't the --conf parameter work in yarn-cluster mode (but works in yarn-client and local)?

2015-03-23 Thread Sandy Ryza
Hi Emre, The --conf property is meant to work with yarn-cluster mode. System.getProperty(key) isn't guaranteed, but new SparkConf().get(key) should. Does it not? -Sandy On Mon, Mar 23, 2015 at 8:39 AM, Emre Sevinc emre.sev...@gmail.com wrote: Hello, According to Spark Documentation

Re: What is best way to run spark job in yarn-cluster mode from java program(servlet container) and NOT using spark-submit command.

2015-03-17 Thread Akhil Das
Create SparkContext set master as yarn-cluster then run it as a standalone program? Thanks Best Regards On Tue, Mar 17, 2015 at 1:27 AM, rrussell25 rrussel...@gmail.com wrote: Hi, were you ever able to determine a satisfactory approach for this problem? I have a similar situation and would

Re: What is best way to run spark job in yarn-cluster mode from java program(servlet container) and NOT using spark-submit command.

2015-03-16 Thread rrussell25
.nabble.com/What-is-best-way-to-run-spark-job-in-yarn-cluster-mode-from-java-program-servlet-container-and-NOT-u-tp21817p22086.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user

Re: Issue with yarn cluster - hangs in accepted state.

2015-03-15 Thread abhi
Thanks, It worked. -Abhi On Tue, Mar 3, 2015 at 5:15 PM, Tobias Pfeiffer t...@preferred.jp wrote: Hi, On Wed, Mar 4, 2015 at 6:20 AM, Zhan Zhang zzh...@hortonworks.com wrote: Do you have enough resource in your cluster? You can check your resource manager to see the usage. Yep, I can

Re: Using 1.3.0 client jars with 1.2.1 assembly in yarn-cluster mode

2015-03-08 Thread Akhil Das
Mostly, when you use different versions of jars, it will throw up incompatible version errors. Thanks Best Regards On Fri, Mar 6, 2015 at 7:38 PM, Zsolt Tóth toth.zsolt@gmail.com wrote: Hi, I submit spark jobs in yarn-cluster mode remotely from java code by calling

Using 1.3.0 client jars with 1.2.1 assembly in yarn-cluster mode

2015-03-06 Thread Zsolt Tóth
Hi, I submit spark jobs in yarn-cluster mode remotely from java code by calling Client.submitApplication(). For some reason I want to use 1.3.0 jars on the client side (e.g spark-yarn_2.10-1.3.0.jar) but I have spark-assembly-1.2.1* on the cluster. The problem is that the ApplicationMaster can't

Nullpointer Exception on broadcast variables (YARN Cluster mode)

2015-03-05 Thread samriddhac
Hi All I have a simple spark application, where I am trying to broadcast a String type variable on YARN Cluster. But every time I am trying to access the broadcast-ed variable value , I am getting null within the Task. It will be really helpful, if you guys can suggest, what I am doing wrong here

<    1   2   3   4   5   >