Re: Silly question about Yarn client vs Yarn cluster modes...

2016-06-22 Thread Michael Segel
The only documentation on this… in terms of direction … (that I could find) If your client is not close to the cluster (e.g. your PC) then you definitely want to go cluster to improve performance. If your client is close to the cluster (e.g. an edge node) then you could go either client or

Re: Silly question about Yarn client vs Yarn cluster modes...

2016-06-22 Thread Marcelo Vanzin
On Wed, Jun 22, 2016 at 1:32 PM, Mich Talebzadeh wrote: > Does it also depend on the number of Spark nodes involved in choosing which > way to go? Not really. -- Marcelo - To unsubscribe, e-mail:

Re: Silly question about Yarn client vs Yarn cluster modes...

2016-06-22 Thread Mich Talebzadeh
Thanks Marcelo, Sounds like cluster mode is more resilient than the client-mode. Does it also depend on the number of Spark nodes involved in choosing which way to go? Cheers Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

Re: Silly question about Yarn client vs Yarn cluster modes...

2016-06-22 Thread Marcelo Vanzin
Trying to keep the answer short and simple... On Wed, Jun 22, 2016 at 1:19 PM, Michael Segel wrote: > But this gets to the question… what are the real differences between client > and cluster modes? > What are the pros/cons and use cases where one has advantages over

Re: Silly question about Yarn client vs Yarn cluster modes...

2016-06-22 Thread Michael Segel
rprise > practice :) > > The question on YARN client versus YARN cluster mode. I am not sure how much > in real life it is going to make an impact if I choose one over the other? > > These days I yell developers that it is perfectly valid to use Spark local > mode to their d

Re: Silly question about Yarn client vs Yarn cluster modes...

2016-06-22 Thread Mich Talebzadeh
This is exactly the sort of topics that distinguish lab work from enterprise practice :) The question on YARN client versus YARN cluster mode. I am not sure how much in real life it is going to make an impact if I choose one over the other? These days I yell developers that it is perfectly valid

Re: Silly question about Yarn client vs Yarn cluster modes...

2016-06-22 Thread Michael Segel
JDBC reliability problem? Ok… a bit more explanation… Usually when you have to go back to a legacy system, its because the data set is usually metadata and is relatively small. Its not the sort of data that gets ingested in to a data lake unless you’re also ingesting the metadata and are

Re: Silly question about Yarn client vs Yarn cluster modes...

2016-06-22 Thread Mich Talebzadeh
Thanks Mike for clarification. I think there is another option to get data out of RDBMS through some form of SELECT ALL COLUMNS TAB SEPARATED OR OTHER and put them in a flat file or files. scp that file from the RDBMS directory to a private directory on HDFS system and push it into HDFS. That

Re: Silly question about Yarn client vs Yarn cluster modes...

2016-06-22 Thread Michael Segel
Hi, Just to clear a few things up… First I know its hard to describe some problems because they deal with client confidential information. (Also some basic ‘dead hooker’ thought problems to work through before facing them at a client.) The questions I pose here are very general and deal

Re: Silly question about Yarn client vs Yarn cluster modes...

2016-06-22 Thread Mich Talebzadeh
If you are going to get data out of an RDBMS like Oracle then the correct procedure is: 1. Use Hive on Spark execution engine. That improves Hive performance 2. You can use JDBC through Spark itself. No issue there. It will use JDBC provided by HiveContext 3. JDBC is fine. Every

Re: Silly question about Yarn client vs Yarn cluster modes...

2016-06-21 Thread Jörn Franke
I would import data via sqoop and put it on HDFS. It has some mechanisms to handle the lack of reliability by jdbc. Then you can process the data via Spark. You could also use jdbc rdd but I do not recommend to use it, because you do not want to pull data all the time out of the database when

Re: Silly question about Yarn client vs Yarn cluster modes...

2016-06-21 Thread ayan guha
I may be wrong here, but beeline is basically a client library. So you "connect" to STS and/or HS2 using beeline. Spark connecting to jdbc is different discussion and no way related to beeline. When you read data from DB (Oracle, DB2 etc) then you do not use beeline, but use jdbc connection to

Re: Silly question about Yarn client vs Yarn cluster modes...

2016-06-21 Thread Michael Segel
Sorry, I think you misunderstood. Spark can read from JDBC sources so to say using beeline as a way to access data is not a spark application isn’t really true. Would you say the same if you were pulling data in to spark from Oracle or DB2? There are a couple of different design patterns and

Re: Silly question about Yarn client vs Yarn cluster modes...

2016-06-21 Thread ayan guha
1. Yes, in the sense you control number of executors from spark application config. 2. Any IO will be done from executors (never ever on driver, unless you explicitly call collect()). For example, connection to a DB happens one for each worker (and used by local executors). Also, if you run a

Silly question about Yarn client vs Yarn cluster modes...

2016-06-21 Thread Michael Segel
Ok, its at the end of the day and I’m trying to make sure I understand the locale of where things are running. I have an application where I have to query a bunch of sources, creating some RDDs and then I need to join off the RDDs and some other lookup tables. Yarn has two modes… client and

Re: In yarn-cluster mode, provide system prop to the client jvm

2016-06-16 Thread Jacek Laskowski
/jaceklaskowski On Thu, Jun 16, 2016 at 1:02 PM, Ellis, Tom (Financial Markets IT) <tom.el...@lloydsbanking.com.invalid> wrote: > Hi, > > > > I was wondering if it was possible to submit a java system property to the > JVM that does the submission of a yarn-cluster a

In yarn-cluster mode, provide system prop to the client jvm

2016-06-16 Thread Ellis, Tom (Financial Markets IT)
Hi, I was wondering if it was possible to submit a java system property to the JVM that does the submission of a yarn-cluster application, for instance, -Dlog4j.configuration. I believe it will default to using the SPARK_CONF_DIR's log4j.properties, is it possible to override this, as I do

Re: Spark job is failing with kerberos error while creating hive context in yarn-cluster mode (through spark-submit)

2016-05-23 Thread Chandraprakash Bhagtani
Thanks, It worked !!! On Tue, May 24, 2016 at 1:14 AM, Marcelo Vanzin wrote: > On Mon, May 23, 2016 at 4:41 AM, Chandraprakash Bhagtani > wrote: > > I am passing hive-site.xml through --files option. > > You need hive-site-xml in Spark's classpath

Re: Spark job is failing with kerberos error while creating hive context in yarn-cluster mode (through spark-submit)

2016-05-23 Thread Marcelo Vanzin
On Mon, May 23, 2016 at 4:41 AM, Chandraprakash Bhagtani wrote: > I am passing hive-site.xml through --files option. You need hive-site-xml in Spark's classpath too. Easiest way is to copy / symlink hive-site.xml in your Spark's conf directory. -- Marcelo

Re: Spark job is failing with kerberos error while creating hive context in yarn-cluster mode (through spark-submit)

2016-05-23 Thread Chandraprakash Bhagtani
> > > > Hi, > > > > My Spark job is failing with kerberos issues while creating hive context > in yarn-cluster mode. However it is running with yarn-client mode. My spark > version is 1.6.1 > > > > I am passing hive-site.xml through --files option. > >

Re: Spark job is failing with kerberos error while creating hive context in yarn-cluster mode (through spark-submit)

2016-05-23 Thread Doug Balog
issues while creating hive context in > yarn-cluster mode. However it is running with yarn-client mode. My spark > version is 1.6.1 > > I am passing hive-site.xml through --files option. > > I tried searching online and found that the same issue is fixed with the > follow

Re: Spark job is failing with kerberos error while creating hive context in yarn-cluster mode (through spark-submit)

2016-05-23 Thread Ted Yu
Can you describe the kerberos issues in more detail ? Which release of YARN are you using ? Cheers On Mon, May 23, 2016 at 4:41 AM, Chandraprakash Bhagtani < cpbhagt...@gmail.com> wrote: > Hi, > > My Spark job is failing with kerberos issues while creating hive context > in

Spark job is failing with kerberos error while creating hive context in yarn-cluster mode (through spark-submit)

2016-05-23 Thread Chandraprakash Bhagtani
Hi, My Spark job is failing with kerberos issues while creating hive context in yarn-cluster mode. However it is running with yarn-client mode. My spark version is 1.6.1 I am passing hive-site.xml through --files option. I tried searching online and found that the same issue is fixed

Re: duplicate jar problem in yarn-cluster mode

2016-05-17 Thread Saisai Shao
de with yarn-cluster > > --master > yarn-cluster > --name > Spark-FileCopy > --class > my.example.SparkFileCopy > --properties-file > spark-defaults.conf > --queue > saleyq > --executor-memory > 1G > --driver-memory > 1G > --conf > spark.john.snow.is.back

duplicate jar problem in yarn-cluster mode

2016-05-17 Thread satish saley
Hello, I am executing a simple code with yarn-cluster --master yarn-cluster --name Spark-FileCopy --class my.example.SparkFileCopy --properties-file spark-defaults.conf --queue saleyq --executor-memory 1G --driver-memory 1G --conf spark.john.snow.is.back=true --jars hdfs://myclusternn.com:8020

Re: yarn-cluster mode error

2016-05-17 Thread Sandeep Nemuri
Can you post the complete stack trace ? ᐧ On Tue, May 17, 2016 at 7:00 PM, <spark@yahoo.com.invalid> wrote: > Hi, > > i am getting error below while running application on yarn-cluster mode. > > *ERROR yarn.ApplicationMaster: RECEIVED SIGNAL 15: SIGTERM* > >

yarn-cluster mode error

2016-05-17 Thread spark.raj
Hi, i am getting error below while running application on yarn-cluster mode. ERROR yarn.ApplicationMaster: RECEIVED SIGNAL 15: SIGTERM Anyone can suggest why i am getting this error message? Thanks Raj   Sent from Yahoo Mail. Get the app

Submitting Job to YARN-Cluster using Spark Job Server

2016-05-12 Thread ashesh_28
Hi Guys , Does any of you have tried this mechanism before? I am able to run it locally and get the output ..But how do i submit the job to the Yarn-Cluster using Spark-JobServer. Any documentation ? Regards Ashesh -- View this message in context: http://apache-spark-user-list.1001560

Re: yarn-cluster

2016-05-04 Thread nsalian
.nabble.com/yarn-cluster-tp26846p26882.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Re: yarn-cluster

2016-05-03 Thread nsalian
YARN settings to help this work appropriately and get the number of containers that the application needs. - Neelesh S. Salian Cloudera -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/yarn-cluster-tp26846p26871.html Sent from the Apache Spark User List

Re: (YARN CLUSTER MODE) Where to find logs within Spark RDD processing function ?

2016-04-29 Thread nguyen duc tuan
>> --conf >> >> "spark.executor.extraJavaOptions=-Dlog4j.configuration=file:///local/file/log4j.properties" >> >> FYI >> >> On Fri, Apr 29, 2016 at 6:03 AM, dev loper <spark...@gmail.com> wrote: >> >>> Hi Spark Team, >&

Re: (YARN CLUSTER MODE) Where to find logs within Spark RDD processing function ?

2016-04-29 Thread dev loper
Hi Ted & Nguyen, @Ted , I was under the belief that if the log4j.properties file would be taken from the application classpath if file path is not specified. Please correct me if I am wrong. I tried your approach as well still I couldn't find the logs. @nguyen I am running it on a Yarn clu

Re: (YARN CLUSTER MODE) Where to find logs within Spark RDD processing function ?

2016-04-29 Thread nguyen duc tuan
here in the executor machine) or see through webui 2016-04-29 20:03 GMT+07:00 dev loper <spark...@gmail.com>: > Hi Spark Team, > > I have asked the same question on stack overflow , no luck yet. > > > http://stackoverflow.com/questions/36923949/where-to-find-logs-withi

(YARN CLUSTER MODE) Where to find logs within Spark RDD processing function ?

2016-04-29 Thread dev loper
Hi Spark Team, I have asked the same question on stack overflow , no luck yet. http://stackoverflow.com/questions/36923949/where-to-find-logs-within-spark-rdd-processing-function-yarn-cluster-mode?noredirect=1#comment61419406_36923949 I am running my Spark Application on Yarn Cluster

Re: Problem with pyspark on Docker talking to YARN cluster

2016-04-06 Thread John Omernik
s host > network stack accessible to all > containers in it, which could leads to security issues. > > 3. use yarn-cluster mode > > Pyspark interactive shell(ipython) doesn't have cluster mode. SPARK-5162 > <https://issues.apache.org/jira/browse/SPARK-5162> is for spar

Re: Issues facing while Running Spark Streaming Job in YARN cluster mode

2016-03-22 Thread Saisai Shao
015.sp...@gmail.com> wrote: > Hi , > > I am able to run spark streaming job in local mode, when i try to run the > same job in my YARN cluster, its throwing errors. > > Any help is appreciated in this regard > > Here are my Exception logs: > > Exception 1: > &g

Issues facing while Running Spark Streaming Job in YARN cluster mode

2016-03-22 Thread Soni spark
Hi , I am able to run spark streaming job in local mode, when i try to run the same job in my YARN cluster, its throwing errors. Any help is appreciated in this regard Here are my Exception logs: Exception 1: java.net.SocketTimeoutException: 48 millis timeout while waiting for channel

Re: how to set log level of spark executor on YARN(using yarn-cluster mode)

2016-03-15 Thread jkukul
Hi Eric (or rather: anyone who's experiencing similar situation), I think your problem was, that the /--files/ parameter was provided after the application jar. Your command should have looked like this, instead: ./bin/spark-submit --class edu.bjut.spark.SparkPageRank --master yarn-cluster

Re: Spark jobs run extremely slow on yarn cluster compared to standalone spark

2016-02-14 Thread Yuval.Itzchakov
-run-extremely-slow-on-yarn-cluster-compared-to-standalone-spark-tp26215p26221.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional

Spark jobs run extremely slow on yarn cluster compared to standalone spark

2016-02-12 Thread pdesai
/Spark-jobs-run-extremely-slow-on-yarn-cluster-compared-to-standalone-spark-tp26215.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org

HADOOP_HOME are not set when try to run spark application in yarn cluster mode

2016-02-09 Thread Rachana Srivastava
I am trying to run an application in yarn cluster mode. Spark-Submit with Yarn Cluster Here are setting of the shell script: spark-submit --class "com.Myclass" \ --num-executors 2 \ --executor-cores 2 \ --master yarn \ --supervise \ --deploy-mode cluster \ ../target/ \ My application

RE: HADOOP_HOME are not set when try to run spark application in yarn cluster mode

2016-02-09 Thread Diwakar Dhanuskodi
, rachana.srivast...@markmonitor.com, user@spark.apache.org Cc: Subject: RE: HADOOP_HOME are not set when try to run spark application in yarn cluster mode Thanks so much Diwakar. spark-submit --class "com.MyClass" \ --files=/usr/lib/hadoop/etc/hadoop/core-site.xml,/usr/lib/hadoop/

RE: HADOOP_HOME are not set when try to run spark application in yarn cluster mode

2016-02-09 Thread Diwakar Dhanuskodi
P_HOME are not set when try to run spark application in yarn cluster mode I am trying to run an application in yarn cluster mode. Spark-Submit with Yarn Cluster Here are setting of the shell script: spark-submit --class "com.Myclass" \ --num-executors 2 \ --executor-cores 2 \ --master ya

Re: HBase 0.98.0 with Spark 1.5.3 issue in yarn-cluster mode

2016-01-22 Thread Ted Yu
there is problem, please pastebin the stack trace. >>>>> >>>>> Thanks >>>>> >>>>> On Wed, Jan 20, 2016 at 5:41 PM, Ajinkya Kale <kaleajin...@gmail.com> >>>>> wrote: >>>>> >>>>>> >>&

Re: HBase 0.98.0 with Spark 1.5.3 issue in yarn-cluster mode

2016-01-22 Thread Ajinkya Kale
e. >>>>>> >>>>>> On Wed, Jan 20, 2016 at 6:14 PM Ted Yu <yuzhih...@gmail.com> wrote: >>>>>> >>>>>>> 0.98.0 didn't have fix from HBASE-8 >>>>>>> >>>>>>> Please u

Re: HBase 0.98.0 with Spark 1.5.3 issue in yarn-cluster mode

2016-01-22 Thread Ajinkya Kale
is problem, please pastebin the stack trace. >>>> >>>> Thanks >>>> >>>> On Wed, Jan 20, 2016 at 5:41 PM, Ajinkya Kale <kaleajin...@gmail.com> >>>> wrote: >>>> >>>>> >>>>> I have posted t

Re: HBase 0.98.0 with Spark 1.5.3 issue in yarn-cluster mode

2016-01-22 Thread Ajinkya Kale
gt;>>> >>>>>> If still there is problem, please pastebin the stack trace. >>>>>> >>>>>> Thanks >>>>>> >>>>>> On Wed, Jan 20, 2016 at 5:41 PM, Ajinkya Kale <kaleajin...@gmail.com> >>&

Re: spark job submisson on yarn-cluster mode failing

2016-01-21 Thread Soni spark
Hi, I am facing below error msg now. please help me. 2016-01-21 16:06:14,123 WARN org.apache.hadoop.hdfs.DFSClient: Failed to connect to /xxx.xx.xx.xx:50010 for block, add to deadNodes and continue. java.nio.channels.ClosedByInterruptException java.nio.channels.ClosedByInterruptException at

Re: spark job submisson on yarn-cluster mode failing

2016-01-21 Thread Akhil Das
Can you look in the executor logs and see why the sparkcontext is being shutdown? Similar discussion happened here previously. http://apache-spark-user-list.1001560.n3.nabble.com/RECEIVED-SIGNAL-15-SIGTERM-td23668.html Thanks Best Regards On Thu, Jan 21, 2016 at 5:11 PM, Soni spark

Re: spark job submisson on yarn-cluster mode failing

2016-01-21 Thread Ted Yu
Please also check AppMaster log. Thanks > On Jan 21, 2016, at 3:51 AM, Akhil Das wrote: > > Can you look in the executor logs and see why the sparkcontext is being > shutdown? Similar discussion happened here previously. >

spark job submisson on yarn-cluster mode failing

2016-01-21 Thread Soni spark
Hi Friends, I spark job is successfully running on local mode but failing on cluster mode. Below is the error message i am getting. anyone can help me. 16/01/21 16:38:07 INFO twitter4j.TwitterStreamImpl: Establishing connection. 16/01/21 16:38:07 INFO twitter.TwitterReceiver: Twitter receiver

Re: spark job submisson on yarn-cluster mode failing

2016-01-21 Thread Ted Yu
Exception below is at WARN level. Can you check hdfs healthiness ? Which hadoop version are you using ? There should be other fatal error if your job failed. Cheers On Thu, Jan 21, 2016 at 4:50 AM, Soni spark wrote: > Hi, > > I am facing below error msg now. please

Re: HBase 0.98.0 with Spark 1.5.3 issue in yarn-cluster mode

2016-01-20 Thread Ajinkya Kale
t;>> On Wed, Jan 20, 2016 at 5:41 PM, Ajinkya Kale <kaleajin...@gmail.com> >>> wrote: >>> >>>> >>>> I have posted this on hbase user list but i thought makes more sense on >>>> spark user list. >>>> I am able to read th

HBase 0.98.0 with Spark 1.5.3 issue in yarn-cluster mode

2016-01-20 Thread Ajinkya Kale
I have posted this on hbase user list but i thought makes more sense on spark user list. I am able to read the table in yarn-client mode from spark-shell but I have exhausted all online forums for options to get it working in the yarn-cluster mode through spark-submit. I am using this code

Re: HBase 0.98.0 with Spark 1.5.3 issue in yarn-cluster mode

2016-01-20 Thread Ted Yu
ut i thought makes more sense on > spark user list. > I am able to read the table in yarn-client mode from spark-shell but I > have exhausted all online forums for options to get it working in the > yarn-cluster mode through spark-submit. > > I am using this code-example > http://www

Re: HBase 0.98.0 with Spark 1.5.3 issue in yarn-cluster mode

2016-01-20 Thread Ajinkya Kale
gt;> spark user list. >> I am able to read the table in yarn-client mode from spark-shell but I >> have exhausted all online forums for options to get it working in the >> yarn-cluster mode through spark-submit. >> >> I am using this code-example >> http:

Re: HBase 0.98.0 with Spark 1.5.3 issue in yarn-cluster mode

2016-01-20 Thread Ted Yu
t;> I am able to read the table in yarn-client mode from spark-shell but I >>> have exhausted all online forums for options to get it working in the >>> yarn-cluster mode through spark-submit. >>> >>> I am using this code-example >>> http://www.vidyasource.com/blog

Re: OOM on yarn-cluster mode

2016-01-19 Thread Saisai Shao
ouble when uploadig spark jobs in yarn-cluster mode. While > the job works and completes in yarn-client mode, I hit the following error > when using spark-submit in yarn-cluster (simplified): > > 16/01/19 21:43:31 INFO hive.metastore: Connected to metastore. > 16/01/19 21:43:32 WARN util

OOM on yarn-cluster mode

2016-01-19 Thread Julio Antonio Soto
Hi, I'm having trouble when uploadig spark jobs in yarn-cluster mode. While the job works and completes in yarn-client mode, I hit the following error when using spark-submit in yarn-cluster (simplified): 16/01/19 21:43:31 INFO hive.metastore: Connected to metastore. 16/01/19 21:43:32 WARN

Re: OOM on yarn-cluster mode

2016-01-19 Thread Julio Antonio Soto de Vicente
, >> >> I'm having trouble when uploadig spark jobs in yarn-cluster mode. While the >> job works and completes in yarn-client mode, I hit the following error when >> using spark-submit in yarn-cluster (simplified): >> 16/01/19 21:43:31 INFO hive.metastore: Connected to

RE: Spark App -Yarn-Cluster-Mode ===> Hadoop_conf_**.zip file.

2016-01-18 Thread Siddharth Ubale
To: Siddharth Ubale <siddharth.ub...@syncoms.com> Cc: user@spark.apache.org Subject: Re: Spark App -Yarn-Cluster-Mode ===> Hadoop_conf_**.zip file. Interesting. Which hbase / Phoenix releases are you using ? The following method has been removed from Put: public Put setWriteToWAL(bool

Re: Spark App -Yarn-Cluster-Mode ===> Hadoop_conf_**.zip file.

2016-01-15 Thread Ted Yu
i, Jan 15, 2016 at 5:58 AM, Siddharth Ubale < siddharth.ub...@syncoms.com> wrote: > Hi, > > > > I am trying to run a Spark streaming application in yarn-cluster mode. > However I am facing an issue where the job ends asking for a particular > Hadoop_conf_**.zip file in hdf

Spark App -Yarn-Cluster-Mode ===> Hadoop_conf_**.zip file.

2016-01-15 Thread Siddharth Ubale
Hi, I am trying to run a Spark streaming application in yarn-cluster mode. However I am facing an issue where the job ends asking for a particular Hadoop_conf_**.zip file in hdfs location. Can any one guide with this? The application works fine in local mode only it stops abruptly for want

Re: Spark App -Yarn-Cluster-Mode ===> Hadoop_conf_**.zip file.

2016-01-15 Thread Ted Yu
at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:497) > > at > org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:483) >

RE: Spark App -Yarn-Cluster-Mode ===> Hadoop_conf_**.zip file.

2016-01-15 Thread Siddharth Ubale
.ub...@syncoms.com> Cc: user@spark.apache.org Subject: Re: Spark App -Yarn-Cluster-Mode ===> Hadoop_conf_**.zip file. bq. check application tracking page:http://slave1:8088/proxy/application_1452763526769_0011/ Then<http://slave1:8088/proxy/application_1452763526769_0011/Then>, ...

Spark 1.5.2 streaming driver in YARN cluster mode on Hadoop 2.6 (on EMR 4.2) restarts after stop

2016-01-14 Thread Roberto Coluccio
Hi there, I'm facing a weird issue when upgrading from Spark 1.4.1 streaming driver on EMR 3.9 (hence Hadoop 2.4.0) to Spark 1.5.2 on EMR 4.2 (hence Hadoop 2.6.0). Basically, the very same driver which used to terminate after a timeout as expected, now does not. In particular, as long as the

Re: Unable to read JSON input in Spark (YARN Cluster)

2016-01-02 Thread Vijay Gharge
Hi Few suggestions: 1. Try storage mode as "memory and disk" both. >> to verify heap memory error 2. Try to copy and read json source file from local filesystem (i.e. Without hdfs) >> to verify minimum working code 3. Looks like some library issue which is causing lzo telated error. On Saturday

Unable to read JSON input in Spark (YARN Cluster)

2016-01-01 Thread ๏̯͡๏
Version: Spark 1.5.2 *Spark built with Hive* git clone git://github.com/apache/spark.git ./make-distribution.sh --tgz -Phadoop-2.4 -Pyarn -Dhadoop.version=2.4.0 -Phive -Phive-thriftserver *Input:* -sh-4.1$ hadoop fs -du -h /user/dvasthimal/poc_success_spark/data/input 2.5 G

Re: ​Spark 1.6 - YARN Cluster Mode

2015-12-21 Thread Akhil Das
​ Thanks Best Regards On Fri, Dec 18, 2015 at 1:33 AM, syepes <sye...@gmail.com> wrote: > Hello, > > This week I have been testing 1.6 (#d509194b) in our HDP 2.3 platform and > its been working pretty ok, at the exception of the YARN cluster deployment > mode. > Note that

​Spark 1.6 - YARN Cluster Mode

2015-12-17 Thread syepes
Hello, This week I have been testing 1.6 (#d509194b) in our HDP 2.3 platform and its been working pretty ok, at the exception of the YARN cluster deployment mode. Note that with 1.5 using the same "spark-props.conf" and "spark-env.sh" config files the cluster mode works as e

Re: Autoscaling of Spark YARN cluster

2015-12-14 Thread Deepak Sharma
e custom scaling metrics which you could use to > query the # of applications queued, # of resources available values and > add nodes when required. > > Cheers! > > On Mon, Dec 14, 2015 at 8:57 AM, Mingyu Kim <m...@palantir.com> wrote: > >> Hi all, >> >> H

Re: Autoscaling of Spark YARN cluster

2015-12-14 Thread cs user
required. Cheers! On Mon, Dec 14, 2015 at 8:57 AM, Mingyu Kim <m...@palantir.com> wrote: > Hi all, > > Has anyone tried out autoscaling Spark YARN cluster on a public cloud > (e.g. EC2) based on workload? To be clear, I’m interested in scaling the > cluster itself up and down b

Re: Autoscaling of Spark YARN cluster

2015-12-14 Thread Mingyu Kim
t;user@spark.apache.org" <user@spark.apache.org> Subject: Re: Autoscaling of Spark YARN cluster An approach I can think of is using Ambari Metrics Service(AMS) Using these metrics , you can decide upon if the cluster is low in resources. If yes, call the Ambari management API to add the n

Autoscaling of Spark YARN cluster

2015-12-14 Thread Mingyu Kim
Hi all, Has anyone tried out autoscaling Spark YARN cluster on a public cloud (e.g. EC2) based on workload? To be clear, I¹m interested in scaling the cluster itself up and down by adding and removing YARN nodes based on the cluster resource utilization (e.g. # of applications queued

Re: Question about yarn-cluster mode and spark.driver.allowMultipleContexts

2015-12-02 Thread Marcelo Vanzin
On Tue, Dec 1, 2015 at 9:43 PM, Anfernee Xu wrote: > But I have a single server(JVM) that is creating SparkContext, are you > saying Spark supports multiple SparkContext in the same JVM? Could you > please clarify on this? I'm confused. Nothing you said so far requires

Question about yarn-cluster mode and spark.driver.allowMultipleContexts

2015-12-01 Thread Anfernee Xu
Hi, I have a doubt regarding yarn-cluster mode and spark.driver. allowMultipleContexts for below usercases. I have a long running backend server where I will create a short-lived Spark job in response to each user request, base on the fact that by default multiple Spark Context cannot be created

Re: Question about yarn-cluster mode and spark.driver.allowMultipleContexts

2015-12-01 Thread Ted Yu
owMultipleContexts", "true") .set("spark.driver.allowMultipleContexts", "true")) ./core/src/test/scala/org/apache/spark/SparkContextSuite.scala FYI On Tue, Dec 1, 2015 at 3:32 PM, Anfernee Xu <anfernee...@gmail.com> wrote: > Hi, > > I have a do

Re: Question about yarn-cluster mode and spark.driver.allowMultipleContexts

2015-12-01 Thread Ted Yu
Looks like #2 is better choice. On Tue, Dec 1, 2015 at 4:51 PM, Anfernee Xu <anfernee...@gmail.com> wrote: > Thanks Ted, so 1) is off from the table, can I go with 2), yarn-cluster > mode? As the driver is running as a Yarn container, it's should be OK for > my usercase, isn't i

Re: Question about yarn-cluster mode and spark.driver.allowMultipleContexts

2015-12-01 Thread Josh Rosen
w SparkConf().set("spark.driver.allowMultipleContexts", >>> "false") >>> .set("spark.driver.allowMultipleContexts", "true") >>> .set("spark.driver.allowMultipleContexts", "true")) >>> ./core/src/t

Re: Question about yarn-cluster mode and spark.driver.allowMultipleContexts

2015-12-01 Thread Anfernee Xu
Thanks Ted, so 1) is off from the table, can I go with 2), yarn-cluster mode? As the driver is running as a Yarn container, it's should be OK for my usercase, isn't it? Anfernee On Tue, Dec 1, 2015 at 4:48 PM, Ted Yu <yuzhih...@gmail.com> wrote: > For #1, looks like the config is use

Re: Question about yarn-cluster mode and spark.driver.allowMultipleContexts

2015-12-01 Thread Marcelo Vanzin
JVM, looks like I have > 2 choices > > 2) run my jobs in yarn-cluster mode instead yarn-client There's nothing in yarn-client mode that prevents you from doing what you describe. If you write some server for users to submit jobs to, it should work whether you start the context in yarn-cli

Re: Question about yarn-cluster mode and spark.driver.allowMultipleContexts

2015-12-01 Thread Anfernee Xu
in the same JVM, looks like I > have > > 2 choices > > > > 2) run my jobs in yarn-cluster mode instead yarn-client > > There's nothing in yarn-client mode that prevents you from doing what > you describe. If you write some server for users to submit jobs to, it > shoul

how to get the tracking URL with ip address instead of hostname in yarn-cluster model

2015-11-15 Thread wangpan
tracking-URL-with-ip-address-instead-of-hostname-in-yarn-cluster-model-tp25387.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional co

Issue of jar dependency in yarn-cluster mode

2015-10-16 Thread Rex Xiong
-submit --master yarn-cluster --jars hdfs://../snakeyaml-1.10.jar .. java.lang.NoSuchMethodError: org.yaml.snakeyaml.Yaml.(Lorg/yaml/snakeyaml/constructor/BaseConstructor;)V I check one executor container folder, snakeyaml-1.10.jar has been successfully downloaded, and in spark driver

HTTP 500 if try to access Spark UI in yarn-cluster (only)

2015-10-16 Thread Sebastian YEPES FERNANDEZ
​Hello, I am wondering if anyone else is also facing this ​issue: https://issues.apache.org/jira/browse/SPARK-11147 ​

Re: Issue of jar dependency in yarn-cluster mode

2015-10-16 Thread Rex Xiong
orks fine: > spark-submit --master local --jars d:\snakeyaml-1.10.jar ... > > But when I try to run it in yarn, I have issue, it seems spark executor > cannot find the jar file: > spark-submit --master yarn-cluster --jars hdfs://../snakeyaml-1.10.jar > ..

Re: yarn-cluster mode throwing NullPointerException

2015-10-12 Thread Venkatakrishnan Sowrirajan
Hi Rachana, Are you by any chance saying something like this in your code ​? ​ "sparkConf.setMaster("yarn-cluster");" ​SparkContext is not supported with yarn-cluster mode.​ I think you are hitting this bug -- > https://issues.apache.org/jira/browse/SPARK-7504. This g

yarn-cluster mode throwing NullPointerException

2015-10-11 Thread Rachana Srivastava
I am trying to submit a job using yarn-cluster mode using spark-submit command. My code works fine when I use yarn-client mode. Cloudera Version: CDH-5.4.7-1.cdh5.4.7.p0.3 Command Submitted: spark-submit --class "com.markmonitor.antifraud.ce.KafkaURLStreaming" \ --driver-ja

Jar is cached in yarn-cluster mode?

2015-10-09 Thread Rex Xiong
I use "spark-submit -master yarn-cluster hdfs://.../a.jar .." to submit my app to yarn. Then I update this a.jar in HDFS, run the command again, I found a line of log that was been removed still exist in "yarn logs ". Is there a cache mechanism I need to disable? Thanks

Best practices for scheduling Spark jobs on "shared" YARN cluster using Autosys

2015-09-25 Thread unk1102
-for-scheduling-Spark-jobs-on-shared-YARN-cluster-using-Autosys-tp24820.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e

Re: Zeppelin on Yarn : org.apache.spark.SparkException: Detected yarn-cluster mode, but isn't running on a cluster. Deployment to YARN is not supported directly by SparkContext. Please use spark-submi

2015-09-19 Thread Ewan Leith
: org.apache.spark.SparkException: Detected yarn-cluster mode, but isn't running on a cluster. Deployment to YARN is not supported directly by SparkContext. Please use spark-submit. It works using yarn-client but I want to make it running on cluster. Is there any way to do so? best, /Shahab On Fri, Sep 18

Zeppelin on Yarn : org.apache.spark.SparkException: Detected yarn-cluster mode, but isn't running on a cluster. Deployment to YARN is not supported directly by SparkContext. Please use spark-submit.

2015-09-18 Thread shahab
Hi, Probably I have wrong zeppelin configuration, because I get the following error when I execute spark statements in Zeppelin: org.apache.spark.SparkException: Detected yarn-cluster mode, but isn't running on a cluster. Deployment to YARN is not supported directly by SparkContext. Please use

Re: Zeppelin on Yarn : org.apache.spark.SparkException: Detected yarn-cluster mode, but isn't running on a cluster. Deployment to YARN is not supported directly by SparkContext. Please use spark-submi

2015-09-18 Thread Aniket Bhatnagar
I don't think yarn-cluster mode is currently supported. You may want to ask zeppelin community for confirmation though. On Fri, Sep 18, 2015, 5:41 PM shahab <shahab.mok...@gmail.com> wrote: > It works using yarn-client but I want to make it running on cluster. Is > there any

Re: Zeppelin on Yarn : org.apache.spark.SparkException: Detected yarn-cluster mode, but isn't running on a cluster. Deployment to YARN is not supported directly by SparkContext. Please use spark-submi

2015-09-18 Thread shahab
b.mok...@gmail.com> wrote: > >> Hi, >> >> Probably I have wrong zeppelin configuration, because I get the >> following error when I execute spark statements in Zeppelin: >> >> org.apache.spark.SparkException: Detected yarn-cluster mode, but isn't >>

Re: Zeppelin on Yarn : org.apache.spark.SparkException: Detected yarn-cluster mode, but isn't running on a cluster. Deployment to YARN is not supported directly by SparkContext. Please use spark-submi

2015-09-18 Thread Aniket Bhatnagar
Can you try yarn-client mode? On Fri, Sep 18, 2015, 3:38 PM shahab <shahab.mok...@gmail.com> wrote: > Hi, > > Probably I have wrong zeppelin configuration, because I get the following > error when I execute spark statements in Zeppelin: > > org.apache.spark.SparkExceptio

Re: spark not launching in yarn-cluster mode

2015-08-25 Thread Yanbo Liang
spark-shell and spark-sql can not be deployed with yarn-cluster mode, because you need to make spark-shell or spark-sql scripts run on your local machine rather than container of YARN cluster. 2015-08-25 16:19 GMT+08:00 Jeetendra Gangele gangele...@gmail.com: Hi All i am trying to launch

Re: spark not launching in yarn-cluster mode

2015-08-25 Thread Jeetendra Gangele
) at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.stopExecutors(CoarseGrainedSchedulerBackend.scala:257) On 25 August 2015 at 14:26, Yanbo Liang yblia...@gmail.com wrote: spark-shell and spark-sql can not be deployed with yarn-cluster mode, because you need to make spark-shell or spark-sql scripts

Re: Spark Streaming failing on YARN Cluster

2015-08-25 Thread Ramkumar V
the 'SparkContext', set the Master on yarn cluster ( SetMaster(yarn-cluster) ). Its working fine in cluster mode. Thanks for everyone. *Thanks*, https://in.linkedin.com/in/ramkumarcs31 On Fri, Aug 21, 2015 at 6:41 AM, Jeff Zhang zjf...@gmail.com wrote: AM fails to launch, could you check the yarn app logs

Re: Spark Streaming failing on YARN Cluster

2015-08-19 Thread Ramkumar V
resources. I doubt that you didn't put core-site.xml under your classpath, so that spark can not detect your remote file system and won't copy the files to hdfs as local resources. Usually in yarn-cluster mode, you should be able to see the logs like following. 15/08/14 10:48:49 INFO

Re: issue Running Spark Job on Yarn Cluster

2015-08-19 Thread stark_summer
Please look at more about hadoop logs, such as yarn logs -applicationId xxx attach more logs to this topic -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/issue-Running-Spark-Job-on-Yarn-Cluster-tp21779p24350.html Sent from the Apache Spark User List

<    1   2   3   4   5   >