Re: Spark yarn cluster

2020-07-11 Thread Diwakar Dhanuskodi
Thanks Martin. I was not clear in my question initially . Thanks for understanding and briefing. The idea as you said is to explore the possibility of using yarn for cluster scheduling with spark being used without hdfs. Thanks again for clarification. On Sat, Jul 11, 2020 at 1:27 PM Juan

Re: Spark yarn cluster

2020-07-11 Thread Juan Martín Guillén
Hi Diwakar, A Yarn cluster not having Hadoop is kind of a fuzzy concept. Definitely you may want to have Hadoop and don't need to use MapReduce and use Spark instead. That is the main reason to use Spark in a Hadoop cluster anyway. On the other hand it is highly probable you may want to use

Spark yarn cluster

2020-07-11 Thread Diwakar Dhanuskodi
Hi , Could it be possible to setup Spark within Yarn cluster which may not have Hadoop?. Thanks.

Re: SPark - YARN Cluster Mode

2017-02-27 Thread ayan guha
Hi Thanks a lot, i used property file to resolve the issue. I think documentation should mention it though. On Tue, 28 Feb 2017 at 5:05 am, Marcelo Vanzin wrote: > > none of my Config settings > > Is it none of the configs or just the queue? You can't set the YARN > queue

Re: SPark - YARN Cluster Mode

2017-02-27 Thread Marcelo Vanzin
> none of my Config settings Is it none of the configs or just the queue? You can't set the YARN queue in cluster mode through code, it has to be set in the command line. It's a chicken & egg problem (in cluster mode, the YARN app is created before your code runs). --property-file works the

Re: SPark - YARN Cluster Mode

2017-02-26 Thread ayan guha
Also, I wanted to add if I specify the conf in the command line, it seems to be working. For example, if I use spark-submit --master yarn --deploy-mode cluster --conf spark.yarn.queue=root.Application ayan_test.py 10 Then it is going to correct queue. Any help would be great Best Ayan On

SPark - YARN Cluster Mode

2017-02-26 Thread ayan guha
Hi I am facing an issue with Cluster Mode, with pyspark Here is my code: conf = SparkConf() conf.setAppName("Spark Ingestion") conf.set("spark.yarn.queue","root.Applications") conf.set("spark.executor.instances","50")

Re: Spark Yarn Cluster with Reference File

2016-09-23 Thread ayan guha
You may try copying the file to same location on all nodes and try to read from that place On 24 Sep 2016 00:20, "ABHISHEK" wrote: > I have tried with hdfs/tmp location but it didn't work. Same error. > > On 23 Sep 2016 19:37, "Aditya"

Re: Spark Yarn Cluster with Reference File

2016-09-23 Thread ABHISHEK
I have tried with hdfs/tmp location but it didn't work. Same error. On 23 Sep 2016 19:37, "Aditya" wrote: > Hi Abhishek, > > Try below spark submit. > spark-submit --master yarn --deploy-mode cluster --files hdfs:// > abc.com:8020/tmp/abc.drl --class

Re: Spark Yarn Cluster with Reference File

2016-09-23 Thread Aditya
Hi Abhishek, Try below spark submit. spark-submit --master yarn --deploy-mode cluster --files hdfs://abc.com:8020/tmp/abc.drl --class com.abc.StartMain abc-0.0.1-SNAPSHOT-jar-with-dependencies.jar abc.drl On Friday 23

Re: Spark Yarn Cluster with Reference File

2016-09-23 Thread ABHISHEK
Thanks for your response Aditya and Steve. Steve: I have tried specifying both /tmp/filename in hdfs and local path but it didn't work. You may be write that Kie session is configured to access files from Local path. I have attached code here for your reference and if you find some thing wrong,

Re: Spark Yarn Cluster with Reference File

2016-09-23 Thread Steve Loughran
On 23 Sep 2016, at 08:33, ABHISHEK > wrote: at java.lang.Thread.run(Thread.java:745) Caused by: java.io.FileNotFoundException: hdfs:/abc.com:8020/user/abhietc/abc.drl (No such file or directory)

Re: Spark Yarn Cluster with Reference File

2016-09-23 Thread Aditya
Hi Abhishek, From your spark-submit it seems your passing the file as a parameter to the driver program. So now it depends what exactly you are doing with that parameter. Using --files option it will be available to all the worker nodes but if in your code if you are referencing using the

Spark Yarn Cluster with Reference File

2016-09-23 Thread ABHISHEK
Hello there, I have Spark Application which refer to an external file ‘abc.drl’ and having unstructured data. Application is able to find this reference file if I run app in Local mode but in Yarn with Cluster mode, it is not able to find the file in the specified path. I tried with both local

Re: Error trying to connect to Hive from Spark (Yarn-Cluster Mode)

2016-09-17 Thread Mich Talebzadeh
6 AM > *To:* Gangadhar, Anupama (623) > *Cc:* user @spark > *Subject:* Re: Error trying to connect to Hive from Spark (Yarn-Cluster > Mode) > > > > Is your Hive Thrift Server up and running on port > jdbc:hive2://10001? > > > > Do the following >

RE: Error trying to connect to Hive from Spark (Yarn-Cluster Mode)

2016-09-17 Thread anupama . gangadhar
ing to connect to Hive from Spark (Yarn-Cluster Mode) Is your Hive Thrift Server up and running on port jdbc:hive2://10001? Do the following netstat -alnp |grep 10001 and see whether it is actually running HTH Dr Mich Talebzadeh LinkedIn https://www.linkedin.com/profile/view

RE: Error trying to connect to Hive from Spark (Yarn-Cluster Mode)

2016-09-17 Thread anupama . gangadhar
? In the cluster, transport mode is http and ssl is disabled. Thanks Anupama From: Deepak Sharma [mailto:deepakmc...@gmail.com] Sent: Saturday, September 17, 2016 8:35 AM To: Gangadhar, Anupama (623) Cc: spark users Subject: Re: Error trying to connect to Hive from Spark (Yarn-Cluster Mode) Hi

Re: Error trying to connect to Hive from Spark (Yarn-Cluster Mode)

2016-09-16 Thread Deepak Sharma
Hi Anupama To me it looks like issue with the SPN with which you are trying to connect to hive2 , i.e. hive@hostname. Are you able to connect to hive from spark-shell? Try getting the tkt using any other user keytab but not hadoop services keytab and then try running the spark submit. Thanks

Re: Error trying to connect to Hive from Spark (Yarn-Cluster Mode)

2016-09-16 Thread Mich Talebzadeh
Is your Hive Thrift Server up and running on port jdbc:hive2://10001? Do the following netstat -alnp |grep 10001 and see whether it is actually running HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

Error trying to connect to Hive from Spark (Yarn-Cluster Mode)

2016-09-16 Thread anupama . gangadhar
Hi, I am trying to connect to Hive from Spark application in Kerborized cluster and get the following exception. Spark version is 1.4.1 and Hive is 1.2.1. Outside of spark the connection goes through fine. Am I missing any configuration parameters? ava.sql.SQLException: Could not open

Re: Unable to read JSON input in Spark (YARN Cluster)

2016-01-02 Thread Vijay Gharge
Hi Few suggestions: 1. Try storage mode as "memory and disk" both. >> to verify heap memory error 2. Try to copy and read json source file from local filesystem (i.e. Without hdfs) >> to verify minimum working code 3. Looks like some library issue which is causing lzo telated error. On Saturday

Unable to read JSON input in Spark (YARN Cluster)

2016-01-01 Thread ๏̯͡๏
Version: Spark 1.5.2 *Spark built with Hive* git clone git://github.com/apache/spark.git ./make-distribution.sh --tgz -Phadoop-2.4 -Pyarn -Dhadoop.version=2.4.0 -Phive -Phive-thriftserver *Input:* -sh-4.1$ hadoop fs -du -h /user/dvasthimal/poc_success_spark/data/input 2.5 G

Re: Autoscaling of Spark YARN cluster

2015-12-14 Thread Deepak Sharma
e custom scaling metrics which you could use to > query the # of applications queued, # of resources available values and > add nodes when required. > > Cheers! > > On Mon, Dec 14, 2015 at 8:57 AM, Mingyu Kim <m...@palantir.com> wrote: > >> Hi all, >> >> H

Re: Autoscaling of Spark YARN cluster

2015-12-14 Thread cs user
required. Cheers! On Mon, Dec 14, 2015 at 8:57 AM, Mingyu Kim <m...@palantir.com> wrote: > Hi all, > > Has anyone tried out autoscaling Spark YARN cluster on a public cloud > (e.g. EC2) based on workload? To be clear, I’m interested in scaling the > cluster itself up and down b

Re: Autoscaling of Spark YARN cluster

2015-12-14 Thread Mingyu Kim
t;user@spark.apache.org" <user@spark.apache.org> Subject: Re: Autoscaling of Spark YARN cluster An approach I can think of is using Ambari Metrics Service(AMS) Using these metrics , you can decide upon if the cluster is low in resources. If yes, call the Ambari management API to add the n

Autoscaling of Spark YARN cluster

2015-12-14 Thread Mingyu Kim
Hi all, Has anyone tried out autoscaling Spark YARN cluster on a public cloud (e.g. EC2) based on workload? To be clear, I¹m interested in scaling the cluster itself up and down by adding and removing YARN nodes based on the cluster resource utilization (e.g. # of applications queued

spark yarn-cluster job failing in batch processing

2015-04-23 Thread sachin Singh
this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/spark-yarn-cluster-job-failing-in-batch-processing-tp22626.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe

Spark yarn cluster Application Master not running yarn container

2014-11-25 Thread firemonk9
? Thanks for the help. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-yarn-cluster-Application-Master-not-running-yarn-container-tp19761.html Sent from the Apache Spark User List mailing list archive at Nabble.com