Re: how can i use spark with yarn cluster in java

2023-09-06 Thread Mich Talebzadeh
estruction. On Wed, 6 Sept 2023 at 18:48, BCMS wrote: > i want to use yarn cluster with my current code. if i use > conf.set("spark.master","local[*]") inplace of > conf.set("spark.master","yarn"), everything is very well. but i try to use > yarn

how can i use spark with yarn cluster in java

2023-09-06 Thread BCMS
i want to use yarn cluster with my current code. if i use conf.set("spark.master","local[*]") inplace of conf.set("spark.master","yarn"), everything is very well. but i try to use yarn in setmaster, my code give an below error. ``` pac

Re: read dataset from only one node in YARN cluster

2023-08-18 Thread Mich Talebzadeh
author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On Fri, 18 Aug 2023 at 17:17, marc nicole wrote: > Hi, > > Spark 3.2, Hadoop 3.2, using YARN cluster mode, if one wants to read a > dataset that is found in one n

read dataset from only one node in YARN cluster

2023-08-18 Thread marc nicole
Hi, Spark 3.2, Hadoop 3.2, using YARN cluster mode, if one wants to read a dataset that is found in one node of the cluster and not in the others, how to tell Spark that? I expect through DataframeReader and using path like *IP:port/pathOnLocalNode* PS: loading the dataset in HDFS

Re: Spark yarn cluster

2020-07-11 Thread Diwakar Dhanuskodi
Martín Guillén < juanmartinguil...@yahoo.com.ar> wrote: > Hi Diwakar, > > A Yarn cluster not having Hadoop is kind of a fuzzy concept. > > Definitely you may want to have Hadoop and don't need to use MapReduce and > use Spark instead. That is the main reason to use S

Re: Spark yarn cluster

2020-07-11 Thread Juan Martín Guillén
Hi Diwakar, A Yarn cluster not having Hadoop is kind of a fuzzy concept. Definitely you may want to have Hadoop and don't need to use MapReduce and use Spark instead. That is the main reason to use Spark in a Hadoop cluster anyway. On the other hand it is highly probable you may want to use

Spark yarn cluster

2020-07-11 Thread Diwakar Dhanuskodi
Hi , Could it be possible to setup Spark within Yarn cluster which may not have Hadoop?. Thanks.

Re: Spark Cluster over yarn cluster monitoring

2019-10-29 Thread Chetan Khatri
Thanks Jörn On Sun, Oct 27, 2019 at 8:01 AM Jörn Franke wrote: > Use yarn queues: > > > https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html > > Am 27.10.2019 um 06:41 schrieb Chetan Khatri >: > >  > Could someone please help me to understand better.. > > On

Re: Spark Cluster over yarn cluster monitoring

2019-10-27 Thread Jörn Franke
Use yarn queues: https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html > Am 27.10.2019 um 06:41 schrieb Chetan Khatri : > >  > Could someone please help me to understand better.. > >> On Thu, Oct 17, 2019 at 7:41 PM Chetan Khatri >> wrote: >> Hi Users, >>

Re: Spark Cluster over yarn cluster monitoring

2019-10-26 Thread Chetan Khatri
Could someone please help me to understand better.. On Thu, Oct 17, 2019 at 7:41 PM Chetan Khatri wrote: > Hi Users, > > I do submit *X* number of jobs with Airflow to Yarn as a part of workflow > for *Y *customer. I could potentially run workflow for customer *Z *but I > need to check that how

Spark Cluster over yarn cluster monitoring

2019-10-17 Thread Chetan Khatri
Hi Users, I do submit *X* number of jobs with Airflow to Yarn as a part of workflow for *Y *customer. I could potentially run workflow for customer *Z *but I need to check that how much resources are available over the cluster so jobs for next customer should start. Could you please tell what is

Is there a difference between --proxy-user or HADOOP_USER_NAME in a non-Kerberized YARN cluster?

2019-05-16 Thread Jeff Evans
Let's suppose we're dealing with a non-secured (i.e. not Kerberized) YARN cluster. When I invoke spark-submit, is there a practical difference between specifying --proxy-user=foo (supposing impersonation is properly set up) or setting the environment variable HADOOP_USER_NAME=foo? Thanks for any

Re: How to clean up logs-dirs and local-dirs of running spark streaming in yarn cluster mode

2018-12-26 Thread shyla deshpande
to any documentation if available. Thanks >> >> On Tue, Dec 18, 2018 at 11:10 AM shyla deshpande < >> deshpandesh...@gmail.com> wrote: >> >>> Is there a way to do this without stopping the streaming application in >>> yarn cluster mode? >>> >

Re: How to clean up logs-dirs and local-dirs of running spark streaming in yarn cluster mode

2018-12-25 Thread Fawze Abujaber
ay to do this without stopping the streaming application in >> yarn cluster mode? >> >> On Mon, Dec 17, 2018 at 4:42 PM shyla deshpande >> wrote: >> >>> I get the ERROR >>> 1/1 local-dirs are bad: /mnt/yarn; 1/1 log-dirs are bad: >>> /var/l

Re: How to clean up logs-dirs and local-dirs of running spark streaming in yarn cluster mode

2018-12-25 Thread shyla deshpande
Please point me to any documentation if available. Thanks On Tue, Dec 18, 2018 at 11:10 AM shyla deshpande wrote: > Is there a way to do this without stopping the streaming application in > yarn cluster mode? > > On Mon, Dec 17, 2018 at 4:42 PM shyla deshpande > wrote: >

Re: How to clean up logs-dirs and local-dirs of running spark streaming in yarn cluster mode

2018-12-18 Thread shyla deshpande
Is there a way to do this without stopping the streaming application in yarn cluster mode? On Mon, Dec 17, 2018 at 4:42 PM shyla deshpande wrote: > I get the ERROR > 1/1 local-dirs are bad: /mnt/yarn; 1/1 log-dirs are bad: > /var/log/hadoop-yarn/containers > > Is there a

How to clean up logs-dirs and local-dirs of running spark streaming in yarn cluster mode

2018-12-17 Thread shyla deshpande
I get the ERROR 1/1 local-dirs are bad: /mnt/yarn; 1/1 log-dirs are bad: /var/log/hadoop-yarn/containers Is there a way to clean up these directories while the spark streaming application is running? Thanks

Restarting a failed Spark streaming job running on top of a yarn cluster

2018-10-03 Thread jcgarciam
Hi Folks, We have few spark job streaming jobs running on a yarn cluster, and from time to time a job need to be restarted (it was killed due to external reason or others). Once we submit the new job we are face with the following exception: ERROR spark.SparkContext: Failed to add /mnt/data1

Re: issue Running Spark Job on Yarn Cluster

2018-09-16 Thread sivasonai
Come across such issue in our project and got it resolved by clearing the space under hdfs directory - "/user/spark". Please check if you have enough space/privileges for this hdfs directory - "/user/spark" -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

Re: Spark application complete it's job successfully on Yarn cluster but yarn register it as failed

2018-06-20 Thread Sonal Goyal
Have you checked the logs - they probably should have some more details. On Wed 20 Jun, 2018, 2:51 PM Soheil Pourbafrani, wrote: > Hi, > > I run a Spark application on Yarn cluster and it complete the process > successfully, but at the end Yarn print in the console: > &g

Spark application complete it's job successfully on Yarn cluster but yarn register it as failed

2018-06-20 Thread Soheil Pourbafrani
Hi, I run a Spark application on Yarn cluster and it complete the process successfully, but at the end Yarn print in the console: client token: N/A diagnostics: Application application_1529485137783_0004 failed 4 times due to AM Container for appattempt_1529485137783_0004_04 exited

Re: Does Spark shows logical or physical plan when executing job on the yarn cluster

2018-05-20 Thread Ajay
we can see spark logical or physical > plan while running spark job on the yarn cluster( Eg: like number of > stages) > > > Thanks in advance. > > > Thanks, > Giri >

Does Spark shows logical or physical plan when executing job on the yarn cluster

2018-05-20 Thread giri ar
Hi, Good Day. Could you please let me know whether we can see spark logical or physical plan while running spark job on the yarn cluster( Eg: like number of stages) Thanks in advance. Thanks, Giri

Exception thrown in awaitResult during application launch in yarn cluster

2018-05-18 Thread Shiyuan
Hi Spark-users, I am using pyspark on a yarn cluster. One of my spark application launch failed. Only the driver container had started before it failed on the ACCEPTED state. The error message is very short and I cannot make sense of it. The error message is attached below. Any possible causes

Re: Error submitting Spark Job in yarn-cluster mode on EMR

2018-05-08 Thread Marco Mistroni
ram that works fine in the local mode. But I am having > issues when I try to run the program in yarn-cluster mode. I know usually > no such method happens when compile and run version mismatch but I made > sure > I took the same version. > > 205 [main] INFO org.spark

Error submitting Spark Job in yarn-cluster mode on EMR

2018-05-08 Thread SparkUser6
I have a simple program that works fine in the local mode. But I am having issues when I try to run the program in yarn-cluster mode. I know usually no such method happens when compile and run version mismatch but I made sure I took the same version. 205 [main] INFO

Re: run spark job in yarn cluster mode as specified user

2018-01-22 Thread sd wang
Thanks! I finally make this work, except parameter LinuxContainerExecutor and cache directory permissions, the following parameter also need to be updated to specified user. yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user Thanks. 2018-01-22 22:44 GMT+08:00 Margusja

Re: run spark job in yarn cluster mode as specified user

2018-01-22 Thread Margusja
Hi org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor requires user in each node and right permissions set in necessary directories. Br Margus > On 22 Jan 2018, at 13:41, sd wang wrote: > >

Re: run spark job in yarn cluster mode as specified user

2018-01-22 Thread Jörn Franke
Configure Kerberos > On 22. Jan 2018, at 08:28, sd wang <pingwang1...@gmail.com> wrote: > > Hi Advisers, > When submit spark job in yarn cluster mode, the job will be executed by > "yarn" user. Any parameters can change the user? I tried setting > HADOOP_

Re: run spark job in yarn cluster mode as specified user

2018-01-22 Thread sd wang
user who executes script. > > Br > Margus > > > > On 22 Jan 2018, at 09:28, sd wang <pingwang1...@gmail.com> wrote: > > Hi Advisers, > When submit spark job in yarn cluster mode, the job will be executed by > "yarn" user. Any parameters can change the

Re: run spark job in yarn cluster mode as specified user

2018-01-21 Thread Margusja
Margus > On 22 Jan 2018, at 09:28, sd wang <pingwang1...@gmail.com> wrote: > > Hi Advisers, > When submit spark job in yarn cluster mode, the job will be executed by > "yarn" user. Any parameters can change the user? I tried setting > HADOOP_USER_NAME but it

run spark job in yarn cluster mode as specified user

2018-01-21 Thread sd wang
Hi Advisers, When submit spark job in yarn cluster mode, the job will be executed by "yarn" user. Any parameters can change the user? I tried setting HADOOP_USER_NAME but it did not work. I'm using spark 2.2. Thanks for any help!

Re: Spark application on yarn cluster clarification

2018-01-18 Thread Fawze Abujaber
Hi Soheil, Resource manager and NodeManager are enough, of your you need the roles of DataNode and NameNode to be able accessing the Data. On Thu, 18 Jan 2018 at 10:12 Soheil Pourbafrani <soheil.i...@gmail.com> wrote: > I am setting up a Yarn cluster to run Spark applications on that

Spark application on yarn cluster clarification

2018-01-18 Thread Soheil Pourbafrani
I am setting up a Yarn cluster to run Spark applications on that, but I'm confused a bit! Consider I have a 4-node yarn cluster including one resource manager and 3 node manager and spark are installed in all 4 nodes. Now my question is when I want to submit spark application to yarn cluster

Re: update LD_LIBRARY_PATH when running apache job in a YARN cluster

2018-01-17 Thread Keith Chapman
Ballesteros < manuel...@garvan.org.au> wrote: > Dear Spark community, > > > > I have a spark running in a yarn cluster and I am getting some error when > trying to run my python application. > > > > /home/mansop/virtenv/bin/python2.7: error while loading sha

update LD_LIBRARY_PATH when running apache job in a YARN cluster

2018-01-17 Thread Manuel Sopena Ballesteros
Dear Spark community, I have a spark running in a yarn cluster and I am getting some error when trying to run my python application. /home/mansop/virtenv/bin/python2.7: error while loading shared libraries: libpython2.7.so.1.0: cannot open shared object file: No such file or directory

Re: Unable to run Spark Jobs in yarn cluster mode

2017-10-10 Thread mailfordebu
n Tue, Oct 10, 2017 at 7:02 AM, Debabrata Ghosh <mailford...@gmail.com> >> wrote: >> Hi All, >> I am constantly hitting an error : "ApplicationMaster: >> SparkContext did not initialize after waiting for 100 ms" while running my >>

Re: Unable to run Spark Jobs in yarn cluster mode

2017-10-10 Thread Vadim Semenov
ApplicationMaster: > SparkContext did not initialize after waiting for 100 ms" while running my > Spark code in yarn cluster mode. > > Here is the command what I am using :* spark-submit --master yarn > --deploy-mode cluster spark_code.py* > > Pleas

Unable to run Spark Jobs in yarn cluster mode

2017-10-10 Thread Debabrata Ghosh
Hi All, I am constantly hitting an error : "ApplicationMaster: SparkContext did not initialize after waiting for 100 ms" while running my Spark code in yarn cluster mode. Here is the command what I am using :* spark-submit --master yarn --deploy-mode cluster spa

Re: Multiple vcores per container when running Spark applications in Yarn cluster mode

2017-09-11 Thread Gourav Sengupta
;> >> Thanks >> Jerry >> >> On Sat, Sep 9, 2017 at 6:54 AM, Xiaoye Sun <sunxiaoy...@gmail.com> wrote: >> >>> Hi, >>> >>> I am using Spark 1.6.1 and Yarn 2.7.4. >>> I want to submit a Spark application to a Yarn cluster. Ho

Re: Multiple vcores per container when running Spark applications in Yarn cluster mode

2017-09-11 Thread Xiaoye Sun
>> >> I am wondering that is it possible to assign multiple vcores to a >> container when a Spark job is submitted to a Yarn cluster in yarn-cluster >> mode. >> >> Thanks! >> Best, >> Xiaoye >> > >

Re: Multiple vcores per container when running Spark applications in Yarn cluster mode

2017-09-10 Thread Saisai Shao
ep 9, 2017 at 6:54 AM, Xiaoye Sun <sunxiaoy...@gmail.com> wrote: > Hi, > > I am using Spark 1.6.1 and Yarn 2.7.4. > I want to submit a Spark application to a Yarn cluster. However, I found > that the number of vcores assigned to a container/executor is always 1, > even if I

Multiple vcores per container when running Spark applications in Yarn cluster mode

2017-09-08 Thread Xiaoye Sun
Hi, I am using Spark 1.6.1 and Yarn 2.7.4. I want to submit a Spark application to a Yarn cluster. However, I found that the number of vcores assigned to a container/executor is always 1, even if I set spark.executor.cores=2. I also found the number of tasks an executor runs concurrently is 2. So

Re: How to configure spark on Yarn cluster

2017-07-28 Thread yohann jardin
k.apache.org<mailto:user@spark.apache.org> Subject: Re: How to configure spark on Yarn cluster Not sure that we are OK on one thing: Yarn limitations are for the sum of all nodes, while you only specify the memory for a single node through Spark. By the way, the memory displayed in the

Re: How to configure spark on Yarn cluster

2017-07-28 Thread jeff saremi
e: How to configure spark on Yarn cluster Not sure that we are OK on one thing: Yarn limitations are for the sum of all nodes, while you only specify the memory for a single node through Spark. By the way, the memory displayed in the UI is only a part of the total memory allocation:

Re: How to configure spark on Yarn cluster

2017-07-28 Thread yohann jardin
park.apache.org> Subject: Re: How to configure spark on Yarn cluster Check the executor page of the Spark UI, to check if your storage level is limiting. Also, instead of starting with 100 TB of data, sample it, make it work, and grow it little by little until you reached 100 TB.

Re: How to configure spark on Yarn cluster

2017-07-28 Thread jeff saremi
From: yohann jardin <yohannjar...@hotmail.com> Sent: Thursday, July 27, 2017 11:15:39 PM To: jeff saremi; user@spark.apache.org Subject: Re: How to configure spark on Yarn cluster Check the executor page of the Spark UI, to check if your storage level is limiting. Also, i

Re: How to configure spark on Yarn cluster

2017-07-28 Thread yohann jardin
Check the executor page of the Spark UI, to check if your storage level is limiting. Also, instead of starting with 100 TB of data, sample it, make it work, and grow it little by little until you reached 100 TB. This will validate the workflow and let you see how much data is shuffled, etc.

How to configure spark on Yarn cluster

2017-07-28 Thread jeff saremi
I have the simplest job which i'm running against 100TB of data. The job keeps failing with ExecutorLostFailure's on containers killed by Yarn for exceeding memory limits I have varied the executor-memory from 32GB to 96GB, the spark.yarn.executor.memoryOverhead from 8192 to 36000 and similar

Spark don't run all code when is submit to yarn-cluster mode.

2017-06-15 Thread Cosmin Posteuca
, TimeUnit.SECONDS), Duration(5, TimeUnit.SECONDS), task) My problem is when i run the code with yarn-cluster, the SparkSession it's immediately closed after init, and the SparkSession don't wait after the tasks that will be scheduled later. But in yarn-client everything it's ok. SparkSession it's

One question / kerberos, yarn-cluster -> connection to hbase

2017-05-25 Thread sudhir37
Facing one issue with Kerberos enabled Hadoop/CDH cluster. We are trying to run a streaming job on yarn-cluster, which interacts with Kafka (direct stream), and hbase. Somehow, we are not able to connect to hbase in the cluster mode. We use keytab to login to hbase. This is what we do: spark

Re: One question / kerberos, yarn-cluster -> connection to hbase

2017-05-24 Thread Michael Gummelt
<sud...@infoobjects.com> wrote: > Facing one issue with Kerberos enabled Hadoop/CDH cluster. > > > > We are trying to run a streaming job on yarn-cluster, which interacts with > Kafka (direct stream), and hbase. > > > > Somehow, we are not able to connect

One question / kerberos, yarn-cluster -> connection to hbase

2017-05-24 Thread Sudhir Jangir
Facing one issue with Kerberos enabled Hadoop/CDH cluster. We are trying to run a streaming job on yarn-cluster, which interacts with Kafka (direct stream), and hbase. Somehow, we are not able to connect to hbase in the cluster mode. We use keytab to login to hbase. This is what we

spark on yarn cluster model can't use saveAsTable ?

2017-05-15 Thread lk_spark
hi,all: I have a test under spark2.1.0 , which read txt files as DataFrame and save to hive . When I submit the app jar with yarn client model it works well , but If I submit with cluster model , it will not create table and write data , and I didn't find any error log ... can anybody

Re: Monitoring ongoing Spark Job when run in Yarn Cluster mode

2017-03-13 Thread Marcelo Vanzin
It's linked from the YARN RM's Web UI (see the "Application Master" link for the running application). On Mon, Mar 13, 2017 at 6:53 AM, Sourav Mazumder <sourav.mazumde...@gmail.com> wrote: > Hi, > > Is there a way to monitor an ongoing Spark Job when running in Yarn Cl

Re: Monitoring ongoing Spark Job when run in Yarn Cluster mode

2017-03-13 Thread Nirav Patel
mde...@gmail.com> wrote: > Hi, > > Is there a way to monitor an ongoing Spark Job when running in Yarn > Cluster mode ? > > In my understanding in Yarn Cluster mode Spark Monitoring UI for the > ongoing job would not be available in 4040 port. So is there an alternativ

Monitoring ongoing Spark Job when run in Yarn Cluster mode

2017-03-13 Thread Sourav Mazumder
Hi, Is there a way to monitor an ongoing Spark Job when running in Yarn Cluster mode ? In my understanding in Yarn Cluster mode Spark Monitoring UI for the ongoing job would not be available in 4040 port. So is there an alternative ? Regards, Sourav

Re: SPark - YARN Cluster Mode

2017-02-27 Thread ayan guha
Hi Thanks a lot, i used property file to resolve the issue. I think documentation should mention it though. On Tue, 28 Feb 2017 at 5:05 am, Marcelo Vanzin wrote: > > none of my Config settings > > Is it none of the configs or just the queue? You can't set the YARN > queue

Re: SPark - YARN Cluster Mode

2017-02-27 Thread Marcelo Vanzin
> none of my Config settings Is it none of the configs or just the queue? You can't set the YARN queue in cluster mode through code, it has to be set in the command line. It's a chicken & egg problem (in cluster mode, the YARN app is created before your code runs). --property-file works the

Re: SPark - YARN Cluster Mode

2017-02-26 Thread ayan guha
Also, I wanted to add if I specify the conf in the command line, it seems to be working. For example, if I use spark-submit --master yarn --deploy-mode cluster --conf spark.yarn.queue=root.Application ayan_test.py 10 Then it is going to correct queue. Any help would be great Best Ayan On

SPark - YARN Cluster Mode

2017-02-26 Thread ayan guha
Hi I am facing an issue with Cluster Mode, with pyspark Here is my code: conf = SparkConf() conf.setAppName("Spark Ingestion") conf.set("spark.yarn.queue","root.Applications") conf.set("spark.executor.instances","50")

Re: Delegation Token renewal in yarn-cluster

2016-11-04 Thread Marcelo Vanzin
On Fri, Nov 4, 2016 at 1:57 AM, Zsolt Tóth wrote: > This was what confused me in the first place. Why does Spark ask for new > tokens based on the renew-interval instead of the max-lifetime? It could be just a harmless bug, since tokens have a "getMaxDate()" method

Re: Delegation Token renewal in yarn-cluster

2016-11-04 Thread Steve Loughran
On 4 Nov 2016, at 01:37, Marcelo Vanzin > wrote: On Thu, Nov 3, 2016 at 3:47 PM, Zsolt Tóth > wrote: What is the purpose of the delegation token renewal (the one that is done automatically

Re: Delegation Token renewal in yarn-cluster

2016-11-04 Thread Zsolt Tóth
I checked the logs of my tests, and found that the Spark schedules the token refresh based on the renew-interval property, not the max-lifetime. The settings in my tests: dfs.namenode.delegation.key.update-interval=52 dfs.namenode.delegation.token.max-lifetime=102

Re: Delegation Token renewal in yarn-cluster

2016-11-03 Thread Marcelo Vanzin
On Thu, Nov 3, 2016 at 3:47 PM, Zsolt Tóth wrote: > What is the purpose of the delegation token renewal (the one that is done > automatically by Hadoop libraries, after 1 day by default)? It seems that it > always happens (every day) until the token expires, no matter

Re: Delegation Token renewal in yarn-cluster

2016-11-03 Thread Zsolt Tóth
Thank you for the clarification Marcelo, makes sense. I'm thinking about 2 questions here, somewhat unrelated to the original problem. What is the purpose of the delegation token renewal (the one that is done automatically by Hadoop libraries, after 1 day by default)? It seems that it always

Re: Delegation Token renewal in yarn-cluster

2016-11-03 Thread Marcelo Vanzin
I think you're a little confused about what "renewal" means here, and this might be the fault of the documentation (I haven't read it in a while). The existing delegation tokens will always be "renewed", in the sense that Spark (actually Hadoop code invisible to Spark) will talk to the NN to

Re: Delegation Token renewal in yarn-cluster

2016-11-03 Thread Zsolt Tóth
Yes, I did change dfs.namenode.delegation.key.update-interval and dfs.namenode.delegation.token.renew-interval to 15 min, the max-lifetime to 30min. In this case the application (without Spark having the keytab) did not fail after 15 min, only after 30 min. Is it possible that the resource manager

Re: Delegation Token renewal in yarn-cluster

2016-11-03 Thread Marcelo Vanzin
Sounds like your test was set up incorrectly. The default TTL for tokens is 7 days. Did you change that in the HDFS config? The issue definitely exists and people definitely have run into it. So if you're not hitting it, it's most definitely an issue with your test configuration. On Thu, Nov 3,

Re: Delegation Token renewal in yarn-cluster

2016-11-03 Thread Zsolt Tóth
Any ideas about this one? Am I missing something here? 2016-11-03 15:22 GMT+01:00 Zsolt Tóth : > Hi, > > I ran some tests regarding Spark's Delegation Token renewal mechanism. As > I see, the concept here is simple: if I give my keytab file and client > principal to

Delegation Token renewal in yarn-cluster

2016-11-03 Thread Zsolt Tóth
Hi, I ran some tests regarding Spark's Delegation Token renewal mechanism. As I see, the concept here is simple: if I give my keytab file and client principal to Spark, it starts a token renewal thread, and renews the namenode delegation tokens after some time. This works fine. Then I tried to

Re: Run spark-shell inside Docker container against remote YARN cluster

2016-10-27 Thread Marco Mistroni
ild docker image for spark? I want to build docker image with spark inside but configured against remote YARN cluster. I have already created image with spark 1.6.2 inside. But when I run spark-shell --master yarn --deploy-mode client --driver-memory 32G --executor-memory 32G --executor-cores 8 ins

Run spark-shell inside Docker container against remote YARN cluster

2016-10-27 Thread ponkin
Hi, May be someone already had experience to build docker image for spark? I want to build docker image with spark inside but configured against remote YARN cluster. I have already created image with spark 1.6.2 inside. But when I run spark-shell --master yarn --deploy-mode client --driver-memory

Re: Getting the IP address of Spark Driver in yarn-cluster mode

2016-10-25 Thread Masood Krohy
ghran <ste...@hortonworks.com> A : Masood Krohy <masood.kr...@intact.net> Cc :"user@spark.apache.org" <user@spark.apache.org> Date : 2016-10-24 17:09 Objet : Re: Getting the IP address of Spark Driver in yarn-cluster mode On 24 Oct 2016, at 19:34, Masood Krohy <mas

Re: Getting the IP address of Spark Driver in yarn-cluster mode

2016-10-24 Thread Steve Loughran
On 24 Oct 2016, at 19:34, Masood Krohy <masood.kr...@intact.net<mailto:masood.kr...@intact.net>> wrote: Hi everyone, Is there a way to set the IP address/hostname that the Spark Driver is going to be running on when launching a program through spark-submit in yarn-cluster mode (P

Getting the IP address of Spark Driver in yarn-cluster mode

2016-10-24 Thread Masood Krohy
Hi everyone, Is there a way to set the IP address/hostname that the Spark Driver is going to be running on when launching a program through spark-submit in yarn-cluster mode (PySpark 1.6.0)? I do not see an option for this. If not, is there a way to get this IP address after the Spark app has

Re: Pyspark not working on yarn-cluster mode

2016-09-27 Thread ofer
I advice you to use livy for this purpose. Livy works well with yarn and it will decouple spark from your web app. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Pyspark-not-working-on-yarn-cluster-mode-tp23755p27799.html Sent from the Apache Spark User

Re: Spark Yarn Cluster with Reference File

2016-09-23 Thread ayan guha
You may try copying the file to same location on all nodes and try to read from that place On 24 Sep 2016 00:20, "ABHISHEK" wrote: > I have tried with hdfs/tmp location but it didn't work. Same error. > > On 23 Sep 2016 19:37, "Aditya"

Re: Spark Yarn Cluster with Reference File

2016-09-23 Thread ABHISHEK
I have tried with hdfs/tmp location but it didn't work. Same error. On 23 Sep 2016 19:37, "Aditya" wrote: > Hi Abhishek, > > Try below spark submit. > spark-submit --master yarn --deploy-mode cluster --files hdfs:// > abc.com:8020/tmp/abc.drl --class

Re: Spark Yarn Cluster with Reference File

2016-09-23 Thread Aditya
Hi Abhishek, Try below spark submit. spark-submit --master yarn --deploy-mode cluster --files hdfs://abc.com:8020/tmp/abc.drl --class com.abc.StartMain abc-0.0.1-SNAPSHOT-jar-with-dependencies.jar abc.drl On Friday 23

Re: Spark Yarn Cluster with Reference File

2016-09-23 Thread ABHISHEK
Thanks for your response Aditya and Steve. Steve: I have tried specifying both /tmp/filename in hdfs and local path but it didn't work. You may be write that Kie session is configured to access files from Local path. I have attached code here for your reference and if you find some thing wrong,

Re: Spark Yarn Cluster with Reference File

2016-09-23 Thread Steve Loughran
On 23 Sep 2016, at 08:33, ABHISHEK > wrote: at java.lang.Thread.run(Thread.java:745) Caused by: java.io.FileNotFoundException: hdfs:/abc.com:8020/user/abhietc/abc.drl (No such file or directory)

Re: Spark Yarn Cluster with Reference File

2016-09-23 Thread Aditya
Hi Abhishek, From your spark-submit it seems your passing the file as a parameter to the driver program. So now it depends what exactly you are doing with that parameter. Using --files option it will be available to all the worker nodes but if in your code if you are referencing using the

Spark Yarn Cluster with Reference File

2016-09-23 Thread ABHISHEK
Hello there, I have Spark Application which refer to an external file ‘abc.drl’ and having unstructured data. Application is able to find this reference file if I run app in Local mode but in Yarn with Cluster mode, it is not able to find the file in the specified path. I tried with both local

Re: Error trying to connect to Hive from Spark (Yarn-Cluster Mode)

2016-09-17 Thread Mich Talebzadeh
6 AM > *To:* Gangadhar, Anupama (623) > *Cc:* user @spark > *Subject:* Re: Error trying to connect to Hive from Spark (Yarn-Cluster > Mode) > > > > Is your Hive Thrift Server up and running on port > jdbc:hive2://10001? > > > > Do the following >

RE: Error trying to connect to Hive from Spark (Yarn-Cluster Mode)

2016-09-17 Thread anupama . gangadhar
ing to connect to Hive from Spark (Yarn-Cluster Mode) Is your Hive Thrift Server up and running on port jdbc:hive2://10001? Do the following netstat -alnp |grep 10001 and see whether it is actually running HTH Dr Mich Talebzadeh LinkedIn https://www.linkedin.com/profile/view

RE: Error trying to connect to Hive from Spark (Yarn-Cluster Mode)

2016-09-17 Thread anupama . gangadhar
? In the cluster, transport mode is http and ssl is disabled. Thanks Anupama From: Deepak Sharma [mailto:deepakmc...@gmail.com] Sent: Saturday, September 17, 2016 8:35 AM To: Gangadhar, Anupama (623) Cc: spark users Subject: Re: Error trying to connect to Hive from Spark (Yarn-Cluster Mode) Hi

Re: Error trying to connect to Hive from Spark (Yarn-Cluster Mode)

2016-09-16 Thread Deepak Sharma
Hi Anupama To me it looks like issue with the SPN with which you are trying to connect to hive2 , i.e. hive@hostname. Are you able to connect to hive from spark-shell? Try getting the tkt using any other user keytab but not hadoop services keytab and then try running the spark submit. Thanks

Re: Error trying to connect to Hive from Spark (Yarn-Cluster Mode)

2016-09-16 Thread Mich Talebzadeh
Is your Hive Thrift Server up and running on port jdbc:hive2://10001? Do the following netstat -alnp |grep 10001 and see whether it is actually running HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

Error trying to connect to Hive from Spark (Yarn-Cluster Mode)

2016-09-16 Thread anupama . gangadhar
Hi, I am trying to connect to Hive from Spark application in Kerborized cluster and get the following exception. Spark version is 1.4.1 and Hive is 1.2.1. Outside of spark the connection goes through fine. Am I missing any configuration parameters? ava.sql.SQLException: Could not open

Re: Spark's Logistic Regression runs unstable on Yarn cluster

2016-08-16 Thread Yanbo Liang
o do some > classification on an AWS EMR Yarn cluster. > > The cluster consists of 10 m3.xlarge nodes and is set up as follows: > spark.driver.memory 10g, spark.driver.cores 3 , spark.executor.memory 10g, > spark.executor-cores 4. > > I enabled yarn's dynamic allocation abilities. >

Spark's Logistic Regression runs unstable on Yarn cluster

2016-08-12 Thread olivierjeunen
I'm using pyspark ML's logistic regression implementation to do some classification on an AWS EMR Yarn cluster. The cluster consists of 10 m3.xlarge nodes and is set up as follows: spark.driver.memory 10g, spark.driver.cores 3 , spark.executor.memory 10g, spark.executor-cores 4. I enabled

Re: Running spark Java on yarn cluster

2016-08-10 Thread atulp
corresponds to an application. And application can be used in interactive mode too. So was thinking to create server which will have spark context pointing to yarn cluster and use this context to run multiple queries over period. Can you suggest way to achieve the requirement of interactive

Running spark Java on yarn cluster

2016-08-10 Thread atulp
Hi Team, I am new to spark and writing my first program. I have written sample program with spark master as local. To execute spark over local yarn what should be value of spark.master property? Can I point to remote yarn cluster? I would like to execute this as a java application

Running spark Java on yarn cluster

2016-08-10 Thread Atul Phalke
Hi Team, I am new to spark and writing my first program. I have written sample program with spark master as local. To execute spark over local yarn what should be value of spark.master property? Can I point to remote yarn cluster? I would like to execute this as a java application

Re: Streaming from Kinesis is not getting data in Yarn cluster

2016-07-15 Thread Yash Sharma
endra" <d17...@gmail.com> wrote: > I have created small spark streaming program to fetch data from Kinesis and > put some data in database. > When i ran it in spark standalone cluster using master as local[*] it is > working fine but when i tried to run in yarn cluster with ma

Streaming from Kinesis is not getting data in Yarn cluster

2016-07-15 Thread dharmendra
I have created small spark streaming program to fetch data from Kinesis and put some data in database. When i ran it in spark standalone cluster using master as local[*] it is working fine but when i tried to run in yarn cluster with master as "yarn" application doesn't receive any data.

Spark streaming graceful shutdown when running on yarn-cluster deploy-mode

2016-07-12 Thread Guy Harmach
Hi, I'm a newbie to spark, starting to work with Spark 1.5 using the Java API (about to upgrade to 1.6 soon). I am deploying a spark streaming application using spark-submit with yarn-cluster mode. What is the recommended way for performing graceful shutdown to the spark job? Already tried

Re: Unresponsive Spark Streaming UI in YARN cluster mode - 1.5.2

2016-07-08 Thread Shixiong(Ryan) Zhu
a Spark Streaming job in > YARN Cluster mode consuming from a high volume Kafka topic. When we try to > access the Spark Streaming UI on the application master, it is > unresponsive/hangs or sometimes comes back with connection refused. > > > > It seems this UI is resident on the d

Unresponsive Spark Streaming UI in YARN cluster mode - 1.5.2

2016-07-08 Thread Ellis, Tom (Financial Markets IT)
Hi There, We're currently using HDP 2.3.4, Spark 1.5.2 with a Spark Streaming job in YARN Cluster mode consuming from a high volume Kafka topic. When we try to access the Spark Streaming UI on the application master, it is unresponsive/hangs or sometimes comes back with connection refused

  1   2   3   4   5   >