--
If you reply to this email, your message will be added to the discussion
below:
http://apache-spark-user-list.1001560.n3.nabble.com/issue-Running-Spark-Job-on-Yarn-Cluster-tp21697p21909.html
To unsubscribe from issue Running Spark Job on Yarn Cluster, click here
http://apache-spark-user-list
is, is it
possible to achieve the same monitoring UI experience with Yarn Cluster
like
Viewing workers, running/completed job stages in the Web UI. Currently,
if
we go to our Yarn Resource manager UI, we are able to see the Spark Jobs
and
it's logs. But it is not as rich as Spark Standalone master UI
Yes. I do see files, actually I missed copying the other settings:
spark.master spark://
skarri-lt05.redmond.corp.microsoft.com:7077
spark.eventLog.enabled true
spark.rdd.compress true
spark.storage.memoryFraction 1
spark.core.connection.ack.wait.timeout 6000
On Wed, Mar 4, 2015 at 10:08 AM, Srini Karri skarri@gmail.com wrote:
spark.executor.extraClassPath
D:\\Apache\\spark-1.2.1-bin-hadoop2\\spark-1.2.1-bin-hadoop2.4\\bin\\classes
spark.eventLog.dir
D:/Apache/spark-1.2.1-bin-hadoop2/spark-1.2.1-bin-hadoop2.4/bin/tmp/spark-events
Hi Marcelo,
I found the problem from
http://mail-archives.apache.org/mod_mbox/spark-user/201409.mbox/%3cCAL+LEBfzzjugOoB2iFFdz_=9TQsH=DaiKY=cvydfydg3ac5...@mail.gmail.com%3e
this link. The problem is the application I am running, is not generating
APPLICATION_COMPLETE file. If I add this file
-user-list.1001560.n3.nabble.com/issue-Running-Spark-Job-on-Yarn-Cluster-tp21697p21909.html
To unsubscribe from issue Running Spark Job on Yarn Cluster, click here.
NAML
http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble
I am trying to run below java class with yarn cluster, but it hangs in
accepted state . i don't see any error . Below is the class and command .
Any help is appreciated .
Thanks,
Abhi
bin/spark-submit --class com.mycompany.app.SimpleApp --master yarn-cluster
/home/hduser/my-app-1.0.jar
, I have tried Standalone Spark Installation on Windows, I am able
to submit the logs, able to see the history of events. My question is, is
it possible to achieve the same monitoring UI experience with Yarn Cluster
like Viewing workers, running/completed job stages in the Web UI.
Currently
is the
context, I have tried Standalone Spark Installation on Windows, I am able to
submit the logs, able to see the history of events. My question is, is it
possible to achieve the same monitoring UI experience with Yarn Cluster like
Viewing workers, running/completed job stages in the Web UI. Currently
Do you have enough resource in your cluster? You can check your resource
manager to see the usage.
Thanks.
Zhan Zhang
On Mar 3, 2015, at 8:51 AM, abhi
abhishek...@gmail.commailto:abhishek...@gmail.com wrote:
I am trying to run below java class with yarn cluster, but it hangs in accepted
Hi,
On Wed, Mar 4, 2015 at 6:20 AM, Zhan Zhang zzh...@hortonworks.com wrote:
Do you have enough resource in your cluster? You can check your resource
manager to see the usage.
Yep, I can confirm that this is a very annoying issue. If there is not
enough memory or VCPUs available, your app
Cluster
like Viewing workers, running/completed job stages in the Web UI.
Currently, if we go to our Yarn Resource manager UI, we are able to see the
Spark Jobs and it's logs. But it is not as rich as Spark Standalone master
UI. Is this limitation for hadoop yarn cluster or is there any way we can
a situation where I am running web application in Jetty using Spring
boot.My web application receives a REST web service request based on that It
needs to trigger spark calculation job in Yarn cluster. Since my job can
take longer to run and can access data from HDFS, so I want to run the spark
job in yarn
Is this the full stack trace ?
On Wed, Feb 18, 2015 at 2:39 AM, sachin Singh sachin.sha...@gmail.com
wrote:
Hi,
I want to run my spark Job in Hadoop yarn Cluster mode,
I am using below command -
spark-submit --master yarn-cluster --driver-memory 1g --executor-memory 1g
--executor-cores 1
Yes.
On 19 Feb 2015 23:40, Harshvardhan Chauhan ha...@gumgum.com wrote:
Is this the full stack trace ?
On Wed, Feb 18, 2015 at 2:39 AM, sachin Singh sachin.sha...@gmail.com
wrote:
Hi,
I want to run my spark Job in Hadoop yarn Cluster mode,
I am using below command -
spark-submit --master
You'll need to look at your application's logs. You can use yarn logs
--applicationId [id] to see them.
On Wed, Feb 18, 2015 at 2:39 AM, sachin Singh sachin.sha...@gmail.com wrote:
Hi,
I want to run my spark Job in Hadoop yarn Cluster mode,
I am using below command -
spark-submit --master
Hi,
I want to run my spark Job in Hadoop yarn Cluster mode,
I am using below command -
spark-submit --master yarn-cluster --driver-memory 1g --executor-memory 1g
--executor-cores 1 --class com.dc.analysis.jobs.AggregationJob
sparkanalitic.jar param1 param2 param3
I am getting error as under
started in yarn-cluster mode.
./bin/spark-submit --verbose --queue research --driver-java-options
-XX:MaxPermSize=8192M --files /etc/hive/hive-site.xml --driver-class-path
/etc/hive/hive-site.xml --master yarn --deploy-mode cluster
The problem here is that --files only look for the local
yarn-client, even with both a SparkContext and StreamingContext. It looks
to me that in yarn-cluster mode it's grabbing resources for the
StreamingContext but not for the SparkContext.
Any ideas?
Jon
15/02/10 12:06:16 INFO MemoryStore: MemoryStore started with capacity
1177.8 MB.
15/02/10
toth.zsolt@gmail.com:
Hi,
I'm using Spark in yarn-cluster mode and submit the jobs programmatically
from the client in Java. I ran into a few issues when tried to set the
resource allocation properties.
1. It looks like setting spark.executor.memory, spark.executor.cores
(sbStreamingTv) does work successfully using yarn-client,
even with both a SparkContext and StreamingContext. It looks to me that in
yarn-cluster mode it's grabbing resources for the StreamingContext but not
for the SparkContext.
Any ideas?
Jon
15/02/10 12:06:16 INFO MemoryStore
One more question: Is there reason why Spark throws an error when
requesting too much memory instead of capping it to the maximum value (as
YARN would do by default)?
Thanks!
2015-02-10 17:32 GMT+01:00 Zsolt Tóth toth.zsolt@gmail.com:
Hi,
I'm using Spark in yarn-cluster mode and submit
Spark doesn't get beyond that point in
the code.*
Also, this job (sbStreamingTv) does work successfully using yarn-client,
even with both a SparkContext and StreamingContext. It looks to me that in
yarn-cluster mode it's grabbing resources for the StreamingContext but not
for the SparkContext.
Any
Hi,
I'm using Spark in yarn-cluster mode and submit the jobs programmatically
from the client in Java. I ran into a few issues when tried to set the
resource allocation properties.
1. It looks like setting spark.executor.memory, spark.executor.cores and
spark.executor.instances have no effect
(sbStreamingTv) does work successfully using
yarn-client, even with both a SparkContext and StreamingContext. It looks
to me that in yarn-cluster mode it's grabbing resources for the
StreamingContext but not for the SparkContext.
Any ideas?
Jon
15/02/10 12:06:16 INFO MemoryStore
it:
val badIPs = fromFile(edgeDir + badfullIPs.csv)
val badIPsLines = badIPs.getLines
val badIpSet = badIPsLines.toSet
val badIPsBC = sc.broadcast(badIpSet)
badIPs.close
How can I accomplish this in yarn-cluster mode?
Jon
--
View this message in context
that in yarn-cluster mode it's grabbing resources for the
StreamingContext but not for the SparkContext.
Any ideas?
Jon
15/02/10 12:06:16 INFO MemoryStore: MemoryStore started with capacity
1177.8 MB.
15/02/10 12:06:16 INFO ConnectionManager: Bound socket to port 30129
with id
it, and
then
broadcast it:
val badIPs = fromFile(edgeDir + badfullIPs.csv)
val badIPsLines = badIPs.getLines
val badIpSet = badIPsLines.toSet
val badIPsBC = sc.broadcast(badIpSet)
badIPs.close
How can I accomplish this in yarn-cluster mode?
Jon
--
View this message
= badIPs.getLines
val badIpSet = badIPsLines.toSet
val badIPsBC = sc.broadcast(badIpSet)
badIPs.close
How can I accomplish this in yarn-cluster mode?
Jon
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-broadcast-a-variable-read-from
val badIPsBC = sc.broadcast(badIpSet)
badIPs.close
How can I accomplish this in yarn-cluster mode?
Jon
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-broadcast-a-variable-read-from-a-file-in-yarn-cluster-mode-tp21524.html
Sent from
.
through hive cli, I don't see any problem. but for spark on
yarn-cluster mode, I am not able to switch to a database other than
the default one, for Yarn-client mode, it works fine.
Thanks!
Jenny
On Tue, Aug 12, 2014 at 12:53 PM, Yin Huai huaiyin@gmail.com
mailto:huaiyin@gmail.com wrote
this in yarn-cluster mode?
Jon
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-broadcast-a-variable-read-from-a-file-in-yarn-cluster-mode-tp21524.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
to
the standalone mode.
However, as compared to the standalone mode, spark on yarn runs very slow.
I am running it as
$SPARK_HOME/bin/spark-submit --class EDDApp --master yarn-cluster
--num-executors 10 --executor-memory 14g
target/scala-2.10/edd-application_2.10-1.0.jar
hdfs://hm41:9000/user/hduser
with more memory?
For your setup the memory calculation is:1
executorMemoryGB * 1.07 = 14GB = 14GB/1.07 ~ 13GB.
Your command args should be something like: --master yarn-cluster
--num-executors 5 --num-executor-cores 7 --executor-memory 13g
As for the UI, where did you see 7.2GB? can you send a screen
1) Parameters like --num-executors should come before the jar. That is, you
want something like$SPARK_HOME --num-executors 3 --driver-memory 6g
--executor-memory 7g \--master yarn-cluster --class EDDApp
target/scala-2.10/eddjar \outputPath
That is, *your* parameters come after the jar
Is YARN_CONF_DIR set?
--- Original Message ---
From: Aniket Bhatnagar aniket.bhatna...@gmail.com
Sent: February 4, 2015 6:16 AM
To: kundan kumar iitr.kun...@gmail.com, spark users
user@spark.apache.org
Subject: Re: Spark Job running on localhost on yarn cluster
Have you set master in SparkConf
Hi,
I am trying to execute my code on a yarn cluster
The command which I am using is
$SPARK_HOME/bin/spark-submit --class EDDApp
target/scala-2.10/edd-application_2.10-1.0.jar --master yarn-cluster
--num-executors 3 --driver-memory 6g --executor-memory 7g outpuPath
But, I can see
Hi all,
I'm able to submit spark jobs through spark-jobserver. But this allows to
use spark only in yarn-client mode. I want to use spark also in
yarn-cluster mode but jobserver does not allow it, like says in the README
file https://github.com/spark-jobserver/spark-jobserver.
Could you tell
Hello,
We have hadoop 2.6.0 and Yarn set up on ec2. Trying to get spark 1.1.1 running
on the Yarn cluster.
I have of course googled around and found that this problem is solved for most
after removing the line including 127.0.1.1 from /etc/hosts. This hasn’t seemed
to solve this for me
to get spark 1.1.1
running on the Yarn cluster.
I have of course googled around and found that this problem is solved for
most after removing the line including 127.0.1.1 from /etc/hosts. This
hasn’t seemed to solve this for me. Anyone has an idea where else might
127.0.1.1 be hiding in some conf
(javax.jdo.option.ConnectionPassword,xxx)
hiveContext.setConf(javax.jdo.option.ConnectionURL,jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true)
From: huaiyin@gmail.com
Date: Wed, 13 Aug 2014 16:56:13 -0400
Subject: Re: Spark sql failed in yarn-cluster mode when
with -Phadoop-provided, and other common libraries that
are required.
From: alee...@hotmail.com
To: user@spark.apache.org
CC: lian.cs@gmail.com; linlin200...@gmail.com; huaiyin@gmail.com
Subject: RE: Spark sql failed in yarn-cluster mode when connecting to
non-default hive database
Date
Hello folks,
I'm trying to deploy a Spark driver on Amazon EMR in yarn-cluster mode
expecting to be able to access the Spark UI from the spark-master-ip:4040
address (default port). The problem here is that the Spark UI port is
always defined randomly at runtime, although I also tried to specify
That's not true in yarn-cluster mode, where the driver runs in a
container that YARN creates, which may not be on the machine that runs
spark-submit.
As far as I know, however, you can't control where YARN allocates
that, and shouldn't need to.
You can probably query YARN to find where it did
Hi all,On yarn-cluster mode, can we let the driver running on a specific
machine that we choose in cluster ? Or, even the machine not in the cluster?
Hi,
I have a simple app, where I am trying to create a table. I am able to create
the table on running app in yarn-client mode, but not with yarn-cluster mode.
Is this some known issue? Has this already been fixed?
Please note that I am using spark-1.1 over hadoop-2.4.0
App:
-
import
Thanks Sandy, passing --name works fine :)
Tomer
On Fri, Dec 12, 2014 at 9:35 AM, Sandy Ryza sandy.r...@cloudera.com wrote:
Hi Tomer,
In yarn-cluster mode, the application has already been submitted to YARN
by the time the SparkContext is created, so it's too late to set the app
name
I got this error when I click Track URL: ApplicationMaster when I run a
spark job in YARN cluster mode. I found this jira
https://issues.apache.org/jira/browse/YARN-800, but I could not get this
problem fixed. I'm running CDH 5.1.0 with Both HDFS and RM HA enabled. Does
anybody has the similar
Hi,
I'm trying to set a custom spark app name when running a java spark app in
yarn-cluster mode.
SparkConf sparkConf = new SparkConf();
sparkConf.setMaster(System.getProperty(spark.master));
sparkConf.setAppName(myCustomName);
sparkConf.set(spark.logConf, true);
JavaSparkContext sc
On Thu, Dec 11, 2014 at 8:27 PM, Tomer Benyamini tomer@gmail.com
wrote:
Hi,
I'm trying to set a custom spark app name when running a java spark app in
yarn-cluster mode.
SparkConf sparkConf = new SparkConf();
sparkConf.setMaster(System.getProperty(spark.master
Hi Tomer,
In yarn-cluster mode, the application has already been submitted to YARN by
the time the SparkContext is created, so it's too late to set the app name
there. I believe giving it with the --name property to spark-submit should
work.
-Sandy
On Thu, Dec 11, 2014 at 10:28 AM, Tomer
Hi, all:
According to https://github.com/apache/spark/pull/2732, When a spark job fails
or exits nonzero in yarn-cluster mode, the spark-submit will get the
corresponding return code of the spark job. But I tried in spark-1.1.1 yarn
cluster, spark-submit return zero anyway.
Here is my spark
I tried in spark client mode, spark-submit can get the correct return code from
spark job. But in yarn-cluster mode, It failed.
From: lin_q...@outlook.com
To: u...@spark.incubator.apache.org
Subject: Issue on [SPARK-3877][YARN]: Return code of the spark-submit in
yarn-cluster mode
Date: Fri, 5
-submit cannot
get the second return code 100. What's the difference between these two
`exit`? I was so confused.
From: lin_q...@outlook.com
To: u...@spark.incubator.apache.org
Subject: RE: Issue on [SPARK-3877][YARN]: Return code of the spark-submit in
yarn-cluster mode
Date: Fri, 5 Dec 2014 17
.
--
From: lin_q...@outlook.com
To: u...@spark.incubator.apache.org
Subject: RE: Issue on [SPARK-3877][YARN]: Return code of the spark-submit
in yarn-cluster mode
Date: Fri, 5 Dec 2014 17:11:39 +0800
I tried in spark client mode, spark-submit can get the correct return code
, spark-submit returned 1 for
both two cases. That’s expected. In the yarn-cluster mode, the driver runs
in the ApplicationMaster. The exit code of driver is also the exit code of
ApplicationMaster. However, for now, Spark cannot get the exit code of
ApplicationMaster from Yarn, because Yarn does
I am running a 3 node(32 core, 60gb) Yarn cluster for Spark jobs.
1) Below are my Yarn memory settings
yarn.nodemanager.resource.memory-mb = 52224
yarn.scheduler.minimum-allocation-mb = 40960
yarn.scheduler.maximum-allocation-mb = 52224
Apache Spark Memory Settings
export SPARK_EXECUTOR_MEMORY
Hi,
I want to submit my spark program from my machine on a YARN Cluster in yarn
client mode.
How to specify al l the required details through SPARK submitter.
Please provide me some details.
-Naveen.
You can export the hadoop configurations dir (export HADOOP_CONF_DIR=XXX) in
the environment and then submit it like:
./bin/spark-submit \
--class org.apache.spark.examples.SparkPi \
--master yarn-cluster \ # can also be `yarn-client` for client mode
--executor-memory 20G \
--num
, 2014 4:08 PM
To: Naveen Kumar Pokala
Cc: user@spark.apache.orgmailto:user@spark.apache.org
Subject: Re: Submit Spark driver on Yarn Cluster in client mode
You can export the hadoop configurations dir (export HADOOP_CONF_DIR=XXX) in
the environment and then submit it like:
./bin/spark-submit
:* Akhil Das [mailto:ak...@sigmoidanalytics.com
ak...@sigmoidanalytics.com]
*Sent:* Monday, November 24, 2014 4:08 PM
*To:* Naveen Kumar Pokala
*Cc:* user@spark.apache.org
*Subject:* Re: Submit Spark driver on Yarn Cluster in client mode
You can export the hadoop configurations dir (export
Is there any way to get the yarn application_id inside the program?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-get-applicationId-for-yarn-mode-both-yarn-client-and-yarn-cluster-mode-tp19462.html
Sent from the Apache Spark User List mailing
-applicationId-for-yarn-mode-both-yarn-client-and-yarn-cluster-mode-tp19462p19466.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional
)
val ipLines = badIpSource.getLines()
val set = new HashSet[String]()
val badIpSet = set ++ ipLines
badIpSource.close()
def isGoodIp(ip: String): Boolean = !badIpSet.contains(ip)
But when I try this using --master yarn-cluster I get Exception in thread
Thread-4
If the file is not present on each node, it may not find it.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Building-a-hash-table-from-a-csv-file-using-yarn-cluster-and-giving-it-to-each-executor-tp18850p18877.html
Sent from the Apache Spark User List
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-0-0-on-yarn-cluster-problem-tp7560p17175.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe, e
.
Thank you
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-0-0-on-yarn-cluster-problem-tp7560p17175.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
copy of log4j.properties like:*
*log4j.rootCategory=DEBUG, console*
*2. upload when using spark-submit script:*
*./bin/spark-submit --class edu.bjut.spark.SparkPageRank --master
yarn-cluster --num-executors 5 --executor-memory 2g
--executor-cores 1 /data/hadoopspark/MySparkTest.jar
anybody confirm if it is a bug, or a (configuration?) problem on my side?
Thanks,
Christophe.
On 10/10/2014 18:24, Christophe Préaud wrote:
Hi,
After updating from spark-1.0.0 to spark-1.1.0, my spark applications failed
most of the time (but not always) in yarn-cluster mode (but not in yarn-client
.
Regards,
Vishnu
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/system-out-println-with-master-yarn-cluster-tp16370p16473.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
Hi,
I want to check the DEBUG log of spark executor on YARN(using yarn-cluster
mode), but
1. yarn daemonlog setlevel DEBUG YarnChild.class
2. set log4j.properties in spark/conf folder on client node.
no means above works.
So how could i set the log level of spark executor* on YARN container
, Oct 15, 2014 at 5:58 PM, eric wong win19...@gmail.com wrote:
Hi,
I want to check the DEBUG log of spark executor on YARN(using yarn-cluster
mode), but
1. yarn daemonlog setlevel DEBUG YarnChild.class
2. set log4j.properties in spark/conf folder on client node.
no means above works.
So
spark applications failed
most of the time (but not always) in yarn-cluster mode (but not in yarn-client
mode).
Here is my configuration:
* spark-1.1.0
* hadoop-2.2.0
And the hadoop.tmp.dir definition in the hadoop core-site.xml file (each
directory is on its own partition, on different
When I execute the following in yarn-client mode its working fine and giving
the result properly, but when i try to run in Yarn-cluster mode i am getting
error
spark-submit --class org.apache.spark.examples.SparkPi --master yarn-client
/home/rug885/spark/examples/lib/spark-examples_2.10-1.0.0
Hi,
After updating from spark-1.0.0 to spark-1.1.0, my spark applications failed
most of the time (but not always) in yarn-cluster mode (but not in yarn-client
mode).
Here is my configuration:
* spark-1.1.0
* hadoop-2.2.0
And the hadoop.tmp.dir definition in the hadoop core-site.xml
anything,
-Andrew
2014-10-08 12:00 GMT-07:00 jamborta jambo...@gmail.com:
Hi all,
I have a setup that works fine in yarn-client mode, but when I change that
to yarn-cluster, the executors don't get created, apart from the driver (it
seems that it does not even appear in yarn's resource manager
that is the executors cannot connect back to the driver (in my case I am not
sure if they are even started). I could not find a way to debug, as the log
files don't have any error in them.
thanks
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/executors-not-created-yarn
Hi,
I've been using pyspark with my YARN cluster with success. The work I'm
doing involves using the RDD's pipe command to send data through a binary
I've made. I can do this easily in pyspark like so (assuming 'sc' is
already defined):
sc.addFile(./dumb_prog)
t= sc.parallelize(range(10
(MongoDB, algebird
and so on)?
Thanks in advance
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/SPARK-1-1-0-on-yarn-cluster-and-external-JARs-tp15136.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
spark jar in a HDFS folder and set up the variable
SPARK_JAR.
What is the best way to do that for other external jars (MongoDB, algebird
and so on)?
Thanks in advance
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/SPARK-1-1-0-on-yarn-cluster
.nabble.com/SPARK-1-1-0-on-yarn-cluster-and-external-JARs-tp15136.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands
:
http://apache-spark-user-list.1001560.n3.nabble.com/Error-launching-spark-application-from-Windows-to-Linux-YARN-Cluster-Could-not-find-or-load-main-clar-tp14888.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
lines of short strings (about 10 characters each line) from a YARN
cluster with 400 nodes:
*14/08/22 11:43:27 WARN scheduler.TaskSetManager: Lost task 205.0 in stage
0.0 (TID 1228, aaa.xxx.com): FetchFailed(BlockManagerId(220, aaa.xxx.com,
37899, 0), shuffleId=0, mapId=420, reduceId=205)
14/08/22
I saw your post. What are the operations you did? Are you trying to collect
data from driver? Did you try the akka configurations?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/FetchFailed-when-collect-at-YARN-cluster-tp12670p12703.html
Sent from
are the operations you did? Are you trying to collect
data from driver? Did you try the akka configurations?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/FetchFailed-when-collect-at-YARN-cluster-tp12670p12703.html
Sent from the Apache Spark User List mailing list
Hi,
I am having this FetchFailed issue when the driver is about to collect about
2.5M lines of short strings (about 10 characters each line) from a YARN
cluster with 400 nodes:
*14/08/22 11:43:27 WARN scheduler.TaskSetManager: Lost task 205.0 in stage
0.0 (TID 1228, aaa.xxx.com): FetchFailed
)
at org.apache.spark.Logging$class.logInfo(Logging.scala:58)
However, when I removed --deploy-mode cluster \
Exception disappear.
I think with the deploy-mode cluster is running in yarn cluster mode,
if not, the default will be run in yarn client mode.
But why did yarn cluster get Exception?
Thanks
--deploy-mode cluster \
Exception disappear.
I think with the deploy-mode cluster is running in yarn cluster mode,
if not, the default will be run in yarn client mode.
But why did yarn cluster get Exception?
Thanks
--
cente...@gmail.com|齐忠
)
However, when I removed --deploy-mode cluster \
Exception disappear.
I think with the deploy-mode cluster is running in yarn cluster mode, if
not, the default will be run in yarn client mode.
But why did yarn cluster get Exception?
Thanks
--
cente...@gmail.com|齐忠
$class.logInfo(Logging.scala:58)
However, when I removed --deploy-mode cluster \
Exception disappear.
I think with the deploy-mode cluster is running in yarn cluster mode, if
not, the default will be run in yarn client mode.
But why did yarn cluster get Exception?
Thanks
--
cente
I think the problem is that when you are using yarn-cluster mode, because
the Spark driver runs inside the application master, the hive-conf is not
accessible by the driver. Can you try to set those confs by using
hiveContext.set(...)? Or, maybe you can copy hive-site.xml to spark/conf in
the node
...@gmail.com
wrote:
you can reproduce this issue with the following steps (assuming you have
Yarn cluster + Hive 12):
1) using hive shell, create a database, e.g: create database ttt
2) write a simple spark sql program
import org.apache.spark.{SparkConf, SparkContext}
import
Hi Yin,
hive-site.xml was copied to spark/conf and the same as the one under
$HIVE_HOME/conf.
through hive cli, I don't see any problem. but for spark on yarn-cluster
mode, I am not able to switch to a database other than the default one, for
Yarn-client mode, it works fine.
Thanks!
Jenny
Since you were using hql(...), it’s probably not related to JDBC driver.
But I failed to reproduce this issue locally with a single node pseudo
distributed YARN cluster. Would you mind to elaborate more about steps to
reproduce this bug? Thanks
On Sun, Aug 10, 2014 at 9:36 PM, Cheng Lian
reproduce this issue with the following steps (assuming you have
Yarn cluster + Hive 12):
1) using hive shell, create a database, e.g: create database ttt
2) write a simple spark sql program
import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.sql._
import
can reproduce this issue with the following steps (assuming you have
Yarn cluster + Hive 12):
1) using hive shell, create a database, e.g: create database ttt
2) write a simple spark sql program
import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.sql._
import
Hi Jenny, does this issue only happen when running Spark SQL with YARN in
your environment?
On Sat, Aug 9, 2014 at 3:56 AM, Jenny Zhao linlin200...@gmail.com wrote:
Hi,
I am able to run my hql query on yarn cluster mode when connecting to the
default hive metastore defined in hive-site.xml
Hi,
I am able to run my hql query on yarn cluster mode when connecting to the
default hive metastore defined in hive-site.xml.
however, if I want to switch to a different database, like:
hql(use other-database)
it only works in yarn client mode, but failed on yarn-cluster mode
It's really a good question !I'm also working on it
On Wed, Jul 30, 2014 at 11:45 AM, adu dujinh...@hzduozhun.com wrote:
Hi all,
RT. I want to run a job on specific two nodes in the cluster? How to
configure the yarn? Dose yarn queue help?
Thanks
Hi all,
RT. I want to run a job on specific two nodes in the cluster? How to
configure the yarn? Dose yarn queue help?
Thanks
301 - 400 of 415 matches
Mail list logo