,
I’m looking to deploy spark on YARN and I have read through the docs (
https://spark.apache.org/docs/latest/running-on-yarn.html). One question
that I still have is if there is an alternate means of including your own
app jars as opposed to the process in the “Adding Other Jars” section
Hi folks,
I’m looking to deploy spark on YARN and I have read through the docs
(https://spark.apache.org/docs/latest/running-on-yarn.html). One question that
I still have is if there is an alternate means of including your own app jars
as opposed to the process in the “Adding Other Jars
Hi Team,
Sharing one article which summarize the Resource allocation configurations
for Spark on Yarn:
Resource allocation configurations for Spark on Yarn
http://www.openkb.info/2015/06/resource-allocation-configurations-for.html
--
Thanks,
www.openkb.info
(Open KnowledgeBase for Hadoop
...@gmail.com wrote:
Hi all,
I wonder if anyone has used use MapReduce Job History to show Spark jobs.
I can see my Spark jobs (Spark running on Yarn cluster) on Resource manager
(RM).
I start Spark History server, and then through Spark's web-based user
interface I can monitor
Hi all,
I wonder if anyone has used use MapReduce Job History to show Spark jobs.
I can see my Spark jobs (Spark running on Yarn cluster) on Resource manager
(RM).
I start Spark History server, and then through Spark's web-based user
interface I can monitor the cluster (and track cluster
.n3.nabble.com/spark-on-yarn-tp23230.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h
Hi,
I set spark.shuffle.io.preferDirectBufs to false in SparkConf and this
setting can be seen in web ui's environment tab. But, it still eats memory,
i.e. -Xmx set to 512M but RES grows to 1.5G in half a day.
On Wed, Jun 3, 2015 at 12:02 PM, Shixiong Zhu zsxw...@gmail.com wrote:
Could you
Just testing with Spark 1.3, it looks like it sets the proxy correctly to
be the YARN RM host (0101)
15/06/03 10:34:19 INFO yarn.ApplicationMaster: Registered signal handlers
for [TERM, HUP, INT]
15/06/03 10:34:20 INFO yarn.ApplicationMaster: ApplicationAttemptId:
That code hasn't changed at all between 1.3 and 1.4; it also has been
working fine for me.
Are you sure you're using exactly the same Hadoop libraries (since you're
building with -Phadoop-provided) and Hadoop configuration in both cases?
On Tue, Jun 2, 2015 at 5:29 PM, Night Wolf
Hi,
Thanks for you information. I'll give spark1.4 a try when it's released.
On Wed, Jun 3, 2015 at 11:31 AM, Tathagata Das t...@databricks.com wrote:
Could you try it out with Spark 1.4 RC3?
Also pinging, Cloudera folks, they may be aware of something.
BTW, the way I have debugged memory
Could you try it out with Spark 1.4 RC3?
Also pinging, Cloudera folks, they may be aware of something.
BTW, the way I have debugged memory leaks in the past is as follows.
Run with a small driver memory, say 1 GB. Periodically (maybe a script),
take snapshots of histogram and also do memory
Hi,
Thanks for you reply. Here's the top 30 entries of jmap -histo:live result:
num #instances #bytes class name
--
1: 40802 145083848 [B
2: 99264 12716112 methodKlass
3: 99264 12291480
Thanks Marcelo - looks like it was my fault. Seems when we deployed the new
version of spark it was picking up the wrong yarn site and setting the
wrong proxy host. All good now!
On Wed, Jun 3, 2015 at 11:01 AM, Marcelo Vanzin van...@cloudera.com wrote:
That code hasn't changed at all between
Hi Zhang,
Could you paste your code in a gist? Not sure what you are doing inside the
code to fill up memory.
Thanks
Best Regards
On Thu, May 28, 2015 at 10:08 AM, Ji ZHANG zhangj...@gmail.com wrote:
Hi,
Yes, I'm using createStream, but the storageLevel param is by default
Hi,
I wrote a simple test job, it only does very basic operations. for example:
val lines = KafkaUtils.createStream(ssc, zkQuorum, group, Map(topic -
1)).map(_._2)
val logs = lines.flatMap { line =
try {
Some(parse(line).extract[Impression])
} catch {
case _:
Can you replace your counting part with this?
logs.filter(_.s_id 0).foreachRDD(rdd = logger.info(rdd.count()))
Thanks
Best Regards
On Thu, May 28, 2015 at 1:02 PM, Ji ZHANG zhangj...@gmail.com wrote:
Hi,
I wrote a simple test job, it only does very basic operations. for example:
Hi,
Unfortunately, they're still growing, both driver and executors.
I run the same job with local mode, everything is fine.
On Thu, May 28, 2015 at 5:26 PM, Akhil Das ak...@sigmoidanalytics.com
wrote:
Can you replace your counting part with this?
logs.filter(_.s_id 0).foreachRDD(rdd =
After submitting the job, if you do a ps aux | grep spark-submit then you
can see all JVM params. Are you using the highlevel consumer (receiver
based) for receiving data from Kafka? In that case if your throughput is
high and the processing delay exceeds batch interval then you will hit this
Hi,
Yes, I'm using createStream, but the storageLevel param is by default
MEMORY_AND_DISK_SER_2. Besides, the driver's memory is also growing. I
don't think Kafka messages will be cached in driver.
On Thu, May 28, 2015 at 12:24 AM, Akhil Das ak...@sigmoidanalytics.com
wrote:
Are you using the
Hi Akhil,
Thanks for your reply. Accoding to the Streaming tab of Web UI, the
Processing Time is around 400ms, and there's no Scheduling Delay, so I
suppose it's not the Kafka messages that eat up the off-heap memory. Or
maybe it is, but how to tell?
I googled about how to check the off-heap
to do that under yarn?
I've managed to get it mostly there by including all spark, yarn, hadoop,
and hdfs config files in my SparkConf (somewhat indirectly, and that is a
bit of a short-hand), but while the job shows up now under yarn, and has
its own applications web ui page, it's not showing up
the stage completes, because it
might get used again by another later (eg., if the stage is retried).
On Tue, May 12, 2015 at 6:50 PM, Ashwin Shankar ashwinshanka...@gmail.com
wrote:
Hi,
In spark on yarn and when running spark_shuffle as auxiliary service on
node manager, does map spills of a stage
Hi,
In spark on yarn and when running spark_shuffle as auxiliary service on
node manager, does map spills of a stage gets cleaned up once the next
stage completes OR
is it preserved till the app completes(ie waits for all the stages to
complete) ?
--
Thanks,
Ashwin
link:
http://mbonaci.github.io/mbo-spark/
You dont need to install spark on every node.Just install it on one node
or you can install it on remote system also and made a spark cluster.
Thanks
Madhvi
On Thursday 30 April 2015 09:31 AM, xiaohe lan wrote:
Hi experts,
I see spark on yarn has
spark on every node.Just install it on
one node or you can install it on remote system also and made a
spark cluster.
Thanks
Madhvi
On Thursday 30 April 2015 09:31 AM, xiaohe lan wrote:
Hi experts,
I see spark on yarn has yarn-client and yarn-cluster mode. I
and made a spark cluster.
Thanks
Madhvi
On Thursday 30 April 2015 09:31 AM, xiaohe lan wrote:
Hi experts,
I see spark on yarn has yarn-client and yarn-cluster mode. I also have a
5 nodes hadoop cluster (hadoop 2.4). How to install spark if I want to try
the spark on yarn mode.
Do I need
, xiaohe lan wrote:
Hi experts,
I see spark on yarn has yarn-client and yarn-cluster mode. I also have
a 5 nodes hadoop cluster (hadoop 2.4). How to install spark if I want
to try the spark on yarn mode.
Do I need to install spark on the each node of hadoop cluster ?
Thanks,
Xiaohe
Hi experts,
I see spark on yarn has yarn-client and yarn-cluster mode. I also have a 5
nodes hadoop cluster (hadoop 2.4). How to install spark if I want to try
the spark on yarn mode.
Do I need to install spark on the each node of hadoop cluster ?
Thanks,
Xiaohe
On 27 Apr 2015, at 07:51, ÐΞ€ρ@Ҝ (๏̯͡๏)
deepuj...@gmail.commailto:deepuj...@gmail.com wrote:
Spark 1.3
1. View stderr/stdout from executor from Web UI: when the job is running i
figured out the executor that am suppose to see, and those two links show 4
special characters on browser.
2.
Spark 1.3
1. View stderr/stdout from executor from Web UI: when the job is running i
figured out the executor that am suppose to see, and those two links show 4
special characters on browser.
2. Tail on Yarn logs:
/apache/hadoop/bin/yarn logs -applicationId
application_1429087638744_151059 |
1) Application container logs from Web RM UI never load on browser. I
eventually have to kill the browser.
2) /apache/hadoop/bin/yarn logs -applicationId
application_1429087638744_151059
| less emits logs only after the application has completed.
Are there no better ways to see the logs as they
You can check container logs from RM web UI or when log-aggregation is
enabled with the yarn command. There are other, but less convenient options.
On Mon, Apr 27, 2015 at 8:53 AM ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com wrote:
Spark 1.3
1. View stderr/stdout from executor from Web UI: when the job
On top of what's been said...
On Wed, Apr 22, 2015 at 10:48 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com wrote:
1) I can go to Spark UI and see the status of the APP but cannot see the
logs as the job progresses. How can i see logs of executors as they progress
?
Spark 1.3 should have links to the
, Ted Yu yuzhih...@gmail.com wrote:
For step 2, you can pipe application log to a file instead of
copy-pasting.
Cheers
On Apr 22, 2015, at 10:48 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com wrote:
I submit a spark app to YARN and i get these messages
15/04/22 22:45:04 INFO yarn.Client
On Fri, Apr 24, 2015 at 11:31 AM, Marcelo Vanzin van...@cloudera.com
wrote:
Spark 1.3 should have links to the executor logs in the UI while the
application is running. Not yet in the history server, though.
You're absolutely correct -- didn't notice it until now. This is a great
addition!
For step 2, you can pipe application log to a file instead of copy-pasting.
Cheers
On Apr 22, 2015, at 10:48 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com wrote:
I submit a spark app to YARN and i get these messages
15/04/22 22:45:04 INFO yarn.Client: Application report
I submit a spark app to YARN and i get these messages
15/04/22 22:45:04 INFO yarn.Client: Application report for
application_1429087638744_101363 (state: RUNNING)
15/04/22 22:45:04 INFO yarn.Client: Application report for
application_1429087638744_101363 (state: RUNNING).
...
1) I can go
...@sigmoidanalytics.com]
Sent: Monday, April 20, 2015 2:56 PM
To: roy
Cc: user@spark.apache.org
Subject: Re: shuffle.FetchFailedException in spark on YARN job
Which version of Spark are you using? Did you try using
spark.shuffle.blockTransferService=nio
Thanks
Best Regards
On Sat, Apr 18, 2015 at 11:14 PM
spark.akka.frameSize=1
--executor-cores 3
any idea why its failing on shuffle fetch ?
thanks
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/shuffle-FetchFailedException-in-spark-on-YARN-job-tp22557.html
Sent from the Apache Spark User List
spark.yarn.driver.memoryOverhead=1000 --conf spark.akka.frameSize=1
--executor-cores 3
any idea why its failing on shuffle fetch ?
thanks
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/shuffle-FetchFailedException-in-spark-on-YARN-job-tp22557.html
Sent from the Apache
:
In YARN cluster mode, there is no Spark master, since YARN is your
resource manager. Yes you could force your AM somehow to run on the
same node as the RM, but why -- what do think is faster about that?
On Tue, Mar 10, 2015 at 10:06 AM, Harika matha.har...@gmail.com wrote:
Hi all,
I have
in YARN client mode.
And I want to run the AM on the same node as RM inorder use the node which
otherwise would run AM.
How can I get AM run on the same node as RM?
On Tue, Mar 10, 2015 at 3:49 PM, Sean Owen so...@cloudera.com wrote:
In YARN cluster mode, there is no Spark master, since YARN
of the cluster.
1. Is this the correct way of configuration? What is the architecture of
Spark on YARN?
2. Is there a way in which I can run Spark master, YARN application master
and resource manager on a single node?(so that I can use three other nodes
for the computation)
Thanks
Harika
--
View
In YARN cluster mode, there is no Spark master, since YARN is your
resource manager. Yes you could force your AM somehow to run on the
same node as the RM, but why -- what do think is faster about that?
On Tue, Mar 10, 2015 at 10:06 AM, Harika matha.har...@gmail.com wrote:
Hi all,
I have Spark
Hi Harika,
Did you get any solution for this?
I want to use yarn , but the spark-ec2 script does not support it.
Thanks
-Roni
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Setting-up-Spark-with-YARN-on-EC2-cluster-tp21818p21991.html
Sent from the Apache
]$ ./sbin/stop-all.sh
ephemeral-hdfs]$ ./sbin/start-all.sh
HTH
Deb
On Wed, Feb 25, 2015 at 11:46 PM, Harika matha.har...@gmail.com wrote:
Hi,
I want to setup a Spark cluster with YARN dependency on Amazon EC2. I was
reading this https://spark.apache.org/docs/1.2.0/running-on-yarn.html
Yes, spark.yarn.historyServer.address is used to access the spark history
server from yarn, it is not needed if you use only the yarn history server.
It may be possible to have both history servers running, but I have not tried
that yet.
Besides, as far as I have understood, yarn and spark
You can see this information in the yarn web UI using the configuration I
provided in my former mail (click on the application id, then on logs; you will
then be automatically redirected to the yarn history server UI).
On 24/02/2015 19:49, Colin Kincaid Williams wrote:
So back to my original
Hi,
I want to setup a Spark cluster with YARN dependency on Amazon EC2. I was
reading this https://spark.apache.org/docs/1.2.0/running-on-yarn.html
document and I understand that Hadoop has to be setup for running Spark with
YARN. My questions -
1. Do we have to setup Hadoop cluster on EC2
Hi Colin,
Here is how I have configured my hadoop cluster to have yarn logs available
through both the yarn CLI and the _yarn_ history server (with gzip compression
and 10 days retention):
1. Add the following properties in the yarn-site.xml on each node managers and
on the resource manager:
Looks like in my tired state, I didn't mention spark the whole time.
However, it might be implied by the application log above. Spark log
aggregation appears to be working, since I can run the yarn command above.
I do have yarn logging setup for the yarn history server. I was trying to
use the
the spark history server and the yarn history server are totally
independent. Spark knows nothing about yarn logs, and vice versa, so
unfortunately there isn't any way to get all the info in one place.
On Tue, Feb 24, 2015 at 12:36 PM, Colin Kincaid Williams disc...@uw.edu
wrote:
Looks like in
So back to my original question.
I can see the spark logs using the example above:
yarn logs -applicationId application_1424740955620_0009
This shows yarn log aggregation working. I can see the std out and std
error in that container information above. Then how can I get this
information in a
Hi,
I have been trying to get my yarn logs to display in the spark
history-server or yarn history-server. I can see the log information
yarn logs -applicationId application_1424740955620_0009
15/02/23 22:15:14 INFO client.ConfiguredRMFailoverProxyProvider: Failing
over to
Hi,
I'm running Spark on Yarn from an edge node, and the tasks on the run Data
Nodes. My job fails with the Too many open files error once it gets to
groupByKey(). Alternatively I can make it fail immediately if I repartition
the data when I create the RDD.
Where do I need to make sure
Spark on Yarn from an edge node, and the tasks on the run Data
Nodes. My job fails with the Too many open files error once it gets to
groupByKey(). Alternatively I can make it fail immediately if I repartition
the data when I create the RDD.
Where do I need to make sure that ulimit -n is high
The version I'm using was already pre-built for Hadoop 2.3.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-on-Yarn-java-lang-IllegalArgumentException-Invalid-rule-tp21382p21485.html
Sent from the Apache Spark User List mailing list archive
i have a simple spark app that i run with spark-submit on yarn. it runs
fine and shows up with finalStatus=SUCCEEDED in the resource manager logs.
however in the nodemanager logs i see this:
2015-01-31 18:30:48,195 INFO
with spark-submit on yarn. it runs
fine and shows up with finalStatus=SUCCEEDED in the resource manager logs.
however in the nodemanager logs i see this:
2015-01-31 18:30:48,195 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:
Memory usage
clue there ?
You can pastebin part of the RM log around the time your job ran ?
What hadoop version are you using ?
Thanks
On Sat, Jan 31, 2015 at 11:24 AM, Koert Kuipers ko...@tresata.com wrote:
i have a simple spark app that i run with spark-submit on yarn. it runs
fine and shows up
Don't know how to solve it.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-on-YARN-java-lang-ClassCastException-SerializedLambda-to-org-apache-spark-api-java-function-Fu1-tp21261p21315.html
Sent from the Apache Spark User List mailing list archive
Then your spark is not built for yarn. Try to build with
sbt/sbt -Dhadoop.version=2.3.0 -Pyarn assembly
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-on-Yarn-java-lang-IllegalArgumentException-Invalid-rule-tp21382p21404.html
Sent from the Apache
and on yarn, without any issues. It looks like I may be missing a
configuration step somewhere. Any thoughts on what may be causing this?
NR
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-on-Yarn-java-lang-IllegalArgumentException-Invalid-rule-tp21382
Thanks, Siddardha. I did but got the same error. Kerberos is enabled on my
cluster and I may be missing a configuration step somewhere.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-on-Yarn-java-lang-IllegalArgumentException-Invalid-rule
this?
NR
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-on-Yarn-java-lang-IllegalArgumentException-Invalid-rule-tp21382.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
on what may be causing this?
NR
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-on-Yarn-java-lang-IllegalArgumentException-Invalid-rule-tp21382.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
Update: I deployed a stand-alone spark in localhost then set Master as
spark://localhost:7077 and it met the same issue
Don't know how to solve it.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-on-YARN-java-lang-ClassCastException-SerializedLambda
Hi, I try to run Spark on YARN cluster by set master as yarn-client on java
code. It works fine with count task by not working with other command.
It threw ClassCastException:
java.lang.ClassCastException: cannot assign instance of
java.lang.invoke.SerializedLambda to field
`--jars` accepts a comma-separated list of jars. See the usage about
`--jars`
--jars JARS Comma-separated list of local jars to include on the driver and
executor classpaths.
Best Regards,
Shixiong Zhu
2015-01-08 19:23 GMT+08:00 Guillermo Ortiz konstt2...@gmail.com:
I'm trying to execute
I'm trying to execute Spark from a Hadoop Cluster, I have created this
script to try it:
#!/bin/bash
export HADOOP_CONF_DIR=/etc/hadoop/conf
SPARK_CLASSPATH=
for lib in `ls /user/local/etc/lib/*.jar`
do
SPARK_CLASSPATH=$SPARK_CLASSPATH:$lib
done
thanks!
2015-01-08 12:59 GMT+01:00 Shixiong Zhu zsxw...@gmail.com:
`--jars` accepts a comma-separated list of jars. See the usage about
`--jars`
--jars JARS Comma-separated list of local jars to include on the driver and
executor classpaths.
Best Regards,
Shixiong Zhu
2015-01-08
libraries are java-only (the scala version appended there is just for
helping the build scripts).
But it does look weird, so it would be nice to fix it.
On Wed, Jan 7, 2015 at 12:25 AM, Aniket Bhatnagar
aniket.bhatna...@gmail.com wrote:
It seems that spark-network-yarn compiled for scala 2.11
It seems that spark-network-yarn compiled for scala 2.11 depends on
spark-network-shuffle compiled for scala 2.10. This causes cross version
dependencies conflicts in sbt. Seems like a publishing error?
http://www.uploady.com/#!/download/6Yn95UZA0DR/3taAJFjCJjrsSXOR
wrote:
It seems that spark-network-yarn compiled for scala 2.11 depends on
spark-network-shuffle compiled for scala 2.10. This causes cross version
dependencies conflicts in sbt. Seems like a publishing error?
http://www.uploady.com/#!/download/6Yn95UZA0DR/3taAJFjCJjrsSXOR
--
Marcelo
Thanks Ted. Adding dependency to spark-network-yarn would allow resolution
to YarnShuffleService which from docs suggests that it runs on Node Manager
and perhaps isn't useful while submitting jobs programmatically. I think
what I need is a dependency to spark-yarn module so that classes like
Hi all
I just realized that spark-yarn artifact hasn't been published for 1.2.0
release. Any particular reason for that? I was using it in my yet another
spark-job-server project to submit jobs to a YARN cluster through
convenient REST APIs (with some success). The job server was creating
See this thread:
http://search-hadoop.com/m/JW1q5vd61V1/Spark-yarn+1.2.0subj=Re+spark+yarn_2+10+1+2+0+artifacts
Cheers
On Dec 28, 2014, at 11:13 PM, Aniket Bhatnagar aniket.bhatna...@gmail.com
wrote:
Hi all
I just realized that spark-yarn artifact hasn't been published for 1.2.0
release
.1001560.n3.nabble.com/Who-manage-the-log4j-appender-while-running-spark-on-yarn-tp20778p20818.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
section of the Running on YARN docs.
On Fri, Dec 19, 2014 at 12:37 AM, WangTaoTheTonic
barneystin...@aliyun.com wrote:
Hi guys,
I recently ran spark on yarn and found spark didn't set any log4j properties
file in configuration or code. And the log4j logs was writing into stderr
file under
Hi guys,
I recently ran spark on yarn and found spark didn't set any log4j properties
file in configuration or code. And the log4j logs was writing into stderr
file under ${yarn.nodemanager.log-dirs}/application_${appid}.
I wanna know which side(spark or hadoop) controll the appender? Have
Hello,
I'm experiencing an issue where yarn is scheduling two executors (the default)
regardless of what I enter as num-executors when submitting an application.
Background: I'm running Spark with Yarn on Amazon EMR. My cluster has two core
nodes and three task nodes. All five nodes
Hi All,
I have spark on yarn and there are multiple spark jobs on the cluster.
Sometimes some jobs are not getting enough resources even when there are
enough free resources available on cluster, even when I use below settings
--num-workers 75 \
--worker-cores 16
Jobs stick with the resources
, Dec 12, 2014 at 10:52 AM, gpatcham gpatc...@gmail.com wrote:
Hi All,
I have spark on yarn and there are multiple spark jobs on the cluster.
Sometimes some jobs are not getting enough resources even when there are
enough free resources available on cluster, even when I use below settings
--num
12, 2014 at 10:52 AM, gpatcham gpatc...@gmail.com wrote:
Hi All,
I have spark on yarn and there are multiple spark jobs on the cluster.
Sometimes some jobs are not getting enough resources even when there are
enough free resources available on cluster, even when I use below settings
--num
spark on yarn and there are multiple spark jobs on the cluster.
Sometimes some jobs are not getting enough resources even when there are
enough free resources available on cluster, even when I use below settings
--num-workers 75 \
--worker-cores 16
Jobs stick with the resources what they get
Thanks Sandy!
On Mon, Dec 8, 2014 at 23:15 Sandy Ryza sandy.r...@cloudera.com wrote:
Another thing to be aware of is that YARN will round up containers to the
nearest increment of yarn.scheduler.minimum-allocation-mb, which defaults
to 1024.
-Sandy
On Sat, Dec 6, 2014 at 3:48 PM, Denny Lee
Hi yuemeng,
Are you possibly running the Capacity Scheduler with the default resource
calculator?
-Sandy
On Sat, Dec 6, 2014 at 7:29 PM, yuemeng1 yueme...@huawei.com wrote:
Hi, all
When i running an app with this cmd: ./bin/spark-sql --master
yarn-client --num-executors 2 --executor
Another thing to be aware of is that YARN will round up containers to the
nearest increment of yarn.scheduler.minimum-allocation-mb, which defaults
to 1024.
-Sandy
On Sat, Dec 6, 2014 at 3:48 PM, Denny Lee denny.g@gmail.com wrote:
Got it - thanks!
On Sat, Dec 6, 2014 at 14:56 Arun Ahuja
This is perhaps more of a YARN question than a Spark question but i was
just curious to how is memory allocated in YARN via the various
configurations. For example, if I spin up my cluster with 4GB with a
different number of executors as noted below
4GB executor-memory x 10 executors = 46GB
Hi Denny,
This is due the spark.yarn.memoryOverhead parameter, depending on what
version of Spark you are on the default of this may differ, but it should
be the larger of 1024mb per executor or .07 * executorMemory.
When you set executor memory, the yarn resource request is executorMemory +
Got it - thanks!
On Sat, Dec 6, 2014 at 14:56 Arun Ahuja aahuj...@gmail.com wrote:
Hi Denny,
This is due the spark.yarn.memoryOverhead parameter, depending on what
version of Spark you are on the default of this may differ, but it should
be the larger of 1024mb per executor or .07 *
Hi, all
When i running an app with this cmd: ./bin/spark-sql --master
yarn-client --num-executors 2 --executor-cores 3, i noticed that yarn
resource manager ui shows the `vcores used` in cluster metrics is 3. It
seems `vcores used` show wrong num (should be 7?)? Or i miss something
Hi, all:
According to https://github.com/apache/spark/pull/2732, When a spark job fails
or exits nonzero in yarn-cluster mode, the spark-submit will get the
corresponding return code of the spark job. But I tried in spark-1.1.1 yarn
cluster, spark-submit return zero anyway.
Here is my spark
I tried in spark client mode, spark-submit can get the correct return code from
spark job. But in yarn-cluster mode, It failed.
From: lin_q...@outlook.com
To: u...@spark.incubator.apache.org
Subject: Issue on [SPARK-3877][YARN]: Return code of the spark-submit in
yarn-cluster mode
Date: Fri, 5
-submit cannot
get the second return code 100. What's the difference between these two
`exit`? I was so confused.
From: lin_q...@outlook.com
To: u...@spark.incubator.apache.org
Subject: RE: Issue on [SPARK-3877][YARN]: Return code of the spark-submit in
yarn-cluster mode
Date: Fri, 5 Dec 2014 17
.
--
From: lin_q...@outlook.com
To: u...@spark.incubator.apache.org
Subject: RE: Issue on [SPARK-3877][YARN]: Return code of the spark-submit
in yarn-cluster mode
Date: Fri, 5 Dec 2014 17:11:39 +0800
I tried in spark client mode, spark-submit can get the correct return code
these two `exit`? I was so confused.
--
From: lin_q...@outlook.com
To: u...@spark.incubator.apache.org
Subject: RE: Issue on [SPARK-3877][YARN]: Return code of the spark-submit
in yarn-cluster mode
Date: Fri, 5 Dec 2014 17:11:39 +0800
I tried in spark client mode
Hi,
In the Spark on YARN, the AM (driver) will ask the RM for resources. Once
the resources are allocated by the RM, the AM will start the executors
through the NM. This is my understanding.
But, according to the Spark documentation (1), the
`spark.yarn.applicationMaster.waitTries` properties
Owen so...@cloudera.com:
My guess is you're asking for all cores of all machines but the driver
needs at least one core, so one executor is unable to find a machine to fit
on.
On Nov 18, 2014 7:04 PM, Alan Prando a...@scanboo.com.br wrote:
Hi Folks!
I'm running Spark on YARN cluster installed
for all cores of all machines but the driver
needs at least one core, so one executor is unable to find a machine to fit
on.
On Nov 18, 2014 7:04 PM, Alan Prando a...@scanboo.com.br wrote:
Hi Folks!
I'm running Spark on YARN cluster installed with Cloudera Manager
Express.
The cluster has 1
301 - 400 of 512 matches
Mail list logo