, Arun Ahuja aahuj...@gmail.com wrote:
Yes, I imagine it's the driver's classpath - I'm pulling those
screenshots straight from the Spark UI environment page. Is there
somewhere else to grab the executor class path?
Also, the warning is only printing once, so it's also not clear whether
it references the assembly you built
locally and from which you're launching the driver.
I think we're concerned with the executors and what they have on the
classpath. I suspect there is still a problem somewhere in there.
On Mon, Jul 20, 2015 at 4:59 PM, Arun Ahuja aahuj...@gmail.com wrote
Ryza sandy.r...@cloudera.com wrote:
Can you try setting the spark.yarn.jar property to make sure it points to
the jar you're thinking of?
-Sandy
On Fri, Jul 17, 2015 at 11:32 AM, Arun Ahuja aahuj...@gmail.com wrote:
Yes, it's a YARN cluster and using spark-submit to run. I have
SPARK_HOME
this
assembly you built for your job -- like, it's the actually the
assembly the executors are using.
On Tue, Jul 7, 2015 at 8:47 PM, Arun Ahuja aahuj...@gmail.com wrote:
Is there more documentation on what is needed to setup BLAS/LAPACK native
suport with Spark.
I’ve built spark
you are using this assembly across your cluster.
On Fri, Jul 17, 2015 at 6:26 PM, Arun Ahuja aahuj...@gmail.com wrote:
Hi Sean,
Thanks for the reply! I did double-check that the jar is one I think I am
running:
[image: Inline image 2]
jar tf
/hpc/users/ahujaa01/src/spark/assembly
Is there more documentation on what is needed to setup BLAS/LAPACK native
suport with Spark.
I’ve built spark with the -Pnetlib-lgpl flag and see that the netlib
classes are in the assembly jar.
jar tvf spark-assembly-1.5.0-SNAPSHOT-hadoop2.6.0.jar | grep netlib |
grep Native
6625 Tue Jul 07
Sorry, I can't help with this issue, but if you are interested in a simple
way to launch a Spark cluster on Amazon, Spark is now offered as an
application in Amazon EMR. With this you can have a full cluster with a
few clicks:
https://aws.amazon.com/blogs/aws/new-apache-spark-on-amazon-emr/
-
Hi Denny,
This is due the spark.yarn.memoryOverhead parameter, depending on what
version of Spark you are on the default of this may differ, but it should
be the larger of 1024mb per executor or .07 * executorMemory.
When you set executor memory, the yarn resource request is executorMemory +
Hey Sandy,
What are those sleeps for and do they still exist? We have seen about a
1min to 1:30 executor startup time, which is a large chunk for jobs that
run in ~10min.
Thanks,
Arun
On Fri, Dec 5, 2014 at 3:20 PM, Sandy Ryza sandy.r...@cloudera.com wrote:
Hi Denny,
Those sleeps were only
, myself included. I
think Patrick's recent work on the build scripts for 1.2.0 will make
delivering nightly builds to a public maven repo easier.
On Tue, Nov 18, 2014 at 10:22 AM, Arun Ahuja aahuj...@gmail.com wrote:
Of course we can run this as well to get the lastest, but the build is
fairly
Great - posted here https://issues.apache.org/jira/browse/SPARK-4542
On Fri, Nov 21, 2014 at 1:03 PM, Andrew Ash and...@andrewash.com wrote:
Yes you should file a Jira and echo it out here so others can follow and
comment on it. Thanks Arun!
On Fri, Nov 21, 2014 at 12:02 PM, Arun Ahuja
Are nightly releases posted anywhere? There are quite a few vital bugfixes
and performance improvements being commited to Spark and using the latest
commits is useful (or even necessary for some jobs).
Is there a place to post them, it doesn't seem like it would diffcult to
run make-dist nightly
Of course we can run this as well to get the lastest, but the build is
fairly long and this seems like a resource many would need.
On Tue, Nov 18, 2014 at 10:21 AM, Arun Ahuja aahuj...@gmail.com wrote:
Are nightly releases posted anywhere? There are quite a few vital
bugfixes and performance
If you are using spark-submit with --master yarn you can also pass as a
flag --executor-memory
On Mon, Nov 10, 2014 at 8:58 AM, Mudassar Sarwar
mudassar.sar...@northbaysolutions.net wrote:
Hi,
How can we increase the executor memory of a running spark cluster on YARN?
We want to increase
We are running our applications through YARN and are only somtimes seeing
them into the History Server. Most do not seem to have the
APPLICATION_COMPLETE file. Specifically any job that ends because of yarn
application -kill does not show up. For other ones what would be a reason
for them not
We have used the strategy that you suggested, Andrew - using many workers
per machine and keeping the heaps small ( 20gb).
Using a large heap resulted in workers hanging or not responding (leading
to timeouts). The same dataset/job for us will fail (most often due to
akka disassociated or fetch
We are also seeing this PARSING_ERROR(2) error due to
Caused by: java.io.IOException: failed to uncompress the chunk:
PARSING_ERROR(2)
at
org.xerial.snappy.SnappyInputStream.hasNextChunk(SnappyInputStream.java:362)
at
Has anyone else seen this erorr in task deserialization? The task is
processing a small amount of data and doesn't seem to have much data
hanging to the closure? I've only seen this with Spark 1.1
Job aborted due to stage failure: Task 975 in stage 8.0 failed 4
times, most recent failure: Lost
, which is a bit
interesting since the error message shows that the same stage has failed
multiple times. Are you able to consistently re-produce the bug across
multiple invocations at the same place?
On Fri, Sep 26, 2014 at 6:11 AM, Arun Ahuja aahuj...@gmail.com wrote:
Has anyone else seen
What is the proper way to specify java options for the Spark executors
using spark-submit? We had done this previously using
export SPARK_JAVA_OPTS='..
previously, for example to attach a debugger to each executor or add
-verbose:gc
-XX:+PrintGCDetails -XX:+PrintGCTimeStamps
On spark-submit I
Since upgrading to Spark 1.1 we have been seeing the following error in the
logs:
14/09/23 02:14:42 ERROR executor.Executor: Exception in task 1087.0 in
stage 0.0 (TID 607)
java.io.IOException: unexpected exception type
at
I have a general question on when persisting will be beneficial and when it
won't:
I have a task that runs as follow
keyedRecordPieces = records.flatMap( record = Seq(key, recordPieces))
partitoned = keyedRecordPieces.partitionBy(KeyPartitioner)
partitoned.mapPartitions(doComputation).save()
to persist RDDs and one allows you to
specify storage level.
Thanks,
Liquan
On Tue, Sep 23, 2014 at 2:08 PM, Arun Ahuja aahuj...@gmail.com wrote:
I have a general question on when persisting will be beneficial and when
it won't:
I have a task that runs as follow
keyedRecordPieces
Is there more information on what the Input column on the Spark UI means?
How is this computed? I am processing a fairly small (but zipped) file
and see the value as
[image: Inline image 1]
This does not seem correct?
Thanks,
Arun
We see this all the time as well, I don't the believe there is much a
relationship before the Spark job status and the what Yarn shows as the
status.
On Mon, Aug 11, 2014 at 3:17 PM, Shay Rojansky r...@roji.org wrote:
Spark 1.0.2, Python, Cloudera 5.1 (Hadoop 2.3.0)
It seems that Python jobs
Is there more documentation on using spark-submit with Yarn? Trying to
launch a simple job does not seem to work.
My run command is as follows:
/opt/cloudera/parcels/CDH/bin/spark-submit \
--master yarn \
--deploy-mode client \
--executor-memory 10g \
--driver-memory 10g \
, Marcelo Vanzin van...@cloudera.com wrote:
On Tue, Aug 19, 2014 at 2:34 PM, Arun Ahuja aahuj...@gmail.com wrote:
/opt/cloudera/parcels/CDH/bin/spark-submit \
--master yarn \
--deploy-mode client \
This should be enough.
But when I view the job 4040 page, SparkUI, there is a single
Hi all,
I'm running a job that seems to continually fail with the following
exception:
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:152)
at
I was actually able to get this to work. I was NOT setting the classpath
properly originally.
Simply running
java -cp /etc/hadoop/conf/:yarn, hadoop jars com.domain.JobClass
and setting yarn-client as the spark master worked for me. Originally I
had not put the configuration on the classpath.
Hi Matei,
Unfortunately, I don't have more detailed information, but we have seen the
loss of workers in standalone mode as well. If a job is killed through
CTRL-C we will often see in the Spark Master page the number of workers and
cores decrease. They are still alive and well in the Cloudera
-1.0.0-rc7-docs/configuration.html
http://spark.apache.org/docs/0.9.1/configuration.html
2014-05-20 11:30 GMT-07:00 Arun Ahuja aahuj...@gmail.com:
I was actually able to get this to work. I was NOT setting the classpath
properly originally.
Simply running
java -cp /etc/hadoop/conf/:yarn
I am encountering the same thing. Basic yarn apps work as does the SparkPi
example, but my custom application gives this result. I am using
compute-classpath to create the proper classpath for my application, same
with SparkPi - was there a resolution to this issue?
Thanks,
Arun
On Wed, Feb
32 matches
Mail list logo