I dropped down to 0.5 but still OOM'd, so sent it all the way to 0.1 and
didn't get an OOM. I could tune this some more to find where the cliff is,
but this is a one-off job so now that it's completed I don't want to spend
any more time tuning it.
Is there a reason that this value couldn't be
I am getting the following error when trying to access my data using hdfs://
... Not sure how to fix this one.
java.io.IOException: Call to server1/10.85.85.17:9000 failed on local
exception: java.io.EOFException
at org.apache.hadoop.ipc.Client.wrapException(Client.java:1107)
I successfully build with Maven on the command line and from IntelliJ.
I also see that error which only started yesterday, and think it is due to
a commit that has been reverted.
The first of two warnings is fixed in a PR I submitted yesterday and is
ignorable.
The second warning is really a
Hi Everyone,
Maybe it's a good time to reevaluate off-heap storage for RDD's with
custom allocator?
On a few occasions recently I had to lower both
spark.storage.memoryFraction and spark.shuffle.memoryFraction
spark.shuffle.spill helps a bit with large scale reduces
Also it could be you're
hi Mohan,
could you please tell me the hadoop version and the spark version
on which you are working on.
On Mon, Feb 10, 2014 at 3:37 PM, Amit Behera amit.bd...@gmail.com wrote:
Please go to hadoop configuration directory and open core-site.xml and
check the IP and port for HDFS,
Hi Team,
I was trying to run a stand-alone app in spark cluster .
When I executed the same I got Java Heap size error.
I have two workers with 4G Ram and two workers.
The error is pasted at http://pastebin.com/FCFj01UX
I have set SPARK_WORKER_MRMORY and SPARK_DAEMON_MEMORY as 4g too
Any clue
Hi All,
I have a setup which consists of 8 small machines (1 core) and 8G RAM
and 1 large machine (8 cores) with 100G RAM. Is there a way to enable
spark to run multiple executors on the large machine, and a single
executor on each of the small machines ?
Alternatively, is is possible to
I am trying to run a simple Spark Streaming job: counting words from hdfs. I
cannot even compile the scala source. I get the following error:
error: value awaitTermination is not a member of
org.apache.spark.streaming.StreamingContext
ssc.awaitTermination()
This is the code from the .scala
Which kinds of evaluation do you prefer ? Do you mean this kind :
https://github.com/apache/incubator-spark/pull/407
I am interested in such evaluation because I want to optimizing the
performance of distributed machine learning algorithms, both in the
framework execution and machine learning
I'm running 0.9.0 and attempting to try the example described here:
https://spark.incubator.apache.org/docs/0.9.0/streaming-programming-guide.html#a-quick-example
./bin/run-example
org.apache.spark.streaming.examples.JavaNetworkWordCount local[2]
localhost
But get the error:
Is the assembly jar actually missing, or is the script somehow looking in
the wrong place? Check
/home/centrifuge/spark-0.9.0-incubating-bin-hadoop1/examples/target/scala-2.10.
If the examples-assembly jar is not there, then either `./sbt/sbt
assembly` or `./sbt/sbt examples/assembly` should
awaitTermination() was added in Spark 0.9. Are you trying to run the
HdfsWordCount example, maybe in your own separate project? Make sure you
are compiling with Spark 0.9 and not anything older.
TD
On Mon, Feb 10, 2014 at 6:50 AM, Kal El pinu.datri...@yahoo.com wrote:
I am trying to run a
The file examples-assembly is not there.
I ran 'sbt/sbt examples assembly' and that didn't change anything.
Note that the shell script is complaining about binary operator
expected but I'm not positive that's a cause or a result for some
other problem.
Here's what's in the directory:
ls
My understanding of off-heap storage was that you'd still need to get those
JVM objects on-heap in order to actually use them with map, filter, etc.
Would we be trading CPU time to get memory efficiency if we went down the
off-heap storage route? I'm not sure what discussions have already
Blow away the examples/target directory, then do `./sbt/sbt
examples/assembly` -- you've got two assembly jars where there should be
only one in target/scala-2.10:
spark-examples-assembly-0.9.0-incubating.jar
On Mon, Feb 10, 2014 at 10:06 AM, David Swearingen
dswearin...@centrifugesystems.com
Hi all,
I am curious how fault tolerance is achieved in spark. Well, more like what do
I need to do to make sure my aggregations which comes from streams are fault
tolerant and saved into cassandra. I will have nodes die and would not like to
count tuples multiple times.
For example, in
When I try to run spark in local mode on my Mac, it get stuck trying to fetch
my application jar.
My command line is:
SPARK_CLASSPATH=~/depot/Engineering/kenji/dadlog-filter/target/scala-2.10/dadlog-filter-assembly-1.0.jar
~/opt/spark/bin/spark-class {myMainClass} local s3n://{myInputFile}
Thanks Josh. The driver JVM is hitting OutOfMemoryErrors, but the python
process is taking even more memory. I added more details to the JIRA.
On Fri, Feb 7, 2014 at 11:45 AM, Josh Rosen rosenvi...@gmail.com wrote:
I've opened an issue for this on JIRA:
Hadoop version : 2.0.0-cdh4.4.0
Spark: 0.9
thanks
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/EOF-Exception-when-trying-to-access-hdfs-tp1347p1370.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
I've been seeing the 'akka.actor.ActorNotFound' error as well when working
with 0.9 on Mesos. What fixes are working for people?
Kyle
On Fri, Feb 7, 2014 at 1:01 AM, Francesco Bongiovanni bongiova...@gmail.com
wrote:
So, still not working but playing around with some conf variables, I have a
Trying to test out and use graphx from the spark-shell in local mode, but
graphx appears to not be visable from the shell.
Downloaded spark 0.9.0 and built using maven
mvn -Dhadoop.version=2.2.0-cdh5.0.0-beta-1
-Dyarn.version=2.2.0-cdh5.0.0-beta-1 -DskipTests -Pyarn install
bin/spark-shell
I also did a maven clean package with the same options.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/graphx-missing-from-spark-shell-tp1372p1373.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
https://github.com/apache/incubator-spark/pull/527
On Mon, Feb 10, 2014 at 4:11 PM, Eric Kimbrel
eric.kimb...@soteradefense.com wrote:
I also did a maven clean package with the same options.
--
View this message in context:
23 matches
Mail list logo