It is strange that there are always two tasks slower than others, and the
corresponding partitions's data are larger, no matter how many partitions?
Executor ID Address Task Time Shuffle Read Size /
Records
1 slave129.vsvs.com:56691 16 s1 99.5 MB /
begs for your help
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/why-does-com-esotericsoftware-kryo-KryoException-java-u-til-ConcurrentModificationException-happen-tp23067p23068.html
Sent from the Apache Spark User List mailing list archive at
My program runs for 500 iterations, but fails at about 150 iterations
almostly.
It's hard to explain the details of my program, but i think my program is
ok, for it runs succesfully somtimes. *I just wana know in which situations
this exception will happen*.
The detail error information is
? and what is the cluster
setup
that you are having? Given the logs, it looks like the master is dead for
some reason.
Thanks
Best Regards
On Sun, Oct 19, 2014 at 2:48 PM, randylu lt;
randylu26@
gt; wrote:
In additional, driver receives serveral DisassociatedEvent messages.
--
View
The cluster also runs other applcations every hour as normal, so the master
is always running. No matter what the cores i use or the quantity of
input-data(but big enough), the application just fail at 1.1 hours later.
--
View this message in context:
My application is used for LDA(a topic model, with gibbs sampling), it's
hard for me to explain LDA, so you need to search it on google if any.
I did increase spark.akka.frameSize to 1GB(even 5GB), both in
master/workers's spark-defaults.conf and SparkConf, but it has no effect at
all.
I'm
it?
Best,
randylu
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/why-does-driver-connects-to-master-fail-tp16758.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
In additional, driver receives serveral DisassociatedEvent messages.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/why-does-driver-connects-to-master-fail-tp16758p16759.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
Dear all,
In my test programer, there are 3 partitions for each RDD, the iteration
procedure is as follows:
var rdd_0 = ... // init
for (...) {
*rdd_1* = *rdd_0*.reduceByKey(...).partitionBy(p) // calculate rdd_1
from rdd_0
*rdd_0* = *rdd_0*.partitionBy(p).join(*rdd_1*)... //
My code is as follows:
*documents.flatMap(case words = words.map(w = (w, 1))).reduceByKey(_ +
_).collect()*
In driver's log, reduceByKey() is finished, but collect() seems always in
run, just can't be finished.
In additional, there are about 200,000,000 words needs to be collected. Is
it
Thanks rxin,
I still have a doubt about collect().
Word's number before reduceByKey() is about 200 million, and after
reduceByKey() it decreases to 18 million.
Memory for driver is initialized 15GB, then I print out
runtime.freeMemory() before reduceByKey(), it indicates 13GB free memory.
I
If memory is not enough, OutOfMemory exception should occur, but nothing in
driver's log.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/something-about-rdd-collect-tp16451p16461.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
hi, TD. Thanks very much! I got it.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/java-lang-StackOverflowError-when-calling-count-tp5649p11980.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
hi, TD. I also fall into the trap of long lineage, and your suggestions do
work well. But i don't understand why the long lineage can cause stackover,
and where it takes effect?
--
View this message in context:
rdd.coalesce() will take effect:
rdd.coalesce(1, true).saveAsTextFile(save_path)
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/how-to-make-saveAsTextFile-NOT-split-output-into-multiple-file-tp8129p8244.html
Sent from the Apache Spark User List
my programer runs in standalone model, the commond line is like:
/opt/spark-1.0.0/bin/spark-submit \
--verbose \
--class $class_name --master spark://master:7077 \
--driver-memory 15G \
--driver-cores 2 \
--deploy-mode cluster \
hi, Andrew Ash, thanks for your reply.
In fact, I have already used unpersist(), but it doesn't take effect.
One reason that I select 1.0.0 version is just for it providing unpersist()
interface.
--
View this message in context:
My code just like follows:
1 var rdd1 = ...
2 var rdd2 = ...
3 var kv = ...
4 for (i - 0 until n) {
5var kvGlobal = sc.broadcast(kv) // broadcast kv
6rdd1 = rdd2.map {
7 case t = doSomething(t, kvGlobal.value)
8}
9var tmp =
rdd1 is cached, but it has no effect:
1 var rdd1 = ...
2 var rdd2 = ...
3 var kv = ...
4 for (i - 0 until n) {
5var kvGlobal = sc.broadcast(kv) // broadcast kv
6rdd1 = rdd2.map {
7 case t = doSomething(t, kvGlobal.value)
8}.cache()
9var tmp =
But when i put broadcast variable out of for-circle, it workes well(if not
concerned about memory issue as you pointed out):
1 var rdd1 = ...
2 var rdd2 = ...
3 var kv = ...
4 var kvGlobal = sc.broadcast(kv) // broadcast kv
5 for (i - 0 until n) {
6rdd1 =
i run in spark 1.0.0, the newest under-development version.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/problem-about-broadcast-variable-in-iteration-tp5479p5480.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
i found that the small broadcast variable always took about 10s, not 5s or
else.
If there is some property/conf(which is default 10) that control the
timeout?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/about-broadcast-tp5416p5439.html
Sent from the
In my code, there are two broadcast variables. Sometimes reading the small
one took more time than the big one, so strange!
Log on slave node is as follows:
Block broadcast_2 stored as values to memory (estimated size *4.0 KB*, free
17.2 GB)
Reading broadcast variable 2 took *9.998537123* s
additional, Reading the big broadcast variable always took about 2s.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/about-broadcast-tp5416p5417.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
i got it, thanks very much :)
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/two-calls-of-saveAsTextFile-have-different-results-on-the-same-RDD-tp4578p4655.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
my code is like:
rdd2 = rdd1.filter(_._2.length 1)
rdd2.collect()
it works well, but if i use a variable /num/ instead of 1:
var num = 1
rdd2 = rdd1.filter(_._2.length num)
rdd2.collect()
it fails at rdd2.collect()
so strange?
--
View this message in context:
14/04/23 17:17:40 INFO DAGScheduler: Failed to run collect at
SparkListDocByTopic.scala:407
Exception in thread main java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
@Cheng Lian-2, Sourav Chandra, thanks very much.
You are right! The situation just like what you say. so nice !
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/about-rdd-filter-tp4657p4718.html
Sent from the Apache Spark User List mailing list archive
i just call saveAsTextFile() twice. 'doc_topic_dist' is type of RDD[(Long,
Array[Int])],
each element is pair of (doc, topic_arr), for the same doc, they have
different of topic_arr in two files.
...
doc_topic_dist.coalesce(1, true).saveAsTextFile(save_path)
it's ok when i call doc_topic_dist.cache() firstly.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/two-calls-of-saveAsTextFile-have-different-results-on-the-same-RDD-tp4578p4580.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
30 matches
Mail list logo