computing time.
>>>>>
>>>>> Code I use to train model:
>>>>>
>>>>> int MAX_BINS = 16;
>>>>> int NUM_CLASSES = 0;
>>>>> double MIN_INFO_GAIN = 0.0;
>>>>> int MAX_MEMORY_IN_MB = 256;
>
;
>>>> int MAX_MEMORY_IN_MB = 256;
>>>> double SUBSAMPLING_RATE = 1.0;
>>>> boolean USE_NODEID_CACHE = true;
>>>> int CHECKPOINT_INTERVAL = 10;
>>>> int RANDOM_SEED = 12345;
>>>>
>>>
>> int NODE_SIZE = 5;
>>> int maxDepth = 30;
>>> int numTrees = 50;
>>> Strategy strategy = new Strategy(Algo.Regression(), Variance.instance(),
>>> maxDepth, NUM_CLASSES, MAX_BINS,
>>> QuantileStrategy.Sort(), new
>>
t;> double MIN_INFO_GAIN = 0.0;
>>>> int MAX_MEMORY_IN_MB = 256;
>>>> double SUBSAMPLING_RATE = 1.0;
>>>> boolean USE_NODEID_CACHE = true;
>>>> int CHECKPOINT_INTERVAL = 10;
>>>> int RANDOM_SEED = 12345;
>>>>
>>>>
;
>>> Strategy strategy = new Strategy(Algo.Regression(), Variance.instance(),
>>> maxDepth, NUM_CLASSES, MAX_BINS,
>>> QuantileStrategy.Sort(), new
>>> scala.collection.immutable.HashMap<>(), nodeSize, MIN_INFO_GAIN,
>>> MAX
SES, MAX_BINS,
>> QuantileStrategy.Sort(), new scala.collection.immutable.HashMap<>(),
>> nodeSize, MIN_INFO_GAIN,
>> MAX_MEMORY_IN_MB, SUBSAMPLING_RATE, USE_NODEID_CACHE,
>> CHECKPOINT_INTERVAL);
>> RandomForestModel model = RandomForest.trainRe
> RandomForestModel model = RandomForest.trainRegressor(labeledPoints.rdd(),
> strategy, numTrees, "auto", RANDOM_SEED);
>
>
> Any advice would be highly appreciated.
>
> The exception (~3000 lines long):
> java.lang.StackOverflowErr
e, MIN_INFO_GAIN,
MAX_MEMORY_IN_MB, SUBSAMPLING_RATE, USE_NODEID_CACHE,
CHECKPOINT_INTERVAL);
RandomForestModel model =
RandomForest.trainRegressor(labeledPoints.rdd(), strategy, numTrees,
"auto", RANDOM_SEED);
Any advice would be highly appreciated.
The exception (~3000 lines long):
ja
n reason) but it finishes and returns results.
However, the weird thing is that after I run the same query again, I get the error:
"java.lang.StackOverflowError".
I Googled it but didn't find the error appearing with table caching and
querying.
Any hint is appreciated.
m running SQL queries (sqlContext.sql()) on Parquet tables and facing a
> problem with table caching (sqlContext.cacheTable()), using spark-shell of
> Spark 1.5.1.
>
> After I run the sqlContext.cacheTable(table), the sqlContext.sql(query) takes
> longer the first time (well, for
ext.cacheTable()), using spark-shell of
Spark 1.5.1.
After I run the sqlContext.cacheTable(table), the sqlContext.sql(query) takes longer the
first time (well, for the lazy execution reason) but it finishes and returns results.
However, the weird thing is that after I run the same query again, I get the err
ns results. However, the weird thing is that after I run the same
> query again, I get the error: "java.lang.StackOverflowError".
>
> I Googled it but didn't find the error appearing with table caching and
> querying.
> Any hint is appreciated.
>
---
execution reason) but it
finishes and returns results. However, the weird thing is that after I
run the same query again, I get the error: "java.lang.StackOverflowError".
I Googled it but didn't find the error appearing with table caching and
querying.
Any hint is appreciated.
What code is triggering the stack overflow?
On Mon, Feb 29, 2016 at 11:13 PM, Vinti Maheshwari
wrote:
> Hi All,
>
> I am getting below error in spark-streaming application, i am using kafka
> for input stream. When i was doing with socket, it was working fine. But
> when i
= {
uA ++= u
})
var uRDD = sparkContext.parallelize(uA.value)
Its failing on large dataset with following error
java.io.IOException: java.lang.StackOverflowError
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1140
= {
uA ++= u
})
var uRDD = sparkContext.parallelize(uA.value)
Its failing on large dataset with following error
java.io.IOException: java.lang.StackOverflowError
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1140
java.lang.StackOverflowError
at
scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
at
scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222
I tested the examples according to the docs in spark sql programming guide,
but the java.lang.stackoverflowerror occurred everytime I called
sqlContext.sql(...).
Meanwhile, it worked fine in a hiveContext. The Hadoop version is 2.2.0, the
Spark version is 1.1.0, built with Yarn, Hive.. I would
The long lineage causes a long/deep Java object tree (DAG of RDD objects),
which needs to be serialized as part of the task creation. When
serializing, the whole object DAG needs to be traversed leading to the
stackoverflow error.
TD
On Mon, Aug 11, 2014 at 7:14 PM, randylu randyl...@gmail.com
hi, TD. Thanks very much! I got it.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/java-lang-StackOverflowError-when-calling-count-tp5649p11980.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
hi, TD. I also fall into the trap of long lineage, and your suggestions do
work well. But i don't understand why the long lineage can cause stackover,
and where it takes effect?
--
View this message in context:
py4j.protocol.Py4JJavaError: An error occurred while calling
o9564.saveAsTextFile.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task
serialization failed: java.lang.StackOverflowError
java.io.Bits.putInt(Bits.java:93)
java.io.ObjectOutputStream$BlockDataOutputStream.writeInt(ObjectOutputStream.java
/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py,
line 300, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling
o9564.saveAsTextFile.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task
serialization failed: java.lang.StackOverflowError
py4j.protocol.Py4JJavaError: An error occurred while calling
o9564.saveAsTextFile.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task
serialization failed: java.lang.StackOverflowError
java.io.Bits.putInt(Bits.java:93)
java.io.ObjectOutputStream$BlockDataOutputStream.writeInt
Responses inline.
On Wed, Jul 23, 2014 at 4:13 AM, lalit1303 la...@sigmoidanalytics.com wrote:
Hi,
Thanks TD for your reply. I am still not able to resolve the problem for my
use case.
I have let's say 1000 different RDD's, and I am applying a transformation
function on each RDD and I want
Hi,
Thanks TD for your reply. I am still not able to resolve the problem for my
use case.
I have let's say 1000 different RDD's, and I am applying a transformation
function on each RDD and I want the output of all rdd's combined to a single
output RDD. For, this I am doing the following:
*Loop
CODE:print round, round, rdd__new.count()
File
/home1/ghyan/Software/spark-0.9.0-incubating-bin-hadoop2/python/pyspark/rdd.py,
line 542, in count
14/05/12 16:20:28 INFO TaskSetManager: Loss was due to
java.lang.StackOverflowError
:28 INFO TaskSetManager: Loss was due to
java.lang.StackOverflowError [duplicate 1]
return self.mapPartitions(lambda i: [sum(1 for _ in i)]).sum()
14/05/12 16:20:28 ERROR TaskSetManager: Task 8419.0:0 failed 1 times;
aborting job
File
/home1/ghyan/Software/spark-0.9.0
/pyspark/rdd.py,
line 542, in count
14/05/12 16:20:28 INFO TaskSetManager: Loss was due to
java.lang.StackOverflowError [duplicate 1]
return self.mapPartitions(lambda i: [sum(1 for _ in i)]).sum()
14/05/12 16:20:28 ERROR TaskSetManager: Task 8419.0:0 failed 1 times;
aborting job
CODE:print round, round, rdd__new.count()
File
/home1/ghyan/Software/spark-0.9.0-incubating-bin-hadoop2/python/pyspark/
rdd.py,
line 542, in count
14/05/12 16:20:28 INFO TaskSetManager: Loss was due to
java.lang.StackOverflowError [duplicate 1
: Loss was due to
java.lang.StackOverflowError [duplicate 1]
return self.mapPartitions(lambda i: [sum(1 for _ in i)]).sum()
14/05/12 16:20:28 ERROR TaskSetManager: Task 8419.0:0 failed 1 times;
aborting job
File
/home1/ghyan/Software/spark-0.9.0-incubating-bin-hadoop2/python
CODE:print round, round, rdd__new.count()
File
/home1/ghyan/Software/spark-0.9.0-incubating-bin-hadoop2/python/pyspark/rdd
.py, line 542, in count
14/05/12 16:20:28 INFO TaskSetManager: Loss was due to
java.lang.StackOverflowError [duplicate 1
32 matches
Mail list logo