Please decrease spark.serializer.objectStreamReset for your queries. The
default value is 100.

I logged SPARK-10787 for improvement.

Cheers

On Wed, Sep 23, 2015 at 6:59 PM, jluan <jaylu...@gmail.com> wrote:

> I have been stuck on this problem for the last few days:
>
> I am attempting to run random forest from MLLIB, it gets through most of
> it,
> but breaks when doing a mapPartition operation. The following stack trace
> is
> shown:
>
> : An error occurred while calling o94.trainRandomForestModel.
> : java.lang.OutOfMemoryError
>         at
> java.io.ByteArrayOutputStream.hugeCapacity(ByteArrayOutputStream.java:123)
>         at
> java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:117)
>         at
> java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
>         at
> java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:153)
>         at
>
> java.io.ObjectOutputStream$BlockDataOutputStream.drain(ObjectOutputStream.java:1877)
>         at
>
> java.io.ObjectOutputStream$BlockDataOutputStream.setBlockDataMode(ObjectOutputStream.java:1786)
>         at
> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1189)
>         at
> java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
>         at
>
> org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:44)
>         at
>
> org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:84)
>         at
>
> org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:301)
>         at
>
> org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:294)
> at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:122)
>         at org.apache.spark.SparkContext.clean(SparkContext.scala:2021)
>         at
> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:703)
>         at
> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:702)
>         at
>
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
>         at
>
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
>         at org.apache.spark.rdd.RDD.withScope(RDD.scala:306)
>         at org.apache.spark.rdd.RDD.mapPartitions(RDD.scala:702)
>         at
>
> org.apache.spark.mllib.tree.DecisionTree$.findBestSplits(DecisionTree.scala:625)
>         at
> org.apache.spark.mllib.tree.RandomForest.run(RandomForest.scala:235)
>         at
>
> org.apache.spark.mllib.tree.RandomForest$.trainClassifier(RandomForest.scala:291)
>         at
>
> org.apache.spark.mllib.api.python.PythonMLLibAPI.trainRandomForestModel(PythonMLLibAPI.scala:742)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:497)
>         at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
>         at
> py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
>         at py4j.Gateway.invoke(Gateway.java:259)
>         at
> py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
>         at py4j.commands.CallCommand.execute(CallCommand.java:79)
>         at py4j.GatewayConnection.run(GatewayConnection.java:207)
>         at java.lang.Thread.run(Thread.java:745)
>
> It seems to me that it's trying to serialize the mapPartitions closure, but
> runs out of space doing so. However I don't understand how it could run out
> of space when I gave the driver ~190GB for a file that's 45MB.
>
> I have a cluster setup on AWS such that my master is a r3.8xlarge along
> with
> two r3.4xlarge workers. I have the following configurations:
>
> spark version: 1.5.0
> -----------------------------------
> spark.executor.memory 32000m
> spark.driver.memory 230000m
> spark.driver.cores 10
> spark.executor.cores 5
> spark.executor.instances 17
> spark.driver.maxResultSize 0
> spark.storage.safetyFraction 1
> spark.storage.memoryFraction 0.9
> spark.storage.shuffleFraction 0.05
> spark.default.parallelism 128
>
> The master machine has approximately 240 GB of ram and each worker has
> about
> 120GB of ram.
>
> I load in a relatively tiny RDD of MLLIB LabeledPoint objects, with each
> holding sparse vectors inside. This RDD has a total size of roughly 45MB.
> My
> sparse vector has a total length of ~15 million while only about 3000 or so
> are non-zeros.
>
>
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-ClosureCleaner-or-java-serializer-OOM-when-trying-to-grow-tp24796.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to