Re: java.lang.OutOfMemoryError: Java heap space when running job via spark-submit

2014-10-09 Thread Jaonary Rabarisoa
in fact with --driver-memory 2G I can get it working

On Thu, Oct 9, 2014 at 6:20 PM, Xiangrui Meng  wrote:

> Please use --driver-memory 2g instead of --conf
> spark.driver.memory=2g. I'm not sure whether this is a bug. -Xiangrui
>
> On Thu, Oct 9, 2014 at 9:00 AM, Jaonary Rabarisoa 
> wrote:
> > Dear all,
> >
> > I have a spark job with the following configuration
> >
> > val conf = new SparkConf()
> >  .setAppName("My Job")
> >  .set("spark.serializer",
> "org.apache.spark.serializer.KryoSerializer")
> >  .set("spark.kryo.registrator", "value.serializer.Registrator")
> >  .setMaster("local[4]")
> >  .set("spark.executor.memory", "4g")
> >
> >
> > that I can run manually with sbt run without any problem.
> >
> > But, I try to run the same job with spark-submit
> >
> > ./spark-1.1.0-bin-hadoop2.4/bin/spark-submit \
> >  --class value.jobs.MyJob \
> >  --master local[4] \
> >  --conf spark.executor.memory=4g \
> >  --conf spark.driver.memory=2g \
> >  target/scala-2.10/my-job_2.10-1.0.jar
> >
> >
> > I get the following error :
> >
> > Exception in thread "stdin writer for List(patch_matching_similarity)"
> > java.lang.OutOfMemoryError: Java heap space
> > at java.util.Arrays.copyOf(Arrays.java:2271)
> > at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113)
> > at
> >
> java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
> > at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140)
> > at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> > at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
> > at com.esotericsoftware.kryo.io.Output.flush(Output.java:155)
> > at com.esotericsoftware.krput.writeString_slow(Output.java:420)
> > at com.esotericsoftware.kryo.io.Output.writeString(Output.java:326)
> > at
> >
> com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.write(DefaultSerializers.java:153)
> > at
> >
> com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.write(DefaultSerializers.java:146)
> > at com.esotericsoftware.kryo.Kryo.writeObjectOrNull(Kryo.java:549)
> > at
> >
> com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.write(FieldSerializer.java:570)
> > at
> >
> com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:213)
> > at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:568)
> > at
> >
> org.apache.spark.serializer.KryoSerializationStream.writeObject(KryoSerializer.scala:119)
> > at
> >
> org.apache.spark.serializer.SerializationStream.writeAll(Serializer.scala:110)
> > at
> >
> org.apache.spark.storage.BlockManager.dataSerializeStream(BlockManager.scala:1047)
> > at
> >
> org.apache.spark.storage.BlockManager.dataSerialize(BlockManager.scala:1056)
> > at org.apache.spark.storage.MemoryStore.putArray(MemoryStore.scala:93)
> > at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:745)
> > at org.apache.spark.storage.BlockManager.putArray(BlockManager.scala:625)
> > at
> org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:167)
> > at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:70)
> > at org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
> > at org.apache.spark.rdd.FilteredRDD.compute(FilteredRDD.scala:34)
> > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
> > at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
> > at
> >
> org.apache.spark.rdd.CartesianRDD$$anonfun$compute$1.apply(CartesianRDD.scala:75)
> > at
> >
> org.apache.spark.rdd.CartesianRDD$$anonfun$compute$1.apply(CartesianRDD.scala:74)
> > at
> >
> scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)yo.io.Output.require(Output.java:135)
> > at com.esotericsoftware.kryo.io.Out
> >
> >
> > I don't understand why since I set the same amount of memory in the two
> > cases.
> >
> > Any ideas will be helpfull. I use spark 1.1.0.
> >
> > Cheers,
> >
> > Jao
>


Re: java.lang.OutOfMemoryError: Java heap space when running job via spark-submit

2014-10-09 Thread Xiangrui Meng
Please use --driver-memory 2g instead of --conf
spark.driver.memory=2g. I'm not sure whether this is a bug. -Xiangrui

On Thu, Oct 9, 2014 at 9:00 AM, Jaonary Rabarisoa  wrote:
> Dear all,
>
> I have a spark job with the following configuration
>
> val conf = new SparkConf()
>  .setAppName("My Job")
>  .set("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
>  .set("spark.kryo.registrator", "value.serializer.Registrator")
>  .setMaster("local[4]")
>  .set("spark.executor.memory", "4g")
>
>
> that I can run manually with sbt run without any problem.
>
> But, I try to run the same job with spark-submit
>
> ./spark-1.1.0-bin-hadoop2.4/bin/spark-submit \
>  --class value.jobs.MyJob \
>  --master local[4] \
>  --conf spark.executor.memory=4g \
>  --conf spark.driver.memory=2g \
>  target/scala-2.10/my-job_2.10-1.0.jar
>
>
> I get the following error :
>
> Exception in thread "stdin writer for List(patch_matching_similarity)"
> java.lang.OutOfMemoryError: Java heap space
> at java.util.Arrays.copyOf(Arrays.java:2271)
> at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113)
> at
> java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
> at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140)
> at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
> at com.esotericsoftware.kryo.io.Output.flush(Output.java:155)
> at com.esotericsoftware.krput.writeString_slow(Output.java:420)
> at com.esotericsoftware.kryo.io.Output.writeString(Output.java:326)
> at
> com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.write(DefaultSerializers.java:153)
> at
> com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.write(DefaultSerializers.java:146)
> at com.esotericsoftware.kryo.Kryo.writeObjectOrNull(Kryo.java:549)
> at
> com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.write(FieldSerializer.java:570)
> at
> com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:213)
> at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:568)
> at
> org.apache.spark.serializer.KryoSerializationStream.writeObject(KryoSerializer.scala:119)
> at
> org.apache.spark.serializer.SerializationStream.writeAll(Serializer.scala:110)
> at
> org.apache.spark.storage.BlockManager.dataSerializeStream(BlockManager.scala:1047)
> at
> org.apache.spark.storage.BlockManager.dataSerialize(BlockManager.scala:1056)
> at org.apache.spark.storage.MemoryStore.putArray(MemoryStore.scala:93)
> at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:745)
> at org.apache.spark.storage.BlockManager.putArray(BlockManager.scala:625)
> at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:167)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:70)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
> at org.apache.spark.rdd.FilteredRDD.compute(FilteredRDD.scala:34)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
> at
> org.apache.spark.rdd.CartesianRDD$$anonfun$compute$1.apply(CartesianRDD.scala:75)
> at
> org.apache.spark.rdd.CartesianRDD$$anonfun$compute$1.apply(CartesianRDD.scala:74)
> at
> scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)yo.io.Output.require(Output.java:135)
> at com.esotericsoftware.kryo.io.Out
>
>
> I don't understand why since I set the same amount of memory in the two
> cases.
>
> Any ideas will be helpfull. I use spark 1.1.0.
>
> Cheers,
>
> Jao

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



java.lang.OutOfMemoryError: Java heap space when running job via spark-submit

2014-10-09 Thread Jaonary Rabarisoa
Dear all,

I have a spark job with the following configuration

*val conf = new SparkConf()*
* .setAppName("My Job")*
* .set("spark.serializer",
"org.apache.spark.serializer.KryoSerializer")*
* .set("spark.kryo.registrator", "value.serializer.Registrator")*
* .setMaster("local[4]")*
* .set("spark.executor.memory", "4g")*


that I can run manually with sbt run without any problem.

But, I try to run the same job with spark-submit

*./spark-1.1.0-bin-hadoop2.4/bin/spark-submit \*
* --class value.jobs.MyJob \*
* --master local[4] \*
* --conf spark.executor.memory=4g \*
* --conf spark.driver.memory=2g \*
* target/scala-2.10/my-job_2.10-1.0.jar*


I get the following error :

*Exception in thread "stdin writer for List(patch_matching_similarity)"
java.lang.OutOfMemoryError: Java heap space*
* at java.util.Arrays.copyOf(Arrays.java:2271)*
* at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113)*
* at
java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)*
* at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140)*
* at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)*
* at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)*
* at com.esotericsoftware.kryo.io.Output.flush(Output.java:155)*
* at com.esotericsoftware.krput.writeString_slow(Output.java:420)*
* at com.esotericsoftware.kryo.io.Output.writeString(Output.java:326)*
* at
com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.write(DefaultSerializers.java:153)*
* at
com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.write(DefaultSerializers.java:146)*
* at com.esotericsoftware.kryo.Kryo.writeObjectOrNull(Kryo.java:549)*
* at
com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.write(FieldSerializer.java:570)*
* at
com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:213)*
* at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:568)*
* at
org.apache.spark.serializer.KryoSerializationStream.writeObject(KryoSerializer.scala:119)*
* at
org.apache.spark.serializer.SerializationStream.writeAll(Serializer.scala:110)*
* at
org.apache.spark.storage.BlockManager.dataSerializeStream(BlockManager.scala:1047)*
* at
org.apache.spark.storage.BlockManager.dataSerialize(BlockManager.scala:1056)*
* at org.apache.spark.storage.MemoryStore.putArray(MemoryStore.scala:93)*
* at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:745)*
* at org.apache.spark.storage.BlockManager.putArray(BlockManager.scala:625)*
* at
org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:167)*
* at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:70)*
* at org.apache.spark.rdd.RDD.iterator(RDD.scala:227)*
* at org.apache.spark.rdd.FilteredRDD.compute(FilteredRDD.scala:34)*
* at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)*
* at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)*
* at
org.apache.spark.rdd.CartesianRDD$$anonfun$compute$1.apply(CartesianRDD.scala:75)*
* at
org.apache.spark.rdd.CartesianRDD$$anonfun$compute$1.apply(CartesianRDD.scala:74)*
* at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)*
yo.io.Output.require(Output.java:135)
at com.esotericsoftware.kryo.io.*Out*


I don't understand why since I set the same amount of memory in the two
cases.

Any ideas will be helpfull. I use spark 1.1.0.

Cheers,

Jao