Re: java.lang.OutOfMemoryError: Java heap space when running job via spark-submit
in fact with --driver-memory 2G I can get it working On Thu, Oct 9, 2014 at 6:20 PM, Xiangrui Meng wrote: > Please use --driver-memory 2g instead of --conf > spark.driver.memory=2g. I'm not sure whether this is a bug. -Xiangrui > > On Thu, Oct 9, 2014 at 9:00 AM, Jaonary Rabarisoa > wrote: > > Dear all, > > > > I have a spark job with the following configuration > > > > val conf = new SparkConf() > > .setAppName("My Job") > > .set("spark.serializer", > "org.apache.spark.serializer.KryoSerializer") > > .set("spark.kryo.registrator", "value.serializer.Registrator") > > .setMaster("local[4]") > > .set("spark.executor.memory", "4g") > > > > > > that I can run manually with sbt run without any problem. > > > > But, I try to run the same job with spark-submit > > > > ./spark-1.1.0-bin-hadoop2.4/bin/spark-submit \ > > --class value.jobs.MyJob \ > > --master local[4] \ > > --conf spark.executor.memory=4g \ > > --conf spark.driver.memory=2g \ > > target/scala-2.10/my-job_2.10-1.0.jar > > > > > > I get the following error : > > > > Exception in thread "stdin writer for List(patch_matching_similarity)" > > java.lang.OutOfMemoryError: Java heap space > > at java.util.Arrays.copyOf(Arrays.java:2271) > > at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113) > > at > > > java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93) > > at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140) > > at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) > > at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126) > > at com.esotericsoftware.kryo.io.Output.flush(Output.java:155) > > at com.esotericsoftware.krput.writeString_slow(Output.java:420) > > at com.esotericsoftware.kryo.io.Output.writeString(Output.java:326) > > at > > > com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.write(DefaultSerializers.java:153) > > at > > > com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.write(DefaultSerializers.java:146) > > at com.esotericsoftware.kryo.Kryo.writeObjectOrNull(Kryo.java:549) > > at > > > com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.write(FieldSerializer.java:570) > > at > > > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:213) > > at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:568) > > at > > > org.apache.spark.serializer.KryoSerializationStream.writeObject(KryoSerializer.scala:119) > > at > > > org.apache.spark.serializer.SerializationStream.writeAll(Serializer.scala:110) > > at > > > org.apache.spark.storage.BlockManager.dataSerializeStream(BlockManager.scala:1047) > > at > > > org.apache.spark.storage.BlockManager.dataSerialize(BlockManager.scala:1056) > > at org.apache.spark.storage.MemoryStore.putArray(MemoryStore.scala:93) > > at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:745) > > at org.apache.spark.storage.BlockManager.putArray(BlockManager.scala:625) > > at > org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:167) > > at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:70) > > at org.apache.spark.rdd.RDD.iterator(RDD.scala:227) > > at org.apache.spark.rdd.FilteredRDD.compute(FilteredRDD.scala:34) > > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) > > at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) > > at > > > org.apache.spark.rdd.CartesianRDD$$anonfun$compute$1.apply(CartesianRDD.scala:75) > > at > > > org.apache.spark.rdd.CartesianRDD$$anonfun$compute$1.apply(CartesianRDD.scala:74) > > at > > > scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)yo.io.Output.require(Output.java:135) > > at com.esotericsoftware.kryo.io.Out > > > > > > I don't understand why since I set the same amount of memory in the two > > cases. > > > > Any ideas will be helpfull. I use spark 1.1.0. > > > > Cheers, > > > > Jao >
Re: java.lang.OutOfMemoryError: Java heap space when running job via spark-submit
Please use --driver-memory 2g instead of --conf spark.driver.memory=2g. I'm not sure whether this is a bug. -Xiangrui On Thu, Oct 9, 2014 at 9:00 AM, Jaonary Rabarisoa wrote: > Dear all, > > I have a spark job with the following configuration > > val conf = new SparkConf() > .setAppName("My Job") > .set("spark.serializer", "org.apache.spark.serializer.KryoSerializer") > .set("spark.kryo.registrator", "value.serializer.Registrator") > .setMaster("local[4]") > .set("spark.executor.memory", "4g") > > > that I can run manually with sbt run without any problem. > > But, I try to run the same job with spark-submit > > ./spark-1.1.0-bin-hadoop2.4/bin/spark-submit \ > --class value.jobs.MyJob \ > --master local[4] \ > --conf spark.executor.memory=4g \ > --conf spark.driver.memory=2g \ > target/scala-2.10/my-job_2.10-1.0.jar > > > I get the following error : > > Exception in thread "stdin writer for List(patch_matching_similarity)" > java.lang.OutOfMemoryError: Java heap space > at java.util.Arrays.copyOf(Arrays.java:2271) > at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113) > at > java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93) > at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140) > at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) > at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126) > at com.esotericsoftware.kryo.io.Output.flush(Output.java:155) > at com.esotericsoftware.krput.writeString_slow(Output.java:420) > at com.esotericsoftware.kryo.io.Output.writeString(Output.java:326) > at > com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.write(DefaultSerializers.java:153) > at > com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.write(DefaultSerializers.java:146) > at com.esotericsoftware.kryo.Kryo.writeObjectOrNull(Kryo.java:549) > at > com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.write(FieldSerializer.java:570) > at > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:213) > at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:568) > at > org.apache.spark.serializer.KryoSerializationStream.writeObject(KryoSerializer.scala:119) > at > org.apache.spark.serializer.SerializationStream.writeAll(Serializer.scala:110) > at > org.apache.spark.storage.BlockManager.dataSerializeStream(BlockManager.scala:1047) > at > org.apache.spark.storage.BlockManager.dataSerialize(BlockManager.scala:1056) > at org.apache.spark.storage.MemoryStore.putArray(MemoryStore.scala:93) > at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:745) > at org.apache.spark.storage.BlockManager.putArray(BlockManager.scala:625) > at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:167) > at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:70) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:227) > at org.apache.spark.rdd.FilteredRDD.compute(FilteredRDD.scala:34) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) > at > org.apache.spark.rdd.CartesianRDD$$anonfun$compute$1.apply(CartesianRDD.scala:75) > at > org.apache.spark.rdd.CartesianRDD$$anonfun$compute$1.apply(CartesianRDD.scala:74) > at > scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)yo.io.Output.require(Output.java:135) > at com.esotericsoftware.kryo.io.Out > > > I don't understand why since I set the same amount of memory in the two > cases. > > Any ideas will be helpfull. I use spark 1.1.0. > > Cheers, > > Jao - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
java.lang.OutOfMemoryError: Java heap space when running job via spark-submit
Dear all, I have a spark job with the following configuration *val conf = new SparkConf()* * .setAppName("My Job")* * .set("spark.serializer", "org.apache.spark.serializer.KryoSerializer")* * .set("spark.kryo.registrator", "value.serializer.Registrator")* * .setMaster("local[4]")* * .set("spark.executor.memory", "4g")* that I can run manually with sbt run without any problem. But, I try to run the same job with spark-submit *./spark-1.1.0-bin-hadoop2.4/bin/spark-submit \* * --class value.jobs.MyJob \* * --master local[4] \* * --conf spark.executor.memory=4g \* * --conf spark.driver.memory=2g \* * target/scala-2.10/my-job_2.10-1.0.jar* I get the following error : *Exception in thread "stdin writer for List(patch_matching_similarity)" java.lang.OutOfMemoryError: Java heap space* * at java.util.Arrays.copyOf(Arrays.java:2271)* * at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113)* * at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)* * at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140)* * at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)* * at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)* * at com.esotericsoftware.kryo.io.Output.flush(Output.java:155)* * at com.esotericsoftware.krput.writeString_slow(Output.java:420)* * at com.esotericsoftware.kryo.io.Output.writeString(Output.java:326)* * at com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.write(DefaultSerializers.java:153)* * at com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.write(DefaultSerializers.java:146)* * at com.esotericsoftware.kryo.Kryo.writeObjectOrNull(Kryo.java:549)* * at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.write(FieldSerializer.java:570)* * at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:213)* * at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:568)* * at org.apache.spark.serializer.KryoSerializationStream.writeObject(KryoSerializer.scala:119)* * at org.apache.spark.serializer.SerializationStream.writeAll(Serializer.scala:110)* * at org.apache.spark.storage.BlockManager.dataSerializeStream(BlockManager.scala:1047)* * at org.apache.spark.storage.BlockManager.dataSerialize(BlockManager.scala:1056)* * at org.apache.spark.storage.MemoryStore.putArray(MemoryStore.scala:93)* * at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:745)* * at org.apache.spark.storage.BlockManager.putArray(BlockManager.scala:625)* * at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:167)* * at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:70)* * at org.apache.spark.rdd.RDD.iterator(RDD.scala:227)* * at org.apache.spark.rdd.FilteredRDD.compute(FilteredRDD.scala:34)* * at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)* * at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)* * at org.apache.spark.rdd.CartesianRDD$$anonfun$compute$1.apply(CartesianRDD.scala:75)* * at org.apache.spark.rdd.CartesianRDD$$anonfun$compute$1.apply(CartesianRDD.scala:74)* * at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)* yo.io.Output.require(Output.java:135) at com.esotericsoftware.kryo.io.*Out* I don't understand why since I set the same amount of memory in the two cases. Any ideas will be helpfull. I use spark 1.1.0. Cheers, Jao