Hi All,

 

I am a bit confused on spark.storage.memoryFraction, this is used to set the
area for RDD usage, will this RDD means only for cached and persisted RDD?
So if my program has no cached RDD at all (means that I have no .cache() or
.persist() call on any RDD), then I can set this
spark.storage.memoryFraction to a very small number or even zero?

 

I am writing a program which consume a lot of memory (broadcast value,
runtime, etc). But I have no cached RDD, so should I just turn off this
spark.storage.memoryFraction to 0 (which will help me to improve the
performance)?

 

And I have another issue on the broadcast, when I try to get a broadcast
value, it throws me out of memory error, which part of memory should I
allocate more (if I can't increase my overall memory size).

 

java.lang.OutOfMemoryError: Java heap spac

e

        at
com.esotericsoftware.kryo.serializers.DefaultArraySerializers$DoubleA

rraySerializer.read(DefaultArraySerializers.java:218)

        at
com.esotericsoftware.kryo.serializers.DefaultArraySerializers$DoubleA

rraySerializer.read(DefaultArraySerializers.java:200)

        at com.esotericsoftware.kryo.Kryo.readObjectOrNull(Kryo.java:699)

        at
com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.rea

d(FieldSerializer.java:611)

        at
com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSeria

lizer.java:221)

        at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:648)

        at
com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.rea

d(FieldSerializer.java:605)

        at
com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSeria

lizer.java:221)

        at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)

        at
org.apache.spark.serializer.KryoDeserializationStream.readObject(Kryo

Serializer.scala:138)

        at
org.apache.spark.serializer.DeserializationStream$$anon$1.getNext(Ser

ializer.scala:133)

        at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71)

        at
org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:2

48)

        at
org.apache.spark.storage.MemoryStore.putIterator(MemoryStore.scala:13

6)

        at
org.apache.spark.storage.BlockManager.doGetLocal(BlockManager.scala:5

49)

        at
org.apache.spark.storage.BlockManager.getLocal(BlockManager.scala:431

)

        at
org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlo

ck$1.apply(TorrentBroadcast.scala:167)

        at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1152)

        at
org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(Torren

tBroadcast.scala:164)

        at
org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(Torrent

Broadcast.scala:64)

        at
org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.s

cala:64)

        at
org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast

.scala:87)

 

 

Regards,

 

Shuai

Reply via email to