Spark memory distribution

2020-07-27 Thread dben
Hi,
I'm having a computation held on top of a big dynamic model that is
constantly having changes / online updates, therefore, thought that working
in batch mode (stateless): s.t. requires of heavy model sent to spark will
be less appropriate than working in stream mode.
Therefore, was able to have computations in stream mode and have the
model living in spark and getting live updates required.
However, as the main motivation for using spark was complexity
reduction/compute speedup given a distributed algorithm. State management is
evidently a challenging task in terms of memory resources which i don't want
to overwhelm/disrupt computation
My main Q is: as effective spark may be computation wise (using a
corresponding distributed algorithm) is there any grade of control for user
with is memory footprint?
e.g.: Is splitting its memory between its workers is feasible? or all memory
is eventually centralized to spark master (sometime driver, depending on
work mode: client or cluster)
I'm basically looking for a way to scale out memory-wise and not just
compute wise...
is the transformation of a centric data-structure into RDDs (which I'm using
in compute) may relieve/distribute memory footprint as well?
for example: can i split my main memory data structure into set of clusters
assigned to each worker etc?
thanks a lot,
David



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Understanding Spark Memory distribution

2015-04-13 Thread Imran Rashid
broadcast variables count towards spark.storage.memoryFraction, so they
use the same pool of memory as cached RDDs.

That being said, I'm really not sure why you are running into problems, it
seems like you have plenty of memory available.  Most likely its got
nothing to do with broadcast variables or caching -- its just whatever
logic you are applying in your transformations that are causing lots of GC
to occur during the computation.  Hard to say without knowing more details.

You could try increasing the timeout for the failed askWithReply by
increasing spark.akka.lookupTimeout (defaults to 30), but that would most
likely be treating a symptom, not the root cause.

On Fri, Mar 27, 2015 at 4:52 PM, Ankur Srivastava 
ankur.srivast...@gmail.com wrote:

 Hi All,

 I am running a spark cluster on EC2 instances of type: m3.2xlarge. I have
 given 26gb of memory with all 8 cores to my executors. I can see that in
 the logs too:

 *15/03/27 21:31:06 INFO AppClient$ClientActor: Executor added:
 app-20150327213106-/0 on worker-20150327212934-10.x.y.z-40128
 (10.x.y.z:40128) with 8 cores*

 I am not caching any RDD so I have set spark.storage.memoryFraction to
 0.2. I can see on SparkUI under executors tab Memory used is 0.0/4.5 GB.

 I am now confused with these logs?

 *15/03/27 21:31:08 INFO BlockManagerMasterActor: Registering block manager
 10.77.100.196:58407 http://10.77.100.196:58407 with 4.5 GB RAM,
 BlockManagerId(4, 10.x.y.z, 58407)*

 I am broadcasting a large object of 3 gb and after that when I am creating
 an RDD, I see logs which show this 4.5 GB memory getting full and then I
 get OOM.

 How can I make block manager use more memory?

 Is there any other fine tuning I need to do for broadcasting large objects?

 And does broadcast variable use cache memory or rest of the heap?


 Thanks

 Ankur



Re: Understanding Spark Memory distribution

2015-03-30 Thread giive chen
Hi Ankur

If you using standalone mode, your config is wrong. You should use export
SPARK_DAEMON_MEMORY=xxx   in config/spark-env.sh. At least it works on my
spark 1.3.0 standalone mode machine.

BTW, The SPARK_DRIVER_MEMORY is used in Yarn mode and looks like the
standalone mode don't use this config.

To debug this, please type ps auxw | grep
org.apache.spark.deploy.master.[M]aster  in master machine.
You can see the Xmx and Xms option.

Wisely Chen






On Mon, Mar 30, 2015 at 3:55 AM, Ankur Srivastava 
ankur.srivast...@gmail.com wrote:

 Hi Wisely,

 I am running on Amazon EC2 instances so I can not doubt the hardware.
 Moreover my other pipelines run successfully except for this which involves
 Broadcasting large object.

 My spark-en.sh setting are:

 SPARK_MASTER_IP=MASTER-IP

 SPARK_LOCAL_IP=LOCAL-IP

 SPARK_DRIVER_MEMORY=24g

 SPARK_WORKER_MEMORY=28g

 SPARK_EXECUTOR_MEMORY=26g

 SPARK_WORKER_CORES=8

 My spark-default.sh settings are:

 spark.eventLog.enabled   true

 spark.eventLog.dir   /srv/logs/

 spark.serializer org.apache.spark.serializer.KryoSerializer

 spark.kryo.registrator
 com.test.utils.KryoSerializationRegistrator

 spark.executor.extraJavaOptions  -verbose:gc -XX:+PrintGCDetails
 -XX:+PrintGCTimeStamps -XX:+HeapDumpOnOutOfMemoryError
 -XX:HeapDumpPath=/srv/logs/ -XX:+UseG1GC

 spark.shuffle.consolidateFiles   true

 spark.shuffle.managersort

 spark.shuffle.compress   true

 spark.rdd.compress   true
 Thanks
 Ankur

 On Sat, Mar 28, 2015 at 7:57 AM, Wisely Chen wiselyc...@appier.com
 wrote:

 Hi Ankur

 If your hardware is ok, looks like it is config problem. Can you show me
 the config of spark-env.sh or JVM config?

 Thanks

 Wisely Chen

 2015-03-28 15:39 GMT+08:00 Ankur Srivastava ankur.srivast...@gmail.com:

 Hi Wisely,
 I have 26gb for driver and the master is running on m3.2xlarge machines.

 I see OOM errors on workers and even they are running with 26th of
 memory.

 Thanks

 On Fri, Mar 27, 2015, 11:43 PM Wisely Chen wiselyc...@appier.com
 wrote:

 Hi

 In broadcast, spark will collect the whole 3gb object into master node
 and broadcast to each slaves. It is very common situation that the master
 node don't have enough memory .

 What is your master node settings?

 Wisely Chen

 Ankur Srivastava ankur.srivast...@gmail.com 於 2015年3月28日 星期六寫道:

 I have increased the spark.storage.memoryFraction to 0.4 but I still
 get OOM errors on Spark Executor nodes


 15/03/27 23:19:51 INFO BlockManagerMaster: Updated info of block
 broadcast_5_piece10

 15/03/27 23:19:51 INFO TorrentBroadcast: Reading broadcast variable 5
 took 2704 ms

 15/03/27 23:19:52 INFO MemoryStore: ensureFreeSpace(672530208) called
 with curMem=2484698683, maxMem=9631778734

 15/03/27 23:19:52 INFO MemoryStore: Block broadcast_5 stored as values
 in memory (estimated size 641.4 MB, free 6.0 GB)

 15/03/27 23:34:02 WARN AkkaUtils: Error sending message in 1 attempts

 java.util.concurrent.TimeoutException: Futures timed out after [30
 seconds]

 at
 scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)

 at
 scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)

 at
 scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)

 at
 scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)

 at scala.concurrent.Await$.result(package.scala:107)

 at
 org.apache.spark.util.AkkaUtils$.askWithReply(AkkaUtils.scala:187)

 at
 org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:407)

 15/03/27 23:34:02 ERROR Executor: Exception in task 7.0 in stage 2.0
 (TID 4007)

 java.lang.OutOfMemoryError: GC overhead limit exceeded

 at
 java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1986)

 at
 java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)

 at
 java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)

 Thanks

 Ankur

 On Fri, Mar 27, 2015 at 2:52 PM, Ankur Srivastava 
 ankur.srivast...@gmail.com wrote:

 Hi All,

 I am running a spark cluster on EC2 instances of type: m3.2xlarge. I
 have given 26gb of memory with all 8 cores to my executors. I can see 
 that
 in the logs too:

 *15/03/27 21:31:06 INFO AppClient$ClientActor: Executor added:
 app-20150327213106-/0 on worker-20150327212934-10.x.y.z-40128
 (10.x.y.z:40128) with 8 cores*

 I am not caching any RDD so I have set spark.storage.memoryFraction
 to 0.2. I can see on SparkUI under executors tab Memory used is 0.0/4.5 
 GB.

 I am now confused with these logs?

 *15/03/27 21:31:08 INFO BlockManagerMasterActor: Registering block
 manager 10.77.100.196:58407 http://10.77.100.196:58407 with 4.5 GB RAM,
 BlockManagerId(4, 10.x.y.z, 58407)*

 I am broadcasting a large object of 3 gb and after that when I am
 creating an RDD, I see logs which show this 4.5 GB memory getting full 
 and
 then I 

Re: Understanding Spark Memory distribution

2015-03-30 Thread Ankur Srivastava
Hi Wisely,

I am running spark 1.2.1 and I have checked the process heap and it is
running with all the heap that I am assigning and as I mentioned earlier I
get OOM on workers not the driver or master.

Thanks
Ankur

On Mon, Mar 30, 2015 at 9:24 AM, giive chen thegi...@gmail.com wrote:

 Hi Ankur

 If you using standalone mode, your config is wrong. You should use export
 SPARK_DAEMON_MEMORY=xxx   in config/spark-env.sh. At least it works on my
 spark 1.3.0 standalone mode machine.

 BTW, The SPARK_DRIVER_MEMORY is used in Yarn mode and looks like the
 standalone mode don't use this config.

 To debug this, please type ps auxw | grep
 org.apache.spark.deploy.master.[M]aster  in master machine.
 You can see the Xmx and Xms option.

 Wisely Chen






 On Mon, Mar 30, 2015 at 3:55 AM, Ankur Srivastava 
 ankur.srivast...@gmail.com wrote:

 Hi Wisely,

 I am running on Amazon EC2 instances so I can not doubt the hardware.
 Moreover my other pipelines run successfully except for this which involves
 Broadcasting large object.

 My spark-en.sh setting are:

 SPARK_MASTER_IP=MASTER-IP

 SPARK_LOCAL_IP=LOCAL-IP

 SPARK_DRIVER_MEMORY=24g

 SPARK_WORKER_MEMORY=28g

 SPARK_EXECUTOR_MEMORY=26g

 SPARK_WORKER_CORES=8

 My spark-default.sh settings are:

 spark.eventLog.enabled   true

 spark.eventLog.dir   /srv/logs/

 spark.serializer
 org.apache.spark.serializer.KryoSerializer

 spark.kryo.registrator
 com.test.utils.KryoSerializationRegistrator

 spark.executor.extraJavaOptions  -verbose:gc -XX:+PrintGCDetails
 -XX:+PrintGCTimeStamps -XX:+HeapDumpOnOutOfMemoryError
 -XX:HeapDumpPath=/srv/logs/ -XX:+UseG1GC

 spark.shuffle.consolidateFiles   true

 spark.shuffle.managersort

 spark.shuffle.compress   true

 spark.rdd.compress   true
 Thanks
 Ankur

 On Sat, Mar 28, 2015 at 7:57 AM, Wisely Chen wiselyc...@appier.com
 wrote:

 Hi Ankur

 If your hardware is ok, looks like it is config problem. Can you show me
 the config of spark-env.sh or JVM config?

 Thanks

 Wisely Chen

 2015-03-28 15:39 GMT+08:00 Ankur Srivastava ankur.srivast...@gmail.com
 :

 Hi Wisely,
 I have 26gb for driver and the master is running on m3.2xlarge machines.

 I see OOM errors on workers and even they are running with 26th of
 memory.

 Thanks

 On Fri, Mar 27, 2015, 11:43 PM Wisely Chen wiselyc...@appier.com
 wrote:

 Hi

 In broadcast, spark will collect the whole 3gb object into master node
 and broadcast to each slaves. It is very common situation that the master
 node don't have enough memory .

 What is your master node settings?

 Wisely Chen

 Ankur Srivastava ankur.srivast...@gmail.com 於 2015年3月28日 星期六寫道:

 I have increased the spark.storage.memoryFraction to 0.4 but I
 still get OOM errors on Spark Executor nodes


 15/03/27 23:19:51 INFO BlockManagerMaster: Updated info of block
 broadcast_5_piece10

 15/03/27 23:19:51 INFO TorrentBroadcast: Reading broadcast variable 5
 took 2704 ms

 15/03/27 23:19:52 INFO MemoryStore: ensureFreeSpace(672530208) called
 with curMem=2484698683, maxMem=9631778734

 15/03/27 23:19:52 INFO MemoryStore: Block broadcast_5 stored as
 values in memory (estimated size 641.4 MB, free 6.0 GB)

 15/03/27 23:34:02 WARN AkkaUtils: Error sending message in 1 attempts

 java.util.concurrent.TimeoutException: Futures timed out after [30
 seconds]

 at
 scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)

 at
 scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)

 at
 scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)

 at
 scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)

 at scala.concurrent.Await$.result(package.scala:107)

 at
 org.apache.spark.util.AkkaUtils$.askWithReply(AkkaUtils.scala:187)

 at
 org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:407)

 15/03/27 23:34:02 ERROR Executor: Exception in task 7.0 in stage 2.0
 (TID 4007)

 java.lang.OutOfMemoryError: GC overhead limit exceeded

 at
 java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1986)

 at
 java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)

 at
 java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)

 Thanks

 Ankur

 On Fri, Mar 27, 2015 at 2:52 PM, Ankur Srivastava 
 ankur.srivast...@gmail.com wrote:

 Hi All,

 I am running a spark cluster on EC2 instances of type: m3.2xlarge. I
 have given 26gb of memory with all 8 cores to my executors. I can see 
 that
 in the logs too:

 *15/03/27 21:31:06 INFO AppClient$ClientActor: Executor added:
 app-20150327213106-/0 on worker-20150327212934-10.x.y.z-40128
 (10.x.y.z:40128) with 8 cores*

 I am not caching any RDD so I have set
 spark.storage.memoryFraction to 0.2. I can see on SparkUI under 
 executors
 tab Memory used is 0.0/4.5 GB.

 I am now confused with these logs?

 *15/03/27 21:31:08 INFO 

Re: Understanding Spark Memory distribution

2015-03-29 Thread Ankur Srivastava
Hi Wisely,

I am running on Amazon EC2 instances so I can not doubt the hardware.
Moreover my other pipelines run successfully except for this which involves
Broadcasting large object.

My spark-en.sh setting are:

SPARK_MASTER_IP=MASTER-IP

SPARK_LOCAL_IP=LOCAL-IP

SPARK_DRIVER_MEMORY=24g

SPARK_WORKER_MEMORY=28g

SPARK_EXECUTOR_MEMORY=26g

SPARK_WORKER_CORES=8

My spark-default.sh settings are:

spark.eventLog.enabled   true

spark.eventLog.dir   /srv/logs/

spark.serializer org.apache.spark.serializer.KryoSerializer

spark.kryo.registrator   com.test.utils.KryoSerializationRegistrator

spark.executor.extraJavaOptions  -verbose:gc -XX:+PrintGCDetails
-XX:+PrintGCTimeStamps -XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/srv/logs/ -XX:+UseG1GC

spark.shuffle.consolidateFiles   true

spark.shuffle.managersort

spark.shuffle.compress   true

spark.rdd.compress   true
Thanks
Ankur

On Sat, Mar 28, 2015 at 7:57 AM, Wisely Chen wiselyc...@appier.com wrote:

 Hi Ankur

 If your hardware is ok, looks like it is config problem. Can you show me
 the config of spark-env.sh or JVM config?

 Thanks

 Wisely Chen

 2015-03-28 15:39 GMT+08:00 Ankur Srivastava ankur.srivast...@gmail.com:

 Hi Wisely,
 I have 26gb for driver and the master is running on m3.2xlarge machines.

 I see OOM errors on workers and even they are running with 26th of memory.

 Thanks

 On Fri, Mar 27, 2015, 11:43 PM Wisely Chen wiselyc...@appier.com wrote:

 Hi

 In broadcast, spark will collect the whole 3gb object into master node
 and broadcast to each slaves. It is very common situation that the master
 node don't have enough memory .

 What is your master node settings?

 Wisely Chen

 Ankur Srivastava ankur.srivast...@gmail.com 於 2015年3月28日 星期六寫道:

 I have increased the spark.storage.memoryFraction to 0.4 but I still
 get OOM errors on Spark Executor nodes


 15/03/27 23:19:51 INFO BlockManagerMaster: Updated info of block
 broadcast_5_piece10

 15/03/27 23:19:51 INFO TorrentBroadcast: Reading broadcast variable 5
 took 2704 ms

 15/03/27 23:19:52 INFO MemoryStore: ensureFreeSpace(672530208) called
 with curMem=2484698683, maxMem=9631778734

 15/03/27 23:19:52 INFO MemoryStore: Block broadcast_5 stored as values
 in memory (estimated size 641.4 MB, free 6.0 GB)

 15/03/27 23:34:02 WARN AkkaUtils: Error sending message in 1 attempts

 java.util.concurrent.TimeoutException: Futures timed out after [30
 seconds]

 at
 scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)

 at
 scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)

 at
 scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)

 at
 scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)

 at scala.concurrent.Await$.result(package.scala:107)

 at
 org.apache.spark.util.AkkaUtils$.askWithReply(AkkaUtils.scala:187)

 at
 org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:407)

 15/03/27 23:34:02 ERROR Executor: Exception in task 7.0 in stage 2.0
 (TID 4007)

 java.lang.OutOfMemoryError: GC overhead limit exceeded

 at
 java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1986)

 at
 java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)

 at
 java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)

 Thanks

 Ankur

 On Fri, Mar 27, 2015 at 2:52 PM, Ankur Srivastava 
 ankur.srivast...@gmail.com wrote:

 Hi All,

 I am running a spark cluster on EC2 instances of type: m3.2xlarge. I
 have given 26gb of memory with all 8 cores to my executors. I can see that
 in the logs too:

 *15/03/27 21:31:06 INFO AppClient$ClientActor: Executor added:
 app-20150327213106-/0 on worker-20150327212934-10.x.y.z-40128
 (10.x.y.z:40128) with 8 cores*

 I am not caching any RDD so I have set spark.storage.memoryFraction
 to 0.2. I can see on SparkUI under executors tab Memory used is 0.0/4.5 
 GB.

 I am now confused with these logs?

 *15/03/27 21:31:08 INFO BlockManagerMasterActor: Registering block
 manager 10.77.100.196:58407 http://10.77.100.196:58407 with 4.5 GB RAM,
 BlockManagerId(4, 10.x.y.z, 58407)*

 I am broadcasting a large object of 3 gb and after that when I am
 creating an RDD, I see logs which show this 4.5 GB memory getting full and
 then I get OOM.

 How can I make block manager use more memory?

 Is there any other fine tuning I need to do for broadcasting large
 objects?

 And does broadcast variable use cache memory or rest of the heap?


 Thanks

 Ankur






Re: Understanding Spark Memory distribution

2015-03-28 Thread Wisely Chen
Hi Ankur

If your hardware is ok, looks like it is config problem. Can you show me
the config of spark-env.sh or JVM config?

Thanks

Wisely Chen

2015-03-28 15:39 GMT+08:00 Ankur Srivastava ankur.srivast...@gmail.com:

 Hi Wisely,
 I have 26gb for driver and the master is running on m3.2xlarge machines.

 I see OOM errors on workers and even they are running with 26th of memory.

 Thanks

 On Fri, Mar 27, 2015, 11:43 PM Wisely Chen wiselyc...@appier.com wrote:

 Hi

 In broadcast, spark will collect the whole 3gb object into master node
 and broadcast to each slaves. It is very common situation that the master
 node don't have enough memory .

 What is your master node settings?

 Wisely Chen

 Ankur Srivastava ankur.srivast...@gmail.com 於 2015年3月28日 星期六寫道:

 I have increased the spark.storage.memoryFraction to 0.4 but I still
 get OOM errors on Spark Executor nodes


 15/03/27 23:19:51 INFO BlockManagerMaster: Updated info of block
 broadcast_5_piece10

 15/03/27 23:19:51 INFO TorrentBroadcast: Reading broadcast variable 5
 took 2704 ms

 15/03/27 23:19:52 INFO MemoryStore: ensureFreeSpace(672530208) called
 with curMem=2484698683, maxMem=9631778734

 15/03/27 23:19:52 INFO MemoryStore: Block broadcast_5 stored as values
 in memory (estimated size 641.4 MB, free 6.0 GB)

 15/03/27 23:34:02 WARN AkkaUtils: Error sending message in 1 attempts

 java.util.concurrent.TimeoutException: Futures timed out after [30
 seconds]

 at
 scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)

 at
 scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)

 at
 scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)

 at
 scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)

 at scala.concurrent.Await$.result(package.scala:107)

 at
 org.apache.spark.util.AkkaUtils$.askWithReply(AkkaUtils.scala:187)

 at
 org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:407)

 15/03/27 23:34:02 ERROR Executor: Exception in task 7.0 in stage 2.0
 (TID 4007)

 java.lang.OutOfMemoryError: GC overhead limit exceeded

 at
 java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1986)

 at
 java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)

 at
 java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)

 Thanks

 Ankur

 On Fri, Mar 27, 2015 at 2:52 PM, Ankur Srivastava 
 ankur.srivast...@gmail.com wrote:

 Hi All,

 I am running a spark cluster on EC2 instances of type: m3.2xlarge. I
 have given 26gb of memory with all 8 cores to my executors. I can see that
 in the logs too:

 *15/03/27 21:31:06 INFO AppClient$ClientActor: Executor added:
 app-20150327213106-/0 on worker-20150327212934-10.x.y.z-40128
 (10.x.y.z:40128) with 8 cores*

 I am not caching any RDD so I have set spark.storage.memoryFraction
 to 0.2. I can see on SparkUI under executors tab Memory used is 0.0/4.5 GB.

 I am now confused with these logs?

 *15/03/27 21:31:08 INFO BlockManagerMasterActor: Registering block
 manager 10.77.100.196:58407 http://10.77.100.196:58407 with 4.5 GB RAM,
 BlockManagerId(4, 10.x.y.z, 58407)*

 I am broadcasting a large object of 3 gb and after that when I am
 creating an RDD, I see logs which show this 4.5 GB memory getting full and
 then I get OOM.

 How can I make block manager use more memory?

 Is there any other fine tuning I need to do for broadcasting large
 objects?

 And does broadcast variable use cache memory or rest of the heap?


 Thanks

 Ankur





Re: Understanding Spark Memory distribution

2015-03-28 Thread Wisely Chen
Hi

In broadcast, spark will collect the whole 3gb object into master node and
broadcast to each slaves. It is very common situation that the master node
don't have enough memory .

What is your master node settings?

Wisely Chen

Ankur Srivastava ankur.srivast...@gmail.com 於 2015年3月28日 星期六寫道:

 I have increased the spark.storage.memoryFraction to 0.4 but I still
 get OOM errors on Spark Executor nodes


 15/03/27 23:19:51 INFO BlockManagerMaster: Updated info of block
 broadcast_5_piece10

 15/03/27 23:19:51 INFO TorrentBroadcast: Reading broadcast variable 5 took
 2704 ms

 15/03/27 23:19:52 INFO MemoryStore: ensureFreeSpace(672530208) called with
 curMem=2484698683, maxMem=9631778734

 15/03/27 23:19:52 INFO MemoryStore: Block broadcast_5 stored as values in
 memory (estimated size 641.4 MB, free 6.0 GB)

 15/03/27 23:34:02 WARN AkkaUtils: Error sending message in 1 attempts

 java.util.concurrent.TimeoutException: Futures timed out after [30 seconds]

 at
 scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)

 at
 scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)

 at
 scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)

 at
 scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)

 at scala.concurrent.Await$.result(package.scala:107)

 at
 org.apache.spark.util.AkkaUtils$.askWithReply(AkkaUtils.scala:187)

 at
 org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:407)

 15/03/27 23:34:02 ERROR Executor: Exception in task 7.0 in stage 2.0 (TID
 4007)

 java.lang.OutOfMemoryError: GC overhead limit exceeded

 at
 java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1986)

 at
 java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)

 at
 java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)

 Thanks

 Ankur

 On Fri, Mar 27, 2015 at 2:52 PM, Ankur Srivastava 
 ankur.srivast...@gmail.com
 javascript:_e(%7B%7D,'cvml','ankur.srivast...@gmail.com'); wrote:

 Hi All,

 I am running a spark cluster on EC2 instances of type: m3.2xlarge. I have
 given 26gb of memory with all 8 cores to my executors. I can see that in
 the logs too:

 *15/03/27 21:31:06 INFO AppClient$ClientActor: Executor added:
 app-20150327213106-/0 on worker-20150327212934-10.x.y.z-40128
 (10.x.y.z:40128) with 8 cores*

 I am not caching any RDD so I have set spark.storage.memoryFraction to
 0.2. I can see on SparkUI under executors tab Memory used is 0.0/4.5 GB.

 I am now confused with these logs?

 *15/03/27 21:31:08 INFO BlockManagerMasterActor: Registering block
 manager 10.77.100.196:58407 http://10.77.100.196:58407 with 4.5 GB RAM,
 BlockManagerId(4, 10.x.y.z, 58407)*

 I am broadcasting a large object of 3 gb and after that when I am
 creating an RDD, I see logs which show this 4.5 GB memory getting full and
 then I get OOM.

 How can I make block manager use more memory?

 Is there any other fine tuning I need to do for broadcasting large
 objects?

 And does broadcast variable use cache memory or rest of the heap?


 Thanks

 Ankur





Understanding Spark Memory distribution

2015-03-27 Thread Ankur Srivastava
Hi All,

I am running a spark cluster on EC2 instances of type: m3.2xlarge. I have
given 26gb of memory with all 8 cores to my executors. I can see that in
the logs too:

*15/03/27 21:31:06 INFO AppClient$ClientActor: Executor added:
app-20150327213106-/0 on worker-20150327212934-10.x.y.z-40128
(10.x.y.z:40128) with 8 cores*

I am not caching any RDD so I have set spark.storage.memoryFraction to
0.2. I can see on SparkUI under executors tab Memory used is 0.0/4.5 GB.

I am now confused with these logs?

*15/03/27 21:31:08 INFO BlockManagerMasterActor: Registering block manager
10.77.100.196:58407 http://10.77.100.196:58407 with 4.5 GB RAM,
BlockManagerId(4, 10.x.y.z, 58407)*

I am broadcasting a large object of 3 gb and after that when I am creating
an RDD, I see logs which show this 4.5 GB memory getting full and then I
get OOM.

How can I make block manager use more memory?

Is there any other fine tuning I need to do for broadcasting large objects?

And does broadcast variable use cache memory or rest of the heap?


Thanks

Ankur


Re: Understanding Spark Memory distribution

2015-03-27 Thread Ankur Srivastava
I have increased the spark.storage.memoryFraction to 0.4 but I still get
OOM errors on Spark Executor nodes


15/03/27 23:19:51 INFO BlockManagerMaster: Updated info of block
broadcast_5_piece10

15/03/27 23:19:51 INFO TorrentBroadcast: Reading broadcast variable 5 took
2704 ms

15/03/27 23:19:52 INFO MemoryStore: ensureFreeSpace(672530208) called with
curMem=2484698683, maxMem=9631778734

15/03/27 23:19:52 INFO MemoryStore: Block broadcast_5 stored as values in
memory (estimated size 641.4 MB, free 6.0 GB)

15/03/27 23:34:02 WARN AkkaUtils: Error sending message in 1 attempts

java.util.concurrent.TimeoutException: Futures timed out after [30 seconds]

at
scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)

at
scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)

at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)

at
scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)

at scala.concurrent.Await$.result(package.scala:107)

at
org.apache.spark.util.AkkaUtils$.askWithReply(AkkaUtils.scala:187)

at
org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:407)

15/03/27 23:34:02 ERROR Executor: Exception in task 7.0 in stage 2.0 (TID
4007)

java.lang.OutOfMemoryError: GC overhead limit exceeded

at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1986)

at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)

at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)

Thanks

Ankur

On Fri, Mar 27, 2015 at 2:52 PM, Ankur Srivastava 
ankur.srivast...@gmail.com wrote:

 Hi All,

 I am running a spark cluster on EC2 instances of type: m3.2xlarge. I have
 given 26gb of memory with all 8 cores to my executors. I can see that in
 the logs too:

 *15/03/27 21:31:06 INFO AppClient$ClientActor: Executor added:
 app-20150327213106-/0 on worker-20150327212934-10.x.y.z-40128
 (10.x.y.z:40128) with 8 cores*

 I am not caching any RDD so I have set spark.storage.memoryFraction to
 0.2. I can see on SparkUI under executors tab Memory used is 0.0/4.5 GB.

 I am now confused with these logs?

 *15/03/27 21:31:08 INFO BlockManagerMasterActor: Registering block manager
 10.77.100.196:58407 http://10.77.100.196:58407 with 4.5 GB RAM,
 BlockManagerId(4, 10.x.y.z, 58407)*

 I am broadcasting a large object of 3 gb and after that when I am creating
 an RDD, I see logs which show this 4.5 GB memory getting full and then I
 get OOM.

 How can I make block manager use more memory?

 Is there any other fine tuning I need to do for broadcasting large objects?

 And does broadcast variable use cache memory or rest of the heap?


 Thanks

 Ankur