Hi,
I'm having a computation held on top of a big dynamic model that is
constantly having changes / online updates, therefore, thought that working
in batch mode (stateless): s.t. requires of heavy model sent to spark will
be less appropriate than working in stream mode.
Therefore, was able to
broadcast variables count towards spark.storage.memoryFraction, so they
use the same pool of memory as cached RDDs.
That being said, I'm really not sure why you are running into problems, it
seems like you have plenty of memory available. Most likely its got
nothing to do with broadcast
Hi Ankur
If you using standalone mode, your config is wrong. You should use export
SPARK_DAEMON_MEMORY=xxx in config/spark-env.sh. At least it works on my
spark 1.3.0 standalone mode machine.
BTW, The SPARK_DRIVER_MEMORY is used in Yarn mode and looks like the
standalone mode don't use this
Hi Wisely,
I am running spark 1.2.1 and I have checked the process heap and it is
running with all the heap that I am assigning and as I mentioned earlier I
get OOM on workers not the driver or master.
Thanks
Ankur
On Mon, Mar 30, 2015 at 9:24 AM, giive chen thegi...@gmail.com wrote:
Hi Ankur
Hi Wisely,
I am running on Amazon EC2 instances so I can not doubt the hardware.
Moreover my other pipelines run successfully except for this which involves
Broadcasting large object.
My spark-en.sh setting are:
SPARK_MASTER_IP=MASTER-IP
SPARK_LOCAL_IP=LOCAL-IP
SPARK_DRIVER_MEMORY=24g
Hi Ankur
If your hardware is ok, looks like it is config problem. Can you show me
the config of spark-env.sh or JVM config?
Thanks
Wisely Chen
2015-03-28 15:39 GMT+08:00 Ankur Srivastava ankur.srivast...@gmail.com:
Hi Wisely,
I have 26gb for driver and the master is running on m3.2xlarge
Hi
In broadcast, spark will collect the whole 3gb object into master node and
broadcast to each slaves. It is very common situation that the master node
don't have enough memory .
What is your master node settings?
Wisely Chen
Ankur Srivastava ankur.srivast...@gmail.com 於 2015年3月28日 星期六寫道:
I
Hi All,
I am running a spark cluster on EC2 instances of type: m3.2xlarge. I have
given 26gb of memory with all 8 cores to my executors. I can see that in
the logs too:
*15/03/27 21:31:06 INFO AppClient$ClientActor: Executor added:
app-20150327213106-/0 on
I have increased the spark.storage.memoryFraction to 0.4 but I still get
OOM errors on Spark Executor nodes
15/03/27 23:19:51 INFO BlockManagerMaster: Updated info of block
broadcast_5_piece10
15/03/27 23:19:51 INFO TorrentBroadcast: Reading broadcast variable 5 took
2704 ms
15/03/27 23:19:52