Spark memory distribution

2020-07-27 Thread dben
Hi, I'm having a computation held on top of a big dynamic model that is constantly having changes / online updates, therefore, thought that working in batch mode (stateless): s.t. requires of heavy model sent to spark will be less appropriate than working in stream mode. Therefore, was able to have

Re: Understanding Spark Memory distribution

2015-04-14 Thread Ankur Srivastava
Thank you Imran!! I was able to solve the issue by setting "spark.storage.blockManagerSlaveTimeoutMs=30" As I was seeing some block manager timeouts on master I updated this setting and it fixed the timeout issue as well as OOM errors on workers too. I am not really sure how it fixed the OOM

Re: Understanding Spark Memory distribution

2015-04-13 Thread Imran Rashid
broadcast variables count towards "spark.storage.memoryFraction", so they use the same "pool" of memory as cached RDDs. That being said, I'm really not sure why you are running into problems, it seems like you have plenty of memory available. Most likely its got nothing to do with broadcast varia

Re: Understanding Spark Memory distribution

2015-03-30 Thread Ankur Srivastava
Hi Wisely, I am running spark 1.2.1 and I have checked the process heap and it is running with all the heap that I am assigning and as I mentioned earlier I get OOM on workers not the driver or master. Thanks Ankur On Mon, Mar 30, 2015 at 9:24 AM, giive chen wrote: > Hi Ankur > > If you using

Re: Understanding Spark Memory distribution

2015-03-30 Thread giive chen
Hi Ankur If you using standalone mode, your config is wrong. You should use "export SPARK_DAEMON_MEMORY=xxx " in config/spark-env.sh. At least it works on my spark 1.3.0 standalone mode machine. BTW, The SPARK_DRIVER_MEMORY is used in Yarn mode and looks like the standalone mode don't use this c

Re: Understanding Spark Memory distribution

2015-03-29 Thread Ankur Srivastava
Hi Wisely, I am running on Amazon EC2 instances so I can not doubt the hardware. Moreover my other pipelines run successfully except for this which involves Broadcasting large object. My spark-en.sh setting are: SPARK_MASTER_IP= SPARK_LOCAL_IP= SPARK_DRIVER_MEMORY=24g SPARK_WORKER_MEMORY=28g

Re: Understanding Spark Memory distribution

2015-03-28 Thread Wisely Chen
Hi Ankur If your hardware is ok, looks like it is config problem. Can you show me the config of spark-env.sh or JVM config? Thanks Wisely Chen 2015-03-28 15:39 GMT+08:00 Ankur Srivastava : > Hi Wisely, > I have 26gb for driver and the master is running on m3.2xlarge machines. > > I see OOM err

Re: Understanding Spark Memory distribution

2015-03-28 Thread Ankur Srivastava
Hi Wisely, I have 26gb for driver and the master is running on m3.2xlarge machines. I see OOM errors on workers and even they are running with 26th of memory. Thanks On Fri, Mar 27, 2015, 11:43 PM Wisely Chen wrote: > Hi > > In broadcast, spark will collect the whole 3gb object into master nod

Re: Understanding Spark Memory distribution

2015-03-27 Thread Wisely Chen
Hi In broadcast, spark will collect the whole 3gb object into master node and broadcast to each slaves. It is very common situation that the master node don't have enough memory . What is your master node settings? Wisely Chen Ankur Srivastava 於 2015年3月28日 星期六寫道: > I have increased the "spark

Re: Understanding Spark Memory distribution

2015-03-27 Thread Ankur Srivastava
I have increased the "spark.storage.memoryFraction" to 0.4 but I still get OOM errors on Spark Executor nodes 15/03/27 23:19:51 INFO BlockManagerMaster: Updated info of block broadcast_5_piece10 15/03/27 23:19:51 INFO TorrentBroadcast: Reading broadcast variable 5 took 2704 ms 15/03/27 23:19:52

Understanding Spark Memory distribution

2015-03-27 Thread Ankur Srivastava
Hi All, I am running a spark cluster on EC2 instances of type: m3.2xlarge. I have given 26gb of memory with all 8 cores to my executors. I can see that in the logs too: *15/03/27 21:31:06 INFO AppClient$ClientActor: Executor added: app-20150327213106-/0 on worker-20150327212934-10.x.y.z-40128