Its the executor memory (spark.executor.memory) which you can set while creating the spark context. By default it uses 0.6% of the executor memory for Storage. Now, to show some memory usage, you need to cache (persist) the RDD. Regarding the OOM Exception, you can increase the level of parallelism (also you can increase the number of partitions depending on your data size) and it should be fine.
Thanks Best Regards On Mon, Jan 19, 2015 at 11:36 AM, Alessandro Baretta <alexbare...@gmail.com> wrote: > All, > > I'm getting out of memory exceptions in SparkSQL GROUP BY queries. I have > plenty of RAM, so I should be able to brute-force my way through, but I > can't quite figure out what memory option affects what process. > > My current memory configuration is the following: > export SPARK_WORKER_MEMORY=83971m > export SPARK_DAEMON_MEMORY=15744m > > What does each of these config options do exactly? > > Also, how come the executors page of the web UI shows no memory usage: > > 0.0 B / 42.4 GB > > And where does 42.4 GB come from? > > Alex >