Hi there, I set the executor memory to 8g but it didn't help On 18 March 2015 at 13:59, Cheng Lian <lian.cs....@gmail.com> wrote:
> You should probably increase executor memory by setting > "spark.executor.memory". > > Full list of available configurations can be found here > http://spark.apache.org/docs/latest/configuration.html > > Cheng > > > On 3/18/15 9:15 PM, Yiannis Gkoufas wrote: > >> Hi there, >> >> I was trying the new DataFrame API with some basic operations on a >> parquet dataset. >> I have 7 nodes of 12 cores and 8GB RAM allocated to each worker in a >> standalone cluster mode. >> The code is the following: >> >> val people = sqlContext.parquetFile("/data.parquet"); >> val res = people.groupBy("name","date").agg(sum("power"),sum("supply") >> ).take(10); >> System.out.println(res); >> >> The dataset consists of 16 billion entries. >> The error I get is java.lang.OutOfMemoryError: GC overhead limit exceeded >> >> My configuration is: >> >> spark.serializer org.apache.spark.serializer.KryoSerializer >> spark.driver.memory 6g >> spark.executor.extraJavaOptions -XX:+UseCompressedOops >> spark.shuffle.manager sort >> >> Any idea how can I workaround this? >> >> Thanks a lot >> > >