Lost tasks due to OutOfMemoryError (GC overhead limit exceeded)

2016-01-12 Thread Barak Yaish
ionRequired","true"); sparkConf.set("spark.kryoserializer.buffer.max.mb","512"); sparkConf.set("spark.default.parallelism","300"); sparkConf.set("spark.rpc.askTimeout","500"); I'm trying to load data from hdfs and running some sqls on it (m

Re: Lost tasks due to OutOfMemoryError (GC overhead limit exceeded)

2016-01-12 Thread Muthu Jayakumar
k.rpc.askTimeout","500"); > > I'm trying to load data from hdfs and running some sqls on it (mostly > groupby) using DataFrames. The logs keep saying that tasks are lost due to > OutOfMemoryError (GC overhead limit exceeded). > > Can you advice what is the recommended settings (memory, cores, > partitions, etc.) for the given hardware? > > Thanks! >