ionRequired","true");
sparkConf.set("spark.kryoserializer.buffer.max.mb","512");
sparkConf.set("spark.default.parallelism","300");
sparkConf.set("spark.rpc.askTimeout","500");
I'm trying to load data from hdfs and running some sqls on it (m
k.rpc.askTimeout","500");
>
> I'm trying to load data from hdfs and running some sqls on it (mostly
> groupby) using DataFrames. The logs keep saying that tasks are lost due to
> OutOfMemoryError (GC overhead limit exceeded).
>
> Can you advice what is the recommended settings (memory, cores,
> partitions, etc.) for the given hardware?
>
> Thanks!
>