RE: Unbale to run Group BY on Large File

2015-09-03 Thread SAHA, DEBOBROTA
, DEBOBROTA; 'user@spark.apache.org' Subject: Re: Unbale to run Group BY on Large File Unfortunately, groupBy is not the most efficient operation. What is it you’re trying to do? It may be possible with one of the other *byKey transformations. From: "SAHA, DEBOBROTA" Date: Wednesday, September 2,

Unbale to run Group BY on Large File

2015-09-02 Thread SAHA, DEBOBROTA
Hi , I am getting below error while I am trying to select data using SPARK SQL from a RDD table. java.lang.OutOfMemoryError: GC overhead limit exceeded "Spark Context Cleaner" java.lang.InterruptedException The file or table size is around 113 GB and I am running SPARK 1.4 on a

Re: Unbale to run Group BY on Large File

2015-09-02 Thread Silvio Fiorito
che.org>'" Subject: Unbale to run Group BY on Large File Hi , I am getting below error while I am trying to select data using SPARK SQL from a RDD table. java.lang.OutOfMemoryError: GC overhead limit exceeded "Spark Context Cleaner" java.lang.InterruptedException The file

Re: Unbale to run Group BY on Large File

2015-09-02 Thread Raghavendra Pandey
yKey > transformations. > > From: "SAHA, DEBOBROTA" > Date: Wednesday, September 2, 2015 at 7:46 PM > To: "'user@spark.apache.org'" > Subject: Unbale to run Group BY on Large File > > Hi , > > > > I am getting below error while I am trying to select d