, DEBOBROTA; 'user@spark.apache.org'
Subject: Re: Unbale to run Group BY on Large File
Unfortunately, groupBy is not the most efficient operation. What is it you’re
trying to do? It may be possible with one of the other *byKey transformations.
From: "SAHA, DEBOBROTA"
Date: Wednesday, September 2,
Hi ,
I am getting below error while I am trying to select data using SPARK SQL from
a RDD table.
java.lang.OutOfMemoryError: GC overhead limit exceeded
"Spark Context Cleaner" java.lang.InterruptedException
The file or table size is around 113 GB and I am running SPARK 1.4 on a
Hi ,
I am using SPARK 1.4 and I am getting an array out of bound Exception when I am
trying to read from a registered table in SPARK.
For example If I have 3 different text files with the content as below:
Scenario 1:
A1|B1|C1
A2|B2|C2
Scenario 2:
A1| |C1
A2| |C2
Scenario 3:
A1| B1|
A2| B2|
Hi ,
Can anyone help me in loading a column that may or may not have NULL values in
a RDD.
Thanks