Re: Spark SQL on large number of columns

madhu phatak Tue, 19 May 2015 03:35:37 -0700

Hi,
I have fields from field_0 to fied_26000. The query is select on

max( cast($columnName as double)),
   |min(cast($columnName as double)), avg(cast($columnName as double)), count(*)


for all those 26000 fields in one query.





Regards,
Madhukara Phatak
http://datamantra.io/

On Tue, May 19, 2015 at 3:59 PM, ayan guha <guha.a...@gmail.com> wrote:

> can you kindly share your code?
>
> On Tue, May 19, 2015 at 8:04 PM, madhu phatak <phatak....@gmail.com>
> wrote:
>
>> Hi,
>> I  am trying run spark sql aggregation on a file with 26k columns. No of
>> rows is very small. I am running into issue that spark is taking huge
>> amount of time to parse the sql and create a logical plan. Even if i have
>> just one row, it's taking more than 1 hour just to get pass the parsing.
>> Any idea how to optimize in these kind of scenarios?
>>
>>
>> Regards,
>> Madhukara Phatak
>> http://datamantra.io/
>>
>
>
>
> --
> Best Regards,
> Ayan Guha
>

Re: Spark SQL on large number of columns

Reply via email to