Hi, An additional information is, table is backed by a csv file which is read using spark-csv from databricks.
Regards, Madhukara Phatak http://datamantra.io/ On Tue, May 19, 2015 at 4:05 PM, madhu phatak <phatak....@gmail.com> wrote: > Hi, > I have fields from field_0 to fied_26000. The query is select on > > max( cast($columnName as double)), > |min(cast($columnName as double)), avg(cast($columnName as double)), > count(*) > > for all those 26000 fields in one query. > > > > > > Regards, > Madhukara Phatak > http://datamantra.io/ > > On Tue, May 19, 2015 at 3:59 PM, ayan guha <guha.a...@gmail.com> wrote: > >> can you kindly share your code? >> >> On Tue, May 19, 2015 at 8:04 PM, madhu phatak <phatak....@gmail.com> >> wrote: >> >>> Hi, >>> I am trying run spark sql aggregation on a file with 26k columns. No of >>> rows is very small. I am running into issue that spark is taking huge >>> amount of time to parse the sql and create a logical plan. Even if i have >>> just one row, it's taking more than 1 hour just to get pass the parsing. >>> Any idea how to optimize in these kind of scenarios? >>> >>> >>> Regards, >>> Madhukara Phatak >>> http://datamantra.io/ >>> >> >> >> >> -- >> Best Regards, >> Ayan Guha >> > >