Re: [Spark SQL] failure in query

2019-08-29 Thread Subash Prabakar
What is the no of part files in that big table? And what is the distribution of request ID? Is the variance of the column is less or huge? Because partitionBy clause will move data with same request ID to one executor. If the data is huge it might put load on executor. On Sun, 25 Aug 2019 at

[Spark SQL] failure in query

2019-08-25 Thread Tzahi File
Hi, I encountered some issue to run a spark SQL query, and will happy to some advice. I'm trying to run a query on a very big data set (around 1.5TB) and it getting failures in all of my tries. A template of the query is as below: insert overwrite table partition(part) select /*+ BROADCAST(c) */