Please fine the query plan
scala sqlContext.sql(SELECT dw.DAY_OF_WEEK, dw.HOUR, avg(dw.SDP_USAGE) AS
AVG_SDP_USAGE FROM (SELECT sdp.WID, DAY_OF_WEEK, HOUR, SUM(INTERVAL_VALUE)
AS SDP_USAGE FROM (SELECT * FROM date_d AS dd JOIN interval_f AS intf ON
intf.DATE_WID = dd.WID WHERE intf.DATE_WID =
)
Currently the query takes 15 minutes execution time where interval_f table
holds approx 170GB of raw data, date_d -- 170 MB and sdp_d -- 490MB
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Optimizing-SQL-Query-tp21948.html
Sent from the Apache Spark User
in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Optimizing-SQL-Query-tp21948.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org