Are all of your join keys the same? and I guess the join type are all “Left” 
join, https://github.com/apache/spark/pull/3362 probably is what you need.

And, SparkSQL doesn’t support the multiway-join (and multiway-broadcast join) 
currently,  https://github.com/apache/spark/pull/3270 should be another 
optimization for this.


From: Jianshi Huang [mailto:jianshi.hu...@gmail.com]
Sent: Wednesday, November 26, 2014 4:36 PM
To: user
Subject: Auto BroadcastJoin optimization failed in latest Spark

Hi,

I've confirmed that the latest Spark with either Hive 0.12 or 0.13.1 fails 
optimizing auto broadcast join in my query. I have a query that joins a huge 
fact table with 15 tiny dimension tables.

I'm currently using an older version of Spark which was built on Oct. 12.

Anyone else has met similar situation?

--
Jianshi Huang

LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/

Reply via email to