You should post the execution plan here, so we can provide more accurate support.
Since in your feature table, you are building it with projection ("where ...."), so my guess is that the following JIRA (SPARK-13383<https://issues.apache.org/jira/browse/SPARK-13383>) stops the broadcast join. This is fixed in the Spark 2.x. Can you try it on Spark 2.0? Yong ________________________________ From: Jone Zhang <joyoungzh...@gmail.com> Sent: Wednesday, May 10, 2017 7:10 AM To: user @spark/'user @spark'/spark users/user@spark Subject: Why spark.sql.autoBroadcastJoinThreshold not available Now i use spark1.6.0 in java I wish the following sql to be executed in BroadcastJoin way select * from sample join feature This is my step 1.set spark.sql.autoBroadcastJoinThreshold=100M 2.HiveContext.sql("cache lazy table feature as "select * from src where ...) which result size is only 100K 3.HiveContext.sql("select * from sample join feature") Why the join is SortMergeJoin? Grateful for any idea! Thanks.