In Spark 1.3 I would expect this to happen automatically when the parquet table is small (< 10mb, configurable with spark.sql.autoBroadcastJoinThreshold). If you are running 1.3 and not seeing this, can you show the code you are using to create the table?
On Tue, Mar 31, 2015 at 3:25 AM, jitesh129 <jitesh...@gmail.com> wrote: > How can we implement a BroadcastHashJoin for spark with python? > > My SparkSQL inner joins are taking a lot of time since it is performing > ShuffledHashJoin. > > Tables on which join is performed are stored as parquet files. > > Please help. > > Thanks and regards, > Jitesh > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Broadcasting-a-parquet-file-using-spark-and-python-tp22315.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >