Hi, there

I am looking at the SparkSQL setting spark.sql.autoBroadcastJoinThreshold.
According to the programming guide

*Note that currently statistics are only supported for Hive Metastore
tables where the command ANALYZE TABLE <tableName> COMPUTE STATISTICS
noscan has been run.*

My question is that is "NOSCAN" option a must? If I execute "ANALYZE TABLE
<tablename> compute statistics" command in Hive shell, is the statistics
going to be used by SparkSQL to decide broadcast join?

Thanks.

Reply via email to