Hi, there
I am looking at the SparkSQL setting spark.sql.autoBroadcastJoinThreshold.
According to the programming guide
*Note that currently statistics are only supported for Hive Metastore
tables where the command ANALYZE TABLE COMPUTE STATISTICS
noscan has been run.*
My question is that is
>
> My question is that is "NOSCAN" option a must? If I execute "ANALYZE TABLE
> compute statistics" command in Hive shell, is the statistics
> going to be used by SparkSQL to decide broadcast join?
Yes, spark SQL will only accept the simple no scan version. However, as
long as the sizeInBytes
Michael,
Thanks for the reply.
On Wed, Feb 10, 2016 at 11:44 AM, Michael Armbrust
wrote:
> My question is that is "NOSCAN" option a must? If I execute "ANALYZE TABLE
>> compute statistics" command in Hive shell, is the statistics
>> going to be used by SparkSQL to