[ https://issues.apache.org/jira/browse/SPARK-15365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Parth Brahmbhatt updated SPARK-15365: ------------------------------------- Description: Currently if a table is used in join operation we rely on Metastore returned size to calculate if we can convert the operation to Broadcast join. This optimization only kicks in for table's that have the statistics available in metastore. Hive generally rolls over to HDFS if the statistics are not available directly from metastore and this seems like a reasonable choice to adopt given the optimization benefit of using broadcast joins. (was: Currently if a table is used in join operation we rely on Metastore returned size to calculate if we can convert the operation to Broadcast join. This optimization only kicks in for table's that have the statics available in metastore. Hive generally rolls over to HDFS if the statistics are not available directly from metastore and this seems like a reasonable choice to adopt given the optimization benefit of using broadcast joins.) > Metastore relation should fallback to HDFS size if statistics are not > available from table meta data. > ----------------------------------------------------------------------------------------------------- > > Key: SPARK-15365 > URL: https://issues.apache.org/jira/browse/SPARK-15365 > Project: Spark > Issue Type: Improvement > Components: SQL > Reporter: Parth Brahmbhatt > > Currently if a table is used in join operation we rely on Metastore returned > size to calculate if we can convert the operation to Broadcast join. This > optimization only kicks in for table's that have the statistics available in > metastore. Hive generally rolls over to HDFS if the statistics are not > available directly from metastore and this seems like a reasonable choice to > adopt given the optimization benefit of using broadcast joins. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org