[jira] [Commented] (SPARK-15616) Metastore relation should fallback to HDFS size of partitions that are involved in Query if statistics are not available.

Franck Tago (JIRA) Sun, 30 Oct 2016 16:18:07 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15620750#comment-15620750
 ]


Franck Tago commented on SPARK-15616:
-------------------------------------

SO was not able to  use the changes for the following  reasons . 
1-I forgot to mention that I am working off the spark 2.0.1  branch. 
2- I get the following error
[info] Compiling 30 Scala sources and 2 Java sources to 
/export/home/devbld/spark_world/Mercury/pvt/ftago/spark-2.0.1/sql/hive/target/scala-2.11/classes...
[error] 
/export/home/devbld/spark_world/Mercury/pvt/ftago/spark-2.0.1/sql/hive/src/main/scala/org/apache/spark/sql/hive/MetastoreRelation.scala:295:
 type mismatch;
[error]  found   : Seq[org.apache.spark.sql.catalyst.expressions.Expression]
[error]  required: Option[String]
[error]     MetastoreRelation(databaseName, tableName, 
partitionPruningPred)(catalogTable, client, sparkSession)
[error]                                                ^
[error] one error found
[error] Compile failed 

Can you please  build a version of this fix off spark 2.0.1? I tried 
incorporating your changes but as pointed to the error message shown above , I 
was not able to .

> Metastore relation should fallback to HDFS size of partitions that are 
> involved in Query if statistics are not available.
> -------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-15616
>                 URL: https://issues.apache.org/jira/browse/SPARK-15616
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>            Reporter: Lianhui Wang
>
> Currently if some partitions of a partitioned table are used in join 
> operation we rely on Metastore returned size of table to calculate if we can 
> convert the operation to Broadcast join. 
> if Filter can prune some partitions, Hive can prune partition before 
> determining to use broadcast joins according to HDFS size of partitions that 
> are involved in Query.So sparkSQL needs it that can improve join's 
> performance for partitioned table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-15616) Metastore relation should fallback to HDFS size of partitions that are involved in Query if statistics are not available.

Reply via email to