[ 
https://issues.apache.org/jira/browse/SPARK-14070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tejas Patil updated SPARK-14070:
--------------------------------
    Description: 
Currently if one is trying to query ORC tables in Hive, the plan generated by 
Spark hows that its using the `HiveTableScan` operator which is generic to all 
file formats. We could instead use the ORC data source for this so that we can 
get ORC specific optimizations like predicate pushdown.

Current behaviour:

```
scala>  hqlContext.sql("SELECT * FROM orc_table").explain(true)
== Parsed Logical Plan ==
'Project [unresolvedalias(*, None)]
+- 'UnresolvedRelation `orc_table`, None

== Analyzed Logical Plan ==
key: string, value: string
Project [key#171,value#172]
+- MetastoreRelation default, orc_table, None

== Optimized Logical Plan ==
MetastoreRelation default, orc_table, None

== Physical Plan ==
HiveTableScan [key#171,value#172], MetastoreRelation default, orc_table, None
```


> Use ORC data source for SQL queries on ORC tables
> -------------------------------------------------
>
>                 Key: SPARK-14070
>                 URL: https://issues.apache.org/jira/browse/SPARK-14070
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 1.6.1
>            Reporter: Tejas Patil
>            Priority: Minor
>
> Currently if one is trying to query ORC tables in Hive, the plan generated by 
> Spark hows that its using the `HiveTableScan` operator which is generic to 
> all file formats. We could instead use the ORC data source for this so that 
> we can get ORC specific optimizations like predicate pushdown.
> Current behaviour:
> ```
> scala>  hqlContext.sql("SELECT * FROM orc_table").explain(true)
> == Parsed Logical Plan ==
> 'Project [unresolvedalias(*, None)]
> +- 'UnresolvedRelation `orc_table`, None
> == Analyzed Logical Plan ==
> key: string, value: string
> Project [key#171,value#172]
> +- MetastoreRelation default, orc_table, None
> == Optimized Logical Plan ==
> MetastoreRelation default, orc_table, None
> == Physical Plan ==
> HiveTableScan [key#171,value#172], MetastoreRelation default, orc_table, None
> ```



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to