[ 
https://issues.apache.org/jira/browse/SPARK-20487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-20487:
------------------------------------

    Assignee: Apache Spark

> `HiveTableScan` node is quite verbose in explained plan
> -------------------------------------------------------
>
>                 Key: SPARK-20487
>                 URL: https://issues.apache.org/jira/browse/SPARK-20487
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.1.0
>            Reporter: Tejas Patil
>            Assignee: Apache Spark
>            Priority: Trivial
>
> For hive tables, `explain()` prints a lot of information. This makes it hard 
> to read the plan (esp. for large sql strings with numerous tables).
> eg.
> {noformat}
> scala> hc.sql(" SELECT * FROM my_table WHERE name = 'foo' ").explain(true)
> == Parsed Logical Plan ==
> 'Project [*]
> +- 'Filter ('name = foo)
>    +- 'UnresolvedRelation `my_table`
> == Analyzed Logical Plan ==
> user_id: bigint, name: string, ds: string
> Project [user_id#13L, name#14, ds#15]
> +- Filter (name#14 = foo)
>    +- SubqueryAlias my_table
>       +- CatalogRelation CatalogTable(
> Database: default
> Table: my_table
> Owner: tejasp
> Created: Fri Apr 14 17:05:50 PDT 2017
> Last Access: Wed Dec 31 16:00:00 PST 1969
> Type: MANAGED
> Provider: hive
> Properties: [serialization.format=1]
> Statistics: 9223372036854775807 bytes
> Location: file:/tmp/warehouse/my_table
> Serde Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
> InputFormat: org.apache.hadoop.mapred.TextInputFormat
> OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> Partition Provider: Catalog
> Partition Columns: [`ds`]
> Schema: root
> -- user_id: long (nullable = true)
> -- name: string (nullable = true)
> -- ds: string (nullable = true)
> ), [user_id#13L, name#14], [ds#15]
> == Optimized Logical Plan ==
> Filter (isnotnull(name#14) && (name#14 = foo))
> +- CatalogRelation CatalogTable(
> Database: default
> Table: my_table
> Owner: tejasp
> Created: Fri Apr 14 17:05:50 PDT 2017
> Last Access: Wed Dec 31 16:00:00 PST 1969
> Type: MANAGED
> Provider: hive
> Properties: [serialization.format=1]
> Statistics: 9223372036854775807 bytes
> Location: file:/tmp/warehouse/my_table
> Serde Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
> InputFormat: org.apache.hadoop.mapred.TextInputFormat
> OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> Partition Provider: Catalog
> Partition Columns: [`ds`]
> Schema: root
> -- user_id: long (nullable = true)
> -- name: string (nullable = true)
> -- ds: string (nullable = true)
> ), [user_id#13L, name#14], [ds#15]
> == Physical Plan ==
> *Filter (isnotnull(name#14) && (name#14 = foo))
> +- HiveTableScan [user_id#13L, name#14, ds#15], CatalogRelation CatalogTable(
> Database: default
> Table: my_table
> Owner: tejasp
> Created: Fri Apr 14 17:05:50 PDT 2017
> Last Access: Wed Dec 31 16:00:00 PST 1969
> Type: MANAGED
> Provider: hive
> Properties: [serialization.format=1]
> Statistics: 9223372036854775807 bytes
> Location: file:/tmp/warehouse/my_table
> Serde Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
> InputFormat: org.apache.hadoop.mapred.TextInputFormat
> OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> Partition Provider: Catalog
> Partition Columns: [`ds`]
> Schema: root
> -- user_id: long (nullable = true)
> -- name: string (nullable = true)
> -- ds: string (nullable = true)
> ), [user_id#13L, name#14], [ds#15]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to