[ https://issues.apache.org/jira/browse/SPARK-20487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15985907#comment-15985907 ]
Apache Spark commented on SPARK-20487: -------------------------------------- User 'tejasapatil' has created a pull request for this issue: https://github.com/apache/spark/pull/17780 > `HiveTableScan` node is quite verbose in explained plan > ------------------------------------------------------- > > Key: SPARK-20487 > URL: https://issues.apache.org/jira/browse/SPARK-20487 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.1.0 > Reporter: Tejas Patil > Priority: Trivial > > For hive tables, `explain()` prints a lot of information. This makes it hard > to read the plan (esp. for large sql strings with numerous tables). > eg. > {noformat} > scala> hc.sql(" SELECT * FROM my_table WHERE name = 'foo' ").explain(true) > == Parsed Logical Plan == > 'Project [*] > +- 'Filter ('name = foo) > +- 'UnresolvedRelation `my_table` > == Analyzed Logical Plan == > user_id: bigint, name: string, ds: string > Project [user_id#13L, name#14, ds#15] > +- Filter (name#14 = foo) > +- SubqueryAlias my_table > +- CatalogRelation CatalogTable( > Database: default > Table: my_table > Owner: tejasp > Created: Fri Apr 14 17:05:50 PDT 2017 > Last Access: Wed Dec 31 16:00:00 PST 1969 > Type: MANAGED > Provider: hive > Properties: [serialization.format=1] > Statistics: 9223372036854775807 bytes > Location: file:/tmp/warehouse/my_table > Serde Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > InputFormat: org.apache.hadoop.mapred.TextInputFormat > OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > Partition Provider: Catalog > Partition Columns: [`ds`] > Schema: root > -- user_id: long (nullable = true) > -- name: string (nullable = true) > -- ds: string (nullable = true) > ), [user_id#13L, name#14], [ds#15] > == Optimized Logical Plan == > Filter (isnotnull(name#14) && (name#14 = foo)) > +- CatalogRelation CatalogTable( > Database: default > Table: my_table > Owner: tejasp > Created: Fri Apr 14 17:05:50 PDT 2017 > Last Access: Wed Dec 31 16:00:00 PST 1969 > Type: MANAGED > Provider: hive > Properties: [serialization.format=1] > Statistics: 9223372036854775807 bytes > Location: file:/tmp/warehouse/my_table > Serde Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > InputFormat: org.apache.hadoop.mapred.TextInputFormat > OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > Partition Provider: Catalog > Partition Columns: [`ds`] > Schema: root > -- user_id: long (nullable = true) > -- name: string (nullable = true) > -- ds: string (nullable = true) > ), [user_id#13L, name#14], [ds#15] > == Physical Plan == > *Filter (isnotnull(name#14) && (name#14 = foo)) > +- HiveTableScan [user_id#13L, name#14, ds#15], CatalogRelation CatalogTable( > Database: default > Table: my_table > Owner: tejasp > Created: Fri Apr 14 17:05:50 PDT 2017 > Last Access: Wed Dec 31 16:00:00 PST 1969 > Type: MANAGED > Provider: hive > Properties: [serialization.format=1] > Statistics: 9223372036854775807 bytes > Location: file:/tmp/warehouse/my_table > Serde Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > InputFormat: org.apache.hadoop.mapred.TextInputFormat > OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > Partition Provider: Catalog > Partition Columns: [`ds`] > Schema: root > -- user_id: long (nullable = true) > -- name: string (nullable = true) > -- ds: string (nullable = true) > ), [user_id#13L, name#14], [ds#15] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org