[jira] [Updated] (SPARK-28930) Spark DESC FORMATTED TABLENAME information display issues

Dongjoon Hyun (Jira) Fri, 30 Aug 2019 10:09:06 -0700


     [ 
https://issues.apache.org/jira/browse/SPARK-28930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Dongjoon Hyun updated SPARK-28930:
----------------------------------
    Description: 
Spark DESC FORMATTED TABLENAME information display issues.Showing incorrect 
*Last Access time and* feeling some information displays can make it better.

Test steps:
 1. Open spark sql
 2. Create table with partition
 CREATE EXTERNAL TABLE IF NOT EXISTS employees_info_extended ( id INT, name 
STRING, usd_flag STRING, salary DOUBLE, deductions MAP<STRING, DOUBLE>, address 
STRING ) PARTITIONED BY (entrytime STRING) STORED AS TEXTFILE location 
'hdfs://hacluster/user/sparkhive/warehouse';
 3. from spark sql check the table description
 desc formatted tablename;
 4. From scala shell check the table description
 sql("desc formatted tablename").show()

*Issue1:*
 If there is no comment for spark scala shell shows *"null" in small letters* 
but all other places Hive beeline/Spark beeline/Spark SQL it is showing in 
*CAPITAL "NULL*". Better to show same in all places.

 
{code}
*scala>* sql("desc formatted employees_info_extended").show(false);
 +-----------------------------+---------------------------++-------
|col_name|data_type|*comment*|

+-----------------------------+---------------------------++-------
|id|int|*null*|
|name|string|*null*|
|usd_flag|string|*null*|
|salary|double|*null*|
|deductions|map<string,double>|*null*|
|address|string|null|
|entrytime|string|null|
| # Partition Information| | |
| # col_name|data_type|comment|
|entrytime|string|null|
| | | |
| # Detailed Table Information| | |
|Database|sparkdb__| |
|Table|employees_info_extended| |
|Owner|root| |

*|Created Time |Tue Aug 20 13:42:06 CST 2019| |*
 *|Last Access |Thu Jan 01 08:00:00 CST 1970| |*
|Created By|Spark 2.4.3| |
|Type|EXTERNAL| |
|Provider|hive| |

+-----------------------------+---------------------------++-------
 only showing top 20 rows

*scala>*
{code}

*Issue 2:*
 Spark SQL "desc formatted tablename" is not showing the header [# 
col_name,data_type,comment|#col_name,data_type,comment] in the top of the query 
result.But header is showing on top of partition description. For Better 
understanding show the header on Top of the query result.

{code}
*spark-sql>* desc formatted employees_info_extended1;
 id int *NULL*
 name string *NULL*
 usd_flag string NULL
 salary double NULL
 deductions map<string,double> NULL
 address string NULL
 entrytime string NULL
 * 
 ## Partition Information*
 ## col_name data_type comment*
 entrytime string *NULL*

 # Detailed Table Information
 Database sparkdb__
 Table employees_info_extended1
 Owner spark
 *Created Time Tue Aug 20 14:50:37 CST 2019*
 *Last Access Thu Jan 01 08:00:00 CST 1970*
 Created By Spark 2.3.2.0201
 Type EXTERNAL
 Provider hive
 Table Properties [transient_lastDdlTime=1566286655]
 Location hdfs://hacluster/user/sparkhive/warehouse
 Serde Library org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
 InputFormat org.apache.hadoop.mapred.TextInputFormat
 OutputFormat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
 Storage Properties [serialization.format=1]
 Partition Provider Catalog
 Time taken: 0.477 seconds, Fetched 27 row(s)
 *spark-sql>*
{code}
 

*Issue 3:*
 I created the table on Aug 20.So it is showing created time correct .*But Last 
access time showing 1970 Jan 01*. It is not good to show Last access time 
earlier time than the created time.Better to show the correct date and time 
else show UNKNOWN.
 *[Created Time,Tue Aug 20 13:42:06 CST 2019,]*
 *[Last Access,Thu Jan 01 08:00:00 CST 1970,]*

  was:
Spark DESC FORMATTED TABLENAME information display issues.Showing incorrect 
*Last Access time and* feeling some information displays can make it better.

Test steps:
 1. Open spark sql
 2. Create table with partition
 CREATE EXTERNAL TABLE IF NOT EXISTS employees_info_extended ( id INT, name 
STRING, usd_flag STRING, salary DOUBLE, deductions MAP<STRING, DOUBLE>, address 
STRING ) PARTITIONED BY (entrytime STRING) STORED AS TEXTFILE location 
'hdfs://hacluster/user/sparkhive/warehouse';
 3. from spark sql check the table description
 desc formatted tablename;
 4. From scala shell check the table description
 sql("desc formatted tablename").show()

*Issue1:*
 If there is no comment for spark scala shell shows *"null" in small letters* 
but all other places Hive beeline/Spark beeline/Spark SQL it is showing in 
*CAPITAL "NULL*". Better to show same in all places.

 

*scala>* sql("desc formatted employees_info_extended").show(false);
 +-----------------------------+---------------------------++-------
|col_name|data_type|*comment*|

+-----------------------------+---------------------------++-------
|id|int|*null*|
|name|string|*null*|
|usd_flag|string|*null*|
|salary|double|*null*|
|deductions|map<string,double>|*null*|
|address|string|null|
|entrytime|string|null|
| # Partition Information| | |
| # col_name|data_type|comment|
|entrytime|string|null|
| | | |
| # Detailed Table Information| | |
|Database|sparkdb__| |
|Table|employees_info_extended| |
|Owner|root| |

*|Created Time |Tue Aug 20 13:42:06 CST 2019| |*
 *|Last Access |Thu Jan 01 08:00:00 CST 1970| |*
|Created By|Spark 2.4.3| |
|Type|EXTERNAL| |
|Provider|hive| |

+-----------------------------+---------------------------++-------
 only showing top 20 rows

*scala>*

*Issue 2:*
 Spark SQL "desc formatted tablename" is not showing the header [# 
col_name,data_type,comment|#col_name,data_type,comment] in the top of the query 
result.But header is showing on top of partition description. For Better 
understanding show the header on Top of the query result.

*spark-sql>* desc formatted employees_info_extended1;
 id int *NULL*
 name string *NULL*
 usd_flag string NULL
 salary double NULL
 deductions map<string,double> NULL
 address string NULL
 entrytime string NULL
 * 
 ## Partition Information*
 ## col_name data_type comment*
 entrytime string *NULL*

 # Detailed Table Information
 Database sparkdb__
 Table employees_info_extended1
 Owner spark
 *Created Time Tue Aug 20 14:50:37 CST 2019*
 *Last Access Thu Jan 01 08:00:00 CST 1970*
 Created By Spark 2.3.2.0201
 Type EXTERNAL
 Provider hive
 Table Properties [transient_lastDdlTime=1566286655]
 Location hdfs://hacluster/user/sparkhive/warehouse
 Serde Library org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
 InputFormat org.apache.hadoop.mapred.TextInputFormat
 OutputFormat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
 Storage Properties [serialization.format=1]
 Partition Provider Catalog
 Time taken: 0.477 seconds, Fetched 27 row(s)
 *spark-sql>*

 

*Issue 3:*
 I created the table on Aug 20.So it is showing created time correct .*But Last 
access time showing 1970 Jan 01*. It is not good to show Last access time 
earlier time than the created time.Better to show the correct date and time 
else show UNKNOWN.
 *[Created Time,Tue Aug 20 13:42:06 CST 2019,]*
 *[Last Access,Thu Jan 01 08:00:00 CST 1970,]*


> Spark DESC FORMATTED TABLENAME information display issues
> ---------------------------------------------------------
>
>                 Key: SPARK-28930
>                 URL: https://issues.apache.org/jira/browse/SPARK-28930
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Shell, SQL
>    Affects Versions: 2.4.3
>            Reporter: jobit mathew
>            Priority: Minor
>
> Spark DESC FORMATTED TABLENAME information display issues.Showing incorrect 
> *Last Access time and* feeling some information displays can make it better.
> Test steps:
>  1. Open spark sql
>  2. Create table with partition
>  CREATE EXTERNAL TABLE IF NOT EXISTS employees_info_extended ( id INT, name 
> STRING, usd_flag STRING, salary DOUBLE, deductions MAP<STRING, DOUBLE>, 
> address STRING ) PARTITIONED BY (entrytime STRING) STORED AS TEXTFILE 
> location 'hdfs://hacluster/user/sparkhive/warehouse';
>  3. from spark sql check the table description
>  desc formatted tablename;
>  4. From scala shell check the table description
>  sql("desc formatted tablename").show()
> *Issue1:*
>  If there is no comment for spark scala shell shows *"null" in small letters* 
> but all other places Hive beeline/Spark beeline/Spark SQL it is showing in 
> *CAPITAL "NULL*". Better to show same in all places.
>  
> {code}
> *scala>* sql("desc formatted employees_info_extended").show(false);
>  +-----------------------------+---------------------------++-------
> |col_name|data_type|*comment*|
> +-----------------------------+---------------------------++-------
> |id|int|*null*|
> |name|string|*null*|
> |usd_flag|string|*null*|
> |salary|double|*null*|
> |deductions|map<string,double>|*null*|
> |address|string|null|
> |entrytime|string|null|
> | # Partition Information| | |
> | # col_name|data_type|comment|
> |entrytime|string|null|
> | | | |
> | # Detailed Table Information| | |
> |Database|sparkdb__| |
> |Table|employees_info_extended| |
> |Owner|root| |
> *|Created Time |Tue Aug 20 13:42:06 CST 2019| |*
>  *|Last Access |Thu Jan 01 08:00:00 CST 1970| |*
> |Created By|Spark 2.4.3| |
> |Type|EXTERNAL| |
> |Provider|hive| |
> +-----------------------------+---------------------------++-------
>  only showing top 20 rows
> *scala>*
> {code}
> *Issue 2:*
>  Spark SQL "desc formatted tablename" is not showing the header [# 
> col_name,data_type,comment|#col_name,data_type,comment] in the top of the 
> query result.But header is showing on top of partition description. For 
> Better understanding show the header on Top of the query result.
> {code}
> *spark-sql>* desc formatted employees_info_extended1;
>  id int *NULL*
>  name string *NULL*
>  usd_flag string NULL
>  salary double NULL
>  deductions map<string,double> NULL
>  address string NULL
>  entrytime string NULL
>  * 
>  ## Partition Information*
>  ## col_name data_type comment*
>  entrytime string *NULL*
>  # Detailed Table Information
>  Database sparkdb__
>  Table employees_info_extended1
>  Owner spark
>  *Created Time Tue Aug 20 14:50:37 CST 2019*
>  *Last Access Thu Jan 01 08:00:00 CST 1970*
>  Created By Spark 2.3.2.0201
>  Type EXTERNAL
>  Provider hive
>  Table Properties [transient_lastDdlTime=1566286655]
>  Location hdfs://hacluster/user/sparkhive/warehouse
>  Serde Library org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>  InputFormat org.apache.hadoop.mapred.TextInputFormat
>  OutputFormat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>  Storage Properties [serialization.format=1]
>  Partition Provider Catalog
>  Time taken: 0.477 seconds, Fetched 27 row(s)
>  *spark-sql>*
> {code}
>  
> *Issue 3:*
>  I created the table on Aug 20.So it is showing created time correct .*But 
> Last access time showing 1970 Jan 01*. It is not good to show Last access 
> time earlier time than the created time.Better to show the correct date and 
> time else show UNKNOWN.
>  *[Created Time,Tue Aug 20 13:42:06 CST 2019,]*
>  *[Last Access,Thu Jan 01 08:00:00 CST 1970,]*



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-28930) Spark DESC FORMATTED TABLENAME information display issues

Reply via email to