[jira] [Commented] (SPARK-36086) The case of the delta table is inconsistent with parquet

2021-08-09 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-36086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17395858#comment-17395858
 ] 

Apache Spark commented on SPARK-36086:
--

User 'AngersZh' has created a pull request for this issue:
https://github.com/apache/spark/pull/33686

> The case of the delta table is inconsistent with parquet
> 
>
> Key: SPARK-36086
> URL: https://issues.apache.org/jira/browse/SPARK-36086
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.1.1
>Reporter: Yuming Wang
>Assignee: angerszhu
>Priority: Major
> Fix For: 3.2.0
>
>
> How to reproduce this issue:
> {noformat}
> 1. Add delta-core_2.12-1.0.0-SNAPSHOT.jar to ${SPARK_HOME}/jars.
> 2. bin/spark-shell --conf 
> spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension --conf 
> spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog
> {noformat}
> {code:scala}
> spark.sql("create table t1 using parquet as select id, id as lower_id from 
> range(5)")
> spark.sql("CREATE VIEW v1 as SELECT * FROM t1")
> spark.sql("CREATE TABLE t2 USING DELTA PARTITIONED BY (LOWER_ID) SELECT 
> LOWER_ID, ID FROM v1")
> spark.sql("CREATE TABLE t3 USING PARQUET PARTITIONED BY (LOWER_ID) SELECT 
> LOWER_ID, ID FROM v1")
> spark.sql("desc extended t2").show(false)
> spark.sql("desc extended t3").show(false)
> {code}
> {noformat}
> scala> spark.sql("desc extended t2").show(false)
> ++--+---+
> |col_name|data_type   
>   |comment|
> ++--+---+
> |lower_id|bigint  
>   |   |
> |id  |bigint  
>   |   |
> ||
>   |   |
> |# Partitioning  |
>   |   |
> |Part 0  |lower_id
>   |   |
> ||
>   |   |
> |# Detailed Table Information|
>   |   |
> |Name|default.t2  
>   |   |
> |Location
> |file:/Users/yumwang/Downloads/spark-3.1.1-bin-hadoop2.7/spark-warehouse/t2|  
>  |
> |Provider|delta   
>   |   |
> |Table Properties
> |[Type=MANAGED,delta.minReaderVersion=1,delta.minWriterVersion=2]  |  
>  |
> ++--+---+
> scala> spark.sql("desc extended t3").show(false)
> ++--+---+
> |col_name|data_type   
>   |comment|
> ++--+---+
> |ID  |bigint  
>   |null   |
> |LOWER_ID|bigint  
>   |null   |
> |# Partition Information |
>   |   |
> |# col_name  |data_type   
>   |comment|
> |LOWER_ID|bigint  
>   |null   |
> ||
>   |   |
> |# Detailed Table Information|
>   |   |
> |Database|default 
>   |   |
> |Table   |t3  
>   |   |
> |Owner   |yumwang 
>

[jira] [Commented] (SPARK-36086) The case of the delta table is inconsistent with parquet

2021-08-09 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-36086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17395856#comment-17395856
 ] 

Apache Spark commented on SPARK-36086:
--

User 'AngersZh' has created a pull request for this issue:
https://github.com/apache/spark/pull/33685

> The case of the delta table is inconsistent with parquet
> 
>
> Key: SPARK-36086
> URL: https://issues.apache.org/jira/browse/SPARK-36086
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.1.1
>Reporter: Yuming Wang
>Assignee: angerszhu
>Priority: Major
> Fix For: 3.2.0
>
>
> How to reproduce this issue:
> {noformat}
> 1. Add delta-core_2.12-1.0.0-SNAPSHOT.jar to ${SPARK_HOME}/jars.
> 2. bin/spark-shell --conf 
> spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension --conf 
> spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog
> {noformat}
> {code:scala}
> spark.sql("create table t1 using parquet as select id, id as lower_id from 
> range(5)")
> spark.sql("CREATE VIEW v1 as SELECT * FROM t1")
> spark.sql("CREATE TABLE t2 USING DELTA PARTITIONED BY (LOWER_ID) SELECT 
> LOWER_ID, ID FROM v1")
> spark.sql("CREATE TABLE t3 USING PARQUET PARTITIONED BY (LOWER_ID) SELECT 
> LOWER_ID, ID FROM v1")
> spark.sql("desc extended t2").show(false)
> spark.sql("desc extended t3").show(false)
> {code}
> {noformat}
> scala> spark.sql("desc extended t2").show(false)
> ++--+---+
> |col_name|data_type   
>   |comment|
> ++--+---+
> |lower_id|bigint  
>   |   |
> |id  |bigint  
>   |   |
> ||
>   |   |
> |# Partitioning  |
>   |   |
> |Part 0  |lower_id
>   |   |
> ||
>   |   |
> |# Detailed Table Information|
>   |   |
> |Name|default.t2  
>   |   |
> |Location
> |file:/Users/yumwang/Downloads/spark-3.1.1-bin-hadoop2.7/spark-warehouse/t2|  
>  |
> |Provider|delta   
>   |   |
> |Table Properties
> |[Type=MANAGED,delta.minReaderVersion=1,delta.minWriterVersion=2]  |  
>  |
> ++--+---+
> scala> spark.sql("desc extended t3").show(false)
> ++--+---+
> |col_name|data_type   
>   |comment|
> ++--+---+
> |ID  |bigint  
>   |null   |
> |LOWER_ID|bigint  
>   |null   |
> |# Partition Information |
>   |   |
> |# col_name  |data_type   
>   |comment|
> |LOWER_ID|bigint  
>   |null   |
> ||
>   |   |
> |# Detailed Table Information|
>   |   |
> |Database|default 
>   |   |
> |Table   |t3  
>   |   |
> |Owner   |yumwang 
>

[jira] [Commented] (SPARK-36086) The case of the delta table is inconsistent with parquet

2021-08-02 Thread Wenchen Fan (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-36086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17391679#comment-17391679
 ] 

Wenchen Fan commented on SPARK-36086:
-

[~krivosheinruslan] please open a ticket if you are working to improve the v2 
describe table command. This ticket is resolved because this column name case 
different is fixed.

> The case of the delta table is inconsistent with parquet
> 
>
> Key: SPARK-36086
> URL: https://issues.apache.org/jira/browse/SPARK-36086
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.1.1
>Reporter: Yuming Wang
>Assignee: angerszhu
>Priority: Major
> Fix For: 3.2.0
>
>
> How to reproduce this issue:
> {noformat}
> 1. Add delta-core_2.12-1.0.0-SNAPSHOT.jar to ${SPARK_HOME}/jars.
> 2. bin/spark-shell --conf 
> spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension --conf 
> spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog
> {noformat}
> {code:scala}
> spark.sql("create table t1 using parquet as select id, id as lower_id from 
> range(5)")
> spark.sql("CREATE VIEW v1 as SELECT * FROM t1")
> spark.sql("CREATE TABLE t2 USING DELTA PARTITIONED BY (LOWER_ID) SELECT 
> LOWER_ID, ID FROM v1")
> spark.sql("CREATE TABLE t3 USING PARQUET PARTITIONED BY (LOWER_ID) SELECT 
> LOWER_ID, ID FROM v1")
> spark.sql("desc extended t2").show(false)
> spark.sql("desc extended t3").show(false)
> {code}
> {noformat}
> scala> spark.sql("desc extended t2").show(false)
> ++--+---+
> |col_name|data_type   
>   |comment|
> ++--+---+
> |lower_id|bigint  
>   |   |
> |id  |bigint  
>   |   |
> ||
>   |   |
> |# Partitioning  |
>   |   |
> |Part 0  |lower_id
>   |   |
> ||
>   |   |
> |# Detailed Table Information|
>   |   |
> |Name|default.t2  
>   |   |
> |Location
> |file:/Users/yumwang/Downloads/spark-3.1.1-bin-hadoop2.7/spark-warehouse/t2|  
>  |
> |Provider|delta   
>   |   |
> |Table Properties
> |[Type=MANAGED,delta.minReaderVersion=1,delta.minWriterVersion=2]  |  
>  |
> ++--+---+
> scala> spark.sql("desc extended t3").show(false)
> ++--+---+
> |col_name|data_type   
>   |comment|
> ++--+---+
> |ID  |bigint  
>   |null   |
> |LOWER_ID|bigint  
>   |null   |
> |# Partition Information |
>   |   |
> |# col_name  |data_type   
>   |comment|
> |LOWER_ID|bigint  
>   |null   |
> ||
>   |   |
> |# Detailed Table Information|
>   |   |
> |Database|default 
>   |   |
> |Table   |t3  
>   |   |
> |Owner

[jira] [Commented] (SPARK-36086) The case of the delta table is inconsistent with parquet

2021-07-29 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-36086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389802#comment-17389802
 ] 

Apache Spark commented on SPARK-36086:
--

User 'AngersZh' has created a pull request for this issue:
https://github.com/apache/spark/pull/33576

> The case of the delta table is inconsistent with parquet
> 
>
> Key: SPARK-36086
> URL: https://issues.apache.org/jira/browse/SPARK-36086
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.1.1
>Reporter: Yuming Wang
>Priority: Major
>
> How to reproduce this issue:
> {noformat}
> 1. Add delta-core_2.12-1.0.0-SNAPSHOT.jar to ${SPARK_HOME}/jars.
> 2. bin/spark-shell --conf 
> spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension --conf 
> spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog
> {noformat}
> {code:scala}
> spark.sql("create table t1 using parquet as select id, id as lower_id from 
> range(5)")
> spark.sql("CREATE VIEW v1 as SELECT * FROM t1")
> spark.sql("CREATE TABLE t2 USING DELTA PARTITIONED BY (LOWER_ID) SELECT 
> LOWER_ID, ID FROM v1")
> spark.sql("CREATE TABLE t3 USING PARQUET PARTITIONED BY (LOWER_ID) SELECT 
> LOWER_ID, ID FROM v1")
> spark.sql("desc extended t2").show(false)
> spark.sql("desc extended t3").show(false)
> {code}
> {noformat}
> scala> spark.sql("desc extended t2").show(false)
> ++--+---+
> |col_name|data_type   
>   |comment|
> ++--+---+
> |lower_id|bigint  
>   |   |
> |id  |bigint  
>   |   |
> ||
>   |   |
> |# Partitioning  |
>   |   |
> |Part 0  |lower_id
>   |   |
> ||
>   |   |
> |# Detailed Table Information|
>   |   |
> |Name|default.t2  
>   |   |
> |Location
> |file:/Users/yumwang/Downloads/spark-3.1.1-bin-hadoop2.7/spark-warehouse/t2|  
>  |
> |Provider|delta   
>   |   |
> |Table Properties
> |[Type=MANAGED,delta.minReaderVersion=1,delta.minWriterVersion=2]  |  
>  |
> ++--+---+
> scala> spark.sql("desc extended t3").show(false)
> ++--+---+
> |col_name|data_type   
>   |comment|
> ++--+---+
> |ID  |bigint  
>   |null   |
> |LOWER_ID|bigint  
>   |null   |
> |# Partition Information |
>   |   |
> |# col_name  |data_type   
>   |comment|
> |LOWER_ID|bigint  
>   |null   |
> ||
>   |   |
> |# Detailed Table Information|
>   |   |
> |Database|default 
>   |   |
> |Table   |t3  
>   |   |
> |Owner   |yumwang 
>   |   |
> |Created Time|Mon Jul 12 

[jira] [Commented] (SPARK-36086) The case of the delta table is inconsistent with parquet

2021-07-27 Thread Ruslan Krivoshein (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-36086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17388156#comment-17388156
 ] 

Ruslan Krivoshein commented on SPARK-36086:
---

Let me get on with it, please

> The case of the delta table is inconsistent with parquet
> 
>
> Key: SPARK-36086
> URL: https://issues.apache.org/jira/browse/SPARK-36086
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.1.1
>Reporter: Yuming Wang
>Priority: Major
>
> How to reproduce this issue:
> {noformat}
> 1. Add delta-core_2.12-1.0.0-SNAPSHOT.jar to ${SPARK_HOME}/jars.
> 2. bin/spark-shell --conf 
> spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension --conf 
> spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog
> {noformat}
> {code:scala}
> spark.sql("create table t1 using parquet as select id, id as lower_id from 
> range(5)")
> spark.sql("CREATE VIEW v1 as SELECT * FROM t1")
> spark.sql("CREATE TABLE t2 USING DELTA PARTITIONED BY (LOWER_ID) SELECT 
> LOWER_ID, ID FROM v1")
> spark.sql("CREATE TABLE t3 USING PARQUET PARTITIONED BY (LOWER_ID) SELECT 
> LOWER_ID, ID FROM v1")
> spark.sql("desc extended t2").show(false)
> spark.sql("desc extended t3").show(false)
> {code}
> {noformat}
> scala> spark.sql("desc extended t2").show(false)
> ++--+---+
> |col_name|data_type   
>   |comment|
> ++--+---+
> |lower_id|bigint  
>   |   |
> |id  |bigint  
>   |   |
> ||
>   |   |
> |# Partitioning  |
>   |   |
> |Part 0  |lower_id
>   |   |
> ||
>   |   |
> |# Detailed Table Information|
>   |   |
> |Name|default.t2  
>   |   |
> |Location
> |file:/Users/yumwang/Downloads/spark-3.1.1-bin-hadoop2.7/spark-warehouse/t2|  
>  |
> |Provider|delta   
>   |   |
> |Table Properties
> |[Type=MANAGED,delta.minReaderVersion=1,delta.minWriterVersion=2]  |  
>  |
> ++--+---+
> scala> spark.sql("desc extended t3").show(false)
> ++--+---+
> |col_name|data_type   
>   |comment|
> ++--+---+
> |ID  |bigint  
>   |null   |
> |LOWER_ID|bigint  
>   |null   |
> |# Partition Information |
>   |   |
> |# col_name  |data_type   
>   |comment|
> |LOWER_ID|bigint  
>   |null   |
> ||
>   |   |
> |# Detailed Table Information|
>   |   |
> |Database|default 
>   |   |
> |Table   |t3  
>   |   |
> |Owner   |yumwang 
>   |   |
> |Created Time|Mon Jul 12 14:07:16 CST 2021
>   |

[jira] [Commented] (SPARK-36086) The case of the delta table is inconsistent with parquet

2021-07-19 Thread Wenchen Fan (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-36086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383195#comment-17383195
 ] 

Wenchen Fan commented on SPARK-36086:
-

Seems we should improve the v2 describe table command to include more 
information.

> The case of the delta table is inconsistent with parquet
> 
>
> Key: SPARK-36086
> URL: https://issues.apache.org/jira/browse/SPARK-36086
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.1.1
>Reporter: Yuming Wang
>Priority: Major
>
> How to reproduce this issue:
> {noformat}
> 1. Add delta-core_2.12-1.0.0-SNAPSHOT.jar to ${SPARK_HOME}/jars.
> 2. bin/spark-shell --conf 
> spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension --conf 
> spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog
> {noformat}
> {code:scala}
> spark.sql("create table t1 using parquet as select id, id as lower_id from 
> range(5)")
> spark.sql("CREATE VIEW v1 as SELECT * FROM t1")
> spark.sql("CREATE TABLE t2 USING DELTA PARTITIONED BY (LOWER_ID) SELECT 
> LOWER_ID, ID FROM v1")
> spark.sql("CREATE TABLE t3 USING PARQUET PARTITIONED BY (LOWER_ID) SELECT 
> LOWER_ID, ID FROM v1")
> spark.sql("desc extended t2").show(false)
> spark.sql("desc extended t3").show(false)
> {code}
> {noformat}
> scala> spark.sql("desc extended t2").show(false)
> ++--+---+
> |col_name|data_type   
>   |comment|
> ++--+---+
> |lower_id|bigint  
>   |   |
> |id  |bigint  
>   |   |
> ||
>   |   |
> |# Partitioning  |
>   |   |
> |Part 0  |lower_id
>   |   |
> ||
>   |   |
> |# Detailed Table Information|
>   |   |
> |Name|default.t2  
>   |   |
> |Location
> |file:/Users/yumwang/Downloads/spark-3.1.1-bin-hadoop2.7/spark-warehouse/t2|  
>  |
> |Provider|delta   
>   |   |
> |Table Properties
> |[Type=MANAGED,delta.minReaderVersion=1,delta.minWriterVersion=2]  |  
>  |
> ++--+---+
> scala> spark.sql("desc extended t3").show(false)
> ++--+---+
> |col_name|data_type   
>   |comment|
> ++--+---+
> |ID  |bigint  
>   |null   |
> |LOWER_ID|bigint  
>   |null   |
> |# Partition Information |
>   |   |
> |# col_name  |data_type   
>   |comment|
> |LOWER_ID|bigint  
>   |null   |
> ||
>   |   |
> |# Detailed Table Information|
>   |   |
> |Database|default 
>   |   |
> |Table   |t3  
>   |   |
> |Owner   |yumwang 
>   |   |
> |Created Time|Mon Jul 12 14:07:16 CST 2021