[jira] [Commented] (SPARK-34060) ALTER TABLE .. DROP PARTITION uncaches Hive table while updating table stats

2021-01-15 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-34060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17266072#comment-17266072
 ] 

Apache Spark commented on SPARK-34060:
--

User 'MaxGekk' has created a pull request for this issue:
https://github.com/apache/spark/pull/31197

> ALTER TABLE .. DROP PARTITION uncaches Hive table while updating table stats
> 
>
> Key: SPARK-34060
> URL: https://issues.apache.org/jira/browse/SPARK-34060
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Maxim Gekk
>Assignee: Maxim Gekk
>Priority: Major
> Fix For: 3.0.2, 3.2.0, 3.1.1
>
>
> The example below portraits the issue:
> {code:scala}
> scala> spark.conf.set("spark.sql.statistics.size.autoUpdate.enabled", true)
> scala> sql(s"CREATE TABLE tbl (id int, part int) USING hive PARTITIONED BY 
> (part)")
> 21/01/10 13:19:59 WARN HiveMetaStore: Location: 
> file:/Users/maximgekk/proj/apache-spark/spark-warehouse/tbl specified for 
> non-external table:tbl
> res12: org.apache.spark.sql.DataFrame = []
> scala> sql("INSERT INTO tbl PARTITION (part=0) SELECT 0")
> res13: org.apache.spark.sql.DataFrame = []
> scala> sql("INSERT INTO tbl PARTITION (part=1) SELECT 1")
> res14: org.apache.spark.sql.DataFrame = []
> scala> sql("CACHE TABLE tbl")
> res15: org.apache.spark.sql.DataFrame = []
> scala> sql("SELECT * FROM tbl").show(false)
> +---++
> |id |part|
> +---++
> |0  |0   |
> |1  |1   |
> +---++
> scala> spark.catalog.isCached("tbl")
> res17: Boolean = true
> scala> sql("ALTER TABLE tbl DROP PARTITION (part=0)")
> res18: org.apache.spark.sql.DataFrame = []
> scala> spark.catalog.isCached("tbl")
> res19: Boolean = false
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-34060) ALTER TABLE .. DROP PARTITION uncaches Hive table while updating table stats

2021-01-15 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-34060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17266071#comment-17266071
 ] 

Apache Spark commented on SPARK-34060:
--

User 'MaxGekk' has created a pull request for this issue:
https://github.com/apache/spark/pull/31197

> ALTER TABLE .. DROP PARTITION uncaches Hive table while updating table stats
> 
>
> Key: SPARK-34060
> URL: https://issues.apache.org/jira/browse/SPARK-34060
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Maxim Gekk
>Assignee: Maxim Gekk
>Priority: Major
> Fix For: 3.0.2, 3.2.0, 3.1.1
>
>
> The example below portraits the issue:
> {code:scala}
> scala> spark.conf.set("spark.sql.statistics.size.autoUpdate.enabled", true)
> scala> sql(s"CREATE TABLE tbl (id int, part int) USING hive PARTITIONED BY 
> (part)")
> 21/01/10 13:19:59 WARN HiveMetaStore: Location: 
> file:/Users/maximgekk/proj/apache-spark/spark-warehouse/tbl specified for 
> non-external table:tbl
> res12: org.apache.spark.sql.DataFrame = []
> scala> sql("INSERT INTO tbl PARTITION (part=0) SELECT 0")
> res13: org.apache.spark.sql.DataFrame = []
> scala> sql("INSERT INTO tbl PARTITION (part=1) SELECT 1")
> res14: org.apache.spark.sql.DataFrame = []
> scala> sql("CACHE TABLE tbl")
> res15: org.apache.spark.sql.DataFrame = []
> scala> sql("SELECT * FROM tbl").show(false)
> +---++
> |id |part|
> +---++
> |0  |0   |
> |1  |1   |
> +---++
> scala> spark.catalog.isCached("tbl")
> res17: Boolean = true
> scala> sql("ALTER TABLE tbl DROP PARTITION (part=0)")
> res18: org.apache.spark.sql.DataFrame = []
> scala> spark.catalog.isCached("tbl")
> res19: Boolean = false
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-34060) ALTER TABLE .. DROP PARTITION uncaches Hive table while updating table stats

2021-01-11 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-34060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17262524#comment-17262524
 ] 

Apache Spark commented on SPARK-34060:
--

User 'MaxGekk' has created a pull request for this issue:
https://github.com/apache/spark/pull/31126

> ALTER TABLE .. DROP PARTITION uncaches Hive table while updating table stats
> 
>
> Key: SPARK-34060
> URL: https://issues.apache.org/jira/browse/SPARK-34060
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Maxim Gekk
>Assignee: Maxim Gekk
>Priority: Major
> Fix For: 3.2.0
>
>
> The example below portraits the issue:
> {code:scala}
> scala> spark.conf.set("spark.sql.statistics.size.autoUpdate.enabled", true)
> scala> sql(s"CREATE TABLE tbl (id int, part int) USING hive PARTITIONED BY 
> (part)")
> 21/01/10 13:19:59 WARN HiveMetaStore: Location: 
> file:/Users/maximgekk/proj/apache-spark/spark-warehouse/tbl specified for 
> non-external table:tbl
> res12: org.apache.spark.sql.DataFrame = []
> scala> sql("INSERT INTO tbl PARTITION (part=0) SELECT 0")
> res13: org.apache.spark.sql.DataFrame = []
> scala> sql("INSERT INTO tbl PARTITION (part=1) SELECT 1")
> res14: org.apache.spark.sql.DataFrame = []
> scala> sql("CACHE TABLE tbl")
> res15: org.apache.spark.sql.DataFrame = []
> scala> sql("SELECT * FROM tbl").show(false)
> +---++
> |id |part|
> +---++
> |0  |0   |
> |1  |1   |
> +---++
> scala> spark.catalog.isCached("tbl")
> res17: Boolean = true
> scala> sql("ALTER TABLE tbl DROP PARTITION (part=0)")
> res18: org.apache.spark.sql.DataFrame = []
> scala> spark.catalog.isCached("tbl")
> res19: Boolean = false
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-34060) ALTER TABLE .. DROP PARTITION uncaches Hive table while updating table stats

2021-01-11 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-34060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17262523#comment-17262523
 ] 

Apache Spark commented on SPARK-34060:
--

User 'MaxGekk' has created a pull request for this issue:
https://github.com/apache/spark/pull/31126

> ALTER TABLE .. DROP PARTITION uncaches Hive table while updating table stats
> 
>
> Key: SPARK-34060
> URL: https://issues.apache.org/jira/browse/SPARK-34060
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Maxim Gekk
>Assignee: Maxim Gekk
>Priority: Major
> Fix For: 3.2.0
>
>
> The example below portraits the issue:
> {code:scala}
> scala> spark.conf.set("spark.sql.statistics.size.autoUpdate.enabled", true)
> scala> sql(s"CREATE TABLE tbl (id int, part int) USING hive PARTITIONED BY 
> (part)")
> 21/01/10 13:19:59 WARN HiveMetaStore: Location: 
> file:/Users/maximgekk/proj/apache-spark/spark-warehouse/tbl specified for 
> non-external table:tbl
> res12: org.apache.spark.sql.DataFrame = []
> scala> sql("INSERT INTO tbl PARTITION (part=0) SELECT 0")
> res13: org.apache.spark.sql.DataFrame = []
> scala> sql("INSERT INTO tbl PARTITION (part=1) SELECT 1")
> res14: org.apache.spark.sql.DataFrame = []
> scala> sql("CACHE TABLE tbl")
> res15: org.apache.spark.sql.DataFrame = []
> scala> sql("SELECT * FROM tbl").show(false)
> +---++
> |id |part|
> +---++
> |0  |0   |
> |1  |1   |
> +---++
> scala> spark.catalog.isCached("tbl")
> res17: Boolean = true
> scala> sql("ALTER TABLE tbl DROP PARTITION (part=0)")
> res18: org.apache.spark.sql.DataFrame = []
> scala> spark.catalog.isCached("tbl")
> res19: Boolean = false
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-34060) ALTER TABLE .. DROP PARTITION uncaches Hive table while updating table stats

2021-01-11 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-34060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17262501#comment-17262501
 ] 

Apache Spark commented on SPARK-34060:
--

User 'MaxGekk' has created a pull request for this issue:
https://github.com/apache/spark/pull/31124

> ALTER TABLE .. DROP PARTITION uncaches Hive table while updating table stats
> 
>
> Key: SPARK-34060
> URL: https://issues.apache.org/jira/browse/SPARK-34060
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Maxim Gekk
>Assignee: Maxim Gekk
>Priority: Major
> Fix For: 3.2.0
>
>
> The example below portraits the issue:
> {code:scala}
> scala> spark.conf.set("spark.sql.statistics.size.autoUpdate.enabled", true)
> scala> sql(s"CREATE TABLE tbl (id int, part int) USING hive PARTITIONED BY 
> (part)")
> 21/01/10 13:19:59 WARN HiveMetaStore: Location: 
> file:/Users/maximgekk/proj/apache-spark/spark-warehouse/tbl specified for 
> non-external table:tbl
> res12: org.apache.spark.sql.DataFrame = []
> scala> sql("INSERT INTO tbl PARTITION (part=0) SELECT 0")
> res13: org.apache.spark.sql.DataFrame = []
> scala> sql("INSERT INTO tbl PARTITION (part=1) SELECT 1")
> res14: org.apache.spark.sql.DataFrame = []
> scala> sql("CACHE TABLE tbl")
> res15: org.apache.spark.sql.DataFrame = []
> scala> sql("SELECT * FROM tbl").show(false)
> +---++
> |id |part|
> +---++
> |0  |0   |
> |1  |1   |
> +---++
> scala> spark.catalog.isCached("tbl")
> res17: Boolean = true
> scala> sql("ALTER TABLE tbl DROP PARTITION (part=0)")
> res18: org.apache.spark.sql.DataFrame = []
> scala> spark.catalog.isCached("tbl")
> res19: Boolean = false
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-34060) ALTER TABLE .. DROP PARTITION uncaches Hive table while updating table stats

2021-01-10 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-34060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17262146#comment-17262146
 ] 

Apache Spark commented on SPARK-34060:
--

User 'MaxGekk' has created a pull request for this issue:
https://github.com/apache/spark/pull/31112

> ALTER TABLE .. DROP PARTITION uncaches Hive table while updating table stats
> 
>
> Key: SPARK-34060
> URL: https://issues.apache.org/jira/browse/SPARK-34060
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Maxim Gekk
>Priority: Major
>
> The example below portraits the issue:
> {code:scala}
> scala> spark.conf.set("spark.sql.statistics.size.autoUpdate.enabled", true)
> scala> sql(s"CREATE TABLE tbl (id int, part int) USING hive PARTITIONED BY 
> (part)")
> 21/01/10 13:19:59 WARN HiveMetaStore: Location: 
> file:/Users/maximgekk/proj/apache-spark/spark-warehouse/tbl specified for 
> non-external table:tbl
> res12: org.apache.spark.sql.DataFrame = []
> scala> sql("INSERT INTO tbl PARTITION (part=0) SELECT 0")
> res13: org.apache.spark.sql.DataFrame = []
> scala> sql("INSERT INTO tbl PARTITION (part=1) SELECT 1")
> res14: org.apache.spark.sql.DataFrame = []
> scala> sql("CACHE TABLE tbl")
> res15: org.apache.spark.sql.DataFrame = []
> scala> sql("SELECT * FROM tbl").show(false)
> +---++
> |id |part|
> +---++
> |0  |0   |
> |1  |1   |
> +---++
> scala> spark.catalog.isCached("tbl")
> res17: Boolean = true
> scala> sql("ALTER TABLE tbl DROP PARTITION (part=0)")
> res18: org.apache.spark.sql.DataFrame = []
> scala> spark.catalog.isCached("tbl")
> res19: Boolean = false
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org