date:20210207

[jira] [Resolved] (SPARK-34355) Add log and time cost for commit job

2021-02-07 Thread Jungtaek Lim (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim resolved SPARK-34355.
--
Fix Version/s: 3.2.0
   Resolution: Fixed

Issue resolved by pull request 31471
[https://github.com/apache/spark/pull/31471]

> Add log and time cost for commit job
> 
>
> Key: SPARK-34355
> URL: https://issues.apache.org/jira/browse/SPARK-34355
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: ulysses you
>Assignee: ulysses you
>Priority: Minor
> Fix For: 3.2.0
>
>
> Th commit job is a heavy option and we have seen many times Spark block at 
> this code place due to the slow rpc with namenode or other.
>  
> It's better to record the time that commit job cost.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-34355) Add log and time cost for commit job

2021-02-07 Thread Jungtaek Lim (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim reassigned SPARK-34355:


Assignee: ulysses you

> Add log and time cost for commit job
> 
>
> Key: SPARK-34355
> URL: https://issues.apache.org/jira/browse/SPARK-34355
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: ulysses you
>Assignee: ulysses you
>Priority: Minor
>
> Th commit job is a heavy option and we have seen many times Spark block at 
> this code place due to the slow rpc with namenode or other.
>  
> It's better to record the time that commit job cost.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Issue Comment Deleted] (SPARK-34394) Unify output of SHOW FUNCTIONS and pass output attributes properly

2021-02-07 Thread jiaan.geng (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jiaan.geng updated SPARK-34394:
---
Comment: was deleted

(was: I'm working on.)

> Unify output of SHOW FUNCTIONS and pass output attributes properly
> --
>
> Key: SPARK-34394
> URL: https://issues.apache.org/jira/browse/SPARK-34394
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: jiaan.geng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-34394) Unify output of SHOW FUNCTIONS and pass output attributes properly

2021-02-07 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-34394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280803#comment-17280803
 ] 

Apache Spark commented on SPARK-34394:
--

User 'beliefer' has created a pull request for this issue:
https://github.com/apache/spark/pull/31519

> Unify output of SHOW FUNCTIONS and pass output attributes properly
> --
>
> Key: SPARK-34394
> URL: https://issues.apache.org/jira/browse/SPARK-34394
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: jiaan.geng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-34394) Unify output of SHOW FUNCTIONS and pass output attributes properly

2021-02-07 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-34394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280802#comment-17280802
 ] 

Apache Spark commented on SPARK-34394:
--

User 'beliefer' has created a pull request for this issue:
https://github.com/apache/spark/pull/31519

> Unify output of SHOW FUNCTIONS and pass output attributes properly
> --
>
> Key: SPARK-34394
> URL: https://issues.apache.org/jira/browse/SPARK-34394
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: jiaan.geng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-34394) Unify output of SHOW FUNCTIONS and pass output attributes properly

2021-02-07 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-34394:


Assignee: Apache Spark

> Unify output of SHOW FUNCTIONS and pass output attributes properly
> --
>
> Key: SPARK-34394
> URL: https://issues.apache.org/jira/browse/SPARK-34394
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: jiaan.geng
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-34394) Unify output of SHOW FUNCTIONS and pass output attributes properly

2021-02-07 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-34394:


Assignee: (was: Apache Spark)

> Unify output of SHOW FUNCTIONS and pass output attributes properly
> --
>
> Key: SPARK-34394
> URL: https://issues.apache.org/jira/browse/SPARK-34394
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: jiaan.geng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-34375) Replaces `Mockito.initMocks` with `Mockito.openMocks`

2021-02-07 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-34375.
--
Fix Version/s: 3.2.0
   Resolution: Fixed

Issue resolved by pull request 31487
[https://github.com/apache/spark/pull/31487]

> Replaces `Mockito.initMocks` with `Mockito.openMocks`
> -
>
> Key: SPARK-34375
> URL: https://issues.apache.org/jira/browse/SPARK-34375
> Project: Spark
>  Issue Type: Improvement
>  Components: Kubernetes, Spark Core, Tests
>Affects Versions: 3.2.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Minor
> Fix For: 3.2.0
>
>
> Mockito.initMocks is a deprecated api, should use openMocks(Object) instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-34375) Replaces `Mockito.initMocks` with `Mockito.openMocks`

2021-02-07 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-34375:


Assignee: Yang Jie

> Replaces `Mockito.initMocks` with `Mockito.openMocks`
> -
>
> Key: SPARK-34375
> URL: https://issues.apache.org/jira/browse/SPARK-34375
> Project: Spark
>  Issue Type: Improvement
>  Components: Kubernetes, Spark Core, Tests
>Affects Versions: 3.2.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Minor
>
> Mockito.initMocks is a deprecated api, should use openMocks(Object) instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-34391) Upgrade commons-io to 2.8.0

2021-02-07 Thread Dongjoon Hyun (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-34391.
---
Fix Version/s: 3.2.0
   Resolution: Fixed

Issue resolved by pull request 31503
[https://github.com/apache/spark/pull/31503]

> Upgrade commons-io to 2.8.0
> ---
>
> Key: SPARK-34391
> URL: https://issues.apache.org/jira/browse/SPARK-34391
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.2.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Major
> Fix For: 3.2.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-34391) Upgrade commons-io to 2.8.0

2021-02-07 Thread Dongjoon Hyun (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-34391:
-

Assignee: Dongjoon Hyun

> Upgrade commons-io to 2.8.0
> ---
>
> Key: SPARK-34391
> URL: https://issues.apache.org/jira/browse/SPARK-34391
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.2.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-34309) Use Caffeine instead of Guava Cache

2021-02-07 Thread Yang Jie (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-34309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280769#comment-17280769
 ] 

Yang Jie edited comment on SPARK-34309 at 2/8/21, 5:48 AM:
---

Thx [~xkrogen] ~ I submitted a temporary pr which includes all the places where 
guava cache is used
{quote}[~LuciferYang] can you elaborate on where Guava Cache is used in 
performance critical sections within Spark? Microbenchmarks are nice but it 
would be good to understand what kind of effect this could have on overall 
performance.
{quote}
 

I need to think about where to make a Microbenchmarks


was (Author: luciferyang):
Thx [~xkrogen] ~ I submitted a temporary pr which includes all the places where 
guava cache is used, 
{quote}[~LuciferYang] can you elaborate on where Guava Cache is used in 
performance critical sections within Spark? Microbenchmarks are nice but it 
would be good to understand what kind of effect this could have on overall 
performance.
{quote}
 

I need to think about where to make a Microbenchmarks

> Use Caffeine instead of Guava Cache
> ---
>
> Key: SPARK-34309
> URL: https://issues.apache.org/jira/browse/SPARK-34309
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Yang Jie
>Priority: Minor
> Attachments: image-2021-02-05-18-08-48-852.png, screenshot-1.png
>
>
> Caffeine is a high performance, near optimal caching library based on Java 8, 
> it is used in a similar way to guava cache, but with better performance. The 
> comparison results are as follow are on the [caffeine benchmarks 
> |https://github.com/ben-manes/caffeine/wiki/Benchmarks]
> At the same time, caffeine has been used in some open source projects like 
> Cassandra, Hbase, Neo4j, Druid, Spring and so on.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-34309) Use Caffeine instead of Guava Cache

2021-02-07 Thread Yang Jie (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-34309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280769#comment-17280769
 ] 

Yang Jie commented on SPARK-34309:
--

Thx [~xkrogen] ~ I submitted a temporary pr which includes all the places where 
guava cache is used, 
{quote}[~LuciferYang] can you elaborate on where Guava Cache is used in 
performance critical sections within Spark? Microbenchmarks are nice but it 
would be good to understand what kind of effect this could have on overall 
performance.
{quote}
 

I need to think about where to make a Microbenchmarks

> Use Caffeine instead of Guava Cache
> ---
>
> Key: SPARK-34309
> URL: https://issues.apache.org/jira/browse/SPARK-34309
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Yang Jie
>Priority: Minor
> Attachments: image-2021-02-05-18-08-48-852.png, screenshot-1.png
>
>
> Caffeine is a high performance, near optimal caching library based on Java 8, 
> it is used in a similar way to guava cache, but with better performance. The 
> comparison results are as follow are on the [caffeine benchmarks 
> |https://github.com/ben-manes/caffeine/wiki/Benchmarks]
> At the same time, caffeine has been used in some open source projects like 
> Cassandra, Hbase, Neo4j, Druid, Spring and so on.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-34309) Use Caffeine instead of Guava Cache

2021-02-07 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-34309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280768#comment-17280768
 ] 

Apache Spark commented on SPARK-34309:
--

User 'LuciferYang' has created a pull request for this issue:
https://github.com/apache/spark/pull/31517

> Use Caffeine instead of Guava Cache
> ---
>
> Key: SPARK-34309
> URL: https://issues.apache.org/jira/browse/SPARK-34309
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Yang Jie
>Priority: Minor
> Attachments: image-2021-02-05-18-08-48-852.png, screenshot-1.png
>
>
> Caffeine is a high performance, near optimal caching library based on Java 8, 
> it is used in a similar way to guava cache, but with better performance. The 
> comparison results are as follow are on the [caffeine benchmarks 
> |https://github.com/ben-manes/caffeine/wiki/Benchmarks]
> At the same time, caffeine has been used in some open source projects like 
> Cassandra, Hbase, Neo4j, Druid, Spring and so on.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-34239) Unify output of SHOW COLUMNS pass output attributes properly

2021-02-07 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-34239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280767#comment-17280767
 ] 

Apache Spark commented on SPARK-34239:
--

User 'AngersZh' has created a pull request for this issue:
https://github.com/apache/spark/pull/31518

> Unify output of SHOW COLUMNS pass output attributes properly
> 
>
> Key: SPARK-34239
> URL: https://issues.apache.org/jira/browse/SPARK-34239
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: angerszhu
>Assignee: angerszhu
>Priority: Major
> Fix For: 3.2.0
>
>
> Unify output of SHOW COLUMNS pass output attributes properly



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-34309) Use Caffeine instead of Guava Cache

2021-02-07 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-34309:


Assignee: (was: Apache Spark)

> Use Caffeine instead of Guava Cache
> ---
>
> Key: SPARK-34309
> URL: https://issues.apache.org/jira/browse/SPARK-34309
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Yang Jie
>Priority: Minor
> Attachments: image-2021-02-05-18-08-48-852.png, screenshot-1.png
>
>
> Caffeine is a high performance, near optimal caching library based on Java 8, 
> it is used in a similar way to guava cache, but with better performance. The 
> comparison results are as follow are on the [caffeine benchmarks 
> |https://github.com/ben-manes/caffeine/wiki/Benchmarks]
> At the same time, caffeine has been used in some open source projects like 
> Cassandra, Hbase, Neo4j, Druid, Spring and so on.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-34309) Use Caffeine instead of Guava Cache

2021-02-07 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-34309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280764#comment-17280764
 ] 

Apache Spark commented on SPARK-34309:
--

User 'LuciferYang' has created a pull request for this issue:
https://github.com/apache/spark/pull/31517

> Use Caffeine instead of Guava Cache
> ---
>
> Key: SPARK-34309
> URL: https://issues.apache.org/jira/browse/SPARK-34309
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Yang Jie
>Priority: Minor
> Attachments: image-2021-02-05-18-08-48-852.png, screenshot-1.png
>
>
> Caffeine is a high performance, near optimal caching library based on Java 8, 
> it is used in a similar way to guava cache, but with better performance. The 
> comparison results are as follow are on the [caffeine benchmarks 
> |https://github.com/ben-manes/caffeine/wiki/Benchmarks]
> At the same time, caffeine has been used in some open source projects like 
> Cassandra, Hbase, Neo4j, Druid, Spring and so on.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-34309) Use Caffeine instead of Guava Cache

2021-02-07 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-34309:


Assignee: Apache Spark

> Use Caffeine instead of Guava Cache
> ---
>
> Key: SPARK-34309
> URL: https://issues.apache.org/jira/browse/SPARK-34309
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Yang Jie
>Assignee: Apache Spark
>Priority: Minor
> Attachments: image-2021-02-05-18-08-48-852.png, screenshot-1.png
>
>
> Caffeine is a high performance, near optimal caching library based on Java 8, 
> it is used in a similar way to guava cache, but with better performance. The 
> comparison results are as follow are on the [caffeine benchmarks 
> |https://github.com/ben-manes/caffeine/wiki/Benchmarks]
> At the same time, caffeine has been used in some open source projects like 
> Cassandra, Hbase, Neo4j, Druid, Spring and so on.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-34239) Unify output of SHOW COLUMNS pass output attributes properly

2021-02-07 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-34239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280765#comment-17280765
 ] 

Apache Spark commented on SPARK-34239:
--

User 'AngersZh' has created a pull request for this issue:
https://github.com/apache/spark/pull/31518

> Unify output of SHOW COLUMNS pass output attributes properly
> 
>
> Key: SPARK-34239
> URL: https://issues.apache.org/jira/browse/SPARK-34239
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: angerszhu
>Assignee: angerszhu
>Priority: Major
> Fix For: 3.2.0
>
>
> Unify output of SHOW COLUMNS pass output attributes properly



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-34238) Unify output of SHOW PARTITIONS pass output attributes properly

2021-02-07 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-34238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280763#comment-17280763
 ] 

Apache Spark commented on SPARK-34238:
--

User 'AngersZh' has created a pull request for this issue:
https://github.com/apache/spark/pull/31516

> Unify output of SHOW PARTITIONS pass output attributes properly
> ---
>
> Key: SPARK-34238
> URL: https://issues.apache.org/jira/browse/SPARK-34238
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: angerszhu
>Assignee: angerszhu
>Priority: Major
> Fix For: 3.2.0
>
>
> Unify output of SHOW PARTITIONS



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-34238) Unify output of SHOW PARTITIONS pass output attributes properly

2021-02-07 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-34238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280762#comment-17280762
 ] 

Apache Spark commented on SPARK-34238:
--

User 'AngersZh' has created a pull request for this issue:
https://github.com/apache/spark/pull/31516

> Unify output of SHOW PARTITIONS pass output attributes properly
> ---
>
> Key: SPARK-34238
> URL: https://issues.apache.org/jira/browse/SPARK-34238
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: angerszhu
>Assignee: angerszhu
>Priority: Major
> Fix For: 3.2.0
>
>
> Unify output of SHOW PARTITIONS



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-34344) Have functionality to trace back Spark SQL queries from the application ID that got submitted on YARN

2021-02-07 Thread Arpan Bhandari (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpan Bhandari updated SPARK-34344:
---
Description: 
We need to have Application Id from resource manager mapped to the specific 
spark sql query that got executed with respect to that application Id so that 
back tracing is possible.

For example : if i run a query using spark shell : 

spark.sql("select dt.d_year,item.i_brand_id brand_id,item.i_brand 
brand,sum(ss_ext_sales_price) sum_agg from date_dim dt,store_sales,item where 
dt.d_date_sk = store_sales.ss_sold_date_sk and store_sales.ss_item_sk = 
item.i_item_sk and item.i_manufact_id = 436 and dt.d_moy=12 group by 
dt.d_year,item.i_brand,item.i_brand_id order by dt.d_year,sum_agg desc,brand_id 
limit 100").show();

When  i see the event logs or the history server i don't see the query 
anywhere, but the query plan is there, so it becomes difficult to trace back 
what query actually got submitted. (if have to map it to the specific 
application Id on yarn)

  was:
We need to have Application Id from resource manager mapped to the specific 
spark sql query that got executed with respect to that application Id so that 
back tracing is possible.

For example : if i run a query using spark shell : 

spark.sql("select dt.d_year,item.i_brand_id brand_id,item.i_brand 
brand,sum(ss_ext_sales_price) sum_agg from date_dim dt,store_sales,item where 
dt.d_date_sk = store_sales.ss_sold_date_sk and store_sales.ss_item_sk = 
item.i_item_sk and item.i_manufact_id = 436 and dt.d_moy=12 group by 
dt.d_year,item.i_brand,item.i_brand_id order by dt.d_year,sum_agg desc,brand_id 
limit 100").show();

When  i see the event logs or the history server i don't see the query 
anywhere, but the query plan is there, so it becomes difficult to trace back 
what query actually got submitted.


> Have functionality to trace back Spark SQL queries from the application ID 
> that got submitted on YARN
> -
>
> Key: SPARK-34344
> URL: https://issues.apache.org/jira/browse/SPARK-34344
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Shell, Spark Submit
>Affects Versions: 1.6.3, 2.3.0, 2.4.5
>Reporter: Arpan Bhandari
>Priority: Major
>
> We need to have Application Id from resource manager mapped to the specific 
> spark sql query that got executed with respect to that application Id so that 
> back tracing is possible.
> For example : if i run a query using spark shell : 
> spark.sql("select dt.d_year,item.i_brand_id brand_id,item.i_brand 
> brand,sum(ss_ext_sales_price) sum_agg from date_dim dt,store_sales,item where 
> dt.d_date_sk = store_sales.ss_sold_date_sk and store_sales.ss_item_sk = 
> item.i_item_sk and item.i_manufact_id = 436 and dt.d_moy=12 group by 
> dt.d_year,item.i_brand,item.i_brand_id order by dt.d_year,sum_agg 
> desc,brand_id limit 100").show();
> When  i see the event logs or the history server i don't see the query 
> anywhere, but the query plan is there, so it becomes difficult to trace back 
> what query actually got submitted. (if have to map it to the specific 
> application Id on yarn)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-34344) Have functionality to trace back Spark SQL queries from the application ID that got submitted on YARN

2021-02-07 Thread Arpan Bhandari (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-34344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280760#comment-17280760
 ] 

Arpan Bhandari commented on SPARK-34344:


[~hyukjin.kwon] : I have updated the diescription , let me know if u need more 
details on this.

> Have functionality to trace back Spark SQL queries from the application ID 
> that got submitted on YARN
> -
>
> Key: SPARK-34344
> URL: https://issues.apache.org/jira/browse/SPARK-34344
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Shell, Spark Submit
>Affects Versions: 1.6.3, 2.3.0, 2.4.5
>Reporter: Arpan Bhandari
>Priority: Major
>
> We need to have Application Id from resource manager mapped to the specific 
> spark sql query that got executed with respect to that application Id so that 
> back tracing is possible.
> For example : if i run a query using spark shell : 
> spark.sql("select dt.d_year,item.i_brand_id brand_id,item.i_brand 
> brand,sum(ss_ext_sales_price) sum_agg from date_dim dt,store_sales,item where 
> dt.d_date_sk = store_sales.ss_sold_date_sk and store_sales.ss_item_sk = 
> item.i_item_sk and item.i_manufact_id = 436 and dt.d_moy=12 group by 
> dt.d_year,item.i_brand,item.i_brand_id order by dt.d_year,sum_agg 
> desc,brand_id limit 100").show();
> When  i see the event logs or the history server i don't see the query 
> anywhere, but the query plan is there, so it becomes difficult to trace back 
> what query actually got submitted.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-34344) Have functionality to trace back Spark SQL queries from the application ID that got submitted on YARN

2021-02-07 Thread Arpan Bhandari (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-34344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280760#comment-17280760
 ] 

Arpan Bhandari edited comment on SPARK-34344 at 2/8/21, 5:37 AM:
-

[~hyukjin.kwon] : I have updated the description , let me know if u need more 
details on this.


was (Author: arpan3189):
[~hyukjin.kwon] : I have updated the diescription , let me know if u need more 
details on this.

> Have functionality to trace back Spark SQL queries from the application ID 
> that got submitted on YARN
> -
>
> Key: SPARK-34344
> URL: https://issues.apache.org/jira/browse/SPARK-34344
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Shell, Spark Submit
>Affects Versions: 1.6.3, 2.3.0, 2.4.5
>Reporter: Arpan Bhandari
>Priority: Major
>
> We need to have Application Id from resource manager mapped to the specific 
> spark sql query that got executed with respect to that application Id so that 
> back tracing is possible.
> For example : if i run a query using spark shell : 
> spark.sql("select dt.d_year,item.i_brand_id brand_id,item.i_brand 
> brand,sum(ss_ext_sales_price) sum_agg from date_dim dt,store_sales,item where 
> dt.d_date_sk = store_sales.ss_sold_date_sk and store_sales.ss_item_sk = 
> item.i_item_sk and item.i_manufact_id = 436 and dt.d_moy=12 group by 
> dt.d_year,item.i_brand,item.i_brand_id order by dt.d_year,sum_agg 
> desc,brand_id limit 100").show();
> When  i see the event logs or the history server i don't see the query 
> anywhere, but the query plan is there, so it becomes difficult to trace back 
> what query actually got submitted.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-34344) Have functionality to trace back Spark SQL queries from the application ID that got submitted on YARN

2021-02-07 Thread Arpan Bhandari (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpan Bhandari updated SPARK-34344:
---
Description: 
We need to have Application Id from resource manager mapped to the specific 
spark sql query that got executed with respect to that application Id so that 
back tracing is possible.

For example : if i run a query using spark shell : 

spark.sql("select dt.d_year,item.i_brand_id brand_id,item.i_brand 
brand,sum(ss_ext_sales_price) sum_agg from date_dim dt,store_sales,item where 
dt.d_date_sk = store_sales.ss_sold_date_sk and store_sales.ss_item_sk = 
item.i_item_sk and item.i_manufact_id = 436 and dt.d_moy=12 group by 
dt.d_year,item.i_brand,item.i_brand_id order by dt.d_year,sum_agg desc,brand_id 
limit 100").show();

When  i see the event logs or the history server i don't see the query 
anywhere, but the query plan is there, so it becomes difficult to trace back 
what query actually got submitted.

  was:We need to have Application Id from resource manager mapped to the 
specific spark sql query that got executed with respect to that application Id 
so that back tracing is possible.


> Have functionality to trace back Spark SQL queries from the application ID 
> that got submitted on YARN
> -
>
> Key: SPARK-34344
> URL: https://issues.apache.org/jira/browse/SPARK-34344
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Shell, Spark Submit
>Affects Versions: 1.6.3, 2.3.0, 2.4.5
>Reporter: Arpan Bhandari
>Priority: Major
>
> We need to have Application Id from resource manager mapped to the specific 
> spark sql query that got executed with respect to that application Id so that 
> back tracing is possible.
> For example : if i run a query using spark shell : 
> spark.sql("select dt.d_year,item.i_brand_id brand_id,item.i_brand 
> brand,sum(ss_ext_sales_price) sum_agg from date_dim dt,store_sales,item where 
> dt.d_date_sk = store_sales.ss_sold_date_sk and store_sales.ss_item_sk = 
> item.i_item_sk and item.i_manufact_id = 436 and dt.d_moy=12 group by 
> dt.d_year,item.i_brand,item.i_brand_id order by dt.d_year,sum_agg 
> desc,brand_id limit 100").show();
> When  i see the event logs or the history server i don't see the query 
> anywhere, but the query plan is there, so it becomes difficult to trace back 
> what query actually got submitted.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-34158) Incorrect url of the only developer Matei in pom.xml

2021-02-07 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-34158:


Assignee: Kevin Su

> Incorrect url of the only developer Matei in pom.xml
> 
>
> Key: SPARK-34158
> URL: https://issues.apache.org/jira/browse/SPARK-34158
> Project: Spark
>  Issue Type: Improvement
>  Components: Build, Spark Core
>Affects Versions: 3.1.1
>Reporter: Jacek Laskowski
>Assignee: Kevin Su
>Priority: Minor
> Fix For: 3.1.2
>
>
> {{[http://www.cs.berkeley.edu/~matei]}} in 
> [pom.xml|https://github.com/apache/spark/blob/53fe365edb948d0e05a5ccb62f349cd9fcb4bb5d/pom.xml#L51]
>  gives
> {quote}Resource not found
>  The server has encountered a problem because the resource was not found.
>  Your request was :
>  [https://people.eecs.berkeley.edu/~matei]
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-34158) Incorrect url of the only developer Matei in pom.xml

2021-02-07 Thread Kevin Su (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-34158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280747#comment-17280747
 ] 

Kevin Su commented on SPARK-34158:
--

Hi [~hyukjin.kwon], my Jira ID is pingsutw

 

> Incorrect url of the only developer Matei in pom.xml
> 
>
> Key: SPARK-34158
> URL: https://issues.apache.org/jira/browse/SPARK-34158
> Project: Spark
>  Issue Type: Improvement
>  Components: Build, Spark Core
>Affects Versions: 3.1.1
>Reporter: Jacek Laskowski
>Priority: Minor
> Fix For: 3.1.2
>
>
> {{[http://www.cs.berkeley.edu/~matei]}} in 
> [pom.xml|https://github.com/apache/spark/blob/53fe365edb948d0e05a5ccb62f349cd9fcb4bb5d/pom.xml#L51]
>  gives
> {quote}Resource not found
>  The server has encountered a problem because the resource was not found.
>  Your request was :
>  [https://people.eecs.berkeley.edu/~matei]
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-34399) Add file commit time to metrics and shown in SQL Tab UI

2021-02-07 Thread angerszhu (Jira)

angerszhu created SPARK-34399:
-

 Summary: Add file commit time to metrics and shown in SQL Tab UI
 Key: SPARK-34399
 URL: https://issues.apache.org/jira/browse/SPARK-34399
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.2.0
Reporter: angerszhu


Add file commit time to metrics and shown in SQL Tab UI



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-34399) Add file commit time to metrics and shown in SQL Tab UI

2021-02-07 Thread angerszhu (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-34399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280726#comment-17280726
 ] 

angerszhu commented on SPARK-34399:
---

Working on it

> Add file commit time to metrics and shown in SQL Tab UI
> ---
>
> Key: SPARK-34399
> URL: https://issues.apache.org/jira/browse/SPARK-34399
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: angerszhu
>Priority: Major
>
> Add file commit time to metrics and shown in SQL Tab UI



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-34392) Invalid ID for offset-based ZoneId since Spark 3.0

2021-02-07 Thread Yuming Wang (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-34392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280724#comment-17280724
 ] 

Yuming Wang edited comment on SPARK-34392 at 2/8/21, 2:42 AM:
--

https://docs.oracle.com/javase/8/docs/api/java/time/ZoneId.html


was (Author: q79969786):
https://docs.oracle.com/javase/8/docs/api/java/time/ZoneId.htmlhttps://docs.oracle.com/javase/8/docs/api/java/time/ZoneId.html

> Invalid ID for offset-based ZoneId since Spark 3.0
> --
>
> Key: SPARK-34392
> URL: https://issues.apache.org/jira/browse/SPARK-34392
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.0, 3.0.1
>Reporter: Yuming Wang
>Priority: Major
>
> How to reproduce this issue:
> {code:sql}
> select to_utc_timestamp("2020-02-07 16:00:00", "GMT+8:00");
> {code}
> Spark 2.4:
> {noformat}
> spark-sql> select to_utc_timestamp("2020-02-07 16:00:00", "GMT+8:00");
> 2020-02-07 08:00:00
> Time taken: 0.089 seconds, Fetched 1 row(s)
> {noformat}
> Spark 3.x:
> {noformat}
> spark-sql> select to_utc_timestamp("2020-02-07 16:00:00", "GMT+8:00");
> 21/02/07 01:24:32 ERROR SparkSQLDriver: Failed in [select 
> to_utc_timestamp("2020-02-07 16:00:00", "GMT+8:00")]
> java.time.DateTimeException: Invalid ID for offset-based ZoneId: GMT+8:00
>   at java.time.ZoneId.ofWithPrefix(ZoneId.java:437)
>   at java.time.ZoneId.of(ZoneId.java:407)
>   at java.time.ZoneId.of(ZoneId.java:359)
>   at java.time.ZoneId.of(ZoneId.java:315)
>   at 
> org.apache.spark.sql.catalyst.util.DateTimeUtils$.getZoneId(DateTimeUtils.scala:53)
>   at 
> org.apache.spark.sql.catalyst.util.DateTimeUtils$.toUTCTime(DateTimeUtils.scala:814)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-34392) Invalid ID for offset-based ZoneId since Spark 3.0

2021-02-07 Thread Yuming Wang (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-34392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280724#comment-17280724
 ] 

Yuming Wang commented on SPARK-34392:
-

https://docs.oracle.com/javase/8/docs/api/java/time/ZoneId.htmlhttps://docs.oracle.com/javase/8/docs/api/java/time/ZoneId.html

> Invalid ID for offset-based ZoneId since Spark 3.0
> --
>
> Key: SPARK-34392
> URL: https://issues.apache.org/jira/browse/SPARK-34392
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.0, 3.0.1
>Reporter: Yuming Wang
>Priority: Major
>
> How to reproduce this issue:
> {code:sql}
> select to_utc_timestamp("2020-02-07 16:00:00", "GMT+8:00");
> {code}
> Spark 2.4:
> {noformat}
> spark-sql> select to_utc_timestamp("2020-02-07 16:00:00", "GMT+8:00");
> 2020-02-07 08:00:00
> Time taken: 0.089 seconds, Fetched 1 row(s)
> {noformat}
> Spark 3.x:
> {noformat}
> spark-sql> select to_utc_timestamp("2020-02-07 16:00:00", "GMT+8:00");
> 21/02/07 01:24:32 ERROR SparkSQLDriver: Failed in [select 
> to_utc_timestamp("2020-02-07 16:00:00", "GMT+8:00")]
> java.time.DateTimeException: Invalid ID for offset-based ZoneId: GMT+8:00
>   at java.time.ZoneId.ofWithPrefix(ZoneId.java:437)
>   at java.time.ZoneId.of(ZoneId.java:407)
>   at java.time.ZoneId.of(ZoneId.java:359)
>   at java.time.ZoneId.of(ZoneId.java:315)
>   at 
> org.apache.spark.sql.catalyst.util.DateTimeUtils$.getZoneId(DateTimeUtils.scala:53)
>   at 
> org.apache.spark.sql.catalyst.util.DateTimeUtils$.toUTCTime(DateTimeUtils.scala:814)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-34395) Clean up unused code for code simplifications

2021-02-07 Thread Sean R. Owen (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean R. Owen updated SPARK-34395:
-
Priority: Trivial  (was: Minor)

> Clean up unused code for code simplifications
> -
>
> Key: SPARK-34395
> URL: https://issues.apache.org/jira/browse/SPARK-34395
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0, 3.2.0
>Reporter: yikf
>Priority: Trivial
>
> Currently, we pass the default value `EmptyRow` to method `checkEvaluation` 
> in the StringExpressionsSuite, but the default value of the 'checkEvaluation' 
> method parameter is the `emptyRow`.
> We can clean the parameter for Code Simplifications.
>  
> example:
> *before:*
> {code:java}
> def testConcat(inputs: String*): Unit = {
>   val expected = if (inputs.contains(null)) null else inputs.mkString
>   checkEvaluation(Concat(inputs.map(Literal.create(_, StringType))), 
> expected, EmptyRow)
> }{code}
> *after:*
> {code:java}
> def testConcat(inputs: String*): Unit = {
>   val expected = if (inputs.contains(null)) null else inputs.mkString
>   checkEvaluation(Concat(inputs.map(Literal.create(_, StringType))), expected)
> }{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-34374) Use standard methods to extract keys or values from a Map.

2021-02-07 Thread Sean R. Owen (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean R. Owen updated SPARK-34374:
-
Priority: Trivial  (was: Minor)

> Use standard methods to extract keys or values from a Map.
> --
>
> Key: SPARK-34374
> URL: https://issues.apache.org/jira/browse/SPARK-34374
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Yang Jie
>Priority: Trivial
>
> For keys：
> *before* 
> {code:scala}
> map.map(_._1)
> {code}
> *after*
> {code:java}
> map.keys
> {code}
> For values:
> {code:scala}
> map.map(_._2)
> {code}
> *after*
> {code:java}
> map.values
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-34346) io.file.buffer.size set by spark.buffer.size will override by hive-site.xml may cause perf regression

2021-02-07 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-34346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280700#comment-17280700
 ] 

Apache Spark commented on SPARK-34346:
--

User 'dongjoon-hyun' has created a pull request for this issue:
https://github.com/apache/spark/pull/31515

> io.file.buffer.size set by spark.buffer.size will override by hive-site.xml 
> may cause perf regression
> -
>
> Key: SPARK-34346
> URL: https://issues.apache.org/jira/browse/SPARK-34346
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, SQL
>Affects Versions: 3.0.2, 3.1.1
>Reporter: Kent Yao
>Assignee: Kent Yao
>Priority: Blocker
> Fix For: 3.0.2, 3.1.1
>
>
> In many real-world cases, when interacting with hive catalog through Spark 
> SQL, users may just share the `hive-site.xml` for their hive jobs and make a 
> copy to `SPARK_HOME`/conf w/o modification. In Spark, when we generate Hadoop 
> configurations, we will use `spark.buffer.size(65536)` to reset  
> `io.file.buffer.size(4096)`. But when we load the hive-site.xml, we may 
> ignore this behavior and reset `io.file.buffer.size` again according to 
> `hive-site.xml`.
> 1. The configuration priority for setting Hadoop and Hive config here is not 
> right, while literally, the order should be `spark > spark.hive > 
> spark.hadoop > hive > hadoop`
> 2. This breaks `spark.buffer.size` congfig's behavior for tuning the IO 
> performance w/ HDFS if there is an existing `io.file.buffer.size` in 
> hive-site.xml 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-34158) Incorrect url of the only developer Matei in pom.xml

2021-02-07 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-34158.
--
Fix Version/s: 3.1.2
   Resolution: Fixed

Issue resolved by pull request 31512
[https://github.com/apache/spark/pull/31512]

> Incorrect url of the only developer Matei in pom.xml
> 
>
> Key: SPARK-34158
> URL: https://issues.apache.org/jira/browse/SPARK-34158
> Project: Spark
>  Issue Type: Improvement
>  Components: Build, Spark Core
>Affects Versions: 3.1.1
>Reporter: Jacek Laskowski
>Priority: Minor
> Fix For: 3.1.2
>
>
> {{[http://www.cs.berkeley.edu/~matei]}} in 
> [pom.xml|https://github.com/apache/spark/blob/53fe365edb948d0e05a5ccb62f349cd9fcb4bb5d/pom.xml#L51]
>  gives
> {quote}Resource not found
>  The server has encountered a problem because the resource was not found.
>  Your request was :
>  [https://people.eecs.berkeley.edu/~matei]
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-34398) Missing pyspark 3.1.1 doc migration

2021-02-07 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-34398:


Assignee: raphael auv

> Missing pyspark 3.1.1 doc migration
> ---
>
> Key: SPARK-34398
> URL: https://issues.apache.org/jira/browse/SPARK-34398
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 3.1.1
>Reporter: raphael auv
>Assignee: raphael auv
>Priority: Major
>
> The link on the page 
> [https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc1-docs/_site/pyspark-migration-guide.html]
> is not wokring.
> Also there is nothing on 3.0 to 3.1.1 on this page
> https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc1-docs/_site/api/python/migration_guide/index.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-34398) Missing pyspark 3.1.1 doc migration

2021-02-07 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-34398.
--
Fix Version/s: 3.1.2
   Resolution: Fixed

Issue resolved by pull request 31514
[https://github.com/apache/spark/pull/31514]

> Missing pyspark 3.1.1 doc migration
> ---
>
> Key: SPARK-34398
> URL: https://issues.apache.org/jira/browse/SPARK-34398
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 3.1.1
>Reporter: raphael auv
>Assignee: raphael auv
>Priority: Major
> Fix For: 3.1.2
>
>
> The link on the page 
> [https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc1-docs/_site/pyspark-migration-guide.html]
> is not wokring.
> Also there is nothing on 3.0 to 3.1.1 on this page
> https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc1-docs/_site/api/python/migration_guide/index.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-34398) Missing pyspark 3.1.1 doc migration

2021-02-07 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-34398:
-
Priority: Major  (was: Blocker)

> Missing pyspark 3.1.1 doc migration
> ---
>
> Key: SPARK-34398
> URL: https://issues.apache.org/jira/browse/SPARK-34398
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 3.1.1
>Reporter: raphael auv
>Priority: Major
>
> The link on the page 
> [https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc1-docs/_site/pyspark-migration-guide.html]
> is not wokring.
> Also there is nothing on 3.0 to 3.1.1 on this page
> https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc1-docs/_site/api/python/migration_guide/index.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-34349) No python3 in docker images

2021-02-07 Thread Hyukjin Kwon (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-34349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280694#comment-17280694
 ] 

Hyukjin Kwon commented on SPARK-34349:
--

[~chethanuk] does it work if you set {{spark.kubernetes.pyspark.pythonVersion}} 
to {{3}} ?

> No python3 in docker images 
> 
>
> Key: SPARK-34349
> URL: https://issues.apache.org/jira/browse/SPARK-34349
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.0.1
>Reporter: Tim Hughes
>Priority: Critical
>
> The spark-py container image doesn't receive the instruction to use python3 
> and defaults to python 2.7
>  
> The worker container was build using the following commands
> {code:java}
> mkdir ./tmp
> wget -qO- 
> https://www.mirrorservice.org/sites/ftp.apache.org/spark/spark-3.0.1/spark-3.0.1-bin-hadoop3.2.tgz
>  | tar -C ./tmp/ -xzf -
> cd ../spark-3.0.1-bin-hadoop3.2/
> ./bin/docker-image-tool.sh -r docker.io/timhughes -t 
> spark-3.0.1-bin-hadoop3.2 -p 
> kubernetes/dockerfiles/spark/bindings/python/Dockerfile build
> docker push docker.io/timhughes/spark-py:spark-3.0.1-bin-hadoop3.2{code}
>  
> This is the code I am using to initialize the workers
>  
> {code:java}
> import os
> from pyspark import SparkContext, SparkConf
> from pyspark.sql import SparkSession# Create Spark config for our Kubernetes 
> based cluster manager
> sparkConf = SparkConf()
> sparkConf.setMaster("k8s://https://kubernetes.default.svc.cluster.local:443";)
> sparkConf.setAppName("spark")
> sparkConf.set("spark.kubernetes.container.image", 
> "docker.io/timhughes/spark-py:spark-3.0.1-bin-hadoop3.2")
> sparkConf.set("spark.kubernetes.namespace", "spark")
> sparkConf.set("spark.executor.instances", "2")
> sparkConf.set("spark.executor.cores", "1")
> sparkConf.set("spark.driver.memory", "1024m")
> sparkConf.set("spark.executor.memory", "1024m")
> sparkConf.set("spark.kubernetes.pyspark.pythonVersion", "3")
> sparkConf.set("spark.kubernetes.authenticate.driver.serviceAccountName", 
> "spark")
> sparkConf.set("spark.kubernetes.authenticate.serviceAccountName", "spark")
> sparkConf.set("spark.driver.port", "29413")
> sparkConf.set("spark.driver.host", 
> "my-notebook-deployment.spark.svc.cluster.local")
> # Initialize our Spark cluster, this will actually
> # generate the worker nodes.
> spark = SparkSession.builder.config(conf=sparkConf).getOrCreate()
> sc = spark.sparkContext
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-34398) Missing pyspark 3.1.1 doc migration

2021-02-07 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-34398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280691#comment-17280691
 ] 

Apache Spark commented on SPARK-34398:
--

User 'raphaelauv' has created a pull request for this issue:
https://github.com/apache/spark/pull/31514

> Missing pyspark 3.1.1 doc migration
> ---
>
> Key: SPARK-34398
> URL: https://issues.apache.org/jira/browse/SPARK-34398
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 3.1.1
>Reporter: raphael auv
>Priority: Blocker
>
> The link on the page 
> [https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc1-docs/_site/pyspark-migration-guide.html]
> is not wokring.
> Also there is nothing on 3.0 to 3.1.1 on this page
> https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc1-docs/_site/api/python/migration_guide/index.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-34398) Missing pyspark 3.1.1 doc migration

2021-02-07 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-34398:


Assignee: (was: Apache Spark)

> Missing pyspark 3.1.1 doc migration
> ---
>
> Key: SPARK-34398
> URL: https://issues.apache.org/jira/browse/SPARK-34398
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 3.1.1
>Reporter: raphael auv
>Priority: Blocker
>
> The link on the page 
> [https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc1-docs/_site/pyspark-migration-guide.html]
> is not wokring.
> Also there is nothing on 3.0 to 3.1.1 on this page
> https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc1-docs/_site/api/python/migration_guide/index.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-34398) Missing pyspark 3.1.1 doc migration

2021-02-07 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-34398:


Assignee: Apache Spark

> Missing pyspark 3.1.1 doc migration
> ---
>
> Key: SPARK-34398
> URL: https://issues.apache.org/jira/browse/SPARK-34398
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 3.1.1
>Reporter: raphael auv
>Assignee: Apache Spark
>Priority: Blocker
>
> The link on the page 
> [https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc1-docs/_site/pyspark-migration-guide.html]
> is not wokring.
> Also there is nothing on 3.0 to 3.1.1 on this page
> https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc1-docs/_site/api/python/migration_guide/index.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-34398) Missing pyspark 3.1.1 doc migration

2021-02-07 Thread raphael auv (Jira)

raphael auv created SPARK-34398:
---

 Summary: Missing pyspark 3.1.1 doc migration
 Key: SPARK-34398
 URL: https://issues.apache.org/jira/browse/SPARK-34398
 Project: Spark
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 3.1.1
Reporter: raphael auv


The link on the page 
[https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc1-docs/_site/pyspark-migration-guide.html]

is not wokring.

Also there is nothing on 3.0 to 3.1.1 on this page

https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc1-docs/_site/api/python/migration_guide/index.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-34361) Dynamic allocation on K8s kills executors with running tasks

2021-02-07 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-34361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280636#comment-17280636
 ] 

Apache Spark commented on SPARK-34361:
--

User 'attilapiros' has created a pull request for this issue:
https://github.com/apache/spark/pull/31513

> Dynamic allocation on K8s kills executors with running tasks
> 
>
> Key: SPARK-34361
> URL: https://issues.apache.org/jira/browse/SPARK-34361
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.0.0, 3.0.1, 3.0.2, 3.1.0, 3.2.0, 3.1.1, 3.1.2
>Reporter: Attila Zsolt Piros
>Priority: Major
>
> There is race between executor POD allocator and cluster scheduler backend. 
> During downscaling (in dynamic allocation) we experienced a lot of killed new 
> executors with running task on them.
> The pattern in the log is the following:
> {noformat}
> 21/02/01 15:12:03 INFO ExecutorMonitor: New executor 312 has registered (new 
> total is 138)
> ...
> 21/02/01 15:12:03 INFO TaskSetManager: Starting task 247.0 in stage 4.0 (TID 
> 2079, 100.100.18.138, executor 312, partition 247, PROCESS_LOCAL, 8777 bytes)
> 21/02/01 15:12:03 INFO ExecutorPodsAllocator: Deleting 3 excess pod requests 
> (408,312,307).
> ...
> 21/02/01 15:12:04 ERROR TaskSchedulerImpl: Lost executor 312 on 
> 100.100.18.138: The executor with id 312 was deleted by a user or the 
> framework.
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-34361) Dynamic allocation on K8s kills executors with running tasks

2021-02-07 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-34361:


Assignee: (was: Apache Spark)

> Dynamic allocation on K8s kills executors with running tasks
> 
>
> Key: SPARK-34361
> URL: https://issues.apache.org/jira/browse/SPARK-34361
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.0.0, 3.0.1, 3.0.2, 3.1.0, 3.2.0, 3.1.1, 3.1.2
>Reporter: Attila Zsolt Piros
>Priority: Major
>
> There is race between executor POD allocator and cluster scheduler backend. 
> During downscaling (in dynamic allocation) we experienced a lot of killed new 
> executors with running task on them.
> The pattern in the log is the following:
> {noformat}
> 21/02/01 15:12:03 INFO ExecutorMonitor: New executor 312 has registered (new 
> total is 138)
> ...
> 21/02/01 15:12:03 INFO TaskSetManager: Starting task 247.0 in stage 4.0 (TID 
> 2079, 100.100.18.138, executor 312, partition 247, PROCESS_LOCAL, 8777 bytes)
> 21/02/01 15:12:03 INFO ExecutorPodsAllocator: Deleting 3 excess pod requests 
> (408,312,307).
> ...
> 21/02/01 15:12:04 ERROR TaskSchedulerImpl: Lost executor 312 on 
> 100.100.18.138: The executor with id 312 was deleted by a user or the 
> framework.
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-34361) Dynamic allocation on K8s kills executors with running tasks

2021-02-07 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-34361:


Assignee: Apache Spark

> Dynamic allocation on K8s kills executors with running tasks
> 
>
> Key: SPARK-34361
> URL: https://issues.apache.org/jira/browse/SPARK-34361
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.0.0, 3.0.1, 3.0.2, 3.1.0, 3.2.0, 3.1.1, 3.1.2
>Reporter: Attila Zsolt Piros
>Assignee: Apache Spark
>Priority: Major
>
> There is race between executor POD allocator and cluster scheduler backend. 
> During downscaling (in dynamic allocation) we experienced a lot of killed new 
> executors with running task on them.
> The pattern in the log is the following:
> {noformat}
> 21/02/01 15:12:03 INFO ExecutorMonitor: New executor 312 has registered (new 
> total is 138)
> ...
> 21/02/01 15:12:03 INFO TaskSetManager: Starting task 247.0 in stage 4.0 (TID 
> 2079, 100.100.18.138, executor 312, partition 247, PROCESS_LOCAL, 8777 bytes)
> 21/02/01 15:12:03 INFO ExecutorPodsAllocator: Deleting 3 excess pod requests 
> (408,312,307).
> ...
> 21/02/01 15:12:04 ERROR TaskSchedulerImpl: Lost executor 312 on 
> 100.100.18.138: The executor with id 312 was deleted by a user or the 
> framework.
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-34349) No python3 in docker images

2021-02-07 Thread Chethan UK (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-34349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280634#comment-17280634
 ] 

Chethan UK commented on SPARK-34349:


Yes, this is fixed. I Have tested with 3.1.1

> No python3 in docker images 
> 
>
> Key: SPARK-34349
> URL: https://issues.apache.org/jira/browse/SPARK-34349
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.0.1
>Reporter: Tim Hughes
>Priority: Critical
>
> The spark-py container image doesn't receive the instruction to use python3 
> and defaults to python 2.7
>  
> The worker container was build using the following commands
> {code:java}
> mkdir ./tmp
> wget -qO- 
> https://www.mirrorservice.org/sites/ftp.apache.org/spark/spark-3.0.1/spark-3.0.1-bin-hadoop3.2.tgz
>  | tar -C ./tmp/ -xzf -
> cd ../spark-3.0.1-bin-hadoop3.2/
> ./bin/docker-image-tool.sh -r docker.io/timhughes -t 
> spark-3.0.1-bin-hadoop3.2 -p 
> kubernetes/dockerfiles/spark/bindings/python/Dockerfile build
> docker push docker.io/timhughes/spark-py:spark-3.0.1-bin-hadoop3.2{code}
>  
> This is the code I am using to initialize the workers
>  
> {code:java}
> import os
> from pyspark import SparkContext, SparkConf
> from pyspark.sql import SparkSession# Create Spark config for our Kubernetes 
> based cluster manager
> sparkConf = SparkConf()
> sparkConf.setMaster("k8s://https://kubernetes.default.svc.cluster.local:443";)
> sparkConf.setAppName("spark")
> sparkConf.set("spark.kubernetes.container.image", 
> "docker.io/timhughes/spark-py:spark-3.0.1-bin-hadoop3.2")
> sparkConf.set("spark.kubernetes.namespace", "spark")
> sparkConf.set("spark.executor.instances", "2")
> sparkConf.set("spark.executor.cores", "1")
> sparkConf.set("spark.driver.memory", "1024m")
> sparkConf.set("spark.executor.memory", "1024m")
> sparkConf.set("spark.kubernetes.pyspark.pythonVersion", "3")
> sparkConf.set("spark.kubernetes.authenticate.driver.serviceAccountName", 
> "spark")
> sparkConf.set("spark.kubernetes.authenticate.serviceAccountName", "spark")
> sparkConf.set("spark.driver.port", "29413")
> sparkConf.set("spark.driver.host", 
> "my-notebook-deployment.spark.svc.cluster.local")
> # Initialize our Spark cluster, this will actually
> # generate the worker nodes.
> spark = SparkSession.builder.config(conf=sparkConf).getOrCreate()
> sc = spark.sparkContext
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33603) Group exception messages in execution/command

2021-02-07 Thread Kevin Su (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280610#comment-17280610
 ] 

Kevin Su commented on SPARK-33603:
--

I'm working on.

> Group exception messages in execution/command
> -
>
> Key: SPARK-33603
> URL: https://issues.apache.org/jira/browse/SPARK-33603
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Allison Wang
>Priority: Major
>
> '/core/src/main/scala/org/apache/spark/sql/execution/command'
> || Filename  ||   Count ||
> | AnalyzeColumnCommand.scala|   3 |
> | AnalyzePartitionCommand.scala |   2 |
> | AnalyzeTableCommand.scala |   1 |
> | SetCommand.scala  |   2 |
> | createDataSourceTables.scala  |   2 |
> | ddl.scala |   1 |
> | functions.scala   |   4 |
> | tables.scala  |   7 |
> | views.scala   |   3 |



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-34158) Incorrect url of the only developer Matei in pom.xml

2021-02-07 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-34158:


Assignee: (was: Apache Spark)

> Incorrect url of the only developer Matei in pom.xml
> 
>
> Key: SPARK-34158
> URL: https://issues.apache.org/jira/browse/SPARK-34158
> Project: Spark
>  Issue Type: Improvement
>  Components: Build, Spark Core
>Affects Versions: 3.1.1
>Reporter: Jacek Laskowski
>Priority: Minor
>
> {{[http://www.cs.berkeley.edu/~matei]}} in 
> [pom.xml|https://github.com/apache/spark/blob/53fe365edb948d0e05a5ccb62f349cd9fcb4bb5d/pom.xml#L51]
>  gives
> {quote}Resource not found
>  The server has encountered a problem because the resource was not found.
>  Your request was :
>  [https://people.eecs.berkeley.edu/~matei]
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-34158) Incorrect url of the only developer Matei in pom.xml

2021-02-07 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-34158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280597#comment-17280597
 ] 

Apache Spark commented on SPARK-34158:
--

User 'pingsutw' has created a pull request for this issue:
https://github.com/apache/spark/pull/31512

> Incorrect url of the only developer Matei in pom.xml
> 
>
> Key: SPARK-34158
> URL: https://issues.apache.org/jira/browse/SPARK-34158
> Project: Spark
>  Issue Type: Improvement
>  Components: Build, Spark Core
>Affects Versions: 3.1.1
>Reporter: Jacek Laskowski
>Priority: Minor
>
> {{[http://www.cs.berkeley.edu/~matei]}} in 
> [pom.xml|https://github.com/apache/spark/blob/53fe365edb948d0e05a5ccb62f349cd9fcb4bb5d/pom.xml#L51]
>  gives
> {quote}Resource not found
>  The server has encountered a problem because the resource was not found.
>  Your request was :
>  [https://people.eecs.berkeley.edu/~matei]
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-34158) Incorrect url of the only developer Matei in pom.xml

2021-02-07 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-34158:


Assignee: Apache Spark

> Incorrect url of the only developer Matei in pom.xml
> 
>
> Key: SPARK-34158
> URL: https://issues.apache.org/jira/browse/SPARK-34158
> Project: Spark
>  Issue Type: Improvement
>  Components: Build, Spark Core
>Affects Versions: 3.1.1
>Reporter: Jacek Laskowski
>Assignee: Apache Spark
>Priority: Minor
>
> {{[http://www.cs.berkeley.edu/~matei]}} in 
> [pom.xml|https://github.com/apache/spark/blob/53fe365edb948d0e05a5ccb62f349cd9fcb4bb5d/pom.xml#L51]
>  gives
> {quote}Resource not found
>  The server has encountered a problem because the resource was not found.
>  Your request was :
>  [https://people.eecs.berkeley.edu/~matei]
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-34397) Support v2 `MSCK REPAIR TABLE`

2021-02-07 Thread Maxim Gekk (Jira)

Maxim Gekk created SPARK-34397:
--

 Summary: Support v2 `MSCK REPAIR TABLE`
 Key: SPARK-34397
 URL: https://issues.apache.org/jira/browse/SPARK-34397
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.2.0
Reporter: Maxim Gekk


Implement the `MSCK REPAIR TABLE` command for tables from v2 catalogs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-28330) ANSI SQL: Top-level in

2021-02-07 Thread ShengJun Zheng (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-28330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280504#comment-17280504
 ] 

ShengJun Zheng commented on SPARK-28330:


any progress ?

> ANSI SQL: Top-level  in 
> 
>
> Key: SPARK-28330
> URL: https://issues.apache.org/jira/browse/SPARK-28330
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.1.0
>Reporter: Yuming Wang
>Priority: Major
>
> h2. {{LIMIT}} and {{OFFSET}}
> LIMIT and OFFSET allow you to retrieve just a portion of the rows that are 
> generated by the rest of the query:
> {noformat}
> SELECT select_list
> FROM table_expression
> [ ORDER BY ... ]
> [ LIMIT { number | ALL } ] [ OFFSET number ]
> {noformat}
> If a limit count is given, no more than that many rows will be returned (but 
> possibly fewer, if the query itself yields fewer rows). LIMIT ALL is the same 
> as omitting the LIMIT clause, as is LIMIT with a NULL argument.
> OFFSET says to skip that many rows before beginning to return rows. OFFSET 0 
> is the same as omitting the OFFSET clause, as is OFFSET with a NULL argument.
> If both OFFSET and LIMIT appear, then OFFSET rows are skipped before starting 
> to count the LIMIT rows that are returned.
> https://www.postgresql.org/docs/11/queries-limit.html
> *Feature ID*: F861



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-34395) Clean up unused code for code simplifications

2021-02-07 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-34395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280493#comment-17280493
 ] 

Apache Spark commented on SPARK-34395:
--

User 'yikf' has created a pull request for this issue:
https://github.com/apache/spark/pull/31510

> Clean up unused code for code simplifications
> -
>
> Key: SPARK-34395
> URL: https://issues.apache.org/jira/browse/SPARK-34395
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0, 3.2.0
>Reporter: yikf
>Priority: Minor
>
> Currently, we pass the default value `EmptyRow` to method `checkEvaluation` 
> in the StringExpressionsSuite, but the default value of the 'checkEvaluation' 
> method parameter is the `emptyRow`.
> We can clean the parameter for Code Simplifications.
>  
> example:
> *before:*
> {code:java}
> def testConcat(inputs: String*): Unit = {
>   val expected = if (inputs.contains(null)) null else inputs.mkString
>   checkEvaluation(Concat(inputs.map(Literal.create(_, StringType))), 
> expected, EmptyRow)
> }{code}
> *after:*
> {code:java}
> def testConcat(inputs: String*): Unit = {
>   val expected = if (inputs.contains(null)) null else inputs.mkString
>   checkEvaluation(Concat(inputs.map(Literal.create(_, StringType))), expected)
> }{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-34395) Clean up unused code for code simplifications

2021-02-07 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-34395:


Assignee: (was: Apache Spark)

> Clean up unused code for code simplifications
> -
>
> Key: SPARK-34395
> URL: https://issues.apache.org/jira/browse/SPARK-34395
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0, 3.2.0
>Reporter: yikf
>Priority: Minor
>
> Currently, we pass the default value `EmptyRow` to method `checkEvaluation` 
> in the StringExpressionsSuite, but the default value of the 'checkEvaluation' 
> method parameter is the `emptyRow`.
> We can clean the parameter for Code Simplifications.
>  
> example:
> *before:*
> {code:java}
> def testConcat(inputs: String*): Unit = {
>   val expected = if (inputs.contains(null)) null else inputs.mkString
>   checkEvaluation(Concat(inputs.map(Literal.create(_, StringType))), 
> expected, EmptyRow)
> }{code}
> *after:*
> {code:java}
> def testConcat(inputs: String*): Unit = {
>   val expected = if (inputs.contains(null)) null else inputs.mkString
>   checkEvaluation(Concat(inputs.map(Literal.create(_, StringType))), expected)
> }{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-34395) Clean up unused code for code simplifications

2021-02-07 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-34395:


Assignee: Apache Spark

> Clean up unused code for code simplifications
> -
>
> Key: SPARK-34395
> URL: https://issues.apache.org/jira/browse/SPARK-34395
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0, 3.2.0
>Reporter: yikf
>Assignee: Apache Spark
>Priority: Minor
>
> Currently, we pass the default value `EmptyRow` to method `checkEvaluation` 
> in the StringExpressionsSuite, but the default value of the 'checkEvaluation' 
> method parameter is the `emptyRow`.
> We can clean the parameter for Code Simplifications.
>  
> example:
> *before:*
> {code:java}
> def testConcat(inputs: String*): Unit = {
>   val expected = if (inputs.contains(null)) null else inputs.mkString
>   checkEvaluation(Concat(inputs.map(Literal.create(_, StringType))), 
> expected, EmptyRow)
> }{code}
> *after:*
> {code:java}
> def testConcat(inputs: String*): Unit = {
>   val expected = if (inputs.contains(null)) null else inputs.mkString
>   checkEvaluation(Concat(inputs.map(Literal.create(_, StringType))), expected)
> }{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-34396) Add a new build-in function delegate

2021-02-07 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-34396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280487#comment-17280487
 ] 

Apache Spark commented on SPARK-34396:
--

User 'ulysses-you' has created a pull request for this issue:
https://github.com/apache/spark/pull/31509

> Add a new build-in function delegate
> 
>
> Key: SPARK-34396
> URL: https://issues.apache.org/jira/browse/SPARK-34396
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: ulysses you
>Priority: Minor
>
> Delegate is just like a big box that can include some other expressions. It 
> will execute all of children and return the last child result as its result.
>  
> The origin idea is from debug. Debug SQL is hard since SQL is always quite 
> long and complex. This new function can help debug with inject some help 
> functions, e.g., `println, sleep, raise_error`.
>  
> Two usage examples:
> {code:java}
> -- raw sql
> INSERT INTO TABLE t1
> SELECT coalesce(c1, c2) as c FROM t2
>  
> -- print the column data
> INSERT INTO TABLE t1
> SELECT delegate(
> java_method('scala.Console', 'println', concat('c1: ', c1, ', c2: ', c2)),
> coalesce(c1, c2)
> ) as c FROM t2{code}
>  
> {code:java}
> -- raw sql
> SELECT if(spark_partition_id() = 1, c1, raise_error('test error')) FROM t2
>  
> -- add a sleep time before throw error
> SELECT if(spark_partition_id() = 1, c1, delegate(
> java_method('java.lang.Thread', 'sleep', 3000l),
> raise_error('test error')
> )) FROM t2{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-34396) Add a new build-in function delegate

2021-02-07 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-34396:


Assignee: Apache Spark

> Add a new build-in function delegate
> 
>
> Key: SPARK-34396
> URL: https://issues.apache.org/jira/browse/SPARK-34396
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: ulysses you
>Assignee: Apache Spark
>Priority: Minor
>
> Delegate is just like a big box that can include some other expressions. It 
> will execute all of children and return the last child result as its result.
>  
> The origin idea is from debug. Debug SQL is hard since SQL is always quite 
> long and complex. This new function can help debug with inject some help 
> functions, e.g., `println, sleep, raise_error`.
>  
> Two usage examples:
> {code:java}
> -- raw sql
> INSERT INTO TABLE t1
> SELECT coalesce(c1, c2) as c FROM t2
>  
> -- print the column data
> INSERT INTO TABLE t1
> SELECT delegate(
> java_method('scala.Console', 'println', concat('c1: ', c1, ', c2: ', c2)),
> coalesce(c1, c2)
> ) as c FROM t2{code}
>  
> {code:java}
> -- raw sql
> SELECT if(spark_partition_id() = 1, c1, raise_error('test error')) FROM t2
>  
> -- add a sleep time before throw error
> SELECT if(spark_partition_id() = 1, c1, delegate(
> java_method('java.lang.Thread', 'sleep', 3000l),
> raise_error('test error')
> )) FROM t2{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-34396) Add a new build-in function delegate

2021-02-07 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-34396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280485#comment-17280485
 ] 

Apache Spark commented on SPARK-34396:
--

User 'ulysses-you' has created a pull request for this issue:
https://github.com/apache/spark/pull/31509

> Add a new build-in function delegate
> 
>
> Key: SPARK-34396
> URL: https://issues.apache.org/jira/browse/SPARK-34396
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: ulysses you
>Priority: Minor
>
> Delegate is just like a big box that can include some other expressions. It 
> will execute all of children and return the last child result as its result.
>  
> The origin idea is from debug. Debug SQL is hard since SQL is always quite 
> long and complex. This new function can help debug with inject some help 
> functions, e.g., `println, sleep, raise_error`.
>  
> Two usage examples:
> {code:java}
> -- raw sql
> INSERT INTO TABLE t1
> SELECT coalesce(c1, c2) as c FROM t2
>  
> -- print the column data
> INSERT INTO TABLE t1
> SELECT delegate(
> java_method('scala.Console', 'println', concat('c1: ', c1, ', c2: ', c2)),
> coalesce(c1, c2)
> ) as c FROM t2{code}
>  
> {code:java}
> -- raw sql
> SELECT if(spark_partition_id() = 1, c1, raise_error('test error')) FROM t2
>  
> -- add a sleep time before throw error
> SELECT if(spark_partition_id() = 1, c1, delegate(
> java_method('java.lang.Thread', 'sleep', 3000l),
> raise_error('test error')
> )) FROM t2{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-34396) Add a new build-in function delegate

2021-02-07 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-34396:


Assignee: (was: Apache Spark)

> Add a new build-in function delegate
> 
>
> Key: SPARK-34396
> URL: https://issues.apache.org/jira/browse/SPARK-34396
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: ulysses you
>Priority: Minor
>
> Delegate is just like a big box that can include some other expressions. It 
> will execute all of children and return the last child result as its result.
>  
> The origin idea is from debug. Debug SQL is hard since SQL is always quite 
> long and complex. This new function can help debug with inject some help 
> functions, e.g., `println, sleep, raise_error`.
>  
> Two usage examples:
> {code:java}
> -- raw sql
> INSERT INTO TABLE t1
> SELECT coalesce(c1, c2) as c FROM t2
>  
> -- print the column data
> INSERT INTO TABLE t1
> SELECT delegate(
> java_method('scala.Console', 'println', concat('c1: ', c1, ', c2: ', c2)),
> coalesce(c1, c2)
> ) as c FROM t2{code}
>  
> {code:java}
> -- raw sql
> SELECT if(spark_partition_id() = 1, c1, raise_error('test error')) FROM t2
>  
> -- add a sleep time before throw error
> SELECT if(spark_partition_id() = 1, c1, delegate(
> java_method('java.lang.Thread', 'sleep', 3000l),
> raise_error('test error')
> )) FROM t2{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-34396) Add a new build-in function delegate

2021-02-07 Thread ulysses you (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ulysses you updated SPARK-34396:

Description: 
Delegate is just like a big box that can include some other expressions. It 
will execute all of children and return the last child result as its result.

 

The origin idea is from debug. Debug SQL is hard since SQL is always quite long 
and complex. This new function can help debug with inject some help functions, 
e.g., `println, sleep, raise_error`.

 

Two usage examples:
{code:java}
-- raw sql
INSERT INTO TABLE t1
SELECT coalesce(c1, c2) as c FROM t2
 
-- print the column data
INSERT INTO TABLE t1
SELECT delegate(
java_method('scala.Console', 'println', concat('c1: ', c1, ', c2: ', c2)),
coalesce(c1, c2)
) as c FROM t2{code}
 
{code:java}
-- raw sql
SELECT if(spark_partition_id() = 1, c1, raise_error('test error')) FROM t2
 
-- add a sleep time before throw error
SELECT if(spark_partition_id() = 1, c1, delegate(
java_method('java.lang.Thread', 'sleep', 3000l),
raise_error('test error')
)) FROM t2{code}
 

  was:
Delegate is just like a big box that can include some other expressions. It 
will execute all of children and return the last child result as its result.

 

The origin idea is from debug. Debug SQL is hard since SQL is always quite long 
and complex. This new function can help debug with inject some help functions, 
e.g., `println, sleep, raise_error`.

 

Two usage examples:
{code:java}
-- raw sql
INSERT INTO TABLE t1
SELECT coalesce(c1, c2) as c FROM t2
 
-- print the column data
INSERT INTO TABLE t1
SELECT delegate(
java_method('scala.Console', 'println', concat('c1: ', c1, ', c2: ', c2)),
coalesce(c1, c2)
) as c FROM t2{code}
 

 
{code:java}
-- raw sql
SELECT if(spark_partition_id() = 1, c1, raise_error('test error')) FROM t2
 
-- add a sleep time before throw error
SELECT if(spark_partition_id() = 1, c1, delegate(
java_method('java.lang.Thread', 'sleep', 3000l),
raise_error('test error')
)) FROM t2{code}
 


> Add a new build-in function delegate
> 
>
> Key: SPARK-34396
> URL: https://issues.apache.org/jira/browse/SPARK-34396
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: ulysses you
>Priority: Minor
>
> Delegate is just like a big box that can include some other expressions. It 
> will execute all of children and return the last child result as its result.
>  
> The origin idea is from debug. Debug SQL is hard since SQL is always quite 
> long and complex. This new function can help debug with inject some help 
> functions, e.g., `println, sleep, raise_error`.
>  
> Two usage examples:
> {code:java}
> -- raw sql
> INSERT INTO TABLE t1
> SELECT coalesce(c1, c2) as c FROM t2
>  
> -- print the column data
> INSERT INTO TABLE t1
> SELECT delegate(
> java_method('scala.Console', 'println', concat('c1: ', c1, ', c2: ', c2)),
> coalesce(c1, c2)
> ) as c FROM t2{code}
>  
> {code:java}
> -- raw sql
> SELECT if(spark_partition_id() = 1, c1, raise_error('test error')) FROM t2
>  
> -- add a sleep time before throw error
> SELECT if(spark_partition_id() = 1, c1, delegate(
> java_method('java.lang.Thread', 'sleep', 3000l),
> raise_error('test error')
> )) FROM t2{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-34396) Add a new build-in function delegate

2021-02-07 Thread ulysses you (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ulysses you updated SPARK-34396:

Description: 
Delegate is just like a big box that can include some other expressions. It 
will execute all of children and return the last child result as its result.

 

The origin idea is from debug. Debug SQL is hard since SQL is always quite long 
and complex. This new function can help debug with inject some help functions, 
e.g., `println, sleep, raise_error`.

 

Two usage examples:
{code:java}
-- raw sql
INSERT INTO TABLE t1
SELECT coalesce(c1, c2) as c FROM t2
 
-- print the column data
INSERT INTO TABLE t1
SELECT delegate(
java_method('scala.Console', 'println', concat('c1: ', c1, ', c2: ', c2)),
coalesce(c1, c2)
) as c FROM t2{code}
 

 
{code:java}
-- raw sql
SELECT if(spark_partition_id() = 1, c1, raise_error('test error')) FROM t2
 
-- add a sleep time before throw error
SELECT if(spark_partition_id() = 1, c1, delegate(
java_method('java.lang.Thread', 'sleep', 3000l),
raise_error('test error')
)) FROM t2{code}
 

  was:
Delegate is just like a big box that can include some other expressions. It 
will execute all of children and return the last child result as its result.

 

The origin idea is from debug. Debug SQL is hard since SQL is always quite long 
and complex. This new function can help debug with inject some help functions, 
e.g., `println, sleep, raise_error`.

 

Two usage examples:

 
{code:java}
-- raw sql
INSERT INTO TABLE t1
SELECT coalesce(c1, c2) as c FROM t2
 
-- print the column data
INSERT INTO TABLE t1
SELECT delegate(
java_method('scala.Console', 'println', concat('c1: ', c1, ', c2: ', c2)),
coalesce(c1, c2)
) as c FROM t2{code}
 

 
{code:java}
-- raw sql
SELECT if(spark_partition_id() = 1, c1, raise_error('test error')) FROM t2
 
-- add a sleep time before throw error
SELECT if(spark_partition_id() = 1, c1, delegate(
java_method('java.lang.Thread', 'sleep', 3000l),
raise_error('test error')
)) FROM t2{code}
 


> Add a new build-in function delegate
> 
>
> Key: SPARK-34396
> URL: https://issues.apache.org/jira/browse/SPARK-34396
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: ulysses you
>Priority: Minor
>
> Delegate is just like a big box that can include some other expressions. It 
> will execute all of children and return the last child result as its result.
>  
> The origin idea is from debug. Debug SQL is hard since SQL is always quite 
> long and complex. This new function can help debug with inject some help 
> functions, e.g., `println, sleep, raise_error`.
>  
> Two usage examples:
> {code:java}
> -- raw sql
> INSERT INTO TABLE t1
> SELECT coalesce(c1, c2) as c FROM t2
>  
> -- print the column data
> INSERT INTO TABLE t1
> SELECT delegate(
> java_method('scala.Console', 'println', concat('c1: ', c1, ', c2: ', c2)),
> coalesce(c1, c2)
> ) as c FROM t2{code}
>  
>  
> {code:java}
> -- raw sql
> SELECT if(spark_partition_id() = 1, c1, raise_error('test error')) FROM t2
>  
> -- add a sleep time before throw error
> SELECT if(spark_partition_id() = 1, c1, delegate(
> java_method('java.lang.Thread', 'sleep', 3000l),
> raise_error('test error')
> )) FROM t2{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-34396) Add a new build-in function delegate

2021-02-07 Thread ulysses you (Jira)

ulysses you created SPARK-34396:
---

 Summary: Add a new build-in function delegate
 Key: SPARK-34396
 URL: https://issues.apache.org/jira/browse/SPARK-34396
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.2.0
Reporter: ulysses you


Delegate is just like a big box that can include some other expressions. It 
will execute all of children and return the last child result as its result.

 

The origin idea is from debug. Debug SQL is hard since SQL is always quite long 
and complex. This new function can help debug with inject some help functions, 
e.g., `println, sleep, raise_error`.

 

Two usage examples:

 
{code:java}
-- raw sql
INSERT INTO TABLE t1
SELECT coalesce(c1, c2) as c FROM t2
 
-- print the column data
INSERT INTO TABLE t1
SELECT delegate(
java_method('scala.Console', 'println', concat('c1: ', c1, ', c2: ', c2)),
coalesce(c1, c2)
) as c FROM t2{code}
 

 
{code:java}
-- raw sql
SELECT if(spark_partition_id() = 1, c1, raise_error('test error')) FROM t2
 
-- add a sleep time before throw error
SELECT if(spark_partition_id() = 1, c1, delegate(
java_method('java.lang.Thread', 'sleep', 3000l),
raise_error('test error')
)) FROM t2{code}
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-34395) Clean up unused code for code simplifications

2021-02-07 Thread yikf (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yikf updated SPARK-34395:
-
Description: 
Currently, we pass the default value `EmptyRow` to method `checkEvaluation` in 
the StringExpressionsSuite, but the default value of the 'checkEvaluation' 
method parameter is the `emptyRow`.

We can clean the parameter for Code Simplifications.

 

example:

*before:*
{code:java}
def testConcat(inputs: String*): Unit = {
  val expected = if (inputs.contains(null)) null else inputs.mkString
  checkEvaluation(Concat(inputs.map(Literal.create(_, StringType))), expected, 
EmptyRow)
}{code}
*after:*
{code:java}
def testConcat(inputs: String*): Unit = {
  val expected = if (inputs.contains(null)) null else inputs.mkString
  checkEvaluation(Concat(inputs.map(Literal.create(_, StringType))), expected)
}{code}

  was:
Currently, we pass the default value `EmptyRow` to method `checkEvaluation` in 
the StringExpressionsSuite, but the default value of the 'checkEvaluation' 
method parameter is the `emptyRow`.

We can clean the parameter for Code Simplifications.

 

example:

*before:*
{code:java}
def testConcat(inputs: String*): Unit = {
  val expected = if (inputs.contains(null)) null else inputs.mkString
  checkEvaluation(Concat(inputs.map(Literal.create(_, StringType))), expected, 
EmptyRow)
}{code}
*after:*

 
{code:java}
def testConcat(inputs: String*): Unit = {
  val expected = if (inputs.contains(null)) null else inputs.mkString
  checkEvaluation(Concat(inputs.map(Literal.create(_, StringType))), expected)
}{code}


> Clean up unused code for code simplifications
> -
>
> Key: SPARK-34395
> URL: https://issues.apache.org/jira/browse/SPARK-34395
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0, 3.2.0
>Reporter: yikf
>Priority: Minor
>
> Currently, we pass the default value `EmptyRow` to method `checkEvaluation` 
> in the StringExpressionsSuite, but the default value of the 'checkEvaluation' 
> method parameter is the `emptyRow`.
> We can clean the parameter for Code Simplifications.
>  
> example:
> *before:*
> {code:java}
> def testConcat(inputs: String*): Unit = {
>   val expected = if (inputs.contains(null)) null else inputs.mkString
>   checkEvaluation(Concat(inputs.map(Literal.create(_, StringType))), 
> expected, EmptyRow)
> }{code}
> *after:*
> {code:java}
> def testConcat(inputs: String*): Unit = {
>   val expected = if (inputs.contains(null)) null else inputs.mkString
>   checkEvaluation(Concat(inputs.map(Literal.create(_, StringType))), expected)
> }{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-34395) Clean up unused code for code simplifications

2021-02-07 Thread yikf (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yikf updated SPARK-34395:
-
Description: 
Currently, we pass the default value `EmptyRow` to method `checkEvaluation` in 
the StringExpressionsSuite, but the default value of the 'checkEvaluation' 
method parameter is the `emptyRow`.

We can clean the parameter for Code Simplifications.

 

example:

*before:*
{code:java}
def testConcat(inputs: String*): Unit = {
  val expected = if (inputs.contains(null)) null else inputs.mkString
  checkEvaluation(Concat(inputs.map(Literal.create(_, StringType))), expected, 
EmptyRow)
}{code}
*after:*

 
{code:java}
def testConcat(inputs: String*): Unit = {
  val expected = if (inputs.contains(null)) null else inputs.mkString
  checkEvaluation(Concat(inputs.map(Literal.create(_, StringType))), expected)
}{code}

  was:
Currently, we pass the default value `EmptyRow` to method `checkEvaluation` in 
the StringExpressionsSuite, but the default value of the 'checkEvaluation' 
method parameter is the `emptyRow`.

We can clean the parameter for Code Simplifications.

 

example:

*before:*

*after:*

 


> Clean up unused code for code simplifications
> -
>
> Key: SPARK-34395
> URL: https://issues.apache.org/jira/browse/SPARK-34395
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0, 3.2.0
>Reporter: yikf
>Priority: Minor
>
> Currently, we pass the default value `EmptyRow` to method `checkEvaluation` 
> in the StringExpressionsSuite, but the default value of the 'checkEvaluation' 
> method parameter is the `emptyRow`.
> We can clean the parameter for Code Simplifications.
>  
> example:
> *before:*
> {code:java}
> def testConcat(inputs: String*): Unit = {
>   val expected = if (inputs.contains(null)) null else inputs.mkString
>   checkEvaluation(Concat(inputs.map(Literal.create(_, StringType))), 
> expected, EmptyRow)
> }{code}
> *after:*
>  
> {code:java}
> def testConcat(inputs: String*): Unit = {
>   val expected = if (inputs.contains(null)) null else inputs.mkString
>   checkEvaluation(Concat(inputs.map(Literal.create(_, StringType))), expected)
> }{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-34395) Clean up unused code for code simplifications

2021-02-07 Thread yikf (Jira)

yikf created SPARK-34395:


 Summary: Clean up unused code for code simplifications
 Key: SPARK-34395
 URL: https://issues.apache.org/jira/browse/SPARK-34395
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.0.0, 3.2.0
Reporter: yikf


Currently, we pass the default value `EmptyRow` to method `checkEvaluation` in 
the StringExpressionsSuite, but the default value of the 'checkEvaluation' 
method parameter is the `emptyRow`.

We can clean the parameter for Code Simplifications.

 

example:

*before:*

*after:*

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-34394) Unify output of SHOW FUNCTIONS and pass output attributes properly

2021-02-07 Thread jiaan.geng (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-34394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280456#comment-17280456
 ] 

jiaan.geng commented on SPARK-34394:


I'm working on.

> Unify output of SHOW FUNCTIONS and pass output attributes properly
> --
>
> Key: SPARK-34394
> URL: https://issues.apache.org/jira/browse/SPARK-34394
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: jiaan.geng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-34394) Unify output of SHOW FUNCTIONS and pass output attributes properly

2021-02-07 Thread jiaan.geng (Jira)

jiaan.geng created SPARK-34394:
--

 Summary: Unify output of SHOW FUNCTIONS and pass output attributes 
properly
 Key: SPARK-34394
 URL: https://issues.apache.org/jira/browse/SPARK-34394
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.2.0
Reporter: jiaan.geng






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-34393) Unify output of SHOW VIEWS and pass output attributes properly

2021-02-07 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-34393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280444#comment-17280444
 ] 

Apache Spark commented on SPARK-34393:
--

User 'beliefer' has created a pull request for this issue:
https://github.com/apache/spark/pull/31508

> Unify output of SHOW VIEWS and pass output attributes properly
> --
>
> Key: SPARK-34393
> URL: https://issues.apache.org/jira/browse/SPARK-34393
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: jiaan.geng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Issue Comment Deleted] (SPARK-34393) Unify output of SHOW VIEWS and pass output attributes properly

2021-02-07 Thread jiaan.geng (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jiaan.geng updated SPARK-34393:
---
Comment: was deleted

(was: I'm working on.)

> Unify output of SHOW VIEWS and pass output attributes properly
> --
>
> Key: SPARK-34393
> URL: https://issues.apache.org/jira/browse/SPARK-34393
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: jiaan.geng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-34393) Unify output of SHOW VIEWS and pass output attributes properly

2021-02-07 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-34393:


Assignee: (was: Apache Spark)

> Unify output of SHOW VIEWS and pass output attributes properly
> --
>
> Key: SPARK-34393
> URL: https://issues.apache.org/jira/browse/SPARK-34393
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: jiaan.geng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-34393) Unify output of SHOW VIEWS and pass output attributes properly

2021-02-07 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-34393:


Assignee: Apache Spark

> Unify output of SHOW VIEWS and pass output attributes properly
> --
>
> Key: SPARK-34393
> URL: https://issues.apache.org/jira/browse/SPARK-34393
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: jiaan.geng
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-34393) Unify output of SHOW VIEWS and pass output attributes properly

2021-02-07 Thread jiaan.geng (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-34393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280439#comment-17280439
 ] 

jiaan.geng commented on SPARK-34393:


I'm working on.

> Unify output of SHOW VIEWS and pass output attributes properly
> --
>
> Key: SPARK-34393
> URL: https://issues.apache.org/jira/browse/SPARK-34393
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: jiaan.geng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-34393) Unify output of SHOW VIEWS and pass output attributes properly

2021-02-07 Thread jiaan.geng (Jira)

jiaan.geng created SPARK-34393:
--

 Summary: Unify output of SHOW VIEWS and pass output attributes 
properly
 Key: SPARK-34393
 URL: https://issues.apache.org/jira/browse/SPARK-34393
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.2.0
Reporter: jiaan.geng






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-33979) Filter predicate reorder

2021-02-07 Thread Yuming Wang (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuming Wang updated SPARK-33979:

Description: 
Reorder filter predicate to improve query performance:
{noformat}
others < In < Like < UDF/CaseWhen/If < Inset < LikeAny/LikeAll
{noformat}
[https://www.ibm.com/support/knowledgecenter/SSSHTQ_8.1.0/com.ibm.netcool_OMNIbus.doc_8.1.0/omnibus/wip/admin/reference/omn_adm_per_optimizationrules.html#omn_adm_per_optimizationrules__reorder]
 
[https://docs.oracle.com/en/database/oracle/oracle-database/21/addci/extensible-optimizer-interface.html#GUID-28A4EDA6-19DD-4773-B3B8-1802C3B01E21]
 [https://docs.oracle.com/cd/B10501_01/server.920/a96533/hintsref.htm#13676]

https://issues.apache.org/jira/browse/HIVE-21857

 

  was:
Reorder filter predicate to improve query performance:
{noformat}
others < In < Like < UDF/CaseWhen/If < Inset < LikeAny/LikeAll
{noformat}

https://www.ibm.com/support/knowledgecenter/SSSHTQ_8.1.0/com.ibm.netcool_OMNIbus.doc_8.1.0/omnibus/wip/admin/reference/omn_adm_per_optimizationrules.html#omn_adm_per_optimizationrules__reorder
https://docs.oracle.com/en/database/oracle/oracle-database/21/addci/extensible-optimizer-interface.html#GUID-28A4EDA6-19DD-4773-B3B8-1802C3B01E21
https://docs.oracle.com/cd/B10501_01/server.920/a96533/hintsref.htm#13676



> Filter predicate reorder
> 
>
> Key: SPARK-33979
> URL: https://issues.apache.org/jira/browse/SPARK-33979
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Yuming Wang
>Priority: Major
>
> Reorder filter predicate to improve query performance:
> {noformat}
> others < In < Like < UDF/CaseWhen/If < Inset < LikeAny/LikeAll
> {noformat}
> [https://www.ibm.com/support/knowledgecenter/SSSHTQ_8.1.0/com.ibm.netcool_OMNIbus.doc_8.1.0/omnibus/wip/admin/reference/omn_adm_per_optimizationrules.html#omn_adm_per_optimizationrules__reorder]
>  
> [https://docs.oracle.com/en/database/oracle/oracle-database/21/addci/extensible-optimizer-interface.html#GUID-28A4EDA6-19DD-4773-B3B8-1802C3B01E21]
>  [https://docs.oracle.com/cd/B10501_01/server.920/a96533/hintsref.htm#13676]
> https://issues.apache.org/jira/browse/HIVE-21857
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-34392) Invalid ID for offset-based ZoneId since Spark 3.0

2021-02-07 Thread Yuming Wang (Jira)

Yuming Wang created SPARK-34392:
---

 Summary: Invalid ID for offset-based ZoneId since Spark 3.0
 Key: SPARK-34392
 URL: https://issues.apache.org/jira/browse/SPARK-34392
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 3.0.1, 3.0.0
Reporter: Yuming Wang


How to reproduce this issue:
{code:sql}
select to_utc_timestamp("2020-02-07 16:00:00", "GMT+8:00");
{code}
Spark 2.4:
{noformat}
spark-sql> select to_utc_timestamp("2020-02-07 16:00:00", "GMT+8:00");
2020-02-07 08:00:00
Time taken: 0.089 seconds, Fetched 1 row(s)
{noformat}

Spark 3.x:
{noformat}
spark-sql> select to_utc_timestamp("2020-02-07 16:00:00", "GMT+8:00");
21/02/07 01:24:32 ERROR SparkSQLDriver: Failed in [select 
to_utc_timestamp("2020-02-07 16:00:00", "GMT+8:00")]
java.time.DateTimeException: Invalid ID for offset-based ZoneId: GMT+8:00
at java.time.ZoneId.ofWithPrefix(ZoneId.java:437)
at java.time.ZoneId.of(ZoneId.java:407)
at java.time.ZoneId.of(ZoneId.java:359)
at java.time.ZoneId.of(ZoneId.java:315)
at 
org.apache.spark.sql.catalyst.util.DateTimeUtils$.getZoneId(DateTimeUtils.scala:53)
at 
org.apache.spark.sql.catalyst.util.DateTimeUtils$.toUTCTime(DateTimeUtils.scala:814)
{noformat}






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

76 matches

Mail list logo