date:20220110

[jira] [Updated] (SPARK-37802) composite field name like `field name` doesn't work with Aggregate push down

2022-01-10 Thread Huaxin Gao (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huaxin Gao updated SPARK-37802:
---
Fix Version/s: 3.2.1
   (was: 3.2.0)

> composite field name like `field name` doesn't work with Aggregate push down
> 
>
> Key: SPARK-37802
> URL: https://issues.apache.org/jira/browse/SPARK-37802
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.2.0, 3.3.0
>Reporter: Huaxin Gao
>Assignee: Huaxin Gao
>Priority: Minor
> Fix For: 3.2.1, 3.3.0
>
>
> {code:java}
> sql("SELECT SUM(`field name`) FROM h2.test.table")
> org.apache.spark.sql.catalyst.parser.ParseException: 
> extraneous input 'name' expecting (line 1, pos 9)
>   at 
> org.apache.spark.sql.catalyst.parser.ParseErrorListener$.syntaxError(ParseDriver.scala:212)
>   at 
> org.antlr.v4.runtime.ProxyErrorListener.syntaxError(ProxyErrorListener.java:41)
>   at org.antlr.v4.runtime.Parser.notifyErrorListeners(Parser.java:544)
>   at 
> org.antlr.v4.runtime.DefaultErrorStrategy.reportUnwantedToken(DefaultErrorStrategy.java:377)
>   at 
> org.antlr.v4.runtime.DefaultErrorStrategy.singleTokenDeletion(DefaultErrorStrategy.java:548)
>   at 
> org.antlr.v4.runtime.DefaultErrorStrategy.recoverInline(DefaultErrorStrategy.java:467)
>   at org.antlr.v4.runtime.Parser.match(Parser.java:206)
>   at 
> org.apache.spark.sql.catalyst.parser.SqlBaseParser.singleMultipartIdentifier(SqlBaseParser.java:519)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-37853) Clean up deprecation compilation warning related to log4j2

2022-01-10 Thread Yang Jie (Jira)

Yang Jie created SPARK-37853:


 Summary: Clean up deprecation compilation warning related to log4j2
 Key: SPARK-37853
 URL: https://issues.apache.org/jira/browse/SPARK-37853
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 3.3.0
Reporter: Yang Jie


[WARNING] [Warn] 
/spark-source/core/src/main/scala/org/apache/spark/util/logging/DriverLogger.scala:64:
 [deprecation @ org.apache.spark.util.logging.DriverLogger.addLogAppender.fa | 
origin=org.apache.logging.log4j.core.appender.FileAppender.createAppender | 
version=] method createAppender in class FileAppender is deprecated
[WARNING] [Warn] 
/spark-source/core/src/test/scala/org/apache/spark/SparkFunSuite.scala:268: 
[deprecation @ org.apache.spark.SparkFunSuite.LogAppender. | 
origin=org.apache.logging.log4j.core.appender.AbstractAppender. | 
version=] constructor AbstractAppender in class AbstractAppender is deprecated



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-37851) Mark org.apache.spark.sql.hive.execution as slow tests

2022-01-10 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-37851:
-
Summary: Mark org.apache.spark.sql.hive.execution as slow tests  (was: Mark 
TPCDS*Suite, SQLQuerySuite and org.apache.spark.sql.hive.execution as slow 
tests)

> Mark org.apache.spark.sql.hive.execution as slow tests
> --
>
> Key: SPARK-37851
> URL: https://issues.apache.org/jira/browse/SPARK-37851
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL, Tests
>Affects Versions: 3.3.0
>Reporter: Hyukjin Kwon
>Priority: Major
>
> Related to SPARK-33171 and SPARK-32884. We should rebalance both:
> "sql - slow tests": 
> https://github.com/apache/spark/runs/4755996273?check_suite_focus=true
> "sql - other tests": 
> https://github.com/apache/spark/runs/4755996343?check_suite_focus=true
> and
> "hive -slow tests": 
> https://github.com/apache/spark/runs/4755996153?check_suite_focus=true
> "hive - other tests": 
> https://github.com/apache/spark/runs/4755996212?check_suite_focus=true



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-37851) Mark org.apache.spark.sql.hive.execution as slow tests

2022-01-10 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-37851:
-
Description: 
Related to SPARK-33171 and SPARK-32884. We should rebalance both:

"hive -slow tests": 
https://github.com/apache/spark/runs/4755996153?check_suite_focus=true
"hive - other tests": 
https://github.com/apache/spark/runs/4755996212?check_suite_focus=true

  was:
Related to SPARK-33171 and SPARK-32884. We should rebalance both:

"sql - slow tests": 
https://github.com/apache/spark/runs/4755996273?check_suite_focus=true
"sql - other tests": 
https://github.com/apache/spark/runs/4755996343?check_suite_focus=true

and

"hive -slow tests": 
https://github.com/apache/spark/runs/4755996153?check_suite_focus=true
"hive - other tests": 
https://github.com/apache/spark/runs/4755996212?check_suite_focus=true


> Mark org.apache.spark.sql.hive.execution as slow tests
> --
>
> Key: SPARK-37851
> URL: https://issues.apache.org/jira/browse/SPARK-37851
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL, Tests
>Affects Versions: 3.3.0
>Reporter: Hyukjin Kwon
>Priority: Major
>
> Related to SPARK-33171 and SPARK-32884. We should rebalance both:
> "hive -slow tests": 
> https://github.com/apache/spark/runs/4755996153?check_suite_focus=true
> "hive - other tests": 
> https://github.com/apache/spark/runs/4755996212?check_suite_focus=true



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37853) Clean up deprecation compilation warning related to log4j2

2022-01-10 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17471785#comment-17471785
 ] 

Apache Spark commented on SPARK-37853:
--

User 'LuciferYang' has created a pull request for this issue:
https://github.com/apache/spark/pull/35153

> Clean up deprecation compilation warning related to log4j2
> --
>
> Key: SPARK-37853
> URL: https://issues.apache.org/jira/browse/SPARK-37853
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.3.0
>Reporter: Yang Jie
>Priority: Minor
>
> [WARNING] [Warn] 
> /spark-source/core/src/main/scala/org/apache/spark/util/logging/DriverLogger.scala:64:
>  [deprecation @ org.apache.spark.util.logging.DriverLogger.addLogAppender.fa 
> | origin=org.apache.logging.log4j.core.appender.FileAppender.createAppender | 
> version=] method createAppender in class FileAppender is deprecated
> [WARNING] [Warn] 
> /spark-source/core/src/test/scala/org/apache/spark/SparkFunSuite.scala:268: 
> [deprecation @ org.apache.spark.SparkFunSuite.LogAppender. | 
> origin=org.apache.logging.log4j.core.appender.AbstractAppender. | 
> version=] constructor AbstractAppender in class AbstractAppender is deprecated



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-37853) Clean up deprecation compilation warning related to log4j2

2022-01-10 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37853:


Assignee: (was: Apache Spark)

> Clean up deprecation compilation warning related to log4j2
> --
>
> Key: SPARK-37853
> URL: https://issues.apache.org/jira/browse/SPARK-37853
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.3.0
>Reporter: Yang Jie
>Priority: Minor
>
> [WARNING] [Warn] 
> /spark-source/core/src/main/scala/org/apache/spark/util/logging/DriverLogger.scala:64:
>  [deprecation @ org.apache.spark.util.logging.DriverLogger.addLogAppender.fa 
> | origin=org.apache.logging.log4j.core.appender.FileAppender.createAppender | 
> version=] method createAppender in class FileAppender is deprecated
> [WARNING] [Warn] 
> /spark-source/core/src/test/scala/org/apache/spark/SparkFunSuite.scala:268: 
> [deprecation @ org.apache.spark.SparkFunSuite.LogAppender. | 
> origin=org.apache.logging.log4j.core.appender.AbstractAppender. | 
> version=] constructor AbstractAppender in class AbstractAppender is deprecated



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-37853) Clean up deprecation compilation warning related to log4j2

2022-01-10 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37853:


Assignee: Apache Spark

> Clean up deprecation compilation warning related to log4j2
> --
>
> Key: SPARK-37853
> URL: https://issues.apache.org/jira/browse/SPARK-37853
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.3.0
>Reporter: Yang Jie
>Assignee: Apache Spark
>Priority: Minor
>
> [WARNING] [Warn] 
> /spark-source/core/src/main/scala/org/apache/spark/util/logging/DriverLogger.scala:64:
>  [deprecation @ org.apache.spark.util.logging.DriverLogger.addLogAppender.fa 
> | origin=org.apache.logging.log4j.core.appender.FileAppender.createAppender | 
> version=] method createAppender in class FileAppender is deprecated
> [WARNING] [Warn] 
> /spark-source/core/src/test/scala/org/apache/spark/SparkFunSuite.scala:268: 
> [deprecation @ org.apache.spark.SparkFunSuite.LogAppender. | 
> origin=org.apache.logging.log4j.core.appender.AbstractAppender. | 
> version=] constructor AbstractAppender in class AbstractAppender is deprecated



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37853) Clean up deprecation compilation warning related to log4j2

2022-01-10 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17471786#comment-17471786
 ] 

Apache Spark commented on SPARK-37853:
--

User 'LuciferYang' has created a pull request for this issue:
https://github.com/apache/spark/pull/35153

> Clean up deprecation compilation warning related to log4j2
> --
>
> Key: SPARK-37853
> URL: https://issues.apache.org/jira/browse/SPARK-37853
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.3.0
>Reporter: Yang Jie
>Priority: Minor
>
> [WARNING] [Warn] 
> /spark-source/core/src/main/scala/org/apache/spark/util/logging/DriverLogger.scala:64:
>  [deprecation @ org.apache.spark.util.logging.DriverLogger.addLogAppender.fa 
> | origin=org.apache.logging.log4j.core.appender.FileAppender.createAppender | 
> version=] method createAppender in class FileAppender is deprecated
> [WARNING] [Warn] 
> /spark-source/core/src/test/scala/org/apache/spark/SparkFunSuite.scala:268: 
> [deprecation @ org.apache.spark.SparkFunSuite.LogAppender. | 
> origin=org.apache.logging.log4j.core.appender.AbstractAppender. | 
> version=] constructor AbstractAppender in class AbstractAppender is deprecated



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-37854) Use type match to simplify TestUtils#withHttpConnection

2022-01-10 Thread Yang Jie (Jira)

Yang Jie created SPARK-37854:


 Summary: Use type match to simplify TestUtils#withHttpConnection
 Key: SPARK-37854
 URL: https://issues.apache.org/jira/browse/SPARK-37854
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 3.3.0
Reporter: Yang Jie






--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-37854) Use type match to simplify TestUtils#withHttpConnection

2022-01-10 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37854:


Assignee: (was: Apache Spark)

> Use type match to simplify TestUtils#withHttpConnection
> ---
>
> Key: SPARK-37854
> URL: https://issues.apache.org/jira/browse/SPARK-37854
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.3.0
>Reporter: Yang Jie
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37854) Use type match to simplify TestUtils#withHttpConnection

2022-01-10 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17471799#comment-17471799
 ] 

Apache Spark commented on SPARK-37854:
--

User 'LuciferYang' has created a pull request for this issue:
https://github.com/apache/spark/pull/35154

> Use type match to simplify TestUtils#withHttpConnection
> ---
>
> Key: SPARK-37854
> URL: https://issues.apache.org/jira/browse/SPARK-37854
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.3.0
>Reporter: Yang Jie
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-37854) Use type match to simplify TestUtils#withHttpConnection

2022-01-10 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37854:


Assignee: Apache Spark

> Use type match to simplify TestUtils#withHttpConnection
> ---
>
> Key: SPARK-37854
> URL: https://issues.apache.org/jira/browse/SPARK-37854
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.3.0
>Reporter: Yang Jie
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-37854) Use type match to simplify TestUtils#withHttpConnection

2022-01-10 Thread Yang Jie (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie updated SPARK-37854:
-
Description: 
{code:java}
if (connection.isInstanceOf[HttpsURLConnection]) {
      
      
connection.asInstanceOf[HttpsURLConnection].setSSLSocketFactory(sslCtx.getSocketFactory())
      connection.asInstanceOf[HttpsURLConnection].setHostnameVerifier(verifier) 
{code}

> Use type match to simplify TestUtils#withHttpConnection
> ---
>
> Key: SPARK-37854
> URL: https://issues.apache.org/jira/browse/SPARK-37854
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.3.0
>Reporter: Yang Jie
>Priority: Major
>
> {code:java}
> if (connection.isInstanceOf[HttpsURLConnection]) {
>       
>       
> connection.asInstanceOf[HttpsURLConnection].setSSLSocketFactory(sslCtx.getSocketFactory())
>       
> connection.asInstanceOf[HttpsURLConnection].setHostnameVerifier(verifier) 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-37818) Add option for show create table command

2022-01-10 Thread Gengliang Wang (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gengliang Wang reassigned SPARK-37818:
--

Assignee: PengLei

> Add option for show create table command
> 
>
> Key: SPARK-37818
> URL: https://issues.apache.org/jira/browse/SPARK-37818
> Project: Spark
>  Issue Type: Documentation
>  Components: Documentation
>Affects Versions: 3.3.0
>Reporter: PengLei
>Assignee: PengLei
>Priority: Trivial
> Fix For: 3.3.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-37818) Add option for show create table command

2022-01-10 Thread Gengliang Wang (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gengliang Wang resolved SPARK-37818.

Fix Version/s: 3.2.1
   Resolution: Fixed

Issue resolved by pull request 35107
[https://github.com/apache/spark/pull/35107]

> Add option for show create table command
> 
>
> Key: SPARK-37818
> URL: https://issues.apache.org/jira/browse/SPARK-37818
> Project: Spark
>  Issue Type: Documentation
>  Components: Documentation
>Affects Versions: 3.3.0
>Reporter: PengLei
>Assignee: PengLei
>Priority: Trivial
> Fix For: 3.3.0, 3.2.1
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37818) Add option for show create table command

2022-01-10 Thread Gengliang Wang (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17471925#comment-17471925
 ] 

Gengliang Wang commented on SPARK-37818:


[~huaxingao] FYI I set the fixed version as 3.2.1. I saw there is a tag 
3.2.1-rc1 already, so I will update the fixed version as 3.2.2 if this doc 
change can't make it on 3.2.1

> Add option for show create table command
> 
>
> Key: SPARK-37818
> URL: https://issues.apache.org/jira/browse/SPARK-37818
> Project: Spark
>  Issue Type: Documentation
>  Components: Documentation
>Affects Versions: 3.3.0
>Reporter: PengLei
>Assignee: PengLei
>Priority: Trivial
> Fix For: 3.2.1, 3.3.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-35442) Support propagate empty relation through aggregate

2022-01-10 Thread Wenchen Fan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-35442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-35442.
-
Fix Version/s: 3.3.0
   Resolution: Fixed

Issue resolved by pull request 35149
[https://github.com/apache/spark/pull/35149]

> Support propagate empty relation through aggregate
> --
>
> Key: SPARK-35442
> URL: https://issues.apache.org/jira/browse/SPARK-35442
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: XiDuo You
>Priority: Minor
> Fix For: 3.3.0
>
>
> The Aggregate in AQE is different with others, the `LogicalQueryStage` looks 
> like `LogicalQueryStage(Aggregate, BaseAggregate)`. We should handle this 
> case specially.
> Logically, if the Aggregate grouping expression is not empty, we can 
> eliminate it safely.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-35746) Task id in the Stage page timeline is incorrect

2022-01-10 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-35746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17471982#comment-17471982
 ] 

Apache Spark commented on SPARK-35746:
--

User 'stczwd' has created a pull request for this issue:
https://github.com/apache/spark/pull/35155

> Task id in the Stage page timeline is incorrect
> ---
>
> Key: SPARK-35746
> URL: https://issues.apache.org/jira/browse/SPARK-35746
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 3.0.0, 3.1.2
>Reporter: shahid
>Assignee: shahid
>Priority: Minor
> Attachments: image-2021-06-12-07-03-09-808.png
>
>
> !image-2021-06-12-07-03-09-808.png!



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-35746) Task id in the Stage page timeline is incorrect

2022-01-10 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-35746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17471981#comment-17471981
 ] 

Apache Spark commented on SPARK-35746:
--

User 'stczwd' has created a pull request for this issue:
https://github.com/apache/spark/pull/35155

> Task id in the Stage page timeline is incorrect
> ---
>
> Key: SPARK-35746
> URL: https://issues.apache.org/jira/browse/SPARK-35746
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 3.0.0, 3.1.2
>Reporter: shahid
>Assignee: shahid
>Priority: Minor
> Attachments: image-2021-06-12-07-03-09-808.png
>
>
> !image-2021-06-12-07-03-09-808.png!



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37491) Fix Series.asof when values of the series is not sorted

2022-01-10 Thread pralabhkumar (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17472056#comment-17472056
 ] 

pralabhkumar commented on SPARK-37491:
--

Lets take example of 

pser = pd.Series([2, 1, np.nan, 4], index=[10, 20, 30, 40], name="Koalas")

pser.asof([5,20])  will give output [Nan , 1] 

While 

ps.from_pandas(pser).asof[5,20] will give output [Nan, 2]

*Explanation*

Data frame created after applying condition.

F.when(index_scol <= SF.lit(index).cast(index_type)  Without applying max 
aggregation  

+-+--+-+

|col_5        |col_25        |__index_level_0__|

+-+--+-+

|null|2.0|10               |

|null|1.0|20               |

|null|null|30               |

|null|null|40               |

+-+--+-+

Since we are taking max , output is coming 2. Ideally what we need is the last 
non null value or each col with increasing order of __index_level_0__.

Now to implement the logic . What I planning to do is create a below DF from 
the above DF , using explode , partition and row_number

__index_level_0__.        Identifier          value    row_number

40                                      col_5               null.      1

30                                    col_5                null       2

20                                    col_5                null       3

10                                    col_5               null         4

40                                     col_20         2              1

30                                     col_20        1              2

20                                    col_20         null         3

10                                  col_20            null         4  

 

Then filter on row_number=1 . There are other things to take care , but 
majority of the logic is this .

Please let me know if its in correct direction ( This is actually passing all 
the asof test cases ,including the  case which is described in jira. ) . 

 

[~itholic]  

> Fix Series.asof when values of the series is not sorted
> ---
>
> Key: SPARK-37491
> URL: https://issues.apache.org/jira/browse/SPARK-37491
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 3.3.0
>Reporter: dch nguyen
>Priority: Major
>
> https://github.com/apache/spark/pull/34737#discussion_r758223279



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-37854) Use type match to simplify TestUtils#withHttpConnection

2022-01-10 Thread Sean R. Owen (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean R. Owen updated SPARK-37854:
-
Priority: Trivial  (was: Major)

> Use type match to simplify TestUtils#withHttpConnection
> ---
>
> Key: SPARK-37854
> URL: https://issues.apache.org/jira/browse/SPARK-37854
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.3.0
>Reporter: Yang Jie
>Priority: Trivial
>
> {code:java}
> if (connection.isInstanceOf[HttpsURLConnection]) {
>       
>       
> connection.asInstanceOf[HttpsURLConnection].setSSLSocketFactory(sslCtx.getSocketFactory())
>       
> connection.asInstanceOf[HttpsURLConnection].setHostnameVerifier(verifier) 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-37855) IllegalStateException when transforming an array inside a nested struct

2022-01-10 Thread G Muciaccia (Jira)

G Muciaccia created SPARK-37855:
---

 Summary: IllegalStateException when transforming an array inside a 
nested struct
 Key: SPARK-37855
 URL: https://issues.apache.org/jira/browse/SPARK-37855
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 3.2.0
 Environment: OS: Ubuntu 20.04.3 LTS

Scala version: 2.12.12

 
Reporter: G Muciaccia


*NOTE*: this bug is only present in version {{3.2.0}}. Downgrading to {{3.1.2}} 
solves the problem.

h3. Prerequisites to reproduce the bug

# use Spark version 3.2.0
# create a DataFrame with an array field, which contains a struct field with a 
nested array field
# *apply a limit* to the DataFrame
# transform the outer array, renaming one of its fields
# transform the inner array too, which requires two {{getField}} in sequence

h3. Example that reproduces the bug

This is a minimal example (as minimal as I could make it) to reproduce the bug:

{code}
import org.apache.spark.sql.functions._
import org.apache.spark.sql.types._
import org.apache.spark.sql.{DataFrame, Row}

def makeInput(): DataFrame = {
val innerElement1 = Row(3, 3.12)
val innerElement2 = Row(4, 2.1)
val innerElement3 = Row(1, 985.2)
val innerElement4 = Row(10, 757548.0)
val innerElement5 = Row(1223, 0.665)

val outerElement1 = Row(1, Row(List(innerElement1, innerElement2)))
val outerElement2 = Row(2, Row(List(innerElement3)))
val outerElement3 = Row(3, Row(List(innerElement4, innerElement5)))

val data = Seq(
Row("row1", List(outerElement1)),
Row("row2", List(outerElement2, outerElement3)),
)

val schema = new StructType()
.add("name", StringType)
.add("outer_array", ArrayType(new StructType()
.add("id", IntegerType)
.add("inner_array_struct", new StructType()
.add("inner_array", ArrayType(new StructType()
.add("id", IntegerType)
.add("value", DoubleType)
))
)
))

spark.createDataFrame(spark.sparkContext
.parallelize(data),schema)
}

// val df = makeInput()
val df = makeInput().limit(2)
// val df = makeInput().limit(2).cache()

val res = df.withColumn("extracted", transform(
col("outer_array"),
c1 => {
struct(
c1.getField("id").alias("outer_id"),
transform(
c1.getField("inner_array_struct").getField("inner_array"),
c2 => {
struct(
c2.getField("value").alias("inner_value")
)
}
)
)
}
))

res.printSchema()
res.show(false)
{code}

h4. Executing the example code

When executing it as-is, the execution will fail on the {{show}} statement, with

{code}
java.lang.IllegalStateException Couldn't find _extract_inner_array#23 in 
[name#2,outer_array#3]
{code}

However, *if the limit is not applied, or if the DataFrame is cached after the 
limit, everything works* (you can uncomment the corresponding lines in the 
example to try it).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37818) Add option for show create table command

2022-01-10 Thread Huaxin Gao (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17472181#comment-17472181
 ] 

Huaxin Gao commented on SPARK-37818:


[~Gengliang.Wang] I am drafting the 3.2.1 voting email now. I will need to 
change the fixed version to 3.2.2, otherwise, the list of bug fixes will 
contain this one. I will change this back to 3.2.1 if RC1 doesn't pass.

> Add option for show create table command
> 
>
> Key: SPARK-37818
> URL: https://issues.apache.org/jira/browse/SPARK-37818
> Project: Spark
>  Issue Type: Documentation
>  Components: Documentation
>Affects Versions: 3.3.0
>Reporter: PengLei
>Assignee: PengLei
>Priority: Trivial
> Fix For: 3.2.1, 3.3.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-37818) Add option for show create table command

2022-01-10 Thread Huaxin Gao (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huaxin Gao updated SPARK-37818:
---
Fix Version/s: (was: 3.2.1)

> Add option for show create table command
> 
>
> Key: SPARK-37818
> URL: https://issues.apache.org/jira/browse/SPARK-37818
> Project: Spark
>  Issue Type: Documentation
>  Components: Documentation
>Affects Versions: 3.3.0
>Reporter: PengLei
>Assignee: PengLei
>Priority: Trivial
> Fix For: 3.3.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37818) Add option for show create table command

2022-01-10 Thread Huaxin Gao (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17472183#comment-17472183
 ] 

Huaxin Gao commented on SPARK-37818:


[~Gengliang.Wang] version 3.2.2 doesn't exist yet. I will just set the version 
to 3.3.0 for now. Will update the version to 3.2.2 later.

> Add option for show create table command
> 
>
> Key: SPARK-37818
> URL: https://issues.apache.org/jira/browse/SPARK-37818
> Project: Spark
>  Issue Type: Documentation
>  Components: Documentation
>Affects Versions: 3.3.0
>Reporter: PengLei
>Assignee: PengLei
>Priority: Trivial
> Fix For: 3.3.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-37856) Executor pods keep existing if driver container was restarted

2022-01-10 Thread Denis Krivenko (Jira)

Denis Krivenko created SPARK-37856:
--

 Summary: Executor pods keep existing if driver container was 
restarted
 Key: SPARK-37856
 URL: https://issues.apache.org/jira/browse/SPARK-37856
 Project: Spark
  Issue Type: Bug
  Components: Kubernetes
Affects Versions: 3.2.0, 3.1.2
 Environment: * Kubernetes 1.20
 * Spark 3.1.2
 * Hadoop 3.2.0
 * Java 11
 * Scala 2.12

and
 * Kubernetes 1.20
 * Spark 3.2.0
 * Hadoop 3.3.1
 * Java 11
 * Scala 2.12
Reporter: Denis Krivenko


I run Spark Thrift Server on Kubernetes cluster, so the driver pod runs 
continuously and it creates and manages executor pods. From time to time OOM 
issue occurs on a driver pod or executor pods.

When it happens on
 * executor - the executor pod is getting deleted and the driver creates a new 
executor pod instead. It works as expected.
 * driver     - Kubernetes restarts the driver container and the driver creates 
new executor pods. All previous executors stop, but still exist with *Error* 
state for Spark 3.1.2 or with *Completed* state for Spark 3.2.0

The behavior can be reproduced by restarting a pod container with the command
{code:java}
kubectl exec POD_NAME -c CONTAINER_NAME -- /sbin/killall5{code}
Property _spark.kubernetes.executor.deleteOnTermination_ is set to *true* by 
default.

If I delete driver pod all executor pods (in any state) are also deleted 
completely.

+Pod list+
{code:java}
NAME                                           READY   STATUS      RESTARTS   
AGE
spark-thrift-server-85cf5d689b-vvrwd           1/1     Running     1          
3d15h
spark-thrift-server-198cc57e3f9a7400-exec-10   1/1     Running     0          
86m
spark-thrift-server-198cc57e3f9a7400-exec-6    1/1     Running     0          
12h
spark-thrift-server-198cc57e3f9a7400-exec-8    1/1     Running     0          9h
spark-thrift-server-198cc57e3f9a7400-exec-9    1/1     Running     0          
3h12m
spark-thrift-server-1a9aee7e31f36eea-exec-17   0/1     Completed   0          
38h
spark-thrift-server-1a9aee7e31f36eea-exec-18   0/1     Completed   0          
38h
spark-thrift-server-1a9aee7e31f36eea-exec-19   0/1     Completed   0          
36h
spark-thrift-server-1a9aee7e31f36eea-exec-21   0/1     Completed   0          
24h
 {code}
+Driver pod+
{code:java}
apiVersion: v1
kind: Pod
metadata:
  name: spark-thrift-server-85cf5d689b-vvrwd
  uid: b69a7c68-a767-4e3b-939c-061347b1c25e
spec:
  ...
status:
  containerStatuses:
  - containerID: 
containerd://7206acf424aa30b6f8533c0e32c99ebfdc5ee80648e76289f6bd2f87460ddcd3
    image: xxx/spark:3.2.0
    lastState:
      terminated:
        containerID: 
containerd://fe3cacb8e6470ac37dcd50d525ae3d54c8b6bfef3558325bc22e7b40daab1703
        exitCode: 143
        finishedAt: "2022-01-09T16:09:50Z"
        reason: OOMKilled
        startedAt: "2022-01-07T00:32:21Z"
    name: spark-thrift-server
    ready: true
    restartCount: 1
    started: true
    state:
      running:
        startedAt: "2022-01-09T16:09:51Z" {code}
Executor pod
{code:java}
apiVersion: v1
kind: Pod
metadata:
  name: spark-thrift-server-1a9aee7e31f36eea-exec-17
  ownerReferences:
  - apiVersion: v1
    controller: true
    kind: Pod
    name: spark-thrift-server-85cf5d689b-vvrwd
    uid: b69a7c68-a767-4e3b-939c-061347b1c25e
spec:
  ...
status:
  containerStatuses:
  - containerID: 
containerd://75c68190147ba980f4b9014eef3989ddc2ee30de321fd1119957b6684a995c19
    image: xxx/spark:3.2.0
    lastState: {}
    name: spark-kubernetes-executor
    ready: false
    restartCount: 0
    started: false
    state:
      terminated:
        containerID: 
containerd://75c68190147ba980f4b9014eef3989ddc2ee30de321fd1119957b6684a995c19
        exitCode: 0
        finishedAt: "2022-01-09T16:08:57Z"
        reason: Completed
        startedAt: "2022-01-09T01:39:15Z" {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-37856) Executor pods keep existing if driver container was restarted

2022-01-10 Thread Denis Krivenko (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denis Krivenko updated SPARK-37856:
---
Environment: 
Kubernetes 1.20 | Spark 3.1.2 | Hadoop 3.2.0 | Java 11 | Scala 2.12

Kubernetes 1.20 | Spark 3.2.0 | Hadoop 3.3.1 | Java 11 | Scala 2.12

  was:
* Kubernetes 1.20
 * Spark 3.1.2
 * Hadoop 3.2.0
 * Java 11
 * Scala 2.12

and
 * Kubernetes 1.20
 * Spark 3.2.0
 * Hadoop 3.3.1
 * Java 11
 * Scala 2.12


> Executor pods keep existing if driver container was restarted
> -
>
> Key: SPARK-37856
> URL: https://issues.apache.org/jira/browse/SPARK-37856
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.1.2, 3.2.0
> Environment: Kubernetes 1.20 | Spark 3.1.2 | Hadoop 3.2.0 | Java 11 | 
> Scala 2.12
> Kubernetes 1.20 | Spark 3.2.0 | Hadoop 3.3.1 | Java 11 | Scala 2.12
>Reporter: Denis Krivenko
>Priority: Minor
>
> I run Spark Thrift Server on Kubernetes cluster, so the driver pod runs 
> continuously and it creates and manages executor pods. From time to time OOM 
> issue occurs on a driver pod or executor pods.
> When it happens on
>  * executor - the executor pod is getting deleted and the driver creates a 
> new executor pod instead. It works as expected.
>  * driver     - Kubernetes restarts the driver container and the driver 
> creates new executor pods. All previous executors stop, but still exist with 
> *Error* state for Spark 3.1.2 or with *Completed* state for Spark 3.2.0
> The behavior can be reproduced by restarting a pod container with the command
> {code:java}
> kubectl exec POD_NAME -c CONTAINER_NAME -- /sbin/killall5{code}
> Property _spark.kubernetes.executor.deleteOnTermination_ is set to *true* by 
> default.
> If I delete driver pod all executor pods (in any state) are also deleted 
> completely.
> +Pod list+
> {code:java}
> NAME                                           READY   STATUS      RESTARTS   
> AGE
> spark-thrift-server-85cf5d689b-vvrwd           1/1     Running     1          
> 3d15h
> spark-thrift-server-198cc57e3f9a7400-exec-10   1/1     Running     0          
> 86m
> spark-thrift-server-198cc57e3f9a7400-exec-6    1/1     Running     0          
> 12h
> spark-thrift-server-198cc57e3f9a7400-exec-8    1/1     Running     0          
> 9h
> spark-thrift-server-198cc57e3f9a7400-exec-9    1/1     Running     0          
> 3h12m
> spark-thrift-server-1a9aee7e31f36eea-exec-17   0/1     Completed   0          
> 38h
> spark-thrift-server-1a9aee7e31f36eea-exec-18   0/1     Completed   0          
> 38h
> spark-thrift-server-1a9aee7e31f36eea-exec-19   0/1     Completed   0          
> 36h
> spark-thrift-server-1a9aee7e31f36eea-exec-21   0/1     Completed   0          
> 24h
>  {code}
> +Driver pod+
> {code:java}
> apiVersion: v1
> kind: Pod
> metadata:
>   name: spark-thrift-server-85cf5d689b-vvrwd
>   uid: b69a7c68-a767-4e3b-939c-061347b1c25e
> spec:
>   ...
> status:
>   containerStatuses:
>   - containerID: 
> containerd://7206acf424aa30b6f8533c0e32c99ebfdc5ee80648e76289f6bd2f87460ddcd3
>     image: xxx/spark:3.2.0
>     lastState:
>       terminated:
>         containerID: 
> containerd://fe3cacb8e6470ac37dcd50d525ae3d54c8b6bfef3558325bc22e7b40daab1703
>         exitCode: 143
>         finishedAt: "2022-01-09T16:09:50Z"
>         reason: OOMKilled
>         startedAt: "2022-01-07T00:32:21Z"
>     name: spark-thrift-server
>     ready: true
>     restartCount: 1
>     started: true
>     state:
>       running:
>         startedAt: "2022-01-09T16:09:51Z" {code}
> Executor pod
> {code:java}
> apiVersion: v1
> kind: Pod
> metadata:
>   name: spark-thrift-server-1a9aee7e31f36eea-exec-17
>   ownerReferences:
>   - apiVersion: v1
>     controller: true
>     kind: Pod
>     name: spark-thrift-server-85cf5d689b-vvrwd
>     uid: b69a7c68-a767-4e3b-939c-061347b1c25e
> spec:
>   ...
> status:
>   containerStatuses:
>   - containerID: 
> containerd://75c68190147ba980f4b9014eef3989ddc2ee30de321fd1119957b6684a995c19
>     image: xxx/spark:3.2.0
>     lastState: {}
>     name: spark-kubernetes-executor
>     ready: false
>     restartCount: 0
>     started: false
>     state:
>       terminated:
>         containerID: 
> containerd://75c68190147ba980f4b9014eef3989ddc2ee30de321fd1119957b6684a995c19
>         exitCode: 0
>         finishedAt: "2022-01-09T16:08:57Z"
>         reason: Completed
>         startedAt: "2022-01-09T01:39:15Z" {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-37857) Any depth search not working in get_json_object ($..foo)

2022-01-10 Thread Venkata Subramaniam (Jira)

Venkata Subramaniam created SPARK-37857:
---

 Summary: Any depth search not working in get_json_object ($..foo)
 Key: SPARK-37857
 URL: https://issues.apache.org/jira/browse/SPARK-37857
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 3.2.0
Reporter: Venkata Subramaniam


The following example should return a value _abc_ but instead returns null
{code:java}
spark.sql("""select get_json_object('{"k":{"value":"abc"}}', '$..value') as 
j""").show()
{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-35262) Memory leak when dataset is being persisted

2022-01-10 Thread Denis Krivenko (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-35262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17472248#comment-17472248
 ] 

Denis Krivenko commented on SPARK-35262:


[~iamelin] Could you please check/confirm the issue still exists in 3.2.0?

> Memory leak when dataset is being persisted
> ---
>
> Key: SPARK-35262
> URL: https://issues.apache.org/jira/browse/SPARK-35262
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.1.1
>Reporter: Igor Amelin
>Priority: Major
>
> If a Java- or Scala-application with SparkSession runs a long time and 
> persists a lot of datasets, it can crash because of a memory leak.
>  I've noticed the following. When we have a dataset and persist it, the 
> SparkSession used to load that dataset is cloned in CacheManager, and this 
> clone is added as a listener to `listenersPlusTimers` in `ListenerBus`. But 
> this clone isn't removed from the list of listeners after that, e.g. 
> unpersisting the dataset. If we persist a lot of datasets, the SparkSession 
> is cloned and added to `ListenerBus` many times. This leads to a memory leak 
> since the `listenersPlusTimers` list become very large.
> I've found out that the SparkSession is cloned is CacheManager when the 
> parameters `spark.sql.sources.bucketing.autoBucketedScan.enabled` and 
> `spark.sql.adaptive.enabled` are true. The first one is true by default, and 
> this default behavior leads to the problem. When auto bucketed scan is 
> disabled, the SparkSession isn't cloned, and there are no duplicates in 
> ListenerBus, so the memory leak doesn't occur.
> Here is a small Java application to reproduce the memory leak: 
> [https://github.com/iamelin/spark-memory-leak]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-37858) Wrap Java exceptions from aes functions

2022-01-10 Thread Max Gekk (Jira)

Max Gekk created SPARK-37858:


 Summary: Wrap Java exceptions from aes functions
 Key: SPARK-37858
 URL: https://issues.apache.org/jira/browse/SPARK-37858
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 3.3.0
Reporter: Max Gekk
Assignee: Max Gekk


Currently, Spark SQL can throw Java exceptions from the 
aes_encrypt()/aes_decrypt() functions, for instance:

{code:java}
java.lang.RuntimeException: javax.crypto.AEADBadTagException: Tag mismatch!
at 
org.apache.spark.sql.catalyst.expressions.ExpressionImplUtils.aesInternal(ExpressionImplUtils.java:93)
at 
org.apache.spark.sql.catalyst.expressions.ExpressionImplUtils.aesDecrypt(ExpressionImplUtils.java:43)
at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.project_doConsume_0$(Unknown
 Source)
at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
 Source)
at 
org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at 
org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:759)
at 
org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:354)
at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:890)
at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:890)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:136)
at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:507)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1468)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:510)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: javax.crypto.AEADBadTagException: Tag mismatch!
at 
com.sun.crypto.provider.GaloisCounterMode.decryptFinal(GaloisCounterMode.java:620)
at 
com.sun.crypto.provider.CipherCore.finalNoPadding(CipherCore.java:1116)
at 
com.sun.crypto.provider.CipherCore.fillOutputBuffer(CipherCore.java:1053)
at com.sun.crypto.provider.CipherCore.doFinal(CipherCore.java:853)
at com.sun.crypto.provider.AESCipher.engineDoFinal(AESCipher.java:446)
at javax.crypto.Cipher.doFinal(Cipher.java:2226)
at 
org.apache.spark.sql.catalyst.expressions.ExpressionImplUtils.aesInternal(ExpressionImplUtils.java:87)
... 19 more
{code}

That might confuse non-Scala/Java users. Need to wrap such kind of exception by 
Spark's exception.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-37858) Wrap Java exceptions from aes functions

2022-01-10 Thread Max Gekk (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk updated SPARK-37858:
-
Issue Type: Improvement  (was: Bug)

> Wrap Java exceptions from aes functions
> ---
>
> Key: SPARK-37858
> URL: https://issues.apache.org/jira/browse/SPARK-37858
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Max Gekk
>Assignee: Max Gekk
>Priority: Major
>
> Currently, Spark SQL can throw Java exceptions from the 
> aes_encrypt()/aes_decrypt() functions, for instance:
> {code:java}
> java.lang.RuntimeException: javax.crypto.AEADBadTagException: Tag mismatch!
>   at 
> org.apache.spark.sql.catalyst.expressions.ExpressionImplUtils.aesInternal(ExpressionImplUtils.java:93)
>   at 
> org.apache.spark.sql.catalyst.expressions.ExpressionImplUtils.aesDecrypt(ExpressionImplUtils.java:43)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.project_doConsume_0$(Unknown
>  Source)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
>  Source)
>   at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
>   at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:759)
>   at 
> org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:354)
>   at 
> org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:890)
>   at 
> org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:890)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
>   at org.apache.spark.scheduler.Task.run(Task.scala:136)
>   at 
> org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:507)
>   at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1468)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:510)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: javax.crypto.AEADBadTagException: Tag mismatch!
>   at 
> com.sun.crypto.provider.GaloisCounterMode.decryptFinal(GaloisCounterMode.java:620)
>   at 
> com.sun.crypto.provider.CipherCore.finalNoPadding(CipherCore.java:1116)
>   at 
> com.sun.crypto.provider.CipherCore.fillOutputBuffer(CipherCore.java:1053)
>   at com.sun.crypto.provider.CipherCore.doFinal(CipherCore.java:853)
>   at com.sun.crypto.provider.AESCipher.engineDoFinal(AESCipher.java:446)
>   at javax.crypto.Cipher.doFinal(Cipher.java:2226)
>   at 
> org.apache.spark.sql.catalyst.expressions.ExpressionImplUtils.aesInternal(ExpressionImplUtils.java:87)
>   ... 19 more
> {code}
> That might confuse non-Scala/Java users. Need to wrap such kind of exception 
> by Spark's exception.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36644) Push down boolean column filter

2022-01-10 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17472269#comment-17472269
 ] 

Apache Spark commented on SPARK-36644:
--

User 'kazuyukitanimura' has created a pull request for this issue:
https://github.com/apache/spark/pull/35156

> Push down boolean column filter
> ---
>
> Key: SPARK-36644
> URL: https://issues.apache.org/jira/browse/SPARK-36644
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core, SQL
>Affects Versions: 3.3.0
>Reporter: Kazuyuki Tanimura
>Assignee: Kazuyuki Tanimura
>Priority: Major
> Fix For: 3.3.0
>
>
> The following query does not push down the filter 
> ```
> SELECT * FROM t WHERE boolean_field
> ```
> although the following query pushes down the filter as expected.
> ```
> SELECT * FROM t WHERE boolean_field = true
> ```



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36782) Deadlock between map-output-dispatcher and dispatcher-BlockManagerMaster upon migrating shuffle blocks

2022-01-10 Thread Holden Karau (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Holden Karau updated SPARK-36782:
-
Fix Version/s: 3.1.3

> Deadlock between map-output-dispatcher and dispatcher-BlockManagerMaster upon 
> migrating shuffle blocks
> --
>
> Key: SPARK-36782
> URL: https://issues.apache.org/jira/browse/SPARK-36782
> Project: Spark
>  Issue Type: Bug
>  Components: Block Manager
>Affects Versions: 3.1.0, 3.1.1, 3.1.2, 3.2.0, 3.1.3, 3.2.1, 3.3.0
>Reporter: Fabian Thiele
>Assignee: Fabian Thiele
>Priority: Major
> Fix For: 3.2.0, 3.1.3
>
> Attachments: 
> 0001-Add-test-showing-that-decommission-might-deadlock.patch, 
> spark_stacktrace_deadlock.txt
>
>
> I can observe a deadlock on the driver that can be triggered rather reliably 
> in a job with a larger amount of tasks - upon using
> {code:java}
> spark.decommission.enabled: true
> spark.storage.decommission.rddBlocks.enabled: true
> spark.storage.decommission.shuffleBlocks.enabled: true
> spark.storage.decommission.enabled: true{code}
>  
> It origins in the {{dispatcher-BlockManagerMaster}} making a call to 
> {{updateBlockInfo}} when shuffles are migrated. This is not performed by a 
> thread from the pool but instead by the {{dispatcher-BlockManagerMaster}} 
> itself. I suppose this was done under the assumption that this would be very 
> fast. However if the block that is updated is a shuffle index block it calls
> {code:java}
> mapOutputTracker.updateMapOutput(shuffleId, mapId, blockManagerId){code}
> for which it waits to acquire a write lock as part of the 
> {{MapOutputTracker}}.
> If the timing is bad then one of the {{map-output-dispatchers}} are holding 
> this lock as part of e.g. {{serializedMapStatus}}. In this function 
> {{MapOutputTracker.serializeOutputStatuses}} is called and as part of that we 
> do
> {code:java}
> if (arrSize >= minBroadcastSize) {
>  // Use broadcast instead.
>  // Important arr(0) is the tag == DIRECT, ignore that while deserializing !
>  // arr is a nested Array so that it can handle over 2GB serialized data
>  val arr = chunkedByteBuf.getChunks().map(_.array())
>  val bcast = broadcastManager.newBroadcast(arr, isLocal){code}
> which makes an RPC call to {{dispatcher-BlockManagerMaster}}. That one 
> however is unable to answer as it is blocked while waiting on the 
> aforementioned lock. Hence the deadlock. The ingredients of this deadlock are 
> therefore: sufficient size of the array to go the broadcast-path, as well as 
> timing of incoming {{updateBlockInfo}} call as happens regularly during 
> decommissioning. Potentially earlier versions than 3.1.0 are affected but I 
> could not sufficiently conclude that.
> I have a stacktrace of all driver threads showing the deadlock: 
> [^spark_stacktrace_deadlock.txt]
> A coworker of mine wrote a patch that replicates the issue as a test case as 
> well: [^0001-Add-test-showing-that-decommission-might-deadlock.patch]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-37858) Throw Spark exceptions from AES functions

2022-01-10 Thread Max Gekk (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk updated SPARK-37858:
-
Summary: Throw Spark exceptions from AES functions  (was: Wrap Java 
exceptions from aes functions)

> Throw Spark exceptions from AES functions
> -
>
> Key: SPARK-37858
> URL: https://issues.apache.org/jira/browse/SPARK-37858
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Max Gekk
>Assignee: Max Gekk
>Priority: Major
>
> Currently, Spark SQL can throw Java exceptions from the 
> aes_encrypt()/aes_decrypt() functions, for instance:
> {code:java}
> java.lang.RuntimeException: javax.crypto.AEADBadTagException: Tag mismatch!
>   at 
> org.apache.spark.sql.catalyst.expressions.ExpressionImplUtils.aesInternal(ExpressionImplUtils.java:93)
>   at 
> org.apache.spark.sql.catalyst.expressions.ExpressionImplUtils.aesDecrypt(ExpressionImplUtils.java:43)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.project_doConsume_0$(Unknown
>  Source)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
>  Source)
>   at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
>   at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:759)
>   at 
> org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:354)
>   at 
> org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:890)
>   at 
> org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:890)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
>   at org.apache.spark.scheduler.Task.run(Task.scala:136)
>   at 
> org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:507)
>   at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1468)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:510)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: javax.crypto.AEADBadTagException: Tag mismatch!
>   at 
> com.sun.crypto.provider.GaloisCounterMode.decryptFinal(GaloisCounterMode.java:620)
>   at 
> com.sun.crypto.provider.CipherCore.finalNoPadding(CipherCore.java:1116)
>   at 
> com.sun.crypto.provider.CipherCore.fillOutputBuffer(CipherCore.java:1053)
>   at com.sun.crypto.provider.CipherCore.doFinal(CipherCore.java:853)
>   at com.sun.crypto.provider.AESCipher.engineDoFinal(AESCipher.java:446)
>   at javax.crypto.Cipher.doFinal(Cipher.java:2226)
>   at 
> org.apache.spark.sql.catalyst.expressions.ExpressionImplUtils.aesInternal(ExpressionImplUtils.java:87)
>   ... 19 more
> {code}
> That might confuse non-Scala/Java users. Need to wrap such kind of exception 
> by Spark's exception.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-37858) Throw Spark exceptions from AES functions

2022-01-10 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37858:


Assignee: Apache Spark  (was: Max Gekk)

> Throw Spark exceptions from AES functions
> -
>
> Key: SPARK-37858
> URL: https://issues.apache.org/jira/browse/SPARK-37858
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Max Gekk
>Assignee: Apache Spark
>Priority: Major
>
> Currently, Spark SQL can throw Java exceptions from the 
> aes_encrypt()/aes_decrypt() functions, for instance:
> {code:java}
> java.lang.RuntimeException: javax.crypto.AEADBadTagException: Tag mismatch!
>   at 
> org.apache.spark.sql.catalyst.expressions.ExpressionImplUtils.aesInternal(ExpressionImplUtils.java:93)
>   at 
> org.apache.spark.sql.catalyst.expressions.ExpressionImplUtils.aesDecrypt(ExpressionImplUtils.java:43)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.project_doConsume_0$(Unknown
>  Source)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
>  Source)
>   at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
>   at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:759)
>   at 
> org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:354)
>   at 
> org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:890)
>   at 
> org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:890)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
>   at org.apache.spark.scheduler.Task.run(Task.scala:136)
>   at 
> org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:507)
>   at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1468)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:510)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: javax.crypto.AEADBadTagException: Tag mismatch!
>   at 
> com.sun.crypto.provider.GaloisCounterMode.decryptFinal(GaloisCounterMode.java:620)
>   at 
> com.sun.crypto.provider.CipherCore.finalNoPadding(CipherCore.java:1116)
>   at 
> com.sun.crypto.provider.CipherCore.fillOutputBuffer(CipherCore.java:1053)
>   at com.sun.crypto.provider.CipherCore.doFinal(CipherCore.java:853)
>   at com.sun.crypto.provider.AESCipher.engineDoFinal(AESCipher.java:446)
>   at javax.crypto.Cipher.doFinal(Cipher.java:2226)
>   at 
> org.apache.spark.sql.catalyst.expressions.ExpressionImplUtils.aesInternal(ExpressionImplUtils.java:87)
>   ... 19 more
> {code}
> That might confuse non-Scala/Java users. Need to wrap such kind of exception 
> by Spark's exception.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37858) Throw Spark exceptions from AES functions

2022-01-10 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17472294#comment-17472294
 ] 

Apache Spark commented on SPARK-37858:
--

User 'MaxGekk' has created a pull request for this issue:
https://github.com/apache/spark/pull/35157

> Throw Spark exceptions from AES functions
> -
>
> Key: SPARK-37858
> URL: https://issues.apache.org/jira/browse/SPARK-37858
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Max Gekk
>Assignee: Max Gekk
>Priority: Major
>
> Currently, Spark SQL can throw Java exceptions from the 
> aes_encrypt()/aes_decrypt() functions, for instance:
> {code:java}
> java.lang.RuntimeException: javax.crypto.AEADBadTagException: Tag mismatch!
>   at 
> org.apache.spark.sql.catalyst.expressions.ExpressionImplUtils.aesInternal(ExpressionImplUtils.java:93)
>   at 
> org.apache.spark.sql.catalyst.expressions.ExpressionImplUtils.aesDecrypt(ExpressionImplUtils.java:43)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.project_doConsume_0$(Unknown
>  Source)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
>  Source)
>   at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
>   at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:759)
>   at 
> org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:354)
>   at 
> org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:890)
>   at 
> org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:890)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
>   at org.apache.spark.scheduler.Task.run(Task.scala:136)
>   at 
> org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:507)
>   at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1468)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:510)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: javax.crypto.AEADBadTagException: Tag mismatch!
>   at 
> com.sun.crypto.provider.GaloisCounterMode.decryptFinal(GaloisCounterMode.java:620)
>   at 
> com.sun.crypto.provider.CipherCore.finalNoPadding(CipherCore.java:1116)
>   at 
> com.sun.crypto.provider.CipherCore.fillOutputBuffer(CipherCore.java:1053)
>   at com.sun.crypto.provider.CipherCore.doFinal(CipherCore.java:853)
>   at com.sun.crypto.provider.AESCipher.engineDoFinal(AESCipher.java:446)
>   at javax.crypto.Cipher.doFinal(Cipher.java:2226)
>   at 
> org.apache.spark.sql.catalyst.expressions.ExpressionImplUtils.aesInternal(ExpressionImplUtils.java:87)
>   ... 19 more
> {code}
> That might confuse non-Scala/Java users. Need to wrap such kind of exception 
> by Spark's exception.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-37858) Throw Spark exceptions from AES functions

2022-01-10 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37858:


Assignee: Max Gekk  (was: Apache Spark)

> Throw Spark exceptions from AES functions
> -
>
> Key: SPARK-37858
> URL: https://issues.apache.org/jira/browse/SPARK-37858
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Max Gekk
>Assignee: Max Gekk
>Priority: Major
>
> Currently, Spark SQL can throw Java exceptions from the 
> aes_encrypt()/aes_decrypt() functions, for instance:
> {code:java}
> java.lang.RuntimeException: javax.crypto.AEADBadTagException: Tag mismatch!
>   at 
> org.apache.spark.sql.catalyst.expressions.ExpressionImplUtils.aesInternal(ExpressionImplUtils.java:93)
>   at 
> org.apache.spark.sql.catalyst.expressions.ExpressionImplUtils.aesDecrypt(ExpressionImplUtils.java:43)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.project_doConsume_0$(Unknown
>  Source)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
>  Source)
>   at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
>   at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:759)
>   at 
> org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:354)
>   at 
> org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:890)
>   at 
> org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:890)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
>   at org.apache.spark.scheduler.Task.run(Task.scala:136)
>   at 
> org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:507)
>   at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1468)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:510)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: javax.crypto.AEADBadTagException: Tag mismatch!
>   at 
> com.sun.crypto.provider.GaloisCounterMode.decryptFinal(GaloisCounterMode.java:620)
>   at 
> com.sun.crypto.provider.CipherCore.finalNoPadding(CipherCore.java:1116)
>   at 
> com.sun.crypto.provider.CipherCore.fillOutputBuffer(CipherCore.java:1053)
>   at com.sun.crypto.provider.CipherCore.doFinal(CipherCore.java:853)
>   at com.sun.crypto.provider.AESCipher.engineDoFinal(AESCipher.java:446)
>   at javax.crypto.Cipher.doFinal(Cipher.java:2226)
>   at 
> org.apache.spark.sql.catalyst.expressions.ExpressionImplUtils.aesInternal(ExpressionImplUtils.java:87)
>   ... 19 more
> {code}
> That might confuse non-Scala/Java users. Need to wrap such kind of exception 
> by Spark's exception.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-37852) Enable flake's E741 rule in PySpark

2022-01-10 Thread Maciej Szymkiewicz (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maciej Szymkiewicz reassigned SPARK-37852:
--

Assignee: Hyukjin Kwon

> Enable flake's E741 rule in PySpark
> ---
>
> Key: SPARK-37852
> URL: https://issues.apache.org/jira/browse/SPARK-37852
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra, PySpark
>Affects Versions: 3.3.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
>
> To comply PEP 8, we should enable this rule (see also 
> https://www.python.org/dev/peps/pep-0008/#names-to-avoid)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-37852) Enable flake's E741 rule in PySpark

2022-01-10 Thread Maciej Szymkiewicz (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maciej Szymkiewicz resolved SPARK-37852.

Fix Version/s: 3.3.0
   Resolution: Fixed

Issue resolved by pull request 35152
[https://github.com/apache/spark/pull/35152]

> Enable flake's E741 rule in PySpark
> ---
>
> Key: SPARK-37852
> URL: https://issues.apache.org/jira/browse/SPARK-37852
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra, PySpark
>Affects Versions: 3.3.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
> Fix For: 3.3.0
>
>
> To comply PEP 8, we should enable this rule (see also 
> https://www.python.org/dev/peps/pep-0008/#names-to-avoid)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-37859) SQL tables created with JDBC with Spark 3.1 are not readable with 3.2

2022-01-10 Thread Karen Feng (Jira)

Karen Feng created SPARK-37859:
--

 Summary: SQL tables created with JDBC with Spark 3.1 are not 
readable with 3.2
 Key: SPARK-37859
 URL: https://issues.apache.org/jira/browse/SPARK-37859
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 3.2.0
Reporter: Karen Feng


In 
https://github.com/apache/spark/blob/bd24b4884b804fc85a083f82b864823851d5980c/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala#L312,
 a new metadata field is added during reading. As we do a full comparison of 
the user-provided schema and the actual schema in 
https://github.com/apache/spark/blob/bd24b4884b804fc85a083f82b864823851d5980c/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala#L356,
 resolution fails if a table created with Spark 3.1 is read with Spark 3.2.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-37859) SQL tables created with JDBC with Spark 3.1 are not readable with 3.2

2022-01-10 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37859:


Assignee: (was: Apache Spark)

> SQL tables created with JDBC with Spark 3.1 are not readable with 3.2
> -
>
> Key: SPARK-37859
> URL: https://issues.apache.org/jira/browse/SPARK-37859
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Karen Feng
>Priority: Major
>
> In 
> https://github.com/apache/spark/blob/bd24b4884b804fc85a083f82b864823851d5980c/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala#L312,
>  a new metadata field is added during reading. As we do a full comparison of 
> the user-provided schema and the actual schema in 
> https://github.com/apache/spark/blob/bd24b4884b804fc85a083f82b864823851d5980c/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala#L356,
>  resolution fails if a table created with Spark 3.1 is read with Spark 3.2.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-37859) SQL tables created with JDBC with Spark 3.1 are not readable with 3.2

2022-01-10 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37859:


Assignee: Apache Spark

> SQL tables created with JDBC with Spark 3.1 are not readable with 3.2
> -
>
> Key: SPARK-37859
> URL: https://issues.apache.org/jira/browse/SPARK-37859
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Karen Feng
>Assignee: Apache Spark
>Priority: Major
>
> In 
> https://github.com/apache/spark/blob/bd24b4884b804fc85a083f82b864823851d5980c/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala#L312,
>  a new metadata field is added during reading. As we do a full comparison of 
> the user-provided schema and the actual schema in 
> https://github.com/apache/spark/blob/bd24b4884b804fc85a083f82b864823851d5980c/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala#L356,
>  resolution fails if a table created with Spark 3.1 is read with Spark 3.2.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37859) SQL tables created with JDBC with Spark 3.1 are not readable with 3.2

2022-01-10 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17472351#comment-17472351
 ] 

Apache Spark commented on SPARK-37859:
--

User 'karenfeng' has created a pull request for this issue:
https://github.com/apache/spark/pull/35158

> SQL tables created with JDBC with Spark 3.1 are not readable with 3.2
> -
>
> Key: SPARK-37859
> URL: https://issues.apache.org/jira/browse/SPARK-37859
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Karen Feng
>Priority: Major
>
> In 
> https://github.com/apache/spark/blob/bd24b4884b804fc85a083f82b864823851d5980c/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala#L312,
>  a new metadata field is added during reading. As we do a full comparison of 
> the user-provided schema and the actual schema in 
> https://github.com/apache/spark/blob/bd24b4884b804fc85a083f82b864823851d5980c/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala#L356,
>  resolution fails if a table created with Spark 3.1 is read with Spark 3.2.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37859) SQL tables created with JDBC with Spark 3.1 are not readable with 3.2

2022-01-10 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17472350#comment-17472350
 ] 

Apache Spark commented on SPARK-37859:
--

User 'karenfeng' has created a pull request for this issue:
https://github.com/apache/spark/pull/35158

> SQL tables created with JDBC with Spark 3.1 are not readable with 3.2
> -
>
> Key: SPARK-37859
> URL: https://issues.apache.org/jira/browse/SPARK-37859
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Karen Feng
>Priority: Major
>
> In 
> https://github.com/apache/spark/blob/bd24b4884b804fc85a083f82b864823851d5980c/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala#L312,
>  a new metadata field is added during reading. As we do a full comparison of 
> the user-provided schema and the actual schema in 
> https://github.com/apache/spark/blob/bd24b4884b804fc85a083f82b864823851d5980c/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala#L356,
>  resolution fails if a table created with Spark 3.1 is read with Spark 3.2.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-37851) Mark org.apache.spark.sql.hive.execution as slow tests

2022-01-10 Thread Dongjoon Hyun (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-37851:
-

Assignee: Hyukjin Kwon

> Mark org.apache.spark.sql.hive.execution as slow tests
> --
>
> Key: SPARK-37851
> URL: https://issues.apache.org/jira/browse/SPARK-37851
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL, Tests
>Affects Versions: 3.3.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
>
> Related to SPARK-33171 and SPARK-32884. We should rebalance both:
> "hive -slow tests": 
> https://github.com/apache/spark/runs/4755996153?check_suite_focus=true
> "hive - other tests": 
> https://github.com/apache/spark/runs/4755996212?check_suite_focus=true



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-37851) Mark org.apache.spark.sql.hive.execution as slow tests

2022-01-10 Thread Dongjoon Hyun (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-37851.
---
Fix Version/s: 3.3.0
   Resolution: Fixed

Issue resolved by pull request 35151
[https://github.com/apache/spark/pull/35151]

> Mark org.apache.spark.sql.hive.execution as slow tests
> --
>
> Key: SPARK-37851
> URL: https://issues.apache.org/jira/browse/SPARK-37851
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL, Tests
>Affects Versions: 3.3.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
> Fix For: 3.3.0
>
>
> Related to SPARK-33171 and SPARK-32884. We should rebalance both:
> "hive -slow tests": 
> https://github.com/apache/spark/runs/4755996153?check_suite_focus=true
> "hive - other tests": 
> https://github.com/apache/spark/runs/4755996212?check_suite_focus=true



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-30789) Support (IGNORE | RESPECT) NULLS for LEAD/LAG/NTH_VALUE/FIRST_VALUE/LAST_VALUE

2022-01-10 Thread Huaxin Gao (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-30789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huaxin Gao updated SPARK-30789:
---
Fix Version/s: 3.2.1

> Support (IGNORE | RESPECT) NULLS for LEAD/LAG/NTH_VALUE/FIRST_VALUE/LAST_VALUE
> --
>
> Key: SPARK-30789
> URL: https://issues.apache.org/jira/browse/SPARK-30789
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: jiaan.geng
>Assignee: Apache Spark
>Priority: Major
> Fix For: 3.2.0, 3.2.1
>
>
> All of LEAD/LAG/NTH_VALUE/FIRST_VALUE/LAST_VALUE should support IGNORE NULLS 
> | RESPECT NULLS. For example:
> {code:java}
> LEAD (value_expr [, offset ])
> [ IGNORE NULLS | RESPECT NULLS ]
> OVER ( [ PARTITION BY window_partition ] ORDER BY window_ordering ){code}
>  
> {code:java}
> LAG (value_expr [, offset ])
> [ IGNORE NULLS | RESPECT NULLS ]
> OVER ( [ PARTITION BY window_partition ] ORDER BY window_ordering ){code}
>  
> {code:java}
> NTH_VALUE (expr, offset)
> [ IGNORE NULLS | RESPECT NULLS ]
> OVER
> ( [ PARTITION BY window_partition ]
> [ ORDER BY window_ordering 
>  frame_clause ] ){code}
>  
> *Oracle:*
> [https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/NTH_VALUE.html#GUID-F8A0E88C-67E5-4AA6-9515-95D03A7F9EA0]
> *Redshift*
> [https://docs.aws.amazon.com/redshift/latest/dg/r_WF_NTH.html]
> *Presto*
> [https://prestodb.io/docs/current/functions/window.html]
> *DB2*
> [https://www.ibm.com/support/knowledgecenter/SSGU8G_14.1.0/com.ibm.sqls.doc/ids_sqs_1513.htm]
> *Teradata*
> [https://docs.teradata.com/r/756LNiPSFdY~4JcCCcR5Cw/GjCT6l7trjkIEjt~7Dhx4w]
> *Snowflake*
> [https://docs.snowflake.com/en/sql-reference/functions/lead.html]
> [https://docs.snowflake.com/en/sql-reference/functions/lag.html]
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-35714) Bug fix for deadlock during the executor shutdown

2022-01-10 Thread Huaxin Gao (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-35714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huaxin Gao updated SPARK-35714:
---
Fix Version/s: 3.2.1

> Bug fix for deadlock during the executor shutdown
> -
>
> Key: SPARK-35714
> URL: https://issues.apache.org/jira/browse/SPARK-35714
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.1.2
>Reporter: Wan Kun
>Assignee: Wan Kun
>Priority: Minor
> Fix For: 3.0.3, 3.2.0, 3.1.3, 3.2.1
>
> Attachments: three_thread_lock.log
>
>
> When a executor received a TERM signal, it (the second TERM signal) will lock 
> java.lang.Shutdown class and then call Shutdown.exit() method to exit the JVM.
>  Shutdown will call SparkShutdownHook to shutdown the executor.
>  During the executor shutdown phase, RemoteProcessDisconnected event will be 
> send to the RPC inbox, and then WorkerWatcher will try to call 
> System.exit(-1) again.
>  Because java.lang.Shutdown has already locked, a deadlock has occurred.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-37853) Clean up deprecation compilation warning related to log4j2

2022-01-10 Thread Dongjoon Hyun (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-37853.
---
Fix Version/s: 3.3.0
   Resolution: Fixed

Issue resolved by pull request 35153
[https://github.com/apache/spark/pull/35153]

> Clean up deprecation compilation warning related to log4j2
> --
>
> Key: SPARK-37853
> URL: https://issues.apache.org/jira/browse/SPARK-37853
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.3.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Minor
> Fix For: 3.3.0
>
>
> [WARNING] [Warn] 
> /spark-source/core/src/main/scala/org/apache/spark/util/logging/DriverLogger.scala:64:
>  [deprecation @ org.apache.spark.util.logging.DriverLogger.addLogAppender.fa 
> | origin=org.apache.logging.log4j.core.appender.FileAppender.createAppender | 
> version=] method createAppender in class FileAppender is deprecated
> [WARNING] [Warn] 
> /spark-source/core/src/test/scala/org/apache/spark/SparkFunSuite.scala:268: 
> [deprecation @ org.apache.spark.SparkFunSuite.LogAppender. | 
> origin=org.apache.logging.log4j.core.appender.AbstractAppender. | 
> version=] constructor AbstractAppender in class AbstractAppender is deprecated



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-37853) Clean up deprecation compilation warning related to log4j2

2022-01-10 Thread Dongjoon Hyun (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-37853:
-

Assignee: Yang Jie

> Clean up deprecation compilation warning related to log4j2
> --
>
> Key: SPARK-37853
> URL: https://issues.apache.org/jira/browse/SPARK-37853
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.3.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Minor
>
> [WARNING] [Warn] 
> /spark-source/core/src/main/scala/org/apache/spark/util/logging/DriverLogger.scala:64:
>  [deprecation @ org.apache.spark.util.logging.DriverLogger.addLogAppender.fa 
> | origin=org.apache.logging.log4j.core.appender.FileAppender.createAppender | 
> version=] method createAppender in class FileAppender is deprecated
> [WARNING] [Warn] 
> /spark-source/core/src/test/scala/org/apache/spark/SparkFunSuite.scala:268: 
> [deprecation @ org.apache.spark.SparkFunSuite.LogAppender. | 
> origin=org.apache.logging.log4j.core.appender.AbstractAppender. | 
> version=] constructor AbstractAppender in class AbstractAppender is deprecated



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-34399) Add file commit time to metrics and shown in SQL Tab UI

2022-01-10 Thread Huaxin Gao (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huaxin Gao updated SPARK-34399:
---
Fix Version/s: 3.2.1
   (was: 3.2.0)

> Add file commit time to metrics and shown in SQL Tab UI
> ---
>
> Key: SPARK-34399
> URL: https://issues.apache.org/jira/browse/SPARK-34399
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: angerszhu
>Assignee: angerszhu
>Priority: Major
> Fix For: 3.2.1
>
>
> Add file commit time to metrics and shown in SQL Tab UI



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-30789) Support (IGNORE | RESPECT) NULLS for LEAD/LAG/NTH_VALUE/FIRST_VALUE/LAST_VALUE

2022-01-10 Thread Huaxin Gao (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-30789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huaxin Gao updated SPARK-30789:
---
Fix Version/s: (was: 3.2.0)

> Support (IGNORE | RESPECT) NULLS for LEAD/LAG/NTH_VALUE/FIRST_VALUE/LAST_VALUE
> --
>
> Key: SPARK-30789
> URL: https://issues.apache.org/jira/browse/SPARK-30789
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: jiaan.geng
>Assignee: Apache Spark
>Priority: Major
> Fix For: 3.2.1
>
>
> All of LEAD/LAG/NTH_VALUE/FIRST_VALUE/LAST_VALUE should support IGNORE NULLS 
> | RESPECT NULLS. For example:
> {code:java}
> LEAD (value_expr [, offset ])
> [ IGNORE NULLS | RESPECT NULLS ]
> OVER ( [ PARTITION BY window_partition ] ORDER BY window_ordering ){code}
>  
> {code:java}
> LAG (value_expr [, offset ])
> [ IGNORE NULLS | RESPECT NULLS ]
> OVER ( [ PARTITION BY window_partition ] ORDER BY window_ordering ){code}
>  
> {code:java}
> NTH_VALUE (expr, offset)
> [ IGNORE NULLS | RESPECT NULLS ]
> OVER
> ( [ PARTITION BY window_partition ]
> [ ORDER BY window_ordering 
>  frame_clause ] ){code}
>  
> *Oracle:*
> [https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/NTH_VALUE.html#GUID-F8A0E88C-67E5-4AA6-9515-95D03A7F9EA0]
> *Redshift*
> [https://docs.aws.amazon.com/redshift/latest/dg/r_WF_NTH.html]
> *Presto*
> [https://prestodb.io/docs/current/functions/window.html]
> *DB2*
> [https://www.ibm.com/support/knowledgecenter/SSGU8G_14.1.0/com.ibm.sqls.doc/ids_sqs_1513.htm]
> *Teradata*
> [https://docs.teradata.com/r/756LNiPSFdY~4JcCCcR5Cw/GjCT6l7trjkIEjt~7Dhx4w]
> *Snowflake*
> [https://docs.snowflake.com/en/sql-reference/functions/lead.html]
> [https://docs.snowflake.com/en/sql-reference/functions/lag.html]
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36464) Fix Underlying Size Variable Initialization in ChunkedByteBufferOutputStream for Writing Over 2GB Data

2022-01-10 Thread Huaxin Gao (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huaxin Gao updated SPARK-36464:
---
Fix Version/s: 3.2.1
   (was: 3.2.0)

> Fix Underlying Size Variable Initialization in ChunkedByteBufferOutputStream 
> for Writing Over 2GB Data
> --
>
> Key: SPARK-36464
> URL: https://issues.apache.org/jira/browse/SPARK-36464
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.0.2, 2.1.3, 2.2.3, 2.3.4, 2.4.8, 3.0.3, 3.1.2, 3.3.0
>Reporter: Kazuyuki Tanimura
>Assignee: Kazuyuki Tanimura
>Priority: Major
> Fix For: 3.1.3, 3.0.4, 3.2.1
>
>
> The `size` method of `ChunkedByteBufferOutputStream` returns a `Long` value; 
> however, the underlying `_size` variable is initialized as `Int`.
> That causes an overflow and returns a negative size when over 2GB data is 
> written into `ChunkedByteBufferOutputStream`



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36464) Fix Underlying Size Variable Initialization in ChunkedByteBufferOutputStream for Writing Over 2GB Data

2022-01-10 Thread Huaxin Gao (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huaxin Gao updated SPARK-36464:
---
Fix Version/s: 3.2.0

> Fix Underlying Size Variable Initialization in ChunkedByteBufferOutputStream 
> for Writing Over 2GB Data
> --
>
> Key: SPARK-36464
> URL: https://issues.apache.org/jira/browse/SPARK-36464
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.0.2, 2.1.3, 2.2.3, 2.3.4, 2.4.8, 3.0.3, 3.1.2, 3.3.0
>Reporter: Kazuyuki Tanimura
>Assignee: Kazuyuki Tanimura
>Priority: Major
> Fix For: 3.2.0, 3.1.3, 3.0.4, 3.2.1
>
>
> The `size` method of `ChunkedByteBufferOutputStream` returns a `Long` value; 
> however, the underlying `_size` variable is initialized as `Int`.
> That causes an overflow and returns a negative size when over 2GB data is 
> written into `ChunkedByteBufferOutputStream`



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-33277) Python/Pandas UDF right after off-heap vectorized reader could cause executor crash.

2022-01-10 Thread Huaxin Gao (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huaxin Gao updated SPARK-33277:
---
Fix Version/s: 3.2.1

> Python/Pandas UDF right after off-heap vectorized reader could cause executor 
> crash.
> 
>
> Key: SPARK-33277
> URL: https://issues.apache.org/jira/browse/SPARK-33277
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark, SQL
>Affects Versions: 2.4.7, 3.0.1
>Reporter: Takuya Ueshin
>Assignee: Takuya Ueshin
>Priority: Major
> Fix For: 2.4.8, 3.0.2, 3.1.0, 3.2.1
>
>
> Python/Pandas UDF right after off-heap vectorized reader could cause executor 
> crash.
> E.g.,:
> {code:java}
> spark.range(0, 10, 1, 1).write.parquet(path)
> spark.conf.set("spark.sql.columnVector.offheap.enabled", True)
> def f(x):
> return 0
> fUdf = udf(f, LongType())
> spark.read.parquet(path).select(fUdf('id')).head()
> {code}
> This is because, the Python evaluation consumes the parent iterator in a 
> separate thread and it consumes more data from the parent even after the task 
> ends and the parent is closed. If an off-heap column vector exists in the 
> parent iterator, it could cause segmentation fault which crashes the executor.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-37853) Clean up deprecation compilation warning related to log4j2

2022-01-10 Thread Dongjoon Hyun (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-37853:
--
Parent: SPARK-37814
Issue Type: Sub-task  (was: Improvement)

> Clean up deprecation compilation warning related to log4j2
> --
>
> Key: SPARK-37853
> URL: https://issues.apache.org/jira/browse/SPARK-37853
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 3.3.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Minor
> Fix For: 3.3.0
>
>
> [WARNING] [Warn] 
> /spark-source/core/src/main/scala/org/apache/spark/util/logging/DriverLogger.scala:64:
>  [deprecation @ org.apache.spark.util.logging.DriverLogger.addLogAppender.fa 
> | origin=org.apache.logging.log4j.core.appender.FileAppender.createAppender | 
> version=] method createAppender in class FileAppender is deprecated
> [WARNING] [Warn] 
> /spark-source/core/src/test/scala/org/apache/spark/SparkFunSuite.scala:268: 
> [deprecation @ org.apache.spark.SparkFunSuite.LogAppender. | 
> origin=org.apache.logging.log4j.core.appender.AbstractAppender. | 
> version=] constructor AbstractAppender in class AbstractAppender is deprecated



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-37843) Suppress NoSuchFieldError at setMDCForTask

2022-01-10 Thread Dongjoon Hyun (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-37843:
--
Parent: SPARK-37814
Issue Type: Sub-task  (was: Bug)

> Suppress NoSuchFieldError at setMDCForTask
> --
>
> Key: SPARK-37843
> URL: https://issues.apache.org/jira/browse/SPARK-37843
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 3.3.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Major
> Fix For: 3.3.0
>
>
> {code}
> 00:57:11 2022-01-07 15:57:11.693 - stderr> Exception in thread "Executor task 
> launch worker-0" java.lang.NoSuchFieldError: mdc
> 00:57:11 2022-01-07 15:57:11.693 - stderr>at 
> org.apache.log4j.MDCFriend.fixForJava9(MDCFriend.java:11)
> 00:57:11 2022-01-07 15:57:11.693 - stderr>at 
> org.slf4j.impl.Log4jMDCAdapter.(Log4jMDCAdapter.java:38)
> 00:57:11 2022-01-07 15:57:11.693 - stderr>at 
> org.slf4j.impl.StaticMDCBinder.getMDCA(StaticMDCBinder.java:59)
> 00:57:11 2022-01-07 15:57:11.693 - stderr>at 
> org.slf4j.MDC.bwCompatibleGetMDCAdapterFromBinder(MDC.java:99)
> 00:57:11 2022-01-07 15:57:11.693 - stderr>at 
> org.slf4j.MDC.(MDC.java:108)
> 00:57:11 2022-01-07 15:57:11.693 - stderr>at 
> org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$setMDCForTask(Executor.scala:750)
> 00:57:11 2022-01-07 15:57:11.693 - stderr>at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:441)
> 00:57:11 2022-01-07 15:57:11.693 - stderr>at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
> 00:57:11 2022-01-07 15:57:11.693 - stderr>at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
> 00:57:11 2022-01-07 15:57:11.693 - stderr>at 
> java.base/java.lang.Thread.run(Thread.java:833)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-37844) Remove slf4j-log4j12 dependency from hadoop-minikdc

2022-01-10 Thread Dongjoon Hyun (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-37844:
--
Parent: SPARK-37814
Issue Type: Sub-task  (was: Bug)

> Remove slf4j-log4j12 dependency from hadoop-minikdc
> ---
>
> Key: SPARK-37844
> URL: https://issues.apache.org/jira/browse/SPARK-37844
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Tests
>Affects Versions: 3.3.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Major
> Fix For: 3.3.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36979) Add RewriteLateralSubquery rule into nonExcludableRules

2022-01-10 Thread Huaxin Gao (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huaxin Gao updated SPARK-36979:
---
Fix Version/s: 3.2.1
   (was: 3.2.0)

> Add RewriteLateralSubquery rule into nonExcludableRules
> ---
>
> Key: SPARK-36979
> URL: https://issues.apache.org/jira/browse/SPARK-36979
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: XiDuo You
>Assignee: XiDuo You
>Priority: Minor
> Fix For: 3.2.1
>
>
> Lateral Join has no meaning without rule `RewriteLateralSubquery`. So now if 
> we set 
> `spark.sql.optimizer.excludedRules=org.apache.spark.sql.catalyst.optimizer.RewriteLateralSubquery`,
>  the lateral join query will fail with:
> {code:java}
> java.lang.AssertionError: assertion failed: No plan for LateralJoin 
> lateral-subquery#218
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36717) Wrong order of variable initialization may lead to incorrect behavior

2022-01-10 Thread Huaxin Gao (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huaxin Gao updated SPARK-36717:
---
Fix Version/s: 3.2.1

> Wrong order of variable initialization may lead to incorrect behavior
> -
>
> Key: SPARK-36717
> URL: https://issues.apache.org/jira/browse/SPARK-36717
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.1.2
>Reporter: Jianmeng Li
>Assignee: Jianmeng Li
>Priority: Minor
> Fix For: 3.2.0, 3.1.3, 3.0.4, 3.2.1, 3.3.0
>
>
> Incorrect order of variable initialization may lead to incorrect behavior, 
> Related code: 
> [TorrentBroadcast.scala|https://github.com/apache/spark/blob/0494dc90af48ce7da0625485a4dc6917a244d580/core/src/main/scala/org/apache/spark/broadcast/TorrentBroadcast.scala#L94]
>  , TorrentBroadCast will get wrong checksumEnabled value after 
> initialization, this may not be what we need, we can move L94 front of 
> setConf(SparkEnv.get.conf) to avoid this.
> Supplement:
> Snippet 1:
> {code:java}
> class Broadcast {
>   def setConf(): Unit = {
> checksumEnabled = true
>   }
>   setConf()
>   var checksumEnabled = false
> }
> println(new Broadcast().checksumEnabled){code}
> output:
> {code:java}
> false{code}
> Snippet 2:
> {code:java}
> class Broadcast {
>   var checksumEnabled = false
>   def setConf(): Unit = {
> checksumEnabled = true
>   }
>   setConf()
> }
> println(new Broadcast().checksumEnabled){code}
> output: 
> {code:java}
> true{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-37790) Upgrade SLF4J to 1.7.32

2022-01-10 Thread Dongjoon Hyun (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-37790:
--
Parent: SPARK-37814
Issue Type: Sub-task  (was: Improvement)

> Upgrade SLF4J to 1.7.32
> ---
>
> Key: SPARK-37790
> URL: https://issues.apache.org/jira/browse/SPARK-37790
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 3.3.0
>Reporter: William Hyun
>Assignee: William Hyun
>Priority: Major
> Fix For: 3.3.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36717) Wrong order of variable initialization may lead to incorrect behavior

2022-01-10 Thread Huaxin Gao (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huaxin Gao updated SPARK-36717:
---
Fix Version/s: (was: 3.2.0)

> Wrong order of variable initialization may lead to incorrect behavior
> -
>
> Key: SPARK-36717
> URL: https://issues.apache.org/jira/browse/SPARK-36717
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.1.2
>Reporter: Jianmeng Li
>Assignee: Jianmeng Li
>Priority: Minor
> Fix For: 3.1.3, 3.0.4, 3.2.1, 3.3.0
>
>
> Incorrect order of variable initialization may lead to incorrect behavior, 
> Related code: 
> [TorrentBroadcast.scala|https://github.com/apache/spark/blob/0494dc90af48ce7da0625485a4dc6917a244d580/core/src/main/scala/org/apache/spark/broadcast/TorrentBroadcast.scala#L94]
>  , TorrentBroadCast will get wrong checksumEnabled value after 
> initialization, this may not be what we need, we can move L94 front of 
> setConf(SparkEnv.get.conf) to avoid this.
> Supplement:
> Snippet 1:
> {code:java}
> class Broadcast {
>   def setConf(): Unit = {
> checksumEnabled = true
>   }
>   setConf()
>   var checksumEnabled = false
> }
> println(new Broadcast().checksumEnabled){code}
> output:
> {code:java}
> false{code}
> Snippet 2:
> {code:java}
> class Broadcast {
>   var checksumEnabled = false
>   def setConf(): Unit = {
> checksumEnabled = true
>   }
>   setConf()
> }
> println(new Broadcast().checksumEnabled){code}
> output: 
> {code:java}
> true{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-37860) [BUG] Revert: Fix taskid in the stage page task event timeline

2022-01-10 Thread Jackey Lee (Jira)

Jackey Lee created SPARK-37860:
--

 Summary: [BUG] Revert: Fix taskid in the stage page task event 
timeline
 Key: SPARK-37860
 URL: https://issues.apache.org/jira/browse/SPARK-37860
 Project: Spark
  Issue Type: Bug
  Components: Web UI
Affects Versions: 3.2.1
Reporter: Jackey Lee


In [#32888|https://github.com/apache/spark/pull/32888], 
[@shahidki31|https://github.com/shahidki31] change taskInfo.index to 
taskInfo.taskId. However, we generally use {{index.attempt}} or {{taskId}} to 
distinguish tasks within a stage, not {{{}taskId.attempt{}}}.
Thus [#32888|https://github.com/apache/spark/pull/32888] was a wrong fix issue, 
we should revert it.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-35746) Task id in the Stage page timeline is incorrect

2022-01-10 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-35746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17472389#comment-17472389
 ] 

Apache Spark commented on SPARK-35746:
--

User 'stczwd' has created a pull request for this issue:
https://github.com/apache/spark/pull/35159

> Task id in the Stage page timeline is incorrect
> ---
>
> Key: SPARK-35746
> URL: https://issues.apache.org/jira/browse/SPARK-35746
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 3.0.0, 3.1.2
>Reporter: shahid
>Assignee: shahid
>Priority: Minor
> Attachments: image-2021-06-12-07-03-09-808.png
>
>
> !image-2021-06-12-07-03-09-808.png!



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37159) Change HiveExternalCatalogVersionsSuite to be able to test with Java 17

2022-01-10 Thread Dongjoon Hyun (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17472409#comment-17472409
 ] 

Dongjoon Hyun commented on SPARK-37159:
---

I'll collect this to a subtask of SPARK-33772, [~sarutak].

> Change HiveExternalCatalogVersionsSuite to be able to test with Java 17
> ---
>
> Key: SPARK-37159
> URL: https://issues.apache.org/jira/browse/SPARK-37159
> Project: Spark
>  Issue Type: Bug
>  Components: SQL, Tests
>Affects Versions: 3.3.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Minor
> Fix For: 3.3.0
>
>
> SPARK-37105 seems to have fixed most of tests in `sql/hive` for Java 17 but 
> `HiveExternalCatalogVersionsSuite`.
> {code}
> [info] org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite *** ABORTED 
> *** (42 seconds, 526 milliseconds)
> [info]   spark-submit returned with exit code 1.
> [info]   Command line: 
> '/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/test-spark-d86af275-0c40-4b47-9cab-defa92a5ffa7/spark-3.2.0/bin/spark-submit'
>  '--name' 'prepare testing tables' '--master' 'local[2]' '--conf' 
> 'spark.ui.enabled=false' '--conf' 'spark.master.rest.enabled=false' '--conf' 
> 'spark.sql.hive.metastore.version=2.3' '--conf' 
> 'spark.sql.hive.metastore.jars=maven' '--conf' 
> 'spark.sql.warehouse.dir=/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/warehouse-69d9bdbc-54ce-443b-8677-a413663ddb62'
>  '--conf' 'spark.sql.test.version.index=0' '--driver-java-options' 
> '-Dderby.system.home=/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/warehouse-69d9bdbc-54ce-443b-8677-a413663ddb62'
>  
> '/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/test15166225869206697603.py'
> [info]   
> [info]   2021-10-28 06:07:18.486 - stderr> Using Spark's default log4j 
> profile: org/apache/spark/log4j-defaults.properties
> [info]   2021-10-28 06:07:18.49 - stderr> 21/10/28 22:07:18 INFO 
> SparkContext: Running Spark version 3.2.0
> [info]   2021-10-28 06:07:18.537 - stderr> 21/10/28 22:07:18 WARN 
> NativeCodeLoader: Unable to load native-hadoop library for your platform... 
> using builtin-java classes where applicable
> [info]   2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO 
> ResourceUtils: ==
> [info]   2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO 
> ResourceUtils: No custom resources configured for spark.driver.
> [info]   2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO 
> ResourceUtils: ==
> [info]   2021-10-28 06:07:18.617 - stderr> 21/10/28 22:07:18 INFO 
> SparkContext: Submitted application: prepare testing tables
> [info]   2021-10-28 06:07:18.632 - stderr> 21/10/28 22:07:18 INFO 
> ResourceProfile: Default ResourceProfile created, executor resources: 
> Map(cores -> name: cores, amount: 1, script: , vendor: , memory -> name: 
> memory, amount: 1024, script: , vendor: , offHeap -> name: offHeap, amount: 
> 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0)
> [info]   2021-10-28 06:07:18.641 - stderr> 21/10/28 22:07:18 INFO 
> ResourceProfile: Limiting resource is cpu
> [info]   2021-10-28 06:07:18.641 - stderr> 21/10/28 22:07:18 INFO 
> ResourceProfileManager: Added ResourceProfile id: 0
> [info]   2021-10-28 06:07:18.679 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: Changing view acls to: kou
> [info]   2021-10-28 06:07:18.679 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: Changing modify acls to: kou
> [info]   2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: Changing view acls groups to: 
> [info]   2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: Changing modify acls groups to: 
> [info]   2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: SecurityManager: authentication disabled; ui acls disabled; 
> users  with view permissions: Set(kou); groups with view permissions: Set(); 
> users  with modify permissions: Set(kou); groups with modify permissions: 
> Set()
> [info]   2021-10-28 06:07:18.886 - stderr> 21/10/28 22:07:18 INFO Utils: 
> Successfully started service 'sparkDriver' on port 35867.
> [info]   2021-10-28 06:07:18.906 - stderr> 21/10/28 22:07:18 INFO SparkEnv: 
> Registering MapOutputTracker
> [info]   2021-10-28 06:07:18.93 - stderr> 21/10/28 22:07:18 INFO SparkEnv: 
> Registering BlockManagerMaster
> [info]   2021-10-28 06:07:18.943 - stderr> 21/10/28 22

[jira] [Updated] (SPARK-37159) Change HiveExternalCatalogVersionsSuite to be able to test with Java 17

2022-01-10 Thread Dongjoon Hyun (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-37159:
--
Parent: SPARK-33772
Issue Type: Sub-task  (was: Bug)

> Change HiveExternalCatalogVersionsSuite to be able to test with Java 17
> ---
>
> Key: SPARK-37159
> URL: https://issues.apache.org/jira/browse/SPARK-37159
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL, Tests
>Affects Versions: 3.3.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Minor
> Fix For: 3.3.0
>
>
> SPARK-37105 seems to have fixed most of tests in `sql/hive` for Java 17 but 
> `HiveExternalCatalogVersionsSuite`.
> {code}
> [info] org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite *** ABORTED 
> *** (42 seconds, 526 milliseconds)
> [info]   spark-submit returned with exit code 1.
> [info]   Command line: 
> '/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/test-spark-d86af275-0c40-4b47-9cab-defa92a5ffa7/spark-3.2.0/bin/spark-submit'
>  '--name' 'prepare testing tables' '--master' 'local[2]' '--conf' 
> 'spark.ui.enabled=false' '--conf' 'spark.master.rest.enabled=false' '--conf' 
> 'spark.sql.hive.metastore.version=2.3' '--conf' 
> 'spark.sql.hive.metastore.jars=maven' '--conf' 
> 'spark.sql.warehouse.dir=/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/warehouse-69d9bdbc-54ce-443b-8677-a413663ddb62'
>  '--conf' 'spark.sql.test.version.index=0' '--driver-java-options' 
> '-Dderby.system.home=/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/warehouse-69d9bdbc-54ce-443b-8677-a413663ddb62'
>  
> '/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/test15166225869206697603.py'
> [info]   
> [info]   2021-10-28 06:07:18.486 - stderr> Using Spark's default log4j 
> profile: org/apache/spark/log4j-defaults.properties
> [info]   2021-10-28 06:07:18.49 - stderr> 21/10/28 22:07:18 INFO 
> SparkContext: Running Spark version 3.2.0
> [info]   2021-10-28 06:07:18.537 - stderr> 21/10/28 22:07:18 WARN 
> NativeCodeLoader: Unable to load native-hadoop library for your platform... 
> using builtin-java classes where applicable
> [info]   2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO 
> ResourceUtils: ==
> [info]   2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO 
> ResourceUtils: No custom resources configured for spark.driver.
> [info]   2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO 
> ResourceUtils: ==
> [info]   2021-10-28 06:07:18.617 - stderr> 21/10/28 22:07:18 INFO 
> SparkContext: Submitted application: prepare testing tables
> [info]   2021-10-28 06:07:18.632 - stderr> 21/10/28 22:07:18 INFO 
> ResourceProfile: Default ResourceProfile created, executor resources: 
> Map(cores -> name: cores, amount: 1, script: , vendor: , memory -> name: 
> memory, amount: 1024, script: , vendor: , offHeap -> name: offHeap, amount: 
> 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0)
> [info]   2021-10-28 06:07:18.641 - stderr> 21/10/28 22:07:18 INFO 
> ResourceProfile: Limiting resource is cpu
> [info]   2021-10-28 06:07:18.641 - stderr> 21/10/28 22:07:18 INFO 
> ResourceProfileManager: Added ResourceProfile id: 0
> [info]   2021-10-28 06:07:18.679 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: Changing view acls to: kou
> [info]   2021-10-28 06:07:18.679 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: Changing modify acls to: kou
> [info]   2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: Changing view acls groups to: 
> [info]   2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: Changing modify acls groups to: 
> [info]   2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: SecurityManager: authentication disabled; ui acls disabled; 
> users  with view permissions: Set(kou); groups with view permissions: Set(); 
> users  with modify permissions: Set(kou); groups with modify permissions: 
> Set()
> [info]   2021-10-28 06:07:18.886 - stderr> 21/10/28 22:07:18 INFO Utils: 
> Successfully started service 'sparkDriver' on port 35867.
> [info]   2021-10-28 06:07:18.906 - stderr> 21/10/28 22:07:18 INFO SparkEnv: 
> Registering MapOutputTracker
> [info]   2021-10-28 06:07:18.93 - stderr> 21/10/28 22:07:18 INFO SparkEnv: 
> Registering BlockManagerMaster
> [info]   2021-10-28 06:07:18.943 - stderr> 21/10/28 22:07:18 INFO 
> BlockManagerMasterEndpoint: Usin

[jira] [Commented] (SPARK-37860) [BUG] Revert: Fix taskid in the stage page task event timeline

2022-01-10 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17472420#comment-17472420
 ] 

Apache Spark commented on SPARK-37860:
--

User 'stczwd' has created a pull request for this issue:
https://github.com/apache/spark/pull/35160

> [BUG] Revert: Fix taskid in the stage page task event timeline
> --
>
> Key: SPARK-37860
> URL: https://issues.apache.org/jira/browse/SPARK-37860
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 3.2.1
>Reporter: Jackey Lee
>Priority: Major
>
> In [#32888|https://github.com/apache/spark/pull/32888], 
> [@shahidki31|https://github.com/shahidki31] change taskInfo.index to 
> taskInfo.taskId. However, we generally use {{index.attempt}} or {{taskId}} to 
> distinguish tasks within a stage, not {{{}taskId.attempt{}}}.
> Thus [#32888|https://github.com/apache/spark/pull/32888] was a wrong fix 
> issue, we should revert it.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-37860) [BUG] Revert: Fix taskid in the stage page task event timeline

2022-01-10 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37860:


Assignee: (was: Apache Spark)

> [BUG] Revert: Fix taskid in the stage page task event timeline
> --
>
> Key: SPARK-37860
> URL: https://issues.apache.org/jira/browse/SPARK-37860
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 3.2.1
>Reporter: Jackey Lee
>Priority: Major
>
> In [#32888|https://github.com/apache/spark/pull/32888], 
> [@shahidki31|https://github.com/shahidki31] change taskInfo.index to 
> taskInfo.taskId. However, we generally use {{index.attempt}} or {{taskId}} to 
> distinguish tasks within a stage, not {{{}taskId.attempt{}}}.
> Thus [#32888|https://github.com/apache/spark/pull/32888] was a wrong fix 
> issue, we should revert it.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-37860) [BUG] Revert: Fix taskid in the stage page task event timeline

2022-01-10 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37860:


Assignee: Apache Spark

> [BUG] Revert: Fix taskid in the stage page task event timeline
> --
>
> Key: SPARK-37860
> URL: https://issues.apache.org/jira/browse/SPARK-37860
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 3.2.1
>Reporter: Jackey Lee
>Assignee: Apache Spark
>Priority: Major
>
> In [#32888|https://github.com/apache/spark/pull/32888], 
> [@shahidki31|https://github.com/shahidki31] change taskInfo.index to 
> taskInfo.taskId. However, we generally use {{index.attempt}} or {{taskId}} to 
> distinguish tasks within a stage, not {{{}taskId.attempt{}}}.
> Thus [#32888|https://github.com/apache/spark/pull/32888] was a wrong fix 
> issue, we should revert it.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37860) [BUG] Revert: Fix taskid in the stage page task event timeline

2022-01-10 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17472421#comment-17472421
 ] 

Apache Spark commented on SPARK-37860:
--

User 'stczwd' has created a pull request for this issue:
https://github.com/apache/spark/pull/35160

> [BUG] Revert: Fix taskid in the stage page task event timeline
> --
>
> Key: SPARK-37860
> URL: https://issues.apache.org/jira/browse/SPARK-37860
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 3.2.1
>Reporter: Jackey Lee
>Priority: Major
>
> In [#32888|https://github.com/apache/spark/pull/32888], 
> [@shahidki31|https://github.com/shahidki31] change taskInfo.index to 
> taskInfo.taskId. However, we generally use {{index.attempt}} or {{taskId}} to 
> distinguish tasks within a stage, not {{{}taskId.attempt{}}}.
> Thus [#32888|https://github.com/apache/spark/pull/32888] was a wrong fix 
> issue, we should revert it.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37159) Change HiveExternalCatalogVersionsSuite to be able to test with Java 17

2022-01-10 Thread Kousuke Saruta (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17472434#comment-17472434
 ] 

Kousuke Saruta commented on SPARK-37159:


All right. Thank you [~dongjoon]!

> Change HiveExternalCatalogVersionsSuite to be able to test with Java 17
> ---
>
> Key: SPARK-37159
> URL: https://issues.apache.org/jira/browse/SPARK-37159
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL, Tests
>Affects Versions: 3.3.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Minor
> Fix For: 3.3.0
>
>
> SPARK-37105 seems to have fixed most of tests in `sql/hive` for Java 17 but 
> `HiveExternalCatalogVersionsSuite`.
> {code}
> [info] org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite *** ABORTED 
> *** (42 seconds, 526 milliseconds)
> [info]   spark-submit returned with exit code 1.
> [info]   Command line: 
> '/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/test-spark-d86af275-0c40-4b47-9cab-defa92a5ffa7/spark-3.2.0/bin/spark-submit'
>  '--name' 'prepare testing tables' '--master' 'local[2]' '--conf' 
> 'spark.ui.enabled=false' '--conf' 'spark.master.rest.enabled=false' '--conf' 
> 'spark.sql.hive.metastore.version=2.3' '--conf' 
> 'spark.sql.hive.metastore.jars=maven' '--conf' 
> 'spark.sql.warehouse.dir=/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/warehouse-69d9bdbc-54ce-443b-8677-a413663ddb62'
>  '--conf' 'spark.sql.test.version.index=0' '--driver-java-options' 
> '-Dderby.system.home=/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/warehouse-69d9bdbc-54ce-443b-8677-a413663ddb62'
>  
> '/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/test15166225869206697603.py'
> [info]   
> [info]   2021-10-28 06:07:18.486 - stderr> Using Spark's default log4j 
> profile: org/apache/spark/log4j-defaults.properties
> [info]   2021-10-28 06:07:18.49 - stderr> 21/10/28 22:07:18 INFO 
> SparkContext: Running Spark version 3.2.0
> [info]   2021-10-28 06:07:18.537 - stderr> 21/10/28 22:07:18 WARN 
> NativeCodeLoader: Unable to load native-hadoop library for your platform... 
> using builtin-java classes where applicable
> [info]   2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO 
> ResourceUtils: ==
> [info]   2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO 
> ResourceUtils: No custom resources configured for spark.driver.
> [info]   2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO 
> ResourceUtils: ==
> [info]   2021-10-28 06:07:18.617 - stderr> 21/10/28 22:07:18 INFO 
> SparkContext: Submitted application: prepare testing tables
> [info]   2021-10-28 06:07:18.632 - stderr> 21/10/28 22:07:18 INFO 
> ResourceProfile: Default ResourceProfile created, executor resources: 
> Map(cores -> name: cores, amount: 1, script: , vendor: , memory -> name: 
> memory, amount: 1024, script: , vendor: , offHeap -> name: offHeap, amount: 
> 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0)
> [info]   2021-10-28 06:07:18.641 - stderr> 21/10/28 22:07:18 INFO 
> ResourceProfile: Limiting resource is cpu
> [info]   2021-10-28 06:07:18.641 - stderr> 21/10/28 22:07:18 INFO 
> ResourceProfileManager: Added ResourceProfile id: 0
> [info]   2021-10-28 06:07:18.679 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: Changing view acls to: kou
> [info]   2021-10-28 06:07:18.679 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: Changing modify acls to: kou
> [info]   2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: Changing view acls groups to: 
> [info]   2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: Changing modify acls groups to: 
> [info]   2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO 
> SecurityManager: SecurityManager: authentication disabled; ui acls disabled; 
> users  with view permissions: Set(kou); groups with view permissions: Set(); 
> users  with modify permissions: Set(kou); groups with modify permissions: 
> Set()
> [info]   2021-10-28 06:07:18.886 - stderr> 21/10/28 22:07:18 INFO Utils: 
> Successfully started service 'sparkDriver' on port 35867.
> [info]   2021-10-28 06:07:18.906 - stderr> 21/10/28 22:07:18 INFO SparkEnv: 
> Registering MapOutputTracker
> [info]   2021-10-28 06:07:18.93 - stderr> 21/10/28 22:07:18 INFO SparkEnv: 
> Registering BlockManagerMaster
> [info]   2021-10-28 06:07:18.943 - stderr> 21/10/28 22:07:18 INFO 
> Blo

[jira] [Resolved] (SPARK-37847) PushBlockStreamCallback should check isTooLate first to avoid NPE

2022-01-10 Thread Mridul Muralidharan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mridul Muralidharan resolved SPARK-37847.
-
Fix Version/s: 3.3.0
   Resolution: Fixed

Issue resolved by pull request 35146
[https://github.com/apache/spark/pull/35146]

> PushBlockStreamCallback should check isTooLate first to avoid NPE
> -
>
> Key: SPARK-37847
> URL: https://issues.apache.org/jira/browse/SPARK-37847
> Project: Spark
>  Issue Type: Sub-task
>  Components: Shuffle, Spark Core
>Affects Versions: 3.2.1, 3.3.0
>Reporter: Cheng Pan
>Assignee: Cheng Pan
>Priority: Major
> Fix For: 3.3.0
>
>
> {code:java}
> 2022-01-07 21:06:14,464 INFO shuffle.RemoteBlockPushResolver: shuffle 
> partition application_1640143179334_0149_-1 102 6922, chunk_size=1, 
> meta_length=138, data_length=112632
> 2022-01-07 21:06:14,615 ERROR shuffle.RemoteBlockPushResolver: Encountered 
> issue when merging shufflePush_102_0_279_6922
> java.lang.NullPointerException
> at 
> org.apache.spark.network.shuffle.RemoteBlockPushResolver$AppShuffleMergePartitionsInfo.access$200(RemoteBlockPushResolver.java:1017)
> at 
> org.apache.spark.network.shuffle.RemoteBlockPushResolver$PushBlockStreamCallback.isStale(RemoteBlockPushResolver.java:806)
> at 
> org.apache.spark.network.shuffle.RemoteBlockPushResolver$PushBlockStreamCallback.onData(RemoteBlockPushResolver.java:840)
> at 
> org.apache.spark.network.server.TransportRequestHandler$3.onData(TransportRequestHandler.java:209)
> at 
> org.apache.spark.network.client.StreamInterceptor.handle(StreamInterceptor.java:79)
> at 
> org.apache.spark.network.util.TransportFrameDecoder.feedInterceptor(TransportFrameDecoder.java:263)
> at 
> org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:87)
> at 
> org.sparkproject.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
> at 
> org.sparkproject.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
> at 
> org.sparkproject.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
> at 
> org.sparkproject.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
> at 
> org.sparkproject.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
> at 
> org.sparkproject.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
> at 
> org.sparkproject.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
> at 
> org.sparkproject.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
> at 
> org.sparkproject.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:722)
> at 
> org.sparkproject.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:658)
> at 
> org.sparkproject.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:584)
> at 
> org.sparkproject.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496)
> at 
> org.sparkproject.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
> at 
> org.sparkproject.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
> at 
> org.sparkproject.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-37847) PushBlockStreamCallback should check isTooLate first to avoid NPE

2022-01-10 Thread Mridul Muralidharan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mridul Muralidharan reassigned SPARK-37847:
---

Assignee: Cheng Pan

> PushBlockStreamCallback should check isTooLate first to avoid NPE
> -
>
> Key: SPARK-37847
> URL: https://issues.apache.org/jira/browse/SPARK-37847
> Project: Spark
>  Issue Type: Sub-task
>  Components: Shuffle, Spark Core
>Affects Versions: 3.2.1, 3.3.0
>Reporter: Cheng Pan
>Assignee: Cheng Pan
>Priority: Major
>
> {code:java}
> 2022-01-07 21:06:14,464 INFO shuffle.RemoteBlockPushResolver: shuffle 
> partition application_1640143179334_0149_-1 102 6922, chunk_size=1, 
> meta_length=138, data_length=112632
> 2022-01-07 21:06:14,615 ERROR shuffle.RemoteBlockPushResolver: Encountered 
> issue when merging shufflePush_102_0_279_6922
> java.lang.NullPointerException
> at 
> org.apache.spark.network.shuffle.RemoteBlockPushResolver$AppShuffleMergePartitionsInfo.access$200(RemoteBlockPushResolver.java:1017)
> at 
> org.apache.spark.network.shuffle.RemoteBlockPushResolver$PushBlockStreamCallback.isStale(RemoteBlockPushResolver.java:806)
> at 
> org.apache.spark.network.shuffle.RemoteBlockPushResolver$PushBlockStreamCallback.onData(RemoteBlockPushResolver.java:840)
> at 
> org.apache.spark.network.server.TransportRequestHandler$3.onData(TransportRequestHandler.java:209)
> at 
> org.apache.spark.network.client.StreamInterceptor.handle(StreamInterceptor.java:79)
> at 
> org.apache.spark.network.util.TransportFrameDecoder.feedInterceptor(TransportFrameDecoder.java:263)
> at 
> org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:87)
> at 
> org.sparkproject.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
> at 
> org.sparkproject.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
> at 
> org.sparkproject.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
> at 
> org.sparkproject.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
> at 
> org.sparkproject.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
> at 
> org.sparkproject.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
> at 
> org.sparkproject.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
> at 
> org.sparkproject.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
> at 
> org.sparkproject.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:722)
> at 
> org.sparkproject.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:658)
> at 
> org.sparkproject.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:584)
> at 
> org.sparkproject.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496)
> at 
> org.sparkproject.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
> at 
> org.sparkproject.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
> at 
> org.sparkproject.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-37861) How to reload udf jar class without restart driver and executor

2022-01-10 Thread Jinpeng Chi (Jira)

Jinpeng Chi created SPARK-37861:
---

 Summary: How to reload udf jar class without restart driver and 
executor
 Key: SPARK-37861
 URL: https://issues.apache.org/jira/browse/SPARK-37861
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 3.1.2
Reporter: Jinpeng Chi


How to reload udf jar class without restart driver and executor



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37840) Dynamically update the loaded Hive UDF JAR

2022-01-10 Thread Jinpeng Chi (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17472471#comment-17472471
 ] 

Jinpeng Chi commented on SPARK-37840:
-

[~hyukjin.kwon] No

> Dynamically update the loaded Hive UDF JAR
> --
>
> Key: SPARK-37840
> URL: https://issues.apache.org/jira/browse/SPARK-37840
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: melin
>Priority: Major
>
> In the production environment, spark ThriftServer needs to be restarted if 
> jar files are updated after UDF files are loaded。



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-37860) [BUG] Revert: Fix taskid in the stage page task event timeline

2022-01-10 Thread Kousuke Saruta (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta resolved SPARK-37860.

Fix Version/s: 3.1.3
   3.0.4
   3.2.1
   3.3.0
 Assignee: Jackey Lee
   Resolution: Fixed

Issue resolved in https://github.com/apache/spark/pull/35160

> [BUG] Revert: Fix taskid in the stage page task event timeline
> --
>
> Key: SPARK-37860
> URL: https://issues.apache.org/jira/browse/SPARK-37860
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 3.2.1
>Reporter: Jackey Lee
>Assignee: Jackey Lee
>Priority: Major
> Fix For: 3.1.3, 3.0.4, 3.2.1, 3.3.0
>
>
> In [#32888|https://github.com/apache/spark/pull/32888], 
> [@shahidki31|https://github.com/shahidki31] change taskInfo.index to 
> taskInfo.taskId. However, we generally use {{index.attempt}} or {{taskId}} to 
> distinguish tasks within a stage, not {{{}taskId.attempt{}}}.
> Thus [#32888|https://github.com/apache/spark/pull/32888] was a wrong fix 
> issue, we should revert it.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37860) [BUG] Revert: Fix taskid in the stage page task event timeline

2022-01-10 Thread Kousuke Saruta (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17472487#comment-17472487
 ] 

Kousuke Saruta commented on SPARK-37860:


Note: If the vote of Spark 3.2.1 RC1 passes, replace the fix version of 3.2.1 
with 3.2.2.

> [BUG] Revert: Fix taskid in the stage page task event timeline
> --
>
> Key: SPARK-37860
> URL: https://issues.apache.org/jira/browse/SPARK-37860
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 3.2.1
>Reporter: Jackey Lee
>Assignee: Jackey Lee
>Priority: Major
> Fix For: 3.1.3, 3.0.4, 3.2.1, 3.3.0
>
>
> In [#32888|https://github.com/apache/spark/pull/32888], 
> [@shahidki31|https://github.com/shahidki31] change taskInfo.index to 
> taskInfo.taskId. However, we generally use {{index.attempt}} or {{taskId}} to 
> distinguish tasks within a stage, not {{{}taskId.attempt{}}}.
> Thus [#32888|https://github.com/apache/spark/pull/32888] was a wrong fix 
> issue, we should revert it.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37840) Dynamically update the loaded Hive UDF JAR

2022-01-10 Thread Hyukjin Kwon (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17472488#comment-17472488
 ] 

Hyukjin Kwon commented on SPARK-37840:
--

ADD FILE|JAR|ARCHIVE tracks timestamps IIRC, and they can update newer files. 
What's the current behaviour?

> Dynamically update the loaded Hive UDF JAR
> --
>
> Key: SPARK-37840
> URL: https://issues.apache.org/jira/browse/SPARK-37840
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: melin
>Priority: Major
>
> In the production environment, spark ThriftServer needs to be restarted if 
> jar files are updated after UDF files are loaded。



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-37862) RecordBinaryComparator should fast skip the check of aligning with unaligned platform

2022-01-10 Thread XiDuo You (Jira)

XiDuo You created SPARK-37862:
-

 Summary: RecordBinaryComparator should fast skip the check of 
aligning with unaligned platform
 Key: SPARK-37862
 URL: https://issues.apache.org/jira/browse/SPARK-37862
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.3.0
Reporter: XiDuo You


Same with SPARK-37796

It would be better to fast sikip the check of aligning if the platform is 
unaligned.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-37863) Add submitTime for Spark Application

2022-01-10 Thread Zhou Yifan (Jira)

Zhou Yifan created SPARK-37863:
--

 Summary: Add submitTime for Spark Application
 Key: SPARK-37863
 URL: https://issues.apache.org/jira/browse/SPARK-37863
 Project: Spark
  Issue Type: New Feature
  Components: Spark Submit
Affects Versions: 3.3.0
Reporter: Zhou Yifan


Spark Driver may take a long time to startup when running with cluster mode, if 
YARN temporarily does not have enough resources, or HDFS's performance is poor 
due to high pressure.

If `submitTime` is added, we can detect such situation by comparing it with the 
`spark.app.startTime`(already exists) .



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-37863) Add submitTime for Spark Application

2022-01-10 Thread Zhou Yifan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhou Yifan updated SPARK-37863:
---
Description: 
Spark Driver may take a long time to startup when running with cluster mode, if 
YARN temporarily does not have enough resources, or HDFS's performance is poor 
due to high pressure.

After`submitTime` is added, we can detect such situation by comparing it with 
the `spark.app.startTime`(already exists) .

  was:
Spark Driver may take a long time to startup when running with cluster mode, if 
YARN temporarily does not have enough resources, or HDFS's performance is poor 
due to high pressure.

If `submitTime` is added, we can detect such situation by comparing it with the 
`spark.app.startTime`(already exists) .


> Add submitTime for Spark Application
> 
>
> Key: SPARK-37863
> URL: https://issues.apache.org/jira/browse/SPARK-37863
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Submit
>Affects Versions: 3.3.0
>Reporter: Zhou Yifan
>Priority: Major
>
> Spark Driver may take a long time to startup when running with cluster mode, 
> if YARN temporarily does not have enough resources, or HDFS's performance is 
> poor due to high pressure.
> After`submitTime` is added, we can detect such situation by comparing it with 
> the `spark.app.startTime`(already exists) .



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37862) RecordBinaryComparator should fast skip the check of aligning with unaligned platform

2022-01-10 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17472493#comment-17472493
 ] 

Apache Spark commented on SPARK-37862:
--

User 'ulysses-you' has created a pull request for this issue:
https://github.com/apache/spark/pull/35161

> RecordBinaryComparator should fast skip the check of aligning with unaligned 
> platform
> -
>
> Key: SPARK-37862
> URL: https://issues.apache.org/jira/browse/SPARK-37862
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: XiDuo You
>Priority: Major
>
> Same with SPARK-37796
> It would be better to fast sikip the check of aligning if the platform is 
> unaligned.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-37862) RecordBinaryComparator should fast skip the check of aligning with unaligned platform

2022-01-10 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37862:


Assignee: (was: Apache Spark)

> RecordBinaryComparator should fast skip the check of aligning with unaligned 
> platform
> -
>
> Key: SPARK-37862
> URL: https://issues.apache.org/jira/browse/SPARK-37862
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: XiDuo You
>Priority: Major
>
> Same with SPARK-37796
> It would be better to fast sikip the check of aligning if the platform is 
> unaligned.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-37862) RecordBinaryComparator should fast skip the check of aligning with unaligned platform

2022-01-10 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37862:


Assignee: Apache Spark

> RecordBinaryComparator should fast skip the check of aligning with unaligned 
> platform
> -
>
> Key: SPARK-37862
> URL: https://issues.apache.org/jira/browse/SPARK-37862
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: XiDuo You
>Assignee: Apache Spark
>Priority: Major
>
> Same with SPARK-37796
> It would be better to fast sikip the check of aligning if the platform is 
> unaligned.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-37863) Add submitTime for Spark Application

2022-01-10 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37863:


Assignee: Apache Spark

> Add submitTime for Spark Application
> 
>
> Key: SPARK-37863
> URL: https://issues.apache.org/jira/browse/SPARK-37863
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Submit
>Affects Versions: 3.3.0
>Reporter: Zhou Yifan
>Assignee: Apache Spark
>Priority: Major
>
> Spark Driver may take a long time to startup when running with cluster mode, 
> if YARN temporarily does not have enough resources, or HDFS's performance is 
> poor due to high pressure.
> After`submitTime` is added, we can detect such situation by comparing it with 
> the `spark.app.startTime`(already exists) .



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37863) Add submitTime for Spark Application

2022-01-10 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17472507#comment-17472507
 ] 

Apache Spark commented on SPARK-37863:
--

User 'zhouyifan279' has created a pull request for this issue:
https://github.com/apache/spark/pull/35162

> Add submitTime for Spark Application
> 
>
> Key: SPARK-37863
> URL: https://issues.apache.org/jira/browse/SPARK-37863
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Submit
>Affects Versions: 3.3.0
>Reporter: Zhou Yifan
>Priority: Major
>
> Spark Driver may take a long time to startup when running with cluster mode, 
> if YARN temporarily does not have enough resources, or HDFS's performance is 
> poor due to high pressure.
> After`submitTime` is added, we can detect such situation by comparing it with 
> the `spark.app.startTime`(already exists) .



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-37863) Add submitTime for Spark Application

2022-01-10 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37863:


Assignee: (was: Apache Spark)

> Add submitTime for Spark Application
> 
>
> Key: SPARK-37863
> URL: https://issues.apache.org/jira/browse/SPARK-37863
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Submit
>Affects Versions: 3.3.0
>Reporter: Zhou Yifan
>Priority: Major
>
> Spark Driver may take a long time to startup when running with cluster mode, 
> if YARN temporarily does not have enough resources, or HDFS's performance is 
> poor due to high pressure.
> After`submitTime` is added, we can detect such situation by comparing it with 
> the `spark.app.startTime`(already exists) .



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37840) Dynamically update the loaded Hive UDF JAR

2022-01-10 Thread Jinpeng Chi (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17472519#comment-17472519
 ] 

Jinpeng Chi commented on SPARK-37840:
-

Spark does not currently support delete jar

> Dynamically update the loaded Hive UDF JAR
> --
>
> Key: SPARK-37840
> URL: https://issues.apache.org/jira/browse/SPARK-37840
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: melin
>Priority: Major
>
> In the production environment, spark ThriftServer needs to be restarted if 
> jar files are updated after UDF files are loaded。



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37840) Dynamically update the loaded Hive UDF JAR

2022-01-10 Thread Jinpeng Chi (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17472521#comment-17472521
 ] 

Jinpeng Chi commented on SPARK-37840:
-

[~hyukjin.kwon] But the loading is still from the previous Jar

> Dynamically update the loaded Hive UDF JAR
> --
>
> Key: SPARK-37840
> URL: https://issues.apache.org/jira/browse/SPARK-37840
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: melin
>Priority: Major
>
> In the production environment, spark ThriftServer needs to be restarted if 
> jar files are updated after UDF files are loaded。



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

89 matches

Mail list logo