date:20230928

[jira] [Updated] (SPARK-45370) Fix python test when ansi mode enabled

2023-09-28 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-45370:
---
Labels: pull-request-available  (was: )

> Fix python test when ansi mode enabled
> --
>
> Key: SPARK-45370
> URL: https://issues.apache.org/jira/browse/SPARK-45370
> Project: Spark
>  Issue Type: Improvement
>  Components: PySpark, Tests
>Affects Versions: 4.0.0
>Reporter: BingKun Pan
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-45364) Clean up the unnecessary Scala 2.12 logical in SparkBuild

2023-09-28 Thread Sean R. Owen (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean R. Owen resolved SPARK-45364.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 43158
[https://github.com/apache/spark/pull/43158]

> Clean up the unnecessary Scala 2.12 logical in SparkBuild
> -
>
> Key: SPARK-45364
> URL: https://issues.apache.org/jira/browse/SPARK-45364
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build, Project Infra
>Affects Versions: 4.0.0
>Reporter: BingKun Pan
>Assignee: BingKun Pan
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-45364) Clean up the unnecessary Scala 2.12 logical in SparkBuild

2023-09-28 Thread Sean R. Owen (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean R. Owen updated SPARK-45364:
-
Priority: Trivial  (was: Minor)

> Clean up the unnecessary Scala 2.12 logical in SparkBuild
> -
>
> Key: SPARK-45364
> URL: https://issues.apache.org/jira/browse/SPARK-45364
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build, Project Infra
>Affects Versions: 4.0.0
>Reporter: BingKun Pan
>Assignee: BingKun Pan
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-45364) Clean up the unnecessary Scala 2.12 logical in SparkBuild

2023-09-28 Thread Sean R. Owen (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean R. Owen reassigned SPARK-45364:


Assignee: BingKun Pan

> Clean up the unnecessary Scala 2.12 logical in SparkBuild
> -
>
> Key: SPARK-45364
> URL: https://issues.apache.org/jira/browse/SPARK-45364
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build, Project Infra
>Affects Versions: 4.0.0
>Reporter: BingKun Pan
>Assignee: BingKun Pan
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-45330) Upgrade ammonite to 2.6.0

2023-09-28 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-45330:
---
Labels: pull-request-available  (was: )

> Upgrade ammonite to 2.6.0
> -
>
> Key: SPARK-45330
> URL: https://issues.apache.org/jira/browse/SPARK-45330
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-45330) Upgrade ammonite to 2.5.11

2023-09-28 Thread Yang Jie (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie updated SPARK-45330:
-
Summary: Upgrade ammonite to 2.5.11  (was: Upgrade ammonite to 2.6.0)

> Upgrade ammonite to 2.5.11
> --
>
> Key: SPARK-45330
> URL: https://issues.apache.org/jira/browse/SPARK-45330
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-45360) Initialize spark session builder configuration from SPARK_REMOTE

2023-09-28 Thread Jira



 [ 
https://issues.apache.org/jira/browse/SPARK-45360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Herman van Hövell resolved SPARK-45360.
---
Fix Version/s: 4.0.0
   3.5.1
 Assignee: Yihong He
   Resolution: Fixed

> Initialize spark session builder configuration from SPARK_REMOTE
> 
>
> Key: SPARK-45360
> URL: https://issues.apache.org/jira/browse/SPARK-45360
> Project: Spark
>  Issue Type: New Feature
>  Components: Connect
>Affects Versions: 3.5.0, 4.0.0
>Reporter: Yihong He
>Assignee: Yihong He
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0, 3.5.1
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-45266) Refactor ResolveFunctions analyzer rule to delay making lateral join when table arguments are used

2023-09-28 Thread Takuya Ueshin (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takuya Ueshin resolved SPARK-45266.
---
Fix Version/s: 4.0.0
 Assignee: Takuya Ueshin
   Resolution: Fixed

Issue resolved by pull request 43042
https://github.com/apache/spark/pull/43042

> Refactor ResolveFunctions analyzer rule to delay making lateral join when 
> table arguments are used
> --
>
> Key: SPARK-45266
> URL: https://issues.apache.org/jira/browse/SPARK-45266
> Project: Spark
>  Issue Type: Improvement
>  Components: PySpark
>Affects Versions: 4.0.0
>Reporter: Takuya Ueshin
>Assignee: Takuya Ueshin
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-45371) FIx shading problem in Spark Connect

2023-09-28 Thread Jira

Herman van Hövell created SPARK-45371:
-

 Summary: FIx shading problem in Spark Connect
 Key: SPARK-45371
 URL: https://issues.apache.org/jira/browse/SPARK-45371
 Project: Spark
  Issue Type: New Feature
  Components: Connect
Affects Versions: 3.5.0
Reporter: Herman van Hövell
Assignee: Herman van Hövell


See: 
https://stackoverflow.com/questions/77151840/spark-connect-client-failing-with-java-lang-noclassdeffounderror



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-45372) Handle ClassNotFound when load extension

2023-09-28 Thread Zhongwei Zhu (Jira)

Zhongwei Zhu created SPARK-45372:


 Summary: Handle ClassNotFound when load extension
 Key: SPARK-45372
 URL: https://issues.apache.org/jira/browse/SPARK-45372
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 3.5.0
Reporter: Zhongwei Zhu


When load extension with ClassNotFoundException, SparkContext failed to 
initialize. Better behavior is skip this extension and log error without 
failing SparkContext initialization.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-45372) Handle ClassNotFoundException when load extension

2023-09-28 Thread Zhongwei Zhu (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhongwei Zhu updated SPARK-45372:
-
Summary: Handle ClassNotFoundException when load extension  (was: Handle 
ClassNotFound when load extension)

> Handle ClassNotFoundException when load extension
> -
>
> Key: SPARK-45372
> URL: https://issues.apache.org/jira/browse/SPARK-45372
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.5.0
>Reporter: Zhongwei Zhu
>Priority: Minor
>
> When load extension with ClassNotFoundException, SparkContext failed to 
> initialize. Better behavior is skip this extension and log error without 
> failing SparkContext initialization.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-45372) Handle ClassNotFoundException when load extension

2023-09-28 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-45372:
---
Labels: pull-request-available  (was: )

> Handle ClassNotFoundException when load extension
> -
>
> Key: SPARK-45372
> URL: https://issues.apache.org/jira/browse/SPARK-45372
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.5.0
>Reporter: Zhongwei Zhu
>Priority: Minor
>  Labels: pull-request-available
>
> When load extension with ClassNotFoundException, SparkContext failed to 
> initialize. Better behavior is skip this extension and log error without 
> failing SparkContext initialization.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36112) Enable DecorrelateInnerQuery for IN/EXISTS subqueries

2023-09-28 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-36112:
---
Labels: pull-request-available  (was: )

> Enable DecorrelateInnerQuery for IN/EXISTS subqueries
> -
>
> Key: SPARK-36112
> URL: https://issues.apache.org/jira/browse/SPARK-36112
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Allison Wang
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2023-04-25-21-51-55-961.png
>
>
> Currently, `DecorrelateInnerQuery` is only enabled for scalar and lateral 
> subqueries. We should enable `DecorrelateInnerQuery` for IN/EXISTS 
> subqueries. Note we need to add the logic to rewrite domain joins in 
> `RewritePredicateSubquery`.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-45365) Allow the daily tests of branch-3.4 to use the new test group tags

2023-09-28 Thread Chao Sun (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun resolved SPARK-45365.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

> Allow the daily tests of branch-3.4 to use the new test group tags
> --
>
> Key: SPARK-45365
> URL: https://issues.apache.org/jira/browse/SPARK-45365
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-45365) Allow the daily tests of branch-3.4 to use the new test group tags

2023-09-28 Thread Chao Sun (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun reassigned SPARK-45365:


Assignee: Yang Jie

> Allow the daily tests of branch-3.4 to use the new test group tags
> --
>
> Key: SPARK-45365
> URL: https://issues.apache.org/jira/browse/SPARK-45365
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-43049) Use CLOB instead of VARCHAR(255) for StringType for Oracle jdbc

2023-09-28 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-43049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-43049:
---
Labels: pull-request-available  (was: )

> Use CLOB instead of VARCHAR(255) for StringType for Oracle jdbc
> ---
>
> Key: SPARK-43049
> URL: https://issues.apache.org/jira/browse/SPARK-43049
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.5.0
>Reporter: Kent Yao
>Assignee: Kent Yao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-45373) Minimizing calls to HiveMetaStore layer for getting partitions, when tables are repeated

2023-09-28 Thread Asif (Jira)

Asif created SPARK-45373:


 Summary: Minimizing calls to HiveMetaStore layer for getting 
partitions,  when tables are repeated
 Key: SPARK-45373
 URL: https://issues.apache.org/jira/browse/SPARK-45373
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.5.1
Reporter: Asif
 Fix For: 3.5.1


In the rule PruneFileSourcePartitions where the CatalogFileIndex gets converted 
to InMemoryFileIndex,  the HMS calls can get very expensive if :
1) The translated filter string for push down to HMS layer becomes empty ,  
resulting in fetching of all partitions and same table is referenced multiple 
times in the query. 
2) Or just in case same table is referenced multiple times in the query with 
different partition filters.
In such cases current code would result in multiple calls to HMS layer. 
This can be avoided by grouping the tables based on CatalogFileIndex and 
passing a common minimum filter ( filter1 || filter2) and getting a base 
PrunedInmemoryFileIndex which can become a basis for each of the specific table.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-45373) Minimizing calls to HiveMetaStore layer for getting partitions, when tables are repeated

2023-09-28 Thread Asif (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-45373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17770220#comment-17770220
 ] 

Asif commented on SPARK-45373:
--

Will be generating a PR for this.

> Minimizing calls to HiveMetaStore layer for getting partitions,  when tables 
> are repeated
> -
>
> Key: SPARK-45373
> URL: https://issues.apache.org/jira/browse/SPARK-45373
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.5.1
>Reporter: Asif
>Priority: Minor
> Fix For: 3.5.1
>
>
> In the rule PruneFileSourcePartitions where the CatalogFileIndex gets 
> converted to InMemoryFileIndex,  the HMS calls can get very expensive if :
> 1) The translated filter string for push down to HMS layer becomes empty ,  
> resulting in fetching of all partitions and same table is referenced multiple 
> times in the query. 
> 2) Or just in case same table is referenced multiple times in the query with 
> different partition filters.
> In such cases current code would result in multiple calls to HMS layer. 
> This can be avoided by grouping the tables based on CatalogFileIndex and 
> passing a common minimum filter ( filter1 || filter2) and getting a base 
> PrunedInmemoryFileIndex which can become a basis for each of the specific 
> table.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-45362) Project out PARTITION BY expressions before 'eval' method consumes input rows

2023-09-28 Thread Takuya Ueshin (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takuya Ueshin resolved SPARK-45362.
---
Fix Version/s: 4.0.0
 Assignee: Daniel
   Resolution: Fixed

Issue resolved by pull request 43156
https://github.com/apache/spark/pull/43156

> Project out PARTITION BY expressions before 'eval' method consumes input rows
> -
>
> Key: SPARK-45362
> URL: https://issues.apache.org/jira/browse/SPARK-45362
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark, SQL
>Affects Versions: 4.0.0
>Reporter: Daniel
>Assignee: Daniel
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-42205) Remove logging of Accumulables in Task/Stage start events in JsonProtocol

2023-09-28 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-42205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-42205:
---
Labels: pull-request-available  (was: )

> Remove logging of Accumulables in Task/Stage start events in JsonProtocol
> -
>
> Key: SPARK-42205
> URL: https://issues.apache.org/jira/browse/SPARK-42205
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Josh Rosen
>Assignee: Josh Rosen
>Priority: Major
>  Labels: pull-request-available
>
> Spark's JsonProtocol event logs (used by the history server) are impacted by 
> a race condition when tasks / stages finish very quickly:
> The SparkListenerTaskStart and SparkListenerStageSubmitted events contain 
> mutable TaskInfo and StageInfo objects, which in turn contain Accumulables 
> fields. When a task or stage is submitted, Accumulables is initially empty. 
> When the task or stage finishes, this field is updated with values from the 
> task.
> If a task or stage finishes before the start event has been logged by the 
> event logging listener then the _start_ event will contain the Accumulable 
> values from the task or stage _end_ event. 
> This information isn't used by the History Server and contributes to wasteful 
> bloat in event log sizes. In one real-world log, I found that ~10% of the 
> uncompressed log size was due to these redundant Accumulable fields.
> I propose that we update JsonProtocol to skip the logging of this field for 
> Start/Submitted events. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-44895) Considering 'daemon', 'priority' from higher JDKs for ThreadStackTrace class

2023-09-28 Thread Sean R. Owen (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-44895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean R. Owen resolved SPARK-44895.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 43095
[https://github.com/apache/spark/pull/43095]

> Considering 'daemon', 'priority' from higher JDKs for ThreadStackTrace class
> 
>
> Key: SPARK-44895
> URL: https://issues.apache.org/jira/browse/SPARK-44895
> Project: Spark
>  Issue Type: Sub-task
>  Components: Web UI
>Affects Versions: 4.0.0
>Reporter: Kent Yao
>Assignee: Kent Yao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> {code:java}
> jshell> var t = java.lang.management.ManagementFactory.getThreadMXBean()t ==> 
> com.sun.management.internal.HotSpotThreadImpl@7daf6ecc
> jshell> var tt = t.dumpAllThreads(true, true)tt ==> ThreadInfo[10] { "main" 
> prio=5 Id=1 RUNNABLE  at  ... k$NonfairSync@27fa135a
>  }
> jshell> for (java.lang.management.ThreadInfo t1: tt) 
> {System.out.println(t1.toString());}"main" prio=5 Id=1 RUNNABLE  at 
> java.management@20.0.1/sun.management.ThreadImpl.dumpThreads0(Native Method) 
> at 
> java.management@20.0.1/sun.management.ThreadImpl.dumpAllThreads(ThreadImpl.java:540)
>  at 
> java.management@20.0.1/sun.management.ThreadImpl.dumpAllThreads(ThreadImpl.java:527)
>  at REPL.$JShell$12.do_it$Aux($JShell$12.java:7) at 
> REPL.$JShell$12.do_it$($JShell$12.java:11)   at 
> java.base@20.0.1/java.lang.invoke.DirectMethodHandle$Holder.invokeStatic(DirectMethodHandle$Holder)
>   at 
> java.base@20.0.1/java.lang.invoke.LambdaForm$MH/0x007001008c00.invoke(LambdaForm$MH)
>  at 
> java.base@20.0.1/java.lang.invoke.Invokers$Holder.invokeExact_MT(Invokers$Holder)
> ...
> "Reference Handler" daemon prio=10 Id=8 RUNNABLE  at 
> java.base@20.0.1/java.lang.ref.Reference.waitForReferencePendingList(Native 
> Method)  at 
> java.base@20.0.1/java.lang.ref.Reference.processPendingReferences(Reference.java:246)
> at 
> java.base@20.0.1/java.lang.ref.Reference$ReferenceHandler.run(Reference.java:208)
>  {code}
> the `daemon prio=10` is not available for ThreadInfo of jdk8
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-44895) Considering 'daemon', 'priority' from higher JDKs for ThreadStackTrace class

2023-09-28 Thread Sean R. Owen (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-44895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean R. Owen reassigned SPARK-44895:


Assignee: Kent Yao

> Considering 'daemon', 'priority' from higher JDKs for ThreadStackTrace class
> 
>
> Key: SPARK-44895
> URL: https://issues.apache.org/jira/browse/SPARK-44895
> Project: Spark
>  Issue Type: Sub-task
>  Components: Web UI
>Affects Versions: 4.0.0
>Reporter: Kent Yao
>Assignee: Kent Yao
>Priority: Major
>  Labels: pull-request-available
>
> {code:java}
> jshell> var t = java.lang.management.ManagementFactory.getThreadMXBean()t ==> 
> com.sun.management.internal.HotSpotThreadImpl@7daf6ecc
> jshell> var tt = t.dumpAllThreads(true, true)tt ==> ThreadInfo[10] { "main" 
> prio=5 Id=1 RUNNABLE  at  ... k$NonfairSync@27fa135a
>  }
> jshell> for (java.lang.management.ThreadInfo t1: tt) 
> {System.out.println(t1.toString());}"main" prio=5 Id=1 RUNNABLE  at 
> java.management@20.0.1/sun.management.ThreadImpl.dumpThreads0(Native Method) 
> at 
> java.management@20.0.1/sun.management.ThreadImpl.dumpAllThreads(ThreadImpl.java:540)
>  at 
> java.management@20.0.1/sun.management.ThreadImpl.dumpAllThreads(ThreadImpl.java:527)
>  at REPL.$JShell$12.do_it$Aux($JShell$12.java:7) at 
> REPL.$JShell$12.do_it$($JShell$12.java:11)   at 
> java.base@20.0.1/java.lang.invoke.DirectMethodHandle$Holder.invokeStatic(DirectMethodHandle$Holder)
>   at 
> java.base@20.0.1/java.lang.invoke.LambdaForm$MH/0x007001008c00.invoke(LambdaForm$MH)
>  at 
> java.base@20.0.1/java.lang.invoke.Invokers$Holder.invokeExact_MT(Invokers$Holder)
> ...
> "Reference Handler" daemon prio=10 Id=8 RUNNABLE  at 
> java.base@20.0.1/java.lang.ref.Reference.waitForReferencePendingList(Native 
> Method)  at 
> java.base@20.0.1/java.lang.ref.Reference.processPendingReferences(Reference.java:246)
> at 
> java.base@20.0.1/java.lang.ref.Reference$ReferenceHandler.run(Reference.java:208)
>  {code}
> the `daemon prio=10` is not available for ThreadInfo of jdk8
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-44937) Add SSL/TLS support for RPC and Shuffle communications

2023-09-28 Thread Mridul Muralidharan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-44937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mridul Muralidharan reassigned SPARK-44937:
---

Assignee: Hasnain Lakhani

> Add SSL/TLS support for RPC and Shuffle communications
> --
>
> Key: SPARK-44937
> URL: https://issues.apache.org/jira/browse/SPARK-44937
> Project: Spark
>  Issue Type: New Feature
>  Components: Block Manager, Security, Shuffle, Spark Core
>Affects Versions: 4.0.0
>Reporter: Hasnain Lakhani
>Assignee: Hasnain Lakhani
>Priority: Major
>  Labels: pull-request-available
>
> Add support for SSL/TLS based communication for Spark RPCs and block 
> transfers - providing an alternative to the existing encryption / 
> authentication implementation documented at 
> [https://spark.apache.org/docs/latest/security.html#spark-rpc-communication-protocol-between-spark-processes]
> This is a superset of the functionality discussed in 
> https://issues.apache.org/jira/browse/SPARK-6373



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-44937) Add SSL/TLS support for RPC and Shuffle communications

2023-09-28 Thread Mridul Muralidharan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-44937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mridul Muralidharan resolved SPARK-44937.
-
Fix Version/s: 3.3.4
   3.5.1
   4.0.0
   3.4.2
   Resolution: Fixed

Issue resolved by pull request 43162
[https://github.com/apache/spark/pull/43162]

> Add SSL/TLS support for RPC and Shuffle communications
> --
>
> Key: SPARK-44937
> URL: https://issues.apache.org/jira/browse/SPARK-44937
> Project: Spark
>  Issue Type: New Feature
>  Components: Block Manager, Security, Shuffle, Spark Core
>Affects Versions: 4.0.0
>Reporter: Hasnain Lakhani
>Assignee: Hasnain Lakhani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.4, 3.5.1, 4.0.0, 3.4.2
>
>
> Add support for SSL/TLS based communication for Spark RPCs and block 
> transfers - providing an alternative to the existing encryption / 
> authentication implementation documented at 
> [https://spark.apache.org/docs/latest/security.html#spark-rpc-communication-protocol-between-spark-processes]
> This is a superset of the functionality discussed in 
> https://issues.apache.org/jira/browse/SPARK-6373



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Reopened] (SPARK-44937) Add SSL/TLS support for RPC and Shuffle communications

2023-09-28 Thread Hasnain Lakhani (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-44937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hasnain Lakhani reopened SPARK-44937:
-

> Add SSL/TLS support for RPC and Shuffle communications
> --
>
> Key: SPARK-44937
> URL: https://issues.apache.org/jira/browse/SPARK-44937
> Project: Spark
>  Issue Type: New Feature
>  Components: Block Manager, Security, Shuffle, Spark Core
>Affects Versions: 4.0.0
>Reporter: Hasnain Lakhani
>Assignee: Hasnain Lakhani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.2, 4.0.0, 3.5.1, 3.3.4
>
>
> Add support for SSL/TLS based communication for Spark RPCs and block 
> transfers - providing an alternative to the existing encryption / 
> authentication implementation documented at 
> [https://spark.apache.org/docs/latest/security.html#spark-rpc-communication-protocol-between-spark-processes]
> This is a superset of the functionality discussed in 
> https://issues.apache.org/jira/browse/SPARK-6373



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-44937) [umbrella] Add SSL/TLS support for RPC and Shuffle communications

2023-09-28 Thread Hasnain Lakhani (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-44937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hasnain Lakhani updated SPARK-44937:

Summary: [umbrella] Add SSL/TLS support for RPC and Shuffle communications  
(was: Add SSL/TLS support for RPC and Shuffle communications)

> [umbrella] Add SSL/TLS support for RPC and Shuffle communications
> -
>
> Key: SPARK-44937
> URL: https://issues.apache.org/jira/browse/SPARK-44937
> Project: Spark
>  Issue Type: New Feature
>  Components: Block Manager, Security, Shuffle, Spark Core
>Affects Versions: 4.0.0
>Reporter: Hasnain Lakhani
>Assignee: Hasnain Lakhani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.2, 4.0.0, 3.5.1, 3.3.4
>
>
> Add support for SSL/TLS based communication for Spark RPCs and block 
> transfers - providing an alternative to the existing encryption / 
> authentication implementation documented at 
> [https://spark.apache.org/docs/latest/security.html#spark-rpc-communication-protocol-between-spark-processes]
> This is a superset of the functionality discussed in 
> https://issues.apache.org/jira/browse/SPARK-6373



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-44937) [umbrella] Add SSL/TLS support for RPC and Shuffle communications

2023-09-28 Thread Hasnain Lakhani (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-44937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hasnain Lakhani updated SPARK-44937:

Issue Type: Epic  (was: New Feature)

> [umbrella] Add SSL/TLS support for RPC and Shuffle communications
> -
>
> Key: SPARK-44937
> URL: https://issues.apache.org/jira/browse/SPARK-44937
> Project: Spark
>  Issue Type: Epic
>  Components: Block Manager, Security, Shuffle, Spark Core
>Affects Versions: 4.0.0
>Reporter: Hasnain Lakhani
>Assignee: Hasnain Lakhani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.2, 4.0.0, 3.5.1, 3.3.4
>
>
> Add support for SSL/TLS based communication for Spark RPCs and block 
> transfers - providing an alternative to the existing encryption / 
> authentication implementation documented at 
> [https://spark.apache.org/docs/latest/security.html#spark-rpc-communication-protocol-between-spark-processes]
> This is a superset of the functionality discussed in 
> https://issues.apache.org/jira/browse/SPARK-6373



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-45374) [CORE] Add test SSL keys

2023-09-28 Thread Hasnain Lakhani (Jira)

Hasnain Lakhani created SPARK-45374:
---

 Summary: [CORE] Add test SSL keys
 Key: SPARK-45374
 URL: https://issues.apache.org/jira/browse/SPARK-45374
 Project: Spark
  Issue Type: Task
  Components: Spark Core
Affects Versions: 4.0.0
Reporter: Hasnain Lakhani


Add test SSL keys which will be used for unit and integration tests of the new 
SSL RPC functionality



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-45374) [CORE] Add test keys for SSL functionality

2023-09-28 Thread Hasnain Lakhani (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hasnain Lakhani updated SPARK-45374:

Summary: [CORE] Add test keys for SSL functionality  (was: [CORE] Add test 
SSL keys)

> [CORE] Add test keys for SSL functionality
> --
>
> Key: SPARK-45374
> URL: https://issues.apache.org/jira/browse/SPARK-45374
> Project: Spark
>  Issue Type: Task
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Hasnain Lakhani
>Priority: Major
>
> Add test SSL keys which will be used for unit and integration tests of the 
> new SSL RPC functionality



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-45375) [CORE] Mark connection as timedOut in TransportClient.close

2023-09-28 Thread Hasnain Lakhani (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hasnain Lakhani resolved SPARK-45375.
-
Resolution: Fixed

Resolved by [https://github.com/apache/spark/pull/43162]

[~mridul] I'd appreciate a bit of help trying to link this to the PR

> [CORE] Mark connection as timedOut in TransportClient.close
> ---
>
> Key: SPARK-45375
> URL: https://issues.apache.org/jira/browse/SPARK-45375
> Project: Spark
>  Issue Type: Task
>  Components: Spark Core
>Affects Versions: 3.4.2, 4.0.0, 3.5.1, 3.3.4
>Reporter: Hasnain Lakhani
>Priority: Major
>
> Avoid a race condition where a connection which is in the process of being 
> closed could be returned by the TransportClientFactory only to be immediately 
> closed and cause errors upon use
>  
> This doesn't happen much in practice but is observed more frequently as part 
> of efforts to add SSL support



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-44895) Considering 'daemon', 'priority' from higher JDKs for ThreadStackTrace class

2023-09-28 Thread Sean R. Owen (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-44895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean R. Owen updated SPARK-44895:
-
Priority: Minor  (was: Major)

> Considering 'daemon', 'priority' from higher JDKs for ThreadStackTrace class
> 
>
> Key: SPARK-44895
> URL: https://issues.apache.org/jira/browse/SPARK-44895
> Project: Spark
>  Issue Type: Sub-task
>  Components: Web UI
>Affects Versions: 4.0.0
>Reporter: Kent Yao
>Assignee: Kent Yao
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> {code:java}
> jshell> var t = java.lang.management.ManagementFactory.getThreadMXBean()t ==> 
> com.sun.management.internal.HotSpotThreadImpl@7daf6ecc
> jshell> var tt = t.dumpAllThreads(true, true)tt ==> ThreadInfo[10] { "main" 
> prio=5 Id=1 RUNNABLE  at  ... k$NonfairSync@27fa135a
>  }
> jshell> for (java.lang.management.ThreadInfo t1: tt) 
> {System.out.println(t1.toString());}"main" prio=5 Id=1 RUNNABLE  at 
> java.management@20.0.1/sun.management.ThreadImpl.dumpThreads0(Native Method) 
> at 
> java.management@20.0.1/sun.management.ThreadImpl.dumpAllThreads(ThreadImpl.java:540)
>  at 
> java.management@20.0.1/sun.management.ThreadImpl.dumpAllThreads(ThreadImpl.java:527)
>  at REPL.$JShell$12.do_it$Aux($JShell$12.java:7) at 
> REPL.$JShell$12.do_it$($JShell$12.java:11)   at 
> java.base@20.0.1/java.lang.invoke.DirectMethodHandle$Holder.invokeStatic(DirectMethodHandle$Holder)
>   at 
> java.base@20.0.1/java.lang.invoke.LambdaForm$MH/0x007001008c00.invoke(LambdaForm$MH)
>  at 
> java.base@20.0.1/java.lang.invoke.Invokers$Holder.invokeExact_MT(Invokers$Holder)
> ...
> "Reference Handler" daemon prio=10 Id=8 RUNNABLE  at 
> java.base@20.0.1/java.lang.ref.Reference.waitForReferencePendingList(Native 
> Method)  at 
> java.base@20.0.1/java.lang.ref.Reference.processPendingReferences(Reference.java:246)
> at 
> java.base@20.0.1/java.lang.ref.Reference$ReferenceHandler.run(Reference.java:208)
>  {code}
> the `daemon prio=10` is not available for ThreadInfo of jdk8
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-45374) [CORE] Add test SSL keys

2023-09-28 Thread Hasnain Lakhani (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hasnain Lakhani resolved SPARK-45374.
-
Resolution: Fixed

Fixed by [https://github.com/apache/spark/pull/43163]

[~mridul] / [~joshrosen] : I couldn't find how to link the existing PR to this 
ticket, please let me know if there is a way

> [CORE] Add test SSL keys
> 
>
> Key: SPARK-45374
> URL: https://issues.apache.org/jira/browse/SPARK-45374
> Project: Spark
>  Issue Type: Task
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Hasnain Lakhani
>Priority: Major
>
> Add test SSL keys which will be used for unit and integration tests of the 
> new SSL RPC functionality



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-45375) [CORE] Mark connection as timedOut in TransportClient.close

2023-09-28 Thread Hasnain Lakhani (Jira)

Hasnain Lakhani created SPARK-45375:
---

 Summary: [CORE] Mark connection as timedOut in 
TransportClient.close
 Key: SPARK-45375
 URL: https://issues.apache.org/jira/browse/SPARK-45375
 Project: Spark
  Issue Type: Task
  Components: Spark Core
Affects Versions: 3.4.2, 4.0.0, 3.5.1, 3.3.4
Reporter: Hasnain Lakhani


Avoid a race condition where a connection which is in the process of being 
closed could be returned by the TransportClientFactory only to be immediately 
closed and cause errors upon use

 

This doesn't happen much in practice but is observed more frequently as part of 
efforts to add SSL support



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-45376) [CORE] Add netty-tcnative-boringssl-static dependency

2023-09-28 Thread Hasnain Lakhani (Jira)

Hasnain Lakhani created SPARK-45376:
---

 Summary: [CORE] Add netty-tcnative-boringssl-static dependency
 Key: SPARK-45376
 URL: https://issues.apache.org/jira/browse/SPARK-45376
 Project: Spark
  Issue Type: Task
  Components: Spark Core
Affects Versions: 4.0.0
Reporter: Hasnain Lakhani


Add the boringssl dependency which is needed for SSL functionality to work, and 
provide the network common test helper to other test modules which need to test 
SSL functionality



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-45377) [CORE] Handle InputStream in NettyLogger

2023-09-28 Thread Hasnain Lakhani (Jira)

Hasnain Lakhani created SPARK-45377:
---

 Summary: [CORE] Handle InputStream in NettyLogger
 Key: SPARK-45377
 URL: https://issues.apache.org/jira/browse/SPARK-45377
 Project: Spark
  Issue Type: Task
  Components: Spark Core
Affects Versions: 4.0.0
Reporter: Hasnain Lakhani


Allow NettyLogger to also print the size of InputStreams which aids debugging 
for SSL functionality



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-45378) [CORE] Add convertToNettyForSsl to ManagedBuffer

2023-09-28 Thread Hasnain Lakhani (Jira)

Hasnain Lakhani created SPARK-45378:
---

 Summary: [CORE] Add convertToNettyForSsl to ManagedBuffer
 Key: SPARK-45378
 URL: https://issues.apache.org/jira/browse/SPARK-45378
 Project: Spark
  Issue Type: Task
  Components: Spark Core
Affects Versions: 4.0.0
Reporter: Hasnain Lakhani


Since netty's SSL support does not support zero-copy transfers, add another API 
to ManagedBuffer so we can get buffers in a format that works with SSL



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-45057) Deadlock caused by rdd replication level of 2

2023-09-28 Thread Mridul Muralidharan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mridul Muralidharan reassigned SPARK-45057:
---

Assignee: Zhongwei Zhu

> Deadlock caused by rdd replication level of 2
> -
>
> Key: SPARK-45057
> URL: https://issues.apache.org/jira/browse/SPARK-45057
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.4.1
>Reporter: Zhongwei Zhu
>Assignee: Zhongwei Zhu
>Priority: Major
>  Labels: pull-request-available
>
>  
> When 2 tasks try to compute same rdd with replication level of 2 and running 
> on only 2 executors. Deadlock will happen.
> Task only release lock after writing into local machine and replicate to 
> remote executor.
>  
> ||Time||Exe 1 (Task Thread T1)||Exe 1 (Shuffle Server Thread T2)||Exe 2 (Task 
> Thread T3)||Exe 2 (Shuffle Server Thread T4)||
> |T0|write lock of rdd| | | |
> |T1| | |write lock of rdd| |
> |T2|replicate -> UploadBlockSync (blocked by T4)| | | |
> |T3| | | |Received UploadBlock request from T1 (blocked by T4)|
> |T4| | |replicate -> UploadBlockSync (blocked by T2)| |
> |T5| |Received UploadBlock request from T3 (blocked by T1)| | |
> |T6|Deadlock|Deadlock|Deadlock|Deadlock|



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-45057) Deadlock caused by rdd replication level of 2

2023-09-28 Thread Mridul Muralidharan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mridul Muralidharan resolved SPARK-45057.
-
Fix Version/s: 3.3.4
   3.5.1
   4.0.0
   3.4.2
   Resolution: Fixed

Issue resolved by pull request 43067
[https://github.com/apache/spark/pull/43067]

> Deadlock caused by rdd replication level of 2
> -
>
> Key: SPARK-45057
> URL: https://issues.apache.org/jira/browse/SPARK-45057
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.4.1
>Reporter: Zhongwei Zhu
>Assignee: Zhongwei Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.4, 3.5.1, 4.0.0, 3.4.2
>
>
>  
> When 2 tasks try to compute same rdd with replication level of 2 and running 
> on only 2 executors. Deadlock will happen.
> Task only release lock after writing into local machine and replicate to 
> remote executor.
>  
> ||Time||Exe 1 (Task Thread T1)||Exe 1 (Shuffle Server Thread T2)||Exe 2 (Task 
> Thread T3)||Exe 2 (Shuffle Server Thread T4)||
> |T0|write lock of rdd| | | |
> |T1| | |write lock of rdd| |
> |T2|replicate -> UploadBlockSync (blocked by T4)| | | |
> |T3| | | |Received UploadBlock request from T1 (blocked by T4)|
> |T4| | |replicate -> UploadBlockSync (blocked by T2)| |
> |T5| |Received UploadBlock request from T3 (blocked by T1)| | |
> |T6|Deadlock|Deadlock|Deadlock|Deadlock|



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-45276) Replace Java 8 and Java 11 installed in the Dockerfile with Java

2023-09-28 Thread Yang Jie (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie resolved SPARK-45276.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 43076
[https://github.com/apache/spark/pull/43076]

> Replace Java 8 and Java 11 installed in the Dockerfile with Java
> 
>
> Key: SPARK-45276
> URL: https://issues.apache.org/jira/browse/SPARK-45276
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> including dev/create-release/spark-rm/Dockerfile and 
> connector/docker/spark-test/base/Dockerfile
> There might be others as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-45276) Replace Java 8 and Java 11 installed in the Dockerfile with Java

2023-09-28 Thread Yang Jie (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie reassigned SPARK-45276:


Assignee: BingKun Pan

> Replace Java 8 and Java 11 installed in the Dockerfile with Java
> 
>
> Key: SPARK-45276
> URL: https://issues.apache.org/jira/browse/SPARK-45276
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: BingKun Pan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> including dev/create-release/spark-rm/Dockerfile and 
> connector/docker/spark-test/base/Dockerfile
> There might be others as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-45227) Fix a subtle thread-safety issue with CoarseGrainedExecutorBackend where an executor process randomly gets stuck

2023-09-28 Thread Mridul Muralidharan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mridul Muralidharan updated SPARK-45227:

Fix Version/s: 3.4.2
   4.0.0
   3.5.1

> Fix a subtle thread-safety issue with CoarseGrainedExecutorBackend where an 
> executor process randomly gets stuck
> 
>
> Key: SPARK-45227
> URL: https://issues.apache.org/jira/browse/SPARK-45227
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.3.1, 3.5.0, 4.0.0
>Reporter: Bo Xiong
>Priority: Critical
>  Labels: hang, infinite-loop, pull-request-available, 
> race-condition, stuck, threadsafe
> Fix For: 3.4.2, 4.0.0, 3.5.1
>
> Attachments: hashtable1.png, hashtable2.png
>
>   Original Estimate: 4h
>  Remaining Estimate: 4h
>
> h2. Symptom
> Our Spark 3 app running on EMR 6.10.0 with Spark 3.3.1 got stuck in the very 
> last step of writing a data frame to S3 by calling {{{}df.write{}}}. Looking 
> at Spark UI, we saw that an executor process hung over 1 hour. After we 
> manually killed the executor process, the app succeeded. Note that the same 
> EMR cluster with two worker nodes was able to run the same app without any 
> issue before and after the incident.
> h2. Observations
> Below is what's observed from relevant container logs and thread dump.
>  * A regular task that's sent to the executor, which also reported back to 
> the driver upon the task completion.
> {quote}$zgrep 'task 150' container_1694029806204_12865_01_01/stderr.gz
> 23/09/12 18:13:55 INFO TaskSetManager: Starting task 150.0 in stage 23.0 (TID 
> 923) (ip-10-0-185-107.ec2.internal, executor 3, partition 150, NODE_LOCAL, 
> 4432 bytes) taskResourceAssignments Map()
> 23/09/12 18:13:55 INFO TaskSetManager: Finished task 150.0 in stage 23.0 (TID 
> 923) in 126 ms on ip-10-0-185-107.ec2.internal (executor 3) (16/200)
> $zgrep 'task 923' container_1694029806204_12865_01_04/stderr.gz
> 23/09/12 18:13:55 INFO YarnCoarseGrainedExecutorBackend: Got assigned task 923
> $zgrep 'task 150' container_1694029806204_12865_01_04/stderr.gz
> 23/09/12 18:13:55 INFO Executor: Running task 150.0 in stage 23.0 (TID 923)
> 23/09/12 18:13:55 INFO Executor: Finished task 150.0 in stage 23.0 (TID 923). 
> 4495 bytes result sent to driver}}
> {quote} * Another task that's sent to the executor but didn't get launched 
> since the single-threaded dispatcher was stuck (presumably in an "infinite 
> loop" as explained later).
> {quote}$zgrep 'task 153' container_1694029806204_12865_01_01/stderr.gz
> 23/09/12 18:13:55 INFO TaskSetManager: Starting task 153.0 in stage 23.0 (TID 
> 924) (ip-10-0-185-107.ec2.internal, executor 3, partition 153, NODE_LOCAL, 
> 4432 bytes) taskResourceAssignments Map()
> $zgrep ' 924' container_1694029806204_12865_01_04/stderr.gz
> 23/09/12 18:13:55 INFO YarnCoarseGrainedExecutorBackend: Got assigned task 924
> $zgrep 'task 153' container_1694029806204_12865_01_04/stderr.gz
> >> note that the above command has no matching result, indicating that task 
> >> 153.0 in stage 23.0 (TID 924) was never launched}}
> {quote}* Thread dump shows that the dispatcher-Executor thread has the 
> following stack trace.
> {quote}"dispatcher-Executor" #40 daemon prio=5 os_prio=0 
> tid=0x98e37800 nid=0x1aff runnable [0x73bba000]
> java.lang.Thread.State: RUNNABLE
> at scala.runtime.BoxesRunTime.equalsNumObject(BoxesRunTime.java:142)
> at scala.runtime.BoxesRunTime.equals2(BoxesRunTime.java:131)
> at scala.runtime.BoxesRunTime.equals(BoxesRunTime.java:123)
> at scala.collection.mutable.HashTable.elemEquals(HashTable.scala:365)
> at scala.collection.mutable.HashTable.elemEquals$(HashTable.scala:365)
> at scala.collection.mutable.HashMap.elemEquals(HashMap.scala:44)
> at scala.collection.mutable.HashTable.findEntry0(HashTable.scala:140)
> at scala.collection.mutable.HashTable.findOrAddEntry(HashTable.scala:169)
> at scala.collection.mutable.HashTable.findOrAddEntry$(HashTable.scala:167)
> at scala.collection.mutable.HashMap.findOrAddEntry(HashMap.scala:44)
> at scala.collection.mutable.HashMap.put(HashMap.scala:126)
> at scala.collection.mutable.HashMap.update(HashMap.scala:131)
> at 
> org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$receive$1.applyOrElse(CoarseGrainedExecutorBackend.scala:200)
> at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:115)
> at 
> org.apache.spark.rpc.netty.Inbox$$Lambda$323/1930826709.apply$mcV$sp(Unknown 
> Source)
> at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:213)
> at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)
> at 
> org.apache.spark.rpc.netty.MessageLoop.org$apache$spark$rpc$n

[jira] [Resolved] (SPARK-45227) Fix a subtle thread-safety issue with CoarseGrainedExecutorBackend where an executor process randomly gets stuck

2023-09-28 Thread Mridul Muralidharan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mridul Muralidharan resolved SPARK-45227.
-
Resolution: Fixed

> Fix a subtle thread-safety issue with CoarseGrainedExecutorBackend where an 
> executor process randomly gets stuck
> 
>
> Key: SPARK-45227
> URL: https://issues.apache.org/jira/browse/SPARK-45227
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.3.1, 3.5.0, 4.0.0
>Reporter: Bo Xiong
>Assignee: Bo Xiong
>Priority: Critical
>  Labels: hang, infinite-loop, pull-request-available, 
> race-condition, stuck, threadsafe
> Fix For: 3.4.2, 4.0.0, 3.5.1
>
> Attachments: hashtable1.png, hashtable2.png
>
>   Original Estimate: 4h
>  Remaining Estimate: 4h
>
> h2. Symptom
> Our Spark 3 app running on EMR 6.10.0 with Spark 3.3.1 got stuck in the very 
> last step of writing a data frame to S3 by calling {{{}df.write{}}}. Looking 
> at Spark UI, we saw that an executor process hung over 1 hour. After we 
> manually killed the executor process, the app succeeded. Note that the same 
> EMR cluster with two worker nodes was able to run the same app without any 
> issue before and after the incident.
> h2. Observations
> Below is what's observed from relevant container logs and thread dump.
>  * A regular task that's sent to the executor, which also reported back to 
> the driver upon the task completion.
> {quote}$zgrep 'task 150' container_1694029806204_12865_01_01/stderr.gz
> 23/09/12 18:13:55 INFO TaskSetManager: Starting task 150.0 in stage 23.0 (TID 
> 923) (ip-10-0-185-107.ec2.internal, executor 3, partition 150, NODE_LOCAL, 
> 4432 bytes) taskResourceAssignments Map()
> 23/09/12 18:13:55 INFO TaskSetManager: Finished task 150.0 in stage 23.0 (TID 
> 923) in 126 ms on ip-10-0-185-107.ec2.internal (executor 3) (16/200)
> $zgrep 'task 923' container_1694029806204_12865_01_04/stderr.gz
> 23/09/12 18:13:55 INFO YarnCoarseGrainedExecutorBackend: Got assigned task 923
> $zgrep 'task 150' container_1694029806204_12865_01_04/stderr.gz
> 23/09/12 18:13:55 INFO Executor: Running task 150.0 in stage 23.0 (TID 923)
> 23/09/12 18:13:55 INFO Executor: Finished task 150.0 in stage 23.0 (TID 923). 
> 4495 bytes result sent to driver}}
> {quote} * Another task that's sent to the executor but didn't get launched 
> since the single-threaded dispatcher was stuck (presumably in an "infinite 
> loop" as explained later).
> {quote}$zgrep 'task 153' container_1694029806204_12865_01_01/stderr.gz
> 23/09/12 18:13:55 INFO TaskSetManager: Starting task 153.0 in stage 23.0 (TID 
> 924) (ip-10-0-185-107.ec2.internal, executor 3, partition 153, NODE_LOCAL, 
> 4432 bytes) taskResourceAssignments Map()
> $zgrep ' 924' container_1694029806204_12865_01_04/stderr.gz
> 23/09/12 18:13:55 INFO YarnCoarseGrainedExecutorBackend: Got assigned task 924
> $zgrep 'task 153' container_1694029806204_12865_01_04/stderr.gz
> >> note that the above command has no matching result, indicating that task 
> >> 153.0 in stage 23.0 (TID 924) was never launched}}
> {quote}* Thread dump shows that the dispatcher-Executor thread has the 
> following stack trace.
> {quote}"dispatcher-Executor" #40 daemon prio=5 os_prio=0 
> tid=0x98e37800 nid=0x1aff runnable [0x73bba000]
> java.lang.Thread.State: RUNNABLE
> at scala.runtime.BoxesRunTime.equalsNumObject(BoxesRunTime.java:142)
> at scala.runtime.BoxesRunTime.equals2(BoxesRunTime.java:131)
> at scala.runtime.BoxesRunTime.equals(BoxesRunTime.java:123)
> at scala.collection.mutable.HashTable.elemEquals(HashTable.scala:365)
> at scala.collection.mutable.HashTable.elemEquals$(HashTable.scala:365)
> at scala.collection.mutable.HashMap.elemEquals(HashMap.scala:44)
> at scala.collection.mutable.HashTable.findEntry0(HashTable.scala:140)
> at scala.collection.mutable.HashTable.findOrAddEntry(HashTable.scala:169)
> at scala.collection.mutable.HashTable.findOrAddEntry$(HashTable.scala:167)
> at scala.collection.mutable.HashMap.findOrAddEntry(HashMap.scala:44)
> at scala.collection.mutable.HashMap.put(HashMap.scala:126)
> at scala.collection.mutable.HashMap.update(HashMap.scala:131)
> at 
> org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$receive$1.applyOrElse(CoarseGrainedExecutorBackend.scala:200)
> at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:115)
> at 
> org.apache.spark.rpc.netty.Inbox$$Lambda$323/1930826709.apply$mcV$sp(Unknown 
> Source)
> at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:213)
> at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)
> at 
> org.apache.spark.rpc.netty.MessageLoop.org$apache$spark$rpc$netty$MessageLoop$$r

[jira] [Assigned] (SPARK-45227) Fix a subtle thread-safety issue with CoarseGrainedExecutorBackend where an executor process randomly gets stuck

2023-09-28 Thread Mridul Muralidharan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mridul Muralidharan reassigned SPARK-45227:
---

Assignee: Bo Xiong

> Fix a subtle thread-safety issue with CoarseGrainedExecutorBackend where an 
> executor process randomly gets stuck
> 
>
> Key: SPARK-45227
> URL: https://issues.apache.org/jira/browse/SPARK-45227
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.3.1, 3.5.0, 4.0.0
>Reporter: Bo Xiong
>Assignee: Bo Xiong
>Priority: Critical
>  Labels: hang, infinite-loop, pull-request-available, 
> race-condition, stuck, threadsafe
> Fix For: 3.4.2, 4.0.0, 3.5.1
>
> Attachments: hashtable1.png, hashtable2.png
>
>   Original Estimate: 4h
>  Remaining Estimate: 4h
>
> h2. Symptom
> Our Spark 3 app running on EMR 6.10.0 with Spark 3.3.1 got stuck in the very 
> last step of writing a data frame to S3 by calling {{{}df.write{}}}. Looking 
> at Spark UI, we saw that an executor process hung over 1 hour. After we 
> manually killed the executor process, the app succeeded. Note that the same 
> EMR cluster with two worker nodes was able to run the same app without any 
> issue before and after the incident.
> h2. Observations
> Below is what's observed from relevant container logs and thread dump.
>  * A regular task that's sent to the executor, which also reported back to 
> the driver upon the task completion.
> {quote}$zgrep 'task 150' container_1694029806204_12865_01_01/stderr.gz
> 23/09/12 18:13:55 INFO TaskSetManager: Starting task 150.0 in stage 23.0 (TID 
> 923) (ip-10-0-185-107.ec2.internal, executor 3, partition 150, NODE_LOCAL, 
> 4432 bytes) taskResourceAssignments Map()
> 23/09/12 18:13:55 INFO TaskSetManager: Finished task 150.0 in stage 23.0 (TID 
> 923) in 126 ms on ip-10-0-185-107.ec2.internal (executor 3) (16/200)
> $zgrep 'task 923' container_1694029806204_12865_01_04/stderr.gz
> 23/09/12 18:13:55 INFO YarnCoarseGrainedExecutorBackend: Got assigned task 923
> $zgrep 'task 150' container_1694029806204_12865_01_04/stderr.gz
> 23/09/12 18:13:55 INFO Executor: Running task 150.0 in stage 23.0 (TID 923)
> 23/09/12 18:13:55 INFO Executor: Finished task 150.0 in stage 23.0 (TID 923). 
> 4495 bytes result sent to driver}}
> {quote} * Another task that's sent to the executor but didn't get launched 
> since the single-threaded dispatcher was stuck (presumably in an "infinite 
> loop" as explained later).
> {quote}$zgrep 'task 153' container_1694029806204_12865_01_01/stderr.gz
> 23/09/12 18:13:55 INFO TaskSetManager: Starting task 153.0 in stage 23.0 (TID 
> 924) (ip-10-0-185-107.ec2.internal, executor 3, partition 153, NODE_LOCAL, 
> 4432 bytes) taskResourceAssignments Map()
> $zgrep ' 924' container_1694029806204_12865_01_04/stderr.gz
> 23/09/12 18:13:55 INFO YarnCoarseGrainedExecutorBackend: Got assigned task 924
> $zgrep 'task 153' container_1694029806204_12865_01_04/stderr.gz
> >> note that the above command has no matching result, indicating that task 
> >> 153.0 in stage 23.0 (TID 924) was never launched}}
> {quote}* Thread dump shows that the dispatcher-Executor thread has the 
> following stack trace.
> {quote}"dispatcher-Executor" #40 daemon prio=5 os_prio=0 
> tid=0x98e37800 nid=0x1aff runnable [0x73bba000]
> java.lang.Thread.State: RUNNABLE
> at scala.runtime.BoxesRunTime.equalsNumObject(BoxesRunTime.java:142)
> at scala.runtime.BoxesRunTime.equals2(BoxesRunTime.java:131)
> at scala.runtime.BoxesRunTime.equals(BoxesRunTime.java:123)
> at scala.collection.mutable.HashTable.elemEquals(HashTable.scala:365)
> at scala.collection.mutable.HashTable.elemEquals$(HashTable.scala:365)
> at scala.collection.mutable.HashMap.elemEquals(HashMap.scala:44)
> at scala.collection.mutable.HashTable.findEntry0(HashTable.scala:140)
> at scala.collection.mutable.HashTable.findOrAddEntry(HashTable.scala:169)
> at scala.collection.mutable.HashTable.findOrAddEntry$(HashTable.scala:167)
> at scala.collection.mutable.HashMap.findOrAddEntry(HashMap.scala:44)
> at scala.collection.mutable.HashMap.put(HashMap.scala:126)
> at scala.collection.mutable.HashMap.update(HashMap.scala:131)
> at 
> org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$receive$1.applyOrElse(CoarseGrainedExecutorBackend.scala:200)
> at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:115)
> at 
> org.apache.spark.rpc.netty.Inbox$$Lambda$323/1930826709.apply$mcV$sp(Unknown 
> Source)
> at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:213)
> at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)
> at 
> org.apache.spark.rpc.netty.MessageLoop.org$apache$spark$rpc$netty$MessageL

[jira] [Closed] (SPARK-45227) Fix a subtle thread-safety issue with CoarseGrainedExecutorBackend where an executor process randomly gets stuck

2023-09-28 Thread Mridul Muralidharan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mridul Muralidharan closed SPARK-45227.
---

> Fix a subtle thread-safety issue with CoarseGrainedExecutorBackend where an 
> executor process randomly gets stuck
> 
>
> Key: SPARK-45227
> URL: https://issues.apache.org/jira/browse/SPARK-45227
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.3.1, 3.5.0, 4.0.0
>Reporter: Bo Xiong
>Assignee: Bo Xiong
>Priority: Critical
>  Labels: hang, infinite-loop, pull-request-available, 
> race-condition, stuck, threadsafe
> Fix For: 3.4.2, 4.0.0, 3.5.1
>
> Attachments: hashtable1.png, hashtable2.png
>
>   Original Estimate: 4h
>  Remaining Estimate: 4h
>
> h2. Symptom
> Our Spark 3 app running on EMR 6.10.0 with Spark 3.3.1 got stuck in the very 
> last step of writing a data frame to S3 by calling {{{}df.write{}}}. Looking 
> at Spark UI, we saw that an executor process hung over 1 hour. After we 
> manually killed the executor process, the app succeeded. Note that the same 
> EMR cluster with two worker nodes was able to run the same app without any 
> issue before and after the incident.
> h2. Observations
> Below is what's observed from relevant container logs and thread dump.
>  * A regular task that's sent to the executor, which also reported back to 
> the driver upon the task completion.
> {quote}$zgrep 'task 150' container_1694029806204_12865_01_01/stderr.gz
> 23/09/12 18:13:55 INFO TaskSetManager: Starting task 150.0 in stage 23.0 (TID 
> 923) (ip-10-0-185-107.ec2.internal, executor 3, partition 150, NODE_LOCAL, 
> 4432 bytes) taskResourceAssignments Map()
> 23/09/12 18:13:55 INFO TaskSetManager: Finished task 150.0 in stage 23.0 (TID 
> 923) in 126 ms on ip-10-0-185-107.ec2.internal (executor 3) (16/200)
> $zgrep 'task 923' container_1694029806204_12865_01_04/stderr.gz
> 23/09/12 18:13:55 INFO YarnCoarseGrainedExecutorBackend: Got assigned task 923
> $zgrep 'task 150' container_1694029806204_12865_01_04/stderr.gz
> 23/09/12 18:13:55 INFO Executor: Running task 150.0 in stage 23.0 (TID 923)
> 23/09/12 18:13:55 INFO Executor: Finished task 150.0 in stage 23.0 (TID 923). 
> 4495 bytes result sent to driver}}
> {quote} * Another task that's sent to the executor but didn't get launched 
> since the single-threaded dispatcher was stuck (presumably in an "infinite 
> loop" as explained later).
> {quote}$zgrep 'task 153' container_1694029806204_12865_01_01/stderr.gz
> 23/09/12 18:13:55 INFO TaskSetManager: Starting task 153.0 in stage 23.0 (TID 
> 924) (ip-10-0-185-107.ec2.internal, executor 3, partition 153, NODE_LOCAL, 
> 4432 bytes) taskResourceAssignments Map()
> $zgrep ' 924' container_1694029806204_12865_01_04/stderr.gz
> 23/09/12 18:13:55 INFO YarnCoarseGrainedExecutorBackend: Got assigned task 924
> $zgrep 'task 153' container_1694029806204_12865_01_04/stderr.gz
> >> note that the above command has no matching result, indicating that task 
> >> 153.0 in stage 23.0 (TID 924) was never launched}}
> {quote}* Thread dump shows that the dispatcher-Executor thread has the 
> following stack trace.
> {quote}"dispatcher-Executor" #40 daemon prio=5 os_prio=0 
> tid=0x98e37800 nid=0x1aff runnable [0x73bba000]
> java.lang.Thread.State: RUNNABLE
> at scala.runtime.BoxesRunTime.equalsNumObject(BoxesRunTime.java:142)
> at scala.runtime.BoxesRunTime.equals2(BoxesRunTime.java:131)
> at scala.runtime.BoxesRunTime.equals(BoxesRunTime.java:123)
> at scala.collection.mutable.HashTable.elemEquals(HashTable.scala:365)
> at scala.collection.mutable.HashTable.elemEquals$(HashTable.scala:365)
> at scala.collection.mutable.HashMap.elemEquals(HashMap.scala:44)
> at scala.collection.mutable.HashTable.findEntry0(HashTable.scala:140)
> at scala.collection.mutable.HashTable.findOrAddEntry(HashTable.scala:169)
> at scala.collection.mutable.HashTable.findOrAddEntry$(HashTable.scala:167)
> at scala.collection.mutable.HashMap.findOrAddEntry(HashMap.scala:44)
> at scala.collection.mutable.HashMap.put(HashMap.scala:126)
> at scala.collection.mutable.HashMap.update(HashMap.scala:131)
> at 
> org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$receive$1.applyOrElse(CoarseGrainedExecutorBackend.scala:200)
> at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:115)
> at 
> org.apache.spark.rpc.netty.Inbox$$Lambda$323/1930826709.apply$mcV$sp(Unknown 
> Source)
> at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:213)
> at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)
> at 
> org.apache.spark.rpc.netty.MessageLoop.org$apache$spark$rpc$netty$MessageLoop$$receiveLoop(MessageLoop.sca

[jira] [Updated] (SPARK-45377) [CORE] Handle InputStream in NettyLogger

2023-09-28 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-45377:
---
Labels: pull-request-available  (was: )

> [CORE] Handle InputStream in NettyLogger
> 
>
> Key: SPARK-45377
> URL: https://issues.apache.org/jira/browse/SPARK-45377
> Project: Spark
>  Issue Type: Task
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Hasnain Lakhani
>Priority: Major
>  Labels: pull-request-available
>
> Allow NettyLogger to also print the size of InputStreams which aids debugging 
> for SSL functionality



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-45378) [CORE] Add convertToNettyForSsl to ManagedBuffer

2023-09-28 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-45378:
---
Labels: pull-request-available  (was: )

> [CORE] Add convertToNettyForSsl to ManagedBuffer
> 
>
> Key: SPARK-45378
> URL: https://issues.apache.org/jira/browse/SPARK-45378
> Project: Spark
>  Issue Type: Task
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Hasnain Lakhani
>Priority: Major
>  Labels: pull-request-available
>
> Since netty's SSL support does not support zero-copy transfers, add another 
> API to ManagedBuffer so we can get buffers in a format that works with SSL



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-44034) Add a new test group for sql module

2023-09-28 Thread Yang Jie (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-44034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie updated SPARK-44034:
-
Affects Version/s: 3.4.2
   3.3.4

> Add a new test group for sql module
> ---
>
> Key: SPARK-44034
> URL: https://issues.apache.org/jira/browse/SPARK-44034
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra
>Affects Versions: 3.4.2, 3.5.0, 3.3.4
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-44074) `Logging plan changes for execution` test failed

2023-09-28 Thread Yang Jie (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-44074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie updated SPARK-44074:
-
Affects Version/s: 3.4.2
   3.3.4

> `Logging plan changes for execution` test failed
> 
>
> Key: SPARK-44074
> URL: https://issues.apache.org/jira/browse/SPARK-44074
> Project: Spark
>  Issue Type: Bug
>  Components: SQL, Tests
>Affects Versions: 3.4.2, 3.5.0, 3.3.4
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> run {{build/sbt clean "sql/test" 
> -Dtest.exclude.tags=org.apache.spark.tags.ExtendedSQLTest,org.apache.spark.tags.SlowSQLTest}}
> {{}}
> {code:java}
> 2023-06-15T19:58:34.4105460Z �[0m[�[0m�[0minfo�[0m] 
> �[0m�[0m�[32mQueryExecutionSuite:�[0m�[0m
> 2023-06-15T19:58:34.5395268Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[32m- dumping 
> query execution info to a file (77 milliseconds)�[0m�[0m
> 2023-06-15T19:58:34.5856902Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[32m- dumping 
> query execution info to an existing file (49 milliseconds)�[0m�[0m
> 2023-06-15T19:58:34.6099849Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[32m- dumping 
> query execution info to non-existing folder (25 milliseconds)�[0m�[0m
> 2023-06-15T19:58:34.6136467Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[32m- dumping 
> query execution info by invalid path (4 milliseconds)�[0m�[0m
> 2023-06-15T19:58:34.6425071Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[32m- dumping 
> query execution info to a file - explainMode=formatted (28 
> milliseconds)�[0m�[0m
> 2023-06-15T19:58:34.7084916Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[32m- limit 
> number of fields by sql config (66 milliseconds)�[0m�[0m
> 2023-06-15T19:58:34.7432299Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[32m- check 
> maximum fields restriction (34 milliseconds)�[0m�[0m
> 2023-06-15T19:58:34.7554546Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[32m- toString() 
> exception/error handling (11 milliseconds)�[0m�[0m
> 2023-06-15T19:58:34.7621424Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[32m- 
> SPARK-28346: clone the query plan between different stages (6 
> milliseconds)�[0m�[0m
> 2023-06-15T19:58:34.8001412Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m- Logging 
> plan changes for execution *** FAILED *** (12 milliseconds)�[0m�[0m
> 2023-06-15T19:58:34.8007977Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  
> testAppender.loggingEvents.exists(((x$10: 
> org.apache.logging.log4j.core.LogEvent) => 
> x$10.getMessage().getFormattedMessage().contains(expectedMsg))) was false 
> (QueryExecutionSuite.scala:232)�[0m�[0m 
> {code}
>  
> but run {{build/sbt "sql/testOnly *QueryExecutionSuite"}} not this issue, 
> need to investigate. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

48 matches

Mail list logo