date:20231103

[jira] [Assigned] (SPARK-44843) flaky test: RocksDBStateStoreStreamingAggregationSuite

2023-11-03 Thread Kent Yao (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-44843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kent Yao reassigned SPARK-44843:


Assignee: Kent Yao  (was: Yang Jie)

> flaky test: RocksDBStateStoreStreamingAggregationSuite
> --
>
> Key: SPARK-44843
> URL: https://issues.apache.org/jira/browse/SPARK-44843
> Project: Spark
>  Issue Type: Bug
>  Components: SQL, Tests
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Kent Yao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.2, 4.0.0, 3.5.1, 3.3.4
>
>
> I've seen this more than once, let's record it for now.
> [https://github.com/apache/spark/actions/runs/5875252243/job/15931264374]
> {code:java}
> 2023-08-16T06:49:14.0550627Z [0m[[0m[0minfo[0m] [0m[0m[31m- 
> SPARK-35896: metrics in StateOperatorProgress are output correctly 
> (RocksDBStateStore) *** FAILED *** (1 minute, 1 second)[0m[0m
> 2023-08-16T06:49:14.0560354Z [0m[[0m[0minfo[0m] [0m[0m[31m  Timed out 
> waiting for stream: The code passed to failAfter did not complete within 60 
> seconds.[0m[0m
> 2023-08-16T06:49:14.0568703Z [0m[[0m[0minfo[0m] [0m[0m[31m  
> java.lang.Thread.getStackTrace(Thread.java:1564)[0m[0m
> 2023-08-16T06:49:14.0578526Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.concurrent.TimeLimits$.failAfterImpl(TimeLimits.scala:277)[0m[0m
> 2023-08-16T06:49:14.0600495Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.concurrent.TimeLimits.failAfter(TimeLimits.scala:231)[0m[0m
> 2023-08-16T06:49:14.0609443Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.concurrent.TimeLimits.failAfter$(TimeLimits.scala:230)[0m[0m
> 2023-08-16T06:49:14.0630028Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.apache.spark.SparkFunSuite.failAfter(SparkFunSuite.scala:69)[0m[0m
> 2023-08-16T06:49:14.0638142Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.apache.spark.sql.streaming.StreamTest.$anonfun$testStream$7(StreamTest.scala:481)[0m[0m
> 2023-08-16T06:49:14.0704798Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.apache.spark.sql.streaming.StreamTest.$anonfun$testStream$7$adapted(StreamTest.scala:480)[0m[0m
> 2023-08-16T06:49:14.0732716Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149)[0m[0m
> 2023-08-16T06:49:14.0743783Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237)[0m[0m
> 2023-08-16T06:49:14.0753421Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230)[0m[0m
> 2023-08-16T06:49:14.0765553Z [0m[[0m[0minfo[0m] [0m[0m  [0m
> 2023-08-16T06:49:14.0773522Z [0m[[0m[0minfo[0m] [0m[0m[31m Caused 
> by:  null[0m[0m
> 2023-08-16T06:49:14.0787123Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)[0m[0m
> 2023-08-16T06:49:14.0796604Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2173)[0m[0m
> 2023-08-16T06:49:14.0808419Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.apache.spark.sql.execution.streaming.StreamExecution.awaitOffset(StreamExecution.scala:481)[0m[0m
> 2023-08-16T06:49:14.0817018Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.apache.spark.sql.streaming.StreamTest.$anonfun$testStream$8(StreamTest.scala:482)[0m[0m
> 2023-08-16T06:49:14.0824218Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)[0m[0m
> 2023-08-16T06:49:14.0831608Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.enablers.Timed$$anon$1.timeoutAfter(Timed.scala:127)[0m[0m
> 2023-08-16T06:49:14.0838059Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.concurrent.TimeLimits$.failAfterImpl(TimeLimits.scala:282)[0m[0m
> 2023-08-16T06:49:14.0847335Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.concurrent.TimeLimits.failAfter(TimeLimits.scala:231)[0m[0m
> 2023-08-16T06:49:14.0854180Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.concurrent.TimeLimits.failAfter$(TimeLimits.scala:230)[0m[0m
> 2023-08-16T06:49:14.0861298Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.apache.spark.SparkFunSuite.failAfter(SparkFunSuite.scala:69)[0m[0m
> 2023-08-16T06:49:14.0866845Z [0m[[0m[0minfo[0m] [0m[0m  [0m
> 2023-08-16T06:49:14.0872599Z [0m[[0m[0minfo[0m] [0m[0m  [0m
> 2023-08-16T06:49:14.0880688Z [0m[[0m[0minfo[0m] [0m[0m[31m  == 
> Progress ==[0m[0m
>

[jira] [Resolved] (SPARK-45785) Support `spark.deploy.appNumberModulo` to rotate app number

2023-11-03 Thread Dongjoon Hyun (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-45785.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 43654
[https://github.com/apache/spark/pull/43654]

> Support `spark.deploy.appNumberModulo` to rotate app number
> ---
>
> Key: SPARK-45785
> URL: https://issues.apache.org/jira/browse/SPARK-45785
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-45785) Support `spark.deploy.appNumberModulo` to rotate app number

2023-11-03 Thread Dongjoon Hyun (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-45785:
-

Assignee: Dongjoon Hyun

> Support `spark.deploy.appNumberModulo` to rotate app number
> ---
>
> Key: SPARK-45785
> URL: https://issues.apache.org/jira/browse/SPARK-45785
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-45789) Support DESCRIBE TABLE for clustering columns

2023-11-03 Thread Terry Kim (Jira)

Terry Kim created SPARK-45789:
-

 Summary: Support DESCRIBE TABLE for clustering columns
 Key: SPARK-45789
 URL: https://issues.apache.org/jira/browse/SPARK-45789
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 4.0.0
Reporter: Terry Kim






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-45788) Support SHOW CREATE TABLE for clustering columns

2023-11-03 Thread Terry Kim (Jira)

Terry Kim created SPARK-45788:
-

 Summary: Support SHOW CREATE TABLE for clustering columns
 Key: SPARK-45788
 URL: https://issues.apache.org/jira/browse/SPARK-45788
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 4.0.0
Reporter: Terry Kim






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-45787) Support Catalog.listColumns() for clustering columns

2023-11-03 Thread Terry Kim (Jira)

Terry Kim created SPARK-45787:
-

 Summary: Support Catalog.listColumns() for clustering columns
 Key: SPARK-45787
 URL: https://issues.apache.org/jira/browse/SPARK-45787
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 4.0.0
Reporter: Terry Kim


Support Catalog.listColumns() for clustering columns so that it `
org.apache.spark.sql.catalog.Column` contains clustering info (e.g., isCluster).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-45786) Inaccurate Decimal multiplication and division results

2023-11-03 Thread Kazuyuki Tanimura (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kazuyuki Tanimura updated SPARK-45786:
--
Affects Version/s: 4.0.0

> Inaccurate Decimal multiplication and division results
> --
>
> Key: SPARK-45786
> URL: https://issues.apache.org/jira/browse/SPARK-45786
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.2.4, 3.3.3, 3.4.1, 3.5.0, 4.0.0
>Reporter: Kazuyuki Tanimura
>Priority: Major
>
> Decimal multiplication and division results may be inaccurate due to rounding 
> issues.
> h2. Multiplication:
> {code:scala}
> scala> sql("select  -14120025096157587712113961295153.858047 * 
> -0.4652").show(truncate=false)
> ++
>   
> |(-14120025096157587712113961295153.858047 * -0.4652)|
> ++
> |6568635674732509803675414794505.574764  |
> ++
> {code}
> The correct answer is
> {quote}6568635674732509803675414794505.574763
> {quote}
> Please note that the last digit is 3 instead of 4 as
>  
> {code:scala}
> scala> 
> java.math.BigDecimal("-14120025096157587712113961295153.858047").multiply(java.math.BigDecimal("-0.4652"))
> val res21: java.math.BigDecimal = 6568635674732509803675414794505.5747634644
> {code}
> Since the factional part .574763 is followed by 4644, it should not be 
> rounded up.
> h2. Division:
> {code:scala}
> scala> sql("select -0.172787979 / 
> 533704665545018957788294905796.5").show(truncate=false)
> +-+
> |(-0.172787979 / 533704665545018957788294905796.5)|
> +-+
> |-3.237521E-31|
> +-+
> {code}
> The correct answer is
> {quote}-3.237520E-31
> {quote}
> Please note that the last digit is 0 instead of 1 as
>  
> {code:scala}
> scala> 
> java.math.BigDecimal("-0.172787979").divide(java.math.BigDecimal("533704665545018957788294905796.5"),
>  100, java.math.RoundingMode.DOWN)
> val res22: java.math.BigDecimal = 
> -3.237520489418037889998826491401059986665344697406144511563561222578738E-31
> {code}
> Since the factional part .237520 is followed by 4894..., it should not be 
> rounded up.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-45780) Propagate all Spark Connect client threadlocal in InheritableThread

2023-11-03 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-45780:


Assignee: Juliusz Sompolski

> Propagate all Spark Connect client threadlocal in InheritableThread
> ---
>
> Key: SPARK-45780
> URL: https://issues.apache.org/jira/browse/SPARK-45780
> Project: Spark
>  Issue Type: Improvement
>  Components: Connect
>Affects Versions: 4.0.0
>Reporter: Juliusz Sompolski
>Assignee: Juliusz Sompolski
>Priority: Major
>  Labels: pull-request-available
>
> Propagate all thread locals that can be set in SparkConnectClient, not only 
> 'tags'



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-45780) Propagate all Spark Connect client threadlocal in InheritableThread

2023-11-03 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-45780.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 43649
[https://github.com/apache/spark/pull/43649]

> Propagate all Spark Connect client threadlocal in InheritableThread
> ---
>
> Key: SPARK-45780
> URL: https://issues.apache.org/jira/browse/SPARK-45780
> Project: Spark
>  Issue Type: Improvement
>  Components: Connect
>Affects Versions: 4.0.0
>Reporter: Juliusz Sompolski
>Assignee: Juliusz Sompolski
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Propagate all thread locals that can be set in SparkConnectClient, not only 
> 'tags'



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-45786) Inaccurate Decimal multiplication and division results

2023-11-03 Thread Kazuyuki Tanimura (Jira)

Kazuyuki Tanimura created SPARK-45786:
-

 Summary: Inaccurate Decimal multiplication and division results
 Key: SPARK-45786
 URL: https://issues.apache.org/jira/browse/SPARK-45786
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 3.5.0, 3.4.1, 3.3.3, 3.2.4
Reporter: Kazuyuki Tanimura


Decimal multiplication and division results may be inaccurate due to rounding 
issues.
h2. Multiplication:
{code:scala}
scala> sql("select  -14120025096157587712113961295153.858047 * 
-0.4652").show(truncate=false)
++  
|(-14120025096157587712113961295153.858047 * -0.4652)|
++
|6568635674732509803675414794505.574764  |
++
{code}
The correct answer is
{quote}6568635674732509803675414794505.574763
{quote}

Please note that the last digit is 3 instead of 4 as

 
{code:scala}
scala> 
java.math.BigDecimal("-14120025096157587712113961295153.858047").multiply(java.math.BigDecimal("-0.4652"))
val res21: java.math.BigDecimal = 6568635674732509803675414794505.5747634644
{code}
Since the factional part .574763 is followed by 4644, it should not be rounded 
up.
h2. Division:
{code:scala}
scala> sql("select -0.172787979 / 
533704665545018957788294905796.5").show(truncate=false)
+-+
|(-0.172787979 / 533704665545018957788294905796.5)|
+-+
|-3.237521E-31|
+-+
{code}
The correct answer is
{quote}-3.237520E-31
{quote}

Please note that the last digit is 0 instead of 1 as

 
{code:scala}
scala> 
java.math.BigDecimal("-0.172787979").divide(java.math.BigDecimal("533704665545018957788294905796.5"),
 100, java.math.RoundingMode.DOWN)
val res22: java.math.BigDecimal = 
-3.237520489418037889998826491401059986665344697406144511563561222578738E-31
{code}
Since the factional part .237520 is followed by 4894..., it should not be 
rounded up.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-44317) Define the computing logic through PartitionEvaluator API and use it in ShuffledHashJoinExec

2023-11-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-44317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-44317:
---
Labels: pull-request-available  (was: )

> Define the computing logic through PartitionEvaluator API and use it in 
> ShuffledHashJoinExec
> 
>
> Key: SPARK-44317
> URL: https://issues.apache.org/jira/browse/SPARK-44317
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.5.0
>Reporter: Vinod KC
>Priority: Major
>  Labels: pull-request-available
>
> Define the computing logic through PartitionEvaluator API and use it in 
> ShuffledHashJoinExec



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-44483) When using Spark to read the hive table, the number of file partitions cannot be set using Spark's configuration settings

2023-11-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-44483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-44483:
---
Labels: pull-request-available  (was: )

> When using Spark to read the hive table, the number of file partitions cannot 
> be set using Spark's configuration settings
> -
>
> Key: SPARK-44483
> URL: https://issues.apache.org/jira/browse/SPARK-44483
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.4.1
>Reporter: hao
>Priority: Major
>  Labels: pull-request-available
>
> When using Spark to read the hive table, the number of file partitions cannot 
> be set using Spark's configuration settings



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-45785) Support `spark.deploy.appNumberModulo` to rotate app number

2023-11-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-45785:
---
Labels: pull-request-available  (was: )

> Support `spark.deploy.appNumberModulo` to rotate app number
> ---
>
> Key: SPARK-45785
> URL: https://issues.apache.org/jira/browse/SPARK-45785
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-45785) Support `spark.deploy.appNumberModulo` to rotate app number

2023-11-03 Thread Dongjoon Hyun (Jira)

Dongjoon Hyun created SPARK-45785:
-

 Summary: Support `spark.deploy.appNumberModulo` to rotate app 
number
 Key: SPARK-45785
 URL: https://issues.apache.org/jira/browse/SPARK-45785
 Project: Spark
  Issue Type: Sub-task
  Components: Spark Core
Affects Versions: 4.0.0
Reporter: Dongjoon Hyun






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-44886) Introduce CLUSTER BY SQL clause to CREATE/REPLACE table

2023-11-03 Thread Terry Kim (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-44886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Terry Kim updated SPARK-44886:
--
Summary: Introduce CLUSTER BY SQL clause to CREATE/REPLACE table  (was: 
Introduce CLUSTER BY clause to CREATE/REPLACE table)

> Introduce CLUSTER BY SQL clause to CREATE/REPLACE table
> ---
>
> Key: SPARK-44886
> URL: https://issues.apache.org/jira/browse/SPARK-44886
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Terry Kim
>Priority: Major
>
> This proposes to introduce CLUSTER BY clause to CREATE/REPLACE SQL syntax:
> {code:java}
> CREATE TABLE tbl(a int, b string) CLUSTER BY (a, b){code}
> There will not be an implementation, but it's up to the catalog 
> implementation to utilize the clustering information (e.g., Delta, Iceberg, 
> etc.).
> Note that specifying CLUSTER BY will throw an exception if the table being 
> created is for v1 source or session catalog (e.g., v2 source w/ session 
> catalog).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-45784) Introduce clustering mechanism to Spark

2023-11-03 Thread Terry Kim (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Terry Kim updated SPARK-45784:
--
Description: This proposes to introduce a clustering mechanism such that 
different data sources (e.g., Delta, Iceberg, etc.) can implement format 
specific clustering.  (was: This proposes to introduce CLUSTER BY clause to 
CREATE/REPLACE SQL syntax:
{code:java}
CREATE TABLE tbl(a int, b string) CLUSTER BY (a, b){code}
There will not be an implementation, but it's up to the catalog implementation 
to utilize the clustering information (e.g., Delta, Iceberg, etc.).

Note that specifying CLUSTER BY will throw an exception if the table being 
created is for v1 source or session catalog (e.g., v2 source w/ session 
catalog).)

> Introduce clustering mechanism to Spark
> ---
>
> Key: SPARK-45784
> URL: https://issues.apache.org/jira/browse/SPARK-45784
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Terry Kim
>Priority: Major
>
> This proposes to introduce a clustering mechanism such that different data 
> sources (e.g., Delta, Iceberg, etc.) can implement format specific clustering.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-44886) Introduce CLUSTER BY SQL clause to CREATE/REPLACE TABLE

2023-11-03 Thread Terry Kim (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-44886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Terry Kim updated SPARK-44886:
--
Summary: Introduce CLUSTER BY SQL clause to CREATE/REPLACE TABLE  (was: 
Introduce CLUSTER BY SQL clause to CREATE/REPLACE table)

> Introduce CLUSTER BY SQL clause to CREATE/REPLACE TABLE
> ---
>
> Key: SPARK-44886
> URL: https://issues.apache.org/jira/browse/SPARK-44886
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Terry Kim
>Priority: Major
>
> This proposes to introduce CLUSTER BY clause to CREATE/REPLACE SQL syntax:
> {code:java}
> CREATE TABLE tbl(a int, b string) CLUSTER BY (a, b){code}
> There will not be an implementation, but it's up to the catalog 
> implementation to utilize the clustering information (e.g., Delta, Iceberg, 
> etc.).
> Note that specifying CLUSTER BY will throw an exception if the table being 
> created is for v1 source or session catalog (e.g., v2 source w/ session 
> catalog).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-44886) Introduce CLUSTER BY clause to CREATE/REPLACE table

2023-11-03 Thread Terry Kim (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-44886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Terry Kim updated SPARK-44886:
--
Parent: SPARK-45784
Issue Type: Sub-task  (was: New Feature)

> Introduce CLUSTER BY clause to CREATE/REPLACE table
> ---
>
> Key: SPARK-44886
> URL: https://issues.apache.org/jira/browse/SPARK-44886
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Terry Kim
>Priority: Major
>
> This proposes to introduce CLUSTER BY clause to CREATE/REPLACE SQL syntax:
> {code:java}
> CREATE TABLE tbl(a int, b string) CLUSTER BY (a, b){code}
> There will not be an implementation, but it's up to the catalog 
> implementation to utilize the clustering information (e.g., Delta, Iceberg, 
> etc.).
> Note that specifying CLUSTER BY will throw an exception if the table being 
> created is for v1 source or session catalog (e.g., v2 source w/ session 
> catalog).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-45784) Introduce clustering mechanism to Spark

2023-11-03 Thread Terry Kim (Jira)

Terry Kim created SPARK-45784:
-

 Summary: Introduce clustering mechanism to Spark
 Key: SPARK-45784
 URL: https://issues.apache.org/jira/browse/SPARK-45784
 Project: Spark
  Issue Type: New Feature
  Components: SQL
Affects Versions: 4.0.0
Reporter: Terry Kim


This proposes to introduce CLUSTER BY clause to CREATE/REPLACE SQL syntax:
{code:java}
CREATE TABLE tbl(a int, b string) CLUSTER BY (a, b){code}
There will not be an implementation, but it's up to the catalog implementation 
to utilize the clustering information (e.g., Delta, Iceberg, etc.).

Note that specifying CLUSTER BY will throw an exception if the table being 
created is for v1 source or session catalog (e.g., v2 source w/ session 
catalog).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-45783) Improve exception message when no remote url is set

2023-11-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-45783:
---
Labels: pull-request-available  (was: )

> Improve exception message when no remote url is set 
> 
>
> Key: SPARK-45783
> URL: https://issues.apache.org/jira/browse/SPARK-45783
> Project: Spark
>  Issue Type: Improvement
>  Components: Connect, PySpark
>Affects Versions: 3.5.0, 4.0.0
>Reporter: Allison Wang
>Priority: Major
>  Labels: pull-request-available
>
> When "SPARK_CONNECT_MODE_ENABLED" but no spark remote url is set, PySpark 
> currently throws this exception:
> AttributeError: 'NoneType' object has no attribute 'startswith'
> We should improve this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-45783) Improve exception message when no remote url is set

2023-11-03 Thread Allison Wang (Jira)

Allison Wang created SPARK-45783:


 Summary: Improve exception message when no remote url is set 
 Key: SPARK-45783
 URL: https://issues.apache.org/jira/browse/SPARK-45783
 Project: Spark
  Issue Type: Improvement
  Components: Connect, PySpark
Affects Versions: 3.5.0, 4.0.0
Reporter: Allison Wang


When "SPARK_CONNECT_MODE_ENABLED" but no spark remote url is set, PySpark 
currently throws this exception:

AttributeError: 'NoneType' object has no attribute 'startswith'

We should improve this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-45782) Add Dataframe API df.explainString()

2023-11-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-45782:
---
Labels: pull-request-available  (was: )

> Add Dataframe API df.explainString()
> 
>
> Key: SPARK-45782
> URL: https://issues.apache.org/jira/browse/SPARK-45782
> Project: Spark
>  Issue Type: Improvement
>  Components: Connect, PySpark, Spark Core
>Affects Versions: 4.0.0
>Reporter: Khalid Mammadov
>Priority: Minor
>  Labels: pull-request-available
>
> This frequently needed feature for performance optimization purposes. Users 
> often want to look into this output in running systems and so would like to 
> save/extract this output from running systems for later analysis.
> Current API only provided for Scala i.e. 
> {{df.queryExecution.toString()}}
> and also not located in intuitive place where average Spark user (i.e. non 
> Expert/Scala dev) can see immediately.
> It will also avoid users using workarounds and capturing outputs with
> {code:java}
> with io.StringIO() as buf ...`:
> df.explain(True)
> {code}
>  
> So, it would help users a lot have this output avalilable as:
> df.explainString()
> i.e. next to
> df.explain()
> so users can easily locate it and use.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-45782) Add Dataframe API df.explainString()

2023-11-03 Thread Khalid Mammadov (Jira)

Khalid Mammadov created SPARK-45782:
---

 Summary: Add Dataframe API df.explainString()
 Key: SPARK-45782
 URL: https://issues.apache.org/jira/browse/SPARK-45782
 Project: Spark
  Issue Type: Improvement
  Components: Connect, PySpark, Spark Core
Affects Versions: 4.0.0
Reporter: Khalid Mammadov


This frequently needed feature for performance optimization purposes. Users 
often want to look into this output in running systems and so would like to 
save/extract this output from running systems for later analysis.

Current API only provided for Scala i.e. 

{{df.queryExecution.toString()}}

and also not located in intuitive place where average Spark user (i.e. non 
Expert/Scala dev) can see immediately.

It will also avoid users using workarounds and capturing outputs with
{code:java}
with io.StringIO() as buf ...`:
df.explain(True)
{code}
 

So, it would help users a lot have this output avalilable as:

df.explainString()

i.e. next to

df.explain()

so users can easily locate it and use.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-44843) flaky test: RocksDBStateStoreStreamingAggregationSuite

2023-11-03 Thread Kent Yao (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-44843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kent Yao resolved SPARK-44843.
--
Fix Version/s: 3.3.4
   3.5.1
   4.0.0
   3.4.2
   Resolution: Fixed

Issue resolved by pull request 43647
[https://github.com/apache/spark/pull/43647]

> flaky test: RocksDBStateStoreStreamingAggregationSuite
> --
>
> Key: SPARK-44843
> URL: https://issues.apache.org/jira/browse/SPARK-44843
> Project: Spark
>  Issue Type: Bug
>  Components: SQL, Tests
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.4, 3.5.1, 4.0.0, 3.4.2
>
>
> I've seen this more than once, let's record it for now.
> [https://github.com/apache/spark/actions/runs/5875252243/job/15931264374]
> {code:java}
> 2023-08-16T06:49:14.0550627Z [0m[[0m[0minfo[0m] [0m[0m[31m- 
> SPARK-35896: metrics in StateOperatorProgress are output correctly 
> (RocksDBStateStore) *** FAILED *** (1 minute, 1 second)[0m[0m
> 2023-08-16T06:49:14.0560354Z [0m[[0m[0minfo[0m] [0m[0m[31m  Timed out 
> waiting for stream: The code passed to failAfter did not complete within 60 
> seconds.[0m[0m
> 2023-08-16T06:49:14.0568703Z [0m[[0m[0minfo[0m] [0m[0m[31m  
> java.lang.Thread.getStackTrace(Thread.java:1564)[0m[0m
> 2023-08-16T06:49:14.0578526Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.concurrent.TimeLimits$.failAfterImpl(TimeLimits.scala:277)[0m[0m
> 2023-08-16T06:49:14.0600495Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.concurrent.TimeLimits.failAfter(TimeLimits.scala:231)[0m[0m
> 2023-08-16T06:49:14.0609443Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.concurrent.TimeLimits.failAfter$(TimeLimits.scala:230)[0m[0m
> 2023-08-16T06:49:14.0630028Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.apache.spark.SparkFunSuite.failAfter(SparkFunSuite.scala:69)[0m[0m
> 2023-08-16T06:49:14.0638142Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.apache.spark.sql.streaming.StreamTest.$anonfun$testStream$7(StreamTest.scala:481)[0m[0m
> 2023-08-16T06:49:14.0704798Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.apache.spark.sql.streaming.StreamTest.$anonfun$testStream$7$adapted(StreamTest.scala:480)[0m[0m
> 2023-08-16T06:49:14.0732716Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149)[0m[0m
> 2023-08-16T06:49:14.0743783Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237)[0m[0m
> 2023-08-16T06:49:14.0753421Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230)[0m[0m
> 2023-08-16T06:49:14.0765553Z [0m[[0m[0minfo[0m] [0m[0m  [0m
> 2023-08-16T06:49:14.0773522Z [0m[[0m[0minfo[0m] [0m[0m[31m Caused 
> by:  null[0m[0m
> 2023-08-16T06:49:14.0787123Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)[0m[0m
> 2023-08-16T06:49:14.0796604Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2173)[0m[0m
> 2023-08-16T06:49:14.0808419Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.apache.spark.sql.execution.streaming.StreamExecution.awaitOffset(StreamExecution.scala:481)[0m[0m
> 2023-08-16T06:49:14.0817018Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.apache.spark.sql.streaming.StreamTest.$anonfun$testStream$8(StreamTest.scala:482)[0m[0m
> 2023-08-16T06:49:14.0824218Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)[0m[0m
> 2023-08-16T06:49:14.0831608Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.enablers.Timed$$anon$1.timeoutAfter(Timed.scala:127)[0m[0m
> 2023-08-16T06:49:14.0838059Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.concurrent.TimeLimits$.failAfterImpl(TimeLimits.scala:282)[0m[0m
> 2023-08-16T06:49:14.0847335Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.concurrent.TimeLimits.failAfter(TimeLimits.scala:231)[0m[0m
> 2023-08-16T06:49:14.0854180Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.concurrent.TimeLimits.failAfter$(TimeLimits.scala:230)[0m[0m
> 2023-08-16T06:49:14.0861298Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.apache.spark.SparkFunSuite.failAfter(SparkFunSuite.scala:69)[0m[0m
> 2023-08-16T06:49:14.0866845Z [0m[[0m[0minfo[0m] [0m[0m  [0m
> 2

[jira] [Assigned] (SPARK-44843) flaky test: RocksDBStateStoreStreamingAggregationSuite

2023-11-03 Thread Kent Yao (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-44843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kent Yao reassigned SPARK-44843:


Assignee: Yang Jie

> flaky test: RocksDBStateStoreStreamingAggregationSuite
> --
>
> Key: SPARK-44843
> URL: https://issues.apache.org/jira/browse/SPARK-44843
> Project: Spark
>  Issue Type: Bug
>  Components: SQL, Tests
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Major
>  Labels: pull-request-available
>
> I've seen this more than once, let's record it for now.
> [https://github.com/apache/spark/actions/runs/5875252243/job/15931264374]
> {code:java}
> 2023-08-16T06:49:14.0550627Z [0m[[0m[0minfo[0m] [0m[0m[31m- 
> SPARK-35896: metrics in StateOperatorProgress are output correctly 
> (RocksDBStateStore) *** FAILED *** (1 minute, 1 second)[0m[0m
> 2023-08-16T06:49:14.0560354Z [0m[[0m[0minfo[0m] [0m[0m[31m  Timed out 
> waiting for stream: The code passed to failAfter did not complete within 60 
> seconds.[0m[0m
> 2023-08-16T06:49:14.0568703Z [0m[[0m[0minfo[0m] [0m[0m[31m  
> java.lang.Thread.getStackTrace(Thread.java:1564)[0m[0m
> 2023-08-16T06:49:14.0578526Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.concurrent.TimeLimits$.failAfterImpl(TimeLimits.scala:277)[0m[0m
> 2023-08-16T06:49:14.0600495Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.concurrent.TimeLimits.failAfter(TimeLimits.scala:231)[0m[0m
> 2023-08-16T06:49:14.0609443Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.concurrent.TimeLimits.failAfter$(TimeLimits.scala:230)[0m[0m
> 2023-08-16T06:49:14.0630028Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.apache.spark.SparkFunSuite.failAfter(SparkFunSuite.scala:69)[0m[0m
> 2023-08-16T06:49:14.0638142Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.apache.spark.sql.streaming.StreamTest.$anonfun$testStream$7(StreamTest.scala:481)[0m[0m
> 2023-08-16T06:49:14.0704798Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.apache.spark.sql.streaming.StreamTest.$anonfun$testStream$7$adapted(StreamTest.scala:480)[0m[0m
> 2023-08-16T06:49:14.0732716Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149)[0m[0m
> 2023-08-16T06:49:14.0743783Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237)[0m[0m
> 2023-08-16T06:49:14.0753421Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230)[0m[0m
> 2023-08-16T06:49:14.0765553Z [0m[[0m[0minfo[0m] [0m[0m  [0m
> 2023-08-16T06:49:14.0773522Z [0m[[0m[0minfo[0m] [0m[0m[31m Caused 
> by:  null[0m[0m
> 2023-08-16T06:49:14.0787123Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)[0m[0m
> 2023-08-16T06:49:14.0796604Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2173)[0m[0m
> 2023-08-16T06:49:14.0808419Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.apache.spark.sql.execution.streaming.StreamExecution.awaitOffset(StreamExecution.scala:481)[0m[0m
> 2023-08-16T06:49:14.0817018Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.apache.spark.sql.streaming.StreamTest.$anonfun$testStream$8(StreamTest.scala:482)[0m[0m
> 2023-08-16T06:49:14.0824218Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)[0m[0m
> 2023-08-16T06:49:14.0831608Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.enablers.Timed$$anon$1.timeoutAfter(Timed.scala:127)[0m[0m
> 2023-08-16T06:49:14.0838059Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.concurrent.TimeLimits$.failAfterImpl(TimeLimits.scala:282)[0m[0m
> 2023-08-16T06:49:14.0847335Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.concurrent.TimeLimits.failAfter(TimeLimits.scala:231)[0m[0m
> 2023-08-16T06:49:14.0854180Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.concurrent.TimeLimits.failAfter$(TimeLimits.scala:230)[0m[0m
> 2023-08-16T06:49:14.0861298Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.apache.spark.SparkFunSuite.failAfter(SparkFunSuite.scala:69)[0m[0m
> 2023-08-16T06:49:14.0866845Z [0m[[0m[0minfo[0m] [0m[0m  [0m
> 2023-08-16T06:49:14.0872599Z [0m[[0m[0minfo[0m] [0m[0m  [0m
> 2023-08-16T06:49:14.0880688Z [0m[[0m[0minfo[0m] [0m[0m[31m  == 
> Progress ==[0m[0m
> 2023-08-16T06:49:14.0887109Z [0m[[0m[0minfo[0m] [0m[0m[31m

[jira] [Resolved] (SPARK-45779) Add a check step for jira issue ticket to be resolved

2023-11-03 Thread Kent Yao (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kent Yao resolved SPARK-45779.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 43648
[https://github.com/apache/spark/pull/43648]

> Add a check step for jira issue ticket to be resolved
> -
>
> Key: SPARK-45779
> URL: https://issues.apache.org/jira/browse/SPARK-45779
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Kent Yao
>Assignee: Kent Yao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-45779) Add a check step for jira issue ticket to be resolved

2023-11-03 Thread Kent Yao (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kent Yao reassigned SPARK-45779:


Assignee: Kent Yao

> Add a check step for jira issue ticket to be resolved
> -
>
> Key: SPARK-45779
> URL: https://issues.apache.org/jira/browse/SPARK-45779
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Kent Yao
>Assignee: Kent Yao
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-45781) Upgrade Arrow to 14.0.0

2023-11-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-45781:
---
Labels: pull-request-available  (was: )

> Upgrade Arrow to 14.0.0
> ---
>
> Key: SPARK-45781
> URL: https://issues.apache.org/jira/browse/SPARK-45781
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Priority: Major
>  Labels: pull-request-available
>
> https://arrow.apache.org/release/14.0.0.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-45781) Upgrade Arrow to 14.0.0

2023-11-03 Thread Yang Jie (Jira)

Yang Jie created SPARK-45781:


 Summary: Upgrade Arrow to 14.0.0
 Key: SPARK-45781
 URL: https://issues.apache.org/jira/browse/SPARK-45781
 Project: Spark
  Issue Type: Improvement
  Components: Build
Affects Versions: 4.0.0
Reporter: Yang Jie


https://arrow.apache.org/release/14.0.0.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-45688) Clean up the deprecated API usage related to `MapOps`

2023-11-03 Thread Yang Jie (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie resolved SPARK-45688.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 43578
[https://github.com/apache/spark/pull/43578]

> Clean up the deprecated API usage related to `MapOps`
> -
>
> Key: SPARK-45688
> URL: https://issues.apache.org/jira/browse/SPARK-45688
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: BingKun Pan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> * method - in trait MapOps is deprecated (since 2.13.0)
>  * method -- in trait MapOps is deprecated (since 2.13.0)
>  * method + in trait MapOps is deprecated (since 2.13.0)
>  
> {code:java}
> [warn] 
> /Users/yangjie01/SourceCode/git/spark-mine-sbt/core/src/main/scala/org/apache/spark/deploy/worker/CommandUtils.scala:84:27:
>  method + in trait MapOps is deprecated (since 2.13.0): Consider requiring an 
> immutable Map or fall back to Map.concat.
> [warn] Applicable -Wconf / @nowarn filters for this warning: msg= message>, cat=deprecation, 
> site=org.apache.spark.deploy.worker.CommandUtils.buildLocalCommand.newEnvironment,
>  origin=scala.collection.MapOps.+, version=2.13.0
> [warn]       command.environment + ((libraryPathName, 
> libraryPaths.mkString(File.pathSeparator)))
> [warn]                           ^
> [warn] 
> /Users/yangjie01/SourceCode/git/spark-mine-sbt/core/src/main/scala/org/apache/spark/deploy/worker/CommandUtils.scala:91:22:
>  method + in trait MapOps is deprecated (since 2.13.0): Consider requiring an 
> immutable Map or fall back to Map.concat.
> [warn] Applicable -Wconf / @nowarn filters for this warning: msg= message>, cat=deprecation, 
> site=org.apache.spark.deploy.worker.CommandUtils.buildLocalCommand, 
> origin=scala.collection.MapOps.+, version=2.13.0
> [warn]       newEnvironment += (SecurityManager.ENV_AUTH_SECRET -> 
> securityMgr.getSecretKey())
> [warn]                      ^ {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-45774) Support `spark.master.ui.historyServerUrl` in `ApplicationPage`

2023-11-03 Thread Kent Yao (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kent Yao resolved SPARK-45774.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 43643
[https://github.com/apache/spark/pull/43643]

> Support `spark.master.ui.historyServerUrl` in `ApplicationPage`
> ---
>
> Key: SPARK-45774
> URL: https://issues.apache.org/jira/browse/SPARK-45774
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Web UI
>Affects Versions: 4.0.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-45780) Propagate all Spark Connect client threadlocal in InheritableThread

2023-11-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-45780:
---
Labels: pull-request-available  (was: )

> Propagate all Spark Connect client threadlocal in InheritableThread
> ---
>
> Key: SPARK-45780
> URL: https://issues.apache.org/jira/browse/SPARK-45780
> Project: Spark
>  Issue Type: Improvement
>  Components: Connect
>Affects Versions: 4.0.0
>Reporter: Juliusz Sompolski
>Priority: Major
>  Labels: pull-request-available
>
> Propagate all thread locals that can be set in SparkConnectClient, not only 
> 'tags'



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-45780) Propagate all Spark Connect client threadlocal in InheritableThread

2023-11-03 Thread Juliusz Sompolski (Jira)

Juliusz Sompolski created SPARK-45780:
-

 Summary: Propagate all Spark Connect client threadlocal in 
InheritableThread
 Key: SPARK-45780
 URL: https://issues.apache.org/jira/browse/SPARK-45780
 Project: Spark
  Issue Type: Improvement
  Components: Connect
Affects Versions: 4.0.0
Reporter: Juliusz Sompolski


Propagate all thread locals that can be set in SparkConnectClient, not only 
'tags'



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-45738) client will wait forever if session in spark connect server is evicted

2023-11-03 Thread xie shuiahu (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xie shuiahu updated SPARK-45738:

Labels: pull-request-available query-lifecycle  (was: 
pull-request-available)

> client will wait forever if session in spark connect server is evicted
> --
>
> Key: SPARK-45738
> URL: https://issues.apache.org/jira/browse/SPARK-45738
> Project: Spark
>  Issue Type: Bug
>  Components: Connect
>Affects Versions: 3.5.0
>Reporter: xie shuiahu
>Priority: Critical
>  Labels: pull-request-available, query-lifecycle
>
> Step1. start a spark connect server
> Step2. submit a spark job which will run long
> {code:java}
> from pyspark.sql import SparkSession
> spark = SparkSession.builder.remote(f"sc://HOST:PORT/;user_id=job").create()
> spark.sql("A SQL will run longer than creating 100 sessions").show() {code}
>  
> Step3. create more than 100 sessions
> Tips: Run concurrently with step2
> {code:java}
> for i in range(0, 200):
>     spark = 
> SparkSession.builder.remote(f"sc://HOST:PORT/;user_id={i}").create()
>     spark.sql("show databases") {code}
>  
> *When the python code in step3 is executed, the session created in step2 will 
> be evicted, and the client will wait forever*
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-45779) Add a check step for jira issue ticket to be resolved

2023-11-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-45779:
---
Labels: pull-request-available  (was: )

> Add a check step for jira issue ticket to be resolved
> -
>
> Key: SPARK-45779
> URL: https://issues.apache.org/jira/browse/SPARK-45779
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Kent Yao
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-45779) Add a check step for jira issue ticket to be resolved

2023-11-03 Thread Kent Yao (Jira)

Kent Yao created SPARK-45779:


 Summary: Add a check step for jira issue ticket to be resolved
 Key: SPARK-45779
 URL: https://issues.apache.org/jira/browse/SPARK-45779
 Project: Spark
  Issue Type: Improvement
  Components: Project Infra
Affects Versions: 4.0.0
Reporter: Kent Yao






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-45776) Remove the defensive null check added in SPARK-39553.

2023-11-03 Thread Kent Yao (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kent Yao resolved SPARK-45776.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 43644
[https://github.com/apache/spark/pull/43644]

> Remove the defensive null check added in SPARK-39553.
> -
>
> Key: SPARK-45776
> URL: https://issues.apache.org/jira/browse/SPARK-45776
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> {code:java}
> def unregisterShuffle(shuffleId: Int): Unit = {
>     shuffleStatuses.remove(shuffleId).foreach { shuffleStatus =>
>       // SPARK-39553: Add protection for Scala 2.13 due to 
> https://github.com/scala/bug/issues/12613
>       // We should revert this if Scala 2.13 solves this issue.
>       if (shuffleStatus != null) {
>         shuffleStatus.invalidateSerializedMapOutputStatusCache()
>         shuffleStatus.invalidateSerializedMergeOutputStatusCache()
>       }
>     }
>   } {code}
> This issue has been fixed in Scala 2.13.9.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-45776) Remove the defensive null check added in SPARK-39553.

2023-11-03 Thread Kent Yao (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kent Yao reassigned SPARK-45776:


Assignee: Yang Jie

> Remove the defensive null check added in SPARK-39553.
> -
>
> Key: SPARK-45776
> URL: https://issues.apache.org/jira/browse/SPARK-45776
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Minor
>  Labels: pull-request-available
>
> {code:java}
> def unregisterShuffle(shuffleId: Int): Unit = {
>     shuffleStatuses.remove(shuffleId).foreach { shuffleStatus =>
>       // SPARK-39553: Add protection for Scala 2.13 due to 
> https://github.com/scala/bug/issues/12613
>       // We should revert this if Scala 2.13 solves this issue.
>       if (shuffleStatus != null) {
>         shuffleStatus.invalidateSerializedMapOutputStatusCache()
>         shuffleStatus.invalidateSerializedMergeOutputStatusCache()
>       }
>     }
>   } {code}
> This issue has been fixed in Scala 2.13.9.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-44843) flaky test: RocksDBStateStoreStreamingAggregationSuite

2023-11-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-44843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot reassigned SPARK-44843:
--

Assignee: (was: Apache Spark)

> flaky test: RocksDBStateStoreStreamingAggregationSuite
> --
>
> Key: SPARK-44843
> URL: https://issues.apache.org/jira/browse/SPARK-44843
> Project: Spark
>  Issue Type: Bug
>  Components: SQL, Tests
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Priority: Major
>  Labels: pull-request-available
>
> I've seen this more than once, let's record it for now.
> [https://github.com/apache/spark/actions/runs/5875252243/job/15931264374]
> {code:java}
> 2023-08-16T06:49:14.0550627Z [0m[[0m[0minfo[0m] [0m[0m[31m- 
> SPARK-35896: metrics in StateOperatorProgress are output correctly 
> (RocksDBStateStore) *** FAILED *** (1 minute, 1 second)[0m[0m
> 2023-08-16T06:49:14.0560354Z [0m[[0m[0minfo[0m] [0m[0m[31m  Timed out 
> waiting for stream: The code passed to failAfter did not complete within 60 
> seconds.[0m[0m
> 2023-08-16T06:49:14.0568703Z [0m[[0m[0minfo[0m] [0m[0m[31m  
> java.lang.Thread.getStackTrace(Thread.java:1564)[0m[0m
> 2023-08-16T06:49:14.0578526Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.concurrent.TimeLimits$.failAfterImpl(TimeLimits.scala:277)[0m[0m
> 2023-08-16T06:49:14.0600495Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.concurrent.TimeLimits.failAfter(TimeLimits.scala:231)[0m[0m
> 2023-08-16T06:49:14.0609443Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.concurrent.TimeLimits.failAfter$(TimeLimits.scala:230)[0m[0m
> 2023-08-16T06:49:14.0630028Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.apache.spark.SparkFunSuite.failAfter(SparkFunSuite.scala:69)[0m[0m
> 2023-08-16T06:49:14.0638142Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.apache.spark.sql.streaming.StreamTest.$anonfun$testStream$7(StreamTest.scala:481)[0m[0m
> 2023-08-16T06:49:14.0704798Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.apache.spark.sql.streaming.StreamTest.$anonfun$testStream$7$adapted(StreamTest.scala:480)[0m[0m
> 2023-08-16T06:49:14.0732716Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149)[0m[0m
> 2023-08-16T06:49:14.0743783Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237)[0m[0m
> 2023-08-16T06:49:14.0753421Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230)[0m[0m
> 2023-08-16T06:49:14.0765553Z [0m[[0m[0minfo[0m] [0m[0m  [0m
> 2023-08-16T06:49:14.0773522Z [0m[[0m[0minfo[0m] [0m[0m[31m Caused 
> by:  null[0m[0m
> 2023-08-16T06:49:14.0787123Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)[0m[0m
> 2023-08-16T06:49:14.0796604Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2173)[0m[0m
> 2023-08-16T06:49:14.0808419Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.apache.spark.sql.execution.streaming.StreamExecution.awaitOffset(StreamExecution.scala:481)[0m[0m
> 2023-08-16T06:49:14.0817018Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.apache.spark.sql.streaming.StreamTest.$anonfun$testStream$8(StreamTest.scala:482)[0m[0m
> 2023-08-16T06:49:14.0824218Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)[0m[0m
> 2023-08-16T06:49:14.0831608Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.enablers.Timed$$anon$1.timeoutAfter(Timed.scala:127)[0m[0m
> 2023-08-16T06:49:14.0838059Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.concurrent.TimeLimits$.failAfterImpl(TimeLimits.scala:282)[0m[0m
> 2023-08-16T06:49:14.0847335Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.concurrent.TimeLimits.failAfter(TimeLimits.scala:231)[0m[0m
> 2023-08-16T06:49:14.0854180Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.concurrent.TimeLimits.failAfter$(TimeLimits.scala:230)[0m[0m
> 2023-08-16T06:49:14.0861298Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.apache.spark.SparkFunSuite.failAfter(SparkFunSuite.scala:69)[0m[0m
> 2023-08-16T06:49:14.0866845Z [0m[[0m[0minfo[0m] [0m[0m  [0m
> 2023-08-16T06:49:14.0872599Z [0m[[0m[0minfo[0m] [0m[0m  [0m
> 2023-08-16T06:49:14.0880688Z [0m[[0m[0minfo[0m] [0m[0m[31m  == 
> Progress ==[0m[0m
> 2023-08-16T06:49:14.0887109Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> S

[jira] [Assigned] (SPARK-44843) flaky test: RocksDBStateStoreStreamingAggregationSuite

2023-11-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-44843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot reassigned SPARK-44843:
--

Assignee: Apache Spark

> flaky test: RocksDBStateStoreStreamingAggregationSuite
> --
>
> Key: SPARK-44843
> URL: https://issues.apache.org/jira/browse/SPARK-44843
> Project: Spark
>  Issue Type: Bug
>  Components: SQL, Tests
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Apache Spark
>Priority: Major
>  Labels: pull-request-available
>
> I've seen this more than once, let's record it for now.
> [https://github.com/apache/spark/actions/runs/5875252243/job/15931264374]
> {code:java}
> 2023-08-16T06:49:14.0550627Z [0m[[0m[0minfo[0m] [0m[0m[31m- 
> SPARK-35896: metrics in StateOperatorProgress are output correctly 
> (RocksDBStateStore) *** FAILED *** (1 minute, 1 second)[0m[0m
> 2023-08-16T06:49:14.0560354Z [0m[[0m[0minfo[0m] [0m[0m[31m  Timed out 
> waiting for stream: The code passed to failAfter did not complete within 60 
> seconds.[0m[0m
> 2023-08-16T06:49:14.0568703Z [0m[[0m[0minfo[0m] [0m[0m[31m  
> java.lang.Thread.getStackTrace(Thread.java:1564)[0m[0m
> 2023-08-16T06:49:14.0578526Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.concurrent.TimeLimits$.failAfterImpl(TimeLimits.scala:277)[0m[0m
> 2023-08-16T06:49:14.0600495Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.concurrent.TimeLimits.failAfter(TimeLimits.scala:231)[0m[0m
> 2023-08-16T06:49:14.0609443Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.concurrent.TimeLimits.failAfter$(TimeLimits.scala:230)[0m[0m
> 2023-08-16T06:49:14.0630028Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.apache.spark.SparkFunSuite.failAfter(SparkFunSuite.scala:69)[0m[0m
> 2023-08-16T06:49:14.0638142Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.apache.spark.sql.streaming.StreamTest.$anonfun$testStream$7(StreamTest.scala:481)[0m[0m
> 2023-08-16T06:49:14.0704798Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.apache.spark.sql.streaming.StreamTest.$anonfun$testStream$7$adapted(StreamTest.scala:480)[0m[0m
> 2023-08-16T06:49:14.0732716Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149)[0m[0m
> 2023-08-16T06:49:14.0743783Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237)[0m[0m
> 2023-08-16T06:49:14.0753421Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230)[0m[0m
> 2023-08-16T06:49:14.0765553Z [0m[[0m[0minfo[0m] [0m[0m  [0m
> 2023-08-16T06:49:14.0773522Z [0m[[0m[0minfo[0m] [0m[0m[31m Caused 
> by:  null[0m[0m
> 2023-08-16T06:49:14.0787123Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)[0m[0m
> 2023-08-16T06:49:14.0796604Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2173)[0m[0m
> 2023-08-16T06:49:14.0808419Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.apache.spark.sql.execution.streaming.StreamExecution.awaitOffset(StreamExecution.scala:481)[0m[0m
> 2023-08-16T06:49:14.0817018Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.apache.spark.sql.streaming.StreamTest.$anonfun$testStream$8(StreamTest.scala:482)[0m[0m
> 2023-08-16T06:49:14.0824218Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)[0m[0m
> 2023-08-16T06:49:14.0831608Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.enablers.Timed$$anon$1.timeoutAfter(Timed.scala:127)[0m[0m
> 2023-08-16T06:49:14.0838059Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.concurrent.TimeLimits$.failAfterImpl(TimeLimits.scala:282)[0m[0m
> 2023-08-16T06:49:14.0847335Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.concurrent.TimeLimits.failAfter(TimeLimits.scala:231)[0m[0m
> 2023-08-16T06:49:14.0854180Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.concurrent.TimeLimits.failAfter$(TimeLimits.scala:230)[0m[0m
> 2023-08-16T06:49:14.0861298Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.apache.spark.SparkFunSuite.failAfter(SparkFunSuite.scala:69)[0m[0m
> 2023-08-16T06:49:14.0866845Z [0m[[0m[0minfo[0m] [0m[0m  [0m
> 2023-08-16T06:49:14.0872599Z [0m[[0m[0minfo[0m] [0m[0m  [0m
> 2023-08-16T06:49:14.0880688Z [0m[[0m[0minfo[0m] [0m[0m[31m  == 
> Progress ==[0m[0m
> 2023-08-16T06:49:14.0887109Z [0m[[0m[0minfo[0

[jira] [Updated] (SPARK-44843) flaky test: RocksDBStateStoreStreamingAggregationSuite

2023-11-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-44843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-44843:
---
Labels: pull-request-available  (was: )

> flaky test: RocksDBStateStoreStreamingAggregationSuite
> --
>
> Key: SPARK-44843
> URL: https://issues.apache.org/jira/browse/SPARK-44843
> Project: Spark
>  Issue Type: Bug
>  Components: SQL, Tests
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Priority: Major
>  Labels: pull-request-available
>
> I've seen this more than once, let's record it for now.
> [https://github.com/apache/spark/actions/runs/5875252243/job/15931264374]
> {code:java}
> 2023-08-16T06:49:14.0550627Z [0m[[0m[0minfo[0m] [0m[0m[31m- 
> SPARK-35896: metrics in StateOperatorProgress are output correctly 
> (RocksDBStateStore) *** FAILED *** (1 minute, 1 second)[0m[0m
> 2023-08-16T06:49:14.0560354Z [0m[[0m[0minfo[0m] [0m[0m[31m  Timed out 
> waiting for stream: The code passed to failAfter did not complete within 60 
> seconds.[0m[0m
> 2023-08-16T06:49:14.0568703Z [0m[[0m[0minfo[0m] [0m[0m[31m  
> java.lang.Thread.getStackTrace(Thread.java:1564)[0m[0m
> 2023-08-16T06:49:14.0578526Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.concurrent.TimeLimits$.failAfterImpl(TimeLimits.scala:277)[0m[0m
> 2023-08-16T06:49:14.0600495Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.concurrent.TimeLimits.failAfter(TimeLimits.scala:231)[0m[0m
> 2023-08-16T06:49:14.0609443Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.concurrent.TimeLimits.failAfter$(TimeLimits.scala:230)[0m[0m
> 2023-08-16T06:49:14.0630028Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.apache.spark.SparkFunSuite.failAfter(SparkFunSuite.scala:69)[0m[0m
> 2023-08-16T06:49:14.0638142Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.apache.spark.sql.streaming.StreamTest.$anonfun$testStream$7(StreamTest.scala:481)[0m[0m
> 2023-08-16T06:49:14.0704798Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.apache.spark.sql.streaming.StreamTest.$anonfun$testStream$7$adapted(StreamTest.scala:480)[0m[0m
> 2023-08-16T06:49:14.0732716Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149)[0m[0m
> 2023-08-16T06:49:14.0743783Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237)[0m[0m
> 2023-08-16T06:49:14.0753421Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230)[0m[0m
> 2023-08-16T06:49:14.0765553Z [0m[[0m[0minfo[0m] [0m[0m  [0m
> 2023-08-16T06:49:14.0773522Z [0m[[0m[0minfo[0m] [0m[0m[31m Caused 
> by:  null[0m[0m
> 2023-08-16T06:49:14.0787123Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)[0m[0m
> 2023-08-16T06:49:14.0796604Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2173)[0m[0m
> 2023-08-16T06:49:14.0808419Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.apache.spark.sql.execution.streaming.StreamExecution.awaitOffset(StreamExecution.scala:481)[0m[0m
> 2023-08-16T06:49:14.0817018Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.apache.spark.sql.streaming.StreamTest.$anonfun$testStream$8(StreamTest.scala:482)[0m[0m
> 2023-08-16T06:49:14.0824218Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)[0m[0m
> 2023-08-16T06:49:14.0831608Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.enablers.Timed$$anon$1.timeoutAfter(Timed.scala:127)[0m[0m
> 2023-08-16T06:49:14.0838059Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.concurrent.TimeLimits$.failAfterImpl(TimeLimits.scala:282)[0m[0m
> 2023-08-16T06:49:14.0847335Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.concurrent.TimeLimits.failAfter(TimeLimits.scala:231)[0m[0m
> 2023-08-16T06:49:14.0854180Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.scalatest.concurrent.TimeLimits.failAfter$(TimeLimits.scala:230)[0m[0m
> 2023-08-16T06:49:14.0861298Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> org.apache.spark.SparkFunSuite.failAfter(SparkFunSuite.scala:69)[0m[0m
> 2023-08-16T06:49:14.0866845Z [0m[[0m[0minfo[0m] [0m[0m  [0m
> 2023-08-16T06:49:14.0872599Z [0m[[0m[0minfo[0m] [0m[0m  [0m
> 2023-08-16T06:49:14.0880688Z [0m[[0m[0minfo[0m] [0m[0m[31m  == 
> Progress ==[0m[0m
> 2023-08-16T06:49:14.0887109Z [0m[[0m[0minfo[0m] [0m[0m[31m 
> St

[jira] [Updated] (SPARK-45778) Get SHS URL from Spark applications

2023-11-03 Thread Dongjoon Hyun (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-45778:
--
Description: 
This is created based on the following review comments

[https://github.com/apache/spark/pull/43643#discussion_r1381212592]

> How about we get this address from the user's app first, and if not, then 
> from the standalone master's config?

 

 

> We can add it to the 
> [ApplicationDescription|https://github.com/apache/spark/blob/b9d379a6b84b67b29ccb578938764b888d64f293/core/src/main/scala/org/apache/spark/deploy/ApplicationDescription.scala#L24-L36]
>  ahead when you meet such a use case in the future:)

  was:
This is created based on the following review comments

[https://github.com/apache/spark/pull/43643#discussion_r1381212592]

> How about we get this address from the user's app first, and if not, then 
> from the standalone master's config?


> Get SHS URL from Spark applications
> ---
>
> Key: SPARK-45778
> URL: https://issues.apache.org/jira/browse/SPARK-45778
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Dongjoon Hyun
>Priority: Major
>
> This is created based on the following review comments
> [https://github.com/apache/spark/pull/43643#discussion_r1381212592]
> > How about we get this address from the user's app first, and if not, then 
> > from the standalone master's config?
>  
>  
> > We can add it to the 
> > [ApplicationDescription|https://github.com/apache/spark/blob/b9d379a6b84b67b29ccb578938764b888d64f293/core/src/main/scala/org/apache/spark/deploy/ApplicationDescription.scala#L24-L36]
> >  ahead when you meet such a use case in the future:)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-45778) Get SHS URL from Spark applications

2023-11-03 Thread Dongjoon Hyun (Jira)

Dongjoon Hyun created SPARK-45778:
-

 Summary: Get SHS URL from Spark applications
 Key: SPARK-45778
 URL: https://issues.apache.org/jira/browse/SPARK-45778
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 4.0.0
Reporter: Dongjoon Hyun


This is created based on the following review comments

[https://github.com/apache/spark/pull/43643#discussion_r1381212592]

> How about we get this address from the user's app first, and if not, then 
> from the standalone master's config?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-45774) Support `spark.master.ui.historyServerUrl` in `ApplicationPage`

2023-11-03 Thread Dongjoon Hyun (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-45774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-45774:
--
Summary: Support `spark.master.ui.historyServerUrl` in `ApplicationPage`  
(was: Support `spark.ui.historyServerUrl` in `ApplicationPage`)

> Support `spark.master.ui.historyServerUrl` in `ApplicationPage`
> ---
>
> Key: SPARK-45774
> URL: https://issues.apache.org/jira/browse/SPARK-45774
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Web UI
>Affects Versions: 4.0.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

44 matches

Mail list logo