date:20240610

[jira] [Updated] (SPARK-48578) Add new expressions for UTF8 string validation

2024-06-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-48578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-48578:
---
Labels: pull-request-available  (was: )

> Add new expressions for UTF8 string validation
> --
>
> Key: SPARK-48578
> URL: https://issues.apache.org/jira/browse/SPARK-48578
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-48577) Replace invalid byte sequences in UTF8Strings

2024-06-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-48577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-48577:
---
Labels: pull-request-available  (was: )

> Replace invalid byte sequences in UTF8Strings
> -
>
> Key: SPARK-48577
> URL: https://issues.apache.org/jira/browse/SPARK-48577
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-48582) Bump `braces` from 3.0.2 to 3.0.3 in /ui-test

2024-06-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-48582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-48582:
---
Labels: pull-request-available  (was: )

> Bump `braces` from 3.0.2 to 3.0.3 in /ui-test
> -
>
> Key: SPARK-48582
> URL: https://issues.apache.org/jira/browse/SPARK-48582
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-48582) Bump `braces` from 3.0.2 to 3.0.3 in /ui-test

2024-06-10 Thread Yang Jie (Jira)

Yang Jie created SPARK-48582:


 Summary: Bump `braces` from 3.0.2 to 3.0.3 in /ui-test
 Key: SPARK-48582
 URL: https://issues.apache.org/jira/browse/SPARK-48582
 Project: Spark
  Issue Type: Improvement
  Components: Build
Affects Versions: 4.0.0
Reporter: Yang Jie






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-48581) Upgrade dropwizard metrics to 4.2.26

2024-06-10 Thread Wei Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-48581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Guo updated SPARK-48581:

Summary: Upgrade dropwizard metrics to 4.2.26  (was: Upgrade dropwizard 
metrics 4.2.26)

> Upgrade dropwizard metrics to 4.2.26
> 
>
> Key: SPARK-48581
> URL: https://issues.apache.org/jira/browse/SPARK-48581
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Wei Guo
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-48581) Upgrade dropwizard metrics 4.2.26

2024-06-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-48581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-48581:
---
Labels: pull-request-available  (was: )

> Upgrade dropwizard metrics 4.2.26
> -
>
> Key: SPARK-48581
> URL: https://issues.apache.org/jira/browse/SPARK-48581
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Wei Guo
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-48581) Upgrade dropwizard metrics 4.2.26

2024-06-10 Thread Wei Guo (Jira)

Wei Guo created SPARK-48581:
---

 Summary: Upgrade dropwizard metrics 4.2.26
 Key: SPARK-48581
 URL: https://issues.apache.org/jira/browse/SPARK-48581
 Project: Spark
  Issue Type: Improvement
  Components: Build
Affects Versions: 4.0.0
Reporter: Wei Guo






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-48565) Fix thread dump display in UI

2024-06-10 Thread Kent Yao (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-48565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kent Yao resolved SPARK-48565.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46916
[https://github.com/apache/spark/pull/46916]

> Fix thread dump display in UI
> -
>
> Key: SPARK-48565
> URL: https://issues.apache.org/jira/browse/SPARK-48565
> Project: Spark
>  Issue Type: Bug
>  Components: UI
>Affects Versions: 4.0.0
>Reporter: Cheng Pan
>Assignee: Cheng Pan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-48565) Fix thread dump display in UI

2024-06-10 Thread Kent Yao (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-48565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kent Yao reassigned SPARK-48565:


Assignee: Cheng Pan

> Fix thread dump display in UI
> -
>
> Key: SPARK-48565
> URL: https://issues.apache.org/jira/browse/SPARK-48565
> Project: Spark
>  Issue Type: Bug
>  Components: UI
>Affects Versions: 4.0.0
>Reporter: Cheng Pan
>Assignee: Cheng Pan
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-48563) Upgrade pickle to 1.5

2024-06-10 Thread Yang Jie (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-48563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie resolved SPARK-48563.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46913
[https://github.com/apache/spark/pull/46913]

> Upgrade pickle to 1.5
> -
>
> Key: SPARK-48563
> URL: https://issues.apache.org/jira/browse/SPARK-48563
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-48580) MergedBlock read by reduce have missing chunks, leading to inconsistent shuffle data

2024-06-10 Thread gaoyajun02 (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-48580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gaoyajun02 updated SPARK-48580:
---
Description: 
When push-based shuffle enabled, 0.03% of the spark application in our cluster 
experienced shuffle data loss. The metrics of Exchange as follows:

!image-2024-06-11-10-19-57-227.png|width=405,height=170!

We eventually found some WARN logs on the shuffle server:
 
{code:java}
WARN shuffle-server-8-216 
org.apache.spark.network.shuffle.RemoteBlockPushResolver: Application 
application_ shuffleId 0 shuffleMergeId 0 reduceId 133 update to index/meta 
failed{code}
 

And analyzed the cause from the code：

The merge metadata obtained by the reduce side from the driver comes from the 
{{mapTracker}} in the server's memory, while the actual reading of chunk data 
is based on the records in the shuffle server's {{{}metaFile{}}}. There is no 
consistency check between the two.

  was:
When push-based shuffle enabled, 0.03% of the spark application in our cluster 
experienced shuffle data loss. The metrics for the job execution plan's 
Exchange are as follows:

!image-2024-06-11-10-19-57-227.png|width=405,height=170!


We eventually found some WARN logs on the shuffle server:
 
{code:java}
WARN shuffle-server-8-216 
org.apache.spark.network.shuffle.RemoteBlockPushResolver: Application 
application_ shuffleId 0 shuffleMergeId 0 reduceId 133 update to index/meta 
failed{code}
 

And analyzed the cause from the code：

The merge metadata obtained by the reduce side from the driver comes from the 
{{mapTracker}} in the server's memory, while the actual reading of chunk data 
is based on the records in the shuffle server's {{{}metaFile{}}}. There is no 
consistency check between the two.


> MergedBlock read by reduce have missing chunks, leading to inconsistent 
> shuffle data
> 
>
> Key: SPARK-48580
> URL: https://issues.apache.org/jira/browse/SPARK-48580
> Project: Spark
>  Issue Type: Bug
>  Components: Shuffle
>Affects Versions: 3.2.0, 3.3.0, 3.4.0, 3.5.0
>Reporter: gaoyajun02
>Priority: Major
> Attachments: image-2024-06-11-10-19-57-227.png
>
>
> When push-based shuffle enabled, 0.03% of the spark application in our 
> cluster experienced shuffle data loss. The metrics of Exchange as follows:
> !image-2024-06-11-10-19-57-227.png|width=405,height=170!
> We eventually found some WARN logs on the shuffle server:
>  
> {code:java}
> WARN shuffle-server-8-216 
> org.apache.spark.network.shuffle.RemoteBlockPushResolver: Application 
> application_ shuffleId 0 shuffleMergeId 0 reduceId 133 update to 
> index/meta failed{code}
>  
> And analyzed the cause from the code：
> The merge metadata obtained by the reduce side from the driver comes from the 
> {{mapTracker}} in the server's memory, while the actual reading of chunk data 
> is based on the records in the shuffle server's {{{}metaFile{}}}. There is no 
> consistency check between the two.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-48580) MergedBlock read by reduce have missing chunks, leading to inconsistent shuffle data

2024-06-10 Thread gaoyajun02 (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-48580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gaoyajun02 updated SPARK-48580:
---
Description: 
When push-based shuffle enabled, 0.03% of the spark application in our cluster 
experienced shuffle data loss. The metrics for the job execution plan's 
Exchange are as follows:

!image-2024-06-11-10-19-57-227.png|width=405,height=170!


We eventually found some WARN logs on the shuffle server:
 
{code:java}
WARN shuffle-server-8-216 
org.apache.spark.network.shuffle.RemoteBlockPushResolver: Application 
application_ shuffleId 0 shuffleMergeId 0 reduceId 133 update to index/meta 
failed{code}
 

And analyzed the cause from the code：

The merge metadata obtained by the reduce side from the driver comes from the 
{{mapTracker}} in the server's memory, while the actual reading of chunk data 
is based on the records in the shuffle server's {{{}metaFile{}}}. There is no 
consistency check between the two.

> MergedBlock read by reduce have missing chunks, leading to inconsistent 
> shuffle data
> 
>
> Key: SPARK-48580
> URL: https://issues.apache.org/jira/browse/SPARK-48580
> Project: Spark
>  Issue Type: Bug
>  Components: Shuffle
>Affects Versions: 3.2.0, 3.3.0, 3.4.0, 3.5.0
>Reporter: gaoyajun02
>Priority: Major
> Attachments: image-2024-06-11-10-19-57-227.png
>
>
> When push-based shuffle enabled, 0.03% of the spark application in our 
> cluster experienced shuffle data loss. The metrics for the job execution 
> plan's Exchange are as follows:
> !image-2024-06-11-10-19-57-227.png|width=405,height=170!
> We eventually found some WARN logs on the shuffle server:
>  
> {code:java}
> WARN shuffle-server-8-216 
> org.apache.spark.network.shuffle.RemoteBlockPushResolver: Application 
> application_ shuffleId 0 shuffleMergeId 0 reduceId 133 update to 
> index/meta failed{code}
>  
> And analyzed the cause from the code：
> The merge metadata obtained by the reduce side from the driver comes from the 
> {{mapTracker}} in the server's memory, while the actual reading of chunk data 
> is based on the records in the shuffle server's {{{}metaFile{}}}. There is no 
> consistency check between the two.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-48580) MergedBlock read by reduce have missing chunks, leading to inconsistent shuffle data

2024-06-10 Thread gaoyajun02 (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-48580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gaoyajun02 updated SPARK-48580:
---
Attachment: (was: image-2024-06-11-10-19-22-284.png)

> MergedBlock read by reduce have missing chunks, leading to inconsistent 
> shuffle data
> 
>
> Key: SPARK-48580
> URL: https://issues.apache.org/jira/browse/SPARK-48580
> Project: Spark
>  Issue Type: Bug
>  Components: Shuffle
>Affects Versions: 3.2.0, 3.3.0, 3.4.0, 3.5.0
>Reporter: gaoyajun02
>Priority: Major
> Attachments: image-2024-06-11-10-19-57-227.png
>
>
> When push-based shuffle enabled, 0.03% of the spark application in our 
> cluster experienced shuffle data loss. The metrics for the job execution 
> plan's Exchange are as follows:
> !image-2024-06-11-10-19-57-227.png|width=405,height=170!
> We eventually found some WARN logs on the shuffle server:
>  
> {code:java}
> WARN shuffle-server-8-216 
> org.apache.spark.network.shuffle.RemoteBlockPushResolver: Application 
> application_ shuffleId 0 shuffleMergeId 0 reduceId 133 update to 
> index/meta failed{code}
>  
> And analyzed the cause from the code：
> The merge metadata obtained by the reduce side from the driver comes from the 
> {{mapTracker}} in the server's memory, while the actual reading of chunk data 
> is based on the records in the shuffle server's {{{}metaFile{}}}. There is no 
> consistency check between the two.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-48580) MergedBlock read by reduce have missing chunks, leading to inconsistent shuffle data

2024-06-10 Thread gaoyajun02 (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-48580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gaoyajun02 updated SPARK-48580:
---
Attachment: image-2024-06-11-10-19-57-227.png

> MergedBlock read by reduce have missing chunks, leading to inconsistent 
> shuffle data
> 
>
> Key: SPARK-48580
> URL: https://issues.apache.org/jira/browse/SPARK-48580
> Project: Spark
>  Issue Type: Bug
>  Components: Shuffle
>Affects Versions: 3.2.0, 3.3.0, 3.4.0, 3.5.0
>Reporter: gaoyajun02
>Priority: Major
> Attachments: image-2024-06-11-10-19-22-284.png, 
> image-2024-06-11-10-19-57-227.png
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-48580) MergedBlock read by reduce have missing chunks, leading to inconsistent shuffle data

2024-06-10 Thread gaoyajun02 (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-48580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gaoyajun02 updated SPARK-48580:
---
Attachment: image-2024-06-11-10-19-22-284.png

> MergedBlock read by reduce have missing chunks, leading to inconsistent 
> shuffle data
> 
>
> Key: SPARK-48580
> URL: https://issues.apache.org/jira/browse/SPARK-48580
> Project: Spark
>  Issue Type: Bug
>  Components: Shuffle
>Affects Versions: 3.2.0, 3.3.0, 3.4.0, 3.5.0
>Reporter: gaoyajun02
>Priority: Major
> Attachments: image-2024-06-11-10-19-22-284.png
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-48580) MergedBlock read by reduce have missing chunks, leading to inconsistent shuffle data

2024-06-10 Thread gaoyajun02 (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-48580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gaoyajun02 updated SPARK-48580:
---
Summary: MergedBlock read by reduce have missing chunks, leading to 
inconsistent shuffle data  (was: The merge blocks read by reduce have missing 
chunks, leading to inconsistent shuffle data)

> MergedBlock read by reduce have missing chunks, leading to inconsistent 
> shuffle data
> 
>
> Key: SPARK-48580
> URL: https://issues.apache.org/jira/browse/SPARK-48580
> Project: Spark
>  Issue Type: Bug
>  Components: Shuffle
>Affects Versions: 3.2.0, 3.3.0, 3.4.0, 3.5.0
>Reporter: gaoyajun02
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-48580) The merge blocks read by reduce have missing chunks, leading to inconsistent shuffle data

2024-06-10 Thread gaoyajun02 (Jira)

gaoyajun02 created SPARK-48580:
--

 Summary: The merge blocks read by reduce have missing chunks, 
leading to inconsistent shuffle data
 Key: SPARK-48580
 URL: https://issues.apache.org/jira/browse/SPARK-48580
 Project: Spark
  Issue Type: Bug
  Components: Shuffle
Affects Versions: 3.5.0, 3.4.0, 3.3.0, 3.2.0
Reporter: gaoyajun02






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-48480) StreamingQueryListener thread should not be interruptable

2024-06-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-48480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-48480:
---
Labels: pull-request-available  (was: )

> StreamingQueryListener thread should not be interruptable
> -
>
> Key: SPARK-48480
> URL: https://issues.apache.org/jira/browse/SPARK-48480
> Project: Spark
>  Issue Type: New Feature
>  Components: Connect, SS
>Affects Versions: 4.0.0
>Reporter: Wei Liu
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-48569) Connect - StreamingQuery.name should return null when not specified

2024-06-10 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-48569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-48569.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46920
[https://github.com/apache/spark/pull/46920]

> Connect - StreamingQuery.name should return null when not specified
> ---
>
> Key: SPARK-48569
> URL: https://issues.apache.org/jira/browse/SPARK-48569
> Project: Spark
>  Issue Type: New Feature
>  Components: Connect, SS
>Affects Versions: 4.0.0
>Reporter: Wei Liu
>Assignee: Wei Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-48410) Fix InitCap expression

2024-06-10 Thread Wenchen Fan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-48410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-48410:
---

Assignee: Uroš Bojanić

> Fix InitCap expression
> --
>
> Key: SPARK-48410
> URL: https://issues.apache.org/jira/browse/SPARK-48410
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Assignee: Uroš Bojanić
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-48410) Fix InitCap expression

2024-06-10 Thread Wenchen Fan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-48410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48410.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46732
[https://github.com/apache/spark/pull/46732]

> Fix InitCap expression
> --
>
> Key: SPARK-48410
> URL: https://issues.apache.org/jira/browse/SPARK-48410
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Assignee: Uroš Bojanić
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-48403) Fix Upper & Lower expressions for UTF8_BINARY_LCASE

2024-06-10 Thread Wenchen Fan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-48403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48403.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46720
[https://github.com/apache/spark/pull/46720]

> Fix Upper & Lower expressions for UTF8_BINARY_LCASE
> ---
>
> Key: SPARK-48403
> URL: https://issues.apache.org/jira/browse/SPARK-48403
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Assignee: Uroš Bojanić
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-48564) Propagate cached schema in set operations

2024-06-10 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-48564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-48564.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46915
[https://github.com/apache/spark/pull/46915]

> Propagate cached schema in set operations
> -
>
> Key: SPARK-48564
> URL: https://issues.apache.org/jira/browse/SPARK-48564
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect, PySpark
>Affects Versions: 4.0.0
>Reporter: Ruifeng Zheng
>Assignee: Ruifeng Zheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-48564) Propagate cached schema in set operations

2024-06-10 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-48564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-48564:


Assignee: Ruifeng Zheng

> Propagate cached schema in set operations
> -
>
> Key: SPARK-48564
> URL: https://issues.apache.org/jira/browse/SPARK-48564
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect, PySpark
>Affects Versions: 4.0.0
>Reporter: Ruifeng Zheng
>Assignee: Ruifeng Zheng
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-48342) [M0] Parser support

2024-06-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-48342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot reassigned SPARK-48342:
--

Assignee: (was: Apache Spark)

> [M0] Parser support
> ---
>
> Key: SPARK-48342
> URL: https://issues.apache.org/jira/browse/SPARK-48342
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: David Milicevic
>Priority: Major
>  Labels: pull-request-available
>
> Implement parse for SQL scripting with all supporting changes for upcoming 
> interpreter implementation and future extensions of the parser:
>  * Parser - support only compound statements
>  * Parser testing
>  
> For more details, design doc can be found in parent Jira item.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-48342) [M0] Parser support

2024-06-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-48342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot reassigned SPARK-48342:
--

Assignee: Apache Spark

> [M0] Parser support
> ---
>
> Key: SPARK-48342
> URL: https://issues.apache.org/jira/browse/SPARK-48342
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: David Milicevic
>Assignee: Apache Spark
>Priority: Major
>  Labels: pull-request-available
>
> Implement parse for SQL scripting with all supporting changes for upcoming 
> interpreter implementation and future extensions of the parser:
>  * Parser - support only compound statements
>  * Parser testing
>  
> For more details, design doc can be found in parent Jira item.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-48579) [M1] Merge DatabricksSqlParser with SparkSqlParser

2024-06-10 Thread David Milicevic (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-48579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Milicevic updated SPARK-48579:

Description: 
OSS has single parser for SQL statements - SparkSqlParser.

However, in runtime we have additional parser - DatabricksSqlParser. This 
parser is used for edge SQL rules, but there is no clear separation because 
folks keep adding edge rules/features to the SparkSqlParser as well. 

More details can be found in this design doc comment: 
[https://docs.google.com/document/d/1DIsMf2LQJvD4UC5JR1-YBH6Rv3UvGK3dD2QTOf_0SK4/edit?disco=AAABNCpDpdo.|https://docs.google.com/document/d/1DIsMf2LQJvD4UC5JR1-YBH6Rv3UvGK3dD2QTOf_0SK4/edit?disco=AAABNCpDpdo]

 

It seems like there is no reason not to merge these two parsers into one, but 
it needs to be investigated first before refactoring.

  was:
OSS has single parser for SQL statements - SparkSqlParser.

However, in runtime we have additional parser - DatabricksSqlParser. This 
parser is used for edge SQL rules, but there is no clear separation because 
folks keep adding edge rules/features to the SparkSqlParser as well. 

More details can be found in [this design doc 
comment|[https://docs.google.com/document/d/1DIsMf2LQJvD4UC5JR1-YBH6Rv3UvGK3dD2QTOf_0SK4/edit?disco=AAABNCpDpdo]]

 

It seems like there is no reason not to merge these two parsers into one, but 
it needs to be investigated first before refactoring.


> [M1] Merge DatabricksSqlParser with SparkSqlParser
> --
>
> Key: SPARK-48579
> URL: https://issues.apache.org/jira/browse/SPARK-48579
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: David Milicevic
>Priority: Major
>
> OSS has single parser for SQL statements - SparkSqlParser.
> However, in runtime we have additional parser - DatabricksSqlParser. This 
> parser is used for edge SQL rules, but there is no clear separation because 
> folks keep adding edge rules/features to the SparkSqlParser as well. 
> More details can be found in this design doc comment: 
> [https://docs.google.com/document/d/1DIsMf2LQJvD4UC5JR1-YBH6Rv3UvGK3dD2QTOf_0SK4/edit?disco=AAABNCpDpdo.|https://docs.google.com/document/d/1DIsMf2LQJvD4UC5JR1-YBH6Rv3UvGK3dD2QTOf_0SK4/edit?disco=AAABNCpDpdo]
>  
> It seems like there is no reason not to merge these two parsers into one, but 
> it needs to be investigated first before refactoring.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-48579) [M1] Merge DatabricksSqlParser with SparkSqlParser

2024-06-10 Thread David Milicevic (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-48579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Milicevic updated SPARK-48579:

Description: 
OSS has single parser for SQL statements - SparkSqlParser.

However, in runtime we have additional parser - DatabricksSqlParser. This 
parser is used for edge SQL rules, but there is no clear separation because 
folks keep adding edge rules/features to the SparkSqlParser as well. 

More details can be found in [this design doc 
comment|[https://docs.google.com/document/d/1DIsMf2LQJvD4UC5JR1-YBH6Rv3UvGK3dD2QTOf_0SK4/edit?disco=AAABNCpDpdo]]

 

It seems like there is no reason not to merge these two parsers into one, but 
it needs to be investigated first before refactoring.

  was:
OSS has single parser for SQL statements - SparkSqlParser.

However, in runtime we have additional parser - DatabricksSqlParser. This 
parser is used for edge SQL rules, but there is no clear separation because 
folks keep adding edge rules/features to the SparkSqlParser as well. 

More details can be found in [this design doc 
comment|[https://docs.google.com/document/d/1DIsMf2LQJvD4UC5JR1-YBH6Rv3UvGK3dD2QTOf_0SK4/edit?disco=AAABNCpDpdo]].

 

It seems like there is no reason not to merge these two parsers into one, but 
it needs to be investigated first before refactoring.


> [M1] Merge DatabricksSqlParser with SparkSqlParser
> --
>
> Key: SPARK-48579
> URL: https://issues.apache.org/jira/browse/SPARK-48579
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: David Milicevic
>Priority: Major
>
> OSS has single parser for SQL statements - SparkSqlParser.
> However, in runtime we have additional parser - DatabricksSqlParser. This 
> parser is used for edge SQL rules, but there is no clear separation because 
> folks keep adding edge rules/features to the SparkSqlParser as well. 
> More details can be found in [this design doc 
> comment|[https://docs.google.com/document/d/1DIsMf2LQJvD4UC5JR1-YBH6Rv3UvGK3dD2QTOf_0SK4/edit?disco=AAABNCpDpdo]]
>  
> It seems like there is no reason not to merge these two parsers into one, but 
> it needs to be investigated first before refactoring.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-48579) [M1] Merge DatabricksSqlParser with SparkSqlParser

2024-06-10 Thread David Milicevic (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-48579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Milicevic updated SPARK-48579:

Description: 
OSS has single parser for SQL statements - SparkSqlParser.

However, in runtime we have additional parser - DatabricksSqlParser. This 
parser is used for edge SQL rules, but there is no clear separation because 
folks keep adding edge rules/features to the SparkSqlParser as well. 

More details can be found in [this design doc 
comment|[https://docs.google.com/document/d/1DIsMf2LQJvD4UC5JR1-YBH6Rv3UvGK3dD2QTOf_0SK4/edit?disco=AAABNCpDpdo]].

 

It seems like there is no reason not to merge these two parsers into one, but 
it needs to be investigated first before refactoring.

  was:
OSS has single parser for SQL statements - SparkSqlParser.

However, in runtime we have additional parser - DatabricksSqlParser. This 
parser is used for edge SQL rules, but there is no clear separation because 
folks keep adding edge rules/features to the SparkSqlParser as well. 

More details can be found in [this design doc 
comment|[https://docs.google.com/document/d/1DIsMf2LQJvD4UC5JR1-YBH6Rv3UvGK3dD2QTOf_0SK4/edit?disco=AAABNCpDpdo].]

 

It seems like there is no reason not to merge these two parsers into one, but 
it needs to be investigated first before refactoring.


> [M1] Merge DatabricksSqlParser with SparkSqlParser
> --
>
> Key: SPARK-48579
> URL: https://issues.apache.org/jira/browse/SPARK-48579
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: David Milicevic
>Priority: Major
>
> OSS has single parser for SQL statements - SparkSqlParser.
> However, in runtime we have additional parser - DatabricksSqlParser. This 
> parser is used for edge SQL rules, but there is no clear separation because 
> folks keep adding edge rules/features to the SparkSqlParser as well. 
> More details can be found in [this design doc 
> comment|[https://docs.google.com/document/d/1DIsMf2LQJvD4UC5JR1-YBH6Rv3UvGK3dD2QTOf_0SK4/edit?disco=AAABNCpDpdo]].
>  
> It seems like there is no reason not to merge these two parsers into one, but 
> it needs to be investigated first before refactoring.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-48579) [M1] Merge DatabricksSqlParser with SparkSqlParser

2024-06-10 Thread David Milicevic (Jira)

David Milicevic created SPARK-48579:
---

 Summary: [M1] Merge DatabricksSqlParser with SparkSqlParser
 Key: SPARK-48579
 URL: https://issues.apache.org/jira/browse/SPARK-48579
 Project: Spark
  Issue Type: Sub-task
  Components: Spark Core
Affects Versions: 4.0.0
Reporter: David Milicevic


OSS has single parser for SQL statements - SparkSqlParser.

However, in runtime we have additional parser - DatabricksSqlParser. This 
parser is used for edge SQL rules, but there is no clear separation because 
folks keep adding edge rules/features to the SparkSqlParser as well. 

More details can be found in [this design doc 
comment|[https://docs.google.com/document/d/1DIsMf2LQJvD4UC5JR1-YBH6Rv3UvGK3dD2QTOf_0SK4/edit?disco=AAABNCpDpdo].]

 

It seems like there is no reason not to merge these two parsers into one, but 
it needs to be investigated first before refactoring.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-48311) Nested pythonUDF in groupBy and aggregate result in Binding Exception

2024-06-10 Thread Sumit Singh (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-48311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17853582#comment-17853582
 ] 

Sumit Singh commented on SPARK-48311:
-

I have created the PR based on details explained in details doc. 

> Nested pythonUDF in groupBy and aggregate result in Binding Exception 
> --
>
> Key: SPARK-48311
> URL: https://issues.apache.org/jira/browse/SPARK-48311
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark, SQL
>Affects Versions: 3.3.2
>Reporter: Sumit Singh
>Priority: Major
>  Labels: pull-request-available
>
> Steps to Reproduce 
> 1. Data creation
> {code:java}
> from pyspark.sql import SparkSession
> from pyspark.sql.types import StructType, StructField, LongType, 
> TimestampType, StringType
> from datetime import datetime
> # Define the schema
> schema = StructType([
>     StructField("col1", LongType(), nullable=True),
>     StructField("col2", TimestampType(), nullable=True),
>     StructField("col3", StringType(), nullable=True)
> ])
> # Define the data
> data = [
>     (1, datetime(2023, 5, 15, 12, 30), "Discount"),
>     (2, datetime(2023, 5, 16, 16, 45), "Promotion"),
>     (3, datetime(2023, 5, 17, 9, 15), "Coupon")
> ]
> # Create the DataFrame
> df = spark.createDataFrame(data, schema)
> df.createOrReplaceTempView("temp_offers")
> # Query the temporary table using SQL
> # DISTINCT required to reproduce the issue. 
> testDf = spark.sql("""
>                     SELECT DISTINCT 
>                     col1,
>                     col2,
>                     col3 FROM temp_offers
>                     """) {code}
> 2. UDF registration 
> {code:java}
> import pyspark.sql.functions as F 
> import pyspark.sql.types as T
> #Creating udf functions 
> def udf1(d):
>     return d
> def udf2(d):
>     if d.isoweekday() in (1, 2, 3, 4):
>         return 'WEEKDAY'
>     else:
>         return 'WEEKEND'
> udf1_name = F.udf(udf1, T.TimestampType())
> udf2_name = F.udf(udf2, T.StringType()) {code}
> 3. Adding UDF in grouping and agg
> {code:java}
> groupBy_cols = ['col1', 'col4', 'col5', 'col3']
> temp = testDf \
>   .select('*', udf1_name(F.col('col2')).alias('col4')).select('*', 
> udf2_name('col4').alias('col5')) 
> result = 
> (temp.groupBy(*groupBy_cols).agg(F.countDistinct('col5').alias('col6'))){code}
> 4. Result
> {code:java}
> result.show(5, False) {code}
> *We get below error*
> {code:java}
> An error was encountered:
> An error occurred while calling o1079.showString.
> : java.lang.IllegalStateException: Couldn't find pythonUDF0#1108 in 
> [col1#978L,groupingPythonUDF#1104,groupingPythonUDF#1105,col3#980,count(pythonUDF0#1108)#1080L]
>   at 
> org.apache.spark.sql.catalyst.expressions.BindReferences$$anonfun$bindReference$1.applyOrElse(BoundAttribute.scala:80)
>   at 
> org.apache.spark.sql.catalyst.expressions.BindReferences$$anonfun$bindReference$1.applyOrElse(BoundAttribute.scala:73)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-48311) Nested pythonUDF in groupBy and aggregate result in Binding Exception

2024-06-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-48311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-48311:
---
Labels: pull-request-available  (was: )

> Nested pythonUDF in groupBy and aggregate result in Binding Exception 
> --
>
> Key: SPARK-48311
> URL: https://issues.apache.org/jira/browse/SPARK-48311
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark, SQL
>Affects Versions: 3.3.2
>Reporter: Sumit Singh
>Priority: Major
>  Labels: pull-request-available
>
> Steps to Reproduce 
> 1. Data creation
> {code:java}
> from pyspark.sql import SparkSession
> from pyspark.sql.types import StructType, StructField, LongType, 
> TimestampType, StringType
> from datetime import datetime
> # Define the schema
> schema = StructType([
>     StructField("col1", LongType(), nullable=True),
>     StructField("col2", TimestampType(), nullable=True),
>     StructField("col3", StringType(), nullable=True)
> ])
> # Define the data
> data = [
>     (1, datetime(2023, 5, 15, 12, 30), "Discount"),
>     (2, datetime(2023, 5, 16, 16, 45), "Promotion"),
>     (3, datetime(2023, 5, 17, 9, 15), "Coupon")
> ]
> # Create the DataFrame
> df = spark.createDataFrame(data, schema)
> df.createOrReplaceTempView("temp_offers")
> # Query the temporary table using SQL
> # DISTINCT required to reproduce the issue. 
> testDf = spark.sql("""
>                     SELECT DISTINCT 
>                     col1,
>                     col2,
>                     col3 FROM temp_offers
>                     """) {code}
> 2. UDF registration 
> {code:java}
> import pyspark.sql.functions as F 
> import pyspark.sql.types as T
> #Creating udf functions 
> def udf1(d):
>     return d
> def udf2(d):
>     if d.isoweekday() in (1, 2, 3, 4):
>         return 'WEEKDAY'
>     else:
>         return 'WEEKEND'
> udf1_name = F.udf(udf1, T.TimestampType())
> udf2_name = F.udf(udf2, T.StringType()) {code}
> 3. Adding UDF in grouping and agg
> {code:java}
> groupBy_cols = ['col1', 'col4', 'col5', 'col3']
> temp = testDf \
>   .select('*', udf1_name(F.col('col2')).alias('col4')).select('*', 
> udf2_name('col4').alias('col5')) 
> result = 
> (temp.groupBy(*groupBy_cols).agg(F.countDistinct('col5').alias('col6'))){code}
> 4. Result
> {code:java}
> result.show(5, False) {code}
> *We get below error*
> {code:java}
> An error was encountered:
> An error occurred while calling o1079.showString.
> : java.lang.IllegalStateException: Couldn't find pythonUDF0#1108 in 
> [col1#978L,groupingPythonUDF#1104,groupingPythonUDF#1105,col3#980,count(pythonUDF0#1108)#1080L]
>   at 
> org.apache.spark.sql.catalyst.expressions.BindReferences$$anonfun$bindReference$1.applyOrElse(BoundAttribute.scala:80)
>   at 
> org.apache.spark.sql.catalyst.expressions.BindReferences$$anonfun$bindReference$1.applyOrElse(BoundAttribute.scala:73)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-48576) Rename UTF8_BINARY_LCASE to UTF8_LCASE

2024-06-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-48576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-48576:
---
Labels: pull-request-available  (was: )

> Rename UTF8_BINARY_LCASE to UTF8_LCASE
> --
>
> Key: SPARK-48576
> URL: https://issues.apache.org/jira/browse/SPARK-48576
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-48280) Improve collation testing surface area using expression walking

2024-06-10 Thread Mihailo Milosevic (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-48280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-48280:
--
Summary: Improve collation testing surface area using expression walking  
(was: Add Expression Walker for Testing)

> Improve collation testing surface area using expression walking
> ---
>
> Key: SPARK-48280
> URL: https://issues.apache.org/jira/browse/SPARK-48280
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Mihailo Milosevic
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-48578) Add new expressions for UTF8 string validation

[jira] [Updated] (SPARK-48577) Replace invalid byte sequences in UTF8Strings

[jira] [Updated] (SPARK-48582) Bump `braces` from 3.0.2 to 3.0.3 in /ui-test

[jira] [Created] (SPARK-48582) Bump `braces` from 3.0.2 to 3.0.3 in /ui-test

[jira] [Updated] (SPARK-48581) Upgrade dropwizard metrics to 4.2.26

[jira] [Updated] (SPARK-48581) Upgrade dropwizard metrics 4.2.26

[jira] [Created] (SPARK-48581) Upgrade dropwizard metrics 4.2.26

[jira] [Resolved] (SPARK-48565) Fix thread dump display in UI

[jira] [Assigned] (SPARK-48565) Fix thread dump display in UI

[jira] [Resolved] (SPARK-48563) Upgrade pickle to 1.5

[jira] [Updated] (SPARK-48580) MergedBlock read by reduce have missing chunks, leading to inconsistent shuffle data

[jira] [Updated] (SPARK-48580) MergedBlock read by reduce have missing chunks, leading to inconsistent shuffle data

[jira] [Updated] (SPARK-48580) MergedBlock read by reduce have missing chunks, leading to inconsistent shuffle data

[jira] [Updated] (SPARK-48580) MergedBlock read by reduce have missing chunks, leading to inconsistent shuffle data

[jira] [Updated] (SPARK-48580) MergedBlock read by reduce have missing chunks, leading to inconsistent shuffle data

[jira] [Updated] (SPARK-48580) MergedBlock read by reduce have missing chunks, leading to inconsistent shuffle data

[jira] [Created] (SPARK-48580) The merge blocks read by reduce have missing chunks, leading to inconsistent shuffle data

[jira] [Updated] (SPARK-48480) StreamingQueryListener thread should not be interruptable

[jira] [Resolved] (SPARK-48569) Connect - StreamingQuery.name should return null when not specified

[jira] [Assigned] (SPARK-48410) Fix InitCap expression

[jira] [Resolved] (SPARK-48410) Fix InitCap expression

[jira] [Resolved] (SPARK-48403) Fix Upper & Lower expressions for UTF8_BINARY_LCASE

[jira] [Resolved] (SPARK-48564) Propagate cached schema in set operations

[jira] [Assigned] (SPARK-48564) Propagate cached schema in set operations

[jira] [Assigned] (SPARK-48342) [M0] Parser support

[jira] [Assigned] (SPARK-48342) [M0] Parser support

[jira] [Updated] (SPARK-48579) [M1] Merge DatabricksSqlParser with SparkSqlParser

[jira] [Updated] (SPARK-48579) [M1] Merge DatabricksSqlParser with SparkSqlParser

[jira] [Updated] (SPARK-48579) [M1] Merge DatabricksSqlParser with SparkSqlParser

[jira] [Created] (SPARK-48579) [M1] Merge DatabricksSqlParser with SparkSqlParser

[jira] [Commented] (SPARK-48311) Nested pythonUDF in groupBy and aggregate result in Binding Exception

[jira] [Updated] (SPARK-48311) Nested pythonUDF in groupBy and aggregate result in Binding Exception

[jira] [Updated] (SPARK-48576) Rename UTF8_BINARY_LCASE to UTF8_LCASE

[jira] [Updated] (SPARK-48280) Improve collation testing surface area using expression walking

34 matches

Site Navigation

Mail list logo

Footer information