[jira] [Assigned] (SPARK-48559) Fetch globalTempDatabase name directly without invoking initialization of GlobalaTempViewManager

2024-06-07 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-48559:
---

Assignee: Wenchen Fan

> Fetch globalTempDatabase name directly without invoking initialization of 
> GlobalaTempViewManager
> 
>
> Key: SPARK-48559
> URL: https://issues.apache.org/jira/browse/SPARK-48559
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Wenchen Fan
>Assignee: Wenchen Fan
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48559) Fetch globalTempDatabase name directly without invoking initialization of GlobalaTempViewManager

2024-06-07 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48559.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46907
[https://github.com/apache/spark/pull/46907]

> Fetch globalTempDatabase name directly without invoking initialization of 
> GlobalaTempViewManager
> 
>
> Key: SPARK-48559
> URL: https://issues.apache.org/jira/browse/SPARK-48559
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Wenchen Fan
>Assignee: Wenchen Fan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48286) Analyze 'exists' default expression instead of 'current' default expression in structField to v2 column conversion

2024-06-06 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-48286:
---

Assignee: Uros Stankovic

> Analyze 'exists' default expression instead of 'current' default expression 
> in structField to v2 column conversion
> --
>
> Key: SPARK-48286
> URL: https://issues.apache.org/jira/browse/SPARK-48286
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Uros Stankovic
>Assignee: Uros Stankovic
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> org.apache.spark.sql.catalyst.util.ResolveDefaultColumns#analyze method 
> accepts 3 parameter
> 1) Field to analyze
> 2) Statement type - String
> 3) Metadata key - CURRENT_DEFAULT or EXISTS_DEFAULT
> Method 
> org.apache.spark.sql.connector.catalog.CatalogV2Util#structFieldToV2Column
> pass fieldToAnalyze and EXISTS_DEFAULT as second parameter, so it is not 
> metadata key, instead of that, it is statement type, so bad expression is 
> analyzed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48286) Analyze 'exists' default expression instead of 'current' default expression in structField to v2 column conversion

2024-06-06 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48286.
-
Fix Version/s: 3.5.2
   Resolution: Fixed

Issue resolved by pull request 46594
[https://github.com/apache/spark/pull/46594]

> Analyze 'exists' default expression instead of 'current' default expression 
> in structField to v2 column conversion
> --
>
> Key: SPARK-48286
> URL: https://issues.apache.org/jira/browse/SPARK-48286
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Uros Stankovic
>Assignee: Uros Stankovic
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 3.5.2, 4.0.0
>
>
> org.apache.spark.sql.catalyst.util.ResolveDefaultColumns#analyze method 
> accepts 3 parameter
> 1) Field to analyze
> 2) Statement type - String
> 3) Metadata key - CURRENT_DEFAULT or EXISTS_DEFAULT
> Method 
> org.apache.spark.sql.connector.catalog.CatalogV2Util#structFieldToV2Column
> pass fieldToAnalyze and EXISTS_DEFAULT as second parameter, so it is not 
> metadata key, instead of that, it is statement type, so bad expression is 
> analyzed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48283) Implement modified Lowercase operation for UTF8_BINARY_LCASE

2024-06-06 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48283.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46700
[https://github.com/apache/spark/pull/46700]

> Implement modified Lowercase operation for UTF8_BINARY_LCASE
> 
>
> Key: SPARK-48283
> URL: https://issues.apache.org/jira/browse/SPARK-48283
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Assignee: Uroš Bojanić
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48435) UNICODE collation should not support binary equality

2024-06-06 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48435.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46772
[https://github.com/apache/spark/pull/46772]

> UNICODE collation should not support binary equality
> 
>
> Key: SPARK-48435
> URL: https://issues.apache.org/jira/browse/SPARK-48435
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Assignee: Uroš Bojanić
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48435) UNICODE collation should not support binary equality

2024-06-06 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-48435:
---

Assignee: Uroš Bojanić

> UNICODE collation should not support binary equality
> 
>
> Key: SPARK-48435
> URL: https://issues.apache.org/jira/browse/SPARK-48435
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Assignee: Uroš Bojanić
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48526) Allow passing custom sink to StreamTest::testStream

2024-06-06 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-48526:
---

Assignee: Johan Lasperas

> Allow passing custom sink to StreamTest::testStream
> ---
>
> Key: SPARK-48526
> URL: https://issues.apache.org/jira/browse/SPARK-48526
> Project: Spark
>  Issue Type: Test
>  Components: Structured Streaming
>Affects Versions: 4.0.0
>Reporter: Johan Lasperas
>Assignee: Johan Lasperas
>Priority: Trivial
>  Labels: pull-request-available
>
> The testing helpers for streaming don't allow providing a custom sink, this 
> is limiting in (at least) two ways:
>  * A sink can't be reused across multiple calls to `testStream`, e.g. when 
> canceling and resuming streaming
>  * A custom sink implementation other than `MemorySink` can't be provided. A 
> use case here is for example to test the Delta streaming sink by wrapping it 
> in a MemorySink interface and passing it to the test framework.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48526) Allow passing custom sink to StreamTest::testStream

2024-06-06 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48526.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46866
[https://github.com/apache/spark/pull/46866]

> Allow passing custom sink to StreamTest::testStream
> ---
>
> Key: SPARK-48526
> URL: https://issues.apache.org/jira/browse/SPARK-48526
> Project: Spark
>  Issue Type: Test
>  Components: Structured Streaming
>Affects Versions: 4.0.0
>Reporter: Johan Lasperas
>Assignee: Johan Lasperas
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> The testing helpers for streaming don't allow providing a custom sink, this 
> is limiting in (at least) two ways:
>  * A sink can't be reused across multiple calls to `testStream`, e.g. when 
> canceling and resuming streaming
>  * A custom sink implementation other than `MemorySink` can't be provided. A 
> use case here is for example to test the Delta streaming sink by wrapping it 
> in a MemorySink interface and passing it to the test framework.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48546) Fix ExpressionEncoder after replacing NullPointerExceptions with proper error classes in AssertNotNull expression

2024-06-06 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-48546:
---

Assignee: Daniel

> Fix ExpressionEncoder after replacing NullPointerExceptions with proper error 
> classes in AssertNotNull expression
> -
>
> Key: SPARK-48546
> URL: https://issues.apache.org/jira/browse/SPARK-48546
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Daniel
>Assignee: Daniel
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48546) Fix ExpressionEncoder after replacing NullPointerExceptions with proper error classes in AssertNotNull expression

2024-06-06 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48546.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46888
[https://github.com/apache/spark/pull/46888]

> Fix ExpressionEncoder after replacing NullPointerExceptions with proper error 
> classes in AssertNotNull expression
> -
>
> Key: SPARK-48546
> URL: https://issues.apache.org/jira/browse/SPARK-48546
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Daniel
>Assignee: Daniel
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-48552) multi-line CSV schema inference should also throw FAILED_READ_FILE

2024-06-06 Thread Wenchen Fan (Jira)
Wenchen Fan created SPARK-48552:
---

 Summary: multi-line CSV schema inference should also throw 
FAILED_READ_FILE
 Key: SPARK-48552
 URL: https://issues.apache.org/jira/browse/SPARK-48552
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 4.0.0
Reporter: Wenchen Fan






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48307) InlineCTE should keep not-inlined relations in the original WithCTE node

2024-06-04 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48307.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46617
[https://github.com/apache/spark/pull/46617]

> InlineCTE should keep not-inlined relations in the original WithCTE node
> 
>
> Key: SPARK-48307
> URL: https://issues.apache.org/jira/browse/SPARK-48307
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Wenchen Fan
>Assignee: Wenchen Fan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48318) Hash join support for strings with collation (complex types)

2024-06-04 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-48318:
---

Assignee: Uroš Bojanić

> Hash join support for strings with collation (complex types)
> 
>
> Key: SPARK-48318
> URL: https://issues.apache.org/jira/browse/SPARK-48318
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Assignee: Uroš Bojanić
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48318) Hash join support for strings with collation (complex types)

2024-06-04 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48318.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46722
[https://github.com/apache/spark/pull/46722]

> Hash join support for strings with collation (complex types)
> 
>
> Key: SPARK-48318
> URL: https://issues.apache.org/jira/browse/SPARK-48318
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Assignee: Uroš Bojanić
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-47972) Restrict CAST expression for collations

2024-06-03 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-47972:
---

Assignee: Mihailo Milosevic

> Restrict CAST expression for collations
> ---
>
> Key: SPARK-47972
> URL: https://issues.apache.org/jira/browse/SPARK-47972
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Mihailo Milosevic
>Assignee: Mihailo Milosevic
>Priority: Major
>  Labels: pull-request-available
>
> Current state of code allows for calls like CAST(1 AS STRING COLLATE 
> UNICODE). We want to restrict CAST expression to only be able to cast to 
> default collation string, and to only allow COLLATE expression to produce 
> explicitly collated strings.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47972) Restrict CAST expression for collations

2024-06-03 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-47972.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46474
[https://github.com/apache/spark/pull/46474]

> Restrict CAST expression for collations
> ---
>
> Key: SPARK-47972
> URL: https://issues.apache.org/jira/browse/SPARK-47972
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Mihailo Milosevic
>Assignee: Mihailo Milosevic
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Current state of code allows for calls like CAST(1 AS STRING COLLATE 
> UNICODE). We want to restrict CAST expression to only be able to cast to 
> default collation string, and to only allow COLLATE expression to produce 
> explicitly collated strings.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48413) ALTER COLUMN with collation

2024-06-03 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48413.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46734
[https://github.com/apache/spark/pull/46734]

> ALTER COLUMN with collation
> ---
>
> Key: SPARK-48413
> URL: https://issues.apache.org/jira/browse/SPARK-48413
> Project: Spark
>  Issue Type: Task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Nikola Mandic
>Assignee: Nikola Mandic
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Add support for changing collation of a column with ALTER COLUMN command.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48503) Scalar subquery with group-by and non-equality predicate incorrectly allowed, wrong results

2024-06-03 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-48503:
---

Assignee: Jack Chen

> Scalar subquery with group-by and non-equality predicate incorrectly allowed, 
> wrong results
> ---
>
> Key: SPARK-48503
> URL: https://issues.apache.org/jira/browse/SPARK-48503
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Jack Chen
>Assignee: Jack Chen
>Priority: Major
>  Labels: pull-request-available
>
> This query is not legal and should give an error, but instead we incorrectly 
> allow it and it returns wrong results.
> {code:java}
> create table x(x1 int, x2 int);
> insert into x values (1, 1);
> create table y(y1 int, y2 int);
> insert into y values (2, 2), (3, 3);
> select *, (select count(*) from y where y1 > x1 group by y1) from x; {code}
> It returns two rows, even though there's only one row of x.
> The correct result is an error: more than one row returned by a subquery used 
> as an expression (as seen in postgres for example)
>  
> This is a longstanding bug. The bug is in CheckAnalysis in 
> {{{}checkAggregateInScalarSubquery{}}}. It allows grouping columns that are 
> present in correlation predicates, but doesn’t check whether those predicates 
> are equalities -  because when that code was written, non-equality 
> correlation wasn’t allowed. Therefore, it looks like this bug has existed 
> since non-equality correlation was added (~2 years ago).
>  
> Various other expressions that are not equi-joins between the inner and outer 
> fields hit this too, e.g. `where y1 + y2 = x1 group by y1`.
> Another bugged case is if the correlation condition is an equality but it's 
> under another operator like an OUTER JOIN or UNION.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48503) Scalar subquery with group-by and non-equality predicate incorrectly allowed, wrong results

2024-06-03 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48503.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46839
[https://github.com/apache/spark/pull/46839]

> Scalar subquery with group-by and non-equality predicate incorrectly allowed, 
> wrong results
> ---
>
> Key: SPARK-48503
> URL: https://issues.apache.org/jira/browse/SPARK-48503
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Jack Chen
>Assignee: Jack Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> This query is not legal and should give an error, but instead we incorrectly 
> allow it and it returns wrong results.
> {code:java}
> create table x(x1 int, x2 int);
> insert into x values (1, 1);
> create table y(y1 int, y2 int);
> insert into y values (2, 2), (3, 3);
> select *, (select count(*) from y where y1 > x1 group by y1) from x; {code}
> It returns two rows, even though there's only one row of x.
> The correct result is an error: more than one row returned by a subquery used 
> as an expression (as seen in postgres for example)
>  
> This is a longstanding bug. The bug is in CheckAnalysis in 
> {{{}checkAggregateInScalarSubquery{}}}. It allows grouping columns that are 
> present in correlation predicates, but doesn’t check whether those predicates 
> are equalities -  because when that code was written, non-equality 
> correlation wasn’t allowed. Therefore, it looks like this bug has existed 
> since non-equality correlation was added (~2 years ago).
>  
> Various other expressions that are not equi-joins between the inner and outer 
> fields hit this too, e.g. `where y1 + y2 = x1 group by y1`.
> Another bugged case is if the correlation condition is an equality but it's 
> under another operator like an OUTER JOIN or UNION.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48391) use addAll instead of add function in TaskMetrics to accelerate

2024-05-31 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-48391:
---

Assignee: jiahong.li

> use addAll instead of add function  in TaskMetrics  to accelerate
> -
>
> Key: SPARK-48391
> URL: https://issues.apache.org/jira/browse/SPARK-48391
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.5.0, 3.5.1
>Reporter: jiahong.li
>Assignee: jiahong.li
>Priority: Major
>  Labels: pull-request-available
>
> In the fromAccumulators method of TaskMetrics,we should use `
> tm._externalAccums.addAll` instead of `tm._externalAccums.add`, as 
> _externalAccums is a instance of CopyOnWriteArrayList



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48391) use addAll instead of add function in TaskMetrics to accelerate

2024-05-31 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48391.
-
Fix Version/s: 3.5.2
   4.0.0
   Resolution: Fixed

Issue resolved by pull request 46705
[https://github.com/apache/spark/pull/46705]

> use addAll instead of add function  in TaskMetrics  to accelerate
> -
>
> Key: SPARK-48391
> URL: https://issues.apache.org/jira/browse/SPARK-48391
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.5.0, 3.5.1
>Reporter: jiahong.li
>Assignee: jiahong.li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.2, 4.0.0
>
>
> In the fromAccumulators method of TaskMetrics,we should use `
> tm._externalAccums.addAll` instead of `tm._externalAccums.add`, as 
> _externalAccums is a instance of CopyOnWriteArrayList



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48465) Avoid no-op empty relation propagation in AQE

2024-05-31 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48465.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46814
[https://github.com/apache/spark/pull/46814]

> Avoid no-op empty relation propagation in AQE
> -
>
> Key: SPARK-48465
> URL: https://issues.apache.org/jira/browse/SPARK-48465
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Ziqi Liu
>Assignee: Ziqi Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> We should avoid no-op empty relation propagation in AQE: if we convert an 
> empty QueryStageExec to empty relation, it will further wrapped into a new 
> query stage and execute -> produce empty result -> empty relation propagation 
> again. This issue is currently not exposed because AQE will try to reuse 
> shuffle.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48430) Fix map value extraction when map contains collated strings

2024-05-31 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48430.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46758
[https://github.com/apache/spark/pull/46758]

> Fix map value extraction when map contains collated strings
> ---
>
> Key: SPARK-48430
> URL: https://issues.apache.org/jira/browse/SPARK-48430
> Project: Spark
>  Issue Type: Task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Nikola Mandic
>Assignee: Nikola Mandic
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Following queries return unexpected results:
> {code:java}
> select collation(map('a', 'b' collate utf8_binary_lcase)['a']);
> select collation(element_at(map('a', 'b' collate utf8_binary_lcase), 
> 'a'));{code}
> Both return UTF8_BINARY instead of UTF8_BINARY_LCASE.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48476) NPE thrown when delimiter set to null in CSV

2024-05-31 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48476.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46810
[https://github.com/apache/spark/pull/46810]

> NPE thrown when delimiter set to null in CSV
> 
>
> Key: SPARK-48476
> URL: https://issues.apache.org/jira/browse/SPARK-48476
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Milan Stefanovic
>Assignee: Milan Stefanovic
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> When customers specified delimiter to null, currently we throw NPE. We should 
> throw customer facing error
> repro:
> spark.read.format("csv")
> .option("delimiter", null)
> .load()



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48476) NPE thrown when delimiter set to null in CSV

2024-05-31 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-48476:
---

Assignee: Milan Stefanovic

> NPE thrown when delimiter set to null in CSV
> 
>
> Key: SPARK-48476
> URL: https://issues.apache.org/jira/browse/SPARK-48476
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Milan Stefanovic
>Assignee: Milan Stefanovic
>Priority: Major
>  Labels: pull-request-available
>
> When customers specified delimiter to null, currently we throw NPE. We should 
> throw customer facing error
> repro:
> spark.read.format("csv")
> .option("delimiter", null)
> .load()



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48419) Foldable propagation replace foldable column should use origin column

2024-05-30 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-48419:
---

Assignee: KnightChess

> Foldable propagation replace foldable column should use origin column
> -
>
> Key: SPARK-48419
> URL: https://issues.apache.org/jira/browse/SPARK-48419
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.3, 3.1.3, 3.2.4, 4.0.0, 3.5.1, 3.3.4
>Reporter: KnightChess
>Assignee: KnightChess
>Priority: Major
>  Labels: pull-request-available
>
> column name will be change by `FoldablePropagation` in optimizer
> befor optimizer:
> ```shell
> 'Project ['x, 'y, 'z]
> +- 'Project ['a AS x#112, str AS Y#113, 'b AS z#114]
>    +- LocalRelation , [a#0, b#1]
> ```
> after optimizer:
> ```shell
> Project [x#112, str AS Y#113, z#114]
> +- Project [a#0 AS x#112, str AS Y#113, b#1 AS z#114]
>    +- LocalRelation , [a#0, b#1]
> ```
> column name `y` will be replace to 'Y'



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48419) Foldable propagation replace foldable column should use origin column

2024-05-30 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48419.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46742
[https://github.com/apache/spark/pull/46742]

> Foldable propagation replace foldable column should use origin column
> -
>
> Key: SPARK-48419
> URL: https://issues.apache.org/jira/browse/SPARK-48419
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.3, 3.1.3, 3.2.4, 4.0.0, 3.5.1, 3.3.4
>Reporter: KnightChess
>Assignee: KnightChess
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> column name will be change by `FoldablePropagation` in optimizer
> befor optimizer:
> ```shell
> 'Project ['x, 'y, 'z]
> +- 'Project ['a AS x#112, str AS Y#113, 'b AS z#114]
>    +- LocalRelation , [a#0, b#1]
> ```
> after optimizer:
> ```shell
> Project [x#112, str AS Y#113, z#114]
> +- Project [a#0 AS x#112, str AS Y#113, b#1 AS z#114]
>    +- LocalRelation , [a#0, b#1]
> ```
> column name `y` will be replace to 'Y'



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48468) Add LogicalQueryStage interface in catalyst

2024-05-30 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48468.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

> Add LogicalQueryStage interface in catalyst
> ---
>
> Key: SPARK-48468
> URL: https://issues.apache.org/jira/browse/SPARK-48468
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Ziqi Liu
>Assignee: Ziqi Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Add `LogicalQueryStage` interface in catalyst so that it's visible in logical 
> rules



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48468) Add LogicalQueryStage interface in catalyst

2024-05-30 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-48468:
---

Assignee: Ziqi Liu

> Add LogicalQueryStage interface in catalyst
> ---
>
> Key: SPARK-48468
> URL: https://issues.apache.org/jira/browse/SPARK-48468
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Ziqi Liu
>Assignee: Ziqi Liu
>Priority: Major
>  Labels: pull-request-available
>
> Add `LogicalQueryStage` interface in catalyst so that it's visible in logical 
> rules



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48477) Refactor CollationSuite, CoalesceShufflePartitionsSuite, SQLExecutionSuite

2024-05-30 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48477.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

> Refactor CollationSuite, CoalesceShufflePartitionsSuite, SQLExecutionSuite
> --
>
> Key: SPARK-48477
> URL: https://issues.apache.org/jira/browse/SPARK-48477
> Project: Spark
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 4.0.0
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48292) Revert [SPARK-39195][SQL] Spark OutputCommitCoordinator should abort stage when committed file not consistent with task status

2024-05-30 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48292.
-
Fix Version/s: 4.0.0
 Assignee: angerszhu  (was: L. C. Hsieh)
   Resolution: Fixed

> Revert [SPARK-39195][SQL] Spark OutputCommitCoordinator should abort stage 
> when committed file not consistent with task status
> --
>
> Key: SPARK-48292
> URL: https://issues.apache.org/jira/browse/SPARK-48292
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: L. C. Hsieh
>Assignee: angerszhu
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> When a task attemp fails but it is authorized to do task commit, 
> OutputCommitCoordinator will make the stage failed with a reason message 
> which says that task commit success, but actually the driver never knows if a 
> task commit is successful or not. We should update the reason message to make 
> it less confused.
> See https://github.com/apache/spark/pull/36564#discussion_r1598660630



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48292) Revert [SPARK-39195][SQL] Spark OutputCommitCoordinator should abort stage when committed file not consistent with task status

2024-05-30 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-48292:
---

Assignee: L. C. Hsieh

> Revert [SPARK-39195][SQL] Spark OutputCommitCoordinator should abort stage 
> when committed file not consistent with task status
> --
>
> Key: SPARK-48292
> URL: https://issues.apache.org/jira/browse/SPARK-48292
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: L. C. Hsieh
>Assignee: L. C. Hsieh
>Priority: Minor
>  Labels: pull-request-available
>
> When a task attemp fails but it is authorized to do task commit, 
> OutputCommitCoordinator will make the stage failed with a reason message 
> which says that task commit success, but actually the driver never knows if a 
> task commit is successful or not. We should update the reason message to make 
> it less confused.
> See https://github.com/apache/spark/pull/36564#discussion_r1598660630



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48431) Do not forward predicates on collated columns to file readers

2024-05-29 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48431.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46760
[https://github.com/apache/spark/pull/46760]

> Do not forward predicates on collated columns to file readers
> -
>
> Key: SPARK-48431
> URL: https://issues.apache.org/jira/browse/SPARK-48431
> Project: Spark
>  Issue Type: Task
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Jan-Ole Sasse
>Assignee: Jan-Ole Sasse
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> SPARK-47657 allows to push filters on collated columns to file sources that 
> support it. If such filters are pushed to file sources, those file sources 
> must not push those filters to the actual file readers (i.e. parquet or csv 
> readers), because there is no guarantee that those support collations.
> With this task, we are widening filters on collations to be AlwaysTrue when 
> we translate filters for file sources.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48462) Refactor HiveQuerySuite.scala and HiveTableScanSuite

2024-05-29 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48462.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46792
[https://github.com/apache/spark/pull/46792]

> Refactor HiveQuerySuite.scala and HiveTableScanSuite
> 
>
> Key: SPARK-48462
> URL: https://issues.apache.org/jira/browse/SPARK-48462
> Project: Spark
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 4.0.0
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48281) Alter string search logic for: instr, substring_index (UTF8_BINARY_LCASE)

2024-05-29 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-48281:
---

Assignee: Uroš Bojanić

> Alter string search logic for: instr, substring_index (UTF8_BINARY_LCASE)
> -
>
> Key: SPARK-48281
> URL: https://issues.apache.org/jira/browse/SPARK-48281
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Assignee: Uroš Bojanić
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48281) Alter string search logic for: instr, substring_index (UTF8_BINARY_LCASE)

2024-05-29 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48281.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46589
[https://github.com/apache/spark/pull/46589]

> Alter string search logic for: instr, substring_index (UTF8_BINARY_LCASE)
> -
>
> Key: SPARK-48281
> URL: https://issues.apache.org/jira/browse/SPARK-48281
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Assignee: Uroš Bojanić
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48444) Refactor SQLQuerySuite

2024-05-29 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48444.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46778
[https://github.com/apache/spark/pull/46778]

> Refactor SQLQuerySuite
> --
>
> Key: SPARK-48444
> URL: https://issues.apache.org/jira/browse/SPARK-48444
> Project: Spark
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 4.0.0
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48000) Hash join support for strings with collation (StringType only)

2024-05-28 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48000.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46599
[https://github.com/apache/spark/pull/46599]

> Hash join support for strings with collation (StringType only)
> --
>
> Key: SPARK-48000
> URL: https://issues.apache.org/jira/browse/SPARK-48000
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Assignee: Uroš Bojanić
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48000) Hash join support for strings with collation (StringType only)

2024-05-28 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-48000:
---

Assignee: Uroš Bojanić

> Hash join support for strings with collation (StringType only)
> --
>
> Key: SPARK-48000
> URL: https://issues.apache.org/jira/browse/SPARK-48000
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Assignee: Uroš Bojanić
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48221) Alter string search logic for: startsWith, endsWith, contains, locate (UTF8_BINARY_LCASE)

2024-05-28 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-48221:
---

Assignee: Uroš Bojanić

> Alter string search logic for: startsWith, endsWith, contains, locate 
> (UTF8_BINARY_LCASE)
> -
>
> Key: SPARK-48221
> URL: https://issues.apache.org/jira/browse/SPARK-48221
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Assignee: Uroš Bojanić
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48221) Alter string search logic for: startsWith, endsWith, contains, locate (UTF8_BINARY_LCASE)

2024-05-28 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48221.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46511
[https://github.com/apache/spark/pull/46511]

> Alter string search logic for: startsWith, endsWith, contains, locate 
> (UTF8_BINARY_LCASE)
> -
>
> Key: SPARK-48221
> URL: https://issues.apache.org/jira/browse/SPARK-48221
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Assignee: Uroš Bojanić
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48273) Late rewrite of PlanWithUnresolvedIdentifier

2024-05-28 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-48273:
---

Assignee: Nikola Mandic

> Late rewrite of PlanWithUnresolvedIdentifier
> 
>
> Key: SPARK-48273
> URL: https://issues.apache.org/jira/browse/SPARK-48273
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Nikola Mandic
>Assignee: Nikola Mandic
>Priority: Major
>  Labels: pull-request-available
>
> PlanWithUnresolvedIdentifier is rewritten later in analysis which causes 
> rules like 
> SubstituteUnresolvedOrdinals to miss the new plan. This causes following 
> queries to fail:
> {code:java}
> create temporary view identifier('v1') as (select my_col from (values (1), 
> (2), (1) as (my_col)) group by 1);
> --
> cache table identifier('t1') as (select my_col from (values (1), (2), (1) as 
> (my_col)) group by 1); 
> --
> create table identifier('t2') as (select my_col from (values (1), (2), (1) 
> as (my_col)) group by 1);
> insert into identifier('t2') select my_col from (values (3) as (my_col)) 
> group by 1; {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48159) Datetime expressions (all collations)

2024-05-28 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48159.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46618
[https://github.com/apache/spark/pull/46618]

> Datetime expressions (all collations)
> -
>
> Key: SPARK-48159
> URL: https://issues.apache.org/jira/browse/SPARK-48159
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Assignee: Nebojsa Savic
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48273) Late rewrite of PlanWithUnresolvedIdentifier

2024-05-28 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48273.
-
Fix Version/s: 3.5.2
   4.0.0
   Resolution: Fixed

Issue resolved by pull request 46580
[https://github.com/apache/spark/pull/46580]

> Late rewrite of PlanWithUnresolvedIdentifier
> 
>
> Key: SPARK-48273
> URL: https://issues.apache.org/jira/browse/SPARK-48273
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Nikola Mandic
>Assignee: Nikola Mandic
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.2, 4.0.0
>
>
> PlanWithUnresolvedIdentifier is rewritten later in analysis which causes 
> rules like 
> SubstituteUnresolvedOrdinals to miss the new plan. This causes following 
> queries to fail:
> {code:java}
> create temporary view identifier('v1') as (select my_col from (values (1), 
> (2), (1) as (my_col)) group by 1);
> --
> cache table identifier('t1') as (select my_col from (values (1), (2), (1) as 
> (my_col)) group by 1); 
> --
> create table identifier('t2') as (select my_col from (values (1), (2), (1) 
> as (my_col)) group by 1);
> insert into identifier('t2') select my_col from (values (3) as (my_col)) 
> group by 1; {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-46841) Language support for collations

2024-05-28 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-46841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-46841.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46180
[https://github.com/apache/spark/pull/46180]

> Language support for collations
> ---
>
> Key: SPARK-46841
> URL: https://issues.apache.org/jira/browse/SPARK-46841
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Core, SQL
>Affects Versions: 4.0.0
>Reporter: Aleksandar Tomic
>Assignee: Nikola Mandic
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Languages and localization for collations are supported by ICU library. 
> Collation naming format is as follows:
> {code:java}
> <2-letter language code>[_<4-letter script>][_<3-letter country 
> code>][_specifier_specifier...]{code}
> Locale specifier consists of the first part of collation name (language + 
> script + country). Locale specifiers need to be stable across ICU versions; 
> to keep existing ids and names invariant we introduce golden file will locale 
> table which should case CI failure on any silent changes.
> Currently supported optional specifiers:
>  * CS/CI - case sensitivity, default is case-sensitive; supported by 
> configuring ICU collation levels
>  * AS/AI - accent sensitivity; default is accent-sensitive; supported by 
> configuring ICU collation levels
>  * /LCASE/UCASE - case conversion performed prior to 
> comparisons; supported by internal implementation relying on ICU locale-aware 
> conversions
> User can use collation specifiers in any order except of locale which is 
> mandatory and must go first. There is a one-to-one mapping between collation 
> ids and collation names defined in CollationFactory.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-46841) Language support for collations

2024-05-28 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-46841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-46841:
---

Assignee: Nikola Mandic

> Language support for collations
> ---
>
> Key: SPARK-46841
> URL: https://issues.apache.org/jira/browse/SPARK-46841
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Core, SQL
>Affects Versions: 4.0.0
>Reporter: Aleksandar Tomic
>Assignee: Nikola Mandic
>Priority: Major
>  Labels: pull-request-available
>
> Languages and localization for collations are supported by ICU library. 
> Collation naming format is as follows:
> {code:java}
> <2-letter language code>[_<4-letter script>][_<3-letter country 
> code>][_specifier_specifier...]{code}
> Locale specifier consists of the first part of collation name (language + 
> script + country). Locale specifiers need to be stable across ICU versions; 
> to keep existing ids and names invariant we introduce golden file will locale 
> table which should case CI failure on any silent changes.
> Currently supported optional specifiers:
>  * CS/CI - case sensitivity, default is case-sensitive; supported by 
> configuring ICU collation levels
>  * AS/AI - accent sensitivity; default is accent-sensitive; supported by 
> configuring ICU collation levels
>  * /LCASE/UCASE - case conversion performed prior to 
> comparisons; supported by internal implementation relying on ICU locale-aware 
> conversions
> User can use collation specifiers in any order except of locale which is 
> mandatory and must go first. There is a one-to-one mapping between collation 
> ids and collation names defined in CollationFactory.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48364) Type casting for AbstractMapType

2024-05-22 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-48364:
---

Assignee: Uroš Bojanić

> Type casting for AbstractMapType
> 
>
> Key: SPARK-48364
> URL: https://issues.apache.org/jira/browse/SPARK-48364
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Assignee: Uroš Bojanić
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48364) Type casting for AbstractMapType

2024-05-22 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48364.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46661
[https://github.com/apache/spark/pull/46661]

> Type casting for AbstractMapType
> 
>
> Key: SPARK-48364
> URL: https://issues.apache.org/jira/browse/SPARK-48364
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Assignee: Uroš Bojanić
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48215) DateFormatClass (all collations)

2024-05-22 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48215.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46561
[https://github.com/apache/spark/pull/46561]

> DateFormatClass (all collations)
> 
>
> Key: SPARK-48215
> URL: https://issues.apache.org/jira/browse/SPARK-48215
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Assignee: Nebojsa Savic
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Enable collation support for the *DateFormatClass* built-in function in 
> Spark. First confirm what is the expected behaviour for this expression when 
> given collated strings, and then move on to implementation and testing. You 
> will find this expression in the *datetimeExpressions.scala* file, and it 
> should be considered a pass-through function with respect to collation 
> awareness. Implement the corresponding E2E SQL tests 
> (CollationSQLExpressionsSuite) to reflect how this function should be used 
> with collation in SparkSQL, and feel free to use your chosen Spark SQL Editor 
> to experiment with the existing functions to learn more about how they work. 
> In addition, look into the possible use-cases and implementation of similar 
> functions within other other open-source DBMS, such as 
> [PostgreSQL|https://www.postgresql.org/docs/].
>  
> The goal for this Jira ticket is to implement the *DateFormatClass* 
> expression so that it supports all collation types currently supported in 
> Spark. To understand what changes were introduced in order to enable full 
> collation support for other existing functions in Spark, take a look at the 
> Spark PRs and Jira tickets for completed tasks in this parent (for example: 
> Ascii, Chr, Base64, UnBase64, Decode, StringDecode, Encode, ToBinary, 
> FormatNumber, Sentences).
>  
> Read more about ICU [Collation Concepts|http://example.com/] and 
> [Collator|http://example.com/] class. Also, refer to the Unicode Technical 
> Standard for string 
> [collation|https://www.unicode.org/reports/tr35/tr35-collation.html#Collation_Type_Fallback].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48215) DateFormatClass (all collations)

2024-05-22 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-48215:
---

Assignee: Nebojsa Savic

> DateFormatClass (all collations)
> 
>
> Key: SPARK-48215
> URL: https://issues.apache.org/jira/browse/SPARK-48215
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Assignee: Nebojsa Savic
>Priority: Major
>  Labels: pull-request-available
>
> Enable collation support for the *DateFormatClass* built-in function in 
> Spark. First confirm what is the expected behaviour for this expression when 
> given collated strings, and then move on to implementation and testing. You 
> will find this expression in the *datetimeExpressions.scala* file, and it 
> should be considered a pass-through function with respect to collation 
> awareness. Implement the corresponding E2E SQL tests 
> (CollationSQLExpressionsSuite) to reflect how this function should be used 
> with collation in SparkSQL, and feel free to use your chosen Spark SQL Editor 
> to experiment with the existing functions to learn more about how they work. 
> In addition, look into the possible use-cases and implementation of similar 
> functions within other other open-source DBMS, such as 
> [PostgreSQL|https://www.postgresql.org/docs/].
>  
> The goal for this Jira ticket is to implement the *DateFormatClass* 
> expression so that it supports all collation types currently supported in 
> Spark. To understand what changes were introduced in order to enable full 
> collation support for other existing functions in Spark, take a look at the 
> Spark PRs and Jira tickets for completed tasks in this parent (for example: 
> Ascii, Chr, Base64, UnBase64, Decode, StringDecode, Encode, ToBinary, 
> FormatNumber, Sentences).
>  
> Read more about ICU [Collation Concepts|http://example.com/] and 
> [Collator|http://example.com/] class. Also, refer to the Unicode Technical 
> Standard for string 
> [collation|https://www.unicode.org/reports/tr35/tr35-collation.html#Collation_Type_Fallback].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48305) CurrentLike - Database/Schema, Catalog, User (all collations)

2024-05-20 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48305.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46613
[https://github.com/apache/spark/pull/46613]

> CurrentLike - Database/Schema, Catalog, User (all collations)
> -
>
> Key: SPARK-48305
> URL: https://issues.apache.org/jira/browse/SPARK-48305
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Assignee: Uroš Bojanić
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48175) Store collation information in metadata and not in type for SER/DE

2024-05-18 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-48175:
---

Assignee: Stefan Kandic

> Store collation information in metadata and not in type for SER/DE
> --
>
> Key: SPARK-48175
> URL: https://issues.apache.org/jira/browse/SPARK-48175
> Project: Spark
>  Issue Type: Improvement
>  Components: PySpark, SQL
>Affects Versions: 4.0.0
>Reporter: Stefan Kandic
>Assignee: Stefan Kandic
>Priority: Major
>  Labels: pull-request-available
>
> Changing serialization and deserialization of collated strings so that the 
> collation information is put in the metadata of the enclosing struct field - 
> and then read back from there during parsing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48175) Store collation information in metadata and not in type for SER/DE

2024-05-18 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48175.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46280
[https://github.com/apache/spark/pull/46280]

> Store collation information in metadata and not in type for SER/DE
> --
>
> Key: SPARK-48175
> URL: https://issues.apache.org/jira/browse/SPARK-48175
> Project: Spark
>  Issue Type: Improvement
>  Components: PySpark, SQL
>Affects Versions: 4.0.0
>Reporter: Stefan Kandic
>Assignee: Stefan Kandic
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Changing serialization and deserialization of collated strings so that the 
> collation information is put in the metadata of the enclosing struct field - 
> and then read back from there during parsing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48308) Unify getting data schema without partition columns in FileSourceStrategy

2024-05-16 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48308.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46619
[https://github.com/apache/spark/pull/46619]

> Unify getting data schema without partition columns in FileSourceStrategy
> -
>
> Key: SPARK-48308
> URL: https://issues.apache.org/jira/browse/SPARK-48308
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.5.1
>Reporter: Johan Lasperas
>Assignee: Johan Lasperas
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> In 
> [FileSourceStrategy,|https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala#L191]
>  the schema of the data excluding partition columns is computed 2 times in a 
> slightly different way:
>  
> {code:java}
> val dataColumnsWithoutPartitionCols = 
> dataColumns.filterNot(partitionSet.contains) {code}
>  
> vs 
> {code:java}
> val readDataColumns = dataColumns
>   .filterNot(partitionColumns.contains) {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48308) Unify getting data schema without partition columns in FileSourceStrategy

2024-05-16 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-48308:
---

Assignee: Johan Lasperas

> Unify getting data schema without partition columns in FileSourceStrategy
> -
>
> Key: SPARK-48308
> URL: https://issues.apache.org/jira/browse/SPARK-48308
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.5.1
>Reporter: Johan Lasperas
>Assignee: Johan Lasperas
>Priority: Trivial
>  Labels: pull-request-available
>
> In 
> [FileSourceStrategy,|https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala#L191]
>  the schema of the data excluding partition columns is computed 2 times in a 
> slightly different way:
>  
> {code:java}
> val dataColumnsWithoutPartitionCols = 
> dataColumns.filterNot(partitionSet.contains) {code}
>  
> vs 
> {code:java}
> val readDataColumns = dataColumns
>   .filterNot(partitionColumns.contains) {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48288) Add source data type to connector.Cast expression

2024-05-16 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48288.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46596
[https://github.com/apache/spark/pull/46596]

> Add source data type to connector.Cast expression
> -
>
> Key: SPARK-48288
> URL: https://issues.apache.org/jira/browse/SPARK-48288
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Uros Stankovic
>Assignee: Uros Stankovic
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Currently, 
> V2ExpressionBuilder will build connector.Cast expression from catalyst.Cast 
> expression.
> Catalyst cast have expression data type, but connector cast does not have it.
> Since some casts are not allowed on external engine, we need to know source 
> and target data type, since we want finer granularity to block some 
> unsupported casts.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48252) Update CommonExpressionRef when necessary

2024-05-15 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48252.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46552
[https://github.com/apache/spark/pull/46552]

> Update CommonExpressionRef when necessary
> -
>
> Key: SPARK-48252
> URL: https://issues.apache.org/jira/browse/SPARK-48252
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Wenchen Fan
>Assignee: Wenchen Fan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48252) Update CommonExpressionRef when necessary

2024-05-15 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-48252:
---

Assignee: Wenchen Fan

> Update CommonExpressionRef when necessary
> -
>
> Key: SPARK-48252
> URL: https://issues.apache.org/jira/browse/SPARK-48252
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Wenchen Fan
>Assignee: Wenchen Fan
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48172) Fix escaping issues in JDBCDialects

2024-05-15 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48172.
-
Fix Version/s: 3.4.4
   3.5.2
   4.0.0
   Resolution: Fixed

Issue resolved by pull request 46588
[https://github.com/apache/spark/pull/46588]

> Fix escaping issues in JDBCDialects
> ---
>
> Key: SPARK-48172
> URL: https://issues.apache.org/jira/browse/SPARK-48172
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Mihailo Milosevic
>Assignee: Mihailo Milosevic
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.4, 3.5.2, 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48277) Improve error message for ErrorClassesJsonReader.getErrorMessage

2024-05-15 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48277.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46584
[https://github.com/apache/spark/pull/46584]

> Improve error message for ErrorClassesJsonReader.getErrorMessage
> 
>
> Key: SPARK-48277
> URL: https://issues.apache.org/jira/browse/SPARK-48277
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48160) XPath expressions (all collations)

2024-05-15 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-48160:
---

Assignee: Uroš Bojanić

> XPath expressions (all collations)
> --
>
> Key: SPARK-48160
> URL: https://issues.apache.org/jira/browse/SPARK-48160
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Assignee: Uroš Bojanić
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48160) XPath expressions (all collations)

2024-05-15 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48160.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46508
[https://github.com/apache/spark/pull/46508]

> XPath expressions (all collations)
> --
>
> Key: SPARK-48160
> URL: https://issues.apache.org/jira/browse/SPARK-48160
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Assignee: Uroš Bojanić
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48162) Miscellaneous expressions (all collations)

2024-05-15 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-48162:
---

Assignee: Uroš Bojanić

> Miscellaneous expressions (all collations)
> --
>
> Key: SPARK-48162
> URL: https://issues.apache.org/jira/browse/SPARK-48162
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Assignee: Uroš Bojanić
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48162) Miscellaneous expressions (all collations)

2024-05-15 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48162.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46461
[https://github.com/apache/spark/pull/46461]

> Miscellaneous expressions (all collations)
> --
>
> Key: SPARK-48162
> URL: https://issues.apache.org/jira/browse/SPARK-48162
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Assignee: Uroš Bojanić
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48271) Turn match error in RowEncoder into UNSUPPORTED_DATA_TYPE_FOR_ENCODER

2024-05-14 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan updated SPARK-48271:

Summary: Turn match error in RowEncoder into 
UNSUPPORTED_DATA_TYPE_FOR_ENCODER  (was: support char/varchar in RowEncoder)

> Turn match error in RowEncoder into UNSUPPORTED_DATA_TYPE_FOR_ENCODER
> -
>
> Key: SPARK-48271
> URL: https://issues.apache.org/jira/browse/SPARK-48271
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Wenchen Fan
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48263) Collate function support for non UTF8_BINARY strings

2024-05-14 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48263.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46574
[https://github.com/apache/spark/pull/46574]

> Collate function support for non UTF8_BINARY strings
> 
>
> Key: SPARK-48263
> URL: https://issues.apache.org/jira/browse/SPARK-48263
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Nebojsa Savic
>Assignee: Nebojsa Savic
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> When default collation level config is set to some collation other than 
> UTF8_BINARY (i.e. UTF8_BINARY_LCASE) and when we try to execute COLLATE (or 
> collation) expression, this will fail because it is only accepting 
> StringType(0) as argument for collation name.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48263) Collate function support for non UTF8_BINARY strings

2024-05-14 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-48263:
---

Assignee: Nebojsa Savic

> Collate function support for non UTF8_BINARY strings
> 
>
> Key: SPARK-48263
> URL: https://issues.apache.org/jira/browse/SPARK-48263
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Nebojsa Savic
>Assignee: Nebojsa Savic
>Priority: Major
>  Labels: pull-request-available
>
> When default collation level config is set to some collation other than 
> UTF8_BINARY (i.e. UTF8_BINARY_LCASE) and when we try to execute COLLATE (or 
> collation) expression, this will fail because it is only accepting 
> StringType(0) as argument for collation name.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48172) Fix escaping issues in JDBCDialects

2024-05-14 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48172.
-
Fix Version/s: 3.4.4
   3.5.2
   4.0.0
   Resolution: Fixed

Issue resolved by pull request 46437
[https://github.com/apache/spark/pull/46437]

> Fix escaping issues in JDBCDialects
> ---
>
> Key: SPARK-48172
> URL: https://issues.apache.org/jira/browse/SPARK-48172
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Mihailo Milosevic
>Assignee: Mihailo Milosevic
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.4, 3.5.2, 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48155) PropagateEmpty relation cause LogicalQueryStage only with broadcast without join then execute failed

2024-05-14 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48155.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46523
[https://github.com/apache/spark/pull/46523]

> PropagateEmpty relation cause LogicalQueryStage only with broadcast without 
> join then execute failed
> 
>
> Key: SPARK-48155
> URL: https://issues.apache.org/jira/browse/SPARK-48155
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.2.1, 3.5.1, 3.3.4
>Reporter: angerszhu
>Assignee: angerszhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> {code:java}
> 24/05/07 09:48:55 ERROR [main] PlanChangeLogger:
> === Applying Rule 
> org.apache.spark.sql.execution.adaptive.AQEPropagateEmptyRelation ===
>  Project [date#124, station_name#0, shipment_id#14]
>  +- Filter (status#2L INSET 1, 149, 2, 36, 400, 417, 418, 419, 49, 5, 50, 581 
> AND station_type#1 IN (3,12))
>     +- Aggregate [date#124, shipment_id#14], [date#124, shipment_id#14, ... 3 
> more fields] 
> !      +- Join LeftOuter, ((cast(date#124 as timestamp) >= 
> cast(from_unixtime((ctime#27L - 0), -MM-dd HH:mm:ss, 
> Some(Asia/Singapore)) as timestamp)) AND (cast(date#124 as timestamp) + 
> INTERVAL '-4' DAY <= cast(from_unixtime((ctime#27L - 0), -MM-dd HH:mm:ss, 
> Some(Asia/Singapore)) as timestamp)))
> !         :- LogicalQueryStage Generate 
> explode(org.apache.spark.sql.catalyst.expressions.UnsafeArrayData@3a191e40), 
> false, [date#124], BroadcastQueryStage 0
> !         +- LocalRelation , [shipment_id#14, station_name#5, ... 3 
> more fields]24/05/07 09:48:55 ERROR [main] 
> Project [date#124, station_name#0, shipment_id#14]
>  +- Filter (status#2L INSET 1, 149, 2, 36, 400, 417, 418, 419, 49, 5, 50, 581 
> AND station_type#1 IN (3,12))
>     +- Aggregate [date#124, shipment_id#14], [date#124, shipment_id#14, ... 3 
> more fields]
> !      +- Project [date#124, cast(null as string) AS shipment_id#14, ... 4 
> more fields]
> !         +- LogicalQueryStage Generate 
> explode(org.apache.spark.sql.catalyst.expressions.UnsafeArrayData@3a191e40), 
> false, [date#124], BroadcastQueryStage 0 {code}
> {code:java}
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> java.lang.UnsupportedOperationException: BroadcastExchange does not support 
> the execute() code path.at 
> org.apache.spark.sql.errors.QueryExecutionErrors$.executeCodePathUnsupportedError(QueryExecutionErrors.scala:1652)
> at 
> org.apache.spark.sql.execution.exchange.BroadcastExchangeExec.doExecute(BroadcastExchangeExec.scala:203)
> at 
> org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:184)
> at 
> org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:222)
> at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> at 
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:219)
> at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:180)
> at 
> org.apache.spark.sql.execution.adaptive.QueryStageExec.doExecute(QueryStageExec.scala:119)
> at 
> org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:184)
> at 
> org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:222)
> at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> at 
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:219)
> at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:180)
> at 
> org.apache.spark.sql.execution.InputAdapter.inputRDD(WholeStageCodegenExec.scala:526)
> at 
> org.apache.spark.sql.execution.InputRDDCodegen.inputRDDs(WholeStageCodegenExec.scala:454)
> at 
> org.apache.spark.sql.execution.InputRDDCodegen.inputRDDs$(WholeStageCodegenExec.scala:453)
> at 
> org.apache.spark.sql.execution.InputAdapter.inputRDDs(WholeStageCodegenExec.scala:497)
> at 
> org.apache.spark.sql.execution.ProjectExec.inputRDDs(basicPhysicalOperators.scala:50)
> at org.apache.spark.sql.execution.SortExec.inputRDDs(SortExec.scala:132)  
>   at 
> org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:750)
> at 
> org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:184)
> at 
> org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:222)
> at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> at 
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:219)
> at 

[jira] [Assigned] (SPARK-48155) PropagateEmpty relation cause LogicalQueryStage only with broadcast without join then execute failed

2024-05-14 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-48155:
---

Assignee: angerszhu

> PropagateEmpty relation cause LogicalQueryStage only with broadcast without 
> join then execute failed
> 
>
> Key: SPARK-48155
> URL: https://issues.apache.org/jira/browse/SPARK-48155
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.2.1, 3.5.1, 3.3.4
>Reporter: angerszhu
>Assignee: angerszhu
>Priority: Major
>  Labels: pull-request-available
>
> {code:java}
> 24/05/07 09:48:55 ERROR [main] PlanChangeLogger:
> === Applying Rule 
> org.apache.spark.sql.execution.adaptive.AQEPropagateEmptyRelation ===
>  Project [date#124, station_name#0, shipment_id#14]
>  +- Filter (status#2L INSET 1, 149, 2, 36, 400, 417, 418, 419, 49, 5, 50, 581 
> AND station_type#1 IN (3,12))
>     +- Aggregate [date#124, shipment_id#14], [date#124, shipment_id#14, ... 3 
> more fields] 
> !      +- Join LeftOuter, ((cast(date#124 as timestamp) >= 
> cast(from_unixtime((ctime#27L - 0), -MM-dd HH:mm:ss, 
> Some(Asia/Singapore)) as timestamp)) AND (cast(date#124 as timestamp) + 
> INTERVAL '-4' DAY <= cast(from_unixtime((ctime#27L - 0), -MM-dd HH:mm:ss, 
> Some(Asia/Singapore)) as timestamp)))
> !         :- LogicalQueryStage Generate 
> explode(org.apache.spark.sql.catalyst.expressions.UnsafeArrayData@3a191e40), 
> false, [date#124], BroadcastQueryStage 0
> !         +- LocalRelation , [shipment_id#14, station_name#5, ... 3 
> more fields]24/05/07 09:48:55 ERROR [main] 
> Project [date#124, station_name#0, shipment_id#14]
>  +- Filter (status#2L INSET 1, 149, 2, 36, 400, 417, 418, 419, 49, 5, 50, 581 
> AND station_type#1 IN (3,12))
>     +- Aggregate [date#124, shipment_id#14], [date#124, shipment_id#14, ... 3 
> more fields]
> !      +- Project [date#124, cast(null as string) AS shipment_id#14, ... 4 
> more fields]
> !         +- LogicalQueryStage Generate 
> explode(org.apache.spark.sql.catalyst.expressions.UnsafeArrayData@3a191e40), 
> false, [date#124], BroadcastQueryStage 0 {code}
> {code:java}
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> java.lang.UnsupportedOperationException: BroadcastExchange does not support 
> the execute() code path.at 
> org.apache.spark.sql.errors.QueryExecutionErrors$.executeCodePathUnsupportedError(QueryExecutionErrors.scala:1652)
> at 
> org.apache.spark.sql.execution.exchange.BroadcastExchangeExec.doExecute(BroadcastExchangeExec.scala:203)
> at 
> org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:184)
> at 
> org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:222)
> at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> at 
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:219)
> at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:180)
> at 
> org.apache.spark.sql.execution.adaptive.QueryStageExec.doExecute(QueryStageExec.scala:119)
> at 
> org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:184)
> at 
> org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:222)
> at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> at 
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:219)
> at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:180)
> at 
> org.apache.spark.sql.execution.InputAdapter.inputRDD(WholeStageCodegenExec.scala:526)
> at 
> org.apache.spark.sql.execution.InputRDDCodegen.inputRDDs(WholeStageCodegenExec.scala:454)
> at 
> org.apache.spark.sql.execution.InputRDDCodegen.inputRDDs$(WholeStageCodegenExec.scala:453)
> at 
> org.apache.spark.sql.execution.InputAdapter.inputRDDs(WholeStageCodegenExec.scala:497)
> at 
> org.apache.spark.sql.execution.ProjectExec.inputRDDs(basicPhysicalOperators.scala:50)
> at org.apache.spark.sql.execution.SortExec.inputRDDs(SortExec.scala:132)  
>   at 
> org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:750)
> at 
> org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:184)
> at 
> org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:222)
> at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> at 
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:219)
> at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:180)
> at 
> 

[jira] [Created] (SPARK-48271) support char/varchar in RowEncoder

2024-05-14 Thread Wenchen Fan (Jira)
Wenchen Fan created SPARK-48271:
---

 Summary: support char/varchar in RowEncoder
 Key: SPARK-48271
 URL: https://issues.apache.org/jira/browse/SPARK-48271
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 4.0.0
Reporter: Wenchen Fan






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48157) CSV expressions (all collations)

2024-05-14 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48157.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46504
[https://github.com/apache/spark/pull/46504]

> CSV expressions (all collations)
> 
>
> Key: SPARK-48157
> URL: https://issues.apache.org/jira/browse/SPARK-48157
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Assignee: Uroš Bojanić
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Enable collation support for *CSV* built-in string functions in Spark 
> ({*}CsvToStructs{*}, {*}SchemaOfCsv{*}, {*}StructsToCsv{*}). First confirm 
> what is the expected behaviour for these functions when given collated 
> strings, and then move on to implementation and testing. You will find these 
> expressions in the *csvExpressions.scala* file, and they should mostly be 
> pass-through functions. Implement the corresponding E2E SQL tests 
> (CollationSQLExpressionsSuite) to reflect how this function should be used 
> with collation in SparkSQL, and feel free to use your chosen Spark SQL Editor 
> to experiment with the existing functions to learn more about how they work. 
> In addition, look into the possible use-cases and implementation of similar 
> functions within other other open-source DBMS, such as 
> [PostgreSQL|https://www.postgresql.org/docs/].
>  
> The goal for this Jira ticket is to implement the *CSV* expressions so that 
> they support all collation types currently supported in Spark. To understand 
> what changes were introduced in order to enable full collation support for 
> other existing functions in Spark, take a look at the Spark PRs and Jira 
> tickets for completed tasks in this parent (for example: Ascii, Chr, Base64, 
> UnBase64, Decode, StringDecode, Encode, ToBinary, FormatNumber, Sentences).
>  
> Read more about ICU [Collation Concepts|http://example.com/] and 
> [Collator|http://example.com/] class. Also, refer to the Unicode Technical 
> Standard for string 
> [collation|https://www.unicode.org/reports/tr35/tr35-collation.html#Collation_Type_Fallback].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48157) CSV expressions (all collations)

2024-05-14 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-48157:
---

Assignee: Uroš Bojanić

> CSV expressions (all collations)
> 
>
> Key: SPARK-48157
> URL: https://issues.apache.org/jira/browse/SPARK-48157
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Assignee: Uroš Bojanić
>Priority: Major
>  Labels: pull-request-available
>
> Enable collation support for *CSV* built-in string functions in Spark 
> ({*}CsvToStructs{*}, {*}SchemaOfCsv{*}, {*}StructsToCsv{*}). First confirm 
> what is the expected behaviour for these functions when given collated 
> strings, and then move on to implementation and testing. You will find these 
> expressions in the *csvExpressions.scala* file, and they should mostly be 
> pass-through functions. Implement the corresponding E2E SQL tests 
> (CollationSQLExpressionsSuite) to reflect how this function should be used 
> with collation in SparkSQL, and feel free to use your chosen Spark SQL Editor 
> to experiment with the existing functions to learn more about how they work. 
> In addition, look into the possible use-cases and implementation of similar 
> functions within other other open-source DBMS, such as 
> [PostgreSQL|https://www.postgresql.org/docs/].
>  
> The goal for this Jira ticket is to implement the *CSV* expressions so that 
> they support all collation types currently supported in Spark. To understand 
> what changes were introduced in order to enable full collation support for 
> other existing functions in Spark, take a look at the Spark PRs and Jira 
> tickets for completed tasks in this parent (for example: Ascii, Chr, Base64, 
> UnBase64, Decode, StringDecode, Encode, ToBinary, FormatNumber, Sentences).
>  
> Read more about ICU [Collation Concepts|http://example.com/] and 
> [Collator|http://example.com/] class. Also, refer to the Unicode Technical 
> Standard for string 
> [collation|https://www.unicode.org/reports/tr35/tr35-collation.html#Collation_Type_Fallback].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48229) inputFile expressions (all collations)

2024-05-14 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48229.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46503
[https://github.com/apache/spark/pull/46503]

> inputFile expressions (all collations)
> --
>
> Key: SPARK-48229
> URL: https://issues.apache.org/jira/browse/SPARK-48229
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Assignee: Uroš Bojanić
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48265) Infer window group limit batch should do constant folding

2024-05-13 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-48265:
---

Assignee: angerszhu

> Infer window group limit batch should do constant folding
> -
>
> Key: SPARK-48265
> URL: https://issues.apache.org/jira/browse/SPARK-48265
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 4.0.0, 3.5.1
>Reporter: angerszhu
>Assignee: angerszhu
>Priority: Major
>  Labels: pull-request-available
>
> {code:java}
> 24/05/13 17:39:25 ERROR [main] PlanChangeLogger:
> === Result of Batch LocalRelation ===
>  GlobalLimit 21                                                               
>                                                               GlobalLimit 21
>  +- LocalLimit 21                                                             
>                                                               +- LocalLimit 21
> !   +- Union false, false                                                     
>                                                                  +- 
> LocalLimit 21
> !      :- LocalLimit 21                                                       
>                                                                     +- 
> Project [item_id#647L]
> !      :  +- Project [item_id#647L]                                           
>                                                                        +- 
> Filter (((isnotnull(tz_type#734) AND (tz_type#734 = local)) AND 
> (grass_region#735 = BR)) AND isnotnull(grass_region#735))
> !      :     +- Filter (((isnotnull(tz_type#734) AND (tz_type#734 = local)) 
> AND (grass_region#735 = BR)) AND isnotnull(grass_region#735))               
> +- Relation db.table[,... 91 more fields] parquet
> !      :        +- Relation db.table[,... 91 more fields] parquet
> !      +- LocalLimit 21
> !         +- Project [item_id#738L]
> !            +- LocalRelation , [, ... 91 more fields]
> 24/05/13 17:39:25 ERROR [main] PlanChangeLogger: Batch Check Cartesian 
> Products has no effect.
> 24/05/13 17:39:25 ERROR [main] PlanChangeLogger: Batch RewriteSubquery has no 
> effect.
> 24/05/13 17:39:25 ERROR [main] PlanChangeLogger: Batch 
> NormalizeFloatingNumbers has no effect.
> 24/05/13 17:39:25 ERROR [main] PlanChangeLogger: Batch 
> ReplaceUpdateFieldsExpression has no effect.
> 24/05/13 17:39:25 ERROR [main] PlanChangeLogger: Batch Optimize Metadata Only 
> Query has no effect.
> 24/05/13 17:39:25 ERROR [main] PlanChangeLogger: Batch PartitionPruning has 
> no effect.
> 24/05/13 17:39:25 ERROR [main] PlanChangeLogger: Batch InjectRuntimeFilter 
> has no effect.
> 24/05/13 17:39:25 ERROR [main] PlanChangeLogger: Batch Pushdown Filters from 
> PartitionPruning has no effect.
> 24/05/13 17:39:25 ERROR [main] PlanChangeLogger: Batch Cleanup filters that 
> cannot be pushed down has no effect.
> 24/05/13 17:39:25 ERROR [main] PlanChangeLogger: Batch Extract Python UDFs 
> has no effect.
> 24/05/13 17:39:25 ERROR [main] PlanChangeLogger:
> === Applying Rule org.apache.spark.sql.catalyst.optimizer.EliminateLimits ===
>  GlobalLimit 21                                                               
>                                                            GlobalLimit 21
> !+- LocalLimit 21                                                             
>                                                            +- LocalLimit 
> least(, ... 2 more fields)
> !   +- LocalLimit 21                                                          
>                                                               +- Project 
> [item_id#647L]
> !      +- Project [item_id#647L]                                              
>                                                                  +- Filter 
> (((isnotnull(tz_type#734) AND (tz_type#734 = local)) AND (grass_region#735 = 
> BR)) AND isnotnull(grass_region#735))
> !         +- Filter (((isnotnull(tz_type#734) AND (tz_type#734 = local)) AND 
> (grass_region#735 = BR)) AND isnotnull(grass_region#735))            +- 
> Relation db.table[,... 91 more fields] parquet
> !            +- Relation db.table[,... 91 more fields] parquet
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48265) Infer window group limit batch should do constant folding

2024-05-13 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48265.
-
Fix Version/s: 3.5.2
   4.0.0
   Resolution: Fixed

Issue resolved by pull request 46568
[https://github.com/apache/spark/pull/46568]

> Infer window group limit batch should do constant folding
> -
>
> Key: SPARK-48265
> URL: https://issues.apache.org/jira/browse/SPARK-48265
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 4.0.0, 3.5.1
>Reporter: angerszhu
>Assignee: angerszhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.2, 4.0.0
>
>
> {code:java}
> 24/05/13 17:39:25 ERROR [main] PlanChangeLogger:
> === Result of Batch LocalRelation ===
>  GlobalLimit 21                                                               
>                                                               GlobalLimit 21
>  +- LocalLimit 21                                                             
>                                                               +- LocalLimit 21
> !   +- Union false, false                                                     
>                                                                  +- 
> LocalLimit 21
> !      :- LocalLimit 21                                                       
>                                                                     +- 
> Project [item_id#647L]
> !      :  +- Project [item_id#647L]                                           
>                                                                        +- 
> Filter (((isnotnull(tz_type#734) AND (tz_type#734 = local)) AND 
> (grass_region#735 = BR)) AND isnotnull(grass_region#735))
> !      :     +- Filter (((isnotnull(tz_type#734) AND (tz_type#734 = local)) 
> AND (grass_region#735 = BR)) AND isnotnull(grass_region#735))               
> +- Relation db.table[,... 91 more fields] parquet
> !      :        +- Relation db.table[,... 91 more fields] parquet
> !      +- LocalLimit 21
> !         +- Project [item_id#738L]
> !            +- LocalRelation , [, ... 91 more fields]
> 24/05/13 17:39:25 ERROR [main] PlanChangeLogger: Batch Check Cartesian 
> Products has no effect.
> 24/05/13 17:39:25 ERROR [main] PlanChangeLogger: Batch RewriteSubquery has no 
> effect.
> 24/05/13 17:39:25 ERROR [main] PlanChangeLogger: Batch 
> NormalizeFloatingNumbers has no effect.
> 24/05/13 17:39:25 ERROR [main] PlanChangeLogger: Batch 
> ReplaceUpdateFieldsExpression has no effect.
> 24/05/13 17:39:25 ERROR [main] PlanChangeLogger: Batch Optimize Metadata Only 
> Query has no effect.
> 24/05/13 17:39:25 ERROR [main] PlanChangeLogger: Batch PartitionPruning has 
> no effect.
> 24/05/13 17:39:25 ERROR [main] PlanChangeLogger: Batch InjectRuntimeFilter 
> has no effect.
> 24/05/13 17:39:25 ERROR [main] PlanChangeLogger: Batch Pushdown Filters from 
> PartitionPruning has no effect.
> 24/05/13 17:39:25 ERROR [main] PlanChangeLogger: Batch Cleanup filters that 
> cannot be pushed down has no effect.
> 24/05/13 17:39:25 ERROR [main] PlanChangeLogger: Batch Extract Python UDFs 
> has no effect.
> 24/05/13 17:39:25 ERROR [main] PlanChangeLogger:
> === Applying Rule org.apache.spark.sql.catalyst.optimizer.EliminateLimits ===
>  GlobalLimit 21                                                               
>                                                            GlobalLimit 21
> !+- LocalLimit 21                                                             
>                                                            +- LocalLimit 
> least(, ... 2 more fields)
> !   +- LocalLimit 21                                                          
>                                                               +- Project 
> [item_id#647L]
> !      +- Project [item_id#647L]                                              
>                                                                  +- Filter 
> (((isnotnull(tz_type#734) AND (tz_type#734 = local)) AND (grass_region#735 = 
> BR)) AND isnotnull(grass_region#735))
> !         +- Filter (((isnotnull(tz_type#734) AND (tz_type#734 = local)) AND 
> (grass_region#735 = BR)) AND isnotnull(grass_region#735))            +- 
> Relation db.table[,... 91 more fields] parquet
> !            +- Relation db.table[,... 91 more fields] parquet
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48241) CSV parsing failure with char/varchar type columns

2024-05-13 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48241.
-
Fix Version/s: 3.5.2
   Resolution: Fixed

Issue resolved by pull request 46565
[https://github.com/apache/spark/pull/46565]

> CSV parsing failure with char/varchar type columns
> --
>
> Key: SPARK-48241
> URL: https://issues.apache.org/jira/browse/SPARK-48241
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.5.1
>Reporter: Jiayi Liu
>Assignee: Jiayi Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.2, 4.0.0
>
>
> CSV table containing char and varchar columns will result in the following 
> error when selecting from the CSV table:
> {code:java}
> java.lang.IllegalArgumentException: requirement failed: requiredSchema 
> (struct) should be the subset of dataSchema 
> (struct).
>     at scala.Predef$.require(Predef.scala:281)
>     at 
> org.apache.spark.sql.catalyst.csv.UnivocityParser.(UnivocityParser.scala:56)
>     at 
> org.apache.spark.sql.execution.datasources.csv.CSVFileFormat.$anonfun$buildReader$2(CSVFileFormat.scala:127)
>     at 
> org.apache.spark.sql.execution.datasources.FileFormat$$anon$1.apply(FileFormat.scala:155)
>     at 
> org.apache.spark.sql.execution.datasources.FileFormat$$anon$1.apply(FileFormat.scala:140)
>     at 
> org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:231)
>     at 
> org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:293)
>     at 
> org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:125){code}
> The reason for the error is that the StringType columns in the dataSchema and 
> requiredSchema of UnivocityParser are not consistent. It is due to the 
> metadata contained in the StringType StructField of the dataSchema, which is 
> missing in the requiredSchema. We need to retain the metadata when resolving 
> schema.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48241) CSV parsing failure with char/varchar type columns

2024-05-13 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-48241:
---

Assignee: Jiayi Liu

> CSV parsing failure with char/varchar type columns
> --
>
> Key: SPARK-48241
> URL: https://issues.apache.org/jira/browse/SPARK-48241
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.5.1
>Reporter: Jiayi Liu
>Assignee: Jiayi Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> CSV table containing char and varchar columns will result in the following 
> error when selecting from the CSV table:
> {code:java}
> java.lang.IllegalArgumentException: requirement failed: requiredSchema 
> (struct) should be the subset of dataSchema 
> (struct).
>     at scala.Predef$.require(Predef.scala:281)
>     at 
> org.apache.spark.sql.catalyst.csv.UnivocityParser.(UnivocityParser.scala:56)
>     at 
> org.apache.spark.sql.execution.datasources.csv.CSVFileFormat.$anonfun$buildReader$2(CSVFileFormat.scala:127)
>     at 
> org.apache.spark.sql.execution.datasources.FileFormat$$anon$1.apply(FileFormat.scala:155)
>     at 
> org.apache.spark.sql.execution.datasources.FileFormat$$anon$1.apply(FileFormat.scala:140)
>     at 
> org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:231)
>     at 
> org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:293)
>     at 
> org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:125){code}
> The reason for the error is that the StringType columns in the dataSchema and 
> requiredSchema of UnivocityParser are not consistent. It is due to the 
> metadata contained in the StringType StructField of the dataSchema, which is 
> missing in the requiredSchema. We need to retain the metadata when resolving 
> schema.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48206) Add tests for window expression rewrites in RewriteWithExpression

2024-05-13 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48206.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46492
[https://github.com/apache/spark/pull/46492]

> Add tests for window expression rewrites in RewriteWithExpression
> -
>
> Key: SPARK-48206
> URL: https://issues.apache.org/jira/browse/SPARK-48206
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Kelvin Jiang
>Assignee: Kelvin Jiang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Window expressions can be potentially problematic if we pull out a window 
> expression outside a `Window` operator. Right now this shouldn't happen but 
> we should add some tests to make sure it doesn't break.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48206) Add tests for window expression rewrites in RewriteWithExpression

2024-05-13 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-48206:
---

Assignee: Kelvin Jiang

> Add tests for window expression rewrites in RewriteWithExpression
> -
>
> Key: SPARK-48206
> URL: https://issues.apache.org/jira/browse/SPARK-48206
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Kelvin Jiang
>Assignee: Kelvin Jiang
>Priority: Major
>  Labels: pull-request-available
>
> Window expressions can be potentially problematic if we pull out a window 
> expression outside a `Window` operator. Right now this shouldn't happen but 
> we should add some tests to make sure it doesn't break.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48031) Add schema evolution options to views

2024-05-13 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-48031:
---

Assignee: Serge Rielau

> Add schema evolution options to views 
> --
>
> Key: SPARK-48031
> URL: https://issues.apache.org/jira/browse/SPARK-48031
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Serge Rielau
>Assignee: Serge Rielau
>Priority: Major
>  Labels: pull-request-available
>
> We want to provide the ability for views to react to changes in the query 
> resolution in manners differently than just failing the view.
> For example we want the view to be able to compensate for type changes by 
> casting the query result to the view column types.
> Or to adopt any type of column arity changes into a view.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48031) Add schema evolution options to views

2024-05-13 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48031.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46267
[https://github.com/apache/spark/pull/46267]

> Add schema evolution options to views 
> --
>
> Key: SPARK-48031
> URL: https://issues.apache.org/jira/browse/SPARK-48031
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Serge Rielau
>Assignee: Serge Rielau
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> We want to provide the ability for views to react to changes in the query 
> resolution in manners differently than just failing the view.
> For example we want the view to be able to compensate for type changes by 
> casting the query result to the view column types.
> Or to adopt any type of column arity changes into a view.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-48260) disable output committer coordination in one test of ParquetIOSuite

2024-05-13 Thread Wenchen Fan (Jira)
Wenchen Fan created SPARK-48260:
---

 Summary: disable output committer coordination in one test of 
ParquetIOSuite
 Key: SPARK-48260
 URL: https://issues.apache.org/jira/browse/SPARK-48260
 Project: Spark
  Issue Type: Test
  Components: SQL
Affects Versions: 4.0.0
Reporter: Wenchen Fan






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-48252) Update CommonExpressionRef when necessary

2024-05-13 Thread Wenchen Fan (Jira)
Wenchen Fan created SPARK-48252:
---

 Summary: Update CommonExpressionRef when necessary
 Key: SPARK-48252
 URL: https://issues.apache.org/jira/browse/SPARK-48252
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 4.0.0
Reporter: Wenchen Fan






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48146) Fix error with aggregate function in With child

2024-05-10 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-48146:
---

Assignee: Kelvin Jiang

> Fix error with aggregate function in With child
> ---
>
> Key: SPARK-48146
> URL: https://issues.apache.org/jira/browse/SPARK-48146
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Kelvin Jiang
>Assignee: Kelvin Jiang
>Priority: Major
>  Labels: pull-request-available
>
> Right now, if we have an aggregate function in the child of a With 
> expression, we fail an assertion. However, queries like this used to work:
> {code:sql}
> select
> id between cast(max(id between 1 and 2) as int) and id
> from range(10)
> group by id
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48146) Fix error with aggregate function in With child

2024-05-10 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48146.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46443
[https://github.com/apache/spark/pull/46443]

> Fix error with aggregate function in With child
> ---
>
> Key: SPARK-48146
> URL: https://issues.apache.org/jira/browse/SPARK-48146
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Kelvin Jiang
>Assignee: Kelvin Jiang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Right now, if we have an aggregate function in the child of a With 
> expression, we fail an assertion. However, queries like this used to work:
> {code:sql}
> select
> id between cast(max(id between 1 and 2) as int) and id
> from range(10)
> group by id
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48158) XML expressions (all collations)

2024-05-10 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48158.
-
Fix Version/s: 4.0.0
 Assignee: Uroš Bojanić
   Resolution: Fixed

> XML expressions (all collations)
> 
>
> Key: SPARK-48158
> URL: https://issues.apache.org/jira/browse/SPARK-48158
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Assignee: Uroš Bojanić
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Enable collation support for *XML* built-in string functions in Spark 
> ({*}XmlToStructs{*}, {*}SchemaOfXml{*}, {*}StructsToXml{*}). First confirm 
> what is the expected behaviour for these functions when given collated 
> strings, and then move on to implementation and testing. You will find these 
> expressions in the *xmlExpressions.scala* file, and they should mostly be 
> pass-through functions. Implement the corresponding E2E SQL tests 
> (CollationSQLExpressionsSuite) to reflect how this function should be used 
> with collation in SparkSQL, and feel free to use your chosen Spark SQL Editor 
> to experiment with the existing functions to learn more about how they work. 
> In addition, look into the possible use-cases and implementation of similar 
> functions within other other open-source DBMS, such as 
> [PostgreSQL|https://www.postgresql.org/docs/].
>  
> The goal for this Jira ticket is to implement the *XML* expressions so that 
> they support all collation types currently supported in Spark. To understand 
> what changes were introduced in order to enable full collation support for 
> other existing functions in Spark, take a look at the Spark PRs and Jira 
> tickets for completed tasks in this parent (for example: Ascii, Chr, Base64, 
> UnBase64, Decode, StringDecode, Encode, ToBinary, FormatNumber, Sentences).
>  
> Read more about ICU [Collation Concepts|http://example.com/] and 
> [Collator|http://example.com/] class. Also, refer to the Unicode Technical 
> Standard for string 
> [collation|https://www.unicode.org/reports/tr35/tr35-collation.html#Collation_Type_Fallback].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48222) Sync Ruby Bundler to 2.4.22 and refresh Gem lock file

2024-05-09 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-48222:
---

Assignee: Nicholas Chammas

> Sync Ruby Bundler to 2.4.22 and refresh Gem lock file
> -
>
> Key: SPARK-48222
> URL: https://issues.apache.org/jira/browse/SPARK-48222
> Project: Spark
>  Issue Type: Improvement
>  Components: Build, Documentation
>Affects Versions: 4.0.0
>Reporter: Nicholas Chammas
>Assignee: Nicholas Chammas
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48222) Sync Ruby Bundler to 2.4.22 and refresh Gem lock file

2024-05-09 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48222.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46512
[https://github.com/apache/spark/pull/46512]

> Sync Ruby Bundler to 2.4.22 and refresh Gem lock file
> -
>
> Key: SPARK-48222
> URL: https://issues.apache.org/jira/browse/SPARK-48222
> Project: Spark
>  Issue Type: Improvement
>  Components: Build, Documentation
>Affects Versions: 4.0.0
>Reporter: Nicholas Chammas
>Assignee: Nicholas Chammas
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-47409) StringTrim & StringTrimLeft/Right/Both (binary & lowercase collation only)

2024-05-09 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-47409:
---

Assignee: David Milicevic

> StringTrim & StringTrimLeft/Right/Both (binary & lowercase collation only)
> --
>
> Key: SPARK-47409
> URL: https://issues.apache.org/jira/browse/SPARK-47409
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Assignee: David Milicevic
>Priority: Major
>  Labels: pull-request-available
>
> Enable collation support for the *StringTrim* built-in string function in 
> Spark (including {*}StringTrimBoth{*}, {*}StringTrimLeft{*}, 
> {*}StringTrimRight{*}). First confirm what is the expected behaviour for 
> these functions when given collated strings, and then move on to 
> implementation and testing. One way to go about this is to consider using 
> {_}StringSearch{_}, an efficient ICU service for string matching. Implement 
> the corresponding unit tests (CollationStringExpressionsSuite) and E2E tests 
> (CollationSuite) to reflect how this function should be used with collation 
> in SparkSQL, and feel free to use your chosen Spark SQL Editor to experiment 
> with the existing functions to learn more about how they work. In addition, 
> look into the possible use-cases and implementation of similar functions 
> within other other open-source DBMS, such as 
> [PostgreSQL|[https://www.postgresql.org/docs/]].
>  
> The goal for this Jira ticket is to implement the *StringTrim* function so it 
> supports binary & lowercase collation types currently supported in Spark. To 
> understand what changes were introduced in order to enable full collation 
> support for other existing functions in Spark, take a look at the Spark PRs 
> and Jira tickets for completed tasks in this parent (for example: Contains, 
> StartsWith, EndsWith).
>  
> Read more about ICU [Collation Concepts|http://example.com/] and 
> [Collator|http://example.com/] class, as well as _StringSearch_ using the 
> [ICU user 
> guide|https://unicode-org.github.io/icu/userguide/collation/string-search.html]
>  and [ICU 
> docs|https://unicode-org.github.io/icu-docs/apidoc/released/icu4j/com/ibm/icu/text/StringSearch.html].
>  Also, refer to the Unicode Technical Standard for string 
> [searching|https://www.unicode.org/reports/tr10/#Searching] and 
> [collation|https://www.unicode.org/reports/tr35/tr35-collation.html#Collation_Type_Fallback].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47409) StringTrim & StringTrimLeft/Right/Both (binary & lowercase collation only)

2024-05-09 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-47409.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46206
[https://github.com/apache/spark/pull/46206]

> StringTrim & StringTrimLeft/Right/Both (binary & lowercase collation only)
> --
>
> Key: SPARK-47409
> URL: https://issues.apache.org/jira/browse/SPARK-47409
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Assignee: David Milicevic
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Enable collation support for the *StringTrim* built-in string function in 
> Spark (including {*}StringTrimBoth{*}, {*}StringTrimLeft{*}, 
> {*}StringTrimRight{*}). First confirm what is the expected behaviour for 
> these functions when given collated strings, and then move on to 
> implementation and testing. One way to go about this is to consider using 
> {_}StringSearch{_}, an efficient ICU service for string matching. Implement 
> the corresponding unit tests (CollationStringExpressionsSuite) and E2E tests 
> (CollationSuite) to reflect how this function should be used with collation 
> in SparkSQL, and feel free to use your chosen Spark SQL Editor to experiment 
> with the existing functions to learn more about how they work. In addition, 
> look into the possible use-cases and implementation of similar functions 
> within other other open-source DBMS, such as 
> [PostgreSQL|[https://www.postgresql.org/docs/]].
>  
> The goal for this Jira ticket is to implement the *StringTrim* function so it 
> supports binary & lowercase collation types currently supported in Spark. To 
> understand what changes were introduced in order to enable full collation 
> support for other existing functions in Spark, take a look at the Spark PRs 
> and Jira tickets for completed tasks in this parent (for example: Contains, 
> StartsWith, EndsWith).
>  
> Read more about ICU [Collation Concepts|http://example.com/] and 
> [Collator|http://example.com/] class, as well as _StringSearch_ using the 
> [ICU user 
> guide|https://unicode-org.github.io/icu/userguide/collation/string-search.html]
>  and [ICU 
> docs|https://unicode-org.github.io/icu-docs/apidoc/released/icu4j/com/ibm/icu/text/StringSearch.html].
>  Also, refer to the Unicode Technical Standard for string 
> [searching|https://www.unicode.org/reports/tr10/#Searching] and 
> [collation|https://www.unicode.org/reports/tr35/tr35-collation.html#Collation_Type_Fallback].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47421) URL expressions (all collations)

2024-05-09 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-47421.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46460
[https://github.com/apache/spark/pull/46460]

> URL expressions (all collations)
> 
>
> Key: SPARK-47421
> URL: https://issues.apache.org/jira/browse/SPARK-47421
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Assignee: Uroš Bojanić
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-47421) URL expressions (all collations)

2024-05-09 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-47421:
---

Assignee: Uroš Bojanić

> URL expressions (all collations)
> 
>
> Key: SPARK-47421
> URL: https://issues.apache.org/jira/browse/SPARK-47421
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Assignee: Uroš Bojanić
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47354) Variant expressions (all collations)

2024-05-09 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-47354.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46424
[https://github.com/apache/spark/pull/46424]

> Variant expressions (all collations)
> 
>
> Key: SPARK-47354
> URL: https://issues.apache.org/jira/browse/SPARK-47354
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Assignee: Uroš Bojanić
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-47354) Variant expressions (all collations)

2024-05-09 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-47354:
---

Assignee: Uroš Bojanić

> Variant expressions (all collations)
> 
>
> Key: SPARK-47354
> URL: https://issues.apache.org/jira/browse/SPARK-47354
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Assignee: Uroš Bojanić
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48186) Add support for AbstractMapType

2024-05-09 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48186.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46458
[https://github.com/apache/spark/pull/46458]

> Add support for AbstractMapType
> ---
>
> Key: SPARK-48186
> URL: https://issues.apache.org/jira/browse/SPARK-48186
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Assignee: Uroš Bojanić
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48186) Add support for AbstractMapType

2024-05-09 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-48186:
---

Assignee: Uroš Bojanić

> Add support for AbstractMapType
> ---
>
> Key: SPARK-48186
> URL: https://issues.apache.org/jira/browse/SPARK-48186
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Assignee: Uroš Bojanić
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48197) avoid assert error for invalid lambda function

2024-05-08 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-48197.
-
Fix Version/s: 3.5.2
   4.0.0
   Resolution: Fixed

Issue resolved by pull request 46475
[https://github.com/apache/spark/pull/46475]

> avoid assert error for invalid lambda function
> --
>
> Key: SPARK-48197
> URL: https://issues.apache.org/jira/browse/SPARK-48197
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Wenchen Fan
>Assignee: Wenchen Fan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.2, 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-48204) fix release script for Spark 4.0+

2024-05-08 Thread Wenchen Fan (Jira)
Wenchen Fan created SPARK-48204:
---

 Summary: fix release script for Spark 4.0+
 Key: SPARK-48204
 URL: https://issues.apache.org/jira/browse/SPARK-48204
 Project: Spark
  Issue Type: Bug
  Components: Project Infra
Affects Versions: 4.0.0
Reporter: Wenchen Fan






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



  1   2   3   4   5   6   7   8   9   10   >