[jira] [Updated] (SPARK-48280) Improve collation testing surface area using expression walking

2024-06-10 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-48280:
--
Summary: Improve collation testing surface area using expression walking  
(was: Add Expression Walker for Testing)

> Improve collation testing surface area using expression walking
> ---
>
> Key: SPARK-48280
> URL: https://issues.apache.org/jira/browse/SPARK-48280
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Mihailo Milosevic
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48574) Add support for StructTypes with collations

2024-06-08 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-48574:
--
Description: While adding expression walker it was noticed that StructType 
support is broken. One problem is that `CollationsTypeCasts` is doing a cast in 
all BinaryExpressions which includes ExtractValue. Consequently, we are unable 
to extract value if we do a cast there, as ExtractValue only supports 
nonNullLiterals as extracting keys.

> Add support for StructTypes with collations
> ---
>
> Key: SPARK-48574
> URL: https://issues.apache.org/jira/browse/SPARK-48574
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Mihailo Milosevic
>Priority: Major
>
> While adding expression walker it was noticed that StructType support is 
> broken. One problem is that `CollationsTypeCasts` is doing a cast in all 
> BinaryExpressions which includes ExtractValue. Consequently, we are unable to 
> extract value if we do a cast there, as ExtractValue only supports 
> nonNullLiterals as extracting keys.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-48574) Add support for StructTypes with collations

2024-06-08 Thread Mihailo Milosevic (Jira)
Mihailo Milosevic created SPARK-48574:
-

 Summary: Add support for StructTypes with collations
 Key: SPARK-48574
 URL: https://issues.apache.org/jira/browse/SPARK-48574
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 4.0.0
Reporter: Mihailo Milosevic






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48572) Fix DateSub, DateAdd, WindowTime, TimeWindow and SessionWindow expressions

2024-06-08 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-48572:
--
Description: While adding Expression Walker testing, these expression were 
found to be faulty. These expressions need to be fixed to work with collated 
strings.

> Fix DateSub, DateAdd, WindowTime, TimeWindow and SessionWindow expressions
> --
>
> Key: SPARK-48572
> URL: https://issues.apache.org/jira/browse/SPARK-48572
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Mihailo Milosevic
>Priority: Major
>
> While adding Expression Walker testing, these expression were found to be 
> faulty. These expressions need to be fixed to work with collated strings.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48572) Fix DateSub, DateAdd, WindowTime, TimeWindow and SessionWindow expressions

2024-06-08 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-48572:
--
Summary: Fix DateSub, DateAdd, WindowTime, TimeWindow and SessionWindow 
expressions  (was: Fix DateSub and DateAdd expression implicit casting)

> Fix DateSub, DateAdd, WindowTime, TimeWindow and SessionWindow expressions
> --
>
> Key: SPARK-48572
> URL: https://issues.apache.org/jira/browse/SPARK-48572
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Mihailo Milosevic
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-48572) Fix DateSub and DateAdd expression implicit casting

2024-06-08 Thread Mihailo Milosevic (Jira)
Mihailo Milosevic created SPARK-48572:
-

 Summary: Fix DateSub and DateAdd expression implicit casting
 Key: SPARK-48572
 URL: https://issues.apache.org/jira/browse/SPARK-48572
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 4.0.0
Reporter: Mihailo Milosevic






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48472) Enable reflect expressions with collated strings

2024-06-07 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-48472:
--
Description: As a movement to collated world, we need to make sure all 
expressions are supported by appropriate collations. Using expression walker 
testing `CallMethodViaReflection` was found to be erroneous. This expression is 
used as a replacement for all reflection methods and needs to be improved. This 
ticket needs to update methods of this expression. Relevant code could be found 
in these files: `CallMethodViaReflection.scala`, `TypeCoercion.scala`, 
`AnsiTypeCoercion.scala`, `CollationTypeCasts.scala`, and for testing 
`Collation*Suite.scala` files.

> Enable reflect expressions with collated strings
> 
>
> Key: SPARK-48472
> URL: https://issues.apache.org/jira/browse/SPARK-48472
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Mihailo Milosevic
>Priority: Major
>
> As a movement to collated world, we need to make sure all expressions are 
> supported by appropriate collations. Using expression walker testing 
> `CallMethodViaReflection` was found to be erroneous. This expression is used 
> as a replacement for all reflection methods and needs to be improved. This 
> ticket needs to update methods of this expression. Relevant code could be 
> found in these files: `CallMethodViaReflection.scala`, `TypeCoercion.scala`, 
> `AnsiTypeCoercion.scala`, `CollationTypeCasts.scala`, and for testing 
> `Collation*Suite.scala` files.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48472) Enable reflect expressions with collated strings

2024-06-07 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-48472:
--
Summary: Enable reflect expressions with collated strings  (was: Expression 
Walker Test)

> Enable reflect expressions with collated strings
> 
>
> Key: SPARK-48472
> URL: https://issues.apache.org/jira/browse/SPARK-48472
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Mihailo Milosevic
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48472) Expression Walker Test

2024-06-07 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-48472:
--
Parent: SPARK-46837
Issue Type: Sub-task  (was: Improvement)

> Expression Walker Test
> --
>
> Key: SPARK-48472
> URL: https://issues.apache.org/jira/browse/SPARK-48472
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Mihailo Milosevic
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48472) Expression Walker Test

2024-06-07 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-48472:
--
Epic Link: (was: SPARK-46830)

> Expression Walker Test
> --
>
> Key: SPARK-48472
> URL: https://issues.apache.org/jira/browse/SPARK-48472
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Mihailo Milosevic
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-48472) Expression Walker Test

2024-05-30 Thread Mihailo Milosevic (Jira)
Mihailo Milosevic created SPARK-48472:
-

 Summary: Expression Walker Test
 Key: SPARK-48472
 URL: https://issues.apache.org/jira/browse/SPARK-48472
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 4.0.0
Reporter: Mihailo Milosevic






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-48280) Add Expression Walker for Testing

2024-05-15 Thread Mihailo Milosevic (Jira)
Mihailo Milosevic created SPARK-48280:
-

 Summary: Add Expression Walker for Testing
 Key: SPARK-48280
 URL: https://issues.apache.org/jira/browse/SPARK-48280
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 4.0.0
Reporter: Mihailo Milosevic






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48263) Collate function support for non UTF8_BINARY strings

2024-05-14 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-48263:
--
Summary: Collate function support for non UTF8_BINARY strings  (was: 
Collate expression not working when default collation set)

> Collate function support for non UTF8_BINARY strings
> 
>
> Key: SPARK-48263
> URL: https://issues.apache.org/jira/browse/SPARK-48263
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Nebojsa Savic
>Priority: Major
>  Labels: pull-request-available
>
> When default collation level config is set to some collation other than 
> UTF8_BINARY (i.e. UTF8_BINARY_LCASE) and when we try to execute COLLATE (or 
> collation) expression, this will fail because it is only accepting 
> StringType(0) as argument for collation name.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-48262) Substitute BinaryExpression for explicit Expressions in CollationTypeCast

2024-05-13 Thread Mihailo Milosevic (Jira)
Mihailo Milosevic created SPARK-48262:
-

 Summary: Substitute BinaryExpression for explicit Expressions in 
CollationTypeCast
 Key: SPARK-48262
 URL: https://issues.apache.org/jira/browse/SPARK-48262
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 4.0.0
Reporter: Mihailo Milosevic






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48172) Fix escaping issues in JDBCDialects

2024-05-08 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-48172:
--
Summary: Fix escaping issues in JDBCDialects  (was: Fix escaping issue in 
JDBCDialects)

> Fix escaping issues in JDBCDialects
> ---
>
> Key: SPARK-48172
> URL: https://issues.apache.org/jira/browse/SPARK-48172
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Mihailo Milosevic
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48172) Fix escaping issue in JDBCDialects

2024-05-08 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-48172:
--
Summary: Fix escaping issue in JDBCDialects  (was: Fix escaping issue for 
mysql)

> Fix escaping issue in JDBCDialects
> --
>
> Key: SPARK-48172
> URL: https://issues.apache.org/jira/browse/SPARK-48172
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Mihailo Milosevic
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-48172) Fix escaping issue for mysql

2024-05-07 Thread Mihailo Milosevic (Jira)
Mihailo Milosevic created SPARK-48172:
-

 Summary: Fix escaping issue for mysql
 Key: SPARK-48172
 URL: https://issues.apache.org/jira/browse/SPARK-48172
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 4.0.0
Reporter: Mihailo Milosevic






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47408) Fix mathExpressions that use StringType

2024-04-25 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-47408:
--
Summary: Fix mathExpressions that use StringType  (was: TBD)

> Fix mathExpressions that use StringType
> ---
>
> Key: SPARK-47408
> URL: https://issues.apache.org/jira/browse/SPARK-47408
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47972) Restrict CAST expression for collations

2024-04-24 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-47972:
--
Description: Current state of code allows for calls like CAST(1 AS STRING 
COLLATE UNICODE). We want to restrict CAST expression to only be able to cast 
to default collation string, and to only allow COLLATE expression to produce 
explicitly collated strings.

> Restrict CAST expression for collations
> ---
>
> Key: SPARK-47972
> URL: https://issues.apache.org/jira/browse/SPARK-47972
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Mihailo Milosevic
>Priority: Major
>
> Current state of code allows for calls like CAST(1 AS STRING COLLATE 
> UNICODE). We want to restrict CAST expression to only be able to cast to 
> default collation string, and to only allow COLLATE expression to produce 
> explicitly collated strings.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47692) Fix default StringType meaning in implicit casting

2024-04-21 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-47692:
--
Summary: Fix default StringType meaning in implicit casting  (was: Addition 
of priority flag to StringType)

> Fix default StringType meaning in implicit casting
> --
>
> Key: SPARK-47692
> URL: https://issues.apache.org/jira/browse/SPARK-47692
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Mihailo Milosevic
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47356) Add support for ConcatWs & Elt (all collations)

2024-04-15 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-47356:
--
Summary: Add support for ConcatWs & Elt (all collations)  (was: ConcatWs & 
Elt (all collations))

> Add support for ConcatWs & Elt (all collations)
> ---
>
> Key: SPARK-47356
> URL: https://issues.apache.org/jira/browse/SPARK-47356
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-47356) ConcatWs & Elt (all collations)

2024-04-15 Thread Mihailo Milosevic (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-47356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837145#comment-17837145
 ] 

Mihailo Milosevic commented on SPARK-47356:
---

Working on this.

> ConcatWs & Elt (all collations)
> ---
>
> Key: SPARK-47356
> URL: https://issues.apache.org/jira/browse/SPARK-47356
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47357) Add support for Upper, Lower, InitCap (all collations)

2024-04-11 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-47357:
--
Summary: Add support for Upper, Lower, InitCap (all collations)  (was: 
Upper, Lower, InitCap (all collations))

> Add support for Upper, Lower, InitCap (all collations)
> --
>
> Key: SPARK-47357
> URL: https://issues.apache.org/jira/browse/SPARK-47357
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47736) Add support for AbstractArrayType

2024-04-09 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-47736:
--
Summary: Add support for AbstractArrayType  (was: Add support for 
AbstractArrayType(StringTypeCollated))

> Add support for AbstractArrayType
> -
>
> Key: SPARK-47736
> URL: https://issues.apache.org/jira/browse/SPARK-47736
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Mihailo Milosevic
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47765) Add SET COLLATION to parser rules

2024-04-08 Thread Mihailo Milosevic (Jira)
Mihailo Milosevic created SPARK-47765:
-

 Summary: Add SET COLLATION to parser rules
 Key: SPARK-47765
 URL: https://issues.apache.org/jira/browse/SPARK-47765
 Project: Spark
  Issue Type: New Feature
  Components: SQL
Affects Versions: 4.0.0
Reporter: Mihailo Milosevic






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47736) Add support for AbstractArrayType(StringTypeCollated)

2024-04-05 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-47736:
--
Summary: Add support for AbstractArrayType(StringTypeCollated)  (was: Add 
support for ArrayType(StringTypeAnyCollation))

> Add support for AbstractArrayType(StringTypeCollated)
> -
>
> Key: SPARK-47736
> URL: https://issues.apache.org/jira/browse/SPARK-47736
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Mihailo Milosevic
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47736) Add support for ArrayType(StringTypeAnyCollation)

2024-04-05 Thread Mihailo Milosevic (Jira)
Mihailo Milosevic created SPARK-47736:
-

 Summary: Add support for ArrayType(StringTypeAnyCollation)
 Key: SPARK-47736
 URL: https://issues.apache.org/jira/browse/SPARK-47736
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 4.0.0
Reporter: Mihailo Milosevic






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47692) Addition of priority flag to StringType

2024-04-02 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-47692:
--
Summary: Addition of priority flag to StringType  (was: Addition of default 
priority flag)

> Addition of priority flag to StringType
> ---
>
> Key: SPARK-47692
> URL: https://issues.apache.org/jira/browse/SPARK-47692
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Mihailo Milosevic
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47692) Addition of default priority flag

2024-04-02 Thread Mihailo Milosevic (Jira)
Mihailo Milosevic created SPARK-47692:
-

 Summary: Addition of default priority flag
 Key: SPARK-47692
 URL: https://issues.apache.org/jira/browse/SPARK-47692
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 4.0.0
Reporter: Mihailo Milosevic






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47626) Addition for Map Implicit Casting of Collated Strings

2024-03-28 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-47626:
--
Description: Initial ticket for addition of collation implicit casting 
SPARK-47210 introduced support for casting of arrays and normal string types. 
This ticket needs to dive into the problem of casting MapType.  (was: Initial 
PR for addition of collation implicit casting [SPARK-47210] introduced support 
for casting of arrays and normal string types.)

> Addition for Map Implicit Casting of Collated Strings
> -
>
> Key: SPARK-47626
> URL: https://issues.apache.org/jira/browse/SPARK-47626
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Mihailo Milosevic
>Priority: Major
>
> Initial ticket for addition of collation implicit casting SPARK-47210 
> introduced support for casting of arrays and normal string types. This ticket 
> needs to dive into the problem of casting MapType.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47626) Addition for Map Implicit Casting of Collated Strings

2024-03-28 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-47626:
--
Description: Initial PR for addition of collation implicit casting 
[SPARK-47210] introduced support for casting of arrays and normal string types.

> Addition for Map Implicit Casting of Collated Strings
> -
>
> Key: SPARK-47626
> URL: https://issues.apache.org/jira/browse/SPARK-47626
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Mihailo Milosevic
>Priority: Major
>
> Initial PR for addition of collation implicit casting [SPARK-47210] 
> introduced support for casting of arrays and normal string types.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47625) Addition of Indeterminate Collation Support

2024-03-28 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-47625:
--
Description: 
{{INDETERMINATE_COLLATION}} should only be thrown on comparison operations and 
memory storing of data, and we should be able to combine different implicit 
collations for certain operations like concat and possible others in the future.
This is why we have to add another predefined collation id named 
{{INDETERMINATE_COLLATION_ID}} which means that the result is a combination of 
conflicting non-default implicit collations. Right now it would an id of -1 so 
it fail if it ever goes to the {{{}CollatorFactory{}}}.

> Addition of Indeterminate Collation Support
> ---
>
> Key: SPARK-47625
> URL: https://issues.apache.org/jira/browse/SPARK-47625
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Mihailo Milosevic
>Priority: Major
>
> {{INDETERMINATE_COLLATION}} should only be thrown on comparison operations 
> and memory storing of data, and we should be able to combine different 
> implicit collations for certain operations like concat and possible others in 
> the future.
> This is why we have to add another predefined collation id named 
> {{INDETERMINATE_COLLATION_ID}} which means that the result is a combination 
> of conflicting non-default implicit collations. Right now it would an id of 
> -1 so it fail if it ever goes to the {{{}CollatorFactory{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47626) Addition for Map Implicit Casting of Collated Strings

2024-03-28 Thread Mihailo Milosevic (Jira)
Mihailo Milosevic created SPARK-47626:
-

 Summary: Addition for Map Implicit Casting of Collated Strings
 Key: SPARK-47626
 URL: https://issues.apache.org/jira/browse/SPARK-47626
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 4.0.0
Reporter: Mihailo Milosevic






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47625) Addition of Indeterminate Collation Support

2024-03-28 Thread Mihailo Milosevic (Jira)
Mihailo Milosevic created SPARK-47625:
-

 Summary: Addition of Indeterminate Collation Support
 Key: SPARK-47625
 URL: https://issues.apache.org/jira/browse/SPARK-47625
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 4.0.0
Reporter: Mihailo Milosevic






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47210) Addition of implicit casting without indeterminate support

2024-03-28 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-47210:
--
Description: 
*What changes were proposed in this pull request?*
This PR adds automatic casting and collations resolution as per `PGSQL` 
behaviour:

1. Collations set on the metadata level are implicit
2. Collations set using the `COLLATE` expression are explicit
3. When there is a combination of expressions of multiple collations the output 
will be:
 - if there are explicit collations and all of them are equal then that 
collation will be the output
 - if there are multiple different explicit collations 
`COLLATION_MISMATCH.EXPLICIT` will be thrown
 - if there are no explicit collations and only a single type of non default 
collation, that one will be used
 - if there are no explicit collations and multiple non-default implicit ones 
`COLLATION_MISMATCH.IMPLICIT` will be thrown

*Why are the changes needed?*
We need to be able to compare columns and values with different collations and 
set a way of explicitly changing the collation we want to use.

  was:
*What changes were proposed in this pull request?*
This PR adds automatic casting and collations resolution as per `PGSQL` 
behaviour:

1. Collations set on the metadata level are implicit
2. Collations set using the `COLLATE` expression are explicit
3. When there is a combination of expressions of multiple collations the output 
will be:
- if there are explicit collations and all of them are equal then that 
collation will be the output
- if there are multiple different explicit collations 
`COLLATION_MISMATCH.EXPLICIT` will be thrown
- if there are no explicit collations and only a single type of non default 
collation, that one will be used
- if there are no explicit collations and multiple non-default implicit ones 
`COLLATION_MISMATCH.IMPLICIT` will be thrown


Another thing is that `INDETERMINATE_COLLATION` should only be thrown on 
comparison operations, and we should be able to combine different implicit 
collations for certain operations like concat and possible others in the future.
This is why I had to add another predefined collation id named 
`INDETERMINATE_COLLATION_ID` which means that the result is a combination of 
conflicting non-default implicit collations. Right now it has an id of -1 so it 
fails if it ever goes to the `CollatorFactory`.


*Why are the changes needed?*
We need to be able to compare columns and values with different collations and 
set a way of explicitly changing the collation we want to use.


> Addition of implicit casting without indeterminate support
> --
>
> Key: SPARK-47210
> URL: https://issues.apache.org/jira/browse/SPARK-47210
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Mihailo Milosevic
>Priority: Major
>  Labels: pull-request-available
>
> *What changes were proposed in this pull request?*
> This PR adds automatic casting and collations resolution as per `PGSQL` 
> behaviour:
> 1. Collations set on the metadata level are implicit
> 2. Collations set using the `COLLATE` expression are explicit
> 3. When there is a combination of expressions of multiple collations the 
> output will be:
>  - if there are explicit collations and all of them are equal then that 
> collation will be the output
>  - if there are multiple different explicit collations 
> `COLLATION_MISMATCH.EXPLICIT` will be thrown
>  - if there are no explicit collations and only a single type of non default 
> collation, that one will be used
>  - if there are no explicit collations and multiple non-default implicit ones 
> `COLLATION_MISMATCH.IMPLICIT` will be thrown
> *Why are the changes needed?*
> We need to be able to compare columns and values with different collations 
> and set a way of explicitly changing the collation we want to use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47210) Addition of implicit casting without indeterminate support

2024-03-28 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-47210:
--
Summary: Addition of implicit casting without indeterminate support  (was: 
Implicit casting on collated expressions)

> Addition of implicit casting without indeterminate support
> --
>
> Key: SPARK-47210
> URL: https://issues.apache.org/jira/browse/SPARK-47210
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Mihailo Milosevic
>Priority: Major
>  Labels: pull-request-available
>
> *What changes were proposed in this pull request?*
> This PR adds automatic casting and collations resolution as per `PGSQL` 
> behaviour:
> 1. Collations set on the metadata level are implicit
> 2. Collations set using the `COLLATE` expression are explicit
> 3. When there is a combination of expressions of multiple collations the 
> output will be:
> - if there are explicit collations and all of them are equal then that 
> collation will be the output
> - if there are multiple different explicit collations 
> `COLLATION_MISMATCH.EXPLICIT` will be thrown
> - if there are no explicit collations and only a single type of non default 
> collation, that one will be used
> - if there are no explicit collations and multiple non-default implicit ones 
> `COLLATION_MISMATCH.IMPLICIT` will be thrown
> Another thing is that `INDETERMINATE_COLLATION` should only be thrown on 
> comparison operations, and we should be able to combine different implicit 
> collations for certain operations like concat and possible others in the 
> future.
> This is why I had to add another predefined collation id named 
> `INDETERMINATE_COLLATION_ID` which means that the result is a combination of 
> conflicting non-default implicit collations. Right now it has an id of -1 so 
> it fails if it ever goes to the `CollatorFactory`.
> *Why are the changes needed?*
> We need to be able to compare columns and values with different collations 
> and set a way of explicitly changing the collation we want to use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47210) Implicit casting on collated expressions

2024-03-28 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-47210:
--
Parent: SPARK-47624
Issue Type: Sub-task  (was: Improvement)

> Implicit casting on collated expressions
> 
>
> Key: SPARK-47210
> URL: https://issues.apache.org/jira/browse/SPARK-47210
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Mihailo Milosevic
>Priority: Major
>  Labels: pull-request-available
>
> *What changes were proposed in this pull request?*
> This PR adds automatic casting and collations resolution as per `PGSQL` 
> behaviour:
> 1. Collations set on the metadata level are implicit
> 2. Collations set using the `COLLATE` expression are explicit
> 3. When there is a combination of expressions of multiple collations the 
> output will be:
> - if there are explicit collations and all of them are equal then that 
> collation will be the output
> - if there are multiple different explicit collations 
> `COLLATION_MISMATCH.EXPLICIT` will be thrown
> - if there are no explicit collations and only a single type of non default 
> collation, that one will be used
> - if there are no explicit collations and multiple non-default implicit ones 
> `COLLATION_MISMATCH.IMPLICIT` will be thrown
> Another thing is that `INDETERMINATE_COLLATION` should only be thrown on 
> comparison operations, and we should be able to combine different implicit 
> collations for certain operations like concat and possible others in the 
> future.
> This is why I had to add another predefined collation id named 
> `INDETERMINATE_COLLATION_ID` which means that the result is a combination of 
> conflicting non-default implicit collations. Right now it has an id of -1 so 
> it fails if it ever goes to the `CollatorFactory`.
> *Why are the changes needed?*
> We need to be able to compare columns and values with different collations 
> and set a way of explicitly changing the collation we want to use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47210) Implicit casting on collated expressions

2024-03-28 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-47210:
--
Epic Link: (was: SPARK-46830)

> Implicit casting on collated expressions
> 
>
> Key: SPARK-47210
> URL: https://issues.apache.org/jira/browse/SPARK-47210
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Mihailo Milosevic
>Priority: Major
>  Labels: pull-request-available
>
> *What changes were proposed in this pull request?*
> This PR adds automatic casting and collations resolution as per `PGSQL` 
> behaviour:
> 1. Collations set on the metadata level are implicit
> 2. Collations set using the `COLLATE` expression are explicit
> 3. When there is a combination of expressions of multiple collations the 
> output will be:
> - if there are explicit collations and all of them are equal then that 
> collation will be the output
> - if there are multiple different explicit collations 
> `COLLATION_MISMATCH.EXPLICIT` will be thrown
> - if there are no explicit collations and only a single type of non default 
> collation, that one will be used
> - if there are no explicit collations and multiple non-default implicit ones 
> `COLLATION_MISMATCH.IMPLICIT` will be thrown
> Another thing is that `INDETERMINATE_COLLATION` should only be thrown on 
> comparison operations, and we should be able to combine different implicit 
> collations for certain operations like concat and possible others in the 
> future.
> This is why I had to add another predefined collation id named 
> `INDETERMINATE_COLLATION_ID` which means that the result is a combination of 
> conflicting non-default implicit collations. Right now it has an id of -1 so 
> it fails if it ever goes to the `CollatorFactory`.
> *Why are the changes needed?*
> We need to be able to compare columns and values with different collations 
> and set a way of explicitly changing the collation we want to use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47624) Collation Implict Casting Support

2024-03-28 Thread Mihailo Milosevic (Jira)
Mihailo Milosevic created SPARK-47624:
-

 Summary: Collation Implict Casting Support
 Key: SPARK-47624
 URL: https://issues.apache.org/jira/browse/SPARK-47624
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 4.0.0
Reporter: Mihailo Milosevic






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47477) SubstringIndex, StringLocate (all collations)

2024-03-26 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-47477:
--
Parent: (was: SPARK-46837)
Issue Type: New Feature  (was: Sub-task)

> SubstringIndex, StringLocate (all collations)
> -
>
> Key: SPARK-47477
> URL: https://issues.apache.org/jira/browse/SPARK-47477
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Priority: Major
>
> Enable collation support for the *StringInstr* and *FindInSet* built-in 
> string functions in Spark. First confirm what is the expected behaviour for 
> these functions when given collated strings, and then move on to 
> implementation and testing. One way to go about this is to consider using 
> {_}StringSearch{_}, an efficient ICU service for string matching. Implement 
> the corresponding unit tests (CollationStringExpressionsSuite) and E2E tests 
> (CollationSuite) to reflect how this function should be used with collation 
> in SparkSQL, and feel free to use your chosen Spark SQL Editor to experiment 
> with the existing functions to learn more about how they work. In addition, 
> look into the possible use-cases and implementation of similar functions 
> within other other open-source DBMS, such as 
> [PostgreSQL|https://www.postgresql.org/docs/].
>  
> The goal for this Jira ticket is to implement the *StringInstr* and 
> *FindInSet* functions so that they support all collation types currently 
> supported in Spark. To understand what changes were introduced in order to 
> enable full collation support for other existing functions in Spark, take a 
> look at the Spark PRs and Jira tickets for completed tasks in this parent 
> (for example: Contains, StartsWith, EndsWith).
>  
> Read more about ICU [Collation Concepts|http://example.com/] and 
> [Collator|http://example.com/] class, as well as _StringSearch_ using the 
> [ICU user 
> guide|https://unicode-org.github.io/icu/userguide/collation/string-search.html]
>  and [ICU 
> docs|https://unicode-org.github.io/icu-docs/apidoc/released/icu4j/com/ibm/icu/text/StringSearch.html].
>  Also, refer to the Unicode Technical Standard for string 
> [searching|https://www.unicode.org/reports/tr10/#Searching] and 
> [collation|https://www.unicode.org/reports/tr35/tr35-collation.html#Collation_Type_Fallback].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47477) SubstringIndex, StringLocate (all collations)

2024-03-26 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-47477:
--
Epic Link: SPARK-46830

> SubstringIndex, StringLocate (all collations)
> -
>
> Key: SPARK-47477
> URL: https://issues.apache.org/jira/browse/SPARK-47477
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Priority: Major
>
> Enable collation support for the *StringInstr* and *FindInSet* built-in 
> string functions in Spark. First confirm what is the expected behaviour for 
> these functions when given collated strings, and then move on to 
> implementation and testing. One way to go about this is to consider using 
> {_}StringSearch{_}, an efficient ICU service for string matching. Implement 
> the corresponding unit tests (CollationStringExpressionsSuite) and E2E tests 
> (CollationSuite) to reflect how this function should be used with collation 
> in SparkSQL, and feel free to use your chosen Spark SQL Editor to experiment 
> with the existing functions to learn more about how they work. In addition, 
> look into the possible use-cases and implementation of similar functions 
> within other other open-source DBMS, such as 
> [PostgreSQL|https://www.postgresql.org/docs/].
>  
> The goal for this Jira ticket is to implement the *StringInstr* and 
> *FindInSet* functions so that they support all collation types currently 
> supported in Spark. To understand what changes were introduced in order to 
> enable full collation support for other existing functions in Spark, take a 
> look at the Spark PRs and Jira tickets for completed tasks in this parent 
> (for example: Contains, StartsWith, EndsWith).
>  
> Read more about ICU [Collation Concepts|http://example.com/] and 
> [Collator|http://example.com/] class, as well as _StringSearch_ using the 
> [ICU user 
> guide|https://unicode-org.github.io/icu/userguide/collation/string-search.html]
>  and [ICU 
> docs|https://unicode-org.github.io/icu-docs/apidoc/released/icu4j/com/ibm/icu/text/StringSearch.html].
>  Also, refer to the Unicode Technical Standard for string 
> [searching|https://www.unicode.org/reports/tr10/#Searching] and 
> [collation|https://www.unicode.org/reports/tr35/tr35-collation.html#Collation_Type_Fallback].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47477) SubstringIndex, StringLocate (all collations)

2024-03-26 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-47477:
--
Labels:   (was: pull-request-available)

> SubstringIndex, StringLocate (all collations)
> -
>
> Key: SPARK-47477
> URL: https://issues.apache.org/jira/browse/SPARK-47477
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Uroš Bojanić
>Priority: Major
>
> Enable collation support for the *StringInstr* and *FindInSet* built-in 
> string functions in Spark. First confirm what is the expected behaviour for 
> these functions when given collated strings, and then move on to 
> implementation and testing. One way to go about this is to consider using 
> {_}StringSearch{_}, an efficient ICU service for string matching. Implement 
> the corresponding unit tests (CollationStringExpressionsSuite) and E2E tests 
> (CollationSuite) to reflect how this function should be used with collation 
> in SparkSQL, and feel free to use your chosen Spark SQL Editor to experiment 
> with the existing functions to learn more about how they work. In addition, 
> look into the possible use-cases and implementation of similar functions 
> within other other open-source DBMS, such as 
> [PostgreSQL|https://www.postgresql.org/docs/].
>  
> The goal for this Jira ticket is to implement the *StringInstr* and 
> *FindInSet* functions so that they support all collation types currently 
> supported in Spark. To understand what changes were introduced in order to 
> enable full collation support for other existing functions in Spark, take a 
> look at the Spark PRs and Jira tickets for completed tasks in this parent 
> (for example: Contains, StartsWith, EndsWith).
>  
> Read more about ICU [Collation Concepts|http://example.com/] and 
> [Collator|http://example.com/] class, as well as _StringSearch_ using the 
> [ICU user 
> guide|https://unicode-org.github.io/icu/userguide/collation/string-search.html]
>  and [ICU 
> docs|https://unicode-org.github.io/icu-docs/apidoc/released/icu4j/com/ibm/icu/text/StringSearch.html].
>  Also, refer to the Unicode Technical Standard for string 
> [searching|https://www.unicode.org/reports/tr10/#Searching] and 
> [collation|https://www.unicode.org/reports/tr35/tr35-collation.html#Collation_Type_Fallback].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47504) Resolve AbstractDataType simpleStrings for StringTypeCollated

2024-03-21 Thread Mihailo Milosevic (Jira)
Mihailo Milosevic created SPARK-47504:
-

 Summary: Resolve AbstractDataType simpleStrings for 
StringTypeCollated
 Key: SPARK-47504
 URL: https://issues.apache.org/jira/browse/SPARK-47504
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 4.0.0
Reporter: Mihailo Milosevic


*SPARK-47296* introduced a change to fail all unsupported functions. Because of 
this change expected *inputTypes* in *ExpectsInputTypes* had to be changed. 
This change introduced a change on user side which will print 
*"STRING_ANY_COLLATION"* in places where before we printed *"STRING"* when an 
error occurred. Concretely if we get an input of Int where 
*StringTypeAnyCollation* was expected, we will throw this faulty message for 
users.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47431) Add session level default Collation

2024-03-17 Thread Mihailo Milosevic (Jira)
Mihailo Milosevic created SPARK-47431:
-

 Summary: Add session level default Collation
 Key: SPARK-47431
 URL: https://issues.apache.org/jira/browse/SPARK-47431
 Project: Spark
  Issue Type: New Feature
  Components: SQL
Affects Versions: 4.0.0
Reporter: Mihailo Milosevic






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47210) Implicit casting on collated expressions

2024-03-11 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-47210:
--
Description: 
*What changes were proposed in this pull request?*
This PR adds automatic casting and collations resolution as per `PGSQL` 
behaviour:

1. Collations set on the metadata level are implicit
2. Collations set using the `COLLATE` expression are explicit
3. When there is a combination of expressions of multiple collations the output 
will be:
- if there are explicit collations and all of them are equal then that 
collation will be the output
- if there are multiple different explicit collations 
`COLLATION_MISMATCH.EXPLICIT` will be thrown
- if there are no explicit collations and only a single type of non default 
collation, that one will be used
- if there are no explicit collations and multiple non-default implicit ones 
`COLLATION_MISMATCH.IMPLICIT` will be thrown


Another thing is that `INDETERMINATE_COLLATION` should only be thrown on 
comparison operations, and we should be able to combine different implicit 
collations for certain operations like concat and possible others in the future.
This is why I had to add another predefined collation id named 
`INDETERMINATE_COLLATION_ID` which means that the result is a combination of 
conflicting non-default implicit collations. Right now it has an id of -1 so it 
fails if it ever goes to the `CollatorFactory`.


*Why are the changes needed?*
We need to be able to compare columns and values with different collations and 
set a way of explicitly changing the collation we want to use.

> Implicit casting on collated expressions
> 
>
> Key: SPARK-47210
> URL: https://issues.apache.org/jira/browse/SPARK-47210
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Mihailo Milosevic
>Priority: Major
>  Labels: pull-request-available
>
> *What changes were proposed in this pull request?*
> This PR adds automatic casting and collations resolution as per `PGSQL` 
> behaviour:
> 1. Collations set on the metadata level are implicit
> 2. Collations set using the `COLLATE` expression are explicit
> 3. When there is a combination of expressions of multiple collations the 
> output will be:
> - if there are explicit collations and all of them are equal then that 
> collation will be the output
> - if there are multiple different explicit collations 
> `COLLATION_MISMATCH.EXPLICIT` will be thrown
> - if there are no explicit collations and only a single type of non default 
> collation, that one will be used
> - if there are no explicit collations and multiple non-default implicit ones 
> `COLLATION_MISMATCH.IMPLICIT` will be thrown
> Another thing is that `INDETERMINATE_COLLATION` should only be thrown on 
> comparison operations, and we should be able to combine different implicit 
> collations for certain operations like concat and possible others in the 
> future.
> This is why I had to add another predefined collation id named 
> `INDETERMINATE_COLLATION_ID` which means that the result is a combination of 
> conflicting non-default implicit collations. Right now it has an id of -1 so 
> it fails if it ever goes to the `CollatorFactory`.
> *Why are the changes needed?*
> We need to be able to compare columns and values with different collations 
> and set a way of explicitly changing the collation we want to use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47169) Disable bucketing on collated collumns

2024-03-11 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-47169:
--
Description: 
*What changes were proposed in this pull request?*
Disable bucketing on columns that are non default collated.

*Why are the changes needed?*
With current implementation bucketIds are generated from a string value where 
each unique string guarantees unique id, but when collation is turned on, this 
is not the case.

  was:
What changes were proposed in this pull request?
Disable bucketing on columns that are non default collated.

Why are the changes needed?
With current implementation bucketIds are generated from a string value where 
each unique string guarantees unique id, but when collation is turned on, this 
is not the case.


> Disable bucketing on collated collumns
> --
>
> Key: SPARK-47169
> URL: https://issues.apache.org/jira/browse/SPARK-47169
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Mihailo Milosevic
>Priority: Major
>  Labels: pull-request-available
>
> *What changes were proposed in this pull request?*
> Disable bucketing on columns that are non default collated.
> *Why are the changes needed?*
> With current implementation bucketIds are generated from a string value where 
> each unique string guarantees unique id, but when collation is turned on, 
> this is not the case.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47169) Disable bucketing on collated collumns

2024-03-11 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-47169:
--
Description: 
What changes were proposed in this pull request?
Disable bucketing on columns that are non default collated.

Why are the changes needed?
With current implementation bucketIds are generated from a string value where 
each unique string guarantees unique id, but when collation is turned on, this 
is not the case.

> Disable bucketing on collated collumns
> --
>
> Key: SPARK-47169
> URL: https://issues.apache.org/jira/browse/SPARK-47169
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Mihailo Milosevic
>Priority: Major
>  Labels: pull-request-available
>
> What changes were proposed in this pull request?
> Disable bucketing on columns that are non default collated.
> Why are the changes needed?
> With current implementation bucketIds are generated from a string value where 
> each unique string guarantees unique id, but when collation is turned on, 
> this is not the case.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47326) Moving tests to related Suites

2024-03-11 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-47326:
--
Description: 
*What changes were proposed in this pull request?*
Tests from `QueryCompilationErrorsSuite` were moved to `DDLSuite` and 
`JDBCTableCatalogSuite`.

*Why are the changes needed?*
We should move tests to related test suites in order to improve testing.

> Moving tests to related Suites
> --
>
> Key: SPARK-47326
> URL: https://issues.apache.org/jira/browse/SPARK-47326
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Mihailo Milosevic
>Priority: Major
>
> *What changes were proposed in this pull request?*
> Tests from `QueryCompilationErrorsSuite` were moved to `DDLSuite` and 
> `JDBCTableCatalogSuite`.
> *Why are the changes needed?*
> We should move tests to related test suites in order to improve testing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47015) Disable partitioning on collated columns

2024-03-11 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-47015:
--
Description: (was: *What changes were proposed in this pull request?*
Tests from `QueryCompilationErrorsSuite` were moved to `DDLSuite` and 
`JDBCTableCatalogSuite`.

*Why are the changes needed?*
We should move tests to related test suites in order to improve testing.)

> Disable partitioning on collated columns
> 
>
> Key: SPARK-47015
> URL: https://issues.apache.org/jira/browse/SPARK-47015
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Stefan Kandic
>Assignee: Stefan Kandic
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47015) Disable partitioning on collated columns

2024-03-11 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-47015:
--
Description: 
*What changes were proposed in this pull request?*
Tests from `QueryCompilationErrorsSuite` were moved to `DDLSuite` and 
`JDBCTableCatalogSuite`.

*Why are the changes needed?*
We should move tests to related test suites in order to improve testing.

  was:
### What changes were proposed in this pull request?
Tests from `QueryCompilationErrorsSuite` were moved to `DDLSuite` and 
`JDBCTableCatalogSuite`.

### Why are the changes needed?
We should move tests to related test suites in order to improve testing.


> Disable partitioning on collated columns
> 
>
> Key: SPARK-47015
> URL: https://issues.apache.org/jira/browse/SPARK-47015
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Stefan Kandic
>Assignee: Stefan Kandic
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> *What changes were proposed in this pull request?*
> Tests from `QueryCompilationErrorsSuite` were moved to `DDLSuite` and 
> `JDBCTableCatalogSuite`.
> *Why are the changes needed?*
> We should move tests to related test suites in order to improve testing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47015) Disable partitioning on collated columns

2024-03-11 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-47015:
--
Description: 
### What changes were proposed in this pull request?
Tests from `QueryCompilationErrorsSuite` were moved to `DDLSuite` and 
`JDBCTableCatalogSuite`.

### Why are the changes needed?
We should move tests to related test suites in order to improve testing.

> Disable partitioning on collated columns
> 
>
> Key: SPARK-47015
> URL: https://issues.apache.org/jira/browse/SPARK-47015
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Stefan Kandic
>Assignee: Stefan Kandic
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> ### What changes were proposed in this pull request?
> Tests from `QueryCompilationErrorsSuite` were moved to `DDLSuite` and 
> `JDBCTableCatalogSuite`.
> ### Why are the changes needed?
> We should move tests to related test suites in order to improve testing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-47326) Moving tests to related Suites

2024-03-11 Thread Mihailo Milosevic (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-47326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17825203#comment-17825203
 ] 

Mihailo Milosevic commented on SPARK-47326:
---

Issue resolved by pull request 86361

[https://github.com/databricks/runtime/pull/86361|http://example.com]

> Moving tests to related Suites
> --
>
> Key: SPARK-47326
> URL: https://issues.apache.org/jira/browse/SPARK-47326
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Mihailo Milosevic
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47326) Moving tests to related Suites

2024-03-08 Thread Mihailo Milosevic (Jira)
Mihailo Milosevic created SPARK-47326:
-

 Summary: Moving tests to related Suites
 Key: SPARK-47326
 URL: https://issues.apache.org/jira/browse/SPARK-47326
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 4.0.0
Reporter: Mihailo Milosevic






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47210) Implicit casting on collated expressions

2024-02-28 Thread Mihailo Milosevic (Jira)
Mihailo Milosevic created SPARK-47210:
-

 Summary: Implicit casting on collated expressions
 Key: SPARK-47210
 URL: https://issues.apache.org/jira/browse/SPARK-47210
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 4.0.0
Reporter: Mihailo Milosevic






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47169) Disable bucketint on collated collumns

2024-02-26 Thread Mihailo Milosevic (Jira)
Mihailo Milosevic created SPARK-47169:
-

 Summary: Disable bucketint on collated collumns
 Key: SPARK-47169
 URL: https://issues.apache.org/jira/browse/SPARK-47169
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 4.0.0
Reporter: Mihailo Milosevic






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47169) Disable bucketing on collated collumns

2024-02-26 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-47169:
--
Summary: Disable bucketing on collated collumns  (was: Disable bucketint on 
collated collumns)

> Disable bucketing on collated collumns
> --
>
> Key: SPARK-47169
> URL: https://issues.apache.org/jira/browse/SPARK-47169
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Mihailo Milosevic
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47102) Add COLLATION_ENABLED config flag

2024-02-23 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-47102:
--
Description: 
*What changes were proposed in this pull request?*
This PR adds COLLATION_ENABLED config to `SQLConf` and introduces new error 
class `COLLATION_SUPPORT_NOT_ENABLED` to appropriately report error on usage of 
feature under development. 

*Why are the changes needed?*
We want to make collations configurable on this flag. These changes disable 
usage of `collate` and `collation` functions, along with any `COLLATE` syntax 
when the flag is set to false. By default, the flag is set to false.

  was:
*What changes were proposed in this pull request?*
This PR adds COLLATION_ENABLED config to `SQLConf` and introduces new error 
class `COLLATION_SUPPORT_DISABLED` to appropriately report error on usage of 
feature under development. 

*Why are the changes needed?*
We want to make collations configurable on this some flag. These changes 
disable usage of `collate` and `collation` functions, along with any `COLLATE` 
syntax when the flag is set to false. By default, the flag is set to false.


> Add COLLATION_ENABLED config flag
> -
>
> Key: SPARK-47102
> URL: https://issues.apache.org/jira/browse/SPARK-47102
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Mihailo Milosevic
>Priority: Major
>  Labels: pull-request-available
>
> *What changes were proposed in this pull request?*
> This PR adds COLLATION_ENABLED config to `SQLConf` and introduces new error 
> class `COLLATION_SUPPORT_NOT_ENABLED` to appropriately report error on usage 
> of feature under development. 
> *Why are the changes needed?*
> We want to make collations configurable on this flag. These changes disable 
> usage of `collate` and `collation` functions, along with any `COLLATE` syntax 
> when the flag is set to false. By default, the flag is set to false.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47102) Add COLLATION_ENABLED config flag

2024-02-22 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-47102:
--
Description: 
*What changes were proposed in this pull request?*
This PR adds COLLATION_ENABLED config to `SQLConf` and introduces new error 
class `COLLATION_SUPPORT_DISABLED` to appropriately report error on usage of 
feature under development. 

*Why are the changes needed?*
We want to make collations configurable on this some flag. These changes 
disable usage of `collate` and `collation` functions, along with any `COLLATE` 
syntax when the flag is set to false. By default, the flag is set to false.

  was:
### What changes were proposed in this pull request?
This PR adds COLLATION_ENABLED config to `SQLConf` and introduces new error 
class `COLLATION_SUPPORT_DISABLED` to appropriately report error on usage of 
feature under development. 

### Why are the changes needed?
We want to make collations configurable on this some flag. These changes 
disable usage of `collate` and `collation` functions, along with any `COLLATE` 
syntax when the flag is set to false. By default, the flag is set to false.


> Add COLLATION_ENABLED config flag
> -
>
> Key: SPARK-47102
> URL: https://issues.apache.org/jira/browse/SPARK-47102
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Mihailo Milosevic
>Priority: Major
>  Labels: pull-request-available
>
> *What changes were proposed in this pull request?*
> This PR adds COLLATION_ENABLED config to `SQLConf` and introduces new error 
> class `COLLATION_SUPPORT_DISABLED` to appropriately report error on usage of 
> feature under development. 
> *Why are the changes needed?*
> We want to make collations configurable on this some flag. These changes 
> disable usage of `collate` and `collation` functions, along with any 
> `COLLATE` syntax when the flag is set to false. By default, the flag is set 
> to false.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47102) Add COLLATION_ENABLED config flag

2024-02-22 Thread Mihailo Milosevic (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihailo Milosevic updated SPARK-47102:
--
Description: 
### What changes were proposed in this pull request?
This PR adds COLLATION_ENABLED config to `SQLConf` and introduces new error 
class `COLLATION_SUPPORT_DISABLED` to appropriately report error on usage of 
feature under development. 

### Why are the changes needed?
We want to make collations configurable on this some flag. These changes 
disable usage of `collate` and `collation` functions, along with any `COLLATE` 
syntax when the flag is set to false. By default, the flag is set to false.

> Add COLLATION_ENABLED config flag
> -
>
> Key: SPARK-47102
> URL: https://issues.apache.org/jira/browse/SPARK-47102
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Mihailo Milosevic
>Priority: Major
>  Labels: pull-request-available
>
> ### What changes were proposed in this pull request?
> This PR adds COLLATION_ENABLED config to `SQLConf` and introduces new error 
> class `COLLATION_SUPPORT_DISABLED` to appropriately report error on usage of 
> feature under development. 
> ### Why are the changes needed?
> We want to make collations configurable on this some flag. These changes 
> disable usage of `collate` and `collation` functions, along with any 
> `COLLATE` syntax when the flag is set to false. By default, the flag is set 
> to false.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47102) Add COLLATION_ENABLED config flag

2024-02-20 Thread Mihailo Milosevic (Jira)
Mihailo Milosevic created SPARK-47102:
-

 Summary: Add COLLATION_ENABLED config flag
 Key: SPARK-47102
 URL: https://issues.apache.org/jira/browse/SPARK-47102
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 4.0.0
Reporter: Mihailo Milosevic






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-43259) Assign a name to the error class _LEGACY_ERROR_TEMP_2024

2024-02-20 Thread Mihailo Milosevic (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-43259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818730#comment-17818730
 ] 

Mihailo Milosevic commented on SPARK-43259:
---

I want to work on this issue.

Raised a PR for same https://github.com/apache/spark/pull/45095

> Assign a name to the error class _LEGACY_ERROR_TEMP_2024
> 
>
> Key: SPARK-43259
> URL: https://issues.apache.org/jira/browse/SPARK-43259
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.5.0
>Reporter: Max Gekk
>Priority: Minor
>  Labels: pull-request-available, starter
>
> Choose a proper name for the error class *_LEGACY_ERROR_TEMP_2024* defined in 
> {*}core/src/main/resources/error/error-classes.json{*}. The name should be 
> short but complete (look at the example in error-classes.json).
> Add a test which triggers the error from user code if such test still doesn't 
> exist. Check exception fields by using {*}checkError(){*}. The last function 
> checks valuable error fields only, and avoids dependencies from error text 
> message. In this way, tech editors can modify error format in 
> error-classes.json, and don't worry of Spark's internal tests. Migrate other 
> tests that might trigger the error onto checkError().
> If you cannot reproduce the error from user space (using SQL query), replace 
> the error by an internal error, see {*}SparkException.internalError(){*}.
> Improve the error message format in error-classes.json if the current is not 
> clear. Propose a solution to users how to avoid and fix such kind of errors.
> Please, look at the PR below as examples:
>  * [https://github.com/apache/spark/pull/38685]
>  * [https://github.com/apache/spark/pull/38656]
>  * [https://github.com/apache/spark/pull/38490]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org