[jira] [Updated] (SPARK-48158) XML expressions (all collations)
[ https://issues.apache.org/jira/browse/SPARK-48158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-48158: --- Labels: pull-request-available (was: ) > XML expressions (all collations) > > > Key: SPARK-48158 > URL: https://issues.apache.org/jira/browse/SPARK-48158 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Uroš Bojanić >Priority: Major > Labels: pull-request-available > > Enable collation support for *XML* built-in string functions in Spark > ({*}XmlToStructs{*}, {*}SchemaOfXml{*}, {*}StructsToXml{*}). First confirm > what is the expected behaviour for these functions when given collated > strings, and then move on to implementation and testing. You will find these > expressions in the *xmlExpressions.scala* file, and they should mostly be > pass-through functions. Implement the corresponding E2E SQL tests > (CollationSQLExpressionsSuite) to reflect how this function should be used > with collation in SparkSQL, and feel free to use your chosen Spark SQL Editor > to experiment with the existing functions to learn more about how they work. > In addition, look into the possible use-cases and implementation of similar > functions within other other open-source DBMS, such as > [PostgreSQL|https://www.postgresql.org/docs/]. > > The goal for this Jira ticket is to implement the *XML* expressions so that > they support all collation types currently supported in Spark. To understand > what changes were introduced in order to enable full collation support for > other existing functions in Spark, take a look at the Spark PRs and Jira > tickets for completed tasks in this parent (for example: Ascii, Chr, Base64, > UnBase64, Decode, StringDecode, Encode, ToBinary, FormatNumber, Sentences). > > Read more about ICU [Collation Concepts|http://example.com/] and > [Collator|http://example.com/] class. Also, refer to the Unicode Technical > Standard for string > [collation|https://www.unicode.org/reports/tr35/tr35-collation.html#Collation_Type_Fallback]. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-48158) XML expressions (all collations)
[ https://issues.apache.org/jira/browse/SPARK-48158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uroš Bojanić updated SPARK-48158: - Component/s: SQL (was: Spark Core) > XML expressions (all collations) > > > Key: SPARK-48158 > URL: https://issues.apache.org/jira/browse/SPARK-48158 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Uroš Bojanić >Priority: Major > > Enable collation support for *XML* built-in string functions in Spark > ({*}XmlToStructs{*}, {*}SchemaOfXml{*}, {*}StructsToXml{*}). First confirm > what is the expected behaviour for these functions when given collated > strings, and then move on to implementation and testing. You will find these > expressions in the *xmlExpressions.scala* file, and they should mostly be > pass-through functions. Implement the corresponding E2E SQL tests > (CollationSQLExpressionsSuite) to reflect how this function should be used > with collation in SparkSQL, and feel free to use your chosen Spark SQL Editor > to experiment with the existing functions to learn more about how they work. > In addition, look into the possible use-cases and implementation of similar > functions within other other open-source DBMS, such as > [PostgreSQL|https://www.postgresql.org/docs/]. > > The goal for this Jira ticket is to implement the *XML* expressions so that > they support all collation types currently supported in Spark. To understand > what changes were introduced in order to enable full collation support for > other existing functions in Spark, take a look at the Spark PRs and Jira > tickets for completed tasks in this parent (for example: Ascii, Chr, Base64, > UnBase64, Decode, StringDecode, Encode, ToBinary, FormatNumber, Sentences). > > Read more about ICU [Collation Concepts|http://example.com/] and > [Collator|http://example.com/] class. Also, refer to the Unicode Technical > Standard for string > [collation|https://www.unicode.org/reports/tr35/tr35-collation.html#Collation_Type_Fallback]. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-48158) XML expressions (all collations)
[ https://issues.apache.org/jira/browse/SPARK-48158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uroš Bojanić updated SPARK-48158: - Description: Enable collation support for *XML* built-in string functions in Spark ({*}XmlToStructs{*}, {*}SchemaOfXml{*}, {*}StructsToXml{*}). First confirm what is the expected behaviour for these functions when given collated strings, and then move on to implementation and testing. You will find these expressions in the *xmlExpressions.scala* file, and they should mostly be pass-through functions. Implement the corresponding E2E SQL tests (CollationSQLExpressionsSuite) to reflect how this function should be used with collation in SparkSQL, and feel free to use your chosen Spark SQL Editor to experiment with the existing functions to learn more about how they work. In addition, look into the possible use-cases and implementation of similar functions within other other open-source DBMS, such as [PostgreSQL|https://www.postgresql.org/docs/]. The goal for this Jira ticket is to implement the *XML* expressions so that they support all collation types currently supported in Spark. To understand what changes were introduced in order to enable full collation support for other existing functions in Spark, take a look at the Spark PRs and Jira tickets for completed tasks in this parent (for example: Ascii, Chr, Base64, UnBase64, Decode, StringDecode, Encode, ToBinary, FormatNumber, Sentences). Read more about ICU [Collation Concepts|http://example.com/] and [Collator|http://example.com/] class. Also, refer to the Unicode Technical Standard for string [collation|https://www.unicode.org/reports/tr35/tr35-collation.html#Collation_Type_Fallback]. > XML expressions (all collations) > > > Key: SPARK-48158 > URL: https://issues.apache.org/jira/browse/SPARK-48158 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Uroš Bojanić >Priority: Major > > Enable collation support for *XML* built-in string functions in Spark > ({*}XmlToStructs{*}, {*}SchemaOfXml{*}, {*}StructsToXml{*}). First confirm > what is the expected behaviour for these functions when given collated > strings, and then move on to implementation and testing. You will find these > expressions in the *xmlExpressions.scala* file, and they should mostly be > pass-through functions. Implement the corresponding E2E SQL tests > (CollationSQLExpressionsSuite) to reflect how this function should be used > with collation in SparkSQL, and feel free to use your chosen Spark SQL Editor > to experiment with the existing functions to learn more about how they work. > In addition, look into the possible use-cases and implementation of similar > functions within other other open-source DBMS, such as > [PostgreSQL|https://www.postgresql.org/docs/]. > > The goal for this Jira ticket is to implement the *XML* expressions so that > they support all collation types currently supported in Spark. To understand > what changes were introduced in order to enable full collation support for > other existing functions in Spark, take a look at the Spark PRs and Jira > tickets for completed tasks in this parent (for example: Ascii, Chr, Base64, > UnBase64, Decode, StringDecode, Encode, ToBinary, FormatNumber, Sentences). > > Read more about ICU [Collation Concepts|http://example.com/] and > [Collator|http://example.com/] class. Also, refer to the Unicode Technical > Standard for string > [collation|https://www.unicode.org/reports/tr35/tr35-collation.html#Collation_Type_Fallback]. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org