Re: [PR] [SPARK-48159][SQL] Extending support for collated strings on datetime expressions [spark]
cloud-fan closed pull request #46618: [SPARK-48159][SQL] Extending support for collated strings on datetime expressions URL: https://github.com/apache/spark/pull/46618 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-48159][SQL] Extending support for collated strings on datetime expressions [spark]
cloud-fan commented on PR #46618: URL: https://github.com/apache/spark/pull/46618#issuecomment-2135727861 thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-48159][SQL] Extending support for collated strings on datetime expressions [spark]
nebojsa-db commented on code in PR #46618: URL: https://github.com/apache/spark/pull/46618#discussion_r1610009218 ## sql/core/src/test/scala/org/apache/spark/sql/CollationSQLExpressionsSuite.scala: ## @@ -1584,6 +1587,237 @@ class CollationSQLExpressionsSuite }) } + test("CurrentTimeZone expression with collation") { +// Supported collations +testSuppCollations.foreach(collationName => { + val query = "select current_timezone()" + // Data type check + withSQLConf(SqlApiConf.DEFAULT_COLLATION -> collationName) { +val testQuery = sql(query) +val dataType = StringType(collationName) +assert(testQuery.schema.fields.head.dataType.sameType(dataType)) + } +}) + } + + test("DayName expression with collation") { +// Supported collations +testSuppCollations.foreach(collationName => { + val query = "select dayname(current_date())" + // Data type check + withSQLConf(SqlApiConf.DEFAULT_COLLATION -> collationName) { +val testQuery = sql(query) +val dataType = StringType(collationName) +assert(testQuery.schema.fields.head.dataType.sameType(dataType)) + } +}) + } + + test("ToUnixTimestamp expression with collation") { +// Supported collations +testSuppCollations.foreach(collationName => { + val query = +s""" + |select to_unix_timestamp(collate('2021-01-01 00:00:00', '${collationName}'), + |collate('-MM-dd HH:mm:ss', '${collationName}')) + |""".stripMargin + // Result & data type check + val testQuery = sql(query) + val dataType = LongType + val expectedResult = 1609488000L + assert(testQuery.schema.fields.head.dataType.sameType(dataType)) + checkAnswer(testQuery, Row(expectedResult)) +}) + } + + test("FromUnixTime expression with collation") { +// Supported collations +testSuppCollations.foreach(collationName => { + val query = +s""" + |select from_unixtime(1609488000, collate('-MM-dd HH:mm:ss', '${collationName}')) + |""".stripMargin + // Result & data type check + withSQLConf(SqlApiConf.DEFAULT_COLLATION -> collationName) { +val testQuery = sql(query) +val dataType = StringType(collationName) +val expectedResult = "2021-01-01 00:00:00" +assert(testQuery.schema.fields.head.dataType.sameType(dataType)) +checkAnswer(testQuery, Row(expectedResult)) + } +}) + } + + test("NextDay expression with collation") { +// Supported collations +testSuppCollations.foreach(collationName => { + val query = +s""" + |select next_day('2015-01-14', collate('TU', '${collationName}')) + |""".stripMargin + // Result & data type check + withSQLConf(SqlApiConf.DEFAULT_COLLATION -> collationName) { +val testQuery = sql(query) +val dataType = DateType +val expectedResult = "2015-01-20" +assert(testQuery.schema.fields.head.dataType.sameType(dataType)) +checkAnswer(testQuery, Row(Date.valueOf(expectedResult))) + } +}) + } + + test("FromUTCTimestamp expression with collation") { +// Supported collations +testSuppCollations.foreach(collationName => { + val query = +s""" + |select from_utc_timestamp(collate('2016-08-31', '${collationName}'), + |collate('Asia/Seoul', '${collationName}')) + |""".stripMargin + // Result & data type check + val testQuery = sql(query) + val dataType = TimestampType + val expectedResult = "2016-08-31 09:00:00.0" + assert(testQuery.schema.fields.head.dataType.sameType(dataType)) + checkAnswer(testQuery, Row(Timestamp.valueOf(expectedResult))) +}) + } + + test("ToUTCTimestamp expression with collation") { +// Supported collations +testSuppCollations.foreach(collationName => { + val query = +s""" + |select to_utc_timestamp(collate('2016-08-31 09:00:00', '${collationName}'), + |collate('Asia/Seoul', '${collationName}')) + |""".stripMargin + // Result & data type check + val testQuery = sql(query) + val dataType = TimestampType + val expectedResult = "2016-08-31 00:00:00.0" + assert(testQuery.schema.fields.head.dataType.sameType(dataType)) + checkAnswer(testQuery, Row(Timestamp.valueOf(expectedResult))) +}) + } + + test("ParseToDate expression with collation") { +// Supported collations +testSuppCollations.foreach(collationName => { + val query = +s""" + |select to_date(collate('2016-12-31', '${collationName}'), + |collate('-MM-dd', '${collationName}')) + |""".stripMargin + // Result & data type check + val testQuery = sql(query) + val dataType = DateType + val expectedResult = "2016-12-31" + assert(testQuery.schema.fields.head.dataType.sameType(dataType)) +
Re: [PR] [SPARK-48159][SQL] Extending support for collated strings on datetime expressions [spark]
uros-db commented on code in PR #46618: URL: https://github.com/apache/spark/pull/46618#discussion_r1606726766 ## sql/core/src/test/scala/org/apache/spark/sql/CollationSQLExpressionsSuite.scala: ## @@ -1584,6 +1587,237 @@ class CollationSQLExpressionsSuite }) } + test("CurrentTimeZone expression with collation") { +// Supported collations +testSuppCollations.foreach(collationName => { + val query = "select current_timezone()" + // Data type check + withSQLConf(SqlApiConf.DEFAULT_COLLATION -> collationName) { +val testQuery = sql(query) +val dataType = StringType(collationName) +assert(testQuery.schema.fields.head.dataType.sameType(dataType)) + } +}) + } + + test("DayName expression with collation") { +// Supported collations +testSuppCollations.foreach(collationName => { + val query = "select dayname(current_date())" + // Data type check + withSQLConf(SqlApiConf.DEFAULT_COLLATION -> collationName) { +val testQuery = sql(query) +val dataType = StringType(collationName) +assert(testQuery.schema.fields.head.dataType.sameType(dataType)) + } +}) + } + + test("ToUnixTimestamp expression with collation") { +// Supported collations +testSuppCollations.foreach(collationName => { + val query = +s""" + |select to_unix_timestamp(collate('2021-01-01 00:00:00', '${collationName}'), + |collate('-MM-dd HH:mm:ss', '${collationName}')) + |""".stripMargin + // Result & data type check + val testQuery = sql(query) + val dataType = LongType + val expectedResult = 1609488000L + assert(testQuery.schema.fields.head.dataType.sameType(dataType)) + checkAnswer(testQuery, Row(expectedResult)) +}) + } + + test("FromUnixTime expression with collation") { +// Supported collations +testSuppCollations.foreach(collationName => { + val query = +s""" + |select from_unixtime(1609488000, collate('-MM-dd HH:mm:ss', '${collationName}')) + |""".stripMargin + // Result & data type check + withSQLConf(SqlApiConf.DEFAULT_COLLATION -> collationName) { +val testQuery = sql(query) +val dataType = StringType(collationName) +val expectedResult = "2021-01-01 00:00:00" +assert(testQuery.schema.fields.head.dataType.sameType(dataType)) +checkAnswer(testQuery, Row(expectedResult)) + } +}) + } + + test("NextDay expression with collation") { +// Supported collations +testSuppCollations.foreach(collationName => { + val query = +s""" + |select next_day('2015-01-14', collate('TU', '${collationName}')) + |""".stripMargin + // Result & data type check + withSQLConf(SqlApiConf.DEFAULT_COLLATION -> collationName) { +val testQuery = sql(query) +val dataType = DateType +val expectedResult = "2015-01-20" +assert(testQuery.schema.fields.head.dataType.sameType(dataType)) +checkAnswer(testQuery, Row(Date.valueOf(expectedResult))) + } +}) + } + + test("FromUTCTimestamp expression with collation") { +// Supported collations +testSuppCollations.foreach(collationName => { + val query = +s""" + |select from_utc_timestamp(collate('2016-08-31', '${collationName}'), + |collate('Asia/Seoul', '${collationName}')) + |""".stripMargin + // Result & data type check + val testQuery = sql(query) + val dataType = TimestampType + val expectedResult = "2016-08-31 09:00:00.0" + assert(testQuery.schema.fields.head.dataType.sameType(dataType)) + checkAnswer(testQuery, Row(Timestamp.valueOf(expectedResult))) +}) + } + + test("ToUTCTimestamp expression with collation") { +// Supported collations +testSuppCollations.foreach(collationName => { + val query = +s""" + |select to_utc_timestamp(collate('2016-08-31 09:00:00', '${collationName}'), + |collate('Asia/Seoul', '${collationName}')) + |""".stripMargin + // Result & data type check + val testQuery = sql(query) + val dataType = TimestampType + val expectedResult = "2016-08-31 00:00:00.0" + assert(testQuery.schema.fields.head.dataType.sameType(dataType)) + checkAnswer(testQuery, Row(Timestamp.valueOf(expectedResult))) +}) + } + + test("ParseToDate expression with collation") { +// Supported collations +testSuppCollations.foreach(collationName => { + val query = +s""" + |select to_date(collate('2016-12-31', '${collationName}'), + |collate('-MM-dd', '${collationName}')) + |""".stripMargin + // Result & data type check + val testQuery = sql(query) + val dataType = DateType + val expectedResult = "2016-12-31" + assert(testQuery.schema.fields.head.dataType.sameType(dataType)) +
Re: [PR] [SPARK-48159][SQL] Extending support for collated strings on datetime expressions [spark]
nebojsa-db commented on code in PR #46618: URL: https://github.com/apache/spark/pull/46618#discussion_r1606703312 ## sql/core/src/test/scala/org/apache/spark/sql/CollationSQLExpressionsSuite.scala: ## @@ -1584,6 +1587,237 @@ class CollationSQLExpressionsSuite }) } + test("CurrentTimeZone expression with collation") { +// Supported collations +testSuppCollations.foreach(collationName => { + val query = "select current_timezone()" + // Data type check + withSQLConf(SqlApiConf.DEFAULT_COLLATION -> collationName) { +val testQuery = sql(query) +val dataType = StringType(collationName) +assert(testQuery.schema.fields.head.dataType.sameType(dataType)) + } +}) + } + + test("DayName expression with collation") { +// Supported collations +testSuppCollations.foreach(collationName => { + val query = "select dayname(current_date())" + // Data type check + withSQLConf(SqlApiConf.DEFAULT_COLLATION -> collationName) { +val testQuery = sql(query) +val dataType = StringType(collationName) +assert(testQuery.schema.fields.head.dataType.sameType(dataType)) + } +}) + } + + test("ToUnixTimestamp expression with collation") { +// Supported collations +testSuppCollations.foreach(collationName => { + val query = +s""" + |select to_unix_timestamp(collate('2021-01-01 00:00:00', '${collationName}'), + |collate('-MM-dd HH:mm:ss', '${collationName}')) + |""".stripMargin + // Result & data type check + val testQuery = sql(query) + val dataType = LongType + val expectedResult = 1609488000L + assert(testQuery.schema.fields.head.dataType.sameType(dataType)) + checkAnswer(testQuery, Row(expectedResult)) +}) + } + + test("FromUnixTime expression with collation") { +// Supported collations +testSuppCollations.foreach(collationName => { + val query = +s""" + |select from_unixtime(1609488000, collate('-MM-dd HH:mm:ss', '${collationName}')) + |""".stripMargin + // Result & data type check + withSQLConf(SqlApiConf.DEFAULT_COLLATION -> collationName) { +val testQuery = sql(query) +val dataType = StringType(collationName) +val expectedResult = "2021-01-01 00:00:00" +assert(testQuery.schema.fields.head.dataType.sameType(dataType)) +checkAnswer(testQuery, Row(expectedResult)) + } +}) + } + + test("NextDay expression with collation") { +// Supported collations +testSuppCollations.foreach(collationName => { + val query = +s""" + |select next_day('2015-01-14', collate('TU', '${collationName}')) + |""".stripMargin + // Result & data type check + withSQLConf(SqlApiConf.DEFAULT_COLLATION -> collationName) { +val testQuery = sql(query) +val dataType = DateType +val expectedResult = "2015-01-20" +assert(testQuery.schema.fields.head.dataType.sameType(dataType)) +checkAnswer(testQuery, Row(Date.valueOf(expectedResult))) + } +}) + } + + test("FromUTCTimestamp expression with collation") { +// Supported collations +testSuppCollations.foreach(collationName => { + val query = +s""" + |select from_utc_timestamp(collate('2016-08-31', '${collationName}'), + |collate('Asia/Seoul', '${collationName}')) + |""".stripMargin + // Result & data type check + val testQuery = sql(query) + val dataType = TimestampType + val expectedResult = "2016-08-31 09:00:00.0" + assert(testQuery.schema.fields.head.dataType.sameType(dataType)) + checkAnswer(testQuery, Row(Timestamp.valueOf(expectedResult))) +}) + } + + test("ToUTCTimestamp expression with collation") { +// Supported collations +testSuppCollations.foreach(collationName => { + val query = +s""" + |select to_utc_timestamp(collate('2016-08-31 09:00:00', '${collationName}'), + |collate('Asia/Seoul', '${collationName}')) + |""".stripMargin + // Result & data type check + val testQuery = sql(query) + val dataType = TimestampType + val expectedResult = "2016-08-31 00:00:00.0" + assert(testQuery.schema.fields.head.dataType.sameType(dataType)) + checkAnswer(testQuery, Row(Timestamp.valueOf(expectedResult))) +}) + } + + test("ParseToDate expression with collation") { +// Supported collations +testSuppCollations.foreach(collationName => { + val query = +s""" + |select to_date(collate('2016-12-31', '${collationName}'), + |collate('-MM-dd', '${collationName}')) + |""".stripMargin + // Result & data type check + val testQuery = sql(query) + val dataType = DateType + val expectedResult = "2016-12-31" + assert(testQuery.schema.fields.head.dataType.sameType(dataType)) +
Re: [PR] [SPARK-48159][SQL] Extending support for collated strings on datetime expressions [spark]
uros-db commented on code in PR #46618: URL: https://github.com/apache/spark/pull/46618#discussion_r1606554130 ## sql/core/src/test/scala/org/apache/spark/sql/CollationSQLExpressionsSuite.scala: ## @@ -1584,6 +1587,237 @@ class CollationSQLExpressionsSuite }) } + test("CurrentTimeZone expression with collation") { +// Supported collations +testSuppCollations.foreach(collationName => { + val query = "select current_timezone()" + // Data type check + withSQLConf(SqlApiConf.DEFAULT_COLLATION -> collationName) { +val testQuery = sql(query) +val dataType = StringType(collationName) +assert(testQuery.schema.fields.head.dataType.sameType(dataType)) + } +}) + } + + test("DayName expression with collation") { +// Supported collations +testSuppCollations.foreach(collationName => { + val query = "select dayname(current_date())" + // Data type check + withSQLConf(SqlApiConf.DEFAULT_COLLATION -> collationName) { +val testQuery = sql(query) +val dataType = StringType(collationName) +assert(testQuery.schema.fields.head.dataType.sameType(dataType)) + } +}) + } + + test("ToUnixTimestamp expression with collation") { +// Supported collations +testSuppCollations.foreach(collationName => { + val query = +s""" + |select to_unix_timestamp(collate('2021-01-01 00:00:00', '${collationName}'), + |collate('-MM-dd HH:mm:ss', '${collationName}')) + |""".stripMargin + // Result & data type check + val testQuery = sql(query) + val dataType = LongType + val expectedResult = 1609488000L + assert(testQuery.schema.fields.head.dataType.sameType(dataType)) + checkAnswer(testQuery, Row(expectedResult)) +}) + } + + test("FromUnixTime expression with collation") { +// Supported collations +testSuppCollations.foreach(collationName => { + val query = +s""" + |select from_unixtime(1609488000, collate('-MM-dd HH:mm:ss', '${collationName}')) + |""".stripMargin + // Result & data type check + withSQLConf(SqlApiConf.DEFAULT_COLLATION -> collationName) { +val testQuery = sql(query) +val dataType = StringType(collationName) +val expectedResult = "2021-01-01 00:00:00" +assert(testQuery.schema.fields.head.dataType.sameType(dataType)) +checkAnswer(testQuery, Row(expectedResult)) + } +}) + } + + test("NextDay expression with collation") { +// Supported collations +testSuppCollations.foreach(collationName => { + val query = +s""" + |select next_day('2015-01-14', collate('TU', '${collationName}')) + |""".stripMargin + // Result & data type check + withSQLConf(SqlApiConf.DEFAULT_COLLATION -> collationName) { +val testQuery = sql(query) +val dataType = DateType +val expectedResult = "2015-01-20" +assert(testQuery.schema.fields.head.dataType.sameType(dataType)) +checkAnswer(testQuery, Row(Date.valueOf(expectedResult))) + } +}) + } + + test("FromUTCTimestamp expression with collation") { +// Supported collations +testSuppCollations.foreach(collationName => { + val query = +s""" + |select from_utc_timestamp(collate('2016-08-31', '${collationName}'), + |collate('Asia/Seoul', '${collationName}')) + |""".stripMargin + // Result & data type check + val testQuery = sql(query) + val dataType = TimestampType + val expectedResult = "2016-08-31 09:00:00.0" + assert(testQuery.schema.fields.head.dataType.sameType(dataType)) + checkAnswer(testQuery, Row(Timestamp.valueOf(expectedResult))) +}) + } + + test("ToUTCTimestamp expression with collation") { +// Supported collations +testSuppCollations.foreach(collationName => { + val query = +s""" + |select to_utc_timestamp(collate('2016-08-31 09:00:00', '${collationName}'), + |collate('Asia/Seoul', '${collationName}')) + |""".stripMargin + // Result & data type check + val testQuery = sql(query) + val dataType = TimestampType + val expectedResult = "2016-08-31 00:00:00.0" + assert(testQuery.schema.fields.head.dataType.sameType(dataType)) + checkAnswer(testQuery, Row(Timestamp.valueOf(expectedResult))) +}) + } + + test("ParseToDate expression with collation") { +// Supported collations +testSuppCollations.foreach(collationName => { + val query = +s""" + |select to_date(collate('2016-12-31', '${collationName}'), + |collate('-MM-dd', '${collationName}')) + |""".stripMargin + // Result & data type check + val testQuery = sql(query) + val dataType = DateType + val expectedResult = "2016-12-31" + assert(testQuery.schema.fields.head.dataType.sameType(dataType)) +
Re: [PR] [SPARK-48159][SQL] Extending support for collated strings on datetime expressions [spark]
uros-db commented on code in PR #46618: URL: https://github.com/apache/spark/pull/46618#discussion_r1604620097 ## sql/core/src/test/scala/org/apache/spark/sql/CollationSQLExpressionsSuite.scala: ## @@ -1584,6 +1584,234 @@ class CollationSQLExpressionsSuite }) } + test("CurrentTimeZone expression with collation") { +// Supported collations +Seq("UTF8_BINARY", "UTF8_BINARY_LCASE", "UNICODE", "UNICODE_CI").foreach(collationName => { + val query = "select current_timezone()" + // Result + withSQLConf(SqlApiConf.DEFAULT_COLLATION -> collationName) { +val testQuery = sql(query) +val dataType = StringType(collationName) +assertResult(dataType)(testQuery.schema.fields.head.dataType) + } +}) + } + + test("DayName expression with collation") { +// Supported collations +Seq("UTF8_BINARY", "UTF8_BINARY_LCASE", "UNICODE", "UNICODE_CI").foreach(collationName => { + val query = "select dayname(current_date())" + // Result + withSQLConf(SqlApiConf.DEFAULT_COLLATION -> collationName) { +val testQuery = sql(query) +val dataType = StringType(collationName) +assertResult(dataType)(testQuery.schema.fields.head.dataType) + } +}) + } + + test("ToUnixTimestamp expression with collation") { +// Supported collations +Seq("UTF8_BINARY", "UTF8_BINARY_LCASE", "UNICODE", "UNICODE_CI").foreach(collationName => { + val query = +s""" + |select to_unix_timestamp(collate('2021-01-01 00:00:00', '${collationName}'), + |collate('-MM-dd HH:mm:ss', '${collationName}')) + |""".stripMargin + // Result + val testQuery = sql(query) + val dataType = LongType + val expectedResult = 1609488000L + assertResult(dataType)(testQuery.schema.fields.head.dataType) + assertResult(expectedResult)(testQuery.collect().head.getLong(0)) +}) + } + + test("FromUnixTime expression with collation") { +// Supported collations +Seq("UTF8_BINARY", "UTF8_BINARY_LCASE", "UNICODE", "UNICODE_CI").foreach(collationName => { + val query = +s""" + |select from_unixtime(1609488000, collate('-MM-dd HH:mm:ss', '${collationName}')) + |""".stripMargin + // Result + withSQLConf(SqlApiConf.DEFAULT_COLLATION -> collationName) { +val testQuery = sql(query) +val dataType = StringType(collationName) +val expectedResult = "2021-01-01 00:00:00" +assertResult(dataType)(testQuery.schema.fields.head.dataType) +assertResult(expectedResult)(testQuery.collect().head.getString(0)) + } +}) + } + + test("NextDay expression with collation") { +// Supported collations +Seq("UTF8_BINARY", "UTF8_BINARY_LCASE", "UNICODE", "UNICODE_CI").foreach(collationName => { + val query = +s""" + |select next_day('2015-01-14', collate('TU', '${collationName}')) + |""".stripMargin + // Result + withSQLConf(SqlApiConf.DEFAULT_COLLATION -> collationName) { +val testQuery = sql(query) +val dataType = DateType +val expectedResult = "2015-01-20" +assertResult(dataType)(testQuery.schema.fields.head.dataType) + assertResult(expectedResult)(testQuery.collect().head.getDate(0).toString) + } +}) + } + + test("FromUTCTimestamp expression with collation") { +// Supported collations +Seq("UTF8_BINARY", "UTF8_BINARY_LCASE", "UNICODE", "UNICODE_CI").foreach(collationName => { + val query = +s""" + |select from_utc_timestamp(collate('2016-08-31', '${collationName}'), + |collate('Asia/Seoul', '${collationName}')) + |""".stripMargin + // Result + val testQuery = sql(query) + val dataType = TimestampType + val expectedResult = "2016-08-31 09:00:00.0" + assertResult(dataType)(testQuery.schema.fields.head.dataType) + assertResult(expectedResult)(testQuery.collect().head.getTimestamp(0).toString) +}) + } + + test("ToUTCTimestamp expression with collation") { +// Supported collations +Seq("UTF8_BINARY", "UTF8_BINARY_LCASE", "UNICODE", "UNICODE_CI").foreach(collationName => { + val query = +s""" + |select to_utc_timestamp(collate('2016-08-31 09:00:00', '${collationName}'), + |collate('Asia/Seoul', '${collationName}')) + |""".stripMargin + // Result + val testQuery = sql(query) + val dataType = TimestampType + val expectedResult = "2016-08-31 00:00:00.0" + assertResult(dataType)(testQuery.schema.fields.head.dataType) + assertResult(expectedResult)(testQuery.collect().head.getTimestamp(0).toString) +}) + } + + test("ParseToDate expression with collation") { +// Supported collations +Seq("UTF8_BINARY", "UTF8_BINARY_LCASE", "UNICODE", "UNICODE_CI").foreach(collationName => { + val query = +s""" + |select to_d
Re: [PR] [SPARK-48159][SQL] Extending support for collated strings on datetime expressions [spark]
uros-db commented on code in PR #46618: URL: https://github.com/apache/spark/pull/46618#discussion_r1604614789 ## sql/core/src/test/scala/org/apache/spark/sql/CollationSQLExpressionsSuite.scala: ## @@ -1584,6 +1584,234 @@ class CollationSQLExpressionsSuite }) } + test("CurrentTimeZone expression with collation") { +// Supported collations +Seq("UTF8_BINARY", "UTF8_BINARY_LCASE", "UNICODE", "UNICODE_CI").foreach(collationName => { Review Comment: since we're now using this `Seq("UTF8_BINARY", "UTF8_BINARY_LCASE", "UNICODE", "UNICODE_CI")` a lot, let's separate it out and call it something like `testCollationsSeq` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-48159][SQL] Extending support for collated strings on datetime expressions [spark]
uros-db commented on code in PR #46618: URL: https://github.com/apache/spark/pull/46618#discussion_r1604612518 ## sql/core/src/test/scala/org/apache/spark/sql/CollationSQLExpressionsSuite.scala: ## @@ -1584,6 +1584,234 @@ class CollationSQLExpressionsSuite }) } + test("CurrentTimeZone expression with collation") { +// Supported collations +Seq("UTF8_BINARY", "UTF8_BINARY_LCASE", "UNICODE", "UNICODE_CI").foreach(collationName => { + val query = "select current_timezone()" + // Result Review Comment: (goes for other similar tests too) ## sql/core/src/test/scala/org/apache/spark/sql/CollationSQLExpressionsSuite.scala: ## @@ -1584,6 +1584,234 @@ class CollationSQLExpressionsSuite }) } + test("CurrentTimeZone expression with collation") { +// Supported collations +Seq("UTF8_BINARY", "UTF8_BINARY_LCASE", "UNICODE", "UNICODE_CI").foreach(collationName => { + val query = "select current_timezone()" + // Result + withSQLConf(SqlApiConf.DEFAULT_COLLATION -> collationName) { +val testQuery = sql(query) +val dataType = StringType(collationName) +assertResult(dataType)(testQuery.schema.fields.head.dataType) + } +}) + } + + test("DayName expression with collation") { +// Supported collations +Seq("UTF8_BINARY", "UTF8_BINARY_LCASE", "UNICODE", "UNICODE_CI").foreach(collationName => { + val query = "select dayname(current_date())" + // Result + withSQLConf(SqlApiConf.DEFAULT_COLLATION -> collationName) { +val testQuery = sql(query) +val dataType = StringType(collationName) +assertResult(dataType)(testQuery.schema.fields.head.dataType) + } +}) + } + + test("ToUnixTimestamp expression with collation") { +// Supported collations +Seq("UTF8_BINARY", "UTF8_BINARY_LCASE", "UNICODE", "UNICODE_CI").foreach(collationName => { + val query = +s""" + |select to_unix_timestamp(collate('2021-01-01 00:00:00', '${collationName}'), + |collate('-MM-dd HH:mm:ss', '${collationName}')) + |""".stripMargin + // Result + val testQuery = sql(query) + val dataType = LongType + val expectedResult = 1609488000L + assertResult(dataType)(testQuery.schema.fields.head.dataType) + assertResult(expectedResult)(testQuery.collect().head.getLong(0)) Review Comment: (goes for other similar tests too) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-48159][SQL] Extending support for collated strings on datetime expressions [spark]
uros-db commented on code in PR #46618: URL: https://github.com/apache/spark/pull/46618#discussion_r1604611590 ## sql/core/src/test/scala/org/apache/spark/sql/CollationSQLExpressionsSuite.scala: ## @@ -1584,6 +1584,234 @@ class CollationSQLExpressionsSuite }) } + test("CurrentTimeZone expression with collation") { +// Supported collations +Seq("UTF8_BINARY", "UTF8_BINARY_LCASE", "UNICODE", "UNICODE_CI").foreach(collationName => { + val query = "select current_timezone()" + // Result + withSQLConf(SqlApiConf.DEFAULT_COLLATION -> collationName) { +val testQuery = sql(query) +val dataType = StringType(collationName) +assertResult(dataType)(testQuery.schema.fields.head.dataType) + } +}) + } + + test("DayName expression with collation") { +// Supported collations +Seq("UTF8_BINARY", "UTF8_BINARY_LCASE", "UNICODE", "UNICODE_CI").foreach(collationName => { + val query = "select dayname(current_date())" + // Result + withSQLConf(SqlApiConf.DEFAULT_COLLATION -> collationName) { +val testQuery = sql(query) +val dataType = StringType(collationName) +assertResult(dataType)(testQuery.schema.fields.head.dataType) + } +}) + } + + test("ToUnixTimestamp expression with collation") { +// Supported collations +Seq("UTF8_BINARY", "UTF8_BINARY_LCASE", "UNICODE", "UNICODE_CI").foreach(collationName => { + val query = +s""" + |select to_unix_timestamp(collate('2021-01-01 00:00:00', '${collationName}'), + |collate('-MM-dd HH:mm:ss', '${collationName}')) + |""".stripMargin + // Result + val testQuery = sql(query) + val dataType = LongType + val expectedResult = 1609488000L + assertResult(dataType)(testQuery.schema.fields.head.dataType) + assertResult(expectedResult)(testQuery.collect().head.getLong(0)) Review Comment: use `checkAnswer` instead -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-48159][SQL] Extending support for collated strings on datetime expressions [spark]
uros-db commented on code in PR #46618: URL: https://github.com/apache/spark/pull/46618#discussion_r1604609258 ## sql/core/src/test/scala/org/apache/spark/sql/CollationSQLExpressionsSuite.scala: ## @@ -1584,6 +1584,234 @@ class CollationSQLExpressionsSuite }) } + test("CurrentTimeZone expression with collation") { +// Supported collations +Seq("UTF8_BINARY", "UTF8_BINARY_LCASE", "UNICODE", "UNICODE_CI").foreach(collationName => { + val query = "select current_timezone()" + // Result Review Comment: ```suggestion // Data type ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-48159][SQL] Extending support for collated strings on datetime expressions [spark]
uros-db commented on code in PR #46618: URL: https://github.com/apache/spark/pull/46618#discussion_r1604608546 ## sql/core/src/test/scala/org/apache/spark/sql/CollationSQLExpressionsSuite.scala: ## @@ -1584,6 +1584,234 @@ class CollationSQLExpressionsSuite }) } + test("CurrentTimeZone expression with collation") { +// Supported collations +Seq("UTF8_BINARY", "UTF8_BINARY_LCASE", "UNICODE", "UNICODE_CI").foreach(collationName => { + val query = "select current_timezone()" + // Result Review Comment: this is not a result check, but rather a Data type check -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-48159][SQL] Extending support for collated strings on datetime expressions [spark]
nebojsa-db commented on PR #46618: URL: https://github.com/apache/spark/pull/46618#issuecomment-2115535153 @cloud-fan Please review :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-48159][SQL] Extending support for collated strings on datetime expressions [spark]
mihailom-db commented on code in PR #46618: URL: https://github.com/apache/spark/pull/46618#discussion_r1603211716 ## sql/core/src/test/scala/org/apache/spark/sql/CollationSQLExpressionsSuite.scala: ## @@ -1584,6 +1584,240 @@ class CollationSQLExpressionsSuite }) } + test("CurrentTimeZone expression with collation") { +// Supported collations +Seq("UTF8_BINARY", "UTF8_BINARY_LCASE", "UNICODE", "UNICODE_CI").foreach(collationName => { + val query = +s""" + |select current_timezone() + |""".stripMargin Review Comment: practice is that if a string can go to one line we should do it, and this one seems small enough -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-48159][SQL] Extending support for collated strings on datetime expressions [spark]
mihailom-db commented on code in PR #46618: URL: https://github.com/apache/spark/pull/46618#discussion_r1603210909 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala: ## @@ -2938,20 +2940,20 @@ case class Extract(field: Expression, source: Expression, replacement: Expressio object Extract { def createExpr(funcName: String, field: Expression, source: Expression): Expression = { // both string and null literals are allowed. -if ((field.dataType == StringType || field.dataType == NullType) && field.foldable) { - val fieldStr = field.eval().asInstanceOf[UTF8String] Review Comment: You do not need to change this here to pattern match, you can use `field.dataType.isInstanceOf[StringType]` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-48159][SQL] Extending support for collated strings on datetime expressions [spark]
mihailom-db commented on code in PR #46618: URL: https://github.com/apache/spark/pull/46618#discussion_r1603212139 ## sql/core/src/test/scala/org/apache/spark/sql/CollationSQLExpressionsSuite.scala: ## @@ -1584,6 +1584,240 @@ class CollationSQLExpressionsSuite }) } + test("CurrentTimeZone expression with collation") { +// Supported collations +Seq("UTF8_BINARY", "UTF8_BINARY_LCASE", "UNICODE", "UNICODE_CI").foreach(collationName => { + val query = +s""" + |select current_timezone() + |""".stripMargin + // Result + withSQLConf(SqlApiConf.DEFAULT_COLLATION -> collationName) { +val testQuery = sql(query) +val dataType = StringType(collationName) +assertResult(dataType)(testQuery.schema.fields.head.dataType) + } +}) + } + + test("DayName expression with collation") { +// Supported collations +Seq("UTF8_BINARY", "UTF8_BINARY_LCASE", "UNICODE", "UNICODE_CI").foreach(collationName => { + val query = +s""" + |select dayname(current_date()) Review Comment: ditto -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-48159][SQL] Extending support for collated strings on datetime expressions [spark]
nebojsa-db commented on PR #46618: URL: https://github.com/apache/spark/pull/46618#issuecomment-2115033658 Please take a look @nikolamand-db @stefankandic @uros-db @mihailom-db @dbatomic -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[PR] [SPARK-48159][SQL] Extending support for collated strings on datetime expressions [spark]
nebojsa-db opened a new pull request, #46618: URL: https://github.com/apache/spark/pull/46618 ### What changes were proposed in this pull request? This PR introduces changes that will allow for collated strings to be passed to various datetime expressions or return value as collated string from those expressions. Impacted datetime expressions: - current_timezone - to_unix_timestamp - from_unixtime - next_day - from_utc_timestamp - to_utc_timestamp - to_date - to_timestamp - trunc - date_trunc - make_timestamp - date_part - convert_timezone ### Why are the changes needed? This PR is part of ongoing effort to support collated strings on SparkSQL. ### Does this PR introduce _any_ user-facing change? Yes, users will be able to use collated strings for datetime expressions. ### How was this patch tested? Added corresponding tests. ### Was this patch authored or co-authored using generative AI tooling? No. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org