[GitHub] spark pull request #22943: [SPARK-25098][SQL] Trim the string when cast stri...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22943 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22943: [SPARK-25098][SQL] Trim the string when cast stri...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22943#discussion_r231382309 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala --- @@ -140,16 +140,10 @@ class DateTimeUtilsSuite extends SparkFunSuite { c = Calendar.getInstance() c.set(2015, 2, 18, 0, 0, 0) c.set(Calendar.MILLISECOND, 0) -assert(stringToDate(UTF8String.fromString("2015-03-18")).get === - millisToDays(c.getTimeInMillis)) -assert(stringToDate(UTF8String.fromString("2015-03-18 ")).get === - millisToDays(c.getTimeInMillis)) -assert(stringToDate(UTF8String.fromString("2015-03-18 123142")).get === - millisToDays(c.getTimeInMillis)) -assert(stringToDate(UTF8String.fromString("2015-03-18T123123")).get === - millisToDays(c.getTimeInMillis)) -assert(stringToDate(UTF8String.fromString("2015-03-18T")).get === - millisToDays(c.getTimeInMillis)) +Seq("2015-03-18", "2015-03-18 ", " 2015-03-18", " 2015-03-18 ", "2015-03-18 123142", --- End diff -- ah i see --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22943: [SPARK-25098][SQL] Trim the string when cast stri...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/22943#discussion_r231381218 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala --- @@ -140,16 +140,10 @@ class DateTimeUtilsSuite extends SparkFunSuite { c = Calendar.getInstance() c.set(2015, 2, 18, 0, 0, 0) c.set(Calendar.MILLISECOND, 0) -assert(stringToDate(UTF8String.fromString("2015-03-18")).get === - millisToDays(c.getTimeInMillis)) -assert(stringToDate(UTF8String.fromString("2015-03-18 ")).get === - millisToDays(c.getTimeInMillis)) -assert(stringToDate(UTF8String.fromString("2015-03-18 123142")).get === - millisToDays(c.getTimeInMillis)) -assert(stringToDate(UTF8String.fromString("2015-03-18T123123")).get === - millisToDays(c.getTimeInMillis)) -assert(stringToDate(UTF8String.fromString("2015-03-18T")).get === - millisToDays(c.getTimeInMillis)) +Seq("2015-03-18", "2015-03-18 ", " 2015-03-18", " 2015-03-18 ", "2015-03-18 123142", --- End diff -- New test cases (with space padding) are added; e.g. ` 2015-03-18` and ` 2015-03-18 `. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22943: [SPARK-25098][SQL] Trim the string when cast stri...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22943#discussion_r231380552 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala --- @@ -140,16 +140,10 @@ class DateTimeUtilsSuite extends SparkFunSuite { c = Calendar.getInstance() c.set(2015, 2, 18, 0, 0, 0) c.set(Calendar.MILLISECOND, 0) -assert(stringToDate(UTF8String.fromString("2015-03-18")).get === - millisToDays(c.getTimeInMillis)) -assert(stringToDate(UTF8String.fromString("2015-03-18 ")).get === - millisToDays(c.getTimeInMillis)) -assert(stringToDate(UTF8String.fromString("2015-03-18 123142")).get === - millisToDays(c.getTimeInMillis)) -assert(stringToDate(UTF8String.fromString("2015-03-18T123123")).get === - millisToDays(c.getTimeInMillis)) -assert(stringToDate(UTF8String.fromString("2015-03-18T")).get === - millisToDays(c.getTimeInMillis)) +Seq("2015-03-18", "2015-03-18 ", " 2015-03-18", " 2015-03-18 ", "2015-03-18 123142", --- End diff -- the test result doesn't change? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22943: [SPARK-25098][SQL] Trim the string when cast stri...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/22943#discussion_r231207627 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala --- @@ -441,7 +441,7 @@ object DateTimeUtils { } /** - * Parses a given UTF8 date string to a corresponding [[Int]] value. + * Parses a trimmed UTF8 date string to a corresponding [[Int]] value. --- End diff -- `Parses a trimmed UTF8` -> `Trim and parse a given UTF8`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22943: [SPARK-25098][SQL] Trim the string when cast stri...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/22943#discussion_r231207128 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala --- @@ -274,7 +274,7 @@ object DateTimeUtils { } /** - * Parses a given UTF8 date string to the corresponding a corresponding [[Long]] value. + * Parses a trimmed UTF8 date string to the corresponding a corresponding [[Long]] value. --- End diff -- `Parses a trimmed UTF8` -> `Trim and parse a given UTF8`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22943: [SPARK-25098][SQL] Trim the string when cast stri...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/22943#discussion_r230998508 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala --- @@ -359,7 +359,7 @@ case class Cast(child: Expression, dataType: DataType, timeZoneId: Option[String // TimestampConverter private[this] def castToTimestamp(from: DataType): Any => Any = from match { case StringType => - buildCast[UTF8String](_, utfs => DateTimeUtils.stringToTimestamp(utfs, timeZone).orNull) + buildCast[UTF8String](_, s => DateTimeUtils.stringToTimestamp(s.trim(), timeZone).orNull) --- End diff -- Ur, I'd like not to rename it. One line function document will suffice. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22943: [SPARK-25098][SQL] Trim the string when cast stri...
Github user wangyum commented on a diff in the pull request: https://github.com/apache/spark/pull/22943#discussion_r230970713 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala --- @@ -359,7 +359,7 @@ case class Cast(child: Expression, dataType: DataType, timeZoneId: Option[String // TimestampConverter private[this] def castToTimestamp(from: DataType): Any => Any = from match { case StringType => - buildCast[UTF8String](_, utfs => DateTimeUtils.stringToTimestamp(utfs, timeZone).orNull) + buildCast[UTF8String](_, s => DateTimeUtils.stringToTimestamp(s.trim(), timeZone).orNull) --- End diff -- How about change `stringToDate` to `trimStringToDate` and update `trimStringToDate` to: ![image](https://user-images.githubusercontent.com/5399861/48036353-ec49a100-e1a2-11e8-80e6-b52b4a007493.png) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22943: [SPARK-25098][SQL] Trim the string when cast stri...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/22943#discussion_r230911199 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala --- @@ -359,7 +359,7 @@ case class Cast(child: Expression, dataType: DataType, timeZoneId: Option[String // TimestampConverter private[this] def castToTimestamp(from: DataType): Any => Any = from match { case StringType => - buildCast[UTF8String](_, utfs => DateTimeUtils.stringToTimestamp(utfs, timeZone).orNull) + buildCast[UTF8String](_, s => DateTimeUtils.stringToTimestamp(s.trim(), timeZone).orNull) --- End diff -- cc @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22943: [SPARK-25098][SQL] Trim the string when cast stri...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/22943#discussion_r230911108 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala --- @@ -359,7 +359,7 @@ case class Cast(child: Expression, dataType: DataType, timeZoneId: Option[String // TimestampConverter private[this] def castToTimestamp(from: DataType): Any => Any = from match { case StringType => - buildCast[UTF8String](_, utfs => DateTimeUtils.stringToTimestamp(utfs, timeZone).orNull) + buildCast[UTF8String](_, s => DateTimeUtils.stringToTimestamp(s.trim(), timeZone).orNull) --- End diff -- What about changing `stringToDate` and `stringToTimestamp` instead? Those functions are used only in `Cast` and they already handle `null`cases, too. I didn't look at the detail of this PR. The change looks a little less robust when `s` is `null`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22943: [SPARK-25098][SQL] Trim the string when cast stri...
GitHub user wangyum opened a pull request: https://github.com/apache/spark/pull/22943 [SPARK-25098][SQL] Trim the string when cast stringToTimestamp and stringToDate ## What changes were proposed in this pull request? **Hive** and **Oracle** trim the string when cast `stringToTimestamp` and `stringToDate`. this PR support this feature: ![image](https://user-images.githubusercontent.com/5399861/47979721-793b1e80-e0ff-11e8-97c8-24b10950ee9e.png) ![image](https://user-images.githubusercontent.com/5399861/47979725-7dffd280-e0ff-11e8-87d4-5767a00ed46e.png) ## How was this patch tested? unit tests Closes https://github.com/apache/spark/pull/22089 You can merge this pull request into a Git repository by running: $ git pull https://github.com/wangyum/spark SPARK-25098 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22943.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22943 commit d297817b7457fef40eb78b803542aed213afb7fc Author: Yuming Wang Date: 2018-11-05T05:31:22Z trim() the string when cast stringToTimestamp and stringToDate --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org