marmbrus commented on a change in pull request #27710: [SPARK-30960][SQL] add 
back the legacy date/timestamp format support in CSV/JSON parser
URL: https://github.com/apache/spark/pull/27710#discussion_r384715716
 
 

 ##########
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala
 ##########
 @@ -175,10 +175,30 @@ class UnivocityParser(
       }
 
     case _: TimestampType => (d: String) =>
-      nullSafeDatum(d, name, nullable, options)(timestampFormatter.parse)
+      nullSafeDatum(d, name, nullable, options) { datum =>
+        try {
+          timestampFormatter.parse(datum)
+        } catch {
+          case NonFatal(e) =>
+            // If fails to parse, then tries the way used in 2.0 and 1.x for 
backwards
+            // compatibility.
+            val str = UTF8String.fromString(datum)
+            DateTimeUtils.stringToTimestamp(str, 
options.zoneId).getOrElse(throw e)
 
 Review comment:
   I agree with your analysis of the relative costs here.
   
   I think there are probably very few Spark programs in existence where the 
specifics of date handling for dates before October 15, 1582 are critical to 
correctness. This is especially true if you still get back some valid date 
(rather than `null`). If thats really the only difference with the calendar 
switch, then I think you could even make an argument that making the calendar 
configurable is too high a cost for the project.
   
   Removing support for previously supported date format parsing strings in 
contrast seems pretty disastrous to me. Having to remove test cases really 
should have been a red flag to reviewers of those changes that we were going to 
break users' programs.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to