peng bo created SPARK-27638: ------------------------------- Summary: date format yyyy-M-dd comparison isn't handled properly Key: SPARK-27638 URL: https://issues.apache.org/jira/browse/SPARK-27638 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.4.2 Reporter: peng bo
The below example works with both Mysql and Hive, however not with spark. {code:java} mysql> select * from date_test where date_col >= '2000-1-1'; +------------+ | date_col | +------------+ | 2000-01-01 | +------------+ {code} The reason is that Spark casts both sides to String type during date and string comparison for partial date support. Please find more details in https://issues.apache.org/jira/browse/SPARK-8420. Based on some tests, the behavior of Date and String comparison in Hive and Mysql: Hive: Cast to Date, partial date is not supported Spark: Cast to Date, "partial date" is supported by defining certain date string parse rules. Check out {{str_to_datetime}} in https://github.com/mysql/mysql-server/blob/5.5/sql-common/my_time.c Here's 2 proposals: a. Follow Mysql parse rule, but some partial date string comparison cases wouldn't be supported as well b. Cast String value to date, if it passes use date.toString, original string otherwise. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org