[ https://issues.apache.org/jira/browse/SPARK-19496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15857459#comment-15857459 ]
Hyukjin Kwon edited comment on SPARK-19496 at 2/8/17 6:16 AM: -------------------------------------------------------------- - Hive {code} hive> SELECT to_date('2014-31-12'); 2016-07-12 ... hive> SELECT to_date('2014-12-32'); 2015-01-01 ... hive> SELECT to_date('2014-12-31'); 2014-12-31 {code} {code} hive> SELECT to_date('2015-07-22', 'yyyy-dd-MM') org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:7 Arguments length mismatch ''yyyy-dd-MM'': to_date() requires 1 argument, got 2 ... {code} - Postgres {code} postgres=# SELECT to_date('2014-12-31'); ERROR: function to_date(unknown) does not exist LINE 1: SELECT to_date('2014-12-31'); ^ HINT: No function matches the given name and argument types. You might need to add explicit type casts. {code} {code} postgres=# SELECT to_date('2014-12-31'); ERROR: function to_date(unknown) does not exist LINE 1: SELECT to_date('2014-12-31'); ^ HINT: No function matches the given name and argument types. You might need to add explicit type casts. postgres=# SELECT to_date('2014-12-31', 'yyyy-MM-dd'); to_date ------------ 2014-12-31 (1 row) postgres=# SELECT to_date('2014-13-31', 'yyyy-MM-dd'); to_date ------------ 2015-01-31 (1 row) {code} - Spark {code} spark-sql> SELECT to_date('2014-31-12'); NULL ... spark-sql> SELECT to_date('2014-12-32'); NULL ... spark-sql> SELECT to_date('2014-12-31'); 2014-12-31 ... {code} {code} spark-sql> SELECT to_date('2015-07-22', 'yyyy-dd-MM') 2016-10-07 ... {code} - MySQL {code} mysql> SELECT str_to_date('2014-12-31'); ERROR 1582 (42000): Incorrect parameter count in the call to native function 'str_to_date' {code} {code} mysql> SELECT str_to_date('2014-12-31', '%Y-%m-%d'); +---------------------------------------+ | str_to_date('2014-12-31', '%Y-%m-%d') | +---------------------------------------+ | 2014-12-31 | +---------------------------------------+ 1 row in set (0.00 sec) mysql> SELECT str_to_date('2014-13-31', '%Y-%m-%d'); +---------------------------------------+ | str_to_date('2014-13-31', '%Y-%m-%d') | +---------------------------------------+ | NULL | +---------------------------------------+ 1 row in set, 1 warning (0.00 sec) {code} MySQL/SparkSQL - it seems returning {{NULL}}. MySQL seems not supporting omitting the format. Hive/Postgres - it seems returning calculated dates. Hive seems not supporting the format. Postgres seems not supporting omitting the format. was (Author: hyukjin.kwon): - Hive {code} hive> SELECT to_date('2014-31-12'); 2016-07-12 ... hive> SELECT to_date('2014-12-32'); 2015-01-01 ... hive> SELECT to_date('2014-12-31'); 2014-12-31 {code} {code} hive> SELECT to_date('2015-07-22', 'yyyy-dd-MM') org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:7 Arguments length mismatch ''yyyy-dd-MM'': to_date() requires 1 argument, got 2 ... {code} - Postgres {code} postgres=# SELECT to_date('2014-12-31'); ERROR: function to_date(unknown) does not exist LINE 1: SELECT to_date('2014-12-31'); ^ HINT: No function matches the given name and argument types. You might need to add explicit type casts. postgres=# SELECT to_date('2014-12-31', 'yyyy-MM-dd'); to_date ------------ 2014-12-31 (1 row) postgres=# SELECT to_date('2014-13-31', 'yyyy-MM-dd'); to_date ------------ 2015-01-31 (1 row) {code} - Spark {code} spark-sql> SELECT to_date('2014-31-12'); NULL ... spark-sql> SELECT to_date('2014-12-32'); NULL ... spark-sql> SELECT to_date('2014-12-31'); 2014-12-31 ... {code} {code} spark-sql> SELECT to_date('2015-07-22', 'yyyy-dd-MM') 2016-10-07 ... {code} - MySQL {code} mysql> SELECT str_to_date('2014-12-31', '%Y-%m-%d'); +---------------------------------------+ | str_to_date('2014-12-31', '%Y-%m-%d') | +---------------------------------------+ | 2014-12-31 | +---------------------------------------+ 1 row in set (0.00 sec) mysql> SELECT str_to_date('2014-13-31', '%Y-%m-%d'); +---------------------------------------+ | str_to_date('2014-13-31', '%Y-%m-%d') | +---------------------------------------+ | NULL | +---------------------------------------+ 1 row in set, 1 warning (0.00 sec) {code} MySQL/SparkSQL - it seems returning {{NULL}}. MySQL seems not supporting omitting the format. Hive/Postgres - it seems returning calculated dates. Hive seems not supporting the format. Postgres seems not supporting omitting the format. > to_date with format has weird behavior > -------------------------------------- > > Key: SPARK-19496 > URL: https://issues.apache.org/jira/browse/SPARK-19496 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.1.0 > Reporter: Wenchen Fan > > Today, if we run > {code} > SELECT to_date('2015-07-22', 'yyyy-dd-MM') > {code} > will result to `2016-10-07`, while running > {code} > SELECT to_date('2014-31-12') # default format > {code} > will return null. > this behavior is weird and we should check other systems like hive to see if > this is expected. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org