[ https://issues.apache.org/jira/browse/SPARK-24969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon resolved SPARK-24969. ---------------------------------- Resolution: Incomplete > SQL: to_date function can't parse date strings in different locales. > -------------------------------------------------------------------- > > Key: SPARK-24969 > URL: https://issues.apache.org/jira/browse/SPARK-24969 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.2.1 > Environment: Bare Spark 2.2.1 installation, on RHEL 6. > Reporter: Valentino Pinna > Priority: Major > Labels: bulk-closed > > The locale for {{org.apache.spark.sql.catalyst.util.DateTimeUtils}}, that is > internally used by {{to_date}} SQL function, is set in code to be > {{Locale.US}}. > This causes problems parsing a dataset which has dates in a different > (italian in this case) language. > {code:java} > spark.read.format("csv") > .option("sep", ";") > .csv(logFile) > .toDF("DATA", .....) > .withColumn("DATA2", to_date(col("DATA"), "yyyy MMM")) > .show(10) > {code} > Results from example dataset: > |*DATA*|*DATA2*| > |2018 giu|null| > |2018 mag|null| > |2018 apr|2018-04-01| > |2018 mar|2018-03-01| > |2018 feb|2018-02-01| > |2018 gen|null| > |2017 dic|null| > |2017 nov|2017-11-01| > |2017 ott|null| > |2017 set|null| > Expected results: All values converted. > TEMPORARY WORKAROUND: > In object {{org.apache.spark.sql.catalyst.util.DateTimeUtils}}, replace all > instances of {{Locale.US}} with {{Locale.<your locale>}} > ADDITIONAL NOTES: > I can make a pull request available on GitHub. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org