Glen Maisey created SPARK-16239:
-----------------------------------

             Summary: SQL issues with cast from date to string around daylight 
savings time
                 Key: SPARK-16239
                 URL: https://issues.apache.org/jira/browse/SPARK-16239
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 1.6.1
            Reporter: Glen Maisey
            Priority: Critical


Hi all,

I have a dataframe with a date column. When I cast to a string using the spark 
sql cast function it converts it to the wrong date on certain days. Looking 
into it, it occurs once a year when summer daylight savings starts.

I've tried to show this issue the code below. The toString() function works 
correctly whereas the cast does not.

Unfortunately my users are using SQL code rather than scala dataframes and 
therefore this workaround does not apply. This was actually picked up where a 
user was writing something like "SELECT date1 UNION ALL select date2" where 
date1 was a string and date2 was a date type. It must be implicitly converting 
the date to a string which gives this error.

I'm in the Australia/Sydney timezone (see the time changes here 
http://www.timeanddate.com/time/zone/australia/sydney) 

val dates = 
Array("2014-10-03","2014-10-04","2014-10-05","2014-10-06","2015-10-02","2015-10-03",
 "2015-10-04", "2015-10-05")
val df = sc.parallelize(dates)
            .toDF("txn_date")
            .select(col("txn_date").cast("Date"))

df.select(
        col("txn_date"),
        col("txn_date").cast("Timestamp").alias("txn_date_timestamp"),
        col("txn_date").cast("String").alias("txn_date_str_cast"),
        col("txn_date".toString()).alias("txn_date_str_toString")
        )
    .show()

+----------+--------------------+-----------------+---------------------+
|  txn_date|  txn_date_timestamp|txn_date_str_cast|txn_date_str_toString|
+----------+--------------------+-----------------+---------------------+
|2014-10-03|2014-10-02 14:00:...|       2014-10-03|           2014-10-03|
|2014-10-04|2014-10-03 14:00:...|       2014-10-04|           2014-10-04|
|2014-10-05|2014-10-04 13:00:...|       2014-10-04|           2014-10-05|
|2014-10-06|2014-10-05 13:00:...|       2014-10-06|           2014-10-06|
|2015-10-02|2015-10-01 14:00:...|       2015-10-02|           2015-10-02|
|2015-10-03|2015-10-02 14:00:...|       2015-10-03|           2015-10-03|
|2015-10-04|2015-10-03 13:00:...|       2015-10-03|           2015-10-04|
|2015-10-05|2015-10-04 13:00:...|       2015-10-05|           2015-10-05|
+----------+--------------------+-----------------+---------------------+



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to