Hyukjin Kwon created SPARK-22632:
------------------------------------

             Summary: Fix the behavior of timestamp values for R's DataFrame to 
respect session timezone
                 Key: SPARK-22632
                 URL: https://issues.apache.org/jira/browse/SPARK-22632
             Project: Spark
          Issue Type: Bug
          Components: SparkR, SQL
    Affects Versions: 2.3.0
            Reporter: Hyukjin Kwon


Note: wording is borrowed from SPARK-22395. Symptom is similar and I think that 
JIRA is well descriptive.

When converting R's DataFrame from/to Spark DataFrame using {{createDataFrame}} 
or {{collect}}, timestamp values behave to respect R system timezone instead of 
session timezone.

For example, let's say we use "America/Los_Angeles" as session timezone and 
have a timestamp value "1970-01-01 00:00:01" in the timezone. Btw, I'm in South 
Korea so R timezone would be "KST".
The timestamp value from current collect() will be the following:

```r
> sparkR.session(master = "local[*]", sparkConfig = 
> list(spark.sql.session.timeZone = "America/Los_Angeles"))
> collect(sql("SELECT cast(cast(28801 as timestamp) as string) as ts"))
                   ts
1 1970-01-01 00:00:01
> collect(sql("SELECT cast(28801 as timestamp) as ts"))
                   ts
1 1970-01-01 17:00:01
```

As you can see, the value becomes "1970-01-01 17:00:01" because it respects R 
system timezone.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to