[jira] [Commented] (SPARK-22632) Fix the behavior of timestamp values for R's DataFrame to respect session timezone

2018-09-10 Thread Felix Cheung (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-22632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610083#comment-16610083
 ] 

Felix Cheung commented on SPARK-22632:
--

mismatch between R and JVM time zone could be an issue but not a blocker for 
release. let's move to 3.0

> Fix the behavior of timestamp values for R's DataFrame to respect session 
> timezone
> --
>
> Key: SPARK-22632
> URL: https://issues.apache.org/jira/browse/SPARK-22632
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR, SQL
>Affects Versions: 2.3.0
>Reporter: Hyukjin Kwon
>Priority: Major
>
> Note: wording is borrowed from SPARK-22395. Symptom is similar and I think 
> that JIRA is well descriptive.
> When converting R's DataFrame from/to Spark DataFrame using 
> {{createDataFrame}} or {{collect}}, timestamp values behave to respect R 
> system timezone instead of session timezone.
> For example, let's say we use "America/Los_Angeles" as session timezone and 
> have a timestamp value "1970-01-01 00:00:01" in the timezone. Btw, I'm in 
> South Korea so R timezone would be "KST".
> The timestamp value from current collect() will be the following:
> {code}
> > sparkR.session(master = "local[*]", sparkConfig = 
> > list(spark.sql.session.timeZone = "America/Los_Angeles"))
> > collect(sql("SELECT cast(cast(28801 as timestamp) as string) as ts"))
>ts
> 1 1970-01-01 00:00:01
> > collect(sql("SELECT cast(28801 as timestamp) as ts"))
>ts
> 1 1970-01-01 17:00:01
> {code}
> As you can see, the value becomes "1970-01-01 17:00:01" because it respects R 
> system timezone.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22632) Fix the behavior of timestamp values for R's DataFrame to respect session timezone

2018-09-10 Thread Wenchen Fan (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-22632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609292#comment-16609292
 ] 

Wenchen Fan commented on SPARK-22632:
-

Is this still a problem now?

> Fix the behavior of timestamp values for R's DataFrame to respect session 
> timezone
> --
>
> Key: SPARK-22632
> URL: https://issues.apache.org/jira/browse/SPARK-22632
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR, SQL
>Affects Versions: 2.3.0
>Reporter: Hyukjin Kwon
>Priority: Major
>
> Note: wording is borrowed from SPARK-22395. Symptom is similar and I think 
> that JIRA is well descriptive.
> When converting R's DataFrame from/to Spark DataFrame using 
> {{createDataFrame}} or {{collect}}, timestamp values behave to respect R 
> system timezone instead of session timezone.
> For example, let's say we use "America/Los_Angeles" as session timezone and 
> have a timestamp value "1970-01-01 00:00:01" in the timezone. Btw, I'm in 
> South Korea so R timezone would be "KST".
> The timestamp value from current collect() will be the following:
> {code}
> > sparkR.session(master = "local[*]", sparkConfig = 
> > list(spark.sql.session.timeZone = "America/Los_Angeles"))
> > collect(sql("SELECT cast(cast(28801 as timestamp) as string) as ts"))
>ts
> 1 1970-01-01 00:00:01
> > collect(sql("SELECT cast(28801 as timestamp) as ts"))
>ts
> 1 1970-01-01 17:00:01
> {code}
> As you can see, the value becomes "1970-01-01 17:00:01" because it respects R 
> system timezone.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22632) Fix the behavior of timestamp values for R's DataFrame to respect session timezone

2018-01-08 Thread Sameer Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16316915#comment-16316915
 ] 

Sameer Agarwal commented on SPARK-22632:


Thanks guys, I'll move this to 2.4.0

> Fix the behavior of timestamp values for R's DataFrame to respect session 
> timezone
> --
>
> Key: SPARK-22632
> URL: https://issues.apache.org/jira/browse/SPARK-22632
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR, SQL
>Affects Versions: 2.3.0
>Reporter: Hyukjin Kwon
>
> Note: wording is borrowed from SPARK-22395. Symptom is similar and I think 
> that JIRA is well descriptive.
> When converting R's DataFrame from/to Spark DataFrame using 
> {{createDataFrame}} or {{collect}}, timestamp values behave to respect R 
> system timezone instead of session timezone.
> For example, let's say we use "America/Los_Angeles" as session timezone and 
> have a timestamp value "1970-01-01 00:00:01" in the timezone. Btw, I'm in 
> South Korea so R timezone would be "KST".
> The timestamp value from current collect() will be the following:
> {code}
> > sparkR.session(master = "local[*]", sparkConfig = 
> > list(spark.sql.session.timeZone = "America/Los_Angeles"))
> > collect(sql("SELECT cast(cast(28801 as timestamp) as string) as ts"))
>ts
> 1 1970-01-01 00:00:01
> > collect(sql("SELECT cast(28801 as timestamp) as ts"))
>ts
> 1 1970-01-01 17:00:01
> {code}
> As you can see, the value becomes "1970-01-01 17:00:01" because it respects R 
> system timezone.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22632) Fix the behavior of timestamp values for R's DataFrame to respect session timezone

2018-01-07 Thread Felix Cheung (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16315436#comment-16315436
 ] 

Felix Cheung commented on SPARK-22632:
--

yes, first I'd agree we should generalize this to R & Python
second, I think in general the different treatment of timezone between language 
and Spark has been a source of confusion (has been reported at least a few 
times)
lastly, this isn't a regression AFAIK, so not necessarily a blocker for 2.3, 
although might be very good to have.


> Fix the behavior of timestamp values for R's DataFrame to respect session 
> timezone
> --
>
> Key: SPARK-22632
> URL: https://issues.apache.org/jira/browse/SPARK-22632
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR, SQL
>Affects Versions: 2.3.0
>Reporter: Hyukjin Kwon
>
> Note: wording is borrowed from SPARK-22395. Symptom is similar and I think 
> that JIRA is well descriptive.
> When converting R's DataFrame from/to Spark DataFrame using 
> {{createDataFrame}} or {{collect}}, timestamp values behave to respect R 
> system timezone instead of session timezone.
> For example, let's say we use "America/Los_Angeles" as session timezone and 
> have a timestamp value "1970-01-01 00:00:01" in the timezone. Btw, I'm in 
> South Korea so R timezone would be "KST".
> The timestamp value from current collect() will be the following:
> {code}
> > sparkR.session(master = "local[*]", sparkConfig = 
> > list(spark.sql.session.timeZone = "America/Los_Angeles"))
> > collect(sql("SELECT cast(cast(28801 as timestamp) as string) as ts"))
>ts
> 1 1970-01-01 00:00:01
> > collect(sql("SELECT cast(28801 as timestamp) as ts"))
>ts
> 1 1970-01-01 17:00:01
> {code}
> As you can see, the value becomes "1970-01-01 17:00:01" because it respects R 
> system timezone.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22632) Fix the behavior of timestamp values for R's DataFrame to respect session timezone

2018-01-04 Thread Hyukjin Kwon (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16312356#comment-16312356
 ] 

Hyukjin Kwon commented on SPARK-22632:
--

To me, nope, I don't think so although it might be important to have. In case 
of PySpark <> Pandas related one, it was fixed with a configuration to control 
the behaviour.

I was trying to take a look at that time but I am not sure if it's safe to have 
this at this stage and I can make it within 2.3.0 timeline too ... PySpark 
itself also still has the issue too. FYI.


> Fix the behavior of timestamp values for R's DataFrame to respect session 
> timezone
> --
>
> Key: SPARK-22632
> URL: https://issues.apache.org/jira/browse/SPARK-22632
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR, SQL
>Affects Versions: 2.3.0
>Reporter: Hyukjin Kwon
>
> Note: wording is borrowed from SPARK-22395. Symptom is similar and I think 
> that JIRA is well descriptive.
> When converting R's DataFrame from/to Spark DataFrame using 
> {{createDataFrame}} or {{collect}}, timestamp values behave to respect R 
> system timezone instead of session timezone.
> For example, let's say we use "America/Los_Angeles" as session timezone and 
> have a timestamp value "1970-01-01 00:00:01" in the timezone. Btw, I'm in 
> South Korea so R timezone would be "KST".
> The timestamp value from current collect() will be the following:
> {code}
> > sparkR.session(master = "local[*]", sparkConfig = 
> > list(spark.sql.session.timeZone = "America/Los_Angeles"))
> > collect(sql("SELECT cast(cast(28801 as timestamp) as string) as ts"))
>ts
> 1 1970-01-01 00:00:01
> > collect(sql("SELECT cast(28801 as timestamp) as ts"))
>ts
> 1 1970-01-01 17:00:01
> {code}
> As you can see, the value becomes "1970-01-01 17:00:01" because it respects R 
> system timezone.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22632) Fix the behavior of timestamp values for R's DataFrame to respect session timezone

2018-01-04 Thread Sameer Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16312287#comment-16312287
 ] 

Sameer Agarwal commented on SPARK-22632:


[~hyukjin.kwon] [~felixcheung] should this be a blocker for 2.3?

cc [~ueshin]

> Fix the behavior of timestamp values for R's DataFrame to respect session 
> timezone
> --
>
> Key: SPARK-22632
> URL: https://issues.apache.org/jira/browse/SPARK-22632
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR, SQL
>Affects Versions: 2.3.0
>Reporter: Hyukjin Kwon
>
> Note: wording is borrowed from SPARK-22395. Symptom is similar and I think 
> that JIRA is well descriptive.
> When converting R's DataFrame from/to Spark DataFrame using 
> {{createDataFrame}} or {{collect}}, timestamp values behave to respect R 
> system timezone instead of session timezone.
> For example, let's say we use "America/Los_Angeles" as session timezone and 
> have a timestamp value "1970-01-01 00:00:01" in the timezone. Btw, I'm in 
> South Korea so R timezone would be "KST".
> The timestamp value from current collect() will be the following:
> {code}
> > sparkR.session(master = "local[*]", sparkConfig = 
> > list(spark.sql.session.timeZone = "America/Los_Angeles"))
> > collect(sql("SELECT cast(cast(28801 as timestamp) as string) as ts"))
>ts
> 1 1970-01-01 00:00:01
> > collect(sql("SELECT cast(28801 as timestamp) as ts"))
>ts
> 1 1970-01-01 17:00:01
> {code}
> As you can see, the value becomes "1970-01-01 17:00:01" because it respects R 
> system timezone.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22632) Fix the behavior of timestamp values for R's DataFrame to respect session timezone

2017-12-21 Thread Felix Cheung (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16301006#comment-16301006
 ] 

Felix Cheung commented on SPARK-22632:
--

how are we on this for 2.3?

> Fix the behavior of timestamp values for R's DataFrame to respect session 
> timezone
> --
>
> Key: SPARK-22632
> URL: https://issues.apache.org/jira/browse/SPARK-22632
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR, SQL
>Affects Versions: 2.3.0
>Reporter: Hyukjin Kwon
>
> Note: wording is borrowed from SPARK-22395. Symptom is similar and I think 
> that JIRA is well descriptive.
> When converting R's DataFrame from/to Spark DataFrame using 
> {{createDataFrame}} or {{collect}}, timestamp values behave to respect R 
> system timezone instead of session timezone.
> For example, let's say we use "America/Los_Angeles" as session timezone and 
> have a timestamp value "1970-01-01 00:00:01" in the timezone. Btw, I'm in 
> South Korea so R timezone would be "KST".
> The timestamp value from current collect() will be the following:
> {code}
> > sparkR.session(master = "local[*]", sparkConfig = 
> > list(spark.sql.session.timeZone = "America/Los_Angeles"))
> > collect(sql("SELECT cast(cast(28801 as timestamp) as string) as ts"))
>ts
> 1 1970-01-01 00:00:01
> > collect(sql("SELECT cast(28801 as timestamp) as ts"))
>ts
> 1 1970-01-01 17:00:01
> {code}
> As you can see, the value becomes "1970-01-01 17:00:01" because it respects R 
> system timezone.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22632) Fix the behavior of timestamp values for R's DataFrame to respect session timezone

2017-11-30 Thread Felix Cheung (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16274029#comment-16274029
 ] 

Felix Cheung commented on SPARK-22632:
--

interesting re: timezone on macOS
https://cran.r-project.org/src/base/NEWS


> Fix the behavior of timestamp values for R's DataFrame to respect session 
> timezone
> --
>
> Key: SPARK-22632
> URL: https://issues.apache.org/jira/browse/SPARK-22632
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR, SQL
>Affects Versions: 2.3.0
>Reporter: Hyukjin Kwon
>
> Note: wording is borrowed from SPARK-22395. Symptom is similar and I think 
> that JIRA is well descriptive.
> When converting R's DataFrame from/to Spark DataFrame using 
> {{createDataFrame}} or {{collect}}, timestamp values behave to respect R 
> system timezone instead of session timezone.
> For example, let's say we use "America/Los_Angeles" as session timezone and 
> have a timestamp value "1970-01-01 00:00:01" in the timezone. Btw, I'm in 
> South Korea so R timezone would be "KST".
> The timestamp value from current collect() will be the following:
> {code}
> > sparkR.session(master = "local[*]", sparkConfig = 
> > list(spark.sql.session.timeZone = "America/Los_Angeles"))
> > collect(sql("SELECT cast(cast(28801 as timestamp) as string) as ts"))
>ts
> 1 1970-01-01 00:00:01
> > collect(sql("SELECT cast(28801 as timestamp) as ts"))
>ts
> 1 1970-01-01 17:00:01
> {code}
> As you can see, the value becomes "1970-01-01 17:00:01" because it respects R 
> system timezone.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org