[ 
https://issues.apache.org/jira/browse/SPARK-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136567#comment-16136567
 ] 

Vinayak edited comment on SPARK-18350 at 8/30/17 12:56 PM:
-----------------------------------------------------------

[~ueshin]  
I have set the below value to set the timeZone to UTC. It is adding the current 
timeZone value even though it is in the UTC format.

spark.conf.set("spark.sql.session.timeZone", "UTC")

Find the attached csv data for reference.

Expected : Time should remain same as the input since it's already in UTC format

var df1 = spark.read.option("delimiter", ",").option("qualifier", 
"\"").option("inferSchema","true").option("header", "true").option("mode", 
"PERMISSIVE").option("timestampFormat","MM/dd/yyyy'T'HH:mm:ss.SSS").option("dateFormat",
 "MM/dd/yyyy'T'HH:mm:ss").csv("DateSpark.csv");

df1: org.apache.spark.sql.DataFrame = [Name: string, Age: int ... 5 more fields]

scala> df1.show(false);


+----+---+----+-------------------+-------------------+----------------------+-------------------+

|Name|Age|Add |Date               |SparkDate          |SparkDate1            
|SparkDate2         |

+----+---+----+-------------------+-------------------+----------------------+-------------------+

|abc |21 |bvxc|04/22/2017T03:30:02|2017-03-21 03:30:02|2017-03-21 
09:00:02.02|2017-03-21 05:30:00|

+----+---+----+-------------------+-------------------+----------------------+-------------------+


was (Author: vinayaksgadag):
[~ueshin]  
I have set the below value to set the timeZone to UTC. It is adding the current 
timeZone value even though it is in the UTC format.

spark.conf.set("spark.sql.session.timeZone", "UTC")

Find the attached csv data for reference.

Expected : Time should remain same as the input since it's already in UTC format

var df1 = spark.read.option("delimiter", ",").option("qualifier", 
"\"").option("inferSchema","true").option("header", "true").option("mode", 
"PERMISSIVE").option("timestampFormat","MM/dd/yyyy'T'HH:mm:ss.SSS").option("dateFormat",
 "MM/dd/yyyy'T'HH:mm:ss").csv("DateSpark.csv");

df1: org.apache.spark.sql.DataFrame = [Name: string, Age: int ... 5 more fields]

scala> df1.show(false);
------------------------------------------------------------------------------


Name Age Add  Date  SparkDate  SparkDate1  SparkDate2  

------------------------------------------------------------------------------


abc  21  bvxc 04/22/2017T03:30:02 2017-03-21 03:30:02 2017-03-21 09:00:02.02 
2017-03-21 05:30:00 

------------------------------------------------------------------------------



> Support session local timezone
> ------------------------------
>
>                 Key: SPARK-18350
>                 URL: https://issues.apache.org/jira/browse/SPARK-18350
>             Project: Spark
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Reynold Xin
>            Assignee: Takuya Ueshin
>              Labels: releasenotes
>             Fix For: 2.2.0
>
>         Attachments: sample.csv
>
>
> As of Spark 2.1, Spark SQL assumes the machine timezone for datetime 
> manipulation, which is bad if users are not in the same timezones as the 
> machines, or if different users have different timezones.
> We should introduce a session local timezone setting that is used for 
> execution.
> An explicit non-goal is locale handling.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to