[jira] [Comment Edited] (SPARK-18350) Support session local timezone
[ https://issues.apache.org/jira/browse/SPARK-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202322#comment-16202322 ] Alexandre Dupriez edited comment on SPARK-18350 at 10/12/17 5:30 PM: - Hello all, I have a use case where a {{Dataset}} contains a column of type {{java.sql.Timestamp}} (let's call it {{_time}}) which I am using to derive new columns with the year, month, day and hour specified by the {{_time}} column, with something like: {code:java} session.read.schema(mySchema) .json(path) .withColumn("year", year($"_time")) .withColumn("month", month($"_time")) .withColumn("day", dayofmonth($"_time")) .withColumn("hour", hour($"_time")) {code} using the standard {{year}}, {{month}}, {{dayofmonth}} and {{hour}} functions defined in {{org.apache.spark.sql.functions}}. Now let's assume the timezone is row dependent - and let's call {{_tz}} the column which contains it.The timezone is at the row level which is why I cannot configure the {{DataFrameWriter}} with a {{timeZone}} option. I wondered if something like this would be advisable: {code:java} session.read.schema(mySchema) .json(path) .withColumn("year", year($"_time")) .withColumn("month", month($"_time")) .withColumn("day", dayofmonth($"_time")) .withColumn("hour", hour($"_time", $"_tz")) {code} Having a look at the definition of the {{hour}} function, it uses an {{Hour}} expression which can be constructed with an optional {{timeZoneId}}. I have been trying to create an {{Hour}} expression but this is Spark-internal construct - and the API forbids to use it directly. I guess providing a function {{hour(t: Column, tz: Column)}} along with the existing {{hour(t: Column)}} would not be a satisfying design. Do you think a somehow elegant solution exists for this use case? Or is the methodology I use flawed - i.e. I should not derive the hour from a timestamp column if it happens to rely on a not predefined, row-dependent time zone like this? was (Author: hangleton): Hello all, I have a use case where a {{Dataset}} contains a column of type {{java.sql.Timestamp}} (let's call it {{_time}}) which I am using to derive new columns with the year, month, day and hour specified by the {{_time}} column, with something like: {code:java} session.read.schema(mySchema) .json(path) .withColumn("year", year($"_time")) .withColumn("month", month($"_time")) .withColumn("day", dayofmonth($"_time")) .withColumn("hour", hour($"_time")) {code} using the standard {{year}}, {{month}}, {{dayofmonth}} and {{hour}} functions defined in {{org.apache.spark.sql.functions}}. Now let's assume the timezone is row dependent - and let's call {{_tz}} the column which contains it.The timezone is at the row level which is why I cannot configure the {{DataFrameWriter}} with a {{timeZone}} option. I wondered if something like this would be advisable: {code:java} session.read.schema(mySchema) .json(path) .withColumn("year", year($"_time")) .withColumn("month", month($"_time")) .withColumn("day", dayofmonth($"_time")) .withColumn("hour", hour($"_time", $"_tz")) {code} Having a look at the definition of the {{hour}} function, it uses an {{Hour}} expression which can be constructed with an optional {{timeZoneId}}. I have been trying to create an {{Hour}} expression but this is Spark-internal construct - and the API forbids to use it directly. I guess providing a function {{hour(t: Column, tz: Column)}} along with the existing {{hour(t: Column)}} would not be a satisfying design. Do you think a somehow elegant solution exists for this use case? Or is the methodology I use flawed - i.e. I should not derive the hour from a timestamp column if it happens to rely on a not predefined, row-dependent time zone like this? > Support session local timezone > -- > > Key: SPARK-18350 > URL: https://issues.apache.org/jira/browse/SPARK-18350 > Project: Spark > Issue Type: New Feature > Components: SQL >Reporter: Reynold Xin >Assignee: Takuya Ueshin > Labels: releasenotes > Fix For: 2.2.0 > > Attachments: sample.csv > > > As of Spark 2.1, Spark SQL assumes the machine timezone for datetime > manipulation, which is bad if users are not in the same timezones as the > machines, or if different users have different timezones. > We should introduce a session local timezone setting that is used for > execution. > An explicit non-goal is locale handling. -- This message was sent by Atlassian JIRA (v6.
[jira] [Comment Edited] (SPARK-18350) Support session local timezone
[ https://issues.apache.org/jira/browse/SPARK-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202322#comment-16202322 ] Alexandre Dupriez edited comment on SPARK-18350 at 10/12/17 5:30 PM: - Hello all, I have a use case where a {{Dataset}} contains a column of type {{java.sql.Timestamp}} (let's call it {{_time}}) which I am using to derive new columns with the year, month, day and hour specified by the {{_time}} column, with something like: {code:java} session.read.schema(mySchema) .json(path) .withColumn("year", year($"_time")) .withColumn("month", month($"_time")) .withColumn("day", dayofmonth($"_time")) .withColumn("hour", hour($"_time")) {code} using the standard {{year}}, {{month}}, {{dayofmonth}} and {{hour}} functions defined in {{org.apache.spark.sql.functions}}. Now let's assume the timezone is row dependent - and let's call {{_tz}} the column which contains it.The timezone is at the row level which is why I cannot configure the {{DataFrameWriter}} with a {{timeZone}} option. I wondered if something like this would be advisable: {code:java} session.read.schema(mySchema) .json(path) .withColumn("year", year($"_time")) .withColumn("month", month($"_time")) .withColumn("day", dayofmonth($"_time")) .withColumn("hour", hour($"_time", $"_tz")) {code} Having a look at the definition of the {{hour}} function, it uses an {{Hour}} expression which can be constructed with an optional {{timeZoneId}}. I have been trying to create an {{Hour}} expression but this is Spark-internal construct - and the API forbids to use it directly. I guess providing a function {{hour(t: Column, tz: Column)}} along with the existing {{hour(t: Column)}} would not be a satisfying design. Do you think a somehow elegant solution exists for this use case? Or is the methodology I use flawed - i.e. I should not derive the hour from a timestamp column if it happens to rely on a not predefined, row-dependent time zone like this? was (Author: hangleton): Hello all, I have a use case where a {{Dataset}} contains a column of type {{java.sql.Timestamp}} (let's call it {{_time}}) which I am using to derive new columns with the year, month, day and hour specified by the {{_time}} column, with something like: {code:java} session.read.schema(mySchema) .json(path) .withColumn("year", year($"_time")) .withColumn("month", month($"_time")) .withColumn("day", dayofmonth($"_time")) .withColumn("hour", hour($"_time")) {code} using the standard {{year}}, {{month}}, {{dayofmonth}} and {{hour}} functions defined in {{org.apache.spark.sql.functions}}. Now let's assume the timezone is row dependent - and let's call {{_tz}} the column which contains it.The timezone is at the row level which is why I cannot configure the {{DataFrameWriter}} with a {{timeZone}} option. I wondered if something like this would be advisable: {code:java} session.read.schema(mySchema) .json(path) .withColumn("year", year($"_time")) .withColumn("month", month($"_time")) .withColumn("day", dayofmonth($"_time")) .withColumn("hour", hour($"_time", $"_tz")) {code} Having a look at the definition of the {{hour}} function, it uses an {{Hour}} expression which can be constructed with an optional {{timeZoneId}}. I have been trying to create an {{Hour}} expression but this is Spark-internal construct - and the API forbids to use it directly. I guess providing a function {{hour(t: Column, tz: Column)}} along with the existing {{hour(t: Column)}} would not be a satisfying design. Do you think a somehow elegant solution exists for this use case? Or is the methodology I use flawed - i.e. I should not derive the hour from a timestamp column if it happens to rely on a not predefined, row-dependent time zone like this? > Support session local timezone > -- > > Key: SPARK-18350 > URL: https://issues.apache.org/jira/browse/SPARK-18350 > Project: Spark > Issue Type: New Feature > Components: SQL >Reporter: Reynold Xin >Assignee: Takuya Ueshin > Labels: releasenotes > Fix For: 2.2.0 > > Attachments: sample.csv > > > As of Spark 2.1, Spark SQL assumes the machine timezone for datetime > manipulation, which is bad if users are not in the same timezones as the > machines, or if different users have different timezones. > We should introduce a session local timezone setting that is used for > execution. > An explicit non-goal is locale handling. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsub
[jira] [Comment Edited] (SPARK-18350) Support session local timezone
[ https://issues.apache.org/jira/browse/SPARK-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202322#comment-16202322 ] Alexandre Dupriez edited comment on SPARK-18350 at 10/12/17 5:29 PM: - Hello all, I have a use case where a {{Dataset}} contains a column of type {{java.sql.Timestamp}} (let's call it {{_time}}) which I am using to derive new columns with the year, month, day and hour specified by the {{_time}} column, with something like: {code:java} session.read.schema(mySchema) .json(path) .withColumn("year", year($"_time")) .withColumn("month", month($"_time")) .withColumn("day", dayofmonth($"_time")) .withColumn("hour", hour($"_time")) {code} using the standard {{year}}, {{month}}, {{dayofmonth}} and {{hour}} functions defined in {{org.apache.spark.sql.functions}}. Now let's assume the timezone is row dependent - and let's call {{_tz}} the column which contains it.The timezone is at the row level which is why I cannot configure the {{DataFrameWriter}} with a {{timeZone}} option. I wondered if something like this would be advisable: {code:java} session.read.schema(mySchema) .json(path) .withColumn("year", year($"_time")) .withColumn("month", month($"_time")) .withColumn("day", dayofmonth($"_time")) .withColumn("hour", hour($"_time", $"_tz")) {code} Having a look at the definition of the {{hour}} function, it uses an {{Hour}} expression which can be constructed with an optional {{timeZoneId}}. I have been trying to create an {{Hour}} expression but this is Spark-internal construct - and the API forbids to use it directly. I guess providing a function {{hour(t: Column, tz: Column)}} along with the existing {{hour(t: Column)}} would not be a satisfying design. Do you think a somehow elegant solution exists for this use case? Or is the methodology I use flawed - i.e. I should not derive the hour from a timestamp column if it happens to rely on a not predefined, row-dependent time zone like this? was (Author: hangleton): Hello all, I have a use case where a {{Dataset}} contains a column of type {{java.sql.Timestamp}} (let's call it {{_time}}) which I am using to derive new columns with the year, month, day and hour specified by the {{_time}} column, with something like: {code:java} session.read.schema(mySchema) .json(path) .withColumn("year", year($"_time")) .withColumn("month", month($"_time")) .withColumn("day", dayofmonth($"_time")) .withColumn("hour", hour($"_time")) {code} using the standard {{year}}, {{month}}, {{dayofmonth}} and {{hour}} functions defined in {{org.apache.spark.sql.functions}}. Now let's assume the timezone is row dependent - and let's call {{_tz}} the column which contains it.The timezone is at the row level which is why I cannot configure the {{DataFrameWriter}} with a {{timeZone}} option. I wondered if something like this would be advisable: {code:java} session.read.schema(mySchema) .json(path) .withColumn("year", year($"_time")) .withColumn("month", month($"_time")) .withColumn("day", dayofmonth($"_time")) .withColumn("hour", hour($"_time", $"_tz")) {code} Having a look at the definition of the {{hour}} function, it uses an {{Hour}} expression which can be constructed with an optional {{timeZoneId}}. I have been trying to create an {{Hour}} expression but this is Spark-internal construct - and the API forbids to use it directly. I guess providing a function {{hour(t: Column, tz: Column)}} along with the existing {{hour(t: Column)}} would not be a satisfying design. Do you think a somehow elegant solution exists for this use case? Or is the methodology I use flawed - i.e. I should not derive the hour from a timestamp column if it happens to rely on a not predefined, row-dependent time zone like this? > Support session local timezone > -- > > Key: SPARK-18350 > URL: https://issues.apache.org/jira/browse/SPARK-18350 > Project: Spark > Issue Type: New Feature > Components: SQL >Reporter: Reynold Xin >Assignee: Takuya Ueshin > Labels: releasenotes > Fix For: 2.2.0 > > Attachments: sample.csv > > > As of Spark 2.1, Spark SQL assumes the machine timezone for datetime > manipulation, which is bad if users are not in the same timezones as the > machines, or if different users have different timezones. > We should introduce a session local timezone setting that is u
[jira] [Comment Edited] (SPARK-18350) Support session local timezone
[ https://issues.apache.org/jira/browse/SPARK-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136567#comment-16136567 ] Vinayak edited comment on SPARK-18350 at 8/30/17 12:56 PM: --- [~ueshin] I have set the below value to set the timeZone to UTC. It is adding the current timeZone value even though it is in the UTC format. spark.conf.set("spark.sql.session.timeZone", "UTC") Find the attached csv data for reference. Expected : Time should remain same as the input since it's already in UTC format var df1 = spark.read.option("delimiter", ",").option("qualifier", "\"").option("inferSchema","true").option("header", "true").option("mode", "PERMISSIVE").option("timestampFormat","MM/dd/'T'HH:mm:ss.SSS").option("dateFormat", "MM/dd/'T'HH:mm:ss").csv("DateSpark.csv"); df1: org.apache.spark.sql.DataFrame = [Name: string, Age: int ... 5 more fields] scala> df1.show(false); ++---++---+---+--+---+ |Name|Age|Add |Date |SparkDate |SparkDate1 |SparkDate2 | ++---++---+---+--+---+ |abc |21 |bvxc|04/22/2017T03:30:02|2017-03-21 03:30:02|2017-03-21 09:00:02.02|2017-03-21 05:30:00| ++---++---+---+--+---+ was (Author: vinayaksgadag): [~ueshin] I have set the below value to set the timeZone to UTC. It is adding the current timeZone value even though it is in the UTC format. spark.conf.set("spark.sql.session.timeZone", "UTC") Find the attached csv data for reference. Expected : Time should remain same as the input since it's already in UTC format var df1 = spark.read.option("delimiter", ",").option("qualifier", "\"").option("inferSchema","true").option("header", "true").option("mode", "PERMISSIVE").option("timestampFormat","MM/dd/'T'HH:mm:ss.SSS").option("dateFormat", "MM/dd/'T'HH:mm:ss").csv("DateSpark.csv"); df1: org.apache.spark.sql.DataFrame = [Name: string, Age: int ... 5 more fields] scala> df1.show(false); -- Name Age Add Date SparkDate SparkDate1 SparkDate2 -- abc 21 bvxc 04/22/2017T03:30:02 2017-03-21 03:30:02 2017-03-21 09:00:02.02 2017-03-21 05:30:00 -- > Support session local timezone > -- > > Key: SPARK-18350 > URL: https://issues.apache.org/jira/browse/SPARK-18350 > Project: Spark > Issue Type: New Feature > Components: SQL >Reporter: Reynold Xin >Assignee: Takuya Ueshin > Labels: releasenotes > Fix For: 2.2.0 > > Attachments: sample.csv > > > As of Spark 2.1, Spark SQL assumes the machine timezone for datetime > manipulation, which is bad if users are not in the same timezones as the > machines, or if different users have different timezones. > We should introduce a session local timezone setting that is used for > execution. > An explicit non-goal is locale handling. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-18350) Support session local timezone
[ https://issues.apache.org/jira/browse/SPARK-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136567#comment-16136567 ] Vinayak edited comment on SPARK-18350 at 8/30/17 12:55 PM: --- [~ueshin] I have set the below value to set the timeZone to UTC. It is adding the current timeZone value even though it is in the UTC format. spark.conf.set("spark.sql.session.timeZone", "UTC") Find the attached csv data for reference. Expected : Time should remain same as the input since it's already in UTC format var df1 = spark.read.option("delimiter", ",").option("qualifier", "\"").option("inferSchema","true").option("header", "true").option("mode", "PERMISSIVE").option("timestampFormat","MM/dd/'T'HH:mm:ss.SSS").option("dateFormat", "MM/dd/'T'HH:mm:ss").csv("DateSpark.csv"); df1: org.apache.spark.sql.DataFrame = [Name: string, Age: int ... 5 more fields] scala> df1.show(false); -- Name Age Add Date SparkDate SparkDate1 SparkDate2 -- abc 21 bvxc 04/22/2017T03:30:02 2017-03-21 03:30:02 2017-03-21 09:00:02.02 2017-03-21 05:30:00 -- was (Author: vinayaksgadag): [~ueshin] I have set the below value to set the timeZone to UTC. It is adding the current timeZone value even though it is in the UTC format. spark.conf.set("spark.sql.session.timeZone", "UTC") Find the attached csv data for reference. Expected : Time should remain same as the input since it's already in UTC format var df1 = spark.read.option("delimiter", ",").option("qualifier", "\"").option("inferSchema","true").option("header", "true").option("mode", "PERMISSIVE").option("timestampFormat","MM/dd/'T'HH:mm:ss.SSS").option("dateFormat", "MM/dd/'T'HH:mm:ss").csv("DateSpark.csv"); df1: org.apache.spark.sql.DataFrame = [Name: string, Age: int ... 5 more fields] scala> df1.show(false); -- Name Age Add Date SparkDate SparkDate1 SparkDate2 -- abc 21 bvxc 04/22/2017T03:30:02 2017-03-21 03:30:02 2017-03-21 09:00:02.02 2017-03-21 05:30:00 > Support session local timezone > -- > > Key: SPARK-18350 > URL: https://issues.apache.org/jira/browse/SPARK-18350 > Project: Spark > Issue Type: New Feature > Components: SQL >Reporter: Reynold Xin >Assignee: Takuya Ueshin > Labels: releasenotes > Fix For: 2.2.0 > > Attachments: sample.csv > > > As of Spark 2.1, Spark SQL assumes the machine timezone for datetime > manipulation, which is bad if users are not in the same timezones as the > machines, or if different users have different timezones. > We should introduce a session local timezone setting that is used for > execution. > An explicit non-goal is locale handling. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-18350) Support session local timezone
[ https://issues.apache.org/jira/browse/SPARK-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136567#comment-16136567 ] Vinayak edited comment on SPARK-18350 at 8/29/17 7:30 PM: -- [~ueshin] I have set the below value to set the timeZone to UTC. It is adding the current timeZone value even though it is in the UTC format. spark.conf.set("spark.sql.session.timeZone", "UTC") Find the attached csv data for reference. Expected : Time should remain same as the input since it's already in UTC format var df1 = spark.read.option("delimiter", ",").option("qualifier", "\"").option("inferSchema","true").option("header", "true").option("mode", "PERMISSIVE").option("timestampFormat","MM/dd/'T'HH:mm:ss.SSS").option("dateFormat", "MM/dd/'T'HH:mm:ss").csv("DateSpark.csv"); df1: org.apache.spark.sql.DataFrame = [Name: string, Age: int ... 5 more fields] scala> df1.show(false); -- Name Age Add Date SparkDate SparkDate1 SparkDate2 -- abc 21 bvxc 04/22/2017T03:30:02 2017-03-21 03:30:02 2017-03-21 09:00:02.02 2017-03-21 05:30:00 was (Author: vinayaksgadag): I have set the below value to set the timeZone to UTC. It is adding the current timeZone value even though it is in the UTC format. spark.conf.set("spark.sql.session.timeZone", "UTC") Find the attached csv data for reference. Expected : Time should remain same as the input since it's already in UTC format var df1 = spark.read.option("delimiter", ",").option("qualifier", "\"").option("inferSchema","true").option("header", "true").option("mode", "PERMISSIVE").option("timestampFormat","MM/dd/'T'HH:mm:ss.SSS").option("dateFormat", "MM/dd/'T'HH:mm:ss").csv("DateSpark.csv"); df1: org.apache.spark.sql.DataFrame = [Name: string, Age: int ... 5 more fields] scala> df1.show(false); -- Name Age Add Date SparkDate SparkDate1 SparkDate2 -- abc 21 bvxc 04/22/2017T03:30:02 2017-03-21 03:30:02 2017-03-21 09:00:02.02 2017-03-21 05:30:00 > Support session local timezone > -- > > Key: SPARK-18350 > URL: https://issues.apache.org/jira/browse/SPARK-18350 > Project: Spark > Issue Type: New Feature > Components: SQL >Reporter: Reynold Xin >Assignee: Takuya Ueshin > Labels: releasenotes > Fix For: 2.2.0 > > Attachments: sample.csv > > > As of Spark 2.1, Spark SQL assumes the machine timezone for datetime > manipulation, which is bad if users are not in the same timezones as the > machines, or if different users have different timezones. > We should introduce a session local timezone setting that is used for > execution. > An explicit non-goal is locale handling. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-18350) Support session local timezone
[ https://issues.apache.org/jira/browse/SPARK-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136567#comment-16136567 ] Vinayak edited comment on SPARK-18350 at 8/22/17 11:42 AM: --- I have set the below value to set the timeZone to UTC. It is adding the current timeZone value even though it is in the UTC format. spark.conf.set("spark.sql.session.timeZone", "UTC") Find the attached csv data for reference. Expected : Time should remain same as the input since it's already in UTC format var df1 = spark.read.option("delimiter", ",").option("qualifier", "\"").option("inferSchema","true").option("header", "true").option("mode", "PERMISSIVE").option("timestampFormat","MM/dd/'T'HH:mm:ss.SSS").option("dateFormat", "MM/dd/'T'HH:mm:ss").csv("DateSpark.csv"); df1: org.apache.spark.sql.DataFrame = [Name: string, Age: int ... 5 more fields] scala> df1.show(false); -- Name Age Add Date SparkDate SparkDate1 SparkDate2 -- abc 21 bvxc 04/22/2017T03:30:02 2017-03-21 03:30:02 2017-03-21 09:00:02.02 2017-03-21 05:30:00 was (Author: vinayaksgadag): I have set the below value to set the timeZone to UTC. It is adding the current timeZone value even though it is in the UTC format. spark.conf.set("spark.sql.session.timeZone", "UTC") Find the attached csv data for reference. Expected : Time should remain same as the input since it's already in UTC format > Support session local timezone > -- > > Key: SPARK-18350 > URL: https://issues.apache.org/jira/browse/SPARK-18350 > Project: Spark > Issue Type: New Feature > Components: SQL >Reporter: Reynold Xin >Assignee: Takuya Ueshin > Labels: releasenotes > Fix For: 2.2.0 > > Attachments: sample.csv > > > As of Spark 2.1, Spark SQL assumes the machine timezone for datetime > manipulation, which is bad if users are not in the same timezones as the > machines, or if different users have different timezones. > We should introduce a session local timezone setting that is used for > execution. > An explicit non-goal is locale handling. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-18350) Support session local timezone
[ https://issues.apache.org/jira/browse/SPARK-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136567#comment-16136567 ] Vinayak edited comment on SPARK-18350 at 8/22/17 9:37 AM: -- I have set the below value to set the timeZone to UTC. It is adding the current timeZone value even though it is in the UTC format. spark.conf.set("spark.sql.session.timeZone", "UTC") Find the attached csv data for reference. Expected : Time should remain same as the input since it's already in UTC format was (Author: vinayaksgadag): I have set the below value to set the timeZone to UTC. It is adding the current timeZone value even though it is in the UTC format. spark.conf.set("spark.sql.session.timeZone", "UTC") Find the attached csv data for reference. > Support session local timezone > -- > > Key: SPARK-18350 > URL: https://issues.apache.org/jira/browse/SPARK-18350 > Project: Spark > Issue Type: New Feature > Components: SQL >Reporter: Reynold Xin >Assignee: Takuya Ueshin > Labels: releasenotes > Fix For: 2.2.0 > > Attachments: sample.csv > > > As of Spark 2.1, Spark SQL assumes the machine timezone for datetime > manipulation, which is bad if users are not in the same timezones as the > machines, or if different users have different timezones. > We should introduce a session local timezone setting that is used for > execution. > An explicit non-goal is locale handling. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-18350) Support session local timezone
[ https://issues.apache.org/jira/browse/SPARK-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15938096#comment-15938096 ] Giorgio Massignani edited comment on SPARK-18350 at 3/23/17 11:29 AM: -- I'd like to share what we did to solve the oracle _TIMESTAMP WITH TIME ZONE_ We are looking for to upgrade to the latest spark version, but because it hasn't changes about it, we did in the `spark 1.6.1` with scala. In our case, we are creating _StructType_ and _StructField_ programatically creating DataFrames from RDDs. The first problem with the TimeZones are, how to send the TimeZone embedded into a Timestamp column? Our workaround was creating the a new type _TimestampTz_ which has _UserDefinedType_ and _Kryo_ serialisers. {code:java} @SQLUserDefinedType(udt = classOf[TimestampTzUdt]) @DefaultSerializer(classOf[TimestampTzKryo]) class TimestampTz(val time: Long, val timeZoneId:String) {code} The second problem, how to customise spark when it is call _PreparedStatement.setXXX_? It makes me create a new _DataFrameWriter_ duplicating the code because it is a _final class_ With a _CustomDataFrameWriter_ it has to to call the _JdbcUtils_ where the customisation should be done. We created a _CustomJdbcUtils_ which is a Proxy of _JdbcUtils_ but with a change only where it call the _PreparedStatement.setTimestamp_ {code:java} case TimestampTzUdt => val timestampTz = row.getAs[TimestampTz](i) val cal = timestampTz.getCalendar stmt.setTimestamp(i + 1, new java.sql.Timestamp(timestampTz.time), cal) {code} It would be perfect if the oracle driver worked as we expected, sending the timezone to the column. However, to work, we need to call a specific oracle class. {code:java} case TimestampTzUdt => val timestampTz = row.getAs[TimestampTz](i) val cal = timestampTz.getCalendar if (isOracle) stmt.setObject(i + 1, new oracle.sql.TIMESTAMPTZ(conn, new java.sql.Timestamp(timestampTz.time), cal)) else stmt.setTimestamp(i + 1, new java.sql.Timestamp(timestampTz.time), cal) {code} In resume, what we expected from Spark? Creating some shortcuts to make easier customise spark sql for these cases. was (Author: giorgio_sonra): I'd like to share what we did to solve the oracle `TIMESTAMP WITH TIME ZONE` We are looking for to upgrade to the latest spark version, but because it hasn't changes about it, we did in the `spark 1.6.1` with scala. In our case, we are creating `StructType` and `StructField` programatically creating DataFrames from RDDs. The first problem with the TimeZones are, how to send the TimeZone embedded into a Timestamp column? My workaround was creating the a new type `TimestampTz` which has UserDefinedType and Kryo serialisers. {code:java} @SQLUserDefinedType(udt = classOf[TimestampTzUdt]) @DefaultSerializer(classOf[TimestampTzKryo]) class TimestampTz(val time: Long, val timeZoneId:String) {code} The second problem, how to customise spark when it is call `PreparedStatement.setXXX`? It makes me create a new `DataFrameWriter` duplicating the code because it is a `final class` With a `CustomDataFrameWriter` it has to to call the `JdbcUtils` where the customisation should be done. We created a `CustomJdbcUtils` which is a Proxy of ``JdbcUtils` but with a change only where it call the `PreparedStatement.setTimestamp` {code:java} case TimestampTzUdt => val timestampTz = row.getAs[TimestampTz](i) val cal = timestampTz.getCalendar stmt.setTimestamp(i + 1, new java.sql.Timestamp(timestampTz.time), cal) {code} It would be perfect if the oracle driver worked as we expected, sending the timezone to the column. However, to work, we need to call a specific oracle class. {code:java} case TimestampTzUdt => val timestampTz = row.getAs[TimestampTz](i) val cal = timestampTz.getCalendar if (isOracle) stmt.setObject(i + 1, new oracle.sql.TIMESTAMPTZ(conn, new java.sql.Timestamp(timestampTz.time), cal)) else stmt.setTimestamp(i + 1, new java.sql.Timestamp(timestampTz.time), cal) {code} In resume, what we expected from Spark? Creating some shortcuts to make easier customise spark sql for these cases. > Support session local timezone > -- > > Key: SPARK-18350 > URL: https://issues.apache.org/jira/browse/SPARK-18350 > Project: Spark > Issue Type: New Feature > Components: SQL >Reporter: Reynold Xin >Assignee: Takuya Ueshin > Labels: releasenotes > Fix For: 2.2.0 > > > As of Spark 2.1, Spark SQL assumes the machine timezone for datetime > manipulation, which is bad if users are not in the same timezones as the > machines, or if different users have different timezones. > We should introduce a session local timezone setting that is used for > execution. > An explicit non-goal is loc
[jira] [Comment Edited] (SPARK-18350) Support session local timezone
[ https://issues.apache.org/jira/browse/SPARK-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15938096#comment-15938096 ] Giorgio Massignani edited comment on SPARK-18350 at 3/23/17 11:29 AM: -- I'd like to share what we did to solve the oracle _TIMESTAMP WITH TIME ZONE_ We are looking for to upgrade to the latest spark version, but because it hasn't changes about it, we did in the _spark 1.6.1_ with scala. In our case, we are creating _StructType_ and _StructField_ programatically creating DataFrames from RDDs. The first problem with the TimeZones are, how to send the TimeZone embedded into a Timestamp column? Our workaround was creating the a new type _TimestampTz_ which has _UserDefinedType_ and _Kryo_ serialisers. {code:java} @SQLUserDefinedType(udt = classOf[TimestampTzUdt]) @DefaultSerializer(classOf[TimestampTzKryo]) class TimestampTz(val time: Long, val timeZoneId:String) {code} The second problem, how to customise spark when it is call _PreparedStatement.setXXX_? It makes me create a new _DataFrameWriter_ duplicating the code because it is a _final class_ With a _CustomDataFrameWriter_ it has to to call the _JdbcUtils_ where the customisation should be done. We created a _CustomJdbcUtils_ which is a Proxy of _JdbcUtils_ but with a change only where it call the _PreparedStatement.setTimestamp_ {code:java} case TimestampTzUdt => val timestampTz = row.getAs[TimestampTz](i) val cal = timestampTz.getCalendar stmt.setTimestamp(i + 1, new java.sql.Timestamp(timestampTz.time), cal) {code} It would be perfect if the oracle driver worked as we expected, sending the timezone to the column. However, to work, we need to call a specific oracle class. {code:java} case TimestampTzUdt => val timestampTz = row.getAs[TimestampTz](i) val cal = timestampTz.getCalendar if (isOracle) stmt.setObject(i + 1, new oracle.sql.TIMESTAMPTZ(conn, new java.sql.Timestamp(timestampTz.time), cal)) else stmt.setTimestamp(i + 1, new java.sql.Timestamp(timestampTz.time), cal) {code} In resume, what we expected from Spark? Creating some shortcuts to make easier customise spark sql for these cases. was (Author: giorgio_sonra): I'd like to share what we did to solve the oracle _TIMESTAMP WITH TIME ZONE_ We are looking for to upgrade to the latest spark version, but because it hasn't changes about it, we did in the `spark 1.6.1` with scala. In our case, we are creating _StructType_ and _StructField_ programatically creating DataFrames from RDDs. The first problem with the TimeZones are, how to send the TimeZone embedded into a Timestamp column? Our workaround was creating the a new type _TimestampTz_ which has _UserDefinedType_ and _Kryo_ serialisers. {code:java} @SQLUserDefinedType(udt = classOf[TimestampTzUdt]) @DefaultSerializer(classOf[TimestampTzKryo]) class TimestampTz(val time: Long, val timeZoneId:String) {code} The second problem, how to customise spark when it is call _PreparedStatement.setXXX_? It makes me create a new _DataFrameWriter_ duplicating the code because it is a _final class_ With a _CustomDataFrameWriter_ it has to to call the _JdbcUtils_ where the customisation should be done. We created a _CustomJdbcUtils_ which is a Proxy of _JdbcUtils_ but with a change only where it call the _PreparedStatement.setTimestamp_ {code:java} case TimestampTzUdt => val timestampTz = row.getAs[TimestampTz](i) val cal = timestampTz.getCalendar stmt.setTimestamp(i + 1, new java.sql.Timestamp(timestampTz.time), cal) {code} It would be perfect if the oracle driver worked as we expected, sending the timezone to the column. However, to work, we need to call a specific oracle class. {code:java} case TimestampTzUdt => val timestampTz = row.getAs[TimestampTz](i) val cal = timestampTz.getCalendar if (isOracle) stmt.setObject(i + 1, new oracle.sql.TIMESTAMPTZ(conn, new java.sql.Timestamp(timestampTz.time), cal)) else stmt.setTimestamp(i + 1, new java.sql.Timestamp(timestampTz.time), cal) {code} In resume, what we expected from Spark? Creating some shortcuts to make easier customise spark sql for these cases. > Support session local timezone > -- > > Key: SPARK-18350 > URL: https://issues.apache.org/jira/browse/SPARK-18350 > Project: Spark > Issue Type: New Feature > Components: SQL >Reporter: Reynold Xin >Assignee: Takuya Ueshin > Labels: releasenotes > Fix For: 2.2.0 > > > As of Spark 2.1, Spark SQL assumes the machine timezone for datetime > manipulation, which is bad if users are not in the same timezones as the > machines, or if different users have different timezones. > We should introduce a session local timezone setting that is used for > execution. > An explicit non-goal is
[jira] [Comment Edited] (SPARK-18350) Support session local timezone
[ https://issues.apache.org/jira/browse/SPARK-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15938096#comment-15938096 ] Giorgio Massignani edited comment on SPARK-18350 at 3/23/17 11:26 AM: -- I'd like to share what we did to solve the oracle `TIMESTAMP WITH TIME ZONE` We are looking for to upgrade to the latest spark version, but because it hasn't changes about it, we did in the `spark 1.6.1` with scala. In our case, we are creating `StructType` and `StructField` programatically creating DataFrames from RDDs. The first problem with the TimeZones are, how to send the TimeZone embedded into a Timestamp column? My workaround was creating the a new type `TimestampTz` which has UserDefinedType and Kryo serialisers. {code:java} @SQLUserDefinedType(udt = classOf[TimestampTzUdt]) @DefaultSerializer(classOf[TimestampTzKryo]) class TimestampTz(val time: Long, val timeZoneId:String) {code} The second problem, how to customise spark when it is call `PreparedStatement.setXXX`? It makes me create a new `DataFrameWriter` duplicating the code because it is a `final class` With a `CustomDataFrameWriter` it has to to call the `JdbcUtils` where the customisation should be done. We created a `CustomJdbcUtils` which is a Proxy of ``JdbcUtils` but with a change only where it call the `PreparedStatement.setTimestamp` {code:java} case TimestampTzUdt => val timestampTz = row.getAs[TimestampTz](i) val cal = timestampTz.getCalendar stmt.setTimestamp(i + 1, new java.sql.Timestamp(timestampTz.time), cal) {code} It would be perfect if the oracle driver worked as we expected, sending the timezone to the column. However, to work, we need to call a specific oracle class. {code:java} case TimestampTzUdt => val timestampTz = row.getAs[TimestampTz](i) val cal = timestampTz.getCalendar if (isOracle) stmt.setObject(i + 1, new oracle.sql.TIMESTAMPTZ(conn, new java.sql.Timestamp(timestampTz.time), cal)) else stmt.setTimestamp(i + 1, new java.sql.Timestamp(timestampTz.time), cal) {code} In resume, what we expected from Spark? Creating some shortcuts to make easier customise spark sql for these cases. was (Author: giorgio_sonra): I'd like to share what we did to solve the oracle `TIMESTAMP WITH TIME ZONE` We are looking for to upgrade to the latest spark version, but because it hasn't changes about it, we did in the `spark 1.6.1` with scala. In our case, we are creating `StructType` and `StructField` programatically creating DataFrames from RDDs. The first problem with the TimeZones are, how to send the TimeZone embedded into a Timestamp column? My workaround was creating the a new type `TimestampTz` which has UserDefinedType and Kryo serialisers. {code:scala} @SQLUserDefinedType(udt = classOf[TimestampTzUdt]) @DefaultSerializer(classOf[TimestampTzKryo]) class TimestampTz(val time: Long, val timeZoneId:String) {code} The second problem, how to customise spark when it is call `PreparedStatement.setXXX`? It makes me create a new `DataFrameWriter` duplicating the code because it is a `final class` With a `CustomDataFrameWriter` it has to to call the `JdbcUtils` where the customisation should be done. We created a `CustomJdbcUtils` which is a Proxy of ``JdbcUtils` but with a change only where it call the `PreparedStatement.setTimestamp` case TimestampTzUdt => val timestampTz = row.getAs[TimestampTz](i) val cal = timestampTz.getCalendar stmt.setTimestamp(i + 1, new java.sql.Timestamp(timestampTz.time), cal) ``` It would be perfect if the oracle driver worked as we expected, sending the timezone to the column. However, to work, we need to call a specific oracle class. case TimestampTzUdt => val timestampTz = row.getAs[TimestampTz](i) val cal = timestampTz.getCalendar if (isOracle) stmt.setObject(i + 1, new oracle.sql.TIMESTAMPTZ(conn, new java.sql.Timestamp(timestampTz.time), cal)) else stmt.setTimestamp(i + 1, new java.sql.Timestamp(timestampTz.time), cal) In resume, what we expected from Spark? Creating some shortcuts to make easier customise spark sql for these cases. > Support session local timezone > -- > > Key: SPARK-18350 > URL: https://issues.apache.org/jira/browse/SPARK-18350 > Project: Spark > Issue Type: New Feature > Components: SQL >Reporter: Reynold Xin >Assignee: Takuya Ueshin > Labels: releasenotes > Fix For: 2.2.0 > > > As of Spark 2.1, Spark SQL assumes the machine timezone for datetime > manipulation, which is bad if users are not in the same timezones as the > machines, or if different users have different timezones. > We should introduce a session local timezone setting that is used for > execution. > An explicit non-goal is locale handling. -- Th
[jira] [Comment Edited] (SPARK-18350) Support session local timezone
[ https://issues.apache.org/jira/browse/SPARK-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15938096#comment-15938096 ] Giorgio Massignani edited comment on SPARK-18350 at 3/23/17 11:25 AM: -- I'd like to share what we did to solve the oracle `TIMESTAMP WITH TIME ZONE` We are looking for to upgrade to the latest spark version, but because it hasn't changes about it, we did in the `spark 1.6.1` with scala. In our case, we are creating `StructType` and `StructField` programatically creating DataFrames from RDDs. The first problem with the TimeZones are, how to send the TimeZone embedded into a Timestamp column? My workaround was creating the a new type `TimestampTz` which has UserDefinedType and Kryo serialisers. {code:scala} @SQLUserDefinedType(udt = classOf[TimestampTzUdt]) @DefaultSerializer(classOf[TimestampTzKryo]) class TimestampTz(val time: Long, val timeZoneId:String) {code} The second problem, how to customise spark when it is call `PreparedStatement.setXXX`? It makes me create a new `DataFrameWriter` duplicating the code because it is a `final class` With a `CustomDataFrameWriter` it has to to call the `JdbcUtils` where the customisation should be done. We created a `CustomJdbcUtils` which is a Proxy of ``JdbcUtils` but with a change only where it call the `PreparedStatement.setTimestamp` case TimestampTzUdt => val timestampTz = row.getAs[TimestampTz](i) val cal = timestampTz.getCalendar stmt.setTimestamp(i + 1, new java.sql.Timestamp(timestampTz.time), cal) ``` It would be perfect if the oracle driver worked as we expected, sending the timezone to the column. However, to work, we need to call a specific oracle class. case TimestampTzUdt => val timestampTz = row.getAs[TimestampTz](i) val cal = timestampTz.getCalendar if (isOracle) stmt.setObject(i + 1, new oracle.sql.TIMESTAMPTZ(conn, new java.sql.Timestamp(timestampTz.time), cal)) else stmt.setTimestamp(i + 1, new java.sql.Timestamp(timestampTz.time), cal) In resume, what we expected from Spark? Creating some shortcuts to make easier customise spark sql for these cases. was (Author: giorgio_sonra): I'd like to share what we did to solve the oracle `TIMESTAMP WITH TIME ZONE` We are looking for to upgrade to the latest spark version, but because it hasn't changes about it, we did in the `spark 1.6.1` with scala. In our case, we are creating `StructType` and `StructField` programatically creating DataFrames from RDDs. The first problem with the TimeZones are, how to send the TimeZone embedded into a Timestamp column? My workaround was creating the a new type `TimestampTz` which has UserDefinedType and Kryo serialisers. ``` @SQLUserDefinedType(udt = classOf[TimestampTzUdt]) @DefaultSerializer(classOf[TimestampTzKryo]) class TimestampTz(val time: Long, val timeZoneId:String) ``` The second problem, how to customise spark when it is call `PreparedStatement.setXXX`? It makes me create a new `DataFrameWriter` duplicating the code because it is a `final class` With a `CustomDataFrameWriter` it has to to call the `JdbcUtils` where the customisation should be done. We created a `CustomJdbcUtils` which is a Proxy of ``JdbcUtils` but with a change only where it call the `PreparedStatement.setTimestamp` case TimestampTzUdt => val timestampTz = row.getAs[TimestampTz](i) val cal = timestampTz.getCalendar stmt.setTimestamp(i + 1, new java.sql.Timestamp(timestampTz.time), cal) ``` It would be perfect if the oracle driver worked as we expected, sending the timezone to the column. However, to work, we need to call a specific oracle class. case TimestampTzUdt => val timestampTz = row.getAs[TimestampTz](i) val cal = timestampTz.getCalendar if (isOracle) stmt.setObject(i + 1, new oracle.sql.TIMESTAMPTZ(conn, new java.sql.Timestamp(timestampTz.time), cal)) else stmt.setTimestamp(i + 1, new java.sql.Timestamp(timestampTz.time), cal) In resume, what we expected from Spark? Creating some shortcuts to make easier customise spark sql for these cases. > Support session local timezone > -- > > Key: SPARK-18350 > URL: https://issues.apache.org/jira/browse/SPARK-18350 > Project: Spark > Issue Type: New Feature > Components: SQL >Reporter: Reynold Xin >Assignee: Takuya Ueshin > Labels: releasenotes > Fix For: 2.2.0 > > > As of Spark 2.1, Spark SQL assumes the machine timezone for datetime > manipulation, which is bad if users are not in the same timezones as the > machines, or if different users have different timezones. > We should introduce a session local timezone setting that is used for > execution. > An explicit non-goal is locale handling. -- This message was sent by Atlassi
[jira] [Comment Edited] (SPARK-18350) Support session local timezone
[ https://issues.apache.org/jira/browse/SPARK-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15938096#comment-15938096 ] Giorgio Massignani edited comment on SPARK-18350 at 3/23/17 11:22 AM: -- I'd like to share what we did to solve the oracle `TIMESTAMP WITH TIME ZONE` We are looking for to upgrade to the latest spark version, but because it hasn't changes about it, we did in the `spark 1.6.1` with scala. In our case, we are creating `StructType` and `StructField` programatically creating DataFrames from RDDs. The first problem with the TimeZones are, how to send the TimeZone embedded into a Timestamp column? My workaround was creating the a new type `TimestampTz` which has UserDefinedType and Kryo serialisers. ``` @SQLUserDefinedType(udt = classOf[TimestampTzUdt]) @DefaultSerializer(classOf[TimestampTzKryo]) class TimestampTz(val time: Long, val timeZoneId:String) ``` The second problem, how to customise spark when it is call `PreparedStatement.setXXX`? It makes me create a new `DataFrameWriter` duplicating the code because it is a `final class` With a `CustomDataFrameWriter` it has to to call the `JdbcUtils` where the customisation should be done. We created a `CustomJdbcUtils` which is a Proxy of ``JdbcUtils` but with a change only where it call the `PreparedStatement.setTimestamp` case TimestampTzUdt => val timestampTz = row.getAs[TimestampTz](i) val cal = timestampTz.getCalendar stmt.setTimestamp(i + 1, new java.sql.Timestamp(timestampTz.time), cal) ``` It would be perfect if the oracle driver worked as we expected, sending the timezone to the column. However, to work, we need to call a specific oracle class. case TimestampTzUdt => val timestampTz = row.getAs[TimestampTz](i) val cal = timestampTz.getCalendar if (isOracle) stmt.setObject(i + 1, new oracle.sql.TIMESTAMPTZ(conn, new java.sql.Timestamp(timestampTz.time), cal)) else stmt.setTimestamp(i + 1, new java.sql.Timestamp(timestampTz.time), cal) In resume, what we expected from Spark? Creating some shortcuts to make easier customise spark sql for these cases. was (Author: giorgio_sonra): I'd like to share what we did to solve the oracle `TIMESTAMP WITH TIME ZONE` We are looking for to upgrade to the latest spark version, but because it hasn't changes about it we did in the `spark 1.6.1`. In our case, we are creating `StructType` and `StructField` programatically creating DataFrames from RDDs. > Support session local timezone > -- > > Key: SPARK-18350 > URL: https://issues.apache.org/jira/browse/SPARK-18350 > Project: Spark > Issue Type: New Feature > Components: SQL >Reporter: Reynold Xin >Assignee: Takuya Ueshin > Labels: releasenotes > Fix For: 2.2.0 > > > As of Spark 2.1, Spark SQL assumes the machine timezone for datetime > manipulation, which is bad if users are not in the same timezones as the > machines, or if different users have different timezones. > We should introduce a session local timezone setting that is used for > execution. > An explicit non-goal is locale handling. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-18350) Support session local timezone
[ https://issues.apache.org/jira/browse/SPARK-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15938096#comment-15938096 ] Giorgio Massignani edited comment on SPARK-18350 at 3/23/17 10:58 AM: -- I'd like to share what we did to solve the oracle `TIMESTAMP WITH TIME ZONE` We are looking for to upgrade to the latest spark version, but because it hasn't changes about it we did in the `spark 1.6.1`. In our case, we are creating `StructType` and `StructField` programatically creating DataFrames from RDDs. was (Author: giorgio_sonra): I'd like to share what we did to solve the oracle `TIMESTAMP WITH TIME ZONE` We are looking for to upgrade to the latest spark version, but because it hasn't changes about it we did in the spark 1.6.1. > Support session local timezone > -- > > Key: SPARK-18350 > URL: https://issues.apache.org/jira/browse/SPARK-18350 > Project: Spark > Issue Type: New Feature > Components: SQL >Reporter: Reynold Xin >Assignee: Takuya Ueshin > Labels: releasenotes > Fix For: 2.2.0 > > > As of Spark 2.1, Spark SQL assumes the machine timezone for datetime > manipulation, which is bad if users are not in the same timezones as the > machines, or if different users have different timezones. > We should introduce a session local timezone setting that is used for > execution. > An explicit non-goal is locale handling. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-18350) Support session local timezone
[ https://issues.apache.org/jira/browse/SPARK-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15938096#comment-15938096 ] Giorgio Massignani edited comment on SPARK-18350 at 3/23/17 10:55 AM: -- I'd like to share what we did to solve the oracle `TIMESTAMP WITH TIME ZONE` We are looking for to upgrade to the latest spark version, but because it hasn't changes about it we did in the spark 1.6.1. was (Author: giorgio_sonra): I'd like to share what we did to solve the oracle `TIMESTAMP WITH TIME ZONE` > Support session local timezone > -- > > Key: SPARK-18350 > URL: https://issues.apache.org/jira/browse/SPARK-18350 > Project: Spark > Issue Type: New Feature > Components: SQL >Reporter: Reynold Xin >Assignee: Takuya Ueshin > Labels: releasenotes > Fix For: 2.2.0 > > > As of Spark 2.1, Spark SQL assumes the machine timezone for datetime > manipulation, which is bad if users are not in the same timezones as the > machines, or if different users have different timezones. > We should introduce a session local timezone setting that is used for > execution. > An explicit non-goal is locale handling. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-18350) Support session local timezone
[ https://issues.apache.org/jira/browse/SPARK-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15649858#comment-15649858 ] Xiao Li edited comment on SPARK-18350 at 11/9/16 5:26 AM: -- Below might be needed if we want to support session timezone? - Add a SQL statement and API to set the current session timezone? Link: https://docs.oracle.com/cd/B19306_01/server.102/b14225/ch4datetime.htm#i1006728 - Add a SQL statement and API to get the current session timezone? Link: https://www.ibm.com/support/knowledgecenter/SSEPEK_10.0.0/sqlref/src/tpc/db2z_currenttz.html - Add time zone specific expressions? Link: http://www.ibm.com/support/knowledgecenter/SSEPEK_10.0.0/sqlref/src/tpc/db2z_tzspecificexpression.html More works are needed if we want to add a new data type {{TIMESTAMP WITH TIME ZONE}} Link: https://docs.oracle.com/cd/B19306_01/server.102/b14225/ch4datetime.htm#i1005946 was (Author: smilegator): Below might be needed if we want to support session timezone? - Add a SQL statement and API to set the current session timezone? Link: https://docs.oracle.com/cd/B19306_01/server.102/b14225/ch4datetime.htm#i1006728 - Add a SQL statement and API to get the current session timezone? Link: https://www.ibm.com/support/knowledgecenter/SSEPEK_10.0.0/sqlref/src/tpc/db2z_currenttz.html - Add time zone specific expressions? Link: http://www.ibm.com/support/knowledgecenter/SSEPEK_10.0.0/sqlref/src/tpc/db2z_tzspecificexpression.html More works are needed if we want to add a new data type {TIMESTAMP WITH TIME ZONE} Link: https://docs.oracle.com/cd/B19306_01/server.102/b14225/ch4datetime.htm#i1005946 > Support session local timezone > -- > > Key: SPARK-18350 > URL: https://issues.apache.org/jira/browse/SPARK-18350 > Project: Spark > Issue Type: New Feature > Components: SQL >Reporter: Reynold Xin > > As of Spark 2.1, Spark SQL assumes the machine timezone for datetime > manipulation, which is bad if users are not in the same timezones as the > machines, or if different users have different timezones. > We should introduce a session local timezone setting that is used for > execution. > An explicit non-goal is locale handling. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org