[GitHub] spark issue #19250: [SPARK-12297] Table timezone correction for Timestamps

2017-11-09 Thread zivanfi
Github user zivanfi commented on the issue: https://github.com/apache/spark/pull/19250 Yes, that is correct. We introduced the table property to address the 2nd problem I mentioned above: "The adjustment depends on the local timezone." (details in my [previous comm

[GitHub] spark issue #19250: [SPARK-12297] Table timezone correction for Timestamps

2017-11-08 Thread zivanfi
Github user zivanfi commented on the issue: https://github.com/apache/spark/pull/19250 Yes, you understand correctly, the table property affects both the read path and the write path, while the current workaround used by Hive and Impala only affects the read path. (Both are Parquet

[GitHub] spark issue #19250: [SPARK-12297] Table timezone correction for Timestamps

2017-11-08 Thread zivanfi
Github user zivanfi commented on the issue: https://github.com/apache/spark/pull/19250 Hive and Impala introduced the following workaround for timestamp interoperability a long ago: The footer of the Parquet file contains metadata about the library that wrote the file. For Hive and

[GitHub] spark issue #19250: [SPARK-12297] Table timezone correction for Timestamps

2017-11-07 Thread zivanfi
Github user zivanfi commented on the issue: https://github.com/apache/spark/pull/19250 The interoperability issue is that Impala follows timezone-agnostic timestamp semantics as mandated by the SQL standard, while SparkSQL follows UTC-normalized semantics instead (which is not SQL

[GitHub] spark issue #19250: [SPARK-12297] Table timezone correction for Timestamps

2017-10-11 Thread zivanfi
Github user zivanfi commented on the issue: https://github.com/apache/spark/pull/19250 @attilajeges has just found a problem with the behavior specified in the requirements: * Partitions of a table can use different file formats. * As a result, a single table can have data

[GitHub] spark pull request #19250: [SPARK-12297] Table timezone correction for Times...

2017-10-09 Thread zivanfi
Github user zivanfi commented on a diff in the pull request: https://github.com/apache/spark/pull/19250#discussion_r143462649 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala --- @@ -266,6 +267,10 @@ final class DataFrameWriter[T] private[sql](ds

[GitHub] spark pull request #19250: [SPARK-12297] Table timezone correction for Times...

2017-10-06 Thread zivanfi
Github user zivanfi commented on a diff in the pull request: https://github.com/apache/spark/pull/19250#discussion_r143257840 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala --- @@ -1213,6 +1213,71 @@ case class

[GitHub] spark pull request #16781: [SPARK-12297][SQL][POC] Hive compatibility for Pa...

2017-03-07 Thread zivanfi
Github user zivanfi commented on a diff in the pull request: https://github.com/apache/spark/pull/16781#discussion_r104673553 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/ParquetHiveCompatibilitySuite.scala --- @@ -137,8 +141,190 @@ class

[GitHub] spark pull request #16781: [SPARK-12297][SQL][POC] Hive compatibility for Pa...

2017-03-07 Thread zivanfi
Github user zivanfi commented on a diff in the pull request: https://github.com/apache/spark/pull/16781#discussion_r104668320 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedColumnReader.java --- @@ -89,11 +92,23

[GitHub] spark issue #16781: [SPARK-12297][SQL][POC] Hive compatibility for Parquet T...

2017-03-07 Thread zivanfi
Github user zivanfi commented on the issue: https://github.com/apache/spark/pull/16781 Please update the pull request description, because the one dated Feb 2 does not correspond to the fix any more. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request #16781: [SPARK-12297][SQL][POC] Hive compatibility for Pa...

2017-03-07 Thread zivanfi
Github user zivanfi commented on a diff in the pull request: https://github.com/apache/spark/pull/16781#discussion_r104660877 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedColumnReader.java --- @@ -89,11 +92,23

[GitHub] spark pull request #16781: [SPARK-12297][SQL][POC] Hive compatibility for Pa...

2017-03-07 Thread zivanfi
Github user zivanfi commented on a diff in the pull request: https://github.com/apache/spark/pull/16781#discussion_r104664187 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -674,6 +674,12 @@ object SQLConf { .stringConf