Github user vanzin commented on the issue:
https://github.com/apache/spark/pull/19250
@squito we can close this right?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands,
Github user squito commented on the issue:
https://github.com/apache/spark/pull/19250
@cloud-fan yes I can do that, will be next week before I get to it
---
-
To unsubscribe, e-mail:
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/19250
ok @squito can we send a new PR to do it? basically in parquet read task,
get the writer info from the footer. If the writer is impala, and a config is
set, we treat the seconds as seconds from
Github user zivanfi commented on the issue:
https://github.com/apache/spark/pull/19250
Yes, that is correct. We introduced the table property to address the 2nd
problem I mentioned above: "The adjustment depends on the local timezone."
(details in my [previous
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/19250
actually I took a look at #16781 , It also proposed a table property,
instead of a simple spark config.
---
-
To unsubscribe,
Github user zivanfi commented on the issue:
https://github.com/apache/spark/pull/19250
Yes, you understand correctly, the table property affects both the read
path and the write path, while the current workaround used by Hive and Impala
only affects the read path. (Both are
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/19250
IIUC, using the `parquet.timezone-adjustment` table property requires
changing the writer. e.g. Impala creates a table and Hive wants to write data
to it, then Hive needs to write
Github user zivanfi commented on the issue:
https://github.com/apache/spark/pull/19250
Hive and Impala introduced the following workaround for timestamp
interoperability a long ago: The footer of the Parquet file contains metadata
about the library that wrote the file. For Hive and
Github user squito commented on the issue:
https://github.com/apache/spark/pull/19250
> I think we can follow what Hive/Impala did for interoperability, i.e.
create a config to interpret parquet INT96 as timezone-agnostic timestamp in
parquet reader of Spark.
If I understand
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/19250
Ah now I understand this issue. Yes Spark doesn't follow the SQL standard,
the Spark timestamp is actually TIMESTAMP WITH LOCAL TIME ZONE, which is not
SQL standard but used in some databases
Github user zivanfi commented on the issue:
https://github.com/apache/spark/pull/19250
The interoperability issue is that Impala follows timezone-agnostic
timestamp semantics as mandated by the SQL standard, while SparkSQL follows
UTC-normalized semantics instead (which is not
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/19250
What's the interoperability issue with Impala? I think both Spark and
Impala store timestamp as parquet INT96, representing nanoseconds from epoch,
there is no timezone confusion. Internally
Github user squito commented on the issue:
https://github.com/apache/spark/pull/19250
@cloud-fan I think you misunderstand the purpose of this change.
The primary purpose is actually to deal with parquet, where that option
doesn't do anything. We need this for parquet for
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/19250
why is this patch so complicated? Based on the fact that data sources
accept a "timezone" option for read/writre, I'd expect it to be just:
* when `CreateTable`, set session local
Github user zivanfi commented on the issue:
https://github.com/apache/spark/pull/19250
@attilajeges has just found a problem with the behavior specified in the
requirements:
* Partitions of a table can use different file formats.
* As a result, a single table can have data
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19250
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82610/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19250
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19250
**[Test build #82610 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82610/testReport)**
for PR 19250 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19250
**[Test build #82610 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82610/testReport)**
for PR 19250 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19250
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19250
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82595/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19250
**[Test build #82595 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82595/testReport)**
for PR 19250 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19250
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82592/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19250
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19250
**[Test build #82592 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82592/testReport)**
for PR 19250 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19250
**[Test build #82595 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82595/testReport)**
for PR 19250 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19250
**[Test build #82592 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82592/testReport)**
for PR 19250 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19250
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82561/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19250
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19250
**[Test build #82561 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82561/testReport)**
for PR 19250 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19250
**[Test build #82561 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82561/testReport)**
for PR 19250 at commit
Github user vanzin commented on the issue:
https://github.com/apache/spark/pull/19250
I pushed a bunch of changes to address feedback here and also my own
feedback that I didn't bother to write down. Main changes:
- cache the TimeZone instances in generated code
- rename
Github user vanzin commented on the issue:
https://github.com/apache/spark/pull/19250
FYI Imran is probably going to be out for a few weeks so I'll try to
address the feedback here. It would be nice to have people take a look at this,
though.
---
Github user squito commented on the issue:
https://github.com/apache/spark/pull/19250
@HyukjinKwon you might be interested in this one also
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19250
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19250
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82037/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19250
**[Test build #82037 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82037/testReport)**
for PR 19250 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19250
**[Test build #82037 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82037/testReport)**
for PR 19250 at commit
Github user squito commented on the issue:
https://github.com/apache/spark/pull/19250
also cc @yhuai @liancheng would appreciate a review since you've looked at
sql & hive compatibility in the past
---
-
To
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19250
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19250
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81943/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19250
**[Test build #81943 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81943/testReport)**
for PR 19250 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19250
**[Test build #81943 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81943/testReport)**
for PR 19250 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19250
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81929/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19250
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19250
**[Test build #81929 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81929/testReport)**
for PR 19250 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19250
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81931/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19250
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19250
**[Test build #81931 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81931/testReport)**
for PR 19250 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19250
**[Test build #81931 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81931/testReport)**
for PR 19250 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19250
**[Test build #81929 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81929/testReport)**
for PR 19250 at commit
Github user squito commented on the issue:
https://github.com/apache/spark/pull/19250
Hi @ueshin @rxin, could you please review? thanks!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19250
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19250
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81892/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19250
**[Test build #81892 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81892/testReport)**
for PR 19250 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19250
**[Test build #81892 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81892/testReport)**
for PR 19250 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19250
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81888/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19250
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19250
**[Test build #81888 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81888/testReport)**
for PR 19250 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19250
**[Test build #81888 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81888/testReport)**
for PR 19250 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19250
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19250
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81833/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19250
**[Test build #81833 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81833/testReport)**
for PR 19250 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19250
**[Test build #81833 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81833/testReport)**
for PR 19250 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19250
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81830/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19250
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19250
**[Test build #81830 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81830/testReport)**
for PR 19250 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19250
**[Test build #81830 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81830/testReport)**
for PR 19250 at commit
68 matches
Mail list logo