Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12247 )

Change subject: IMPALA-5051: Add INT64 timestamp write support in Parquet
......................................................................


Patch Set 9: Code-Review+1

(5 comments)

http://gerrit.cloudera.org:8080/#/c/12247/8//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12247/8//COMMIT_MSG@39
PS8, Line 39: without conversion to UTC
> The main problem with normalizing to UTC from Impala's standpoint is the pe
OK, thanks for the explanation for both of you!


http://gerrit.cloudera.org:8080/#/c/12247/8/be/src/exec/parquet/hdfs-parquet-table-writer.cc
File be/src/exec/parquet/hdfs-parquet-table-writer.cc:

http://gerrit.cloudera.org:8080/#/c/12247/8/be/src/exec/parquet/hdfs-parquet-table-writer.cc@579
PS8, Line 579: result_
> result_ shouldn't be a local variable, as we return a pointer to it. This i
Ah, I completely missed that.


http://gerrit.cloudera.org:8080/#/c/12247/8/be/src/exec/parquet/parquet-metadata-utils.cc
File be/src/exec/parquet/parquet-metadata-utils.cc:

http://gerrit.cloudera.org:8080/#/c/12247/8/be/src/exec/parquet/parquet-metadata-utils.cc@142
PS8, Line 142: /// converted_type is not set because Impala always writes 
timestamps without UTC
> That's true, but it has a metadata field to tell the caller the desired sem
Sorry, I was really thinking about Hive and Spark.

But it doesn't really matter, since now we have this 'isAdjustedToUTC' field. 
And Zoltan also said in one of his comments that "'pure' TIMESTAMP and 
TIMESTAMP WITHOUT TIME ZONE shall not be normalized to UTC"


http://gerrit.cloudera.org:8080/#/c/12247/8/be/src/runtime/timestamp-value.inline.h
File be/src/runtime/timestamp-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/12247/8/be/src/runtime/timestamp-value.inline.h@154
PS8, Line 154: // TODO: consider optimizing this (IMPALA-8268)
             :   kudu::int128_t nanos128 =
             :     static_cast<kudu::int128_t>(unixtime_seconds) * NANOS_PER_SEC
             :     + time_.fractional_seconds();
             :
             :   if (nanos128 <  std::numeric_limits<int64_t>::min()
> I created a ticket for benchmarking and optimizing these new functions: IMP
OK, thanks!


http://gerrit.cloudera.org:8080/#/c/12247/8/testdata/workloads/functional-query/queries/QueryTest/parquet-int64-timestamps.test
File 
testdata/workloads/functional-query/queries/QueryTest/parquet-int64-timestamps.test:

http://gerrit.cloudera.org:8080/#/c/12247/8/testdata/workloads/functional-query/queries/QueryTest/parquet-int64-timestamps.test@102
PS8, Line 102: ---- QUERY
             : create table int96_nanos (ts timestamp) stored as parquet;
             : ====
             : ---- QUERY
             : # Insert edge values as "normal" int96 timestamps that can 
represent all values.
             : set parquet_timestamp_type=INT96_NANOS;
             : insert into int96_nanos values
> Thanks for the tip!
You're welcome!



--
To view, visit http://gerrit.cloudera.org:8080/12247
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib41ad532ec902ed5a9a1528513726eac1c11441f
Gerrit-Change-Number: 12247
Gerrit-PatchSet: 9
Gerrit-Owner: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi+ger...@cloudera.com>
Gerrit-Comment-Date: Fri, 01 Mar 2019 15:43:45 +0000
Gerrit-HasComments: Yes

Reply via email to