Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12247 )

Change subject: IMPALA-5051: Add INT64 timestamp write support in Parquet
......................................................................


Patch Set 8:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/12247/8//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12247/8//COMMIT_MSG@39
PS8, Line 39: without conversion to UTC
Whouldn't it be better to convert to UTC? You write later that old readers also 
assume that the data is written in UTC.


http://gerrit.cloudera.org:8080/#/c/12247/8//COMMIT_MSG@39
PS8, Line 39: tha
nit: the


http://gerrit.cloudera.org:8080/#/c/12247/8/be/src/exec/parquet/hdfs-parquet-table-writer.cc
File be/src/exec/parquet/hdfs-parquet-table-writer.cc:

http://gerrit.cloudera.org:8080/#/c/12247/8/be/src/exec/parquet/hdfs-parquet-table-writer.cc@579
PS8, Line 579: result_
What about deleting the member 'result_', and only have it here as a local 
variable?

Now it looks strange that we invoke a member function here and pass a member 
variable to it. Also, this way we will have some aliasing inside 
ConvertTimestamp() that might prevent some compiler optimizations, but I'm not 
really sure honestly.


http://gerrit.cloudera.org:8080/#/c/12247/8/be/src/exec/parquet/parquet-metadata-utils.h
File be/src/exec/parquet/parquet-metadata-utils.h:

http://gerrit.cloudera.org:8080/#/c/12247/8/be/src/exec/parquet/parquet-metadata-utils.h@60
PS8, Line 60: Return
nit: Returns


http://gerrit.cloudera.org:8080/#/c/12247/8/be/src/exec/parquet/parquet-metadata-utils.cc
File be/src/exec/parquet/parquet-metadata-utils.cc:

http://gerrit.cloudera.org:8080/#/c/12247/8/be/src/exec/parquet/parquet-metadata-utils.cc@142
PS8, Line 142: /// converted_type is not set because Impala always writes 
timestamps without UTC
Does Parquet-MR also write INT64 timestamps un-normalized?


http://gerrit.cloudera.org:8080/#/c/12247/8/be/src/runtime/timestamp-value.inline.h
File be/src/runtime/timestamp-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/12247/8/be/src/runtime/timestamp-value.inline.h@154
PS8, Line 154: kudu::int128_t nanos128 =
             :     static_cast<kudu::int128_t>(unixtime_seconds) * NANOS_PER_SEC
             :     + time_.fractional_seconds();
             :
             :   if (nanos128 <  std::numeric_limits<int64_t>::min()
             :       || nanos128 >  std::numeric_limits<int64_t>::max()) return 
false;
I think we can still avoid using int128_t.


http://gerrit.cloudera.org:8080/#/c/12247/8/testdata/workloads/functional-query/queries/QueryTest/parquet-int64-timestamps.test
File 
testdata/workloads/functional-query/queries/QueryTest/parquet-int64-timestamps.test:

http://gerrit.cloudera.org:8080/#/c/12247/8/testdata/workloads/functional-query/queries/QueryTest/parquet-int64-timestamps.test@102
PS8, Line 102: ---- QUERY
             : create table int96_nanos (ts timestamp) stored as parquet;
             : ====
             : ---- QUERY
             : # Insert edge values as "normal" int96 timestamps that can 
represent all values.
             : set parquet_timestamp_type=INT96_NANOS;
             : insert into int96_nanos values
nit: you dont't need to start a new QUERY block for each query when you don't 
check the results.
E.g. it could be:

 ====
 ---- QUERY
 create table...
 set ...
 insert into ...
 create table ...
 ====



--
To view, visit http://gerrit.cloudera.org:8080/12247
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib41ad532ec902ed5a9a1528513726eac1c11441f
Gerrit-Change-Number: 12247
Gerrit-PatchSet: 8
Gerrit-Owner: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi+ger...@cloudera.com>
Gerrit-Comment-Date: Thu, 28 Feb 2019 15:35:07 +0000
Gerrit-HasComments: Yes

Reply via email to