Attila Jeges has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9986 )

Change subject: IMPALA-3307: Add support for IANA time-zone db
......................................................................


Patch Set 4:

(18 comments)

http://gerrit.cloudera.org:8080/#/c/9986/4/CMakeLists.txt
File CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/9986/4/CMakeLists.txt@281
PS4, Line 281: Cctz
> nit: CCTZ
Done


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/expr-test.cc@139
PS4, Line 139: new_time_zone_(time_zone), new_tz_
> From reading the names of these 2 variables it's not clear what de differen
Done


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/expr-test.cc@140
PS4, Line 140: /*overwrite*/
> Do you need this comment?
Removed it


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/expr-test.cc@153
PS4, Line 153: Timezone *
> nit: Timezone* Expect..()
Done


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/expr-test.cc@155
PS4, Line 155: new_tz_ = TimezoneDatabase::FindTimezone(new_time_zone_);
> I wonder if it makes sense to do this assignment in the constructor and the
The main reason I implemented the class this way was that some tests that use 
'ScopedTimeZoneOverride' don't need the actual Timezone pointer.

On the other hand, checking always that the timezone name is valid doesn't hurt 
anyone. Done.


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions-ir.cc
File be/src/exprs/timestamp-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions-ir.cc@503
PS4, Line 503: namespace
> Why did you need this namespace?
Putting  stuff into an unnamed namespace is a common C++ idiom: it says that 
everything in the unnamed namespace is "local" to this translation unit. 
They're not visible from the outside, and their names won't clash with names in 
other translation units.

Impala uses unnamed namespaces elsewhere too.


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions-ir.cc@504
PS4, Line 504: / TODO
> What is the plan to get rid of the "Duplicate code" TODOs in this review?
I removed the TODO comment.

I cannot think of a better place for this function. We can create a shared 
class for this function only but that would be awkward.


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions.cc
File be/src/exprs/timestamp-functions.cc:

http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions.cc@55
PS4, Line 55:     // This should raise some sort of error or at least return 
null. Hive just ignores it.
> Shouldn't this be a TODO?
Fixed the comment.


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions.cc@87
PS4, Line 87:     // This should raise some sort of error or at least return 
null. Hive just ignores it.
> Same as above
Fixed the comment.


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db-test.cc
File be/src/exprs/timezone_db-test.cc:

http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db-test.cc@57
PS4, Line 57: TzAbbev
> nit: TzAbbrev?
Done


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db-test.cc@68
PS4, Line 68:   // Abbreviations must start with an uppercase letter.
> If it has to start with an uppercase letter, can we add a test this with an
Done


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db-test.cc@105
PS4, Line 105:   // Misformatted time-zone names.
> Can you again play around with upper vs lower case letters here?
Done


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.h
File be/src/exprs/timezone_db.h:

http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.h@1
PS4, Line 1: // with the License.  You may obtain a copy of the License at
> Hmm, is the top of the Apache comment missing?
Not sure what happened there. I probably inadvertently executed some mysterious 
vim command :)


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.h@91
PS4, Line 91: tz_seg
> nit: might be just my preference but tz_segment is still short and I think
Done


http://gerrit.cloudera.org:8080/#/c/9986/1/be/src/exprs/timezone_db.cc
File be/src/exprs/timezone_db.cc:

http://gerrit.cloudera.org:8080/#/c/9986/1/be/src/exprs/timezone_db.cc@147
PS1, Line 147:
> What I meant is that you could get rid of the offset_sec paramater if you c
I understand.

The problem is that then we would have to allocate an int64_t in the function 
and return a pointer to it wrapped in unique_ptr<int64_t> to avoid memory 
leaks. Seems more pain then gain.


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.cc
File be/src/exprs/timezone_db.cc:

http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.cc@365
PS4, Line 365:   hdfsFile hdfs_file = hdfsOpenFile(
> Just for my information: LoadZoneInfoFromHdfs() copies the zone info file t
LoadZoneInfoFromHdfs() uses cctz::load_time_zone() to load time-zone files into 
memory. cctz::load_time_zone() works only with local filesystem paths, it 
cannot grab files from HDFS directly.

LoadZoneAbbreviationsfromHdfs() is a lot simpler. It can read the config file 
directly from HDFS, creating a temp file on the local filesystem is not 
necessary.


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.cc@391
PS4, Line 391:             ErrorMsg(TErrorCode::GENERAL,
> nit: I'm not exactly sure about the rules, but I feel that this line could
Done


http://gerrit.cloudera.org:8080/#/c/9986/5/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

http://gerrit.cloudera.org:8080/#/c/9986/5/be/src/runtime/runtime-state.cc@136
PS5, Line 136:     local_time_zone_ = &TimezoneDatabase::GetUtcTimezone();
> This has already been set to GetUtcTimezone() in the constructor, right?
True, but I wanted to be more explicit here.

I was also thinking about setting 'local_time_zone_' to 'nullptr' in the 
constructor and then setting it properly in Init(), but I wanted to make clear 
to the reader that 'local_time_zone_' is always set to a valid Timezone address.

TimezoneDatabase::GetUtcTimezone() call is inexpensive, it just returns a 
static const reference.



--
To view, visit http://gerrit.cloudera.org:8080/9986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I93c1fbffe81f067919706e30db0a34d0e58e7e77
Gerrit-Change-Number: 9986
Gerrit-PatchSet: 4
Gerrit-Owner: Attila Jeges <atti...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <atti...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Comment-Date: Fri, 11 May 2018 12:40:54 +0000
Gerrit-HasComments: Yes

Reply via email to