Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/9986 )
Change subject: IMPALA-3307: Add support for IANA time-zone db ...................................................................... Patch Set 4: (18 comments) http://gerrit.cloudera.org:8080/#/c/9986/4/CMakeLists.txt File CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/9986/4/CMakeLists.txt@281 PS4, Line 281: Cctz > nit: CCTZ Done http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/expr-test.cc@139 PS4, Line 139: new_time_zone_(time_zone), new_tz_ > From reading the names of these 2 variables it's not clear what de differen Done http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/expr-test.cc@140 PS4, Line 140: /*overwrite*/ > Do you need this comment? Removed it http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/expr-test.cc@153 PS4, Line 153: Timezone * > nit: Timezone* Expect..() Done http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/expr-test.cc@155 PS4, Line 155: new_tz_ = TimezoneDatabase::FindTimezone(new_time_zone_); > I wonder if it makes sense to do this assignment in the constructor and the The main reason I implemented the class this way was that some tests that use 'ScopedTimeZoneOverride' don't need the actual Timezone pointer. On the other hand, checking always that the timezone name is valid doesn't hurt anyone. Done. http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions-ir.cc File be/src/exprs/timestamp-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions-ir.cc@503 PS4, Line 503: namespace > Why did you need this namespace? Putting stuff into an unnamed namespace is a common C++ idiom: it says that everything in the unnamed namespace is "local" to this translation unit. They're not visible from the outside, and their names won't clash with names in other translation units. Impala uses unnamed namespaces elsewhere too. http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions-ir.cc@504 PS4, Line 504: / TODO > What is the plan to get rid of the "Duplicate code" TODOs in this review? I removed the TODO comment. I cannot think of a better place for this function. We can create a shared class for this function only but that would be awkward. http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions.cc File be/src/exprs/timestamp-functions.cc: http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions.cc@55 PS4, Line 55: // This should raise some sort of error or at least return null. Hive just ignores it. > Shouldn't this be a TODO? Fixed the comment. http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions.cc@87 PS4, Line 87: // This should raise some sort of error or at least return null. Hive just ignores it. > Same as above Fixed the comment. http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db-test.cc File be/src/exprs/timezone_db-test.cc: http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db-test.cc@57 PS4, Line 57: TzAbbev > nit: TzAbbrev? Done http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db-test.cc@68 PS4, Line 68: // Abbreviations must start with an uppercase letter. > If it has to start with an uppercase letter, can we add a test this with an Done http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db-test.cc@105 PS4, Line 105: // Misformatted time-zone names. > Can you again play around with upper vs lower case letters here? Done http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.h File be/src/exprs/timezone_db.h: http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.h@1 PS4, Line 1: // with the License. You may obtain a copy of the License at > Hmm, is the top of the Apache comment missing? Not sure what happened there. I probably inadvertently executed some mysterious vim command :) http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.h@91 PS4, Line 91: tz_seg > nit: might be just my preference but tz_segment is still short and I think Done http://gerrit.cloudera.org:8080/#/c/9986/1/be/src/exprs/timezone_db.cc File be/src/exprs/timezone_db.cc: http://gerrit.cloudera.org:8080/#/c/9986/1/be/src/exprs/timezone_db.cc@147 PS1, Line 147: > What I meant is that you could get rid of the offset_sec paramater if you c I understand. The problem is that then we would have to allocate an int64_t in the function and return a pointer to it wrapped in unique_ptr<int64_t> to avoid memory leaks. Seems more pain then gain. http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.cc File be/src/exprs/timezone_db.cc: http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.cc@365 PS4, Line 365: hdfsFile hdfs_file = hdfsOpenFile( > Just for my information: LoadZoneInfoFromHdfs() copies the zone info file t LoadZoneInfoFromHdfs() uses cctz::load_time_zone() to load time-zone files into memory. cctz::load_time_zone() works only with local filesystem paths, it cannot grab files from HDFS directly. LoadZoneAbbreviationsfromHdfs() is a lot simpler. It can read the config file directly from HDFS, creating a temp file on the local filesystem is not necessary. http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.cc@391 PS4, Line 391: ErrorMsg(TErrorCode::GENERAL, > nit: I'm not exactly sure about the rules, but I feel that this line could Done http://gerrit.cloudera.org:8080/#/c/9986/5/be/src/runtime/runtime-state.cc File be/src/runtime/runtime-state.cc: http://gerrit.cloudera.org:8080/#/c/9986/5/be/src/runtime/runtime-state.cc@136 PS5, Line 136: local_time_zone_ = &TimezoneDatabase::GetUtcTimezone(); > This has already been set to GetUtcTimezone() in the constructor, right? True, but I wanted to be more explicit here. I was also thinking about setting 'local_time_zone_' to 'nullptr' in the constructor and then setting it properly in Init(), but I wanted to make clear to the reader that 'local_time_zone_' is always set to a valid Timezone address. TimezoneDatabase::GetUtcTimezone() call is inexpensive, it just returns a static const reference. -- To view, visit http://gerrit.cloudera.org:8080/9986 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I93c1fbffe81f067919706e30db0a34d0e58e7e77 Gerrit-Change-Number: 9986 Gerrit-PatchSet: 4 Gerrit-Owner: Attila Jeges <atti...@cloudera.com> Gerrit-Reviewer: Attila Jeges <atti...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Comment-Date: Fri, 11 May 2018 12:40:54 +0000 Gerrit-HasComments: Yes