Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/9986 )
Change subject: IMPALA-3307: Add support for IANA time-zone db ...................................................................... Patch Set 5: (19 comments) Thanks for dealing with my previous comments, Attila! I filed some more but basically I'm fine with the changes and feel free to start involving someone with more experience on this topic. http://gerrit.cloudera.org:8080/#/c/9986/4/CMakeLists.txt File CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/9986/4/CMakeLists.txt@281 PS4, Line 281: Cctz nit: CCTZ http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/expr-test.cc@139 PS4, Line 139: new_time_zone_(time_zone), new_tz_ >From reading the names of these 2 variables it's not clear what de difference >is. Can you have a new_time_zone_ and a new_time_zone_name_ or such? http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/expr-test.cc@140 PS4, Line 140: /*overwrite*/ Do you need this comment? http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/expr-test.cc@153 PS4, Line 153: Timezone * nit: Timezone* Expect..() http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/expr-test.cc@155 PS4, Line 155: new_tz_ = TimezoneDatabase::FindTimezone(new_time_zone_); I wonder if it makes sense to do this assignment in the constructor and then this function can be changed to something like "getTimezone" that simply returns the corresponding member. http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions-ir.cc File be/src/exprs/timestamp-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions-ir.cc@503 PS4, Line 503: namespace Why did you need this namespace? http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions-ir.cc@504 PS4, Line 504: nline What is the plan to get rid of the "Duplicate code" TODOs in this review? http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions.cc File be/src/exprs/timestamp-functions.cc: http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions.cc@55 PS4, Line 55: // This should raise some sort of error or at least return null. Hive just ignores it. Shouldn't this be a TODO? http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions.cc@87 PS4, Line 87: // This should raise some sort of error or at least return null. Hive just ignores it. Same as above http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db-test.cc File be/src/exprs/timezone_db-test.cc: http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db-test.cc@57 PS4, Line 57: TzAbbev nit: TzAbbrev? http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db-test.cc@68 PS4, Line 68: // Abbreviations must start with an uppercase letter. If it has to start with an uppercase letter, can we add a test this with an input that would be valid if the first letter was uppercase? e.g. "pST", "pst", "singapore" etc. http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db-test.cc@105 PS4, Line 105: // Misformatted time-zone names. Can you again play around with upper vs lower case letters here? http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.h File be/src/exprs/timezone_db.h: http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.h@1 PS4, Line 1: // with the License. You may obtain a copy of the License at Hmm, is the top of the Apache comment missing? http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.h@23 PS4, Line 23: /// 'TimezoneDatabase' class contains functions to load and access the IANA time-zone Nice description, thx :) http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.h@91 PS4, Line 91: tz_seg nit: might be just my preference but tz_segment is still short and I think better to read. Up to you. http://gerrit.cloudera.org:8080/#/c/9986/1/be/src/exprs/timezone_db.cc File be/src/exprs/timezone_db.cc: http://gerrit.cloudera.org:8080/#/c/9986/1/be/src/exprs/timezone_db.cc@147 PS1, Line 147: > Returning a pointer just to be able to signal failure with nullptr seems co What I meant is that you could get rid of the offset_sec paramater if you changed the return value to int64_t* and you would still be able to detect failure if the return value is null. Not a big deal, though. Your choice. http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.cc File be/src/exprs/timezone_db.cc: http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.cc@365 PS4, Line 365: hdfsFile hdfs_file = hdfsOpenFile( Just for my information: LoadZoneInfoFromHdfs() copies the zone info file to a local dir, meanwhile this function uses hdfsOpenFile to get the abbrev data. Is there a reason that they don't follow the same approach? http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.cc@391 PS4, Line 391: ErrorMsg(TErrorCode::GENERAL, nit: I'm not exactly sure about the rules, but I feel that this line could be merged with the previous one. http://gerrit.cloudera.org:8080/#/c/9986/5/be/src/runtime/runtime-state.cc File be/src/runtime/runtime-state.cc: http://gerrit.cloudera.org:8080/#/c/9986/5/be/src/runtime/runtime-state.cc@136 PS5, Line 136: local_time_zone_ = &TimezoneDatabase::GetUtcTimezone(); This has already been set to GetUtcTimezone() in the constructor, right? -- To view, visit http://gerrit.cloudera.org:8080/9986 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I93c1fbffe81f067919706e30db0a34d0e58e7e77 Gerrit-Change-Number: 9986 Gerrit-PatchSet: 5 Gerrit-Owner: Attila Jeges <atti...@cloudera.com> Gerrit-Reviewer: Attila Jeges <atti...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Comment-Date: Thu, 10 May 2018 18:20:13 +0000 Gerrit-HasComments: Yes