David Ribeiro Alves has posted comments on this change. Change subject: WIP: Add a new TIMESTAMP type ......................................................................
Patch Set 3: (44 comments) Addressed all comments on the TimestampValue class (posted in another patch). Remaining stuff (not that much) will come in follow ups. http://gerrit.cloudera.org:8080/#/c/5819/3/src/kudu/client/scan_batch.h File src/kudu/client/scan_batch.h: Line 31: #include "kudu/util/timestamp_value.h" > is it possible to move this to common/ ? it seems like it's not so general I only had this in util/ cuz that's where slice was. Done PS3, Line 219: their > 'its'? Done PS3, Line 220: getter > getters Done http://gerrit.cloudera.org:8080/#/c/5819/3/src/kudu/client/scanner-internal.cc File src/kudu/client/scanner-internal.cc: PS3, Line 325: LOG(INFO) << "ScanRequest: " << scan->DebugString(); : LOG(INFO) << "PartitionKey: " << partition_key; : > I don't think you meant to keep these Done http://gerrit.cloudera.org:8080/#/c/5819/3/src/kudu/client/value-internal.h File src/kudu/client/value-internal.h: Line 30: class KuduValue::Data { > error: use of undeclared identifier 'KuduValue' [clang-diagnostic-error] Done Line 30: class KuduValue::Data { > not your fault, but I guess we need to be including value.h or somesuch Done Line 46: TimestampValue timestamp_val_; > Looking back at the original code review where this was implemented, slice_ Done http://gerrit.cloudera.org:8080/#/c/5819/3/src/kudu/client/value.cc File src/kudu/client/value.cc: PS3, Line 90: auto > nit: can you use 'auto*' here so it's more obvious it's a pointer? did this for the other methods too and pushed another patch. Line 209: *val_void = ×tamp_val_; > is this also a reasonable place to be checking that the value is well-forme Done http://gerrit.cloudera.org:8080/#/c/5819/3/src/kudu/client/value.h File src/kudu/client/value.h: Line 54: static KuduValue* FromTimestamp(TimestampValue t); > warning: function 'kudu::client::KuduValue::FromTimestamp' has a definition Done Line 54: static KuduValue* FromTimestamp(TimestampValue t); > add doxygen Done http://gerrit.cloudera.org:8080/#/c/5819/3/src/kudu/common/key_encoder.h File src/kudu/common/key_encoder.h: Line 131: uint32_t u_date; > could we just call: we sure can, missed that, thanks Line 145: static void EncodeWithSeparators(const void* key, bool is_last, Buffer* dst) { > warning: parameter 'is_last' is unused [misc-unused-parameters] Done http://gerrit.cloudera.org:8080/#/c/5819/3/src/kudu/util/timestamp_value.cc File src/kudu/util/timestamp_value.cc: Line 24: #include <sstream> > warning: #includes are not sorted properly [llvm-include-order] Done Line 42: Status ToPTime(const int32_t* date_val, const int64_t* time_duration_val, ptime* pt) { > can be in an anonymous namespace? (also a few things below) Done PS3, Line 49: *reinterpret_cast<const date*>(date_val) > I think bit_cast<date>(date_val) would be safer and also inherently include Done Line 60: *time_duration_val)); > include e.what() or something? removed. PS3, Line 65: class > why's the class keyword necessary here? that's odd Done Line 68: *date_val = *reinterpret_cast<int32_t*>(&date_temp); > same bitcast Done Line 73: if (PREDICT_FALSE(gmtime_r(&unix_time, &temp_tm) == NULL)) { > A few weeks back I benchmarked different ways of going from time_t to ptime this has the problem of only allowing values of unix_time that fit an int32_t. Since the speed is quite different I created a fast/slow path. Line 90: Status TimestampValue::FromUnixTime(int64_t unix_time, int64_t nanos, TimestampValue* out) { > it seems this is attempting to allow any range of 'nanos'. maybe the code c Done Line 96: temp += boost::posix_time::nanoseconds(nanos); > error: no member named 'nanoseconds' in namespace 'boost::posix_time' [clan same Line 111: if (PREDICT_FALSE(!t.IsValid())) { > isn't there already a DCHECK to this effect in the above constructor? seems removed the check on the ctor now all From() methods make sure the generated timestamp is valid and the client methods do the same Line 128: pt += boost::posix_time::nanoseconds(1); > error: no member named 'nanoseconds' in namespace 'boost::posix_time' [clan hum apparently tidy bot is not picking up the definition that enables nanosecond resolution for boost. Line 139: pt -= boost::posix_time::nanoseconds(1); > error: no member named 'nanoseconds' in namespace 'boost::posix_time' [clan same PS3, Line 138: CHECK_OK(ToPTime(&date_, &time_duration_, &pt)); : pt -= boost::posix_time::nanoseconds(1); : if (!pt.is_not_a_date_time()) { : FromPTime(pt, &date_, &time_duration_); : return true; : } : > actually wondering if this would be simpler to implement by just doing: this is slightly wrong (nanos_ must be set to 86399999999999L) but changed to be this way. Line 149: ptime min = boost::posix_time::ptime(boost::posix_time::min_date_time); > any idea what this actually represents? maybe clearer to just hard-code the Done Line 168: if (!pt.time_of_day().is_special()) { > how do we get "special" values? removed this. Line 175: bool TimestampValue::IsConsecutive(const TimestampValue& other) const { > again think this might be clearer just using the underlying fields instead made this use the other methods. Line 178: next_pt += boost::posix_time::nanoseconds(1); > error: no member named 'nanoseconds' in namespace 'boost::posix_time' [clan same http://gerrit.cloudera.org:8080/#/c/5819/3/src/kudu/util/timestamp_value.h File src/kudu/util/timestamp_value.h: > Since this is client facing, here are a few other things to keep in mind: thanks for the pointers. my responses: 1 - I just inlined the hottest functions since this class is also used on the server side. These functions only call other inline functions that, in turn, only use the plain date_ and time_duration_ fields. If you feel that is acceptable I would leave it as is but document my reasoning. 2 - This is meant to be memory compatible with impala's timestamp, meaning that they have to have the same fields in the same order and occupy the same space in memory. I'm actually changing the impala format to match this one (theirs isn't packed). Line 17: #pragma once > I think with client-facing headers we have to use old-style include guards. Done Line 19: #include <glog/logging.h> > this probably needs the various incantations that other client-facing heade Done Line 26: class TimestampValue { > this needs a KUDU_EXPORT, right? Done PS3, Line 28: local > We should add lots of extra-verbose documentation about timezones with this Done Line 36: // Returns a non-OK status if the conversion could not be made. > should document any "well-formedness" requirements and allowable range. Is Done Line 39: static Status FromDateAndTimeDuration(int32_t date, int64_t time_duration, TimestampValue* out); > document the parameters here. Also the bit above about well-formedness Done PS3, Line 41: // The minimum possible value for a TimestampValue; : static TimestampValue Min(); : : // The maximum possible value for a TimestampValue; : static TimestampValue Max(); > are these valid values to actually insert? or only valid in the sense of a these are valid values to insert but users shouldn't have a use for them, added the macro and moved this to the new private api section Line 47: // "Raw" ctor, for internal use only. > I think there is a doxygen way to un-document this. Or maybe make it privat moved this to pvt PS3, Line 77: strings > typo Done Line 85: } else if (*this > other) { > warning: don't use else after return [readability-else-after-return] Done PS3, Line 92: // Returns whether 'other' is consecutive to 'this'. : bool IsConsecutive(const TimestampValue& other) const; > I think the naming and docs are slightly confusing here - not 100% obvious Done Line 113: int32_t date_; > document valid range Done Line 115: // 8 bytes to store nanoseconds within the current day. > should document the invariant that this is in the range (0..1e9] Done -- To view, visit http://gerrit.cloudera.org:8080/5819 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I63271adf6b87f048e30bf2d6d04c758884752afc Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: David Ribeiro Alves <dral...@apache.org> Gerrit-Reviewer: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: David Ribeiro Alves <dral...@apache.org> Gerrit-Reviewer: Jean-Daniel Cryans <jdcry...@apache.org> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-HasComments: Yes