Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/12109 )
Change subject: tablet: add support for diff scan iterator ...................................................................... Patch Set 1: (12 comments) http://gerrit.cloudera.org:8080/#/c/12109/1/src/kudu/common/schema.cc File src/kudu/common/schema.cc: http://gerrit.cloudera.org:8080/#/c/12109/1/src/kudu/common/schema.cc@387 PS1, Line 387: if (!col.has_read_default()) { : return Status::InvalidArgument(Substitute("Virtual column $0 $1 must have a default value", : col.name(), col.TypeToString())); : } Why is this important? What should the invariant be for virtual columns? For IS_DELETED specifically, a read default of false simplifies the implementation, but if we wanted to we could allow for no default at all and ensure that iterators always set the value (currently the deltas code will set false->true for DELETEs and true->false for REINSERTs, but we'd need to also set it to false when projecting base data). http://gerrit.cloudera.org:8080/#/c/12109/1/src/kudu/common/schema.cc@392 PS1, Line 392: TODO(mpercy): Is this safe? > @adar any idea offhand on this one? if not i will dig in more on this Seems OK. max_col_id_ is immutable in Schema once constructed, so there's no danger of the same schema somehow reusing the next ID for something else. Furthermore, once you call Tablet::GetMappedReadProjection(), you get a safe pointer to a Schema object that is never freed, even if the Tablet's schema is changed via AlterTablet. http://gerrit.cloudera.org:8080/#/c/12109/1/src/kudu/tablet/diff_scan-test.cc File src/kudu/tablet/diff_scan-test.cc: http://gerrit.cloudera.org:8080/#/c/12109/1/src/kudu/tablet/diff_scan-test.cc@18 PS1, Line 18: #include <stdint.h> Nit: cstdint http://gerrit.cloudera.org:8080/#/c/12109/1/src/kudu/tablet/diff_scan-test.cc@58 PS1, Line 58: tablet::MvccSnapshot snap1; You're in kudu::tablet::, so why do you need this prefix? http://gerrit.cloudera.org:8080/#/c/12109/1/src/kudu/tablet/diff_scan-test.cc@70 PS1, Line 70: // 3. Reinsert the rows and flush the MRS. Nit: It's not really a REINSERT though; it's just an INSERT into a new MRS (and then a new DRS). http://gerrit.cloudera.org:8080/#/c/12109/1/src/kudu/tablet/diff_scan-test.cc@75 PS1, Line 75: std::vector<std::string> rows; using http://gerrit.cloudera.org:8080/#/c/12109/1/src/kudu/tablet/diff_scan-test.cc@96 PS1, Line 96: ASSERT_OK(tablet->NewRowIterator(projection, If you expose RowIteratorOptions at this level, you could expose the ghosts by just setting include_deleted_rows=true; no need to add a virtual column. Granted, you'll probably want the virtual column later to make sure the MergeIterator did the right thing. http://gerrit.cloudera.org:8080/#/c/12109/1/src/kudu/tablet/tablet.h File src/kudu/tablet/tablet.h: http://gerrit.cloudera.org:8080/#/c/12109/1/src/kudu/tablet/tablet.h@199 PS1, Line 199: // Create a new row iterator for some historical snapshot. Update the doc. Would be good to specify why we aren't using RowIteratorOptions here; is it because we have to pass 'projection' by cref and not by pointer? http://gerrit.cloudera.org:8080/#/c/12109/1/src/kudu/tablet/tablet.h@575 PS1, Line 575: member Specifically, just the pointers in 'opts'. Everything else is a value type. http://gerrit.cloudera.org:8080/#/c/12109/1/src/kudu/tablet/tablet.h@795 PS1, Line 795: Schema schema_; Nit: why rename from projection_? It's more appropriate given this is an iterator. http://gerrit.cloudera.org:8080/#/c/12109/1/src/kudu/tablet/tablet.cc File src/kudu/tablet/tablet.cc: http://gerrit.cloudera.org:8080/#/c/12109/1/src/kudu/tablet/tablet.cc@420 PS1, Line 420: RowIteratorOptions opts; Need to also set an IOContext, no? Oh, I see you're setting those in the Iterator constructor, once the Schema and IOContext are initialized. Could you add a comment about that here, since it's a little unusual? http://gerrit.cloudera.org:8080/#/c/12109/1/src/kudu/tablet/tablet.cc@428 PS1, Line 428: for (const auto& c : projection.columns()) { You can use Schema::find_first_is_deleted_virtual_column. -- To view, visit http://gerrit.cloudera.org:8080/12109 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9aa0ce2276bdd37688e7a91a2efcf49a8f802eb5 Gerrit-Change-Number: 12109 Gerrit-PatchSet: 1 Gerrit-Owner: Mike Percy <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Grant Henke <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mike Percy <[email protected]> Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Wed, 19 Dec 2018 18:19:28 +0000 Gerrit-HasComments: Yes
