Mike Percy has posted comments on this change.

Change subject: KUDU-236 (part 1). Implement tablet history GC
......................................................................


Patch Set 19:

(22 comments)

http://gerrit.cloudera.org:8080/#/c/3076/19/src/kudu/tablet/compaction-test.cc
File src/kudu/tablet/compaction-test.cc:

Line 66:   return { HistoryGcOpts::GC_DISABLED, Timestamp::kInvalidTimestamp };
> is there special handling for kInvalidTimestamp? I think kInvalidTimestamp 
Replaced with Timestamp(0)


Line 217: 
> nit: no need for blank lines
Done


Line 387:       vector<rowid_t> gced_input_rows;
> unused
Done


Line 585:   HistoryGcOpts history_gc_opts = { HistoryGcOpts::GC_DISABLED, 
Timestamp(0) };
> use NoHistoryGC()?
Done


Line 639:   HistoryGcOpts history_gc_opts = { HistoryGcOpts::GC_DISABLED, 
Timestamp(0) };
> same
Done


http://gerrit.cloudera.org:8080/#/c/3076/19/src/kudu/tablet/compaction.cc
File src/kudu/tablet/compaction.cc:

Line 638:     DVLOG(2) << "Ancient history mark: " << 
history_gc_opts.ancient_history_mark.ToString()
> given this is a per-row log message, i think DVLOG(3) or DVLOG(4) is probab
Done


Line 647:           prev_undo->set_next(nullptr);
> hrm, the comment on line 629-631 is no longer correct, then. (we were const
Oh, good point. Hmm. I factored this into its own function and added a call to 
it at call sites that also call ApplyMutationsAndGenerateUndos() prior to 
passing the row into ApplyMutationsAndGenerateUndos(). Alternatively, we could 
change ApplyMutationsAndGenerateUndos() to take a non-const input row. I think 
the approach I took is probably better, though.


Line 800:   rowid_t input_row_offset = -1;
> is this variable necessary? seems like it's only used in the DVLOG below, b
Removed


Line 838:       DVLOG(2) << "Output Row: " << 
dst_row.schema()->DebugRow(dst_row) <<
> bump to a higher DVLOG level since this is a big message and every row
Done


PS19, Line 863:       DVLOG(2) << "Output Row: " << 
dst_row.schema()->DebugRow(dst_row)
              :                << "; RowId: " << index_in_current_drs;
> this vlog is redundant with the new one on 838-844
Bumped to DVLOG(5) since it shows the index in the current drs and we can't get 
that for GCed rows


PS19, Line 954: history_gc_opts.enabled == HistoryGcOpts::GC_ENABLED &&
              :               
mut->timestamp().ComesBefore(history_gc_opts.ancient_history_mark
> this condition shows up a lot. what about adding a method like bool History
added a method IsAncientHistory()


Line 956:             DVLOG(2) << "Skipping GCed input row: " << 
schema->DebugRow(row.row)
> this log message isn't accurate since it could still be reinserted later. p
Done


http://gerrit.cloudera.org:8080/#/c/3076/19/src/kudu/tablet/compaction.h
File src/kudu/tablet/compaction.h:

PS19, Line 38:  GC_ENABLED,
             :     GC_DISABLED
> reorder these and make GC_DISABLED have the value 0 so that if this is acci
Did the class thing.


Line 46:   // A timestamp prior to which no history will be preserved.
> add "Ignored if enabled != GC_ENABLED"
Done


Line 155: // 'is_history_truncated': Set to true if this row had a "reinsert 
after
> update (it's an int, not a bool)
Done


http://gerrit.cloudera.org:8080/#/c/3076/19/src/kudu/tablet/tablet-test.cc
File src/kudu/tablet/tablet-test.cc:

PS19, Line 564:   // TODO: Should we be reopening the tablet in a different way 
to persist the
              :   // clock / timestamps?
> does this need to be addressed here?
I don't think this needs to be addressed as part of this patch.


http://gerrit.cloudera.org:8080/#/c/3076/19/src/kudu/tablet/tablet.cc
File src/kudu/tablet/tablet.cc:

Line 704:   *ancient_history_mark = 
HybridClock::AddPhysicalTimeToTimestamp(clock_->Now(), neg_hist_age);
> slightly worried about underflow here if the user says "I want to keep all 
It looks to me like HybridTime has enough bits that this isn't possible, unless 
their clock says it's 1970 or something. If that was the case, NTP would be 
messed up, and I don't think Kudu would start. HybridTime uses 52 bits for the 
physical time in microseconds and 12 bits for the logical time. Here's the 
napkin math:

  mpercy@mpercy-T460p:~/src/kudu$ perl -e 'print((time() * 1000) << 12, "\n");'
  6034973429760000
  mpercy@mpercy-T460p:~/src/kudu$ perl -e 'print(2**31 * 1000, "\n");'
     2147483648000
  mpercy@mpercy-T460p:~/src/kudu$ perl -e 'print(((time() * 1000) << 12) - 
(2**31 * 1000), "\n");'
  6032826847232000


http://gerrit.cloudera.org:8080/#/c/3076/19/src/kudu/tablet/tablet_history_gc-test.cc
File src/kudu/tablet/tablet_history_gc-test.cc:

PS19, Line 193: // Test that no UNDO DELETE is generated after deleting a row 
from the MRS and
              : // waiting for AHM to pass, then flushing the MRS.
              : T
> this test actually tests two separate cases. can you expand this comment to
Done


PS19, Line 243: represents the insert. In the UNDO
              : // representation, this insert looks like  DELETE
> I think this description is confusing. the UNDO *undoes* the insert, doesn'
Done


PS19, Line 425:  7,
> some commentary at the top of what these special row offsets are would be n
I added some comments at the top. Let me know if you think I need to rework 
this test to be more understandable.


http://gerrit.cloudera.org:8080/#/c/3076/19/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

PS19, Line 1512: prohibits with 
> missing word
Done


PS19, Line 1534: C
> should be a lower case c, since it comes after a ':' in the formatted messa
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/3076
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If9833a863f118eb82be80ea56204d0d9141611c2
Gerrit-PatchSet: 19
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mpe...@apache.org>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mpe...@apache.org>
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-HasComments: Yes

Reply via email to