Mike Percy has posted comments on this change. Change subject: KUDU-1601. Delete ancient UNDO delta blocks in the background ......................................................................
Patch Set 12: (36 comments) http://gerrit.cloudera.org:8080/#/c/4363/9/src/kudu/tablet/delta_tracker.cc File src/kudu/tablet/delta_tracker.cc: PS9, Line 434: update.RemoveUndoDeltaBlocks(block_ids_to_remove); > Ah, didn't know that. Think that having that info somewhere would be great. Done, added a comment to DeleteAncientUndoDeltas() http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tablet/delta_tracker.cc File src/kudu/tablet/delta_tracker.cc: PS12, Line 287: compactions > compactions and GC now, right? That's true. Fixed. PS12, Line 300: No consistency problems will be visible > not totally your fault, but I don't really understand why this is the case. Because it's specified to only be used for compactions, not flushes. And that is true. PS12, Line 300: // No consistency problems will be visible if we don't successfully : // Flush(), so no need to CHECK_OK here. > think we need a full explanation why this is the case, here or somewhere el I added an extra sentence at the end. LMK if you're still not comfortable with it. PS12, Line 367: if (max_deltas_to_initialize != -1 && : tmp_blocks_initialized == max_deltas_to_initialize) break; > as I mentioned in the header, don't think we need both a time and count lim Done PS12, Line 396: effectively a compaction > i think i understand what you're saying, but this is misleading without add Updated the comment PS12, Line 420: -1 > use a constant? Removed this param http://gerrit.cloudera.org:8080/#/c/4363/9/src/kudu/tablet/delta_tracker.h File src/kudu/tablet/delta_tracker.h: PS9, Line 173: ed. > because kInvalidTimestamp is a big positive number whereas kInitialTimestam I look at it in the opposite way, hit me up on Slack if you want to discuss this point. A high timestamp is way in the future whereas a low one is way in the past. http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tablet/delta_tracker.h File src/kudu/tablet/delta_tracker.h: PS12, Line 181: int64_t max_deltas_to_initialize, : MonoTime deadline, > do we need both a max_deltas_to_initialize and a dealine? these seem a bit You are right... at some point I wanted more control but I don't think it's used now. http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tablet/diskrowset-test.cc File src/kudu/tablet/diskrowset-test.cc: PS12, Line 564: // Major delta compaction is additive. > ? Tried to be terse but I guess it's cryptic. I reworded this. PS12, Line 567: We should be left with no delta stores. > nit: There shouldn't be any delta stores left. Done PS12, Line 568: constexpr > curious, why constexpr? any way think it would make sense to have this cons I am going to remove this parameter also, since it's never used. http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tablet/diskrowset.cc File src/kudu/tablet/diskrowset.cc: PS12, Line 545: since we've already updated the metadata > nit: surround with commas Done PS12, Line 556: We don't CHECK_OK on Flush here because if we don't successfully flush we : // don't have consistency problems in the case of major delta compaction > as I mentioned elsewhere I think we need a good explanation as to why. I as I added an explanation right below this line: "we are not adding additional mutations that weren't already present" http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tablet/mt-tablet-test.cc File src/kudu/tablet/mt-tablet-test.cc: PS12, Line 331: /* : if (bytes_in_ancient_undos > 0) { : LOG(INFO) << "Found " << bytes_in_ancient_undos << " bytes of ancient delta undos"; : } : */ > don't think you meant to leave this. Done http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tablet/rowset.h File src/kudu/tablet/rowset.h: PS12, Line 146: // Initialize up to 'max_deltas_to_initialize' deltas that while 'deadline' has : // not yet been passed and while a delta file with a max timestamp later than : // 'ancient_history_mark' has not yet been encountered. Return the number of : // deltas initialized during this invocation in 'num_deltas_initialized' and : // the total amount of on-disk data known to be entirely composed of ancient : // undo delta blocks in 'bytes_in_ancient_undos'. > is this comment different from the delta_tracker one? if not maybe make tha Good idea, done. PS12, Line 159: // timestamp lower than the given ancient history mark. This method only : // checks the oldest 'max_deltas_to_delete' UNDO delta files, and only if : // they are already initialized. : // : // Does not flush updates to the rowset metadata. The caller must flush the : // metadata explicitly. > same Done http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tablet/rowset_metadata.cc File src/kudu/tablet/rowset_metadata.cc: PS12, Line 193: auto iter = undo_delta_blocks_.begin(); : while (iter != undo_delta_blocks_.end()) { : if (ContainsKey(undos_to_remove, *iter)) { : removed.push_back(*iter); : iter = undo_delta_blocks_.erase(iter); : } else { : ++iter; : } : } > all the blocks we are removing should be present, right? can we check for t Done http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tablet/tablet.cc File src/kudu/tablet/tablet.cc: PS12, Line 285: // First, enter the shutdown state to no additional mm jobs get started. : { : std::lock_guard<rw_spinlock> lock(component_lock_); : state_ = kShutdown; : } : : // Wait until all external maintenance manager operations complete. : LOG_WITH_PREFIX(INFO) << "Waiting until all external maintenance ops complete"; : { : std::unique_lock<std::mutex> l(external_mm_op_lock_); \ : external_mm_ops_finished_.wait(l, [this] { return num_external_mm_ops_running_ == 0; }); : } : LOG_WITH_PREFIX(INFO) << "External maintenance ops are done"; > this is a bit scary, if not for correctness at least for shutdown ordering Yeah, it worked but was not pretty. I was able to remove it. PS12, Line 1682: Major compacting all delta stores > append: ", for tests." Done PS12, Line 1687: if (!rs->IsAvailableForCompaction()) continue; > I forget, when would this happen? In any case maybe log it (the test might The MRS is never available for compaction so that is the purpose of this. The other case is that someone is holding the compact_flush_lock for the RowSet, which means they are compacting. If we log it might be a bit spammy because of the MRS thing. http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tablet/tablet.h File src/kudu/tablet/tablet.h: PS12, Line 286: Find and delete all undo delta files that have a maximum op : // timestamp prior to the current ancient history mark. > xs nit: weird wrapping relative to the paragraph below Done PS12, Line 534: // Lock to protect 'num_external_mm_ops_running_' and the associated : // condition variable 'external_mm_ops_finished_'. No other Tablet locks : // should be acquired while holding this lock; This lock should be acquired : // last. : mutable std::mutex external_mm_op_lock_; : : // Indicates how many externally-driven mm ops are running. Blocks shutdown : // when greater than zero. : int32_t num_external_mm_ops_running_; : mutable std::condition_variable external_mm_ops_finished_; > think with the simplifications we discussed yesterday (IO on Perform() low Yep, done http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tablet/tablet_history_gc-test.cc File src/kudu/tablet/tablet_history_gc-test.cc: PS12, Line 550: // No deletes. > why is this relevant? Hmm it's not, removed PS12, Line 559: MonoTime deadline = MonoTime::Now() + MonoDelta::FromSeconds(60); > use MonoTime::Max() here Changed the method signature to a timeout, changed this to an initialized MonoDelta. PS12, Line 560: bytes_in_ancient_undos > assert that this is GT 0 Done http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tablet/tablet_metrics.cc File src/kudu/tablet/tablet_metrics.cc: PS12, Line 119: Does not include bytes garbage collected during compactions. > why? think it would make sense to have both, no? I think it would be nice to have a separate metric for that, but I don't think it needs to be part of this patch. PS12, Line 126: Does not include blocks garbage collected during compactions. > not sure it's worth it to have both metrics. what does the number of blocks Yeah that's fair, I don't think people will really care about this metric. Removing it. http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tserver/tablet_service.cc File src/kudu/tserver/tablet_service.cc: PS12, Line 1131: WARN_NOT_OK(server_->tablet_manager()->GetTabletPeers(&peers), "Cannot list tablets" > return on not OK? Reverted this change to the method signature http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tserver/ts_tablet_manager.cc File src/kudu/tserver/ts_tablet_manager.cc: PS12, Line 842: WARN_NOT_OK(GetTabletPeers(&peers_to_shutdown), "Cannot list tablets"); > should this be a warn? what happens if we don't get all of them? from what Reverted this http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tserver/ts_tablet_manager.h File src/kudu/tserver/ts_tablet_manager.h: PS12, Line 190: fs_manager > docs Reverted this http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tserver/tserver-path-handlers.cc File src/kudu/tserver/tserver-path-handlers.cc: PS12, Line 114: WARN_NOT_OK(tserver_->tablet_manager()->GetTabletPeers(&peers), "Cannot list tablets"); > what happens below on not-OK? reverted PS12, Line 182: WARN_NOT_OK(tserver_->tablet_manager()->GetTabletPeers(&peers), "Cannot list tablets"); > same reverted http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tserver/tserver_mm_ops.cc File src/kudu/tserver/tserver_mm_ops.cc: PS12, Line 59: Total Undo Delta Block GC Init Duration > think the explanation text below is better than the metric name and not muc I ended up removing these "total" metrics since there wasn't an analog for them elsewhere. PS12, Line 108: enable_undo_delta_block_gc > same comment as before regarding merging the enable_ flag with the undo max Since it's a new feature, it seems safer to have a flag to turn off undo delta block gc independently of changing the ancient history mark. Or do you mean --data_gc_min_size_mb ? http://gerrit.cloudera.org:8080/#/c/4363/12/src/kudu/tserver/tserver_mm_ops.h File src/kudu/tserver/tserver_mm_ops.h: PS12, Line 50: class UndoDeltaBlockGCOp : public MaintenanceOp > likely can make this a regular tablet op with the changes we discussed. Done -- To view, visit http://gerrit.cloudera.org:8080/4363 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0309bf7acfb6d018860c80f354012c3500da5c68 Gerrit-PatchSet: 12 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike Percy <mpe...@apache.org> Gerrit-Reviewer: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: David Ribeiro Alves <dral...@apache.org> Gerrit-Reviewer: Jean-Daniel Cryans <jdcry...@apache.org> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <mpe...@apache.org> Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: Yes