Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8441 )
Change subject: disk failure: make various delta paths more robust to errors ...................................................................... Patch Set 2: (10 comments) http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.h File src/kudu/tablet/delta_tracker.h: http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.h@197 PS1, Line 197: // readers in 'stores'. > Nit: indentation of the continuation lines. Done http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.cc File src/kudu/tablet/delta_tracker.cc: http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.cc@244 PS1, Line 244: size_t i = 1; i < list.size(); i++) { > It's not introduced in this changelist, but I suspect it would it be an iss Interesting catch, thanks! http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.cc@299 PS1, Line 299: // Sanity check that the last store we are adding would logically appear > What if we return from ValidateDeltasOrdered _after_ this? Won't our in-mem I've done some reordering here so that if there's an error, no state will be changed. http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.cc@341 PS1, Line 341: AtomicUpdateStores(to_remove, new_stores, type); : vector<BlockId> removed_blocks; : rowset_metadata_->Commit > This is no longer true, and isn't that a problem? I've done some reordering here so that if there's an error, no state will be changed. http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.cc@345 PS1, Line 345: > Not sure what this comment is explaining. Sorry, was useful as a note to self. http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.cc@722 PS1, Line 722: CHECK_EQ(redo_delta_stores_.back(), old_dms) > Since component_lock_ is released between the swap and the cleanup, do we h Given we do the same below, I think so. We're also protected by compact_flush_lock_ up at L687 which ensures a single compaction at a time (ie IIUC this prevents other threads updating the stores). I'll add a CHECK to be sure. http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.cc@758 PS1, Line 758: // rename to final path! > warning: missing username/bug in TODO [google-readability-todo] Done http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/diskrowset.cc File src/kudu/tablet/diskrowset.cc: http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/diskrowset.cc@529 PS1, Line 529: RETURN_NOT_OK(DeltaTracker::Open(rowset_metadata_, num_rows, > Nit: indentation of the continuation lines. Done http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/diskrowset.cc@560 PS1, Line 560: > I tried to track this up to the call chains and in the chains Ah, thanks for digging that up! Whether intentional or not, this CHECK_OK() served to ensure 1) that we were not left in a half-updated state when doing the atomic calls, and 2) that we were able to verify that correctness by reading from disk. I just updated UpdateDeltaTracker to be actually atomic, so now the returned error serves only to tell us that we've hit an error reading from disk (which shouldn't be fatal at this level), and if we have, we're now still in a consistent state. http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/diskrowset.cc@594 PS1, Line 594: // Update the delta tracker with the changes. > Indeed. Done -- To view, visit http://gerrit.cloudera.org:8080/8441 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141 Gerrit-Change-Number: 8441 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Wong <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-Comment-Date: Mon, 06 Nov 2017 19:43:14 +0000 Gerrit-HasComments: Yes
