Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/8441 )
Change subject: disk failure: pass Statuses instead of failing CHECKs along the Flush DMS path ...................................................................... Patch Set 1: (7 comments) This is thorny code that I'm not terribly familiar with, so please also get a review from Todd. http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.h File src/kudu/tablet/delta_tracker.h: http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.h@197 PS1, Line 197: Status ValidateDeltaOrder(const std::shared_ptr<DeltaStore>& first, Nit: indentation of the continuation lines. http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.cc File src/kudu/tablet/delta_tracker.cc: http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.cc@299 PS1, Line 299: stores_to_update->erase(start_it, end_it); What if we return from ValidateDeltasOrdered _after_ this? Won't our in-memory state be sort of corrupt? I understand that the validation is for testing only anyway, but if the goal is to have a fully formed DeltaTracker in the event of an injected disk failure, we should probably think about the order here. http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.cc@341 PS1, Line 341: // Once we successfully commit to the rowset metadata, let's ensure we update : // the delta stores to maintain consistency between the two. We enforce this : // with a CHECK_OK here. This is no longer true, and isn't that a problem? http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.cc@345 PS1, Line 345: // This goes down the path of opening blocks. Not sure what this comment is explaining. http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.cc@722 PS1, Line 722: std::lock_guard<rw_spinlock> lock(component_lock_); Since component_lock_ is released between the swap and the cleanup, do we have assurances that the last entry of redo_delta_stores_ was old_dms? http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/diskrowset.cc File src/kudu/tablet/diskrowset.cc: http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/diskrowset.cc@529 PS1, Line 529: log_anchor_registry_, Nit: indentation of the continuation lines. http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/diskrowset.cc@594 PS1, Line 594: // TODO(awong): update here this message. Indeed. -- To view, visit http://gerrit.cloudera.org:8080/8441 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141 Gerrit-Change-Number: 8441 Gerrit-PatchSet: 1 Gerrit-Owner: Andrew Wong <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-Comment-Date: Wed, 01 Nov 2017 20:35:20 +0000 Gerrit-HasComments: Yes
