Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8441 )

Change subject: disk failure: make various delta paths more robust to errors
......................................................................


Patch Set 2:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.h
File src/kudu/tablet/delta_tracker.h:

http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.h@197
PS1, Line 197:   // readers in 'stores'.
> Nit: indentation of the continuation lines.
Done


http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.cc
File src/kudu/tablet/delta_tracker.cc:

http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.cc@244
PS1, Line 244: size_t i = 1; i < list.size(); i++) {
> It's not introduced in this changelist, but I suspect it would it be an iss
Interesting catch, thanks!


http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.cc@299
PS1, Line 299:       // Sanity check that the last store we are adding would 
logically appear
> What if we return from ValidateDeltasOrdered _after_ this? Won't our in-mem
I've done some reordering here so that if there's an error, no state will be 
changed.


http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.cc@341
PS1, Line 341:   AtomicUpdateStores(to_remove, new_stores, type);
             :   vector<BlockId> removed_blocks;
             :   rowset_metadata_->Commit
> This is no longer true, and isn't that a problem?
I've done some reordering here so that if there's an error, no state will be 
changed.


http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.cc@345
PS1, Line 345:
> Not sure what this comment is explaining.
Sorry, was useful as a note to self.


http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.cc@722
PS1, Line 722:     CHECK_EQ(redo_delta_stores_.back(), old_dms)
> Since component_lock_ is released between the swap and the cleanup, do we h
Given we do the same below, I think so. We're also protected by 
compact_flush_lock_ up at L687 which ensures a single compaction at a time (ie 
IIUC this prevents other threads updating the stores). I'll add a CHECK to be 
sure.


http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.cc@758
PS1, Line 758:   // rename to final path!
> warning: missing username/bug in TODO [google-readability-todo]
Done


http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/diskrowset.cc
File src/kudu/tablet/diskrowset.cc:

http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/diskrowset.cc@529
PS1, Line 529:   RETURN_NOT_OK(DeltaTracker::Open(rowset_metadata_, num_rows,
> Nit: indentation of the continuation lines.
Done


http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/diskrowset.cc@560
PS1, Line 560:
> I tried to track this up to the call chains and in the chains
Ah, thanks for digging that up!

Whether intentional or not, this CHECK_OK() served to ensure 1) that we were 
not left in a half-updated state when doing the atomic calls, and 2) that we 
were able to verify that correctness by reading from disk.

I just updated UpdateDeltaTracker to be actually atomic, so now the returned 
error serves only to tell us that we've hit an error reading from disk (which 
shouldn't be fatal at this level), and if we have, we're now still in a 
consistent state.


http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/diskrowset.cc@594
PS1, Line 594:     // Update the delta tracker with the changes.
> Indeed.
Done



--
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-Comment-Date: Mon, 06 Nov 2017 19:43:14 +0000
Gerrit-HasComments: Yes

Reply via email to