Mike Percy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8402 )

Change subject: [docs] Document how to recover from a majority failed tablet
......................................................................


Patch Set 4:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/8402/4/docs/administration.adoc
File docs/administration.adoc:

http://gerrit.cloudera.org:8080/#/c/8402/4/docs/administration.adoc@709
PS4, Line 709: Reviving a tablet that's lost a majority of replicas
how about: Bringing a tablet that's lost a majority of replicas back online


http://gerrit.cloudera.org:8080/#/c/8402/4/docs/administration.adoc@711
PS4, Line 711: If a tablet has permanently lost a majority of its replicas, it 
cannot recover
It is critical to emphasize that in a majority-lost scenario, permanent data 
loss is likely, and in fact there is no guarantee that any data can be 
recovered. It may only be due to luck that they get some or all of their data 
back after this procedure. We should also emphasize that this procedure should 
only be performed if it is not possible to bring the majority back online.


http://gerrit.cloudera.org:8080/#/c/8402/4/docs/administration.adoc@723
PS4, Line 723:   638a20403e3e4ae3b55d4d07d920e6de (tserver-00:7150): RUNNING 
[LEADER]
This is kind of a cool scenario but this whole thing only works if the leader 
survives. I think it's worth indicating how to handle this when the leader did 
not survive as well and a discussion around the implications of that. Actually, 
if the leader survives, the likelihood of losing data is much lower (although 
not zero, because it could have been an old, partitioned leader in some nasty 
cases)


http://gerrit.cloudera.org:8080/#/c/8402/4/docs/administration.adoc@760
PS4, Line 760: $ kudu remote_replica delete tserver-01:7150 
e822cab6c0584bc0858219d1539a17e6 "delete failed replica"
this is not actually required; the master should do it automatically once they 
get evicted when we do the unsafe config change


http://gerrit.cloudera.org:8080/#/c/8402/4/docs/administration.adoc@767
PS4, Line 767: [source,bash]
             : ----
             : $ kudu remote_replica unsafe_change_config <tserver address> 
<tablet id> <uuid 1> <uuid 2> ...
             : ----
I found this confusing. It seems like a command, I was trying to figure out who 
uuid1 and uuid2 were and why we're changing the config to those two, etc. I 
think we need to pick one of the "prototype" or the "example" for the same 
command. I actually think the prototype (this example) is more useful than the 
one below, except that you indicate a "uuid2" which doesn't apply here.


http://gerrit.cloudera.org:8080/#/c/8402/4/docs/administration.adoc@775
PS4, Line 775: [source,bash]
If you are going to put this in, at least mark it with a label like "Example:"


http://gerrit.cloudera.org:8080/#/c/8402/4/docs/administration.adoc@777
PS4, Line 777: $ kudu remote_replica unsafe_change_config tserver-00:7150 
e822cab6c0584bc0858219d1539a17e6 638a20403e3e4ae3b55d4d07d920e6de
Because having a long UUID for tablet_id and UUID for tablet server id can be 
confusing, and these example uuids are never going to actually be what a user 
would paste in, I think something that is sort of a compromise of what you 
wrote on line 770 and what is here on line 777 would be ideal:

$ kudu remote_replica unsafe_change_config tserver-00:7150 <tablet_id> 
<tserver-00-uuid>

explaining that tserver-000-uuid would be the tablet server UUID of the 
remaining replica on tserver-00



--
To view, visit http://gerrit.cloudera.org:8080/8402
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic6326f65d029a1cd75e487b16ce5be4baea2f215
Gerrit-Change-Number: 8402
Gerrit-PatchSet: 4
Gerrit-Owner: Will Berkeley <wdberke...@gmail.com>
Gerrit-Reviewer: Jean-Daniel Cryans <jdcry...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mpe...@apache.org>
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: Will Berkeley <wdberke...@gmail.com>
Gerrit-Comment-Date: Fri, 15 Dec 2017 22:46:36 +0000
Gerrit-HasComments: Yes

Reply via email to