[ 
https://issues.apache.org/jira/browse/CASSANDRA-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254729#comment-13254729
 ] 

Sylvain Lebresne commented on CASSANDRA-3912:
---------------------------------------------

bq.  the solution is for the user to look at the cluster layout, and use exact 
tokens, right?

Well, kinda. The user should look at the layout, grab one of the range of the 
node, and then submit repair on a subset of that range. I fully agree this is 
not for the faint of heart (which is why I prefer not exposing it to nodetool 
just yet), but as it stand I'll admit I'm not sure how to improve that error 
message much.

bq. it should be possible to repair a range that falls on the boundary of two 
getLocalRanges, assuming it can be fully contained in their aggregate

Actually no. Or more precisely, in general it still have the problem mentioned 
above. Given 2 local ranges, there will be some neighbors that share one but 
not both of those ranges (so with these nodes the repair would be imprecise). 
In other words, to repair a range on the boundary of two local ranges, you'd 
really want to do 2 repairs on each subrange, because each time the set of 
neighbors will be different (we could do that splitting at the StorageService 
level but we should probably keep it simple for now. I see this ticket as just 
exposing existing code to advanced user, not reinventing repair).

bq. For JMX's sake.

Fixed :) (I've rebased the patch)

bq. Would including the 'future.session.getName' in this log message be useful

That's logged before the session is created.
                
> support incremental repair controlled by external agent
> -------------------------------------------------------
>
>                 Key: CASSANDRA-3912
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3912
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Peter Schuller
>            Assignee: Peter Schuller
>             Fix For: 1.2
>
>         Attachments: 3912_v2.txt, CASSANDRA-3912-trunk-v1.txt, 
> CASSANDRA-3912-v2-001-add-nodetool-commands.txt, 
> CASSANDRA-3912-v2-002-fix-antientropyservice.txt
>
>
> As a poor man's pre-cursor to CASSANDRA-2699, exposing the ability to repair 
> small parts of a range is extremely useful because it allows (with external 
> scripting logic) to slowly repair a node's content over time. Other than 
> avoiding the bulkyness of complete repairs, it means that you can safely do 
> repairs even if you absolutely cannot afford e.g. disk spaces spikes (see 
> CASSANDRA-2699 for what the issues are).
> Attaching a patch that exposes a "repairincremental" command to nodetool, 
> where you specify a step and the number of total steps. Incrementally 
> performing a repair in 100 steps, for example, would be done by:
> {code}
> nodetool repairincremental 0 100
> nodetool repairincremental 1 100
> ...
> nodetool repairincremental 99 100
> {code}
> An external script can be used to keep track of what has been repaired and 
> when. This should allow (1) allow incremental repair to happen now/soon, and 
> (2) allow experimentation and evaluation for an implementation of 
> CASSANDRA-2699 which I still think is a good idea. This patch does nothing to 
> help the average deployment, but at least makes incremental repair possible 
> given sufficient effort spent on external scripting.
> The big "no-no" about the patch is that it is entirely specific to 
> RandomPartitioner and BigIntegerToken. If someone can suggest a way to 
> implement this command generically using the Range/Token abstractions, I'd be 
> happy to hear suggestions.
> An alternative would be to provide a nodetool command that allows you to 
> simply specify the specific token ranges on the command line. It makes using 
> it a bit more difficult, but would mean that it works for any partitioner and 
> token type.
> Unless someone can suggest a better way to do this, I think I'll provide a 
> patch that does this. I'm still leaning towards supporting the simple "step N 
> out of M" form though.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to