[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2017-09-27 Thread Jeremy Hanna (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16182634#comment-16182634
 ] 

Jeremy Hanna commented on CASSANDRA-10070:
--

Doesn't look like CASSANDRA-8911 is moving forward, so it seems like this or 
things that [~vinaykumarcse] was talking about at NGCC yesterday or a 
combination could move forward.

> Automatic repair scheduling
> ---
>
> Key: CASSANDRA-10070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
> Fix For: 4.x
>
> Attachments: Distributed Repair Scheduling.doc, Distributed Repair 
> Scheduling_V2.doc
>
>
> Scheduling and running repairs in a Cassandra cluster is most often a 
> required task, but this can both be hard for new users and it also requires a 
> bit of manual configuration. There are good tools out there that can be used 
> to simplify things, but wouldn't this be a good feature to have inside of 
> Cassandra? To automatically schedule and run repairs, so that when you start 
> up your cluster it basically maintains itself in terms of normal 
> anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2016-06-10 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15324979#comment-15324979
 ] 

Paulo Motta commented on CASSANDRA-10070:
-

After discussion on NGCC we decided to put this on hold while we have a better 
definition on mutation-based repairs (MBR) (CASSANDRA-8911), since if that 
moves forward we will deprecate merkle-tree based repair in favor of MBR 
removing the need for automatic repair scheduling, since MBR will be continuous.

> Automatic repair scheduling
> ---
>
> Key: CASSANDRA-10070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
> Fix For: 3.x
>
> Attachments: Distributed Repair Scheduling.doc, Distributed Repair 
> Scheduling_V2.doc
>
>
> Scheduling and running repairs in a Cassandra cluster is most often a 
> required task, but this can both be hard for new users and it also requires a 
> bit of manual configuration. There are good tools out there that can be used 
> to simplify things, but wouldn't this be a good feature to have inside of 
> Cassandra? To automatically schedule and run repairs, so that when you start 
> up your cluster it basically maintains itself in terms of normal 
> anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2016-05-04 Thread Marcus Olsson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15270836#comment-15270836
 ] 

Marcus Olsson commented on CASSANDRA-10070:
---

@[~jbellis]
bq. How closely does this match the design doc from February? Is it worth 
posting an updated design for those of us joining late?
I'd say there have been enough changes for it to be a good idea to update the 
document, so I'll work on that! :)

> Automatic repair scheduling
> ---
>
> Key: CASSANDRA-10070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
> Fix For: 3.x
>
> Attachments: Distributed Repair Scheduling.doc
>
>
> Scheduling and running repairs in a Cassandra cluster is most often a 
> required task, but this can both be hard for new users and it also requires a 
> bit of manual configuration. There are good tools out there that can be used 
> to simplify things, but wouldn't this be a good feature to have inside of 
> Cassandra? To automatically schedule and run repairs, so that when you start 
> up your cluster it basically maintains itself in terms of normal 
> anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2016-05-03 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15269618#comment-15269618
 ] 

Jonathan Ellis commented on CASSANDRA-10070:


How closely does this match the design doc from February?  Is it worth posting 
an updated design for those of us joining late?

> Automatic repair scheduling
> ---
>
> Key: CASSANDRA-10070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
> Fix For: 3.x
>
> Attachments: Distributed Repair Scheduling.doc
>
>
> Scheduling and running repairs in a Cassandra cluster is most often a 
> required task, but this can both be hard for new users and it also requires a 
> bit of manual configuration. There are good tools out there that can be used 
> to simplify things, but wouldn't this be a good feature to have inside of 
> Cassandra? To automatically schedule and run repairs, so that when you start 
> up your cluster it basically maintains itself in terms of normal 
> anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2016-03-02 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15176519#comment-15176519
 ] 

Paulo Motta commented on CASSANDRA-10070:
-

bq. We start with the resource locking and then move on to the maintenance 
scheduling API. And after that I think most tasks could be discussed in 
parallel. Also I removed the task for management commands since I think it 
would be easier to add them while implementing the features. 

+1

bq. I've now created the sub-tasks and linked them to this issue. I didn't 
include the node configuration since it might be redundant but we might add it 
later on if we feel the need to.

Awesome!

> Automatic repair scheduling
> ---
>
> Key: CASSANDRA-10070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
> Fix For: 3.x
>
> Attachments: Distributed Repair Scheduling.doc
>
>
> Scheduling and running repairs in a Cassandra cluster is most often a 
> required task, but this can both be hard for new users and it also requires a 
> bit of manual configuration. There are good tools out there that can be used 
> to simplify things, but wouldn't this be a good feature to have inside of 
> Cassandra? To automatically schedule and run repairs, so that when you start 
> up your cluster it basically maintains itself in terms of normal 
> anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2016-02-26 Thread Marcus Olsson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169498#comment-15169498
 ] 

Marcus Olsson commented on CASSANDRA-10070:
---

I've now created the sub-tasks and linked them to this issue. I didn't include 
the node configuration since it might be redundant but we might add it later on 
if we feel the need to.

> Automatic repair scheduling
> ---
>
> Key: CASSANDRA-10070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
> Fix For: 3.x
>
> Attachments: Distributed Repair Scheduling.doc
>
>
> Scheduling and running repairs in a Cassandra cluster is most often a 
> required task, but this can both be hard for new users and it also requires a 
> bit of manual configuration. There are good tools out there that can be used 
> to simplify things, but wouldn't this be a good feature to have inside of 
> Cassandra? To automatically schedule and run repairs, so that when you start 
> up your cluster it basically maintains itself in terms of normal 
> anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2016-02-24 Thread Marcus Olsson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15163462#comment-15163462
 ] 

Marcus Olsson commented on CASSANDRA-10070:
---

For the basic implementation I think the tasks could be broken down as:
- Resource locking API & implementation
-- Maintenance scheduling API & basic repair scheduling
--- Rejection policy interface & default implementations
--- Configuration support
 Table configuration
 Global configuration (for pausing repairs in the basic implementation)
 Node configuration
--- Aborting/interrupting repairs (Requires CASSANDRA-3486,CASSANDRA-11190)
 Polling and monitoring module
 Failure handling and retry 

So that we start with the resource locking and then move on to the maintenance 
scheduling API. And after that I think most tasks could be discussed in 
parallel. Also I removed the task for management commands since I think it 
would be easier to add them while implementing the features. WDYT?

> Automatic repair scheduling
> ---
>
> Key: CASSANDRA-10070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
> Fix For: 3.x
>
> Attachments: Distributed Repair Scheduling.doc
>
>
> Scheduling and running repairs in a Cassandra cluster is most often a 
> required task, but this can both be hard for new users and it also requires a 
> bit of manual configuration. There are good tools out there that can be used 
> to simplify things, but wouldn't this be a good feature to have inside of 
> Cassandra? To automatically schedule and run repairs, so that when you start 
> up your cluster it basically maintains itself in terms of normal 
> anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2016-02-23 Thread Marcus Olsson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159286#comment-15159286
 ] 

Marcus Olsson commented on CASSANDRA-10070:
---

bq. Sounds good! We could ask the user to pause, but I think doing that 
automatically via "system interrupts" is better. It just ocurred to me that 
both "the pause" or "system interrupts" will prevent new repairs from starting, 
but what about already running repairs? We will probably want to interrupt 
already running repairs as well in some situations. For this reason 
CASSANDRA-3486 is also relevant for this ticket (adding it as a dependency of 
this ticket).
+1

bq. Then I think we should either have timeout, or add an ability to 
cancel/interrupt a running scheduled repair in the initial version, to avoid 
hanging repairs to render the automatic repair scheduling useless.
I think the timeout would be good enough in the initial version. I guess the 
interruption of repairs would be handled by CASSANDRA-3486? Perhaps it would be 
possible to extend that feature later to be able to cancel a scheduled repair? 
Here I'm thinking that the interruption is stopping the running repair and 
allowing the scheduled job to retry it immediately, while cancelling it would 
prevent the scheduled job from retrying it immediately.

bq. WDYT? Feel free to update or break-up into smaller or larger subtasks, and 
then create the actual subtasks to start work on them.
Sounds good, I'll have a closer look on the subtasks tomorrow! I guess we will 
have sort of a dependency tree for some of the tasks.

> Automatic repair scheduling
> ---
>
> Key: CASSANDRA-10070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
> Fix For: 3.x
>
> Attachments: Distributed Repair Scheduling.doc
>
>
> Scheduling and running repairs in a Cassandra cluster is most often a 
> required task, but this can both be hard for new users and it also requires a 
> bit of manual configuration. There are good tools out there that can be used 
> to simplify things, but wouldn't this be a good feature to have inside of 
> Cassandra? To automatically schedule and run repairs, so that when you start 
> up your cluster it basically maintains itself in terms of normal 
> anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2016-02-23 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158968#comment-15158968
 ] 

Paulo Motta commented on CASSANDRA-10070:
-

bq. But in that case the pause/stop feature should be implemented as early as 
possible to avoid having an upgrade scenario that requires the user to upgrade 
to the version that introduces the pause feature before upgrading to the 
latest. Another way would be to have the "system interrupts" feature in place 
early, so that the repairs would be paused during an upgrade.

Sounds good! We could ask the user to pause, but I think doing that 
automatically via "system interrupts" is better. It just ocurred to me that 
both "the pause" or "system interrupts" will prevent new repairs from starting, 
but what about already running repairs? We will probably want to interrupt 
already running repairs as well in some situations. For this reason 
CASSANDRA-3486 is also relevant for this ticket (adding it as a dependency of 
this ticket).

bq. I think the timeout might be good to have to prevent a hang from stopping 
the entire repair process. But I think it would only work if the repair would 
only hang occasionally, otherwise the same repair would be retried until it is 
marked as a "fail". 

+1. Then I think we should either have timeout, or add an ability to 
cancel/interrupt a running scheduled repair in the initial version, to avoid 
hanging repairs to render the automatic repair scheduling useless.

bq. Another option is to have a "slow repair"-detector that would log a warning 
if a repair session is taking too long time, to avoid aborting it if it's 
actually repairing and leaving it up to the user to handle it. Either way I'd 
say it's out of the scope of the initial version.

bq. We might also want to be able to detect if it would be impossible to repair 
the whole cluster within gc grace and report it to the user. This could happen 
for multiple reasons like too many tables, too many nodes, too few parallel 
repairs or simply overload. I guess it would be hard to make accurate 
predictions with all of these variables so it might be good enough to check 
through the history of the repairs, do an estimation of the time and compare it 
to gc grace? I think this is something out of scope for the first version, but 
I thought I'd just mention it here to remember it.

Nice! These could probably live in a separate repair metrics and alert module 
in the future, allowing users to track statistics, issue alerts/warnings based 
on history and allow the scheduler to perform more advanced adaptive 
scheduling. Some metrics to track:
* Repair time per session
** Break up of time per phase (validation, sync, anticompaction, etc)
* Repair time per node
* Validation mismatch %
* Fail count

bq. Should we maybe compile a list of "features that should be in the initial 
version" and also a "improvements" list for future work to make the scope clear?

Sounds good! Below is a suggested list of subtasks:

* Basic functionality
** Resource locking API and implementation
** Maintenance scheduling API and metadata
** Basic scheduling support
** Polling and monitoring module
** Pausing and aborting support 
** Rejection policies (includes system interrupts and maintenance windows)
** Failure handling and retry
** Configuration support
** Frontend support (table options, management commands)

* Optional/deferred functionality
** Parallel repair session support
** Subrange repair support
** Maintenance history
** Timeout
** Metrics
** Alerts

WDYT? Feel free to update or break-up into smaller or larger subtasks, and then 
create the actual subtasks to start work on them.

> Automatic repair scheduling
> ---
>
> Key: CASSANDRA-10070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
> Fix For: 3.x
>
> Attachments: Distributed Repair Scheduling.doc
>
>
> Scheduling and running repairs in a Cassandra cluster is most often a 
> required task, but this can both be hard for new users and it also requires a 
> bit of manual configuration. There are good tools out there that can be used 
> to simplify things, but wouldn't this be a good feature to have inside of 
> Cassandra? To automatically schedule and run repairs, so that when you start 
> up your cluster it basically maintains itself in terms of normal 
> anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2016-02-23 Thread Marcus Olsson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158634#comment-15158634
 ] 

Marcus Olsson commented on CASSANDRA-10070:
---

bq. We could probably replace the single resource lock 
('RepairResource\-{dc}\-{i}') with global ('Global\-{dc}\-{i}') or mutually 
exclusive resources ('CleanupAndRepairResource-{dc}-{i}') later if necessary. 
We'll probably only need some special care during upgrades when we introduce 
these new locks, but other than that I don't see any problem that could arise 
with renaming the resources later if necessary. Do you see any issue with this 
approach?
No that should probably work, so we can have it as 
'RepairResource-\{dc\}-\{i\}' for now. For the upgrades we could add a release 
note that says something like "pause/stop all scheduled repairs while upgrading 
from x.y to x.z". But in that case the pause/stop feature should be implemented 
as early as possible to avoid having an upgrade scenario that requires the user 
to upgrade to the version that introduces the pause feature before upgrading to 
the latest. Another way would be to have the "system interrupts" feature in 
place early, so that the repairs would be paused during an upgrade.

bq. Created CASSANDRA-11190 for failing repairs fast and linked as a 
requirement of this ticket.
Great!

bq. No unless there is a bug. Repair messages are undroppable, and the nodes 
report the coordinator on failure.
bq. We could probably handle explicit failures in CASSANDRA-11190 making sure 
all nodes are properly informed and abort their operations in case of failures 
in any of the nodes. The timeout in this context could be helpful in case of 
hangs in streaming or validation. But I suppose that as the protocol becomes 
more mature/correct and with fail fast in place these hanging situations will 
become more rare so I'm not sure timeouts would be required if we assume there 
are no hangs. I guess we can leave them out of the initial version for 
simplicity and add them later if necessary.
I think the timeout might be good to have to prevent a hang from stopping the 
entire repair process. But I think it would only work if the repair would only 
hang occasionally, otherwise the same repair would be retried until it is 
marked as a "fail". Another option is to have a "slow repair"-detector that 
would log a warning if a repair session is taking too long time, to avoid 
aborting it if it's actually repairing and leaving it up to the user to handle 
it. Either way I'd say it's out of the scope of the initial version.

---

We might also want to be able to detect if it would be impossible to repair the 
whole cluster within gc grace and report it to the user. This could happen for 
multiple reasons like too many tables, too many nodes, too few parallel repairs 
or simply overload. I guess it would be hard to make accurate predictions with 
all of these variables so it might be good enough to check through the history 
of the repairs, do an estimation of the time and compare it to gc grace? I 
think this is something out of scope for the first version, but I thought I'd 
just mention it here to remember it.

Should we maybe compile a list of  "features that should be in the initial 
version" and also a "improvements" list for future work to make the scope clear?

> Automatic repair scheduling
> ---
>
> Key: CASSANDRA-10070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
> Fix For: 3.x
>
> Attachments: Distributed Repair Scheduling.doc
>
>
> Scheduling and running repairs in a Cassandra cluster is most often a 
> required task, but this can both be hard for new users and it also requires a 
> bit of manual configuration. There are good tools out there that can be used 
> to simplify things, but wouldn't this be a good feature to have inside of 
> Cassandra? To automatically schedule and run repairs, so that when you start 
> up your cluster it basically maintains itself in terms of normal 
> anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2016-02-19 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15154215#comment-15154215
 ] 

Paulo Motta commented on CASSANDRA-10070:
-

bq. Another thing we should probably consider is whether or not multiple types 
of maintenance work should run simultaneously. If we need to add this 
constraint, should they use the same lock resources?

We could probably replace the single resource lock 
('RepairResource\-\{dc\}\-\{i\}') with global ('Global\-\{dc\}\-\{i\}') or 
mutually exclusive resources ('CleanupAndRepairResource-\{dc\}-\{i\}') later if 
necessary. We'll probably only need some special care during upgrades when we 
introduce these new locks, but other than that I don't see any problem that 
could arise with renaming the resources later if necessary. Do you see any 
issue with this approach?

bq. Sounds good, let's start with the lockResource field in the repair session 
and move to scheduled repairs all together later on (maybe optionally scheduled 
via JMX at first?).

+1

bq. But as you said, it should be done in a separate ticket.

Created CASSANDRA-11190 for failing repairs fast and linked as a requirement of 
this ticket.

bq. Would it be possible for a node to "drop" a validation/streaming without 
notifying the repair coordinator? 

No unless there is a bug. Repair messages are undroppable, and the nodes report 
the coordinator on failure.

bq. Do we have any time out scenarios that we could foresee before they occur?  
If we could detect that, it would be good to abort the repair as early as 
possible, assuming that the timeout would be set rather high.

We could probably handle explicit failures in CASSANDRA-11190 making sure all 
nodes are properly informed and abort their operations in case of failures in 
any of the nodes. The timeout in this context could be helpful in case of hangs 
in streaming or validation. But I suppose that as the protocol becomes more 
mature/correct and with fail fast in place these hanging situations will become 
more rare so I'm not sure timeouts would be required if we assume there are no 
hangs. I guess we can leave them out of the initial version for simplicity and 
add them later if necessary.

> Automatic repair scheduling
> ---
>
> Key: CASSANDRA-10070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
> Fix For: 3.x
>
> Attachments: Distributed Repair Scheduling.doc
>
>
> Scheduling and running repairs in a Cassandra cluster is most often a 
> required task, but this can both be hard for new users and it also requires a 
> bit of manual configuration. There are good tools out there that can be used 
> to simplify things, but wouldn't this be a good feature to have inside of 
> Cassandra? To automatically schedule and run repairs, so that when you start 
> up your cluster it basically maintains itself in terms of normal 
> anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2016-02-16 Thread Marcus Olsson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148842#comment-15148842
 ] 

Marcus Olsson commented on CASSANDRA-10070:
---

bq. Do we intend to reuse the lock table for other maintenance tasks as well? 
If so, we must add a generic "holder" column to the lock table so we can reuse 
to identify resources other than the parent repair session in the future. We 
could also add an "attributes" map in the lock table to store additional 
attributes such as status, or have a separate table to maintain status to keep 
the lock table simple.

I think it could be reused, so it's probably better to do it generic from the 
start. I think that as long as we don't put too much data in the attributes 
map, it could be stored in the lock table. Another thing is that it's tightly 
bound to the lock itself, since we will use it to clean up repairs without a 
lock, which means keeping it in a single table is probably the easiest solution.

Another thing we should probably consider is whether or not multiple types of 
maintenance work should run simultaneously. If we need to add this constraint, 
should they use the same lock resources?

bq. Ideally all repairs would go through this interface, but this would 
probably add complexity at this stage. So we should probably just add a 
"lockResource" attribute to each repair session object, and each node would go 
through all repairs currently running checking if it still holds the lock in 
case the "lockResource" field is set.

Sounds good, let's start with the lockResource field in the repair session and 
move to scheduled repairs all together later on (maybe optionally scheduled via 
JMX at first?).

{quote}
It would probably be safe to abort ongoing validation and stream background 
tasks and cleanup repair state on all involved nodes before starting a new 
repair session in the same ranges. This doesn't seem to be done currently. As 
far as I understood, if there are nodes A, B, C running repair, A is the 
coordinator. If validation or streaming fails on node B, the coordinator (A) is 
notified and fails the repair session, but node C will remain doing validation 
and/or streaming, what could cause problems (or increased load) if we start 
another repair session on the same range. 

We will probably need to extend the repair protocol to perform this 
cleanup/abort step on failure. We already have a legacy cleanup message that 
doesn't seem to be used in the current protocol that we could maybe reuse to 
cleanup repair state after a failure. This repair abortion will probably have 
intersection with CASSANDRA-3486. In any case, this is a separate (but related) 
issue and we should address it in an independent ticket, and make this ticket 
dependent on that.
{quote}

Right now it seems that the cleanup message is only used to remove the parent 
repair session from the ActiveRepairService's map. I guess that if we should 
use it we would have to rewrite it to stop validation and streaming as well. 
But as you said, it should be done in a separate ticket.

bq. Another unrelated option that we should probably include in the future is a 
timeout, and abort repair sessions running longer than that.

Agreed. Do we have any time out scenarios that we could foresee before they 
occur? Would it be possible for a node to "drop" a validation/streaming without 
notifying the repair coordinator? If we could detect that, it would be good to 
abort the repair as early as possible, assuming that the timeout would be set 
rather high.

> Automatic repair scheduling
> ---
>
> Key: CASSANDRA-10070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
> Fix For: 3.x
>
> Attachments: Distributed Repair Scheduling.doc
>
>
> Scheduling and running repairs in a Cassandra cluster is most often a 
> required task, but this can both be hard for new users and it also requires a 
> bit of manual configuration. There are good tools out there that can be used 
> to simplify things, but wouldn't this be a good feature to have inside of 
> Cassandra? To automatically schedule and run repairs, so that when you start 
> up your cluster it basically maintains itself in terms of normal 
> anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2016-02-15 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15147751#comment-15147751
 ] 

Paulo Motta commented on CASSANDRA-10070:
-

Starting with a single repair per dc and adding support for parallel repair 
sessions later sounds like a good idea.

bq. I agree and we could probably store the parent repair session id in an 
extra column of the lock table and have a thread wake up periodically to see if 
there are repair sessions without locks. 

Do we intend to reuse the lock table for other maintenance tasks as well? If 
so, we must add a generic "holder" column to the lock table so we can reuse to 
identify resources other than the parent repair session in the future. We could 
also add an "attributes" map in the lock table to store additional attributes 
such as status, or have a separate table to maintain status to keep the lock 
table simple.

bq. But then we must somehow be able to differentiate user-defined and 
automatically scheduled repair sessions. It could be done by having all repairs 
go through this scheduling interface, which also would reduce user mistakes 
with multiple repairs in parallel. Another alternative is to have a custom flag 
in the parent repair that makes the garbage collector ignore it if it's 
user-defined. I think that the garbage collector/cancel repairs when unable to 
lock feature is something that should be included in the first pass.

Ideally all repairs would go through this interface, but this would probably 
add complexity at this stage. So we should probably just add a "lockResource" 
attribute to each local repair session object (as opposed to only the parent 
repair object), and each node would go through all repairs currently running 
checking if it still holds the lock if the "lockResource" field is set.

bq. The most basic failure scenarios should be covered by retrying a repair if 
it fails and log a warning/error based on how many times it failed. Could the 
retry behaviour cause some unexpected consequences?

It would probably be safe to abort ongoing validation and stream background 
tasks and cleanup repair state on all involved nodes before starting a new 
repair session in the same ranges. This doesn't seem to be done currently. As 
far as I understood, if there are nodes A, B, C running repair, A is the 
coordinator. If validation or streaming fails on node B, the coordinator (A) is 
notified and fails the repair session, but node C will remain doing validation 
and/or streaming, what could cause problems (or increased load) if we start 
another repair session on the same range. 

We will probably need to extend the repair protocol to perform this 
cleanup/abort step on failure. We already have a legacy cleanup message that 
doesn't seem to be used in the current protocol that we could maybe reuse to 
cleanup repair state after a failure. This repair abortion will probably have 
intersection with CASSANDRA-3486. In any case, this is a separate (but related) 
issue and we should address it in an independent ticket, and make this ticket 
dependent on that.

Another unrelated option that we should probably include in the future is a 
timeout, and abort repair sessions running longer than that.

> Automatic repair scheduling
> ---
>
> Key: CASSANDRA-10070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
> Fix For: 3.x
>
> Attachments: Distributed Repair Scheduling.doc
>
>
> Scheduling and running repairs in a Cassandra cluster is most often a 
> required task, but this can both be hard for new users and it also requires a 
> bit of manual configuration. There are good tools out there that can be used 
> to simplify things, but wouldn't this be a good feature to have inside of 
> Cassandra? To automatically schedule and run repairs, so that when you start 
> up your cluster it basically maintains itself in terms of normal 
> anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2016-02-15 Thread Marcus Olsson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15147579#comment-15147579
 ] 

Marcus Olsson commented on CASSANDRA-10070:
---

{quote}
All data centers involved in a repair must be available for a repair to 
start/succeed, so if we make the lock resource dc-aware and try to create the 
lock by contacting a node in each involved data center with LOCAL_SERIAL 
consistency that should be sufficient to ensure correctness without the need 
for a global lock. This will also play along well with both dc_parallelism 
global option and with the --local or --dcs table repair options.
{quote}

{quote}
The second alternative is probably the most desireable. Actually dc_parallelism 
by itself might cause problems, since we can have a situation where all repairs 
run in a single node or range, overloading those nodes. If we are to support 
concurrent repairs in the first pass, I think we need both dc_parallelism and 
node_parallelism options together.
{quote}

{quote}
This is becoming a bit complex and there probably are some edge cases and/or 
starvation scenarios so we should think carefully about before jumping into 
implementation. What do you think about this approach? Should we stick to a 
simpler non-parallel version in the first pass or think this through and 
already support parallelism in the first version?
{quote}

I like the approach with using local serial for each dc and having specialized 
keys. I think we could include the dc parallelism lock with 
"RepairResource-\{dc}-\{i}" but only allow one repair per data center by 
hardcoding "i" to 1 in the first pass. This should make the upgrades easier 
when we do allow parallel repairs. I like the node locks approach as well, but 
as you say there are probably some edge cases so we could wait with adding them 
until we allow parallel repairs and I don't think it would break the upgrades 
by introducing them later.

{quote}
We should also think better about possible failure scenarios and network 
partitions. What happens if the node cannot renew locks in a remote DC due to a 
temporary network partition but the repair is still running ? We should 
probably cancel a repair if not able to renew the lock and also have some kind 
of garbage collector to kill ongoing repair sessions without associated locks 
to protect from disrespecting the configured dc_parallelism and 
node_paralellism.
{quote}
I agree and we could probably store the parent repair session id in an extra 
column of the lock table and have a thread wake up periodically to see if there 
are repair sessions without locks. But then we must somehow be able to 
differentiate user-defined and automatically scheduled repair sessions. It 
could be done by having all repairs go through this scheduling interface, which 
also would reduce user mistakes with multiple repairs in parallel. Another 
alternative is to have a custom flag in the parent repair that makes the 
garbage collector ignore it if it's user-defined. I think that the garbage 
collector/cancel repairs when unable to lock feature is something that should 
be included in the first pass.

The most basic failure scenarios should be covered by retrying a repair if it 
fails and log a warning/error based on how many times it failed. Could the 
retry behaviour cause some unexpected consequences?

> Automatic repair scheduling
> ---
>
> Key: CASSANDRA-10070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
> Fix For: 3.x
>
> Attachments: Distributed Repair Scheduling.doc
>
>
> Scheduling and running repairs in a Cassandra cluster is most often a 
> required task, but this can both be hard for new users and it also requires a 
> bit of manual configuration. There are good tools out there that can be used 
> to simplify things, but wouldn't this be a good feature to have inside of 
> Cassandra? To automatically schedule and run repairs, so that when you start 
> up your cluster it basically maintains itself in terms of normal 
> anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2016-02-11 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142861#comment-15142861
 ] 

Paulo Motta commented on CASSANDRA-10070:
-

Sorry for the delay, will try to be faster on next iterations. Below are some 
comments in your previous reply:

bq. A problem with this table is that if we have a setup with two data centers 
and three replicas in each data center, then we have a total of six replicas 
and QUORUM would require four replicas to succeed. This would require that both 
data centers are available to be able to run repair. 

All data centers involved in a repair must be available for a repair to 
start/succeed, so if we make the lock resource dc-aware and try to create the 
lock by contacting a node in each involved data center with LOCAL_SERIAL 
consistency that should be sufficient to ensure correctness without the need 
for a global lock. This will also play along well with both dc_parallelism 
global option and with the {{\-\-local}} or {{\-\-dcs}} table repair options.

I thought of something along those lines:

{noformat}
dc_locks = {}
dcs = repair_dcs(keyspace, table) # this will depend on both keyspace settings 
and table repair settings (--local or  --dcs)

for dc in dcs:
  for i in 0..dc_parallelism(dc):
if ((lock = get_node(dc).execute("INSERT INTO lock (resource) VALUES 
('RepairResource-{dc}-{i}') IF NOT EXISTS USING TTL 30;", LOCAL_SERIAL) != nil)
  dc_locks[dc] = lock

if len(dc_locks) != len(dcs):
  release_locks(dc_locks)
else:
  start_repair(table)
{noformat}

bq. Just a questions regarding your suggestion with the 
node_repair_parallelism. Should it be used to specify the number of repairs a 
node can initiate or how many repairs the node can be an active part of in 
parallel? I guess the second alternative would be harder to implement, but it 
is probably what one would expect.

The second alternative is probably the most desireable. Actually dc_parallelism 
by itself might cause problems, since we can have a situation where all repairs 
run in a single node or range, overloading those nodes. If we are to support 
concurrent repairs in the first pass, I think we need both dc_parallelism and 
node_parallelism options together.

I thought we could extend the previous lock acquiring algorithm with:

{noformat}
dc_locks = previous algorithm

if len(dc_locks) != len(dcs):
  release_locks(dc_locks)
  return;

node_locks = {}
nodes = repair_nodes(table, range) 

for node in nodes:
  for i in 0..node_parallelism(node): 
if ((lock = node.execute("INSERT INTO lock (resource) VALUES 
('RepairResource-{node}-{i}') IF NOT EXISTS USING TTL 30;", LOCAL_SERIAL)) != 
nil)
  node_locks[node] = lock
  break;

if len(node_locks) != len(nodes):
  release_locks(dc_locks)
  release_locks(node_locks)
else:
  start_repair(table)
{noformat}

This is becoming a bit complex and there probably are some edge cases and/or 
starvation scenarios so we should think carefully about before jumping into 
implementation. What do you think about this approach? Should we stick to a 
simpler non-parallel version in the first pass or think this through and 
already support parallelism in the first version?

bq. It should be possible to extend the repair scheduler with subrange repairs

I like the token_division approach for supporting subrange repairs in addition 
to {{-pr}}, but we can think about this later.

bq. Agreed, are there any other scenarios that we might have to take into 
account?

I can only think of upgrades and range movements (bootstrap, move, removenode, 
etc) right now.

We should also think better about possible failure scenarios and network 
partitions. What happens if the node cannot renew locks in a remote DC due to a 
temporary network partition but the repair is still running ? We should 
probably cancel a repair if not able to renew the lock and also have some kind 
of garbage collector to kill ongoing repair sessions without associated locks 
to protect from disrespecting the configured {{dc_parallelism}} and 
{{node_paralellism}}.

> Automatic repair scheduling
> ---
>
> Key: CASSANDRA-10070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
> Fix For: 3.x
>
> Attachments: Distributed Repair Scheduling.doc
>
>
> Scheduling and running repairs in a Cassandra cluster is most often a 
> required task, but this can both be hard for new users and it also requires a 
> bit of manual configuration. There are good tools out there that can be used 
> to simplify things, but wouldn't this be a good feature to have inside of 
> Cassandra? To automatically schedule and run repairs, so that when you start 
> up your 

[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2016-02-05 Thread Marcus Olsson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15134231#comment-15134231
 ] 

Marcus Olsson commented on CASSANDRA-10070:
---

[~yukim] [~pauloricardomg] Thanks for the comments, great questions/suggestions!

Regarding your questions about the locking:
{quote}
* What would "lock resource" be like for repair scheduling? I think the value 
controls number of repair jobs running at given time in the whole cluster, and 
we don't want to run as many repair jobs at once.
* I second Yuki Morishita's first question above, in that we need to better 
specify how is cluster-wide repair parallelism handled: is it fixed or 
configurable? can a node run repair for multiple ranges in parallel? Perhaps we 
should have a node_repair_paralellism (default 1) and dc_repair_parallelism 
(default 1) global config and reject starting repairs above those thresholds.
{quote}
The thought with the lock resource was that it could be something simple, like 
a table defined as:
{noformat}
CREATE TABLE lock (
resource text PRIMARY KEY
)
{noformat}
And then the different nodes would try to get the lock using LWT with TTL:
{noformat}
INSERT INTO lock (resource) VALUES ('RepairResource') IF NOT EXISTS USING TTL 
30;
{noformat}
After that the node would have to continue to update the locked resource while 
running the repair to prevent that someone else gets the locked resource. The 
value "RepairResource" could just as easily be defined as "RepairResource-N", 
so that it would be possible to allow repairs to run in parallel.

A problem with this table is that if we have a setup with two data centers and 
three replicas in each data center, then we have a total of six replicas and 
QUORUM would require four replicas to succeed. This would require that both 
data centers are available to be able to run repair. Since some of the 
keyspaces might not be replicated across both data centers we would still have 
to be able to run repair even if one of the data centers is unavailable. This 
also applies if we should "force" local dc repairs if a data center has been 
unavailable too long. There are two options as I see it on how to solve this:
* Get the lock with local_serial during these scenarios.
* Have a separate lock table for each data center *and* a global one.

I guess the easiest solution would be to use local_serial, but I'm not sure if 
it might cause some unexpected behavior. If we would go for the other option 
with separate tables it would probably increase the overall complexity, but it 
would make it easier to restrict the number of parallel repairs in a single 
data center.

Just a questions regarding your suggestion with the node_repair_parallelism. 
Should it be used to specify the number of repairs a node can initiate or how 
many repairs the node can be an active part of in parallel? I guess the second 
alternative would be harder to implement, but it is probably what one would 
expect.

---

{quote}
* It seems the scheduling only makes sense for repairing primary range of the 
node ('nodetool -pr') since we end up repairing all nodes eventually. Are you 
considering other options like subrange ('nodetool -st -et') repair?
* For subrange repair, we could maybe have something similar to reaper's 
segmentCount option, but since this would add more complexity we could leave 
for a separate ticket.
{quote}

It should be possible to extend the repair scheduler with subrange repairs, 
either by having it as an option per table or by having a separate scheduler 
for it. The separate scheduler would just be another plugin that could replace 
the default repair scheduler. If we go for a table configuration it could be 
that the user either specifies pr or the number of segments to divide the token 
range in, something like:
{noformat}
repair_options = {..., token_division='pr'}; // Use primary range repair
or
repair_options = {..., token_division='2048'}; // Divide the token range in 
2048 slices
{noformat}
If we would have a separate scheduler it could just be a configuration for it. 
Personally I would prefer to have it all in a single scheduler and I agree that 
it should probably be a separate ticket to keep the complexity of the base 
scheduler to a minimum. But I think this is a feature that will be very much 
needed both with non-vnode token assignment and also with the possibility to 
reduce the number of vnodes as of CASSANDRA-7032.

---

{quote}
* While pausing repair is a nice future for user-based interruptions, we could 
probably embed system known interruptions (such as when a bootstrap or upgrade 
is going on) in the default rejection logic.
{quote}

Agreed, are there any other scenarios that we might have to take into account?

> Automatic repair scheduling
> ---
>
> Key: CASSANDRA-10070
> URL: 

[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2016-02-04 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15133253#comment-15133253
 ] 

Paulo Motta commented on CASSANDRA-10070:
-

Nice work [~molsson]. Overall the design doc looks great and addresses most of 
the issues raised previously, just a few minor comments/questions:
* I second [~yukim]'s first question above, in that we need to better specify 
how is cluster-wide repair parallelism handled: is it fixed or configurable? 
can a node run repair for multiple ranges in parallel? Perhaps we should have a 
 {{node_repair_paralellism}} (default 1) and {{dc_repair_parallelism}} (default 
1) global config and reject starting repairs above those thresholds.
* For subrange repair, we could maybe have something similar to 
[reaper|https://github.com/spotify/cassandra-reaper]'s {{segmentCount}} option, 
but since this would add more complexity we could leave for a separate ticket.
* While pausing repair is a nice future for user-based interruptions, we could 
probably embed system known interruptions (such as when a bootstrap or upgrade 
is going on) in the default rejection logic.

Maybe the spotify reaper folks have something to add based on their experience 
with automatic repair scheduling (cc [~Bj0rn], [~zvo]).

> Automatic repair scheduling
> ---
>
> Key: CASSANDRA-10070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
> Fix For: 3.x
>
> Attachments: Distributed Repair Scheduling.doc
>
>
> Scheduling and running repairs in a Cassandra cluster is most often a 
> required task, but this can both be hard for new users and it also requires a 
> bit of manual configuration. There are good tools out there that can be used 
> to simplify things, but wouldn't this be a good feature to have inside of 
> Cassandra? To automatically schedule and run repairs, so that when you start 
> up your cluster it basically maintains itself in terms of normal 
> anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2016-02-04 Thread Yuki Morishita (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15132680#comment-15132680
 ] 

Yuki Morishita commented on CASSANDRA-10070:


[~molsson] Thanks for the write up. I have couple of questions:

* What would "lock resource" be like for repair scheduling? I think the value 
controls number of repair jobs running at given time in the whole cluster, and 
we don't want to run as many repair jobs at once.
* It seems the scheduling only makes sense for repairing primary range of the 
node ('nodetool -pr') since we end up repairing all nodes eventually. Are you 
considering other options like subrange ('nodetool -st -et') repair?

> Automatic repair scheduling
> ---
>
> Key: CASSANDRA-10070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
> Fix For: 3.x
>
> Attachments: Distributed Repair Scheduling.doc
>
>
> Scheduling and running repairs in a Cassandra cluster is most often a 
> required task, but this can both be hard for new users and it also requires a 
> bit of manual configuration. There are good tools out there that can be used 
> to simplify things, but wouldn't this be a good feature to have inside of 
> Cassandra? To automatically schedule and run repairs, so that when you start 
> up your cluster it basically maintains itself in terms of normal 
> anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2016-02-02 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128529#comment-15128529
 ] 

Jonathan Ellis commented on CASSANDRA-10070:


[~devdazed], you had some great suggestions above.  Do you have time to look at 
the draft Marcus attached?

> Automatic repair scheduling
> ---
>
> Key: CASSANDRA-10070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
> Fix For: 3.x
>
> Attachments: Distributed Repair Scheduling.doc
>
>
> Scheduling and running repairs in a Cassandra cluster is most often a 
> required task, but this can both be hard for new users and it also requires a 
> bit of manual configuration. There are good tools out there that can be used 
> to simplify things, but wouldn't this be a good feature to have inside of 
> Cassandra? To automatically schedule and run repairs, so that when you start 
> up your cluster it basically maintains itself in terms of normal 
> anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2016-01-20 Thread Marcus Olsson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108704#comment-15108704
 ] 

Marcus Olsson commented on CASSANDRA-10070:
---

I completely agree, I should create a document describing these things.

I've also thought about making a high level document for the whole proposal, so 
as to see if everyone agrees that this is the way to go about the distributed 
scheduling. Then we can take it from there and revise the proposal and 
hopefully later on break the JIRA into several tasks to make it easier to 
review and develop this feature.

I think this document should contain:
* High level description of proposal (flow charts, etc.)
* Problems that could occur and possible solutions

Any thoughts or ideas on this?

> Automatic repair scheduling
> ---
>
> Key: CASSANDRA-10070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
> Fix For: 3.x
>
>
> Scheduling and running repairs in a Cassandra cluster is most often a 
> required task, but this can both be hard for new users and it also requires a 
> bit of manual configuration. There are good tools out there that can be used 
> to simplify things, but wouldn't this be a good feature to have inside of 
> Cassandra? To automatically schedule and run repairs, so that when you start 
> up your cluster it basically maintains itself in terms of normal 
> anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2016-01-19 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107570#comment-15107570
 ] 

Jonathan Ellis commented on CASSANDRA-10070:


Marcus, I think Russell has made some very valuable suggestions as to the kind 
of complications we need to be thinking about here.

Before jumping back to another patch, I think it would be useful to put 
together a high level design document that thinks through these questions and 
proposes approaches to deal with them.  Then we can get feedback to you faster 
than at the level of actual code.

> Automatic repair scheduling
> ---
>
> Key: CASSANDRA-10070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
> Fix For: 3.x
>
>
> Scheduling and running repairs in a Cassandra cluster is most often a 
> required task, but this can both be hard for new users and it also requires a 
> bit of manual configuration. There are good tools out there that can be used 
> to simplify things, but wouldn't this be a good feature to have inside of 
> Cassandra? To automatically schedule and run repairs, so that when you start 
> up your cluster it basically maintains itself in terms of normal 
> anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2015-12-10 Thread Marcus Olsson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15050440#comment-15050440
 ] 

Marcus Olsson commented on CASSANDRA-10070:
---

{quote}
While it may intuitively seem like you want to kick-off a repair as soon as a 
node comes back online, it can be very dangerous in a production environment.

Starting the most resource intensive process on a node that is already 
problematic, in a cluster that is already having issues can exacerbate the 
issue and lead to a longer outage, or degradation, than anticipated. 
{quote}
True, it should probably be a feature enabled by the user and maybe with a 
configurable delay before it actually performs the repair?

{quote}
Network reliability is also another aspect of this. Lets say you have 3 nodes, 
RF=3 and there is a partition dividing node A and node B. All nodes are still 
actually, up, but in this case node A will start a repair on B and B will start 
a repair on A. Now 2/3 of your cluster is un-needly repairing which can cause 
serious performance problems, especially when running a loaded cluster.
{quote}
The repairs are still executed with respect to the distributed locking, so 
there would only be one node running repair at a time. But they would send the 
job information to each other in parallel.

{quote}
Also:
Other times you might not want a repair automatically started:
* The cluster is in the middle of a rolling upgrade where streaming is broken 
between versions.
* Heavily loaded clusters during normal operation (some users schedule repairs 
at night to not affect performance during normal hours of operation)
* Clusters where the read-consistency is high enough to account for the hints 
beyond the window allowing the user to schedule the repair for a time that 
makes sense for their cluster and use-case.
{quote}
* This is something that the repair scheduler should be handling either way, to 
avoiding repairing if the cluster is unable to perform it. (version 
incompatibility, nodes are down, etc.)
* There is a plug-in point for schedule policies that can be used to decide if 
repairs should run, so it would be possible to prevent repairs due to some 
condition(s). The conditions could be based on what the user wants, be it 
maintenance windows or resource usage. It would also be possible to prevent 
normal scheduled repairs during some hours, but allow manually scheduled 
repairs at all times.
* This would be possible by making this feature optional.

---

{quote}
I don't know much about Cassandra internals, so one of the regular devs would 
know better, buy my thought would be during a restart, somewhere it figures out 
that it needs to replay part of the commit log to rebuild memtables that hadn't 
been flushed to disk. The timestamp of the last thing in the commit log might 
be a good estimate of when the node went down, and you could compare that to 
the current time to figure out how long the node was down.

I wouldn't worry about the second case since it would be hard to get that right.
{quote}
Looking at the commitlog might be a good enough approach. I'll look in to that.

---

Overall I'd say that if this feature(exceeding hint window repairs) should 
exist, it should probably be something that is enabled per table, but disabled 
by default.

> Automatic repair scheduling
> ---
>
> Key: CASSANDRA-10070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
> Fix For: 3.x
>
>
> Scheduling and running repairs in a Cassandra cluster is most often a 
> required task, but this can both be hard for new users and it also requires a 
> bit of manual configuration. There are good tools out there that can be used 
> to simplify things, but wouldn't this be a good feature to have inside of 
> Cassandra? To automatically schedule and run repairs, so that when you start 
> up your cluster it basically maintains itself in terms of normal 
> anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2015-12-07 Thread Jim Meyer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15044921#comment-15044921
 ] 

Jim Meyer commented on CASSANDRA-10070:
---

Wouldn't it be safer if node A checked itself how long it had been down and 
scheduled its own repairs? Why have node B guess that node A was down? I've 
seen cases where nodes couldn't communicate so they think the other node is 
down, when actually both nodes are up.

> Automatic repair scheduling
> ---
>
> Key: CASSANDRA-10070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
> Fix For: 3.x
>
>
> Scheduling and running repairs in a Cassandra cluster is most often a 
> required task, but this can both be hard for new users and it also requires a 
> bit of manual configuration. There are good tools out there that can be used 
> to simplify things, but wouldn't this be a good feature to have inside of 
> Cassandra? To automatically schedule and run repairs, so that when you start 
> up your cluster it basically maintains itself in terms of normal 
> anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2015-12-07 Thread Marcus Olsson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15045189#comment-15045189
 ] 

Marcus Olsson commented on CASSANDRA-10070:
---

I agree that it would probably be safer for node A to check how long it has 
been down itself, but I'm not sure how that can be done reliably. But also if 
node A & B couldn't communicate for a time period longer than the hint window 
they will not have hints. So in that case they should do a repair even if both 
were up the whole time.

Note that I'm not against having the check on the node that was down, it's just 
that I think that both the case that a node was down and that two nodes was 
unable to communicate should require a repair. If the second case is not 
required do you have any suggestions on how the self-check could be implemented?

> Automatic repair scheduling
> ---
>
> Key: CASSANDRA-10070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
> Fix For: 3.x
>
>
> Scheduling and running repairs in a Cassandra cluster is most often a 
> required task, but this can both be hard for new users and it also requires a 
> bit of manual configuration. There are good tools out there that can be used 
> to simplify things, but wouldn't this be a good feature to have inside of 
> Cassandra? To automatically schedule and run repairs, so that when you start 
> up your cluster it basically maintains itself in terms of normal 
> anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2015-12-07 Thread Russell Bradberry (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15045293#comment-15045293
 ] 

Russell Bradberry commented on CASSANDRA-10070:
---

While it may intuitively seem like you want to kick-off a repair as soon as a 
node comes back online, it can be very dangerous in a production environment. 

Starting the most resource intensive process on a node that is already 
problematic, in a cluster that is already having issues can exacerbate the 
issue and lead to a longer outage, or degradation, than anticipated.  

Network reliability is also another aspect of this.  Lets say you have 3 nodes, 
RF=3 and there is a partition dividing node A and node B.  All nodes are still 
actually, up, but in this case node A will start a repair on B and B will start 
a repair on A.  Now 2/3 of your cluster is un-needly repairing which can cause 
serious performance problems, especially when running a loaded cluster.

Also:
Other times you might not want a repair automatically started:

 - The cluster is in the middle of a rolling upgrade where streaming is broken 
between versions.  
 - Heavily loaded clusters during normal operation (some users schedule repairs 
at night to not affect performance during normal hours of operation)
 - Clusters where the read-consistency is high enough to account for the hints 
beyond the window allowing the user to schedule the repair for a time that 
makes sense for their cluster and use-case.





> Automatic repair scheduling
> ---
>
> Key: CASSANDRA-10070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
> Fix For: 3.x
>
>
> Scheduling and running repairs in a Cassandra cluster is most often a 
> required task, but this can both be hard for new users and it also requires a 
> bit of manual configuration. There are good tools out there that can be used 
> to simplify things, but wouldn't this be a good feature to have inside of 
> Cassandra? To automatically schedule and run repairs, so that when you start 
> up your cluster it basically maintains itself in terms of normal 
> anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2015-12-07 Thread Marcus Olsson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15044637#comment-15044637
 ] 

Marcus Olsson commented on CASSANDRA-10070:
---

[~zemeyer] I've added the possibility to schedule a job remotely, so that one 
node can tell another node to run a certain job. Right now it's used for when a 
node discovers that another node has been down longer than the possible hint 
window, and then tells that node to repair it's ranges ASAP. The remote 
scheduling is using the distributed locking mechanism to avoid that multiple 
nodes try to tell the same node to run the repair at the same time.

So a simple flow could be:
Node A goes down at 12:00
Node B recognizes it and saves "Node A DOWN @ 12:00" locally
Node A comes back up at 16:00
Node B sees Node A as online again at 16:00 and sees that Node A has been down 
since 12:00, 4 hours.
Node B sends a repair job to Node A for each table that has a hint window that 
is 4 hours or less.
Node A runs all repairs

---

I'll continue to work on the feature of pausing all repairs and also the 
prevention mechanism. I've done some work for the prevention mechanism for jobs 
in that it checks the job history for repairs and only returns that it *can* 
run a repair if any range hasn't been repaired within the hint window (it's 
still based on the interval though, so the repair shouldn't run more than once 
per interval in the normal case).

To the prevention mechanism I should probably add a way for it to avoid doing 
multiple repairs for a single node at the same time. After that I'll add the 
possibility to run parallel repair tasks over the cluster.

---

The git branch is [here|https://github.com/emolsson/cassandra/commits/10070].

> Automatic repair scheduling
> ---
>
> Key: CASSANDRA-10070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
> Fix For: 3.x
>
>
> Scheduling and running repairs in a Cassandra cluster is most often a 
> required task, but this can both be hard for new users and it also requires a 
> bit of manual configuration. There are good tools out there that can be used 
> to simplify things, but wouldn't this be a good feature to have inside of 
> Cassandra? To automatically schedule and run repairs, so that when you start 
> up your cluster it basically maintains itself in terms of normal 
> anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2015-12-07 Thread Jim Meyer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15045336#comment-15045336
 ] 

Jim Meyer commented on CASSANDRA-10070:
---

I don't know much about Cassandra internals, so one of the regular devs would 
know better, buy my thought would be during a restart, somewhere it figures out 
that it needs to replay part of the commit log to rebuild memtables that hadn't 
been flushed to disk.  The timestamp of the last thing in the commit log might 
be a good estimate of when the node went down, and you could compare that to 
the current time to figure out how long the node was down.

I wouldn't worry about the second case since it would be hard to get that right.

> Automatic repair scheduling
> ---
>
> Key: CASSANDRA-10070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
> Fix For: 3.x
>
>
> Scheduling and running repairs in a Cassandra cluster is most often a 
> required task, but this can both be hard for new users and it also requires a 
> bit of manual configuration. There are good tools out there that can be used 
> to simplify things, but wouldn't this be a good feature to have inside of 
> Cassandra? To automatically schedule and run repairs, so that when you start 
> up your cluster it basically maintains itself in terms of normal 
> anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2015-12-07 Thread Jim Meyer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15045350#comment-15045350
 ] 

Jim Meyer commented on CASSANDRA-10070:
---

I think this is part of the motivation for building repair scheduling into 
Cassandra.  When we write an external repair scheduler, it has no idea what the 
state of the cluster is, so it just blindly issues repairs based on a time 
schedule.

> Automatic repair scheduling
> ---
>
> Key: CASSANDRA-10070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
> Fix For: 3.x
>
>
> Scheduling and running repairs in a Cassandra cluster is most often a 
> required task, but this can both be hard for new users and it also requires a 
> bit of manual configuration. There are good tools out there that can be used 
> to simplify things, but wouldn't this be a good feature to have inside of 
> Cassandra? To automatically schedule and run repairs, so that when you start 
> up your cluster it basically maintains itself in terms of normal 
> anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2015-10-29 Thread Jon Haddad (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14980310#comment-14980310
 ] 

Jon Haddad commented on CASSANDRA-10070:


[~amandava] I just opened CASSANDRA-10619

> Automatic repair scheduling
> ---
>
> Key: CASSANDRA-10070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
> Fix For: 3.x
>
>
> Scheduling and running repairs in a Cassandra cluster is most often a 
> required task, but this can both be hard for new users and it also requires a 
> bit of manual configuration. There are good tools out there that can be used 
> to simplify things, but wouldn't this be a good feature to have inside of 
> Cassandra? To automatically schedule and run repairs, so that when you start 
> up your cluster it basically maintains itself in terms of normal 
> anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2015-10-29 Thread Marcus Olsson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14980078#comment-14980078
 ] 

Marcus Olsson commented on CASSANDRA-10070:
---

Just to clarify, the automatic scheduling is done on a node level. The way it 
distributes is by "competing" with the other nodes with regards to who has the 
highest need for a repair and then uses a CAS lock to obtain the right to run a 
repair. So the repair process would continue during upgrade, but I assume it 
would fail as it is right now and that the repair job would be retried. The 
problem here is that this job would try to run until it succeeded since it has 
the highest priority, even if there are other repair jobs that could run (e.g. 
if only a part of the cluster was upgraded).

To allow repairs during an upgrade scenario I think we need to have both 
CASSANDRA-7530 & CASSANDRA-8110 in place.
Until then I see two options:
* Make it possible to "pause" all repair scheduling, e.g. during upgrade 
scenarios.
* Make the repair job recognize that it cannot run at this time and allow 
another repair job to run instead.

I wouldn't mind implementing both options, since there might be scenarios when 
both are needed, even if we can repair between versions.

> Automatic repair scheduling
> ---
>
> Key: CASSANDRA-10070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
> Fix For: 3.x
>
>
> Scheduling and running repairs in a Cassandra cluster is most often a 
> required task, but this can both be hard for new users and it also requires a 
> bit of manual configuration. There are good tools out there that can be used 
> to simplify things, but wouldn't this be a good feature to have inside of 
> Cassandra? To automatically schedule and run repairs, so that when you start 
> up your cluster it basically maintains itself in terms of normal 
> anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2015-10-29 Thread Marcus Olsson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14980079#comment-14980079
 ] 

Marcus Olsson commented on CASSANDRA-10070:
---

Just to clarify, the automatic scheduling is done on a node level. The way it 
distributes is by "competing" with the other nodes with regards to who has the 
highest need for a repair and then uses a CAS lock to obtain the right to run a 
repair. So the repair process would continue during upgrade, but I assume it 
would fail as it is right now and that the repair job would be retried. The 
problem here is that this job would try to run until it succeeded since it has 
the highest priority, even if there are other repair jobs that could run (e.g. 
if only a part of the cluster was upgraded).

To allow repairs during an upgrade scenario I think we need to have both 
CASSANDRA-7530 & CASSANDRA-8110 in place.
Until then I see two options:
* Make it possible to "pause" all repair scheduling, e.g. during upgrade 
scenarios.
* Make the repair job recognize that it cannot run at this time and allow 
another repair job to run instead.

I wouldn't mind implementing both options, since there might be scenarios when 
both are needed, even if we can repair between versions.

> Automatic repair scheduling
> ---
>
> Key: CASSANDRA-10070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
> Fix For: 3.x
>
>
> Scheduling and running repairs in a Cassandra cluster is most often a 
> required task, but this can both be hard for new users and it also requires a 
> bit of manual configuration. There are good tools out there that can be used 
> to simplify things, but wouldn't this be a good feature to have inside of 
> Cassandra? To automatically schedule and run repairs, so that when you start 
> up your cluster it basically maintains itself in terms of normal 
> anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2015-10-28 Thread Avinash Mandava (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14978802#comment-14978802
 ] 

Avinash Mandava commented on CASSANDRA-10070:
-

Right now streaming between Cassandra versions isn't recommended, but I'm 
wondering how we would upgrade to a new version with automatic repairs running. 
If I'm doing a rolling upgrade, right now I have to stop the repair process to 
prevent streaming between nodes and then upgrade, and then resume the repair 
process. But if we are thinking of including automatic repairs, might it be 
valuable to allow people to keep the repair process going while they upgrade? I 
can see how an upgrade is infrequent enough for this suggestion to be overkill, 
but curious what people think.

> Automatic repair scheduling
> ---
>
> Key: CASSANDRA-10070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
> Fix For: 3.x
>
>
> Scheduling and running repairs in a Cassandra cluster is most often a 
> required task, but this can both be hard for new users and it also requires a 
> bit of manual configuration. There are good tools out there that can be used 
> to simplify things, but wouldn't this be a good feature to have inside of 
> Cassandra? To automatically schedule and run repairs, so that when you start 
> up your cluster it basically maintains itself in terms of normal 
> anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2015-08-28 Thread Marcus Olsson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718230#comment-14718230
 ] 

Marcus Olsson commented on CASSANDRA-10070:
---

This is an optional feature in the way it's implemented, you can disable this 
and do only manual/scripted repairs if you want. I think this is a fundamental 
functionality that should be a part of the codebase, otherwise the same 
argument could be held against compactions, those \*could\* also be handled in 
an external tool. Note that the actual repairs are always handled inside of C* 
and that this is just a way to schedule them.

I think that data consistency management should be a part of a databases 
functionality. There are already hints and read-repairs that are handled inside 
of C*, so why shouldn't repairs be handled that way as well?

 Automatic repair scheduling
 ---

 Key: CASSANDRA-10070
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Marcus Olsson
Assignee: Marcus Olsson
Priority: Minor
 Fix For: 3.x


 Scheduling and running repairs in a Cassandra cluster is most often a 
 required task, but this can both be hard for new users and it also requires a 
 bit of manual configuration. There are good tools out there that can be used 
 to simplify things, but wouldn't this be a good feature to have inside of 
 Cassandra? To automatically schedule and run repairs, so that when you start 
 up your cluster it basically maintains itself in terms of normal 
 anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2015-08-28 Thread Malcolm (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718168#comment-14718168
 ] 

Malcolm commented on CASSANDRA-10070:
-

Operational simplicity is nice, however anything that increases the surface 
area of what Cassandra does increases the chances of bugs.

 Automatic repair scheduling
 ---

 Key: CASSANDRA-10070
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Marcus Olsson
Assignee: Marcus Olsson
Priority: Minor
 Fix For: 3.x


 Scheduling and running repairs in a Cassandra cluster is most often a 
 required task, but this can both be hard for new users and it also requires a 
 bit of manual configuration. There are good tools out there that can be used 
 to simplify things, but wouldn't this be a good feature to have inside of 
 Cassandra? To automatically schedule and run repairs, so that when you start 
 up your cluster it basically maintains itself in terms of normal 
 anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2015-08-28 Thread Marcus Olsson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718206#comment-14718206
 ] 

Marcus Olsson commented on CASSANDRA-10070:
---

As it is right now it would wait an hour after starting up(configurable) before 
starting to schedule repairs. At that point the repair priority would be based 
on when it last ran a repair and how often it can run (P = (H+1) * bP - 
described above). So it should have the highest priority if it was supposed to 
repair during the time it was down.

Right now there is no such option for that, so a manual repair would be 
required.

 Automatic repair scheduling
 ---

 Key: CASSANDRA-10070
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Marcus Olsson
Assignee: Marcus Olsson
Priority: Minor
 Fix For: 3.x


 Scheduling and running repairs in a Cassandra cluster is most often a 
 required task, but this can both be hard for new users and it also requires a 
 bit of manual configuration. There are good tools out there that can be used 
 to simplify things, but wouldn't this be a good feature to have inside of 
 Cassandra? To automatically schedule and run repairs, so that when you start 
 up your cluster it basically maintains itself in terms of normal 
 anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2015-08-28 Thread Jim Meyer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718522#comment-14718522
 ] 

Jim Meyer commented on CASSANDRA-10070:
---

Would it be difficult to add an option like that?  One of the advantages of 
building the scheduler into C* is that it could have insight into the state of 
the cluster and respond to node downtime.  It could reduce the consistency gap 
between the hint window being exceeded and the next regularly scheduled repair 
for a critical table.  Then one could set the hint window smaller, the regular 
schedule to once a week, and the recovery repair to queue a repair when 
downtime had exceeded the hint window.

Separate question, is the 'parallelism' attribute scalable for large clusters?  
If I have a 1000 node cluster and want to allow up to 10% of my nodes to run 
repairs at the same time, how would I specify that?  Would that be a system 
config param or a table level attribute?

 Automatic repair scheduling
 ---

 Key: CASSANDRA-10070
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Marcus Olsson
Assignee: Marcus Olsson
Priority: Minor
 Fix For: 3.x


 Scheduling and running repairs in a Cassandra cluster is most often a 
 required task, but this can both be hard for new users and it also requires a 
 bit of manual configuration. There are good tools out there that can be used 
 to simplify things, but wouldn't this be a good feature to have inside of 
 Cassandra? To automatically schedule and run repairs, so that when you start 
 up your cluster it basically maintains itself in terms of normal 
 anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2015-08-28 Thread Jim Meyer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718527#comment-14718527
 ] 

Jim Meyer commented on CASSANDRA-10070:
---

+1 to including.  This feature would help with multi-tenancy and extend the 
idea of tunable consistency since different use cases will have different 
repair requirements.  Individual applications could self serve their repair 
frequency via the table properties instead of having an administrator guess 
what frequency is needed.

It is a difficult and error prone chore for an application developer to devise 
a reliable external mechanism for scheduling repairs.  It often ends up as a 
simple cron job that blindly repairs all keyspaces once a day.

 Automatic repair scheduling
 ---

 Key: CASSANDRA-10070
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Marcus Olsson
Assignee: Marcus Olsson
Priority: Minor
 Fix For: 3.x


 Scheduling and running repairs in a Cassandra cluster is most often a 
 required task, but this can both be hard for new users and it also requires a 
 bit of manual configuration. There are good tools out there that can be used 
 to simplify things, but wouldn't this be a good feature to have inside of 
 Cassandra? To automatically schedule and run repairs, so that when you start 
 up your cluster it basically maintains itself in terms of normal 
 anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2015-08-28 Thread Marcus Olsson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14719321#comment-14719321
 ] 

Marcus Olsson commented on CASSANDRA-10070:
---

Good point, I can take a look into how difficult it would be to implement 
something like that.

The parallelism attribute is a mapping to the repair parallelism(parallel or 
sequential). I have thought about the possibility to run multiple repairs at 
once as well, I guess it would be a system level configuration. I think the 
attribute should say the number of parallel maintenance tasks to perform.

 Automatic repair scheduling
 ---

 Key: CASSANDRA-10070
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Marcus Olsson
Assignee: Marcus Olsson
Priority: Minor
 Fix For: 3.x


 Scheduling and running repairs in a Cassandra cluster is most often a 
 required task, but this can both be hard for new users and it also requires a 
 bit of manual configuration. There are good tools out there that can be used 
 to simplify things, but wouldn't this be a good feature to have inside of 
 Cassandra? To automatically schedule and run repairs, so that when you start 
 up your cluster it basically maintains itself in terms of normal 
 anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2015-08-27 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14717170#comment-14717170
 ] 

sankalp kohli commented on CASSANDRA-10070:
---

[~malcolm] With CASSANDRA-6434, it becomes more important to run repairs. It is 
not just to keep things in sync but to drop tombstones. I would vote for 
keeping this in C*. I am not sure if it is possible to keep it on the side as 
C* is a single process. 

 Automatic repair scheduling
 ---

 Key: CASSANDRA-10070
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Marcus Olsson
Assignee: Marcus Olsson
Priority: Minor
 Fix For: 3.x


 Scheduling and running repairs in a Cassandra cluster is most often a 
 required task, but this can both be hard for new users and it also requires a 
 bit of manual configuration. There are good tools out there that can be used 
 to simplify things, but wouldn't this be a good feature to have inside of 
 Cassandra? To automatically schedule and run repairs, so that when you start 
 up your cluster it basically maintains itself in terms of normal 
 anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2015-08-27 Thread Jon Haddad (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14717204#comment-14717204
 ] 

Jon Haddad commented on CASSANDRA-10070:


As an operator, +1 to including.  Anything that reduces the surface area for 
user mistake is a good thing.

 Automatic repair scheduling
 ---

 Key: CASSANDRA-10070
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Marcus Olsson
Assignee: Marcus Olsson
Priority: Minor
 Fix For: 3.x


 Scheduling and running repairs in a Cassandra cluster is most often a 
 required task, but this can both be hard for new users and it also requires a 
 bit of manual configuration. There are good tools out there that can be used 
 to simplify things, but wouldn't this be a good feature to have inside of 
 Cassandra? To automatically schedule and run repairs, so that when you start 
 up your cluster it basically maintains itself in terms of normal 
 anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2015-08-27 Thread Malcolm (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14716248#comment-14716248
 ] 

Malcolm commented on CASSANDRA-10070:
-

Is there any strong reason to make this part of the Cassandra codebase?  All of 
this work can be expressed and handled in an external tool, keeping the 
Cassandra codebase focused more on storing data.

 Automatic repair scheduling
 ---

 Key: CASSANDRA-10070
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Marcus Olsson
Assignee: Marcus Olsson
Priority: Minor
 Fix For: 3.x


 Scheduling and running repairs in a Cassandra cluster is most often a 
 required task, but this can both be hard for new users and it also requires a 
 bit of manual configuration. There are good tools out there that can be used 
 to simplify things, but wouldn't this be a good feature to have inside of 
 Cassandra? To automatically schedule and run repairs, so that when you start 
 up your cluster it basically maintains itself in terms of normal 
 anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2015-08-26 Thread Jim Meyer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14713496#comment-14713496
 ] 

Jim Meyer commented on CASSANDRA-10070:
---

This sounds like a very useful feature.  I'm wondering what the behavior will 
be when a node that has been down for a while comes back up.  I assume it would 
see that it is overdue for some repairs and schedule them in a load friendly 
manner.

Now suppose I have a table where consistency is very important.  Would I be 
able to set table attributes to schedule a high priority repair if the node had 
been down longer than max_hint_window_in_ms, so that it can be made consistent 
as soon as possible?  Or would that still need to be done manually?

 Automatic repair scheduling
 ---

 Key: CASSANDRA-10070
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Marcus Olsson
Assignee: Marcus Olsson
Priority: Minor
 Fix For: 3.x


 Scheduling and running repairs in a Cassandra cluster is most often a 
 required task, but this can both be hard for new users and it also requires a 
 bit of manual configuration. There are good tools out there that can be used 
 to simplify things, but wouldn't this be a good feature to have inside of 
 Cassandra? To automatically schedule and run repairs, so that when you start 
 up your cluster it basically maintains itself in terms of normal 
 anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2015-08-25 Thread Marcus Olsson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710966#comment-14710966
 ] 

Marcus Olsson commented on CASSANDRA-10070:
---

*An explanation of the patch so far*

*Tables*
Added a table property to specify that the table should be repaired 
automatically, the minimum delay between repairs, and some repair parameters.

{noformat}
CREATE TABLE example
(
  key text,
  value text,
  PRIMARY KEY(key)
) WITH repair_scheduling = {'enabled': true, 'min_delay': 86400, 'incremental': 
true, 'parallelism': 'sequential'};
{noformat}
This would create a table that would get repaired at most once a day(86400 
seconds between each repair) using incremental sequential repair.

I added a package, scheduling, for all maintenance scheduling related 
classes. The main class in this package is the ScheduleManager which performs 
the updates and running of the scheduled jobs. This package has two pluggable 
interfaces and some abstract classes. Which schedulers/policies to use is 
configurable in cassandra.yaml

*Interfaces*
- *IScheduler* - Used to create new scheduled jobs in the background.
- *ISchedulePolicy* - Used to deny jobs from running based on some conditions.

*Implementations*
- *RepairScheduler* - Implements the IScheduler interface and is responsible to 
check the tables for repair scheduling options. It also listens for schema 
changes.
- *FileSchedulePolicy* - Implements the ISchedulePolicy interface and uses a 
configuration file,,

The scheduler has one implementation, the *RepairScheduler*, which listens for 
schema changes and adds new ScheduledJobs for the ScheduleManager to run. For 
the schedule policy there is one implementation, *FileSchedulePolicy* which 
uses a configuration file to define when scheduled jobs should be prevented 
from running. The policy is not enabled by default.

*Abstract classes*
- *ScheduledJob* - The base class for all scheduled jobs which contains 
functionality to calculate priority, when to run next, etc. 
- *ScheduledTask* - The base class for all tasks.

The priority of the jobs are calculated as *P = (H + 1) \* bP* where *P* is the 
priority, *H* is the number of hours that has passed since it could have been 
executed(based on min_delay) and *bP* is the base priority.

*Other*
- *DistributedLock* - Used to get the run lock.
- *JobConfiguration* - Common configuration for the scheduled jobs like minimum 
delay between jobs, base priority, enabled and if it should only run once.
- *ScheduledJobQueue* - Dynamically prioritized queue of jobs based on their 
priority.
- *ScheduledRepairJob* - Implementation of the ScheduledJob that holds a list 
of repair tasks(repairs a single table).
- *ScheduledRepairTask* - Implementation of the ScheduledTask used to repair a 
single range for a table.

The DistributedLock uses two tables in the system_distributed keyspace. One for 
writing priority so that all nodes knows about the others priority. The other 
is for the lock which is accessed by LWT. It first writes it's own priority to 
the table, then it reads all nodes priorities. If the node has the highest 
priority it will try to take the lock.

*Nodetool*
Added the \-s(\-\-scheduled)  \-sh(\-\-scheduled-high) flags to repair that 
indicates that it should be a scheduled(but only run once) repair and if it 
should have the highest priority.

The reason for using interfaces and abstract classes is to have flexibility for 
the user in adding their own schedulers and perhaps having the possibility to 
schedule e.g. cleanups through nodetool.

The branch is available here: 
https://github.com/emolsson/cassandra/commits/10070

 Automatic repair scheduling
 ---

 Key: CASSANDRA-10070
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Marcus Olsson
Assignee: Marcus Olsson
Priority: Minor

 Scheduling and running repairs in a Cassandra cluster is most often a 
 required task, but this can both be hard for new users and it also requires a 
 bit of manual configuration. There are good tools out there that can be used 
 to simplify things, but wouldn't this be a good feature to have inside of 
 Cassandra? To automatically schedule and run repairs, so that when you start 
 up your cluster it basically maintains itself in terms of normal 
 anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2015-08-20 Thread Marcus Olsson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704457#comment-14704457
 ] 

Marcus Olsson commented on CASSANDRA-10070:
---

Working on a first draft of the scheduler, will hopefully have a patch set up 
in the next week.

 Automatic repair scheduling
 ---

 Key: CASSANDRA-10070
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Marcus Olsson
Assignee: Marcus Olsson
Priority: Minor

 Scheduling and running repairs in a Cassandra cluster is most often a 
 required task, but this can both be hard for new users and it also requires a 
 bit of manual configuration. There are good tools out there that can be used 
 to simplify things, but wouldn't this be a good feature to have inside of 
 Cassandra? To automatically schedule and run repairs, so that when you start 
 up your cluster it basically maintains itself in terms of normal 
 anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10070) Automatic repair scheduling

2015-08-14 Thread Marcus Olsson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696755#comment-14696755
 ] 

Marcus Olsson commented on CASSANDRA-10070:
---

I have previously done some work on something similar to this, although it was 
an external Java process that used JMX. There it divides all tables as _Jobs_ 
that is prioritized against each other in a consistent manner. A Job is then 
further divided into _Tasks_, where each task is responsible to repair a 
certain range for that table. A Job can be seen as a atomic unit, where the 
success of the Job was based on the success of all it's Tasks. It used LWT with 
TTL to create a lock on which node has the right to run repair right now. Since 
it used TTL, the lock would disappear in case the node holding the lock would 
die. It also used a repair history to be able to continue from where it left of 
when restarted.

---

*Some ideas for the implementation:*

*Core*
By reusing some of the concepts it would be possible to create a pluggable 
interface that can be used to prioritize these Jobs on a node level. By 
inserting this priority(an simple integer?) into a distributed table it would 
be possible for other nodes to see which node has the highest priority to run a 
repair, to avoid starvation. Then before running the repair there could be 
another pluggable interface that can prevent repairs from starting under 
certain circumstances(e.g. node load). The automatic repair of a table could be 
enabled/disabled by a table property.

*Nodetool*
* Add possibility to enable/disable all automatic repair.
* Add possibility to run a repair of a table when possible(that uses this 
distributed scheduling).

 Automatic repair scheduling
 ---

 Key: CASSANDRA-10070
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Marcus Olsson
Assignee: Marcus Olsson
Priority: Minor

 Scheduling and running repairs in a Cassandra cluster is most often a 
 required task, but this can both be hard for new users and it also requires a 
 bit of manual configuration. There are good tools out there that can be used 
 to simplify things, but wouldn't this be a good feature to have inside of 
 Cassandra? To automatically schedule and run repairs, so that when you start 
 up your cluster it basically maintains itself in terms of normal 
 anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)