[jira] [Commented] (CASSANDRA-13392) Repaired status should be cleared on new sstables when issuing nodetool refresh

2017-03-30 Thread Blake Eggleston (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15950041#comment-15950041
 ] 

Blake Eggleston commented on CASSANDRA-13392:
-

This seems like a documentation problem. I don't think we should change the 
behavior of nodetool refresh, it should be treated as a shortcut for stopping a 
node, adding sstables, and restarting it. That's it. If an operator wants to 
add sstables to a live node we should expect that they’re adding them in the 
correct state, and the nodetool docs should explicitly warn them that they 
should modify repaired statuses prior to adding the sstables to the data set. 

My reasoning here is:
* You each bring up valid use cases for nodetool refresh working one way or the 
other, so we can't really make assumptions about the operator's intentions.
* If a node unexpectedly dies after the sstables have been added to the 
dataset, it will load the sstables without adjusting any metatdata when it 
restarts, so the failure recovery behavior would differ from the normal 
behavior pretty significantly.
* After (briefly) reviewing the CFS.loadNewSSTables code, along with the other 
addSSTable code, I’m not confident that removing the repaired status of 
manually added sstables wouldn’t also clear the repaired status of legit 
sstables inadvertently in some cases. Specifically, there doesn’t appear to be 
anything preventing streamed repaired tables from appearing on disk between 
when we get the initial set of sstables, and when we start scanning the files 
on disk for sstables to add.




> Repaired status should be cleared on new sstables when issuing nodetool 
> refresh
> ---
>
> Key: CASSANDRA-13392
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13392
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> We can't assume that new sstables added when doing nodetool refresh 
> (ColumnFamilyStore#loadNewSSTables) are actually repaired if they have the 
> repairedAt flag set



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13392) Repaired status should be cleared on new sstables when issuing nodetool refresh

2017-03-30 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15948933#comment-15948933
 ] 

Marcus Eriksson commented on CASSANDRA-13392:
-

bq. How would users know in which case to refresh or restart?
We would tell them in the error message if they try to refresh with repaired 
sstables?

> Repaired status should be cleared on new sstables when issuing nodetool 
> refresh
> ---
>
> Key: CASSANDRA-13392
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13392
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> We can't assume that new sstables added when doing nodetool refresh 
> (ColumnFamilyStore#loadNewSSTables) are actually repaired if they have the 
> repairedAt flag set



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13392) Repaired status should be cleared on new sstables when issuing nodetool refresh

2017-03-30 Thread Stefan Podkowinski (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15948923#comment-15948923
 ] 

Stefan Podkowinski commented on CASSANDRA-13392:


bq. Well, if we are running nodetool refresh, we will want the data to reappear 
on replicas right? Someone copies in a bunch of sstables on one node, runs 
repair, that data should end up on all nodes right?

Shouldn't you use sstableloader in that case? I personally would never thought 
of using nodetool refresh for this. For me it's simply a command to make copied 
sstables available in a running Cassandra process. I also don't understand why 
it should make a difference here running refresh or restarting the node. How 
would users know in which case to refresh or restart?



> Repaired status should be cleared on new sstables when issuing nodetool 
> refresh
> ---
>
> Key: CASSANDRA-13392
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13392
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> We can't assume that new sstables added when doing nodetool refresh 
> (ColumnFamilyStore#loadNewSSTables) are actually repaired if they have the 
> repairedAt flag set



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13392) Repaired status should be cleared on new sstables when issuing nodetool refresh

2017-03-30 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15948909#comment-15948909
 ] 

Marcus Eriksson commented on CASSANDRA-13392:
-

Ok so it is not safe to remove the repair flag and it is not safe to keep it. 
Maybe we should just fail the refresh if there is a repaired sstable, forcing 
the user to either mark it unrepaired using tools/bin/sstablerepairedset or 
restarting the node to keep the flag?

> Repaired status should be cleared on new sstables when issuing nodetool 
> refresh
> ---
>
> Key: CASSANDRA-13392
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13392
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> We can't assume that new sstables added when doing nodetool refresh 
> (ColumnFamilyStore#loadNewSSTables) are actually repaired if they have the 
> repairedAt flag set



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13392) Repaired status should be cleared on new sstables when issuing nodetool refresh

2017-03-30 Thread Stefan Podkowinski (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15948896#comment-15948896
 ] 

Stefan Podkowinski commented on CASSANDRA-13392:


Let's say one of my sstables got corrupted and removed/scrubbed manually 
afterwards. Node has been started again. Now the admin pulls the same sstable 
from yesterdays backup into the data dir and runs refresh. Having the "new" 
sstable replicated during next repair would be rather unexpected and not safe. 

> Repaired status should be cleared on new sstables when issuing nodetool 
> refresh
> ---
>
> Key: CASSANDRA-13392
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13392
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> We can't assume that new sstables added when doing nodetool refresh 
> (ColumnFamilyStore#loadNewSSTables) are actually repaired if they have the 
> repairedAt flag set



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13392) Repaired status should be cleared on new sstables when issuing nodetool refresh

2017-03-30 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15948837#comment-15948837
 ] 

Marcus Eriksson commented on CASSANDRA-13392:
-

Well, if we are running nodetool refresh, we will want the data to reappear on 
replicas right? Someone copies in a bunch of sstables on one node, runs repair, 
that data should end up on all nodes right?

We will not be moving data from repaired to unrepaired on any 'live' sstables 
on the node we are running refresh - only on the sstables copied in to the data 
directory. Or do you have some other use case where this is not true?

> Repaired status should be cleared on new sstables when issuing nodetool 
> refresh
> ---
>
> Key: CASSANDRA-13392
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13392
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> We can't assume that new sstables added when doing nodetool refresh 
> (ColumnFamilyStore#loadNewSSTables) are actually repaired if they have the 
> repairedAt flag set



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13392) Repaired status should be cleared on new sstables when issuing nodetool refresh

2017-03-30 Thread Stefan Podkowinski (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15948820#comment-15948820
 ] 

Stefan Podkowinski commented on CASSANDRA-13392:


How would the repairedAt flag be set, if the sstables haven't been repaired 
before? Moving sstables from repaired to unrepaired again can resurrect data 
that has already been purged from the replicas, so it's not safe to assume that 
we can always drop sstables to unrepaired without consistency implications.

> Repaired status should be cleared on new sstables when issuing nodetool 
> refresh
> ---
>
> Key: CASSANDRA-13392
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13392
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> We can't assume that new sstables added when doing nodetool refresh 
> (ColumnFamilyStore#loadNewSSTables) are actually repaired if they have the 
> repairedAt flag set



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)