[jira] [Comment Edited] (CASSANDRA-13392) Repaired status should be cleared on new sstables when issuing nodetool refresh

2017-03-30 Thread Blake Eggleston (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15950041#comment-15950041
 ] 

Blake Eggleston edited comment on CASSANDRA-13392 at 3/30/17 11:13 PM:
---

This seems like a documentation problem. I don't think we should change the 
behavior of nodetool refresh, it should be treated as a shortcut for stopping a 
node, adding sstables, and restarting it. That's it. If an operator wants to 
add sstables to a live node we should expect that they’re adding them in the 
correct state, and the nodetool docs should explicitly warn them that they 
should modify repaired statuses prior to adding the sstables to the data 
directories. 

My reasoning here is:
* You each bring up valid use cases for nodetool refresh working one way or the 
other, so we can't really make assumptions about the operator's intentions.
* If a node unexpectedly dies after the sstables have been added to the data 
directories, it will load the sstables without adjusting any metatdata when it 
restarts, so the failure recovery behavior would differ from the normal 
behavior pretty significantly.
* After (briefly) reviewing the CFS.loadNewSSTables code, along with the other 
addSSTable code, I’m not confident that removing the repaired status of 
manually added sstables wouldn’t also clear the repaired status of legit 
sstables inadvertently in some cases. Specifically, there doesn’t appear to be 
anything preventing streamed repaired tables from appearing on disk between 
when we get the initial set of sstables, and when we start scanning the files 
on disk for sstables to add.





was (Author: bdeggleston):
This seems like a documentation problem. I don't think we should change the 
behavior of nodetool refresh, it should be treated as a shortcut for stopping a 
node, adding sstables, and restarting it. That's it. If an operator wants to 
add sstables to a live node we should expect that they’re adding them in the 
correct state, and the nodetool docs should explicitly warn them that they 
should modify repaired statuses prior to adding the sstables to the data set. 

My reasoning here is:
* You each bring up valid use cases for nodetool refresh working one way or the 
other, so we can't really make assumptions about the operator's intentions.
* If a node unexpectedly dies after the sstables have been added to the 
dataset, it will load the sstables without adjusting any metatdata when it 
restarts, so the failure recovery behavior would differ from the normal 
behavior pretty significantly.
* After (briefly) reviewing the CFS.loadNewSSTables code, along with the other 
addSSTable code, I’m not confident that removing the repaired status of 
manually added sstables wouldn’t also clear the repaired status of legit 
sstables inadvertently in some cases. Specifically, there doesn’t appear to be 
anything preventing streamed repaired tables from appearing on disk between 
when we get the initial set of sstables, and when we start scanning the files 
on disk for sstables to add.




> Repaired status should be cleared on new sstables when issuing nodetool 
> refresh
> ---
>
> Key: CASSANDRA-13392
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13392
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> We can't assume that new sstables added when doing nodetool refresh 
> (ColumnFamilyStore#loadNewSSTables) are actually repaired if they have the 
> repairedAt flag set



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13392) Repaired status should be cleared on new sstables when issuing nodetool refresh

2017-03-30 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15948837#comment-15948837
 ] 

Marcus Eriksson edited comment on CASSANDRA-13392 at 3/30/17 10:56 AM:
---

Well, if we are running nodetool refresh, we will want the data to reappear on 
replicas right? Someone copies in a bunch of sstables on one node, runs repair, 
that data should end up on all nodes right?

We will not be moving data from repaired to unrepaired on any 'live' sstables 
on the node we are running refresh - only on the sstables copied in to the data 
directory. Or do you have some other use case of 'nodetool refresh' where this 
is not true?


was (Author: krummas):
Well, if we are running nodetool refresh, we will want the data to reappear on 
replicas right? Someone copies in a bunch of sstables on one node, runs repair, 
that data should end up on all nodes right?

We will not be moving data from repaired to unrepaired on any 'live' sstables 
on the node we are running refresh - only on the sstables copied in to the data 
directory. Or do you have some other use case where this is not true?

> Repaired status should be cleared on new sstables when issuing nodetool 
> refresh
> ---
>
> Key: CASSANDRA-13392
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13392
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> We can't assume that new sstables added when doing nodetool refresh 
> (ColumnFamilyStore#loadNewSSTables) are actually repaired if they have the 
> repairedAt flag set



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)