[jira] [Commented] (CASSANDRA-5426) Redesign repair messages
[ https://issues.apache.org/jira/browse/CASSANDRA-5426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13688061#comment-13688061 ] Jonathan Ellis commented on CASSANDRA-5426: --- Yuki confirms that https://github.com/yukim/cassandra/commits/5426-3 is ready for review. > Redesign repair messages > > > Key: CASSANDRA-5426 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5426 > Project: Cassandra > Issue Type: Improvement >Reporter: Yuki Morishita >Assignee: Yuki Morishita >Priority: Minor > Labels: repair > Fix For: 2.0 > > > Many people have been reporting 'repair hang' when something goes wrong. > Two major causes of hang are 1) validation failure and 2) streaming failure. > Currently, when those failures happen, the failed node would not respond back > to the repair initiator. > The goal of this ticket is to redesign message flows around repair so that > repair never hang. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5426) Redesign repair messages
[ https://issues.apache.org/jira/browse/CASSANDRA-5426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13673145#comment-13673145 ] Yuki Morishita commented on CASSANDRA-5426: --- [~jasobrown] Actually, I think that try catch block is redundant. Streaming does not run on the same thread as StreamingRepairTask does and exception should be handled at IStreamCallback's onError method(which is empty in current 1.2). I'm trying to overhaul streaming API for 2.0(CASSANDRA-5286) and it should have more fine grained control over streaming. > Redesign repair messages > > > Key: CASSANDRA-5426 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5426 > Project: Cassandra > Issue Type: Improvement >Reporter: Yuki Morishita >Assignee: Yuki Morishita >Priority: Minor > Labels: repair > Fix For: 2.0 > > > Many people have been reporting 'repair hang' when something goes wrong. > Two major causes of hang are 1) validation failure and 2) streaming failure. > Currently, when those failures happen, the failed node would not respond back > to the repair initiator. > The goal of this ticket is to redesign message flows around repair so that > repair never hang. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5426) Redesign repair messages
[ https://issues.apache.org/jira/browse/CASSANDRA-5426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13673105#comment-13673105 ] Jason Brown commented on CASSANDRA-5426: In StreamingRepairTask.initiateStreaming(), there's this block {code}try { ... StreamOut.transferSSTables(outsession, sstables, request.ranges, OperationType.AES); // request ranges from the remote node StreamIn.requestRanges(request.dst, desc.keyspace, Collections.singleton(cfstore), request.ranges, this, OperationType.AES); } catch(Exception e) ...{code} Is there any value in putting the StreamIn.requestRanges() in a separate try block and not (immediately) fail if StreamOut has a problem? Then, we could potentially make some forward progress (for the stream StreamIn) even if StreamOut fails? I'll note that 1.2 has the same try/catch as Yuki's new work, so it has not changed in that regard. > Redesign repair messages > > > Key: CASSANDRA-5426 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5426 > Project: Cassandra > Issue Type: Improvement >Reporter: Yuki Morishita >Assignee: Yuki Morishita >Priority: Minor > Labels: repair > Fix For: 2.0 > > > Many people have been reporting 'repair hang' when something goes wrong. > Two major causes of hang are 1) validation failure and 2) streaming failure. > Currently, when those failures happen, the failed node would not respond back > to the repair initiator. > The goal of this ticket is to redesign message flows around repair so that > repair never hang. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5426) Redesign repair messages
[ https://issues.apache.org/jira/browse/CASSANDRA-5426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671647#comment-13671647 ] Yuki Morishita commented on CASSANDRA-5426: --- Pushed update to: https://github.com/yukim/cassandra/commits/5426-3 Removed all classes that was kept for backward compatibility. bq. One thing I'm not sure of is that it seems that when we get an error, we log it but we doesn't error out the repair session itself. Maybe we should, otherwise I fear most people won't notice something went wrong. bq. Also, when we fail, maybe we could send an error message (typically the exception message) for easier debugging/reporting. The latest version notifies the user by throwing exception which is filled(RepairSession#exception) When the error occurred. Sending exception back to the coordinator can be useful, but I'd rather take different approach that use tracing CF(CASSANDRA-5483). bq. I also wonder if maybe we should have more of a fail-fast policy when there is errors. For instance, if one node fail it's validation phase, maybe it might be worth failing right away and let the user re-trigger a repair once he has fixed whatever was the source of the error, rather than still differencing/syncing the other nodes (but I admit that both solutions are possible). I changed to let repair session fail when error occurred, but I think it is better to have repair option(something like -k, --keep-going) to keep repair running and report failed session/job at the end. If you +1, I will do that in separate ticket. bq. Going a bit further, I think we should add 2 messages to interrupt the validation and sync phase. If only because that could be useful to users if they need to stop a repair for some reason, but also, if we get an error during validation from one node, we could use that to interrupt the other nodes and thus fail fast while minimizing the amount of work done uselessly. But anyway, I guess that part can be done in a follow up ticket. +1 on doing this on separate ticket. We also need to add the way to abort streaming to interrupt syncing. bq. In RepairMessageType, if gossip is any proof, then it could be wise to add more "FUTURE" type, say 4 or 5 "just in case". bq. Do we really need RepairMessageHeader? What about making RepairMessage a RepairJobDesc, a RepairMessageType and a body, rather than creating yet another class? For messages, I mimicked the way o.a.c.transport.messages does. bq. For the hashCode methods (Differencer, NodePair, RepairJobDesc,...), I'd prefer using guava's Objects.hashcode() (and Objects.equal() for equals() when there is null). Done, if I didn't miss anything. bq. I would move the gossiper/failure registration in ARS.addToActiveSessions. Done. bq. I'd remove Validator.rangeToValidate and just inline desc.range. Done. bq. Out of curiosity, what do you mean by the TODO in the comment of Validator.add(). That comment was from ancient version. Removed since it is no longer applicable. bq. For MerkleTree.fullRange, maybe it's time to add it to the MT serializer rather than restoring it manually, which is ugly and error prone. Aslo, for the partitioner, let's maybe have MT uses DatabaseDescriptor.getPartitioner() directly rather than restoring them manually in Differencer.run(). Yup, this is a good time to finally cleanup MerkleTree serialization. Done. > Redesign repair messages > > > Key: CASSANDRA-5426 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5426 > Project: Cassandra > Issue Type: Improvement >Reporter: Yuki Morishita >Assignee: Yuki Morishita >Priority: Minor > Labels: repair > Fix For: 2.0 > > > Many people have been reporting 'repair hang' when something goes wrong. > Two major causes of hang are 1) validation failure and 2) streaming failure. > Currently, when those failures happen, the failed node would not respond back > to the repair initiator. > The goal of this ticket is to redesign message flows around repair so that > repair never hang. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5426) Redesign repair messages
[ https://issues.apache.org/jira/browse/CASSANDRA-5426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13646605#comment-13646605 ] Sylvain Lebresne commented on CASSANDRA-5426: - The approach looks good to me: I definitively like the idea of having a common message/header for all repair message. Same for breaking down ARS in separate files. One thing I'm not sure of is that it seems that when we get an error, we log it but we doesn't error out the repair session itself. Maybe we should, otherwise I fear most people won't notice something went wrong. Also, when we fail, maybe we could send an error message (typically the exception message) for easier debugging/reporting. I also wonder if maybe we should have more of a fail-fast policy when there is errors. For instance, if one node fail it's validation phase, maybe it might be worth failing right away and let the user re-trigger a repair once he has fixed whatever was the source of the error, rather than still differencing/syncing the other nodes (but I admit that both solutions are possible). Going a bit further, I think we should add 2 messages to interrupt the validation and sync phase. If only because that could be useful to users if they need to stop a repair for some reason, but also, if we get an error during validation from one node, we could use that to interrupt the other nodes and thus fail fast while minimizing the amount of work done uselessly. But anyway, I guess that part can be done in a follow up ticket. Other than that, a few remarks/nits on the refactor.: - In RepairMessageType, if gossip is any proof, then it could be wise to add more "FUTURE" type, say 4 or 5 "just in case". As an aside, I tend to not be a fan of relying on an enum ordinal for serialization since it's extra fragile (you should not reorder stuffs for instance, which could easily slip by mistake imo). I personally prefer assigning the ordinal manually (like in transport.Message.Type for instance) even if that's a bit more verbose. Anyway, if people like it the way it is, so be it, but wanted to mention it nonetheless. - For the hashCode methods (Differencer, NodePair, RepairJobDesc,...), I'd prefer using guava's Objects.hashcode() (and Objects.equal() for equals() when there is null). - Do we really need RepairMessageHeader? What about making RepairMessage a RepairJobDesc, a RepairMessageType and a body, rather than creating yet another class? - In RepairMessage, not sure it's a good idea to allow a {{null}} body, especially since RepairMessageVerbHander doesn't handle it really. I'd rather assert it's not {{null}} and assert we do always have a body serializer in RepairMessage serializer (since that's really a programing error if we don't). - The code to create the repair messages feels a bit verbose. What about adding a static helper in RepairMessage: {noformat} public static MessageOut> createMessage(RepairJobDesc desc, RepairMessageType type, T body); {noformat} or even maybe one helper for each RepairMessageType? - I would move the gossiper/failure registration in ARS.addToActiveSessions. - I'd remove Validator.rangeToValidate and just inline desc.range. - Out of curiosity, what do you mean by the TODO in the comment of Validator.add(). What is there todo typically? Cause MT has some notion of valid/invalid ranges but that's historical and not used. Validator is really just a MT builder. So feels to me that mentioning cases 2 and 4 to later say we don't consider them will be more confusing than helpful for people looking at the code for the first time. As a side note, I think we could simplify the hell out of the MerkleTree class, but that's another story. - For MerkleTree.fullRange, maybe it's time to add it to the MT serializer rather than restoring it manually, which is ugly and error prone. Aslo, for the partitioner, let's maybe have MT uses DatabaseDescriptor.getPartitioner() directly rather than restoring them manually in Differencer.run(). I also noted that we can remove all the old compat stuff since we don't have backward compatibility issues with repair, but you already told me you had started doing it :). > Redesign repair messages > > > Key: CASSANDRA-5426 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5426 > Project: Cassandra > Issue Type: Improvement >Reporter: Yuki Morishita >Assignee: Yuki Morishita >Priority: Minor > Labels: repair > Fix For: 2.0 > > > Many people have been reporting 'repair hang' when something goes wrong. > Two major causes of hang are 1) validation failure and 2) streaming failure. > Currently, when those failures happen, the failed node would not respond back > to the repair initiator. > The goal of this ticket is to
[jira] [Commented] (CASSANDRA-5426) Redesign repair messages
[ https://issues.apache.org/jira/browse/CASSANDRA-5426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642116#comment-13642116 ] Yuki Morishita commented on CASSANDRA-5426: --- [~jjordan] For the streaming improvements, I'm working on CASSANDRA-5286. > Redesign repair messages > > > Key: CASSANDRA-5426 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5426 > Project: Cassandra > Issue Type: Improvement >Reporter: Yuki Morishita >Assignee: Yuki Morishita >Priority: Minor > Labels: repair > Fix For: 2.0 > > > Many people have been reporting 'repair hang' when something goes wrong. > Two major causes of hang are 1) validation failure and 2) streaming failure. > Currently, when those failures happen, the failed node would not respond back > to the repair initiator. > The goal of this ticket is to redesign message flows around repair so that > repair never hang. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5426) Redesign repair messages
[ https://issues.apache.org/jira/browse/CASSANDRA-5426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13641899#comment-13641899 ] Jeremiah Jordan commented on CASSANDRA-5426: Could any of this be added to other processes using streaming? Bootstrap/Decommission/Move/sstableloader? > Redesign repair messages > > > Key: CASSANDRA-5426 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5426 > Project: Cassandra > Issue Type: Improvement >Reporter: Yuki Morishita >Assignee: Yuki Morishita >Priority: Minor > Labels: repair > Fix For: 2.0 > > > Many people have been reporting 'repair hang' when something goes wrong. > Two major causes of hang are 1) validation failure and 2) streaming failure. > Currently, when those failures happen, the failed node would not respond back > to the repair initiator. > The goal of this ticket is to redesign message flows around repair so that > repair never hang. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5426) Redesign repair messages
[ https://issues.apache.org/jira/browse/CASSANDRA-5426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13631816#comment-13631816 ] Yuki Morishita commented on CASSANDRA-5426: --- Pushed completed version to: https://github.com/yukim/cassandra/commits/5426-2 This time, failure handling is implemented and added some unit tests for new classes. > Redesign repair messages > > > Key: CASSANDRA-5426 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5426 > Project: Cassandra > Issue Type: Improvement >Reporter: Yuki Morishita >Assignee: Yuki Morishita >Priority: Minor > Fix For: 2.0 > > > Many people have been reporting 'repair hang' when something goes wrong. > Two major causes of hang are 1) validation failure and 2) streaming failure. > Currently, when those failures happen, the failed node would not respond back > to the repair initiator. > The goal of this ticket is to redesign message flows around repair so that > repair never hang. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5426) Redesign repair messages
[ https://issues.apache.org/jira/browse/CASSANDRA-5426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13621523#comment-13621523 ] Yuki Morishita commented on CASSANDRA-5426: --- Work in progress is pushed to: https://github.com/yukim/cassandra/commits/5426-1 Only implemented for normal case that works. -- First of all, ActiveRepairService is broken down to several classes and placed into o.a.c.repair to make my work easier. The main design change around messages is that, all repair related message is packed into RepairMessage and handled in RepairMessageVerbHandler, which is executed in ANTY_ENTROPY stage. RepairMessage carries RepairMessageHeader and its content(if any). RepairMessageHeader is basically to indicate that the message belongs to which repair job and to specify content type. Repair message content type currently has 6 types defined in RepairMessageType: VALIDATION_REQUEST, VALIDATION_COMPLETE, VALIDATION_FAILED, SYNC_REQUEST, SYNC_COMPLETE, and SYNC_FAILED. *VALIDATION_REQUEST* VALIDATION_REQUEST is sent from repair initiator(coordinator) to request Merkle tree. *VALIDATION_COMPLETE*/*VALIDATION_FAILED* Calculated Merkle tree is sent back using VALIDATION_COMPLETE message. VALIDATION_FAILED message is used when something goes wrong in remote node. *SYNC_REQUEST* SYNC_REQUEST is sent when we have to repair remote two nodes. This is forwarded StreamingRepairTask we have today. *SYNC_COMPLETE*/*SYNC_FAILED* When there is no need to exchange data, or need to exchange but completed streaming, the node(this includes the node that received SYNC_REQUEST) sends back SYNC_COMPLETE. If streaming data fails, sends back SYNC_FAILED. The whole repair process is depend on async message exchange using MessagingService, so there is still the chance to hang when the node fail to deliver message(see CASSANDRA-5393). Any feedback is appreciated. > Redesign repair messages > > > Key: CASSANDRA-5426 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5426 > Project: Cassandra > Issue Type: Improvement >Reporter: Yuki Morishita >Assignee: Yuki Morishita >Priority: Minor > Fix For: 2.0 > > > Many people have been reporting 'repair hang' when something goes wrong. > Two major causes of hang are 1) validation failure and 2) streaming failure. > Currently, when those failures happen, the failed node would not respond back > to the repair initiator. > The goal of this ticket is to redesign message flows around repair so that > repair never hang. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira