[ https://issues.apache.org/jira/browse/CASSANDRA-5902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15140055#comment-15140055 ]
Aleksey Yeschenko commented on CASSANDRA-5902: ---------------------------------------------- Mostly what I had in mind. Pushed some commits on top [here|https://github.com/iamaleksey/cassandra/commits/5902-3.0]: 1. Factor out {{concat(getNaturalEndpoints(keyspaceName, token), tokenMetadata.pendingEndpointsFor(token, keyspaceName))}} into its own method instead of duplicating it further 2. Move local/remote/owner logic to the border of the system - {{HintVerbHandler::doVerb}} - where the rest of this logic lives, leaving {{Hint}} unaware and local-only 3. Revert {{HintsDispatcher}} changes - as conversion doesn't really fit the surrounding code, and nulls for callback/action feel a bit messy to me. Add a very trivial {{convert}} method to {{HintsDispatchExecutor}} instead (2) means that the tests that rely on {{Hint::apply()}} need to be rewritten (using MS sinks), haven't handled that yet. > Dealing with hints after a topology change > ------------------------------------------ > > Key: CASSANDRA-5902 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5902 > Project: Cassandra > Issue Type: Sub-task > Components: Coordination > Reporter: Jonathan Ellis > Assignee: Branimir Lambov > Priority: Minor > Fix For: 3.0.x, 3.x > > > Hints are stored and delivered by destination node id. This allows them to > survive IP changes in the target, while making "scan all the hints for a > given destination" an efficient operation. However, we do not detect and > handle new node assuming responsibility for the hinted row via bootstrap > before it can be delivered. > I think we have to take a performance hit in this case -- we need to deliver > such a hint to *all* replicas, since we don't know which is the "new" one. > This happens infrequently enough, however -- requiring first the target node > to be down to create the hint, then the hint owner to be down long enough for > the target to both recover and stream to a new node -- that this should be > okay. -- This message was sent by Atlassian JIRA (v6.3.4#6332)