[ https://issues.apache.org/jira/browse/CASSANDRA-13064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Benjamin Roth reassigned CASSANDRA-13064: ----------------------------------------- Assignee: Benjamin Roth > Add stream type or purpose to stream plan / stream > -------------------------------------------------- > > Key: CASSANDRA-13064 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13064 > Project: Cassandra > Issue Type: Improvement > Reporter: Benjamin Roth > Assignee: Benjamin Roth > > It would be very good to know the type or purpose of a certain stream on the > receiver side. It should be both available in a stream request and a stream > task. > Why? > It would be helpful to distinguish the purpose to allow different handling of > streams and requests. Examples: > - In stream request a global flush is done. This is not necessary for all > types of streams. A repair stream(-plan) does not require a flush as this has > been done shortly before in validation compaction and only the sstables that > have been validated also have to be streamed. > - In StreamReceiveTask streams for MVs go through the regular write path this > is painfully slow especially on bootstrap and decomission. Both for bootstrap > and decommission this is not necessary. Sstables can be directly streamed > down in this case. Handling bootstrap is no problem as it relies on a local > state but during decommission, the decom-state is bound to the sender and not > the receiver, so the receiver has to know that it is safe to stream that > sstable directly, not through the write-path. Thats why we have to know the > purpose of the stream. > I'd love to implement this on my own but I am not sure how not to break the > streaming protocol for backwards compat or if it is ok to do so. > Furthermore I'd love to get some feedback on that idea and some proposals > what stream types to distinguish. I could imagine: > - bootstrap > - decommission > - repair > - replace node > - remove node > - range relocation > Comments like this support my idea, knowing the purpose could avoid this. > {quote} > // TODO each call to transferRanges re-flushes, this is > potentially a lot of waste > streamPlan.transferRanges(newEndpoint, preferred, > keyspaceName, ranges); > {quote} > And alternative to passing the purpose of the stream was to pass flags like: > - requiresFlush > - requiresWritePathForMaterializedView > ... > I guess passing the purpose will make the streaming protocol more robust for > future changes and leaves decisions up to the receiver. > But an additional "requiresFlush" would also avoid putting too much logic > into the streaming code. The streaming code should not care about purposes, > the caller or receiver should. So the decision if a stream requires as flush > before stream should be up to the stream requester and the stream request > receiver depending on the purpose of the stream. > I'm excited about your feedback :) -- This message was sent by Atlassian JIRA (v6.3.15#6346)