[ 
https://issues.apache.org/jira/browse/CASSANDRA-13064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Roth reassigned CASSANDRA-13064:
-----------------------------------------

    Assignee: Benjamin Roth

> Add stream type or purpose to stream plan / stream
> --------------------------------------------------
>
>                 Key: CASSANDRA-13064
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13064
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Benjamin Roth
>            Assignee: Benjamin Roth
>
> It would be very good to know the type or purpose of a certain stream on the 
> receiver side. It should be both available in a stream request and a stream 
> task.
> Why?
> It would be helpful to distinguish the purpose to allow different handling of 
> streams and requests. Examples:
> - In stream request a global flush is done. This is not necessary for all 
> types of streams. A repair stream(-plan) does not require a flush as this has 
> been done shortly before in validation compaction and only the sstables that 
> have been validated also have to be streamed.
> - In StreamReceiveTask streams for MVs go through the regular write path this 
> is painfully slow especially on bootstrap and decomission. Both for bootstrap 
> and decommission this is not necessary. Sstables can be directly streamed 
> down in this case. Handling bootstrap is no problem as it relies on a local 
> state but during decommission, the decom-state is bound to the sender and not 
> the receiver, so the receiver has to know that it is safe to stream that 
> sstable directly, not through the write-path. Thats why we have to know the 
> purpose of the stream.
> I'd love to implement this on my own but I am not sure how not to break the 
> streaming protocol for backwards compat or if it is ok to do so.
> Furthermore I'd love to get some feedback on that idea and some proposals 
> what stream types to distinguish. I could imagine:
> - bootstrap
> - decommission
> - repair
> - replace node
> - remove node
> - range relocation
> Comments like this support my idea, knowing the purpose could avoid this.
> {quote}
>                 // TODO each call to transferRanges re-flushes, this is 
> potentially a lot of waste
>                 streamPlan.transferRanges(newEndpoint, preferred, 
> keyspaceName, ranges);
> {quote}
> And alternative to passing the purpose of the stream was to pass flags like:
> - requiresFlush
> - requiresWritePathForMaterializedView
> ...
> I guess passing the purpose will make the streaming protocol more robust for 
> future changes and leaves decisions up to the receiver.
> But an additional "requiresFlush" would also avoid putting too much logic 
> into the streaming code. The streaming code should not care about purposes, 
> the caller or receiver should. So the decision if a stream requires as flush 
> before stream should be up to the stream requester and the stream request 
> receiver depending on the purpose of the stream.
> I'm excited about your feedback :)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to