[ https://issues.apache.org/jira/browse/CASSANDRA-11521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15232701#comment-15232701 ]
Sylvain Lebresne commented on CASSANDRA-11521: ---------------------------------------------- The first thing that I think should be answered here is how do we "expose" this "externally". My initial though was more or less what I think your proof of concept is doing, that is having a different "paging mode" where the server sends pages "as fast as possible" rather than waiting for the client to ask for them. But I'm starting to wonder if that's the best approach. Because one of the question in that case is "how to make sure we don't overwhelm the client?". And taking a step back, I strongly suspect that by far the majority of the gain of "streaming" in the numbers on CASSANDRA-9259 is due to not having to re-start a new query server side for each page. Because other than that, the difference between clients requesting pages as-fast-as-they-can versus server sending them as-fast-as-they-can (without waiting on the client to ask) is really just the latency of 2 client-server messages per page, which should be fairly small (and probably not even noticeable if the server can send data faster than the client can process). So an alternative could be to not change how current paging works in general, but simply allow user to provide a "hint" when they know that they intend to consume the whole result set no matter what (and do so rapidly). That hint would be used by the driver and server to optimize based on that assumption, which would mean for the driver to try to ask all pages to the same replica and for the server to, at CL.ONE at least, maintain the ongoing query iterator in memory. My reasoning is that this would trade some hopefully negligable amount of latency between pages for: # a simple solution to the problem of rate limiting for clients sake (since client will still control how fast things come). # almost no change to the native protocol. We only need to pass the new "hint" flag, which would really only mean "please optimize if you can". In particular, we could actually introduce this _without_ a bump of the native protocol since we have flags available for query/execute messages. Given that so far we have no plan on doing the protocol v5 before 4.0, this would let us deliver this earlier which is nice. # very little changes for the drivers: all they probably have to do is make sure they reuse the same replica for all pages if the "hint" is set by users but that should be pretty trivial to implement. # it makes the question of what CL is supported moot: the "hint" flag will be just that, a hint, so users will be able to use it whenever. It just happens that we'll only optimize CL.ONE initially. Overall, assuming the loss in latency (compared to having the server sends page as fast as it can) is indeed very small (which we should certainly validate), this would appear to a pretty good tradeoff to me. But anyway, that's my initial brain dump on that first question of "how we expose this?". There are other questions too that needs to be discussed (and the sooner the better). For instance, how do we concretely handle the long running queries that this will allow? Holding an OpOrder for too long feels problematic to name just one problem. > Implement streaming for bulk read requests > ------------------------------------------ > > Key: CASSANDRA-11521 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11521 > Project: Cassandra > Issue Type: Sub-task > Components: Local Write-Read Paths > Reporter: Stefania > Assignee: Stefania > Fix For: 3.x > > > Allow clients to stream data from a C* host, bypassing the coordination layer > and eliminating the need to query individual pages one by one. -- This message was sent by Atlassian JIRA (v6.3.4#6332)