[ https://issues.apache.org/jira/browse/CASSANDRA-2540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13023166#comment-13023166 ]
Sylvain Lebresne commented on CASSANDRA-2540: --------------------------------------------- I agree that the "digest reads but not your data read" is not too nice (let's still add for the record that you will potentially fail QUORUM reads only during the time it takes for a node to be marked down by the failure detector). For fixing that, I think 3) is a reasonable option. bq. The outcome of data-reads-by-default should be significantly improved latency I'm not so sure of that. Digests mismatch are supposed to be the exception, not the norm, and read with no mismatches are supposed to be at least an order of magnitude more frequent that the ones with mismatch. Obviously the exact details depends on the use case and it would be easy enough to start recording such metric (to verify assumptions but perhaps also to help tweaking the ratio between data reads and digest reads). For the use cases where we do have very little mismatches (and those definitively exists, I would even guess this is the majority of use cases), using data reads all over the place may actually result in a decrease of (the average) latency (more things to transfer means more time to do it and even though the requests are in parallel, it's still more chances for increased latency). And let's not forget that people having fat columns may see important drawbacks to removing digest reads entirely. Anyway, what I'm saying is that I'm very much opposed to removing digest reads altogether, at least for now. I'm however in favor of making the ratio of digest/data reads configurable (at least in the code, how we expose it to users is another question), to add a metric to count the ratio of mismatch/reads and to at least expose the option of 3) to the user. > Data reads by default > --------------------- > > Key: CASSANDRA-2540 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2540 > Project: Cassandra > Issue Type: Bug > Reporter: Stu Hood > Fix For: 0.8.0 > > > The intention of digest vs data reads is to save bandwidth in the read path > at the cost of latency, but I expect that this has been a premature > optimization. > * Data requested by a read will often be within an order of magnitude of the > digest size, and a failed digest means extra roundtrips, more bandwidth > * The [digest reads but not your data > read|https://issues.apache.org/jira/browse/CASSANDRA-2282?focusedCommentId=13004656&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13004656] > problem means failing QUORUM reads because a single node is unavailable, and > would require eagerly re-requesting at some fraction of your timeout > * Saving bandwidth in cross datacenter usecases comes at huge cost to > latency, but since both constraints change proportionally (enough), the > tradeoff is not clear > Some options: > # Add an option to use digest reads > # Remove digest reads entirely (and/or punt and make them a runtime > optimization based on data size in the future) > # Continue to use digest reads, but send them to {{N - R}} nodes for > (somewhat) more predicatable behavior with QUORUM > \\ > The outcome of data-reads-by-default should be significantly improved > latency, with a moderate increase in bandwidth usage for large reads. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira