[ https://issues.apache.org/jira/browse/CASSANDRA-4983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13709111#comment-13709111 ]
Jonathan Ellis commented on CASSANDRA-4983: ------------------------------------------- I'm not really a fan of making CFRR (and CqlPRR?) more complex to make a corner case slightly better. Remember, we'll have exactly one wrapping range per Task out of the 100s of splits. On the bright side, the "real" CqlInputFormat (using server-side paging) will make this a non-issue in 2.0. > Improve range wrap-around in CFIF: CFIF shouldn't produce input splits of > very tiny size > ---------------------------------------------------------------------------------------- > > Key: CASSANDRA-4983 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4983 > Project: Cassandra > Issue Type: Improvement > Affects Versions: 1.1.6 > Reporter: Piotr Kołaczkowski > Assignee: Piotr Kołaczkowski > Priority: Minor > Fix For: 1.2.7 > > Attachments: > 0001-CASSANDRA-4983-CFRR-able-to-iterate-over-more-than-o.patch > > > Currently CFIF splits the wrap-around split into two non-wrap-around splits. > While it simplifies CFRR implementation, this approach has several minor > downsides: > * One of the splits can be extremely small. One of our (picky) customers > suspected there must be a bug, because one of his map tasks executed in 1 > second, while all the rest executed in minutes. Also having a very small task > is wasting resources - more resources go to launching the task than doing any > real work. > * The number of map tasks is always one more than the number of (expected > rows / cassandra.input.split.size). The number of map tasks is always >= 2. > This is confusing customers. > * Progress reporting for the divided split parts is inaccurate - even if the > splits are similar in size, the progress bar goes to about 50% and then > immediately to 100%, because it is impossible to estimate their size properly > (the size estimation is done before removing wrap-around). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira