[ https://issues.apache.org/jira/browse/CASSANDRA-3137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13099124#comment-13099124 ]
Mck SembWever edited comment on CASSANDRA-3137 at 9/27/11 9:19 AM: ------------------------------------------------------------------- Indeed. I could be using this asap. The use case is... We're using a ByteOrderedPartition because we run incremental hadoop jobs over one of our column families where "events" initially come in. This cf has RF=1 and time-based UUID keys that are manipulated so that their byte ordering are time ordered. (the timestamp put up front). Each column has ttl of 3 months. After 3 months of data we saw all data on one node. Now i understand as the token range is the timestamp range which is from 1970 to 2270 so of course our 3 month period fell on one node (with a 3 node cluster even 100 years would fall on one node). To properly manage this cf we need to either continuously move nodes around, a cumbersome operation, or change the key so it's prefixed with {{timestamp % 3months}}. This would allow 3 months of data to cycle over the whole cluster and wrap around again. Obviously we're leaning towards the latter solution as it simplifies operations. But it does require this patch. (When CFIF supports IndexClause everything changes, we change our cluster to RandomPartitioner, use secondary indexes, and never look back...) was (Author: michaelsembwever): Indeed. I could be using this asap. The use case is... We're using a ByteOrderedPartition because we run incremental hadoop jobs over one of our column families where "events" initially come in. This cf has RF=1 and time-based UUID keys that are manipulated so that their byte ordering are time ordered. (the byte-unsigned timestamp put up front). Each column has ttl of 3 months. After 3 months of data we saw all data on one node. Now i understand as the token range is the timestamp range which is from 1970 to 2270 so of course our 3 month period fell on one node (with a 3 node cluster even 100 years would fall on one node). To properly manage this cf we need to either continuously move nodes around, a cumbersome operation, or change the key so it's prefixed with {{timestamp % 3months}}. This would allow 3 months of data to cycle over the whole cluster and wrap around again. Obviously we're leaning towards the latter solution as it simplifies operations. But it does require this patch. (When CFIF supports IndexClause everything changes, we change our cluster to RandomPartitioner, use secondary indexes, and never look back...) > Implement wrapping intersections for ConfigHelper's InputKeyRange > ----------------------------------------------------------------- > > Key: CASSANDRA-3137 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3137 > Project: Cassandra > Issue Type: Improvement > Components: Hadoop > Affects Versions: 0.8.5 > Reporter: Mck SembWever > Assignee: Mck SembWever > Priority: Minor > Fix For: 0.8.7 > > Attachments: CASSANDRA-3137.patch, CASSANDRA-3137.patch > > > Before there was no support for multiple intersections between the split's > range and the job's configured range. > After CASSANDRA-3108 it is now possible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira