[ 
https://issues.apache.org/jira/browse/CASSANDRA-6436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-6436:
-----------------------------------
    Component/s: Tools

> AbstractColumnFamilyInputFormat does not use start and end tokens configured 
> via ConfigHelper.setInputRange()
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-6436
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6436
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Tools
>            Reporter: Paulo Motta
>            Assignee: Paulo Motta
>              Labels: hadoop, patch
>             Fix For: 2.0.7
>
>         Attachments: cassandra-1.2-6436.txt, cassandra-1.2-6436.txt
>
>
> ConfigHelper allows to set a token input range via the setInputRange(conf, 
> startToken, endToken) call (ConfigHelper:254).
> We used this feature to limit a hadoop job range to a single Cassandra node's 
> range, or even to single row key, mostly for testing purposes. 
> This worked before the fix for CASSANDRA-5536 
> (https://github.com/apache/cassandra/commit/aaf18bd08af50bbaae0954d78d5e6cbb684aded9),
>  but after this ColumnFamilyInputFormat never uses the value of 
> KeyRange.start_token when defining the input splits 
> (AbstractColumnFamilyInputFormat:142-160), but only KeyRange.start_key, which 
> needs an order preserving partitioner to work.
> I propose the attached fix in order to allow defining Cassandra token ranges 
> for a given Hadoop job even when using a non-order preserving partitioner.
> Example use of ConfigHelper.setInputRange(conf, startToken, endToken) to 
> limit the range to a single Cassandra Key with RandomPartitioner: 
> IPartitioner part = ConfigHelper.getInputPartitioner(job.getConfiguration());
> Token token = part.getToken(ByteBufferUtil.bytes("Cassandra Key"));
> BigInteger endToken = (BigInteger) new 
> BigIntegerConverter().convert(BigInteger.class, 
> part.getTokenFactory().toString(token));
> BigInteger startToken = endToken.subtract(new BigInteger("1"));
> ConfigHelper.setInputRange(job.getConfiguration(), startToken.toString(), 
> endToken.toString());



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to