[ https://issues.apache.org/jira/browse/CASSANDRA-6439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Francesco Piccinno updated CASSANDRA-6439: ------------------------------------------ Description: I am trying to achieve a linear scan on the data contained in a cassandra node by exploiting tokens. The idea behind my approach is to request through the pycassa SystemManager a list of tokens that the cluster is responsible of, and then for each token, issue a {{key_range}} command specifying {{start}} and {{end}} interval. The problem is that apparently some tokens returned by the server does not respect the property {{start}} < {{end}}. I think that the bug is due to the fact that {{Murmur3RandomPartitioner}} is used, but I am just guessing here. Anyway here are the steps to reproduce the bug: {code:title=Triggering the bug} $ pycassaShell -k KEYSPACE In [1]: tokens = map(lambda x: (int(x.start_token), int(x.end_token)), SYSTEM_MANAGER.describe_ring('Crawler')) In [2]: len(filter(lambda x: x[0] < x[1], tokens)) Out[2]: 255 In [3]: len(filter(lambda x: x[0] > x[1], tokens)) Out[3]: 1 In [4]: filter(lambda x: x[0] > x[1], tokens) Out[4]: [(9207458196362321348, -9182599474778206823)] In [5]: for i in CF.get_range(start_token="9207458196362321348", finish_token="-9182599474778206823"): print i # ... # after some objects are printed # ... InvalidRequestException: InvalidRequestException(why="Start key's token sorts after end token") {code} was: I am trying to achieve a linear scan on the data contained in a cassandra node by exploiting tokens. The idea behind my approach is to request through the pycassa SystemManager a list of tokens that the cluster is responsible of, and then for each token, issue a {{key_range}} command specifying {{start}} and {{end}} interval. The problem is that apparently some tokens returned by the server does not respect the property {{start}} < {{end}}. I think that the bug is due to the fact that {{Murmur3RandomPartitioner}} is used, but I am just guessing here. Anyway here are the steps to reproduce the bug: {code:title=Triggering the bug} $ pycassaShell -k Crawler In [1]: tokens = map(lambda x: (int(x.start_token), int(x.end_token)), SYSTEM_MANAGER.describe_ring('Crawler')) In [2]: len(filter(lambda x: x[0] < x[1], tokens)) Out[2]: 255 In [3]: len(filter(lambda x: x[0] > x[1], tokens)) Out[3]: 1 In [4]: filter(lambda x: x[0] > x[1], tokens) Out[4]: [(9207458196362321348, -9182599474778206823)] In [5]: for i in CF.get_range(start_token="9207458196362321348", finish_token="-9182599474778206823"): print i # ... # after some objects are printed # ... InvalidRequestException: InvalidRequestException(why="Start key's token sorts after end token") {code} > Token ranges are erroneously swapped > ------------------------------------ > > Key: CASSANDRA-6439 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6439 > Project: Cassandra > Issue Type: Bug > Components: Core > Reporter: Francesco Piccinno > Priority: Critical > Fix For: 2.0.0 > > > I am trying to achieve a linear scan on the data contained in a cassandra > node by exploiting tokens. The idea behind my approach is to request through > the pycassa SystemManager a list of tokens that the cluster is responsible > of, and then for each token, issue a {{key_range}} command specifying > {{start}} and {{end}} interval. The problem is that apparently some tokens > returned by the server does not respect the property {{start}} < {{end}}. I > think that the bug is due to the fact that {{Murmur3RandomPartitioner}} is > used, but I am just guessing here. > Anyway here are the steps to reproduce the bug: > {code:title=Triggering the bug} > $ pycassaShell -k KEYSPACE > In [1]: tokens = map(lambda x: (int(x.start_token), int(x.end_token)), > SYSTEM_MANAGER.describe_ring('Crawler')) > In [2]: len(filter(lambda x: x[0] < x[1], tokens)) > Out[2]: 255 > In [3]: len(filter(lambda x: x[0] > x[1], tokens)) > Out[3]: 1 > In [4]: filter(lambda x: x[0] > x[1], tokens) > Out[4]: [(9207458196362321348, -9182599474778206823)] > In [5]: for i in CF.get_range(start_token="9207458196362321348", > finish_token="-9182599474778206823"): print i > # ... > # after some objects are printed > # ... > InvalidRequestException: InvalidRequestException(why="Start key's token sorts > after end token") > {code} -- This message was sent by Atlassian JIRA (v6.1#6144)