[ https://issues.apache.org/jira/browse/CASSANDRA-1600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sylvain Lebresne updated CASSANDRA-1600: ---------------------------------------- Attachment: 0004-Update-cql-to-not-use-deprecated-index-scan-v3.patch 0003-Allow-get_range_slices-to-apply-filter-to-a-sequenti-v3.patch 0002-thrift-generated-code-changes-v3.patch 0001-Add-optional-FilterClause-to-KeyRange-v3.patch Attaching rebased version (the so-called v3 patchset). For the most part this is a rebase of the preceding patches to trunk. But I've also slightly modified the AbstractScanIterator idea of the preceding patches to split it into the two classes of the new patch ExtendedFilter and AbstractScanIterator. It felt like a better separation of concern for the current trunk. The last patch is new and just change CQL to use getRangeSlice with a filter rather than the "old" index scan. That last part is not very tested except for doing a stress test with '-o INDEXED_RANGE_SLICE' (and -L to enable CQL obviously). All the unit tests that are not already broken in trunk pass with this patch. For the system tests (test/system/test_thrift_server.py), I got some weird thrift error: {noformat} ====================================================================== ERROR: Test that column ttled expires from KEYS index ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/pymodules/python2.7/nose/case.py", line 187, in runTest self.test(*self.arg) File "/home/mcmanus/Git/cassandra/test/system/test_thrift_server.py", line 1911, in test_index_scan_expiring result = get_range_slice(client, cp, sp, '', '', ConsistencyLevel.ONE, clause) File "/home/mcmanus/Git/cassandra/test/system/test_thrift_server.py", line 218, in get_range_slice return client.get_range_slices(parent, predicate, kr, cl) File "/home/mcmanus/Git/cassandra/interface/thrift/gen-py/cassandra/Cassandra.py", line 669, in get_range_slices self.send_get_range_slices(column_parent, predicate, range, consistency_level) File "/home/mcmanus/Git/cassandra/interface/thrift/gen-py/cassandra/Cassandra.py", line 679, in send_get_range_slices args.write(self._oprot) File "/home/mcmanus/Git/cassandra/interface/thrift/gen-py/cassandra/Cassandra.py", line 3619, in write oprot.writeI32(self.consistency_level) File "/usr/lib/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py", line 110, in writeI32 buff = pack("!i", i32) AttributeError: FilterClause instance has no attribute '__trunc__' {noformat} After that one, thrift is in a bad state and the next test throw another completely wacko thrift exception during the setup: {noformat} ====================================================================== ERROR: system.test_thrift_server.TestMutations.test_index_scan_uuid_names ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/pymodules/python2.7/nose/case.py", line 371, in setUp try_run(self.inst, ('setup', 'setUp')) File "/usr/lib/pymodules/python2.7/nose/util.py", line 478, in try_run return func() File "/home/mcmanus/Git/cassandra/test/system/__init__.py", line 113, in setUp self.define_schema() File "/home/mcmanus/Git/cassandra/test/system/__init__.py", line 180, in define_schema self.client.system_add_keyspace(ks) File "/home/mcmanus/Git/cassandra/interface/thrift/gen-py/cassandra/Cassandra.py", line 1373, in system_add_keyspace return self.recv_system_add_keyspace() File "/home/mcmanus/Git/cassandra/interface/thrift/gen-py/cassandra/Cassandra.py", line 1389, in recv_system_add_keyspace raise x TApplicationException: Required field 'consistency_level' was not present! Struct: get_range_slices_args(column_parent:ColumnParent(column_family:Indexed1), predicate:SlicePredicate(slice_range:SliceRange(start:80 01 00 01 00 00 00 10 67 65 74 5F 72 61 6E 67 65 5F 73 6C 69 63 65 73 00 00 00 00 0C 00 01 0B 00 03 00 00 00 08 49 6E 64 65 78 65 64 31 00 0C 00 02 0C 00 02 0B 00 01 00 00 00 00, finish:80 01 00 01 00 00 00 10 67 65 74 5F 72 61 6E 67 65 5F 73 6C 69 63 65 73 00 00 00 00 0C 00 01 0B 00 03 00 00 00 08 49 6E 64 65 78 65 64 31 00 0C 00 02 0C 00 02 0B 00 01 00 00 00 00 0B 00 02 00 00 00 00, reversed:false, count:100)), range:KeyRange(start_key:80 01 00 01 00 00 00 10 67 65 74 5F 72 61 6E 67 65 5F 73 6C 69 63 65 73 00 00 00 00 0C 00 01 0B 00 03 00 00 00 08 49 6E 64 65 78 65 64 31 00 0C 00 02 0C 00 02 0B 00 01 00 00 00 00 0B 00 02 00 00 00 00 02 00 03 00 08 00 04 00 00 00 64 00 00 0C 00 03 0B 00 01 00 00 00 00, end_key:80 01 00 01 00 00 00 10 67 65 74 5F 72 61 6E 67 65 5F 73 6C 69 63 65 73 00 00 00 00 0C 00 01 0B 00 03 00 00 00 08 49 6E 64 65 78 65 64 31 00 0C 00 02 0C 00 02 0B 00 01 00 00 00 00 0B 00 02 00 00 00 00 02 00 03 00 08 00 04 00 00 00 64 00 00 0C 00 03 0B 00 01 00 00 00 00 0B 00 02 00 00 00 00, count:1), consistency_level:null) {noformat} after which all remaining tests are just skipped because the process hasn't be cleanly shutdown. I'm at a loss on what is causing this. I double checked my thrift compiler version (even recompiling it from scratch with no more success). I could use someone looking to see if he gets the same error, or just look if he spot something wrong in this test (I don't). It should be noted that this lift the limitation that index expression should have at least one EQ clause on an indexed column, but in that case we do a sequential scan. In other word, if the expressions capture very few rows, this will be slooooooooow. It is worth asking if allowing such potentially inefficient queries won't add more confusion than usefulness. > Merge get_indexed_slices with get_range_slices > ---------------------------------------------- > > Key: CASSANDRA-1600 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1600 > Project: Cassandra > Issue Type: New Feature > Components: API > Reporter: Stu Hood > Assignee: Sylvain Lebresne > Fix For: 1.1 > > Attachments: > 0001-Add-optional-FilterClause-to-KeyRange-and-support-do-v2.patch, > 0001-Add-optional-FilterClause-to-KeyRange-and-support-doin.txt, > 0001-Add-optional-FilterClause-to-KeyRange-v3.patch, > 0002-allow-get_range_slices-to-apply-filter-to-a-sequenti-v2.patch, > 0002-allow-get_range_slices-to-apply-filter-to-a-sequential.txt, > 0002-thrift-generated-code-changes-v3.patch, > 0003-Allow-get_range_slices-to-apply-filter-to-a-sequenti-v3.patch, > 0004-Update-cql-to-not-use-deprecated-index-scan-v3.patch > > > From a comment on 1157: > {quote} > IndexClause only has a start key for get_indexed_slices, but it would seem > that the reasoning behind using 'KeyRange' for get_range_slices applies there > as well, since if you know the range you care about in the primary index, you > don't want to continue scanning until you exhaust 'count' (or the cluster). > Since it would appear that get_indexed_slices would benefit from a KeyRange, > why not smash get_(range|indexed)_slices together, and make IndexClause an > optional field on KeyRange? > {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira