[jira] [Updated] (CASSANDRA-1600) Merge get_indexed_slices with get_range_slices

Sylvain Lebresne (Updated) (JIRA) Thu, 22 Dec 2011 01:21:01 -0800

     [ 
https://issues.apache.org/jira/browse/CASSANDRA-1600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Sylvain Lebresne updated CASSANDRA-1600:
----------------------------------------

    Attachment: 0004-Update-cql-to-not-use-deprecated-index-scan-v3.patch
                
0003-Allow-get_range_slices-to-apply-filter-to-a-sequenti-v3.patch
                0002-thrift-generated-code-changes-v3.patch
                0001-Add-optional-FilterClause-to-KeyRange-v3.patch

Attaching rebased version (the so-called v3 patchset).

For the most part this is a rebase of the preceding patches to trunk. But I've 
also slightly modified the AbstractScanIterator idea of the preceding patches 
to split it into the two classes of the new patch ExtendedFilter and 
AbstractScanIterator. It felt like a better separation of concern for the 
current trunk.

The last patch is new and just change CQL to use getRangeSlice with a filter 
rather than the "old" index scan. That last part is not very tested except for 
doing a stress test with '-o INDEXED_RANGE_SLICE' (and -L to enable CQL 
obviously).

All the unit tests that are not already broken in trunk pass with this patch.  
For the system tests (test/system/test_thrift_server.py), I got some weird 
thrift error:
{noformat}
======================================================================
ERROR: Test that column ttled expires from KEYS index
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/pymodules/python2.7/nose/case.py", line 187, in runTest
    self.test(*self.arg)
  File "/home/mcmanus/Git/cassandra/test/system/test_thrift_server.py", line 
1911, in test_index_scan_expiring
    result = get_range_slice(client, cp, sp, '', '', ConsistencyLevel.ONE, 
clause)
  File "/home/mcmanus/Git/cassandra/test/system/test_thrift_server.py", line 
218, in get_range_slice
    return client.get_range_slices(parent, predicate, kr, cl)
  File 
"/home/mcmanus/Git/cassandra/interface/thrift/gen-py/cassandra/Cassandra.py", 
line 669, in get_range_slices
    self.send_get_range_slices(column_parent, predicate, range, 
consistency_level)
  File 
"/home/mcmanus/Git/cassandra/interface/thrift/gen-py/cassandra/Cassandra.py", 
line 679, in send_get_range_slices
    args.write(self._oprot)
  File 
"/home/mcmanus/Git/cassandra/interface/thrift/gen-py/cassandra/Cassandra.py", 
line 3619, in write
    oprot.writeI32(self.consistency_level)
  File "/usr/lib/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py", 
line 110, in writeI32
    buff = pack("!i", i32)
AttributeError: FilterClause instance has no attribute '__trunc__'
{noformat}
After that one, thrift is in a bad state and the next test throw another 
completely wacko thrift exception during the setup:
{noformat}
======================================================================
ERROR: system.test_thrift_server.TestMutations.test_index_scan_uuid_names
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/pymodules/python2.7/nose/case.py", line 371, in setUp
    try_run(self.inst, ('setup', 'setUp'))
  File "/usr/lib/pymodules/python2.7/nose/util.py", line 478, in try_run
    return func()
  File "/home/mcmanus/Git/cassandra/test/system/__init__.py", line 113, in setUp
    self.define_schema()
  File "/home/mcmanus/Git/cassandra/test/system/__init__.py", line 180, in 
define_schema
    self.client.system_add_keyspace(ks)
  File 
"/home/mcmanus/Git/cassandra/interface/thrift/gen-py/cassandra/Cassandra.py", 
line 1373, in system_add_keyspace
    return self.recv_system_add_keyspace()
  File 
"/home/mcmanus/Git/cassandra/interface/thrift/gen-py/cassandra/Cassandra.py", 
line 1389, in recv_system_add_keyspace
    raise x
TApplicationException: Required field 'consistency_level' was not present! 
Struct: 
get_range_slices_args(column_parent:ColumnParent(column_family:Indexed1), 
predicate:SlicePredicate(slice_range:SliceRange(start:80 01 00 01 00 00 00 10 
67 65 74 5F 72 61 6E 67 65 5F 73 6C 69 63 65 73 00 00 00 00 0C 00 01 0B 00 03 
00 00 00 08 49 6E 64 65 78 65 64 31 00 0C 00 02 0C 00 02 0B 00 01 00 00 00 00, 
finish:80 01 00 01 00 00 00 10 67 65 74 5F 72 61 6E 67 65 5F 73 6C 69 63 65 73 
00 00 00 00 0C 00 01 0B 00 03 00 00 00 08 49 6E 64 65 78 65 64 31 00 0C 00 02 
0C 00 02 0B 00 01 00 00 00 00 0B 00 02 00 00 00 00, reversed:false, 
count:100)), range:KeyRange(start_key:80 01 00 01 00 00 00 10 67 65 74 5F 72 61 
6E 67 65 5F 73 6C 69 63 65 73 00 00 00 00 0C 00 01 0B 00 03 00 00 00 08 49 6E 
64 65 78 65 64 31 00 0C 00 02 0C 00 02 0B 00 01 00 00 00 00 0B 00 02 00 00 00 
00 02 00 03 00 08 00 04 00 00 00 64 00 00 0C 00 03 0B 00 01 00 00 00 00, 
end_key:80 01 00 01 00 00 00 10 67 65 74 5F 72 61 6E 67 65 5F 73 6C 69 63 65 73 
00 00 00 00 0C 00 01 0B 00 03 00 00 00 08 49 6E 64 65 78 65 64 31 00 0C 00 02 
0C 00 02 0B 00 01 00 00 00 00 0B 00 02 00 00 00 00 02 00 03 00 08 00 04 00 00 
00 64 00 00 0C 00 03 0B 00 01 00 00 00 00 0B 00 02 00 00 00 00, count:1), 
consistency_level:null)
{noformat}
after which all remaining tests are just skipped because the process hasn't be 
cleanly shutdown.

I'm at a loss on what is causing this. I double checked my thrift compiler 
version (even recompiling it from scratch with no more success). I could use 
someone looking to see if he gets the same error, or just look if he spot 
something wrong in this test (I don't).

It should be noted that this lift the limitation that index expression should 
have at least one EQ clause on an indexed column, but in that case we do a 
sequential scan. In other word, if the expressions capture very few rows, this 
will be slooooooooow. It is worth asking if allowing such potentially 
inefficient queries won't add more confusion than usefulness.

                
> Merge get_indexed_slices with get_range_slices
> ----------------------------------------------
>
>                 Key: CASSANDRA-1600
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1600
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: API
>            Reporter: Stu Hood
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1
>
>         Attachments: 
> 0001-Add-optional-FilterClause-to-KeyRange-and-support-do-v2.patch, 
> 0001-Add-optional-FilterClause-to-KeyRange-and-support-doin.txt, 
> 0001-Add-optional-FilterClause-to-KeyRange-v3.patch, 
> 0002-allow-get_range_slices-to-apply-filter-to-a-sequenti-v2.patch, 
> 0002-allow-get_range_slices-to-apply-filter-to-a-sequential.txt, 
> 0002-thrift-generated-code-changes-v3.patch, 
> 0003-Allow-get_range_slices-to-apply-filter-to-a-sequenti-v3.patch, 
> 0004-Update-cql-to-not-use-deprecated-index-scan-v3.patch
>
>
> From a comment on 1157:
> {quote}
> IndexClause only has a start key for get_indexed_slices, but it would seem 
> that the reasoning behind using 'KeyRange' for get_range_slices applies there 
> as well, since if you know the range you care about in the primary index, you 
> don't want to continue scanning until you exhaust 'count' (or the cluster).
> Since it would appear that get_indexed_slices would benefit from a KeyRange, 
> why not smash get_(range|indexed)_slices together, and make IndexClause an 
> optional field on KeyRange?
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-1600) Merge get_indexed_slices with get_range_slices

Reply via email to