[ 
https://issues.apache.org/jira/browse/CASSANDRA-4421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13659149#comment-13659149
 ] 

Mike Schrag edited comment on CASSANDRA-4421 at 5/16/13 2:27 AM:
-----------------------------------------------------------------

I've tracked down the bug ... If the token value of the last row of the page == 
the end value of the split, it ends up trying to fetch the next page using the 
query:

SELECT * FROM [cf] WHERE token(key) > token(?) AND token(key) <= ? LIMIT 1000 
ALLOW FILTERING

If you fill this in ... Assume your split is 1000-2000, and the last row of the 
page happened to actually be the max value 2000, that would be:

SELECT * FROM [cf] WHERE token(key) > 2000 AND token(key) <= 2000 LIMIT 1000 
ALLOW FILTERING

It looks like Cass freaks out here with the impossible predicate, and where it 
should be returning an empty result, it ACTUALLY returns bogus values that fall 
outside the specified range. Once you get a token outside of the split range, 
you're totally screwed, and everything goes off the rails.

                
      was (Author: mikeschrag):
    I've tracked down the bug ... If the token value of the last row of the 
page == the end value of the split, it ends trying to fetch the next page using 
the query.

SELECT * FROM [cf] WHERE token(key) > token(?) AND token(key) <= ? LIMIT 1000 
ALLOW FILTERING

If you fill this in ... Assume your split is 1000-2000, and the last row of the 
page happened to actually be the max value 2000, that would be:

SELECT * FROM [cf] WHERE token(key) > 2000 AND token(key) <= 2000 LIMIT 1000 
ALLOW FILTERING

It looks like Cass freaks out here with the impossible predicate, and where it 
should be returning an empty result, it ACTUALLY returns bogus values that fall 
outside the specified range. Once you get a token outside of the split range, 
you're totally screwed, and everything goes off the rails.

                  
> Support cql3 table definitions in Hadoop InputFormat
> ----------------------------------------------------
>
>                 Key: CASSANDRA-4421
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4421
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: API
>    Affects Versions: 1.1.0
>         Environment: Debian Squeeze
>            Reporter: bert Passek
>              Labels: cql3
>             Fix For: 1.2.5
>
>         Attachments: 4421-1.txt, 4421-2.txt, 4421.txt
>
>
> Hello,
> i faced a bug while writing composite column values and following validation 
> on server side.
> This is the setup for reproduction:
> 1. create a keyspace
> create keyspace test with strategy_class = 'SimpleStrategy' and 
> strategy_options:replication_factor = 1;
> 2. create a cf via cql (3.0)
> create table test1 (
>     a int,
>     b int,
>     c int,
>     primary key (a, b)
> );
> If i have a look at the schema in cli i noticed that there is no column 
> metadata for columns not part of primary key.
> create column family test1
>   with column_type = 'Standard'
>   and comparator = 
> 'CompositeType(org.apache.cassandra.db.marshal.Int32Type,org.apache.cassandra.db.marshal.UTF8Type)'
>   and default_validation_class = 'UTF8Type'
>   and key_validation_class = 'Int32Type'
>   and read_repair_chance = 0.1
>   and dclocal_read_repair_chance = 0.0
>   and gc_grace = 864000
>   and min_compaction_threshold = 4
>   and max_compaction_threshold = 32
>   and replicate_on_write = true
>   and compaction_strategy = 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'
>   and caching = 'KEYS_ONLY'
>   and compression_options = {'sstable_compression' : 
> 'org.apache.cassandra.io.compress.SnappyCompressor'};
> Please notice the default validation class: UTF8Type
> Now i would like to insert value > 127 via cassandra client (no cql, part of 
> mr-jobs). Have a look at the attachement.
> Batch mutate fails:
> InvalidRequestException(why:(String didn't validate.) [test][test1][1:c] 
> failed validation)
> A validator for column value is fetched in 
> ThriftValidation::validateColumnData which returns always the default 
> validator which is UTF8Type as described above (The ColumnDefinition for 
> given column name "c" is always null)
> In UTF8Type there is a check for
> if (b > 127)
>    return false;
> Anyway, maybe i'm doing something wrong, but i used cql 3.0 for table 
> creation. I assigned data types to all columns, but i can not set values for 
> a composite column because the default validation class is used.
> I think the schema should know the correct validator even for composite 
> columns. The usage of the default validation class does not make sense.
> Best Regards 
> Bert Passek

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to