[jira] [Updated] (CASSANDRA-5762) Lost row marker after TTL expires

2013-07-15 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-5762:


Affects Version/s: (was: 1.2.6)
   1.2.0

> Lost row marker after TTL expires
> -
>
> Key: CASSANDRA-5762
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5762
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 1.2.0
> Environment: Ubuntu 12.04
>Reporter: Taner Catakli
>Assignee: Sylvain Lebresne
>Priority: Critical
>
> I have the following table
> cqlsh:loginproject> DESCRIBE TABLE gameservers;
>  
> CREATE TABLE gameservers (
>   address inet PRIMARY KEY,
>   last_update timestamp,
>   regions blob,
>   server_status boolean
> ) WITH
>   bloom_filter_fp_chance=0.01 AND
>   caching='KEYS_ONLY' AND
>   comment='' AND
>   dclocal_read_repair_chance=0.00 AND
>   gc_grace_seconds=864000 AND
>   read_repair_chance=0.10 AND
>   replicate_on_write='true' AND
>   populate_io_cache_on_flush='false' AND
>   compaction={'class': 'SizeTieredCompactionStrategy'} AND
>   compression={'sstable_compression': 'SnappyCompressor'};
> after inserting a row and executing the following command:
> UPDATE gameservers USING TTL 10 SET server_status = true WHERE address = 
> '192.168.0.100'
> after waiting for the ttl to expire, the row will lose its rowmarker making 
> "select address from gameservers" returning 0 results although there are some.
> in cassandra-cli the table looks like this:
> [default@loginproject] list gameservers;
> Using default limit of 100
> Using default cell limit of 100
> ---
> RowKey: 192.168.0.100
> => (name=last_update, value=0017, timestamp=1373884433543000)
> => (name=regions, value=, timestamp=1373883701652000)
> 1 Row Returned.
> Elapsed time: 345 msec(s).
> [default@loginproject]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-5762) Lost row marker after TTL expires

2013-07-16 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-5762:


Attachment: 0001-Always-do-slice-queries-for-CQL3-tables.txt

As much as this pains me, I don't see any easy way to make this work outside of 
doing a read-before-write (which is not acceptable).

It "might" be possible to make it work (without read-before-write) by 
specializing the row marker in the storage so that it tracks TTL to provide the 
desired behavior but at best that wouldn't be trivial and would probably make 
the row marker prohibitive in term of storage (though something like 
CASSANDRA-4175 might help make it more reasonable). In any case, it's _at best_ 
a solution for 2.1 but not before that, and that's leaving aside the debate of 
whether the feature is worth the complexity.

In the meantime, the best workaround I can come with would be to force SELECT 
queries to slice the whole CQL3 row even when only some columns are selected.  
That is, we would revert to what we did for selects before CASSANDRA-4361. Tbh, 
this probably wouldn't have much impact on performance since 1) CQL3 rows are 
bound to be relatively small and 2) we now optimize slice queries relatively 
well for that kind of case (partly in 1.2 with promoted index and even more in 
2.0 with CASSANDRA-5514) so that queries by names probably don't have that much 
benefits anymore.

Doing that would fix the problem is most cases, including the one of the 
description since it'll basically relegate the row marker to only mark rows 
where only the PK is set. This does not fix it fully though, since if you do
{noformat}
CREATE TABLE test (k int PRIMARY KEY, a int, b int);
INSERT INTO test (k, a, b) VALUES (0, 1, 2);
UPDATE test USING TTL=1 SET b=3 WHERE k=0;
// wait 2 seconds
DELETE a FROM test WHERE k=0;
SELECT * FROM test WHERE k=0;
{noformat}
then the last select will return no results, even though it kind of should 
return one result (with {{a == null}} and {{b == null}}) since we haven't done 
a full row deletion. But then we could accept that as a whacky known special 
situation (don't get me wrong, I don't like it, it's just that "we have a 
problem and I don't have a better solution"). And to be fair, you would really 
have to try fairly hard to get bitten by this.

Attaching the patch that do what's above for info (IN queries on the last 
clustering column, which we support. make that slightly more annoying that one 
would hope, but it's not too much of a big deal either).

As mentionned above, another workaround could be to not let user get into that 
state by forcing all the (CQL3) columns of the (CQL3) row to be set int the 
statement if a TTL is used.

The (imho big) problem is that this is a breaking change. If someone is using 
different TTL in the same CQL3 row (and his application do depends on it), it 
basically cannot upgrade (short of migrating data that have differents TTL into 
their own separate table, which is extremely painful). Part of me is also 
pretty convinced that the convenience of being able to set TTL to individual 
columns outweight the "not exactly right" behavior of the special case above 
(especially since only people that *needs* per-columns TTL will ever run into 
that special case).


> Lost row marker after TTL expires
> -
>
> Key: CASSANDRA-5762
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5762
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 1.2.0
> Environment: Ubuntu 12.04
>Reporter: Taner Catakli
>Assignee: Sylvain Lebresne
>Priority: Critical
> Attachments: 0001-Always-do-slice-queries-for-CQL3-tables.txt
>
>
> I have the following table
> cqlsh:loginproject> DESCRIBE TABLE gameservers;
>  
> CREATE TABLE gameservers (
>   address inet PRIMARY KEY,
>   last_update timestamp,
>   regions blob,
>   server_status boolean
> ) WITH
>   bloom_filter_fp_chance=0.01 AND
>   caching='KEYS_ONLY' AND
>   comment='' AND
>   dclocal_read_repair_chance=0.00 AND
>   gc_grace_seconds=864000 AND
>   read_repair_chance=0.10 AND
>   replicate_on_write='true' AND
>   populate_io_cache_on_flush='false' AND
>   compaction={'class': 'SizeTieredCompactionStrategy'} AND
>   compression={'sstable_compression': 'SnappyCompressor'};
> after inserting a row and executing the following command:
> UPDATE gameservers USING TTL 10 SET server_status = true WHERE address = 
> '192.168.0.100'
> after waiting for the ttl to expire, the row will lose its rowmarker making 
> "select address from gameservers" returning 0 results although there are some.
> in cassandra-cli the table looks like this:
> [default@loginproject] list gameservers;
> Using default limit of 100
> Using default cell limit of 100
> ---

[jira] [Updated] (CASSANDRA-5762) Lost row marker after TTL expires

2013-07-16 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-5762:
-

Reviewer: iamaleksey

> Lost row marker after TTL expires
> -
>
> Key: CASSANDRA-5762
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5762
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 1.2.0
> Environment: Ubuntu 12.04
>Reporter: Taner Catakli
>Assignee: Sylvain Lebresne
>Priority: Critical
> Attachments: 0001-Always-do-slice-queries-for-CQL3-tables.txt
>
>
> I have the following table
> cqlsh:loginproject> DESCRIBE TABLE gameservers;
>  
> CREATE TABLE gameservers (
>   address inet PRIMARY KEY,
>   last_update timestamp,
>   regions blob,
>   server_status boolean
> ) WITH
>   bloom_filter_fp_chance=0.01 AND
>   caching='KEYS_ONLY' AND
>   comment='' AND
>   dclocal_read_repair_chance=0.00 AND
>   gc_grace_seconds=864000 AND
>   read_repair_chance=0.10 AND
>   replicate_on_write='true' AND
>   populate_io_cache_on_flush='false' AND
>   compaction={'class': 'SizeTieredCompactionStrategy'} AND
>   compression={'sstable_compression': 'SnappyCompressor'};
> after inserting a row and executing the following command:
> UPDATE gameservers USING TTL 10 SET server_status = true WHERE address = 
> '192.168.0.100'
> after waiting for the ttl to expire, the row will lose its rowmarker making 
> "select address from gameservers" returning 0 results although there are some.
> in cassandra-cli the table looks like this:
> [default@loginproject] list gameservers;
> Using default limit of 100
> Using default cell limit of 100
> ---
> RowKey: 192.168.0.100
> => (name=last_update, value=0017, timestamp=1373884433543000)
> => (name=regions, value=, timestamp=1373883701652000)
> 1 Row Returned.
> Elapsed time: 345 msec(s).
> [default@loginproject]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira