[jira] [Updated] (CASSANDRA-3510) Incorrect query results due to invalid SSTable.maxTimestamp

Aaron Morton (Updated) (JIRA) Mon, 21 Nov 2011 02:45:16 -0800

     [ 
https://issues.apache.org/jira/browse/CASSANDRA-3510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Aaron Morton updated CASSANDRA-3510:
------------------------------------

    Attachment: 0002-3510-update-maxTimestamp-during-repair.patch
                0001-3510-ignore-maxTimestamp-if-Long.MIN_VALUE.patch

patch 0001 is a proof of concept hack based on my first comment, it generated 
this output when using the extra logging 

{noformat}
DEBUG [ReadStage:1] 2011-11-21 23:12:28,578 CollationController.java (line 77) 
collectTimeOrderedData
DEBUG [ReadStage:1] 2011-11-21 23:12:28,578 CollationController.java (line 111) 
Got sstable 
SSTableReader(path='/var/lib/cassandra/data/fmm/Users-hb-12348-Data.db') and 
max TS 1321824847534000
DEBUG [ReadStage:1] 2011-11-21 23:12:28,578 CollationController.java (line 111) 
Got sstable 
SSTableReader(path='/var/lib/cassandra/data/fmm/Users-hb-12346-Data.db') and 
max TS 1321813380793000
DEBUG [ReadStage:1] 2011-11-21 23:12:28,583 CollationController.java (line 111) 
Got sstable 
SSTableReader(path='/var/lib/cassandra/data/fmm/Users-hb-12330-Data.db') and 
max TS -9223372036854775808
DEBUG [ReadStage:1] 2011-11-21 23:12:28,584 CollationController.java (line 130) 
Add Sub Column password:false:8@1307352647576000
DEBUG [ReadStage:1] 2011-11-21 23:12:28,585 CollationController.java (line 111) 
Got sstable 
SSTableReader(path='/var/lib/cassandra/data/fmm/Users-hb-12325-Data.db') and 
max TS -9223372036854775808
DEBUG [ReadStage:1] 2011-11-21 23:12:28,587 CollationController.java (line 130) 
Add Sub Column password:false:8@1307352647576000
DEBUG [ReadStage:1] 2011-11-21 23:12:28,588 CollationController.java (line 111) 
Got sstable 
SSTableReader(path='/var/lib/cassandra/data/fmm/Users-hb-12327-Data.db') and 
max TS -9223372036854775808
DEBUG [ReadStage:1] 2011-11-21 23:12:28,590 CollationController.java (line 130) 
Add Sub Column password:false:8@1307352647576000
DEBUG [ReadStage:1] 2011-11-21 23:12:28,591 CollationController.java (line 111) 
Got sstable 
SSTableReader(path='/var/lib/cassandra/data/fmm/Users-hb-12328-Data.db') and 
max TS -9223372036854775808
DEBUG [ReadStage:1] 2011-11-21 23:12:28,592 CollationController.java (line 130) 
Add Sub Column password:false:8@1319563673493000
DEBUG [ReadStage:1] 2011-11-21 23:12:28,593 CollationController.java (line 111) 
Got sstable 
SSTableReader(path='/var/lib/cassandra/data/fmm/Users-hb-12326-Data.db') and 
max TS -9223372036854775808
DEBUG [ReadStage:1] 2011-11-21 23:12:28,595 CollationController.java (line 111) 
Got sstable 
SSTableReader(path='/var/lib/cassandra/data/fmm/Users-hb-12331-Data.db') and 
max TS -9223372036854775808
DEBUG [ReadStage:1] 2011-11-21 23:12:28,596 CollationController.java (line 130) 
Add Sub Column password:false:8@1307352647576000
DEBUG [ReadStage:1] 2011-11-21 23:12:28,597 CollationController.java (line 111) 
Got sstable 
SSTableReader(path='/var/lib/cassandra/data/fmm/Users-hb-12332-Data.db') and 
max TS -9223372036854775808
DEBUG [pool-2-thread-1] 2011-11-21 23:12:28,604 StorageProxy.java (line 694) 
Read: 26 ms.
{noformat}


patch 0002 makes repair update the maxTimestamp and when I ran my test 
afterwards it created this output:

{noformat}

DEBUG [ReadStage:33] 2011-11-21 23:20:32,032 CollationController.java (line 77) 
collectTimeOrderedData
DEBUG [ReadStage:33] 2011-11-21 23:20:32,033 CollationController.java (line 
111) Got sstable 
SSTableReader(path='/private/var/lib/cassandra/data/fmm/Users-hc-12357-Data.db')
 and max TS 1321824847534000
DEBUG [ReadStage:33] 2011-11-21 23:20:32,033 CollationController.java (line 
111) Got sstable 
SSTableReader(path='/var/lib/cassandra/data/fmm/Users-hc-12352-Data.db') and 
max TS 1321813380793000
DEBUG [ReadStage:33] 2011-11-21 23:20:32,063 CollationController.java (line 
111) Got sstable 
SSTableReader(path='/private/var/lib/cassandra/data/fmm/Users-hc-12356-Data.db')
 and max TS 1321560509938000
DEBUG [ReadStage:33] 2011-11-21 23:20:32,064 CollationController.java (line 
111) Got sstable 
SSTableReader(path='/var/lib/cassandra/data/fmm/Users-hc-12349-Data.db') and 
max TS 1319813295567000
DEBUG [ReadStage:33] 2011-11-21 23:20:32,105 CollationController.java (line 
130) Add Sub Column password:false:8@1319563673493000
DEBUG [ReadStage:33] 2011-11-21 23:20:32,105 CollationController.java (line 
111) Got sstable 
SSTableReader(path='/var/lib/cassandra/data/fmm/Users-hc-12350-Data.db') and 
max TS 1318523190222000
DEBUG [pool-2-thread-2] 2011-11-21 23:20:32,106 StorageProxy.java (line 694) 
Read: 74 ms.
{noformat}
                
> Incorrect query results due to invalid SSTable.maxTimestamp
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-3510
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3510
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.3
>            Reporter: Aaron Morton
>            Priority: Critical
>         Attachments: 0001-3510-ignore-maxTimestamp-if-Long.MIN_VALUE.patch, 
> 0002-3510-update-maxTimestamp-during-repair.patch
>
>
> related to CASSANDRA-3446
> (sorry this is so long, took me a bit to work through it all and there is a 
> lot of new code :) )
>  
> h1. Summary
> SSTable.maxTimestamp for files created before 1.0 defaults to Long.MIN_VALUE, 
> and this means the wrong data is returned from queries. 
>  
> h2. Details 
> Noticed on a cluster that was upgraded from 0.8.X to 1.X, it then had trouble 
> similar to CASSANDRA-3446. It was rolled back to 0.8 and the migrated to 
> 1.0.3. 
> 4 Node cluster, all files upgraded to "hb" format. 
> In a super CF there are situations where a get for a sub columns returns a 
> different value than a get for the column. .e.g. 
> {noformat}
> [default@XXX] get Users[ascii('username')]['meta']['password'];
> => (column=password, value=3130323130343130, timestamp=1307352647576000)
> [default@XX] get Users[ascii('username')]['meta'];     
> (snip)       
> => (column=password, value=3034323131303034, timestamp=1319563673493000)
> {noformat}
> The correct value is the second one. 
> I added logging after line 109 in 
> o.a.c.db.CollectionController.collectTimeOrderedData() to log the sstable 
> name and the file max timestamp, this is what I got:
> {code:java}
> for (SSTableReader sstable : view.sstables)
> {
>     long currentMaxTs = sstable.getMaxTimestamp();
>     logger.debug(String.format("Got sstable %s and max TS %d", sstable, 
> currentMaxTs));
>     reduceNameFilter(reducedFilter, container, currentMaxTs);
> {code}
> {noformat}
> DEBUG 14:08:46,012 Got sstable 
> SSTableReader(path='/var/lib/cassandra/data/X/Users-hb-12348-Data.db') and 
> max TS 1321824847534000
> DEBUG 14:08:47,231 Got sstable 
> SSTableReader(path='/var/lib/cassandra/data/X/Users-hb-12346-Data.db') and 
> max TS 1321813380793000
> DEBUG 14:08:49,879 Got sstable 
> SSTableReader(path='/var/lib/cassandra/data/X/Users-hb-12330-Data.db') and 
> max TS -9223372036854775808
> DEBUG 14:08:49,880 Got sstable 
> SSTableReader(path='/var/lib/cassandra/data/X/Users-hb-12325-Data.db') and 
> max TS -9223372036854775808
> {noformat}
> The key I was reading is present in files 12330 and 12325, the first contains 
> the *old / wrong* value with timestamp 1307352647576000 above. The second 
> contains the *new / correct* value with timestamp 1319563673493000.
> When CollectionController.collectTimeOrderedData() processes the 12325 file 
> (after processing the 12330 file) while looping over the sstables the call to 
> reduceNameFilter() removes the column  from the filter because the column 
> read from the 12330 file has a time stamp of 1307352647576000 and the 12325 
> file incorrectly has a max time stamp of -9223372036854775808 .
> SSTableMetadata is reading the max time stamp from the stats file, but it is 
> Long.MIN_VALUE. I think this happens because scrub creates the SSTableWriter 
> using cfs.createCompactionWriter() which sets the maxTimestamp in the meta 
> data collector according to the maxTimestamp in the meta data for the file(s) 
> that will be scrubbed / compacted. But for pre 1.0 format files the default 
> in SSTableMetadata is Long.MIN_VALUE, (see SSTableMetaData.deserialize() and 
> the ctor). So scrubbing a pre 1.0 file will write stats files that have 
> maxTimestamp as Long.MIN_VALUE.
> During scrubbing the SSTableWriter does not update the maxTimestamp because 
> append(AbstractCompactedRow) is called which expects the that 
> cfs.createCompactionWriter() was able to set the correct maxTimestamp on the 
> meta data. Compaction also uses append(AbstractCompactedRow) so may create an 
> SSTable with an incorrect maxTimestamp if one of the input files started life 
> as a pre 1.0 file and has a bad maxTimestamp. 
> It looks like the only time the maxTimestamp is calculated is when the 
> SSTable is originally written. So the error from the old files will be 
> carried along. 
> e.g. If the files a,b and c have the maxTimestamps 10, 100 and Long.MIN_VALUE 
> compaction will write a SSTable with maxTimestamp 100. However file c may 
> actually contain columns with a timestamp > 100 which will be in the 
> compacted file.
> h1. Reproduce
> 1. Start a clean 0.8.7
> 2. Add a schema (details of the schema do not matter):
> {noformat}
> [default@unknown] create keyspace dev;   
> 5f834620-140b-11e1-0000-242d50cf1fdf
> Waiting for schema agreement...
> ... schemas agree across the cluster
> [default@unknown] 
> [default@unknown] use dev;
> Authenticated to keyspace: dev
> [default@dev] 
> [default@dev] create column family super_dev with column_type = 'Super' 
> ...   and key_validation_class = 'AsciiType' and comparator = 'AsciiType' and 
> ...   subcomparator = 'AsciiType' and default_validation_class = 'AsciiType';
> 60490720-140b-11e1-0000-242d50cf1fdf
> Waiting for schema agreement...
> ... schemas agree across the cluster
> {noformat}
> 3. Shutdown 0.8.7
> 4. Start 1.0.3 using the same data. Check the schema version loaded, example 
> below shows the wrong schema is loaded. I stepped the code and the wrong 
> value was read from Migration.getLastMigrationId() due to this bug. 
> {noformat}
>  INFO [main] 2011-11-21 19:39:08,546 DatabaseDescriptor.java (line 501) 
> Loading schema version 5f834620-140b-11e1-0000-242d50cf1fdf
> {noformat}
> 5. Check the schema using the 1.0.3 CLI 
> {noformat}
> [default@unknown] use dev;
> Authenticated to keyspace: dev
> [default@dev] describe;
> Keyspace: dev:
>   Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy
>   Durable Writes: true
>     Options: [datacenter1:1]
>   Column Families:
> [default@dev] 
> {noformat}
> 6. I then did a 1.0.3 scrub and re-started. The correct schema version was 
> read, but stepping the code both Schema SSTables had Long.MIN_VALUE as the 
> maxTimestamp so I think it was only the random order of the files that made 
> it work. 
> {noformat}
> DEBUG 19:52:30,744 Got sstable 
> SSTableReader(path='/var/lib/cassandra/data/system/Schema-hb-4-Data.db') and 
> max TS -9223372036854775808
> DEBUG 19:52:30,744 Got sstable 
> SSTableReader(path='/var/lib/cassandra/data/system/Schema-hb-3-Data.db') and 
> max TS -9223372036854775808
> {noformat}
> h1. Fixes
> Not sure, (wanted to get the ticket opened and find out if I was imagining 
> things), guessing...
> Use Long.MIN_VALUE as a magic maxTimestamp that means the value is not know. 
> This would not fix issues where the incorrect maxTimestamp been included in 
> compaction. 
>  
> Looking at making scrub re-calculate the maxTimestamp.
> Also wondering if the maxTimestamp should default to Long.MAX_VALUE if read 
> from a file format that does not support maxTimestamp ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3510) Incorrect query results due to invalid SSTable.maxTimestamp

Reply via email to