Issue created.

Will attach debug logs asap
CASSANDRA-4670<https://issues.apache.org/jira/browse/CASSANDRA-4670>

Von: aaron morton [mailto:aa...@thelastpickle.com]
Gesendet: Montag, 17. September 2012 03:46
An: user@cassandra.apache.org
Betreff: Re: secondery indexes TTL - strange issues

 Date gets inserted and accessible via index query for some time. At some point 
in time Indexes are completely empty and start filling again (while new data 
enters the system).
If you can reproduce this please create a ticket on 
https://issues.apache.org/jira/browse/CASSANDRA .

If you can include DEBUG level logs that would be helpful.

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 14/09/2012, at 10:08 PM, Roland Gude 
<roland.g...@ez.no<mailto:roland.g...@ez.no>> wrote:


I am not sure it is compacting an old file: the same thing happens eeverytime I 
rebuild the index. New Files appear, get compacted and vanish.

We have set up a new smaller cluster with fresh data. Same thing happens here 
as well. Date gets inserted and accessible via index query for some time. At 
some point in time Indexes are completely empty and start filling again (while 
new data enters the system).

I am currently testing with SizeTiered on both the fresh set and the imported 
set.

For the fresh set (which is significantly smaller) first results imply that the 
issue is not happening with SizeTieredCompaction - I have not yet tested 
everything that comes into my mind and will update if something new comes up.

As for the failing query it is from the cli:
get EventsByItem where 00000003-0000-1000-0000-000000000000=utf8('someValue');
00000003-0000-1000-0000-000000000000 is a TUUID we use as a marker for a 
TimeSeries.
(and equivalent queries with astyanax and hector as well)

This is a cf with the issue:

create column family EventsByItem
  with column_type = 'Standard'
  and comparator = 'TimeUUIDType'
  and default_validation_class = 'BytesType'
  and key_validation_class = 'BytesType'
  and read_repair_chance = 0.5
  and dclocal_read_repair_chance = 0.0
  and gc_grace = 864000
  and min_compaction_threshold = 4
  and max_compaction_threshold = 32
  and replicate_on_write = true
  and compaction_strategy = 
'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'
  and caching = 'NONE'
  and column_metadata = [
    {column_name : '00000000-0000-1000-0000-000000000000',
    validation_class : BytesType,
    index_name : 'ebi_mandatorIndex',
    index_type : 0},
    {column_name : '00000002-0000-1000-0000-000000000000',
    validation_class : BytesType,
    index_name : 'ebi_itemidIndex',
    index_type : 0},
    {column_name : '00000003-0000-1000-0000-000000000000',
    validation_class : BytesType,
    index_name : 'ebi_eventtypeIndex',
    index_type : 0}]
  and compression_options={sstable_compression:SnappyCompressor, 
chunk_length_kb:64};

Von: aaron morton [mailto:aa...@thelastpickle.com<http://thelastpickle.com>]
Gesendet: Freitag, 14. September 2012 10:46
An: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Betreff: Re: secondery indexes TTL - strange issues

INFO [CompactionExecutor:181] 2012-09-13 12:58:37,443 CompactionTask.java (line
221) Compacted to [/var/lib/cassandra/data/Eventstore/EventsByItem/Eventstore-E
ventsByItem.ebi_eventtypeIndex-he-10-Data.db,].  78,623,000 to 373,348 (~0% of o
riginal) bytes for 83 keys at 0.000280MB/s.  Time: 1,272,883ms.
There is a lot of weird things here.
It could be levelled compaction compacting an older file for the first time. 
But that would be a guess.

Rebuilding the index gives us back the data for a couple of minutes - then it 
vanishes again.
Are you able to do a test with SiezedTieredCompaction ?

Are you able to replicate the problem with a fresh testing CF and some test 
Data?

If it's only a problem with imported data can you provide a sample of the 
failing query ? Any maybe the CF definition ?

Cheers


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 14/09/2012, at 2:46 AM, Roland Gude 
<roland.g...@ez.no<mailto:roland.g...@ez.no>> wrote:



Hi,

we have been running a system on Cassandra 0.7 heavily relying on secondary 
indexes for columns with TTL.
This has been working like a charm, but we are trying hard to move forward with 
Cassandra and are struggling at that point:

When we put our data into a new cluster (any 1.1.x version - currently 1.1.5) , 
rebuild indexes and run our system, everything seems to work good - until in 
some point of time index queries do not return any data at all anymore (note 
that the TTL has not yet expired for several months).
Rebuilding the index gives us back the data for a couple of minutes - then it 
vanishes again.

What seems strange is that compaction apparently is very aggressive:

INFO [CompactionExecutor:181] 2012-09-13 12:58:37,443 CompactionTask.java (line
221) Compacted to [/var/lib/cassandra/data/Eventstore/EventsByItem/Eventstore-E
ventsByItem.ebi_eventtypeIndex-he-10-Data.db,].  78,623,000 to 373,348 (~0% of o
riginal) bytes for 83 keys at 0.000280MB/s.  Time: 1,272,883ms.


Actually we have switched to LeveledCompaction. Could it be that leveled 
compaction does not play nice with indexes?



Reply via email to