Hi: We are currently using Cassandra 0.8.10 and have run into some strange issues surrounding querying for a range of data
I ran a couple of get statements via the Cassandra client and found some interesting results: Consider the following Column Family Definition: ColumnFamily: events Key Validation Class: org.apache.cassandra.db.marshal.BytesType Default column value validator: org.apache.cassandra.db.marshal.BytesType Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type Row cache size / save period in seconds: 0.0/0 Row Cache Provider: org.apache.cassandra.cache.ConcurrentLinkedHashCacheProvider Key cache size / save period in seconds: 200000.0/14400 Memtable thresholds: 0.2953125/1440/63 (millions of ops/minutes/MB) GC grace seconds: 864000 Compaction min/max thresholds: 4/32 Read repair chance: 1.0 Replicate on write: true Built indexes: [events.events_Firm_idx, events.events_OrdType_idx, events.events_OrderID_idx , events.events_OrderQty_idx, events.events_Price_idx, events.events_Symbol_idx, events.events_ds_timestamp_idx] Column Metadata: Column Name: Firm Validation Class: org.apache.cassandra.db.marshal.BytesType Index Name: events_Firm_idx Index Type: KEYS Column Name: OrdType Validation Class: org.apache.cassandra.db.marshal.BytesType Index Name: events_OrdType_idx Index Type: KEYS Column Name: OrderID Validation Class: org.apache.cassandra.db.marshal.BytesType Index Name: events_OrderID_idx Index Type: KEYS Column Name: OrderQty Validation Class: org.apache.cassandra.db.marshal.LongType Index Name: events_OrderQty_idx Index Type: KEYS Column Name: Price Validation Class: org.apache.cassandra.db.marshal.LongType Index Name: events_Price_idx Index Type: KEYS Column Name: Symbol Validation Class: org.apache.cassandra.db.marshal.BytesType Index Name: events_Symbol_idx Column Name: ds_timestamp Validation Class: org.apache.cassandra.db.marshal.LongType Index Name: events_ds_timestamp_idx Index Type: KEYS get events WHERE Firm=434550 AND ds_timestamp=1341955958200; …and the results are pretty much instantaneous. 1 Row Returned. [default@FIX] get events WHERE Firm=434550 AND ds_timestamp=1341955958200; ------------------- RowKey: 64326430363362302d636164362d313165312d626637622d333836303737306639303133 => (column=ClOrdID, value=32323833, timestamp=1341955980651010) => (column=Firm, value=434550, timestamp=1341955980651026) => (column=OrdType, value=31, timestamp=1341955980651008) => (column=OrderQty, value=8200, timestamp=1341955980651013) => (column=Price, value=433561, timestamp=1341955980651019) => (column=Symbol, value=544e54, timestamp=1341955980651018) => (column=ds_timestamp, value=1341955958200, timestamp=1341955980651020) If I run the following query: get events WHERE Firm=434550 AND ds_timestamp>=1341955958200 AND ds_timestamp<=1341955958200; (which in theory would should return the same 1 row result) It runs for around 12 seconds, And I get: TimedOutException() If I run: get events WHERE Firm=434550 AND ds_timestamp>=1341955958200; or get events WHERE Firm=434550 AND ds_timestamp<=1341955958200; The results return quickly. Curious, I also ran a similar set of queries against the price field: get events WHERE Firm=434550 AND Price=433561; get events WHERE Firm=434550 AND Price>=433561; get events WHERE Firm=434550 AND Price<=433561; These all work fine. While, get events WHERE Firm=434550 AND Price=433561 AND Price <= 433561; returns an IO Exception. This feels like it’s attempting to do a full table scan here…. What is going on here? Am I doing something incorrectly? We also see similar behavior when submit the query through our app via the Thrift API. Thanks, JohnB