[ https://issues.apache.org/jira/browse/CASSANDRA-10084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14705971#comment-14705971 ]
Brent Haines commented on CASSANDRA-10084: ------------------------------------------ I did a lot of tuning with prefetching, threads per client, and added multithreading to our query collator. Performance has improved a lot, but it doesn't come close to what we had before we added the collection to the table. Right now, I have discovered a query for a specific index value that is particularly slow, 3 minutes for 10,000 records. First, it returned only a about 1% of the data without error. I did a repair on one of the nodes for that partition key and it seems to be working, but is very slow now. I have attached stack dumps for every node, though I am not certain which one is working at any given time. Stupid question - is there a quick way to see what nodes own the key for a specific query? I turn trace on and run the query a bunch of times to get all three. Please see the attached profiles for the 3 nodes. We run an incremental repair nightly. They usually finish, but sometimes nodes report *much* more storage than they actually own. They all own about 60 to 90GB, but after repair some nodes will say they own 2+ TB! Restarting reveals that they are way behind on compaction and take about 2 hours to clear that up. > Very slow performance streaming a large query from a single CF > -------------------------------------------------------------- > > Key: CASSANDRA-10084 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10084 > Project: Cassandra > Issue Type: Bug > Environment: Cassandra 2.1.8 > 12GB EC2 instance > 12 node cluster > 32 concurrent reads > 32 concurrent writes > 6GB heap space > Reporter: Brent Haines > Attachments: cassandra.yaml > > > We have a relatively simple column family that we use to track event data > from different providers. We have been utilizing it for some time. Here is > what it looks like: > {code} > CREATE TABLE data.stories_by_text ( > ref_id timeuuid, > second_type text, > second_value text, > object_type text, > field_name text, > value text, > story_id timeuuid, > data map<text, text>, > PRIMARY KEY ((ref_id, second_type, second_value, object_type, > field_name), value, story_id) > ) WITH CLUSTERING ORDER BY (value ASC, story_id ASC) > AND bloom_filter_fp_chance = 0.01 > AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' > AND comment = 'Searchable fields and actions in a story are indexed by > ref id which corresponds to a brand, app, app instance, or user.' > AND compaction = {'min_threshold': '4', 'cold_reads_to_omit': '0.0', > 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', > 'max_threshold': '32'} > AND compression = {'sstable_compression': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND dclocal_read_repair_chance = 0.1 > AND default_time_to_live = 0 > AND gc_grace_seconds = 864000 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99.0PERCENTILE'; > {code} > We will, on a daily basis pull a query of the complete data for a given > index, it will look like this: > {code} > select * from stories_by_text where ref_id = > f0124740-2f5a-11e5-a113-03cdf3f3c6dc and second_type = 'Day' and second_value > = '20150812' and object_type = 'booshaka:user' and field_name = 'hashedEmail'; > {code} > In the past, we have been able to pull millions of records out of the CF in a > few seconds. We recently added the data column so that we could filter on > event data and provide more detailed analysis of activity for our reports. > The data map, declared with 'data map<text, text>' is very small; only 2 or 3 > name/value pairs. > Since we have added this column, our streaming query performance has gone > straight to hell. I just ran the above query and it took 46 minutes to read > 86K rows and then it timed out. > I am uncertain what other data you need to see in order to diagnose this. We > are using STCS and are considering a change to Leveled Compaction. The table > is repaired nightly and the updates, which are at a very fast clip will only > impact the partition key for today, while the queries are for previous days > only. > To my knowledge these queries no longer finish ever. They time out, even > though I put a 60 second timeout on the read for the cluster. I can watch it > pause for 30 to 50 seconds many times during the stream. > Again, this only started happening when we added the data column. > Please let me know what else you need for this. It is having a very big > impact on our system. -- This message was sent by Atlassian JIRA (v6.3.4#6332)