[jira] [Commented] (CASSANDRA-9033) Upgrading from 2.1.1 to 2.1.3 with LCS and many sstable files makes nodes unresponsive

Brent Haines (JIRA) Thu, 26 Mar 2015 08:53:10 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-9033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382084#comment-14382084
 ]


Brent Haines commented on CASSANDRA-9033:
-----------------------------------------

If we run 2.1.3 our cluster *will* fail. We have started with upgraded data and 
we have started with an empty key store. No matter what, the number of files 
for all of our LCS tables will grow until the machines collapse. That seems 
like a blocker to me.

* I don't have a graph of the number of files. 
* We have observed this with nodes after repair or during the day when no 
repairs have been run.
* Repair doesn't seem to actually work on 2.1.3. Downgrading back to 2.1.1 
restored the nodes' performance in every case and repair actually worked. After 
a few hours, everything was back to normal.

Some more observations -

* There are something like 50K to 147K files on the nodes for this table right 
now. C* 2.1.1 works with them. 
* There is plenty of bandwidth for compactions. No compactions are pending.
* Downgrading to 2.1.1 from 2.1.3 caused all sorts of compactions and reduced 
the number of files down to under 100K (the count was so high I couldn't count 
them before)
* Upgrading to 2.1.3 created issues immediately on some nodes, but other nodes 
performed fine until suddenly they couldn't.
* We ran into issues on this cluster after successfully upgrading to 2.1.3. 
This is all log data for our app - it is the product of many storm machines 
humming away on data and we can rebuild it (although not without cost), so the 
first time we had trouble, I just wiped the disks and started over with 2.1.3. 
Within a week we had too many files and nodes were failing.
* We never had a complete 2.1.3 cluster. We never made it through the upgrade 
before things went to shit, so it may be some interaction between 2.1.1 and 
2.1.3 (seems unlikely). 

Bottom line, if we use LCS with 2.1.3 we will fail.

I have experimented ad nauseam with the settings. I have added compactors, 
increased compaction throughput, reduced it, added RAM to Heap, added off-heap 
RAM for memtables, trickled-fsync on or off. This will happen for us every 
time. 

# Are there settings I should look at specifically?
# I will start a task to log file count for the table on 2.1.1 on all of the 
nodes. I don't expect it to grow.
# Is there a way to force these files to be rolled up / compacted? 


> Upgrading from 2.1.1 to 2.1.3 with LCS  and many sstable files makes nodes 
> unresponsive
> ---------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9033
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9033
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: * Ubuntu 14.04.2 - Linux ip-10-0-2-122 3.13.0-46-generic 
> #79-Ubuntu SMP Tue Mar 10 20:06:50 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
> * EC2 m2-xlarge instances [4cpu, 16GB RAM, 1TB storage on 3 platters]
> * 12 nodes running a mix of 2.1.1 and 2.1.3
> * 8GB stack size with offheap objects
>            Reporter: Brent Haines
>            Assignee: Marcus Eriksson
>         Attachments: cassandra-env.sh, cassandra.yaml, system.log.1.zip
>
>
> We have an Event Log table using LCS that has grown fast. There are more than 
> 100K sstable files that are around 1KB. Increasing compactors and adjusting 
> compaction throttling upward doesn't make a difference. It has been running 
> great though until we upgraded to 2.1.3. Those nodes needed more RAM for the 
> stack (12 GB) to even have a prayer of responding to queries. They bog down 
> and become unresponsive. There are no GC messages that I can see, and no 
> compaction either. 
> The only work-around I have found is to decommission, blow away the big CF 
> and rejoin. That happens in about 20 minutes and everything is freaking happy 
> again. The size of the files is more like what I'd expect as well. 
> Our schema: 
> {code}
> cqlsh> describe columnfamily data.stories
> CREATE TABLE data.stories (
>     id timeuuid PRIMARY KEY,
>     action_data timeuuid,
>     action_name text,
>     app_id timeuuid,
>     app_instance_id timeuuid,
>     data map<text, text>,
>     objects set<timeuuid>,
>     time_stamp timestamp,
>     user_id timeuuid
> ) WITH bloom_filter_fp_chance = 0.01
>     AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
>     AND comment = 'Stories represent the timeline and are placed in the 
> dashboard for the brand manager to see'
>     AND compaction = {'min_threshold': '4', 'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32'}
>     AND compression = {'sstable_compression': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
>     AND dclocal_read_repair_chance = 0.1
>     AND default_time_to_live = 0
>     AND gc_grace_seconds = 864000
>     AND max_index_interval = 2048
>     AND memtable_flush_period_in_ms = 0
>     AND min_index_interval = 128
>     AND read_repair_chance = 0.0
>     AND speculative_retry = '99.0PERCENTILE';
> cqlsh> 
> {code}
> There were no log entries that stood out. It pretty much consisted of "x is 
> down" "x is up" repeated ad infinitum. I have attached the zipped system.log 
> that has the situation after the upgrade and then after I stopped, removed 
> system, system_traces, OpsCenter, and data/stories-/* and restarted. 
> It has rejoined the cluster now and is busy read-repairing to recover its 
> data.
> On another note, we see a lot of this during repair now (on all the nodes): 
> {code}
> ERROR [AntiEntropySessions:5] 2015-03-24 20:03:10,207 RepairSession.java:303 
> - [repair #c5043c40-d260-11e4-a2f2-8bb3e2bbdb35] session completed with the 
> following error
> java.io.IOException: Failed during snapshot creation.
>         at 
> org.apache.cassandra.repair.RepairSession.failedSnapshot(RepairSession.java:344)
>  ~[apache-cassandra-2.1.3.jar:2.1.3]
>         at 
> org.apache.cassandra.repair.RepairJob$2.onFailure(RepairJob.java:146) 
> ~[apache-cassandra-2.1.3.jar:2.1.3]
>         at com.google.common.util.concurrent.Futures$4.run(Futures.java:1172) 
> ~[guava-16.0.jar:na]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_55]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_55]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_55]
> ERROR [AntiEntropySessions:5] 2015-03-24 20:03:10,208 
> CassandraDaemon.java:167 - Exception in thread 
> Thread[AntiEntropySessions:5,5,RMI Runtime]
> java.lang.RuntimeException: java.io.IOException: Failed during snapshot 
> creation.
>         at com.google.common.base.Throwables.propagate(Throwables.java:160) 
> ~[guava-16.0.jar:na]
>         at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) 
> ~[apache-cassandra-2.1.3.jar:2.1.3]
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[na:1.7.0_55]
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[na:1.7.0_55]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[na:1.7.0_55]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_55]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_55]
> Caused by: java.io.IOException: Failed during snapshot creation.
>         at 
> org.apache.cassandra.repair.RepairSession.failedSnapshot(RepairSession.java:344)
>  ~[apache-cassandra-2.1.3.jar:2.1.3]
>         at 
> org.apache.cassandra.repair.RepairJob$2.onFailure(RepairJob.java:146) 
> ~[apache-cassandra-2.1.3.jar:2.1.3]
>         at com.google.common.util.concurrent.Futures$4.run(Futures.java:1172) 
> ~[guava-16.0.jar:na]
>         ... 3 common frames omitted
> ERROR [RepairJobTask:2] 2015-03-24 20:03:20,227 RepairJob.java:145 - Error 
> occurred during snapshot phase
> java.lang.RuntimeException: Could not create snapshot at /10.0.2.144
>         at 
> org.apache.cassandra.repair.SnapshotTask$SnapshotCallback.onFailure(SnapshotTask.java:77)
>  ~[apache-cassandra-2.1.3.jar:2.1.3]
>         at 
> org.apache.cassandra.net.MessagingService$5$1.run(MessagingService.java:349) 
> ~[apache-cassandra-2.1.3.jar:2.1.3]
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[na:1.7.0_55]
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[na:1.7.0_55]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_55]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_55]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_55]
> {code} 
> I am thinking that this means that my work-around for blowing away and 
> rebuilding the CF may not be working anymore. I don't know of another way to 
> force LCS compaction. The node doesn't ever seem to recover enough to compact 
> on its own.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9033) Upgrading from 2.1.1 to 2.1.3 with LCS and many sstable files makes nodes unresponsive

Reply via email to