[ https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Ellis updated CASSANDRA-1434: -------------------------------------- Attachment: 1434-v3.txt I squashed and added code to keep CFRW from slamming Cassandra with spikes of load: it keeps a pooled connection, and sends mutations one at a time over that. This is only a trivial amount of overhead compared to using a large batch, since we're not reconnecting for each message. (The main advantage of using a larger batch is that it gives you an idempotent group of work to replay if necessary, which doesn't matter here. Under the hood it takes the same code path.) Also attempted to distinguish between recoverable errors and non- in the exception handling. > ColumnFamilyOutputFormat performs blocking writes for large batches > ------------------------------------------------------------------- > > Key: CASSANDRA-1434 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1434 > Project: Cassandra > Issue Type: Bug > Components: Hadoop > Reporter: Stu Hood > Assignee: Stu Hood > Fix For: 0.7 beta 2 > > Attachments: > 0001-Switch-away-from-Multimap-and-fix-regression-introdu.patch, > 0002-Improve-concurrency-and-add-basic-retries-by-attempt.patch, > 0003-Switch-RingCache-back-to-multimap.patch, > 0004-Replace-Executor-with-map-of-threads.patch, 1434-v3.txt > > > By default, ColumnFamilyOutputFormat batches > {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or > {{Long.MAX_VALUE}} mutations, and then performs a blocking write. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.