[ 
https://issues.apache.org/jira/browse/CASSANDRA-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-1434:
--------------------------------------

    Attachment: 1434-v3.txt

I squashed and added code to keep CFRW from slamming Cassandra with spikes of 
load: it keeps a pooled connection, and sends mutations one at a time over 
that.  This is only a trivial amount of overhead compared to using a large 
batch, since we're not reconnecting for each message.  (The main advantage of 
using a larger batch is that it gives you an idempotent group of work to replay 
if necessary, which doesn't matter here.  Under the hood it takes the same code 
path.)

Also attempted to distinguish between recoverable errors and non- in the 
exception handling.

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-1434
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1434
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.7 beta 2
>
>         Attachments: 
> 0001-Switch-away-from-Multimap-and-fix-regression-introdu.patch, 
> 0002-Improve-concurrency-and-add-basic-retries-by-attempt.patch, 
> 0003-Switch-RingCache-back-to-multimap.patch, 
> 0004-Replace-Executor-with-map-of-threads.patch, 1434-v3.txt
>
>
> By default, ColumnFamilyOutputFormat batches 
> {{mapreduce.output.columnfamilyoutputformat.batch.threshold}} or 
> {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to