[ 
https://issues.apache.org/jira/browse/CASSANDRA-7776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14161624#comment-14161624
 ] 

Piotr Kołaczkowski commented on CASSANDRA-7776:
-----------------------------------------------

+1

> Allow multiple MR jobs to concurrently write to the same column family from 
> the same node using CqlBulkOutputFormat
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-7776
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7776
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Hadoop
>            Reporter: Paul Pak
>            Assignee: Paul Pak
>            Priority: Minor
>              Labels: cql3, hadoop
>         Attachments: trunk-7776-v1.txt
>
>
> After sstable files are written, all files in the specified output directory 
> are loaded (transferred) to the remote cassandra cluster. If multiple writes 
> occur on a node to the same table (i.e. directory), then the multiple load 
> processes end up transferring the same sstable files multiple times. 
> Furthermore, if directory cleanup of successful outputs is set to occur 
> ([CASSANDRA-7777|https://issues.apache.org/jira/browse/CASSANDRA-7777]), then 
> there could be errors caused by write/load contention.
> This can be simply remedied by using unique output directories for each MR 
> job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to