[ 
https://issues.apache.org/jira/browse/CASSANDRA-3045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13133671#comment-13133671
 ] 

Brandon Williams commented on CASSANDRA-3045:
---------------------------------------------

This isn't as easy as it seems.  Bulk loading this way requires becoming a fat 
client.  Since hadoop is colocated with cassandra, this means we would have to 
divorce the "ip == node" marriage.  This means rewriting most of how gossip 
works, adding the port for the storage proto (and thus allowing port 
divergence, an idea we have not been fond of in the past), modifying 
MessagingService, Incoming/OutgoingTcpConnection, and probably other classes 
that are notoriously hairy.

That is a lot of work, very difficult to make backwards-compatible, and we 
really don't know what, if any, sort of gains we'll see using this method 
afterwards.  I'm personally very strongly -1 on making these changes to gossip 
since I feel like it is finally fairly stable.

Even in a non-colocated setup, the task jvms would still need to respect 
RING_DELAY, which might be enough to erode any gains that this could provide in 
many scenarios.

One option might be to speak the storage proto directly to the local C* 
instance, but add some kind of logic that says 'this is not a node nor a fat 
client, just accept writes/reads from it and nothing else.'
                
> Update ColumnFamilyOutputFormat to use new bulkload API
> -------------------------------------------------------
>
>                 Key: CASSANDRA-3045
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3045
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Hadoop
>            Reporter: Jonathan Ellis
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 1.1
>
>
> The bulk loading interface added in CASSANDRA-1278 is a great fit for Hadoop 
> jobs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to