[jira] Issue Comment Edited: (CASSANDRA-1278) Make bulk loading into Cassandra less crappy, more pluggable

T Jake Luciani (JIRA) Fri, 28 Jan 2011 13:15:06 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12988237#action_12988237
 ]


T Jake Luciani edited comment on CASSANDRA-1278 at 1/28/11 4:13 PM:
--------------------------------------------------------------------

I don't think we'd need to change the client facing API, I was thinking of just 
moving more of the work to the client.
We could make a java based importtool that a user can pipe the CPT format 
serialized data into to which in turn would encode it to BMT or serialized rows 
to be streamed to the nodes.

That way most of the work of encoding / decoding happens locally and users can 
write their loaders in any language.



      was (Author: tjake):
    I don't think we'd need to change the client facing API, I was thinking of 
just moving more of the work to the client.
We could make a java based importtool that a user can pipe the CPT format 
serialized data into to which encodes it to data to be streamed to the nodes.

That way most of the work of encoding / decoding happens locally and users can 
write their loaders in any language.


  
> Make bulk loading into Cassandra less crappy, more pluggable
> ------------------------------------------------------------
>
>                 Key: CASSANDRA-1278
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1278
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Tools
>            Reporter: Jeremy Hanna
>            Assignee: Matthew F. Dennis
>             Fix For: 0.7.2
>
>         Attachments: 1278-cassandra-0.7.txt
>
>   Original Estimate: 40h
>          Time Spent: 40.67h
>  Remaining Estimate: 0h
>
> Currently bulk loading into Cassandra is a black art.  People are either 
> directed to just do it responsibly with thrift or a higher level client, or 
> they have to explore the contrib/bmt example - 
> http://wiki.apache.org/cassandra/BinaryMemtable  That contrib module requires 
> delving into the code to find out how it works and then applying it to the 
> given problem.  Using either method, the user also needs to keep in mind that 
> overloading the cluster is possible - which will hopefully be addressed in 
> CASSANDRA-685
> This improvement would be to create a contrib module or set of documents 
> dealing with bulk loading.  Perhaps it could include code in the Core to make 
> it more pluggable for external clients of different types.
> It is just that this is something that many that are new to Cassandra need to 
> do - bulk load their data into Cassandra.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (CASSANDRA-1278) Make bulk loading into Cassandra less crappy, more pluggable

Reply via email to