[jira] [Commented] (CASSANDRA-9302) Optimize cqlsh COPY FROM, part 3

Tyler Hobbs (JIRA) Tue, 05 May 2015 13:58:22 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14529253#comment-14529253
 ]


Tyler Hobbs commented on CASSANDRA-9302:
----------------------------------------

We don't necessarily get TAR with the python driver because its [implementation 
of 
murmur3|https://github.com/datastax/python-driver/blob/b8c3f2ff4b92da4d7cc38b398a1b5a1c131e2c66/cassandra/murmur3.c]
 is in a C extension, and we don't compile that with the bundled driver.  
Compiling it is not really a feasible option, so we would need to port the hash 
to pure python.

For prepared statements, the main chunk of work is implementing 
{{from_string()}} for every type so that we can properly serialize the data.  
Batching by partition key can be done with or without prepared statements, so 
we may want to experiment with that first.

> Optimize cqlsh COPY FROM, part 3
> --------------------------------
>
>                 Key: CASSANDRA-9302
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9302
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Tools
>            Reporter: Jonathan Ellis
>             Fix For: 2.1.x
>
>
> We've had some discussion moving to Spark CSV import for bulk load in 3.x, 
> but people need a good bulk load tool now.  One option is to add a separate 
> Java bulk load tool (CASSANDRA-9048), but if we can match that performance 
> from cqlsh I would prefer to leave COPY FROM as the preferred option to which 
> we point people, rather than adding more tools that need to be supported 
> indefinitely.
> Previous work on COPY FROM optimization was done in CASSANDRA-7405 and 
> CASSANDRA-8225.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9302) Optimize cqlsh COPY FROM, part 3

Reply via email to