[ 
https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14085347#comment-14085347
 ] 

Russell Alexander Spitzer commented on CASSANDRA-7631:
------------------------------------------------------

https://github.com/RussellSpitzer/cassandra/compare/CASSANDRA-7631

I'm going to have to leave this branch, I've fixed up as much as I could but I 
realized that we won't be able to get this working cross C* versions easily and 
that was going to be our main use case. I can try to put more time into this 
offline but i'm going to stop main work on this for now.  

[~benedict] Feel free to take this if you like in the mean time, i've basically 
written all the code except for skipping tokens not valid on the node that 
stress is run from.

> Allow Stress to write directly to SSTables
> ------------------------------------------
>
>                 Key: CASSANDRA-7631
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Tools
>            Reporter: Russell Alexander Spitzer
>            Assignee: Russell Alexander Spitzer
>
> One common difficulty with benchmarking machines is the amount of time it 
> takes to initially load data. For machines with a large amount of ram this 
> becomes especially onerous because a very large amount of data needs to be 
> placed on the machine before page-cache can be circumvented. 
> To remedy this I suggest we add a top level flag to Cassandra-Stress which 
> would cause the tool to write directly to sstables rather than actually 
> performing CQL inserts. Internally this would use CQLSStable writer to write 
> directly to sstables while skipping any keys which are not owned by the node 
> stress is running on. The same stress command run on each node in the cluster 
> would then write unique sstables only containing data which that node is 
> responsible for. Following this no further network IO would be required to 
> distribute data as it would all already be correctly in place.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to