[ 
https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14077020#comment-14077020
 ] 

Russell Alexander Spitzer edited comment on CASSANDRA-7631 at 7/28/14 10:32 PM:
--------------------------------------------------------------------------------

https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/CQLSSTableWriter.java
 wraps SSTableSimpleUnsorted Writer so I think we are ok there. The main reason 
I would like this as part of stress is that we already have all the data 
generation code written in for arbitrary schemas, Thanks [~tjake]! This way we 
could prepare for a test that writes a large amount of data and then runs a 
mixed workload much faster. 


was (Author: rspitzer):
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/CQLSSTableWriter.java
 wraps SSTableSimpleUnsorted Writer so I think we are ok there. The main reason 
I would like this as part of stress is that we already have all the data 
generation code backed in for arbitrary schemas, Thanks [~tjake]! This way we 
could prepare for a test that uses a large amount of data and a mixed workload 
much faster. 

> Allow Stress to write directly to SSTables
> ------------------------------------------
>
>                 Key: CASSANDRA-7631
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Tools
>            Reporter: Russell Alexander Spitzer
>            Assignee: Russell Alexander Spitzer
>
> One common difficulty with benchmarking machines is the amount of time it 
> takes to initially load data. For machines with a large amount of ram this 
> becomes especially onerous because a very large amount of data needs to be 
> placed on the machine before page-cache can be circumvented. 
> To remedy this I suggest we add a top level flag to Cassandra-Stress which 
> would cause the tool to write directly to sstables rather than actually 
> performing CQL inserts. Internally this would use CQLSStable writer to write 
> directly to sstables while skipping any keys which are not owned by the node 
> stress is running on. The same stress command run on each node in the cluster 
> would then write unique sstables only containing data which that node is 
> responsible for. Following this no further network IO would be required to 
> distribute data as it would all already be correctly in place.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to