[ https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081184#comment-14081184 ]
Russell Alexander Spitzer commented on CASSANDRA-7631: ------------------------------------------------------ Looks like that is as fast as I can go, CPU is pegged at max on my MBP. Let me clean up the code and I'll get a preview up. I'm relying on CQLSSTableWriter to buffer and do the writes which provides (3) for us but limits the program to 1 CQLSSTableWriter per process since it is not thread-safe. I think (4) and (5) could be very helpful though to giving that code an easier job. (2) There is much bad argument parsing I still need to add. I'm trying to track down one bug at the moment then i'll post a preliminary branch. > Allow Stress to write directly to SSTables > ------------------------------------------ > > Key: CASSANDRA-7631 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7631 > Project: Cassandra > Issue Type: Improvement > Components: Tools > Reporter: Russell Alexander Spitzer > Assignee: Russell Alexander Spitzer > > One common difficulty with benchmarking machines is the amount of time it > takes to initially load data. For machines with a large amount of ram this > becomes especially onerous because a very large amount of data needs to be > placed on the machine before page-cache can be circumvented. > To remedy this I suggest we add a top level flag to Cassandra-Stress which > would cause the tool to write directly to sstables rather than actually > performing CQL inserts. Internally this would use CQLSStable writer to write > directly to sstables while skipping any keys which are not owned by the node > stress is running on. The same stress command run on each node in the cluster > would then write unique sstables only containing data which that node is > responsible for. Following this no further network IO would be required to > distribute data as it would all already be correctly in place. -- This message was sent by Atlassian JIRA (v6.2#6252)