[jira] [Commented] (CASSANDRA-8986) Major cassandra-stress refactor

Jonathan Shook (JIRA) Wed, 18 Mar 2015 19:37:07 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-8986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14368394#comment-14368394
 ]


Jonathan Shook commented on CASSANDRA-8986:
-------------------------------------------

It is good to see the discussion move in this direction.

[~benedict], All,
Nearly all of what you describe in the list of behaviors are on my list for 
another project as well. Although it's still a fairly new project, there have 
been some early successes with demos and training tools. Here is a link that 
explains the project and motives: 
https://github.com/jshook/metagener/blob/master/metagener-core/docs/README.md
I'd be happy to talk in more detail about it. It seems like we have lots of the 
same ideas about what is needed at the foundational level.

It's possible to achieve a drastic simplification of the user-facing part, but 
only if we are willing to revamp the notion of how we define test loads.

RE: distributing test loads: I have been thinking about how to distribute 
stress across multiple clients as well. The gist of it is that we can't get 
there without having a way to automatically partition the client workload 
across some spectrum. As follow-on work, I think it can be done. First we need 
a conceptually obvious and clean way to define whole test loads such that they 
can be partitioned compatibly with the behaviors described above.

 If I can help, given the other work I've been doing, let's keep the 
conversation going.


> Major cassandra-stress refactor
> -------------------------------
>
>                 Key: CASSANDRA-8986
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8986
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Tools
>            Reporter: Benedict
>            Assignee: Benedict
>
> We need a tool for both stressing _and_ validating more complex workloads 
> than stress currently supports. Stress needs a raft of changes, and I think 
> it would be easier to deliver many of these as a single major endeavour which 
> I think is justifiable given its audience. The rough behaviours I want stress 
> to support are:
> * Ability to know exactly how many rows it will produce, for any clustering 
> prefix, without generating those prefixes
> * Ability to generate an amount of data proportional to the amount it will 
> produce to the server (or consume from the server), rather than proportional 
> to the variation in clustering columns
> * Ability to reliably produce near identical behaviour each run
> * Ability to understand complex overlays of operation types (LWT, Delete, 
> Expiry, although perhaps not all implemented immediately, the framework for 
> supporting them easily)
> * Ability to (with minimal internal state) understand the complete cluster 
> state through overlays of multiple procedural generations
> * Ability to understand the in-flight state of in-progress operations (i.e. 
> if we're applying a delete, understand that the delete may have been applied, 
> and may not have been, for potentially multiple conflicting in flight 
> operations)
> I think the necessary changes to support this would give us the _functional_ 
> base to support all the functionality I can currently envisage stress 
> needing. Before embarking on this (which I may attempt very soon), it would 
> be helpful to get input from others as to features missing from stress that I 
> haven't covered here that we will certainly want in the future, so that they 
> can be factored in to the overall design and hopefully avoid another refactor 
> one year from now, as its complexity is scaling each time, and each time it 
> is a higher sunk cost. [~jbellis] [~iamaleksey] [~slebresne] [~tjake] 
> [~enigmacurry] [~aweisberg] [~blambov] [~jshook] ... and @everyone else :) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8986) Major cassandra-stress refactor

Reply via email to