[jira] [Created] (CASSANDRA-7980) cassandra-stress should support partial clustering column generation

Benedict (JIRA) Thu, 18 Sep 2014 23:53:22 -0700

Benedict created CASSANDRA-7980:
-----------------------------------

             Summary: cassandra-stress should support partial clustering column 
generation
                 Key: CASSANDRA-7980
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7980
             Project: Cassandra
          Issue Type: Bug
            Reporter: Benedict
            Assignee: Branimir Lambov
            Priority: Minor



cassandra-stress generates its data randomly, in tiers, so that we can scroll 
through the partitions it generates without having to generate their entirety. 
The problem is that to support very large partitions (important for 
benchmarking certain cases, and acceptance testing) we have to have a large 
number of clustering columns - generally more than we would otherwise have, 
which changes the performance characteristics. We should effectively split each 
clustering column into a number of byte-ranges that become tiers for 
visitation. The only real complexity here is in obeying the size/count 
distribution range specified, which would be difficult for exponential 
distributions, however we could require the user specify the ranges, and 
distributions for each range, upfront. We could even treat them exactly like 
other column specifications, but as sub-specs within a given column in the 
yaml. Or, we could simply accept that we imperfectly follow the distribution in 
these situations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-7980) cassandra-stress should support partial clustering column generation

Reply via email to