I have raised https://issues.apache.org/jira/browse/CASSANDRA-11105 
<https://issues.apache.org/jira/browse/CASSANDRA-11105>.

Thanks!
Ralf

> On 01.02.2016, at 15:01, Jake Luciani <jak...@gmail.com> wrote:
> 
> Yeah that looks like a bug.  Can you open a JIRA and attach the full .yaml?
> 
> Thanks!
> 
> 
> On Mon, Feb 1, 2016 at 5:09 AM, Ralf Steppacher <ralf.viva...@gmail.com 
> <mailto:ralf.viva...@gmail.com>> wrote:
> I am using Cassandra 2.2.4 and I am struggling to get the cassandra-stress 
> tool to work for my test scenario. I have followed the example on 
> http://www.datastax.com/dev/blog/improved-cassandra-2-1-stress-tool-benchmark-any-schema
>  
> <http://www.datastax.com/dev/blog/improved-cassandra-2-1-stress-tool-benchmark-any-schema>
>  to create a yaml file describing my test.
> 
> I am collecting events per user id (text, partition key). Events have a 
> session type (text), event type (text), and creation time (timestamp) 
> (clustering keys, in that order). Plus some more attributes required for 
> rendering the events in a UI. For testing purposes I ended up with the 
> following column spec and insert distribution:
> 
> columnspec:
>   - name: created_at
>     cluster: uniform(10..10000)
>   - name: event_type
>     size: uniform(5..10)
>     population: uniform(1..30)
>     cluster: uniform(1..30)
>   - name: session_type
>     size: fixed(5)
>     population: uniform(1..4)
>     cluster: uniform(1..4)
>   - name: user_id
>     size: fixed(15)
>     population: uniform(1..1000000)
>   - name: message
>     size: uniform(10..100)
>     population: uniform(1..100B)
> 
> insert:
>   partitions: fixed(1)
>   batchtype: UNLOGGED
>   select: fixed(1)/1200000
> 
> 
> Running stress tool for just the insert prints 
> 
> Generating batches with [1..1] partitions and [0..1] rows (of [10..1200000] 
> total rows in the partitions)
> 
> and then immediately starts flooding me with 
> "com.datastax.driver.core.exceptions.InvalidQueryException: Batch too large”. 
> 
> Why I should be exceeding the "batch_size_fail_threshold_in_kb: 50” in the 
> cassandra.yaml I do not understand. My understanding is that the stress tool 
> should generate one row per batch. The size of a single row should not exceed 
> 8+10*3+5*3+15*3+100*3 = 398 bytes. Assuming a worst case of all text 
> characters being 3 byte unicode characters. 
> 
> How come I end up with batches that exceed the 50kb threshold? Am I missing 
> the point about the “select” attribute?
> 
> 
> Thanks!
> Ralf
> 
> 
> 
> -- 
> http://twitter.com/tjake <http://twitter.com/tjake>

Reply via email to