Thanks. I raised concurrent_writes to 128 and
set commitlog_sync_group_window to 20ms. This causes a single execute of a
BatchStatement containing 100 inserts to succeed. However, the throughput
I'm seeing is atrocious.

With these settings, I'm executing 10 BatchStatement concurrently at a time
using the semaphore + loop approach I showed in my first message. So as
requests complete, more are sent out such that there are 10 in-flight at a
time. Each BatchStatement has 100 individual inserts. I'm seeing only 730
inserts / second. Again, with periodic mode I see 38k / second and with
batch I see 14k / second. My expectation was that group commit mode
throughput would be somewhere between those two.

If I set commitlog_sync_group_window to 100ms, the throughput drops to 14 /
second.

If I set commitlog_sync_group_window to 10ms, the throughput increases to
1587 / second.

If I set commitlog_sync_group_window to 5ms, the throughput increases to
3200 / second.

If I set commitlog_sync_group_window to 1ms, the throughput increases to
13k / second, which is slightly less than batch commit mode.

Is group commit mode supposed to have better performance than batch mode?


On Tue, Apr 23, 2024 at 8:46 AM Bowen Song via user <
user@cassandra.apache.org> wrote:

> The default commitlog_sync_group_window is very long for SSDs. Try reduce
> it if you are using SSD-backed storage for the commit log. 10-15 ms is a
> good starting point. You may also want to increase the value of
> concurrent_writes, consider at least double or quadruple it from the
> default. You'll need even higher write concurrency for longer
> commitlog_sync_group_window.
>
> On 23/04/2024 19:26, Nathan Marz wrote:
>
> "batch" mode works fine. I'm having trouble with "group" mode. The only
> config for that is "commitlog_sync_group_window", and I have that set to
> the default 1000ms.
>
> On Tue, Apr 23, 2024 at 8:15 AM Bowen Song via user <
> user@cassandra.apache.org> wrote:
>
>> Why would you want to set commitlog_sync_batch_window to 1 second long
>> when commitlog_sync is set to batch mode? The documentation
>> <https://cassandra.apache.org/doc/stable/cassandra/architecture/storage_engine.html>
>> on this says:
>>
>> *This window should be kept short because the writer threads will be
>> unable to do extra work while waiting. You may need to increase
>> concurrent_writes for the same reason*
>>
>> If you want to use batch mode, at least ensure
>> commitlog_sync_batch_window is reasonably short. The default is 2
>> millisecond.
>>
>>
>> On 23/04/2024 18:32, Nathan Marz wrote:
>>
>> I'm doing some benchmarking of Cassandra on a single m6gd.large instance.
>> It works fine with periodic or batch commitlog_sync options, but I'm having
>> tons of issues when I change it to "group". I have
>> "commitlog_sync_group_window" set to 1000ms.
>>
>> My client is doing writes like this (pseudocode):
>>
>> Semaphore sem = new Semaphore(numTickets);
>> while(true) {
>>
>> sem.acquire();
>> session.executeAsync(insert.bind(genUUIDStr(), genUUIDStr(), genUUIDStr())
>>             .whenComplete((t, u) -> sem.release())
>>
>> }
>>
>> If I set numTickets higher than 20, I get tons of timeout errors.
>>
>> I've also tried doing single commands with BatchStatement with many
>> inserts at a time, and that fails with timeout when the batch size gets
>> more than 20.
>>
>> Increasing the write request timeout in cassandra.yaml makes it time out
>> at slightly higher numbers of concurrent requests.
>>
>> With periodic I'm able to get about 38k writes / second, and with batch
>> I'm able to get about 14k / second.
>>
>> Any tips on what I should be doing to get group commitlog_sync to work
>> properly? I didn't expect to have to do anything other than change the
>> config.
>>
>>

Reply via email to