Re: Trouble with using group commitlog_sync

2024-04-24 Thread Nathan Marz
I tried running two client processes in parallel and the numbers were
unchanged. The max throughput is still a single client doing 10 in-flight
BatchStatement containing 100 inserts.

On Tue, Apr 23, 2024 at 10:24 PM Bowen Song via user <
user@cassandra.apache.org> wrote:

> You might have run into the bottleneck of the driver's IO thread. Try
> increase the driver's connections-per-server limit to 2 or 3 if you've only
> got 1 server in the cluster. Or alternatively, run two client processes in
> parallel.
>
>
> On 24/04/2024 07:19, Nathan Marz wrote:
>
> Tried it again with one more client thread, and that had no effect on
> performance. This is unsurprising as there's only 2 CPU on this node and
> they were already at 100%. These were good ideas, but I'm still unable to
> even match the performance of batch commit mode with group commit mode.
>
> On Tue, Apr 23, 2024 at 12:46 PM Bowen Song via user <
> user@cassandra.apache.org> wrote:
>
>> To achieve 10k loop iterations per second, each iteration must take 0.1
>> milliseconds or less. Considering that each iteration needs to lock and
>> unlock the semaphore (two syscalls) and make network requests (more
>> syscalls), that's a lots of context switches. It may a bit too much to ask
>> for a single thread. I would suggest try multi-threading or
>> multi-processing, and see if the combined insert rate is higher.
>>
>> I should also note that executeAsync() also has implicit limits on the
>> number of in-flight requests, which default to 1024 requests per connection
>> and 1 connection per server. See
>> https://docs.datastax.com/en/developer/java-driver/4.17/manual/core/pooling/
>>
>>
>> On 23/04/2024 23:18, Nathan Marz wrote:
>>
>> It's using the async API, so why would it need multiple threads? Using
>> the exact same approach I'm able to get 38k / second with periodic
>> commitlog_sync. For what it's worth, I do see 100% CPU utilization in every
>> single one of these tests.
>>
>> On Tue, Apr 23, 2024 at 11:01 AM Bowen Song via user <
>> user@cassandra.apache.org> wrote:
>>
>>> Have you checked the thread CPU utilisation of the client side? You
>>> likely will need more than one thread to do insertion in a loop to achieve
>>> tens of thousands of inserts per second.
>>>
>>>
>>> On 23/04/2024 21:55, Nathan Marz wrote:
>>>
>>> Thanks for the explanation.
>>>
>>> I tried again with commitlog_sync_group_window at 2ms, concurrent_writes
>>> at 512, and doing 1000 individual inserts at a time with the same loop +
>>> semaphore approach. This only nets 9k / second.
>>>
>>> I got much higher throughput for the other modes with BatchStatement of
>>> 100 inserts rather than 100x more individual inserts.
>>>
>>> On Tue, Apr 23, 2024 at 10:45 AM Bowen Song via user <
>>> user@cassandra.apache.org> wrote:
>>>
>>>> I suspect you are abusing batch statements. Batch statements should
>>>> only be used where atomicity or isolation is needed. Using batch statements
>>>> won't make inserting multiple partitions faster. In fact, it often will
>>>> make that slower.
>>>>
>>>> Also, the liner relationship between commitlog_sync_group_window and
>>>> write throughput is expected. That's because the max number of uncompleted
>>>> writes is limited by the write concurrency, and a write is not considered
>>>> "complete" before it is synced to disk when commitlog sync is in group or
>>>> batch mode. That means within each interval, only limited number of writes
>>>> can be done. The ways to increase that including: add more nodes, sync
>>>> commitlog at shorter intervals and allow more concurrent writes.
>>>>
>>>>
>>>> On 23/04/2024 20:43, Nathan Marz wrote:
>>>>
>>>> Thanks. I raised concurrent_writes to 128 and
>>>> set commitlog_sync_group_window to 20ms. This causes a single execute of a
>>>> BatchStatement containing 100 inserts to succeed. However, the throughput
>>>> I'm seeing is atrocious.
>>>>
>>>> With these settings, I'm executing 10 BatchStatement concurrently at a
>>>> time using the semaphore + loop approach I showed in my first message. So
>>>> as requests complete, more are sent out such that there are 10 in-flight at
>>>> a time. Each BatchStatement has 100 individual inserts. I'm seeing only 730
>>>>

Re: Trouble with using group commitlog_sync

2024-04-23 Thread Nathan Marz
Tried it again with one more client thread, and that had no effect on
performance. This is unsurprising as there's only 2 CPU on this node and
they were already at 100%. These were good ideas, but I'm still unable to
even match the performance of batch commit mode with group commit mode.

On Tue, Apr 23, 2024 at 12:46 PM Bowen Song via user <
user@cassandra.apache.org> wrote:

> To achieve 10k loop iterations per second, each iteration must take 0.1
> milliseconds or less. Considering that each iteration needs to lock and
> unlock the semaphore (two syscalls) and make network requests (more
> syscalls), that's a lots of context switches. It may a bit too much to ask
> for a single thread. I would suggest try multi-threading or
> multi-processing, and see if the combined insert rate is higher.
>
> I should also note that executeAsync() also has implicit limits on the
> number of in-flight requests, which default to 1024 requests per connection
> and 1 connection per server. See
> https://docs.datastax.com/en/developer/java-driver/4.17/manual/core/pooling/
>
>
> On 23/04/2024 23:18, Nathan Marz wrote:
>
> It's using the async API, so why would it need multiple threads? Using the
> exact same approach I'm able to get 38k / second with periodic
> commitlog_sync. For what it's worth, I do see 100% CPU utilization in every
> single one of these tests.
>
> On Tue, Apr 23, 2024 at 11:01 AM Bowen Song via user <
> user@cassandra.apache.org> wrote:
>
>> Have you checked the thread CPU utilisation of the client side? You
>> likely will need more than one thread to do insertion in a loop to achieve
>> tens of thousands of inserts per second.
>>
>>
>> On 23/04/2024 21:55, Nathan Marz wrote:
>>
>> Thanks for the explanation.
>>
>> I tried again with commitlog_sync_group_window at 2ms, concurrent_writes
>> at 512, and doing 1000 individual inserts at a time with the same loop +
>> semaphore approach. This only nets 9k / second.
>>
>> I got much higher throughput for the other modes with BatchStatement of
>> 100 inserts rather than 100x more individual inserts.
>>
>> On Tue, Apr 23, 2024 at 10:45 AM Bowen Song via user <
>> user@cassandra.apache.org> wrote:
>>
>>> I suspect you are abusing batch statements. Batch statements should only
>>> be used where atomicity or isolation is needed. Using batch statements
>>> won't make inserting multiple partitions faster. In fact, it often will
>>> make that slower.
>>>
>>> Also, the liner relationship between commitlog_sync_group_window and
>>> write throughput is expected. That's because the max number of uncompleted
>>> writes is limited by the write concurrency, and a write is not considered
>>> "complete" before it is synced to disk when commitlog sync is in group or
>>> batch mode. That means within each interval, only limited number of writes
>>> can be done. The ways to increase that including: add more nodes, sync
>>> commitlog at shorter intervals and allow more concurrent writes.
>>>
>>>
>>> On 23/04/2024 20:43, Nathan Marz wrote:
>>>
>>> Thanks. I raised concurrent_writes to 128 and
>>> set commitlog_sync_group_window to 20ms. This causes a single execute of a
>>> BatchStatement containing 100 inserts to succeed. However, the throughput
>>> I'm seeing is atrocious.
>>>
>>> With these settings, I'm executing 10 BatchStatement concurrently at a
>>> time using the semaphore + loop approach I showed in my first message. So
>>> as requests complete, more are sent out such that there are 10 in-flight at
>>> a time. Each BatchStatement has 100 individual inserts. I'm seeing only 730
>>> inserts / second. Again, with periodic mode I see 38k / second and with
>>> batch I see 14k / second. My expectation was that group commit mode
>>> throughput would be somewhere between those two.
>>>
>>> If I set commitlog_sync_group_window to 100ms, the throughput drops to
>>> 14 / second.
>>>
>>> If I set commitlog_sync_group_window to 10ms, the throughput increases
>>> to 1587 / second.
>>>
>>> If I set commitlog_sync_group_window to 5ms, the throughput increases to
>>> 3200 / second.
>>>
>>> If I set commitlog_sync_group_window to 1ms, the throughput increases to
>>> 13k / second, which is slightly less than batch commit mode.
>>>
>>> Is group commit mode supposed to have better performance than batch mode?
>>>
>>>
>>> On Tue, Ap

Re: Trouble with using group commitlog_sync

2024-04-23 Thread Nathan Marz
It's using the async API, so why would it need multiple threads? Using the
exact same approach I'm able to get 38k / second with periodic
commitlog_sync. For what it's worth, I do see 100% CPU utilization in every
single one of these tests.

On Tue, Apr 23, 2024 at 11:01 AM Bowen Song via user <
user@cassandra.apache.org> wrote:

> Have you checked the thread CPU utilisation of the client side? You likely
> will need more than one thread to do insertion in a loop to achieve tens of
> thousands of inserts per second.
>
>
> On 23/04/2024 21:55, Nathan Marz wrote:
>
> Thanks for the explanation.
>
> I tried again with commitlog_sync_group_window at 2ms, concurrent_writes
> at 512, and doing 1000 individual inserts at a time with the same loop +
> semaphore approach. This only nets 9k / second.
>
> I got much higher throughput for the other modes with BatchStatement of
> 100 inserts rather than 100x more individual inserts.
>
> On Tue, Apr 23, 2024 at 10:45 AM Bowen Song via user <
> user@cassandra.apache.org> wrote:
>
>> I suspect you are abusing batch statements. Batch statements should only
>> be used where atomicity or isolation is needed. Using batch statements
>> won't make inserting multiple partitions faster. In fact, it often will
>> make that slower.
>>
>> Also, the liner relationship between commitlog_sync_group_window and
>> write throughput is expected. That's because the max number of uncompleted
>> writes is limited by the write concurrency, and a write is not considered
>> "complete" before it is synced to disk when commitlog sync is in group or
>> batch mode. That means within each interval, only limited number of writes
>> can be done. The ways to increase that including: add more nodes, sync
>> commitlog at shorter intervals and allow more concurrent writes.
>>
>>
>> On 23/04/2024 20:43, Nathan Marz wrote:
>>
>> Thanks. I raised concurrent_writes to 128 and
>> set commitlog_sync_group_window to 20ms. This causes a single execute of a
>> BatchStatement containing 100 inserts to succeed. However, the throughput
>> I'm seeing is atrocious.
>>
>> With these settings, I'm executing 10 BatchStatement concurrently at a
>> time using the semaphore + loop approach I showed in my first message. So
>> as requests complete, more are sent out such that there are 10 in-flight at
>> a time. Each BatchStatement has 100 individual inserts. I'm seeing only 730
>> inserts / second. Again, with periodic mode I see 38k / second and with
>> batch I see 14k / second. My expectation was that group commit mode
>> throughput would be somewhere between those two.
>>
>> If I set commitlog_sync_group_window to 100ms, the throughput drops to 14
>> / second.
>>
>> If I set commitlog_sync_group_window to 10ms, the throughput increases to
>> 1587 / second.
>>
>> If I set commitlog_sync_group_window to 5ms, the throughput increases to
>> 3200 / second.
>>
>> If I set commitlog_sync_group_window to 1ms, the throughput increases to
>> 13k / second, which is slightly less than batch commit mode.
>>
>> Is group commit mode supposed to have better performance than batch mode?
>>
>>
>> On Tue, Apr 23, 2024 at 8:46 AM Bowen Song via user <
>> user@cassandra.apache.org> wrote:
>>
>>> The default commitlog_sync_group_window is very long for SSDs. Try
>>> reduce it if you are using SSD-backed storage for the commit log. 10-15 ms
>>> is a good starting point. You may also want to increase the value of
>>> concurrent_writes, consider at least double or quadruple it from the
>>> default. You'll need even higher write concurrency for longer
>>> commitlog_sync_group_window.
>>>
>>> On 23/04/2024 19:26, Nathan Marz wrote:
>>>
>>> "batch" mode works fine. I'm having trouble with "group" mode. The only
>>> config for that is "commitlog_sync_group_window", and I have that set to
>>> the default 1000ms.
>>>
>>> On Tue, Apr 23, 2024 at 8:15 AM Bowen Song via user <
>>> user@cassandra.apache.org> wrote:
>>>
>>>> Why would you want to set commitlog_sync_batch_window to 1 second long
>>>> when commitlog_sync is set to batch mode? The documentation
>>>> <https://cassandra.apache.org/doc/stable/cassandra/architecture/storage_engine.html>
>>>> on this says:
>>>>
>>>> *This window should be kept short because the writer threads will be
>>>> unable to do extra wo

Re: Trouble with using group commitlog_sync

2024-04-23 Thread Nathan Marz
Thanks for the explanation.

I tried again with commitlog_sync_group_window at 2ms, concurrent_writes at
512, and doing 1000 individual inserts at a time with the same loop +
semaphore approach. This only nets 9k / second.

I got much higher throughput for the other modes with BatchStatement of 100
inserts rather than 100x more individual inserts.

On Tue, Apr 23, 2024 at 10:45 AM Bowen Song via user <
user@cassandra.apache.org> wrote:

> I suspect you are abusing batch statements. Batch statements should only
> be used where atomicity or isolation is needed. Using batch statements
> won't make inserting multiple partitions faster. In fact, it often will
> make that slower.
>
> Also, the liner relationship between commitlog_sync_group_window and
> write throughput is expected. That's because the max number of uncompleted
> writes is limited by the write concurrency, and a write is not considered
> "complete" before it is synced to disk when commitlog sync is in group or
> batch mode. That means within each interval, only limited number of writes
> can be done. The ways to increase that including: add more nodes, sync
> commitlog at shorter intervals and allow more concurrent writes.
>
>
> On 23/04/2024 20:43, Nathan Marz wrote:
>
> Thanks. I raised concurrent_writes to 128 and
> set commitlog_sync_group_window to 20ms. This causes a single execute of a
> BatchStatement containing 100 inserts to succeed. However, the throughput
> I'm seeing is atrocious.
>
> With these settings, I'm executing 10 BatchStatement concurrently at a
> time using the semaphore + loop approach I showed in my first message. So
> as requests complete, more are sent out such that there are 10 in-flight at
> a time. Each BatchStatement has 100 individual inserts. I'm seeing only 730
> inserts / second. Again, with periodic mode I see 38k / second and with
> batch I see 14k / second. My expectation was that group commit mode
> throughput would be somewhere between those two.
>
> If I set commitlog_sync_group_window to 100ms, the throughput drops to 14
> / second.
>
> If I set commitlog_sync_group_window to 10ms, the throughput increases to
> 1587 / second.
>
> If I set commitlog_sync_group_window to 5ms, the throughput increases to
> 3200 / second.
>
> If I set commitlog_sync_group_window to 1ms, the throughput increases to
> 13k / second, which is slightly less than batch commit mode.
>
> Is group commit mode supposed to have better performance than batch mode?
>
>
> On Tue, Apr 23, 2024 at 8:46 AM Bowen Song via user <
> user@cassandra.apache.org> wrote:
>
>> The default commitlog_sync_group_window is very long for SSDs. Try
>> reduce it if you are using SSD-backed storage for the commit log. 10-15 ms
>> is a good starting point. You may also want to increase the value of
>> concurrent_writes, consider at least double or quadruple it from the
>> default. You'll need even higher write concurrency for longer
>> commitlog_sync_group_window.
>>
>> On 23/04/2024 19:26, Nathan Marz wrote:
>>
>> "batch" mode works fine. I'm having trouble with "group" mode. The only
>> config for that is "commitlog_sync_group_window", and I have that set to
>> the default 1000ms.
>>
>> On Tue, Apr 23, 2024 at 8:15 AM Bowen Song via user <
>> user@cassandra.apache.org> wrote:
>>
>>> Why would you want to set commitlog_sync_batch_window to 1 second long
>>> when commitlog_sync is set to batch mode? The documentation
>>> <https://cassandra.apache.org/doc/stable/cassandra/architecture/storage_engine.html>
>>> on this says:
>>>
>>> *This window should be kept short because the writer threads will be
>>> unable to do extra work while waiting. You may need to increase
>>> concurrent_writes for the same reason*
>>>
>>> If you want to use batch mode, at least ensure
>>> commitlog_sync_batch_window is reasonably short. The default is 2
>>> millisecond.
>>>
>>>
>>> On 23/04/2024 18:32, Nathan Marz wrote:
>>>
>>> I'm doing some benchmarking of Cassandra on a single m6gd.large
>>> instance. It works fine with periodic or batch commitlog_sync options, but
>>> I'm having tons of issues when I change it to "group". I have
>>> "commitlog_sync_group_window" set to 1000ms.
>>>
>>> My client is doing writes like this (pseudocode):
>>>
>>> Semaphore sem = new Semaphore(numTickets);
>>> while(true) {
>>>
>>> sem.acquire();
>>> session.executeAsync(insert.bind(genUUIDStr()

Re: Trouble with using group commitlog_sync

2024-04-23 Thread Nathan Marz
Thanks. I raised concurrent_writes to 128 and
set commitlog_sync_group_window to 20ms. This causes a single execute of a
BatchStatement containing 100 inserts to succeed. However, the throughput
I'm seeing is atrocious.

With these settings, I'm executing 10 BatchStatement concurrently at a time
using the semaphore + loop approach I showed in my first message. So as
requests complete, more are sent out such that there are 10 in-flight at a
time. Each BatchStatement has 100 individual inserts. I'm seeing only 730
inserts / second. Again, with periodic mode I see 38k / second and with
batch I see 14k / second. My expectation was that group commit mode
throughput would be somewhere between those two.

If I set commitlog_sync_group_window to 100ms, the throughput drops to 14 /
second.

If I set commitlog_sync_group_window to 10ms, the throughput increases to
1587 / second.

If I set commitlog_sync_group_window to 5ms, the throughput increases to
3200 / second.

If I set commitlog_sync_group_window to 1ms, the throughput increases to
13k / second, which is slightly less than batch commit mode.

Is group commit mode supposed to have better performance than batch mode?


On Tue, Apr 23, 2024 at 8:46 AM Bowen Song via user <
user@cassandra.apache.org> wrote:

> The default commitlog_sync_group_window is very long for SSDs. Try reduce
> it if you are using SSD-backed storage for the commit log. 10-15 ms is a
> good starting point. You may also want to increase the value of
> concurrent_writes, consider at least double or quadruple it from the
> default. You'll need even higher write concurrency for longer
> commitlog_sync_group_window.
>
> On 23/04/2024 19:26, Nathan Marz wrote:
>
> "batch" mode works fine. I'm having trouble with "group" mode. The only
> config for that is "commitlog_sync_group_window", and I have that set to
> the default 1000ms.
>
> On Tue, Apr 23, 2024 at 8:15 AM Bowen Song via user <
> user@cassandra.apache.org> wrote:
>
>> Why would you want to set commitlog_sync_batch_window to 1 second long
>> when commitlog_sync is set to batch mode? The documentation
>> <https://cassandra.apache.org/doc/stable/cassandra/architecture/storage_engine.html>
>> on this says:
>>
>> *This window should be kept short because the writer threads will be
>> unable to do extra work while waiting. You may need to increase
>> concurrent_writes for the same reason*
>>
>> If you want to use batch mode, at least ensure
>> commitlog_sync_batch_window is reasonably short. The default is 2
>> millisecond.
>>
>>
>> On 23/04/2024 18:32, Nathan Marz wrote:
>>
>> I'm doing some benchmarking of Cassandra on a single m6gd.large instance.
>> It works fine with periodic or batch commitlog_sync options, but I'm having
>> tons of issues when I change it to "group". I have
>> "commitlog_sync_group_window" set to 1000ms.
>>
>> My client is doing writes like this (pseudocode):
>>
>> Semaphore sem = new Semaphore(numTickets);
>> while(true) {
>>
>> sem.acquire();
>> session.executeAsync(insert.bind(genUUIDStr(), genUUIDStr(), genUUIDStr())
>> .whenComplete((t, u) -> sem.release())
>>
>> }
>>
>> If I set numTickets higher than 20, I get tons of timeout errors.
>>
>> I've also tried doing single commands with BatchStatement with many
>> inserts at a time, and that fails with timeout when the batch size gets
>> more than 20.
>>
>> Increasing the write request timeout in cassandra.yaml makes it time out
>> at slightly higher numbers of concurrent requests.
>>
>> With periodic I'm able to get about 38k writes / second, and with batch
>> I'm able to get about 14k / second.
>>
>> Any tips on what I should be doing to get group commitlog_sync to work
>> properly? I didn't expect to have to do anything other than change the
>> config.
>>
>>


Re: Trouble with using group commitlog_sync

2024-04-23 Thread Nathan Marz
"batch" mode works fine. I'm having trouble with "group" mode. The only
config for that is "commitlog_sync_group_window", and I have that set to
the default 1000ms.

On Tue, Apr 23, 2024 at 8:15 AM Bowen Song via user <
user@cassandra.apache.org> wrote:

> Why would you want to set commitlog_sync_batch_window to 1 second long
> when commitlog_sync is set to batch mode? The documentation
> <https://cassandra.apache.org/doc/stable/cassandra/architecture/storage_engine.html>
> on this says:
>
> *This window should be kept short because the writer threads will be
> unable to do extra work while waiting. You may need to increase
> concurrent_writes for the same reason*
>
> If you want to use batch mode, at least ensure commitlog_sync_batch_window
> is reasonably short. The default is 2 millisecond.
>
>
> On 23/04/2024 18:32, Nathan Marz wrote:
>
> I'm doing some benchmarking of Cassandra on a single m6gd.large instance.
> It works fine with periodic or batch commitlog_sync options, but I'm having
> tons of issues when I change it to "group". I have
> "commitlog_sync_group_window" set to 1000ms.
>
> My client is doing writes like this (pseudocode):
>
> Semaphore sem = new Semaphore(numTickets);
> while(true) {
>
> sem.acquire();
> session.executeAsync(insert.bind(genUUIDStr(), genUUIDStr(), genUUIDStr())
> .whenComplete((t, u) -> sem.release())
>
> }
>
> If I set numTickets higher than 20, I get tons of timeout errors.
>
> I've also tried doing single commands with BatchStatement with many
> inserts at a time, and that fails with timeout when the batch size gets
> more than 20.
>
> Increasing the write request timeout in cassandra.yaml makes it time out
> at slightly higher numbers of concurrent requests.
>
> With periodic I'm able to get about 38k writes / second, and with batch
> I'm able to get about 14k / second.
>
> Any tips on what I should be doing to get group commitlog_sync to work
> properly? I didn't expect to have to do anything other than change the
> config.
>
>


Trouble with using group commitlog_sync

2024-04-23 Thread Nathan Marz
I'm doing some benchmarking of Cassandra on a single m6gd.large instance.
It works fine with periodic or batch commitlog_sync options, but I'm having
tons of issues when I change it to "group". I have
"commitlog_sync_group_window" set to 1000ms.

My client is doing writes like this (pseudocode):

Semaphore sem = new Semaphore(numTickets);
while(true) {

sem.acquire();
session.executeAsync(insert.bind(genUUIDStr(), genUUIDStr(), genUUIDStr())
.whenComplete((t, u) -> sem.release())

}

If I set numTickets higher than 20, I get tons of timeout errors.

I've also tried doing single commands with BatchStatement with many inserts
at a time, and that fails with timeout when the batch size gets more than
20.

Increasing the write request timeout in cassandra.yaml makes it time out at
slightly higher numbers of concurrent requests.

With periodic I'm able to get about 38k writes / second, and with batch I'm
able to get about 14k / second.

Any tips on what I should be doing to get group commitlog_sync to work
properly? I didn't expect to have to do anything other than change the
config.