Re: [VOTE] Release Apache Cassandra 4.0-beta4

2020-12-20 Thread Yuji Ito
+1 (non-binding)

Short Jepsen tests passed!
- https://github.com/scalar-labs/scalar-jepsen/cassandra

2020年12月20日(日) 8:00 Paulo Motta :

> +1 (nb)
>
> - Tested binary distribution with read/write tlp-stress workload and
> incremental repair on a 6-node multi-dc ccm cluster.
>
> (Ran into https://issues.apache.org/jira/browse/CASSANDRA-16364 which is a
> known issue which I don't think should block this release)
>
> Em sex., 18 de dez. de 2020 às 16:17, Mick Semb Wever 
> escreveu:
>
> > Proposing the test build of Cassandra 4.0-beta4 for release.
> >
> > sha1: b0c50c10dbc443a05662b111a971a65cafa258d5
> > Git:
> >
> >
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/4.0-beta4-tentative
> > Maven Artifacts:
> >
> >
> https://repository.apache.org/content/repositories/orgapachecassandra-1226/org/apache/cassandra/cassandra-all/4.0-beta4/
> >
> > The Source and Build Artifacts, and the Debian and RPM packages and
> > repositories, are available here:
> > https://dist.apache.org/repos/dist/dev/cassandra/4.0-beta4/
> >
> > The vote will be open for 72 hours (longer if needed). Everyone who has
> > tested the build is invited to vote. Votes by PMC members are considered
> > binding. A vote passes if there are at least three binding +1s and no
> -1's.
> >
> > [1]: CHANGES.txt:
> >
> >
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/4.0-beta4-tentative
> > [2]: NEWS.txt:
> >
> >
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/4.0-beta4-tentative
> >
>


Re: [VOTE] Release Apache Cassandra 4.0-beta3

2020-10-31 Thread Yuji Ito
+1 (non-binding)

Short Jepsen tests passed.
https://github.com/scalar-labs/scalar-jepsen

2020年11月1日(日) 10:23 Yifan Cai :

> +1 nb
>
> 
> From: Scott Andreas 
> Sent: Saturday, October 31, 2020 5:44:55 PM
> To: dev@cassandra.apache.org 
> Subject: Re: [VOTE] Release Apache Cassandra 4.0-beta3
>
> +1 nb
>
> > On Oct 31, 2020, at 11:38 AM, Brandon Williams  wrote:
> >
> > +1
> >
> > Signatures and checksums match, source build works, as does
> dis/enablebinary.
> >
> > On Thu, Oct 29, 2020 at 7:30 AM Mick Semb Wever  wrote:
> >>
> >> Proposing the test build of Cassandra 4.0-beta3 for release.
> >>
> >> sha1: be716b46f2cb3b2d1f01dc225396c6284d5a35de
> >> Git:
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/4.0-beta3-tentative
> >> Maven Artifacts:
> >>
> https://repository.apache.org/content/repositories/orgapachecassandra-1224/org/apache/cassandra/cassandra-all/4.0-beta3/
> >>
> >> The Source and Build Artifacts, and the Debian and RPM packages and
> >> repositories, are available here:
> >> https://dist.apache.org/repos/dist/dev/cassandra/4.0-beta3/
> >>
> >> The vote will be open for 72 hours (longer if needed). Everyone who
> >> has tested the build is invited to vote. Votes by PMC members are
> >> considered binding. A vote passes if there are at least three binding
> >> +1s and no -1's.
> >>
> >> [1]: CHANGES.txt:
> >>
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/4.0-beta3-tentative
> >> [2]: NEWS.txt:
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/4.0-beta3-tentative
> >>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: [VOTE] Release Apache Cassandra 4.0-beta1

2020-07-15 Thread Yuji Ito
+1 (non-binding)

Short Jepsen tests with crash injection for map, set, counter, batch, and
LWT passed.
https://github.com/scalar-labs/scalar-jepsen

2020年7月15日(水) 8:06 Mick Semb Wever :

> Proposing the test build of Cassandra 4.0-beta1 for release.
>
> sha1: 5e767711360ecc4bc05a7cd219f0e680bfada004
> Git:
>
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/4.0-beta1-tentative
> Maven Artifacts:
>
> https://repository.apache.org/content/repositories/orgapachecassandra-1210/org/apache/cassandra/cassandra-all/4.0-beta1/
>
> The Source and Build Artifacts, and the Debian and RPM packages and
> repositories, are available here:
> https://dist.apache.org/repos/dist/dev/cassandra/4.0-beta1/
>
> The vote will be open for 72 hours (longer if needed). Everyone who has
> tested the build is invited to vote. Votes by PMC members are considered
> binding. A vote passes if there are at least three binding +1s and no -1s.
>
> Eventual publishing and announcement of the 4.0-beta1 release will be
> coordinated, as described in
>
> https://lists.apache.org/thread.html/r537fe799e7d5e6d72ac791fdbe9098ef0344c55400c7f68ff65abe51%40%3Cdev.cassandra.apache.org%3E
>
> [1]: CHANGES.txt:
>
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/4.0-beta1-tentative
> [2]: NEWS.txt:
>
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/4.0-beta1-tentative
>


Re: [VOTE] Release Apache Cassandra 4.0-alpha4

2020-04-11 Thread Yuji Ito
+1 (non-binding)

It has passed brief Jepsen tests.
https://github.com/scalar-labs/scalar-jepsen

2020年4月11日(土) 8:29 Mick Semb Wever :

> Proposing the test build of Cassandra 4.0-alpha4 for release.
>
> sha1: d00c004cc10986fc41c2070f9c5d0007e03a45c3
> Git:
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/4.0-alpha4-tentative
> Maven Artifacts:
>
> https://repository.apache.org/content/repositories/orgapachecassandra-1202/org/apache/cassandra/cassandra-all/4.0-alpha4/
>
> The Source and Build Artifacts, and the Debian and RPM packages and
> repositories, are available here:
> https://dist.apache.org/repos/dist/dev/cassandra/4.0-alpha4/
>
> The vote will be open for at least 96 hours (longer than normal,
> because of Easter holidays for many). Everyone who has tested the
> build is invited to vote. Votes by PMC members are considered binding.
> A vote passes if there are at least three binding +1s.
>
> [1]: CHANGES.txt:
>
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/4.0-alpha4-tentative
> [2]: NEWS.txt:
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/4.0-alpha4-tentative
>
>
> regards,
> Mick
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: [VOTE] Release Apache Cassandra 4.0-alpha3

2020-01-31 Thread Yuji Ito
+1 (non-binding)

I've briefly tested the build with Jepsen.
https://github.com/scalar-labs/scalar-jepsen

2020年1月31日(金) 13:37 Anthony Grasso :

> +1 (non-binding)
>
> On Fri, 31 Jan 2020 at 08:48, Joshua McKenzie 
> wrote:
>
> > +1
> >
> > On Thu, Jan 30, 2020 at 4:31 PM Brandon Williams 
> wrote:
> >
> > > +1
> > >
> > > On Thu, Jan 30, 2020 at 1:47 PM Mick Semb Wever 
> wrote:
> > > >
> > > > Proposing the test build of Cassandra 4.0-alpha3 for release.
> > > >
> > > > sha1: 5f7c88601c65cdf14ee68387ed68203f2603fc29
> > > > Git:
> > >
> >
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/4.0-alpha3-tentative
> > > > Maven Artifacts:
> > >
> >
> https://repository.apache.org/content/repositories/orgapachecassandra-1189/org/apache/cassandra/apache-cassandra/4.0-alpha3/
> > > >
> > > > The Source and Build Artifacts, and the Debian and RPM packages are
> > > available here:
> > > https://dist.apache.org/repos/dist/dev/cassandra/4.0-alpha3/
> > > >
> > > > The vote will be open for 72 hours (longer if needed). Everyone who
> has
> > > tested the build is invited to vote. Votes by PMC members are
> considered
> > > binding. A vote passes if there are at least three binding +1s.
> > > >
> > > > ** Please note this is my first time as release manager, and the
> > release
> > > process has been improved to deal with sha256|512 checksums as well as
> to
> > > use the ASF dev dist staging location. So please be extra critical. **
> > > >
> > > >
> > > > [1]: CHANGES.txt:
> > >
> >
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/4.0-alpha3-tentative
> > > > [2]: NEWS.txt:
> > >
> >
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/4.0-alpha3-tentative
> > > >
> > > > -
> > > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > > For additional commands, e-mail: dev-h...@cassandra.apache.org
> > > >
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >
> > >
> >
>


Re: Inter-node messaging latency

2018-11-27 Thread Yuji Ito
Hi,

Thank you for the reply.
I've measured LWT throughput in 4.0.

I used the cassandra-stress tool to insert rows with LWT for 3 minutes on
i3.xlarge and i3.4xlarge
For 3.11, I modified the tool to support LWT.
Before each measurement, I cleaned up all Cassandra data.

The throughput in 4.0 is 5 % faster than 3.11.
The CPU load of i3.4xlarge (16 vCPUs) is only up to 75% in both versions.
And, the throughput was slower than 4 times that of i3.xlarge.
I think the throughput wasn't bounded by CPU also in 4.0.

The CPU load of i3.4xlarge is up to 80 % with non-LWT write.

I wonder what is the bottleneck for writes on a many-core machine if the
issue about messaging has been resolved in 4.0.
Can I use up CPU to insert rows by changing any parameter?

# LWT insert
* Cassandra 3.11.3
| instance type | # of threads | concurrent_writes | Throughput [op/s] |
| i3.xlarge |   64 |32 |  2815 |
|i3.4xlarge |  256 |   128 |  9506 |
|i3.4xlarge |  512 |   256 | 10540 |

* Cassandra 4.0 (trunk)
| instance type | # of threads | concurrent_writes | Throughput [op/s] |
| i3.xlarge |   64 |32 |  2951 |
|i3.4xlarge |  256 |   128 |  9816 |
|i3.4xlarge |  512 |   256 | 11055 |

* Environment
- 3 node cluster
- Replication factor: 3
- Node instance: AWS EC2 i3.xlarge / i3.4xlarge

* C* configuration
- Apache Cassandra 3.11.3 / 4.0 (trunk)
- commitlog_sync: batch
- concurrent_writes: 32, 256
- native_transport_max_threads: 128(default), 256 (when concurrent_writes
is 256)

Thanks,
Yuji


2018年11月26日(月) 17:27 sankalp kohli :

> Inter-node messaging is rewritten using Netty in 4.0. It will be better to
> test it using that as potential changes will mostly land on top of that.
>
> On Mon, Nov 26, 2018 at 7:39 AM Yuji Ito  wrote:
>
>> Hi,
>>
>> I'm investigating LWT performance with C* 3.11.3.
>> It looks that the performance is bounded by messaging latency when many
>> requests are issued concurrently.
>>
>> According to the source code, the number of messaging threads per node is
>> only 1 thread for incoming and 1 thread for outbound "small" message to
>> another node.
>>
>> I guess these threads are frequently interrupted because many threads are
>> executed when many requests are issued.
>> Especially, I think it affects the LWT performance when many LWT requests
>> which need lots of inter-node messaging are issued.
>>
>> I measured that latency. It took 2.5 ms in average to enqueue a message
>> at a node and to receive the message at the **same** node with 96
>> concurrent LWT writes.
>> Is it normal? I think it is too big latency, though a message was sent to
>> the same node.
>>
>> Decreasing numbers of other threads like `concurrent_counter_writes`,
>> `concurrent_materialized_view_writes` reduced a bit the latency.
>> Can I change any other parameter to reduce the latency?
>> I've tried using message coalescing, but they didn't reduce that.
>>
>> * Environment
>> - 3 node cluster
>> - Replication factor: 3
>> - Node instance: AWS EC2 i3.xlarge
>>
>> * C* configuration
>> - Apache Cassandra 3.11.3
>> - commitlog_sync: batch
>> - concurrent_reads: 32 (default)
>> - concurrent_writes: 32 (default)
>>
>> Thanks,
>> Yuji
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Inter-node messaging latency

2018-11-25 Thread Yuji Ito
Hi,

I'm investigating LWT performance with C* 3.11.3.
It looks that the performance is bounded by messaging latency when many
requests are issued concurrently.

According to the source code, the number of messaging threads per node is
only 1 thread for incoming and 1 thread for outbound "small" message to
another node.

I guess these threads are frequently interrupted because many threads are
executed when many requests are issued.
Especially, I think it affects the LWT performance when many LWT requests
which need lots of inter-node messaging are issued.

I measured that latency. It took 2.5 ms in average to enqueue a message at
a node and to receive the message at the **same** node with 96 concurrent
LWT writes.
Is it normal? I think it is too big latency, though a message was sent to
the same node.

Decreasing numbers of other threads like `concurrent_counter_writes`,
`concurrent_materialized_view_writes` reduced a bit the latency.
Can I change any other parameter to reduce the latency?
I've tried using message coalescing, but they didn't reduce that.

* Environment
- 3 node cluster
- Replication factor: 3
- Node instance: AWS EC2 i3.xlarge

* C* configuration
- Apache Cassandra 3.11.3
- commitlog_sync: batch
- concurrent_reads: 32 (default)
- concurrent_writes: 32 (default)

Thanks,
Yuji

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Re: Jepsen testing

2018-11-09 Thread Yuji Ito
Hi Alex,

Yes, Jepsen tests of Scalar DB run as a part of the daily testing.
Each of them runs for an hour every day.

However, some tests fail due to a client thread crash.
It seems to be unrelated to Cassandra and Scalar DB.
I think that my implementation might have a problem.
Now, I'm investigating that issue.

That would be a great idea if our work will be included in the regular
testing drill.

Thanks,
Yuji

2018年11月9日(金) 17:27 Oleksandr Shulgin :

> On Thu, Nov 8, 2018 at 10:42 PM Yuji Ito  wrote:
>
> >
> > We are working on Jepsen testing for Cassandra.
> > https://github.com/scalar-labs/jepsen/tree/cassandra/cassandra
> >
> > As you may know, Jepsen is a framework for distributed systems
> > verification.
> > It can inject network failure and so on and check data consistency.
> > https://github.com/jepsen-io/jepsen
> >
> > Our tests are based on riptano's great work.
> > https://github.com/riptano/jepsen/tree/cassandra/cassandra
> >
> > I refined it for the latest Jepsen and removed some tests.
> > Next, I'll fix clock-drift tests.
> >
> > I would like to get your feedback.
> >
>
> Cool stuff!  Do you have jepsen tests as part of regular testing in
> scalardb?  How long does it take to run all of them on average?
>
> I wonder if Apache Cassandra would be willing to include this as part of
> regular testing drill as well.
>
> Cheers,
> --
> Alex
>


Re: Jepsen testing

2018-11-08 Thread Yuji Ito
Thank you for the suggestion.

I haven't tried Cassandra 4.0 yet.
For now, the testing supports only released versions which are distributed
at http://www.us.apache.org/dist/cassandra/ .

I have to modify the installation code and verify also the wrapper.
It's a good next step.

Thanks,
Yuji

2018年11月9日(金) 7:47 sankalp kohli :

> Should we use confluence page to sign them up for this testing?
>
> https://cwiki.apache.org/confluence/display/CASSANDRA/4.0+QA+Signup
>
> On Thu, Nov 8, 2018 at 2:07 PM Nate McCall  wrote:
>
> > [- cassandra-users]
> > Hi Yuji,
> > Thanks so much for working on this! Any fault injection testing is
> > certainly worth the effort.
> >
> > Thanks,
> > -Nate
> > On Thu, Nov 8, 2018 at 1:36 PM Yuji Ito  wrote:
> > >
> > > Hi,
> > >
> > > We are working on Jepsen testing for Cassandra.
> > > https://github.com/scalar-labs/jepsen/tree/cassandra/cassandra
> > >
> > > As you may know, Jepsen is a framework for distributed systems
> > verification.
> > > It can inject network failure and so on and check data consistency.
> > > https://github.com/jepsen-io/jepsen
> > >
> > > Our tests are based on riptano's great work.
> > > https://github.com/riptano/jepsen/tree/cassandra/cassandra
> > >
> > > I refined it for the latest Jepsen and removed some tests.
> > > Next, I'll fix clock-drift tests.
> > >
> > > I would like to get your feedback.
> > >
> > > Thanks,
> > > Yuji Ito
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> >
>


TTL of paxos table

2018-08-29 Thread Yuji Ito
Hi,

I wonder why records on the system.paxos table aren't removed, though all
records are updated with TTL (at least 3 hours).
That's because gc_grace_seconds of system.paxos table is always 0 and users
can't change that value of System keyspace!

Why is the TTL of paxos record invalidated?
Or, I'm misunderstanding?

Related to:
https://issues.apache.org/jira/browse/CASSANDRA-5451
https://issues.apache.org/jira/browse/CASSANDRA-13548

Thanks,
Yuji


Re: Proposal - GroupCommitLogService

2017-05-13 Thread Yuji Ito
That's exactly the motivation of this proposal.

Batch can group only writes which are not persisted (not kept waiting) at
that time.
In Batch, a write of commitlog isn't kept waiting because the thread lock
(the semaphore in 2.2 and 3.0) for sync is released immediately.

So, the window size means 'the maximum length of time that queries may be
batched together for, not the minimum'.
The Batch window size doesn't almost affect the performance.
https://issues.apache.org/jira/browse/CASSANDRA-12864

I tested the throughput of SELECT with Batch window 10ms.
The result was the same as Batch window 2ms as expected.

 SELECT / sec 
# of threads batch 2ms batch 10ms
1 192 192
2 163 169
4 264 263
8 454 454
16 744 744
32 1151 1155
64 1767 1772
128 2949 2962
256 4723 4785


Yuji


On Sun, May 14, 2017 at 2:51 AM, Jonathan Ellis <jbel...@gmail.com> wrote:

> Does that mean that Batch is not working as designed?  If there are other
> pending writes, Batch should also group them together.  (Did you test with
> giving Batch the same window size as Group?)
>
> On Sat, May 13, 2017 at 10:08 AM, Yuji Ito <y...@imagine-orb.com> wrote:
>
>> Batch outperforms when there is no concurrency.
>> Because GroupCommit should wait the window time, the throughput and the
>> latency are worse in a single request.
>> GroupCommit can gather the commitlog writes which are requested in the
>> window time.
>> Actually, the throughput of a single thread was bounded by the window
>> time.
>>
>> Yuji
>>
>> On Sat, May 13, 2017 at 11:49 PM, Jonathan Ellis <jbel...@gmail.com>
>> wrote:
>>
>>> Can we replace Batch entirely with this, or are there situations where
>>> Batch would outperform (in latency, for instance)?
>>>
>>> On Sat, May 13, 2017 at 7:21 AM, Yuji Ito <y...@imagine-orb.com> wrote:
>>>
>>>> Hi dev,
>>>>
>>>> I propose a new CommitLogService, GroupCommitLogService, to improve the
>>>> throughput when lots of requests are received.
>>>> It improved the throughput by maximum 94%.
>>>> I'd like to discuss about this CommitLogService.
>>>>
>>>> Currently, we can select either 2 CommitLog services; Periodic and
>>>> Batch.
>>>> In Periodic, we might lose some commit log which hasn't written to the
>>>> disk.
>>>> In Batch, we can write commit log to the disk every time. The size of
>>>> commit log to write is too small (< 4KB). When high concurrency, these
>>>> writes are gathered and persisted to the disk at once. But, when
>>>> insufficient concurrency, many small writes are issued and the performance
>>>> decreases due to the latency of the disk. Even if you use SSD, processes of
>>>> many IO commands decrease the performance.
>>>>
>>>> GroupCommitLogService writes some commitlog to the disk at once.
>>>> The patch adds GroupCommitLogService (It is enabled by setting
>>>> `commitlog_sync` and `commitlog_sync_group_window_in_ms` in
>>>> cassandra.yaml).
>>>> The difference from Batch is just only waiting for the semaphore.
>>>> By waiting for the semaphore, some writes for commit logs are executed
>>>> at the same time.
>>>> In GroupCommitLogService, the latency becomes worse if the there is no
>>>> concurrency.
>>>>
>>>> I measured the performance with my microbench (MicroRequestThread.java)
>>>> by increasing the number of threads.The cluster has 3 nodes (Replication
>>>> factor: 3). Each nodes is AWS EC2 m4.large instance + 200IOPS io1 volume.
>>>> The result is as below. The GroupCommitLogService with 10ms window
>>>> improved update with Paxos by 94% and improved select with Paxos by 76%.
>>>>
>>>>  SELECT / sec 
>>>> # of threads Batch 2ms Group 10ms
>>>> 1 192 103
>>>> 2 163 212
>>>> 4 264 416
>>>> 8 454 800
>>>> 16 744 1311
>>>> 32 1151 1481
>>>> 64 1767 1844
>>>> 128 2949 3011
>>>> 256 4723 5000
>>>>
>>>>  UPDATE / sec 
>>>> # of threads Batch 2ms Group 10ms
>>>> 1 45 26
>>>> 2 39 51
>>>> 4 58 102
>>>> 8 102 198
>>>> 16 167 213
>>>> 32 289 295
>>>> 64 544 548
>>>> 128 1046 1058
>>>> 256 2020 2061
>>>>
>>>>
>>>> Thanks,
>>>> Yuji
>>>>
>>>>
>>>>
>>>> -
>>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>>>
>>>
>>>
>>>
>>> --
>>> Jonathan Ellis
>>> co-founder, http://www.datastax.com
>>> @spyced
>>>
>>
>>
>
>
> --
> Jonathan Ellis
> co-founder, http://www.datastax.com
> @spyced
>


Re: Proposal - GroupCommitLogService

2017-05-13 Thread Yuji Ito
Batch outperforms when there is no concurrency.
Because GroupCommit should wait the window time, the throughput and the
latency are worse in a single request.
GroupCommit can gather the commitlog writes which are requested in the
window time.
Actually, the throughput of a single thread was bounded by the window time.

Yuji

On Sat, May 13, 2017 at 11:49 PM, Jonathan Ellis <jbel...@gmail.com> wrote:

> Can we replace Batch entirely with this, or are there situations where
> Batch would outperform (in latency, for instance)?
>
> On Sat, May 13, 2017 at 7:21 AM, Yuji Ito <y...@imagine-orb.com> wrote:
>
>> Hi dev,
>>
>> I propose a new CommitLogService, GroupCommitLogService, to improve the
>> throughput when lots of requests are received.
>> It improved the throughput by maximum 94%.
>> I'd like to discuss about this CommitLogService.
>>
>> Currently, we can select either 2 CommitLog services; Periodic and Batch.
>> In Periodic, we might lose some commit log which hasn't written to the
>> disk.
>> In Batch, we can write commit log to the disk every time. The size of
>> commit log to write is too small (< 4KB). When high concurrency, these
>> writes are gathered and persisted to the disk at once. But, when
>> insufficient concurrency, many small writes are issued and the performance
>> decreases due to the latency of the disk. Even if you use SSD, processes of
>> many IO commands decrease the performance.
>>
>> GroupCommitLogService writes some commitlog to the disk at once.
>> The patch adds GroupCommitLogService (It is enabled by setting
>> `commitlog_sync` and `commitlog_sync_group_window_in_ms` in
>> cassandra.yaml).
>> The difference from Batch is just only waiting for the semaphore.
>> By waiting for the semaphore, some writes for commit logs are executed at
>> the same time.
>> In GroupCommitLogService, the latency becomes worse if the there is no
>> concurrency.
>>
>> I measured the performance with my microbench (MicroRequestThread.java)
>> by increasing the number of threads.The cluster has 3 nodes (Replication
>> factor: 3). Each nodes is AWS EC2 m4.large instance + 200IOPS io1 volume.
>> The result is as below. The GroupCommitLogService with 10ms window
>> improved update with Paxos by 94% and improved select with Paxos by 76%.
>>
>>  SELECT / sec 
>> # of threads Batch 2ms Group 10ms
>> 1 192 103
>> 2 163 212
>> 4 264 416
>> 8 454 800
>> 16 744 1311
>> 32 1151 1481
>> 64 1767 1844
>> 128 2949 3011
>> 256 4723 5000
>>
>>  UPDATE / sec 
>> # of threads Batch 2ms Group 10ms
>> 1 45 26
>> 2 39 51
>> 4 58 102
>> 8 102 198
>> 16 167 213
>> 32 289 295
>> 64 544 548
>> 128 1046 1058
>> 256 2020 2061
>>
>>
>> Thanks,
>> Yuji
>>
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>
>
>
>
> --
> Jonathan Ellis
> co-founder, http://www.datastax.com
> @spyced
>


Re: Proposal - GroupCommitLogService

2017-05-13 Thread Yuji Ito
Thanks Jeremiah,

I've opened a ticket on JIRA.

https://issues.apache.org/jira/browse/CASSANDRA-13530

Best,
Yuji


On Sat, May 13, 2017 at 9:38 PM, J. D. Jordan <jeremiah.jor...@gmail.com>
wrote:

> Sounds interesting. You should open a JIRA and attach your code for
> discussion of it.
>
> https://issues.apache.org/jira/browse/CASSANDRA/
> <https://issues.apache.org/jira/browse/CASSANDRA/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel>
>
> -Jeremiah
>
> On May 13, 2017, at 7:21 AM, Yuji Ito <y...@imagine-orb.com> wrote:
>
> Hi dev,
>
> I propose a new CommitLogService, GroupCommitLogService, to improve the
> throughput when lots of requests are received.
> It improved the throughput by maximum 94%.
> I'd like to discuss about this CommitLogService.
>
> Currently, we can select either 2 CommitLog services; Periodic and Batch.
> In Periodic, we might lose some commit log which hasn't written to the
> disk.
> In Batch, we can write commit log to the disk every time. The size of
> commit log to write is too small (< 4KB). When high concurrency, these
> writes are gathered and persisted to the disk at once. But, when
> insufficient concurrency, many small writes are issued and the performance
> decreases due to the latency of the disk. Even if you use SSD, processes of
> many IO commands decrease the performance.
>
> GroupCommitLogService writes some commitlog to the disk at once.
> The patch adds GroupCommitLogService (It is enabled by setting
> `commitlog_sync` and `commitlog_sync_group_window_in_ms` in
> cassandra.yaml).
> The difference from Batch is just only waiting for the semaphore.
> By waiting for the semaphore, some writes for commit logs are executed at
> the same time.
> In GroupCommitLogService, the latency becomes worse if the there is no
> concurrency.
>
> I measured the performance with my microbench (MicroRequestThread.java) by
> increasing the number of threads.The cluster has 3 nodes (Replication
> factor: 3). Each nodes is AWS EC2 m4.large instance + 200IOPS io1 volume.
> The result is as below. The GroupCommitLogService with 10ms window
> improved update with Paxos by 94% and improved select with Paxos by 76%.
>
>  SELECT / sec 
> # of threads Batch 2ms Group 10ms
> 1 192 103
> 2 163 212
> 4 264 416
> 8 454 800
> 16 744 1311
> 32 1151 1481
> 64 1767 1844
> 128 2949 3011
> 256 4723 5000
>
>  UPDATE / sec 
> # of threads Batch 2ms Group 10ms
> 1 45 26
> 2 39 51
> 4 58 102
> 8 102 198
> 16 167 213
> 32 289 295
> 64 544 548
> 128 1046 1058
> 256 2020 2061
>
>
> Thanks,
> Yuji
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Proposal - GroupCommitLogService

2017-05-13 Thread Yuji Ito
Hi dev,

I propose a new CommitLogService, GroupCommitLogService, to improve the
throughput when lots of requests are received.
It improved the throughput by maximum 94%.
I'd like to discuss about this CommitLogService.

Currently, we can select either 2 CommitLog services; Periodic and Batch.
In Periodic, we might lose some commit log which hasn't written to the disk.
In Batch, we can write commit log to the disk every time. The size of
commit log to write is too small (< 4KB). When high concurrency, these
writes are gathered and persisted to the disk at once. But, when
insufficient concurrency, many small writes are issued and the performance
decreases due to the latency of the disk. Even if you use SSD, processes of
many IO commands decrease the performance.

GroupCommitLogService writes some commitlog to the disk at once.
The patch adds GroupCommitLogService (It is enabled by setting
`commitlog_sync` and `commitlog_sync_group_window_in_ms` in cassandra.yaml).
The difference from Batch is just only waiting for the semaphore.
By waiting for the semaphore, some writes for commit logs are executed at
the same time.
In GroupCommitLogService, the latency becomes worse if the there is no
concurrency.

I measured the performance with my microbench (MicroRequestThread.java) by
increasing the number of threads.The cluster has 3 nodes (Replication
factor: 3). Each nodes is AWS EC2 m4.large instance + 200IOPS io1 volume.
The result is as below. The GroupCommitLogService with 10ms window improved
update with Paxos by 94% and improved select with Paxos by 76%.

 SELECT / sec 
# of threads Batch 2ms Group 10ms
1 192 103
2 163 212
4 264 416
8 454 800
16 744 1311
32 1151 1481
64 1767 1844
128 2949 3011
256 4723 5000

 UPDATE / sec 
# of threads Batch 2ms Group 10ms
1 45 26
2 39 51
4 58 102
8 102 198
16 167 213
32 289 295
64 544 548
128 1046 1058
256 2020 2061


Thanks,
Yuji

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org