[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-10-26 Thread Alex Petrov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624364#comment-17624364
 ] 

Alex Petrov commented on CASSANDRA-17240:
-

{quote}I don't think that attributing desires with expressions such as "you 
personally or members of your team" helps us in anything but creating conflict.
{quote}
Re-reading it now, my wording might have been suboptimal, so let me try to 
rephrase. What I meant was Harry adoption might have been seen as unnecessary, 
or there might be something that was preventing its adoption. There was no 
attempt of attributing desires, rather the opposite - stating there is no 
(visible) inclination for adoption. This was an unnecessary assumption on my 
part. Regardless, there definitely was no bad intention in what I was 
attempting to convey, on contrary - I've offered my help with Harry tests 
previously, and have repeated it in the last paragraph.
{quote}Are those tests publicly available?
{quote}
Since some of them are Transactional-Metadata specific, I haven't posted them 
just yet. I am working to make them available on trunk, which requires some 
minor changes to them, alongside with a simple, two-command stress-like tool 
with validation abilities.
{quote}However, eight months have passed and I can't find a single class 
extending that FuzzTestBase
{quote}
Since I was mostly working on Transactional Metadata all that time, my 
intention for pushing 16262 out was to help folks working on SAI to adopt it, 
as it was discussed in cassandra-sai slack channel. But most of the actual 
fuzz-testing was as simple as creating clusters and running Harry with 
different schemas and workloads. 
{quote}CASSANDRA-16262 was meant to add fuzz testing for coordination and 
replication. We had it as a blocker for 4.0 for some time, but we finally 
released without it.
{quote}
The code that got merged into Cassandra tree was intended to enable people to 
write new tests, such as bootstrap/decom, and others, in-tree. Fuzz testing 
itself was done by running Harry with different configurations against 
Cassandra clusters. Even though most of these tests did not run on Apache 
infrastructure, all issues found by it were published, and stability of 4.0 
can, in part, be attributed to it, since several issues in the storage engine 
might have not been triggered without it, or would've been harder to find. Even 
[Scylladb|https://github.com/apache/cassandra-harry/blob/trunk/scylla-usage.md] 
folks are using Harry for validation of foundational functionality. So saying 
that we have released 4.0 without it is a bit unfair.
{quote}Marking any new features as experimental until they are tested with 
Harry is mostly equivalent to force people to use it, isn't it?
{quote}
This heavily depends on the feature. With SAI - we would have to write several 
new models. I can certainly help to write them; we can collaborate on what's to 
be tested, and we find the best way to model SAI, which is not that hard. With 
features like memtables - again, I'd say only having a bake test and lengthy 
read/write workload would give us quite a bit of confidence already. 

Besides, I'm not saying it absolutely has to be Harry, I did mention 
"equivalent rigour" in my previous message: it could be any property-based 
model-supported integration testing tool, that tests the feature not in 
isolation, but tests database behaviour with this feature assumed. Since Harry 
is already available, I just think it makes sense to use it.
{quote}Maybe I'm missing some public, community-owned repo containing a 
gazillion tests using Harry.
{quote}
Thing is, using Harry for testing is really as easy as calling 
{{visitor.visit();}} in the loop, followed by {{model.validate();}} with any 
additional calls such as streaming, etc, you would like to do, in-between. And 
this test is, in itself, a gazillion tests, since {{validate}} tests paging, 
single partition reads, reverse reads, slices, ranges, and so on, while 
{{visit}} tests partition deletions, range tombstones, etc. 

In case of trie-based memtables, I think we just need to run bake tests for 
several hundred (or more) cluster-hours (which is of course parallelizable), 
and make sure we trigger conditions such as sstable/memtable merges, range 
tombstones, partition deletions, etc.

In order to test SAI we will actually require some new models, and same with 
transactions and transactional metadata, but for features as foundational as 
memtables you don't even need to write any new code.
{quote}Those wanting to use them just used it, and that led others by example. 
I'd suggest the same approach for Harry.
{quote}
Fair enough. My impression was that allowing people to _use_ the code would be 
something that would enable them to test it, but you're right maybe it didn't 
go far enough. I've put together a simple bake-tests for trie-based (or any 
other type of) 

[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-10-26 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624340#comment-17624340
 ] 

Andres de la Peña commented on CASSANDRA-17240:
---

{quote}I realise there might be no desire to adopt Harry by either you 
personally or by members of your team
{quote}
I have never said that. I'd like to use Harry, but that and marking any new 
features as experimental just because they haven't been tested with Harry are 
different things. As for members of my team, here we are all individuals 
forming the Cassandra community. We all try to achieve the same goal. I don't 
think that attributing desires with expressions such as "you personally or 
members of your team" helps us in anything but creating conflict.
{quote}I think with 4.0 release, we've been able to prove that fuzz testing is 
the only way to build confidence, since smaller and targeted use-cases often 
exercise only a fraction of our code.
{quote}
{quote}I'm using it for my development all the time, and it is proving itself 
useful every day.
{quote}
Are those tests publicly available?

CASSANDRA-16262 was meant to add fuzz testing for coordination and replication. 
We had it as a blocker for 4.0 for some time, but we finally released without 
it. That ticket was closed in February but we never got any new tests. Instead, 
the ticket added some tooling to use Harry with JVM dtests. Specifically, it 
added a new {{FuzzTestBase}} class that I understand should be extended by 
dtests using Harry. However, eight months have passed and I can't find a single 
class extending that {{{}FuzzTestBase{}}}. So far it seems dead code that we 
still have to maintain. For example, this very same patch has to make some 
(trivial) changes on {{{}SSTableGenerator{}}}. Is there some test extending 
{{FuzzTestBase}} ?

Maybe I'm missing some public, community-owned repo containing a gazillion 
tests using Harry. But if we don't have that, nor a single class extending 
{{{}FuzzTestBase{}}}, I'm inclined to think that Harry-based testing is itself 
experimental. Hence my reluctance to making it mandatory or blocking things on 
it.
{quote}I would not go as far as "forcing" anyone to use Harry.
{quote}
Marking any new features as experimental until they are tested with Harry is 
mostly equivalent to force people to use it, isn't it?
{quote}Similarly, we do have a strong preference (even though not a general 
rule) for in-jvm dtests over python dtests.
{quote}
In-JVM dtests are a good example of how to introduce a new testing tool. They 
have never been mandatory, but they have made their way on their own merits and 
people have gradually adopted them. The suite hasn't stopped growing since its 
introduction, and no enforcing policy has been needed. Those wanting to use 
them just used it, and that led others by example. I'd suggest the same 
approach for Harry.

> CEP-19: Trie memtable implementation
> 
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Normal
> Fix For: 4.2
>
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png, 
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png, 
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
>  Time Spent: 13.5h
>  Remaining Estimate: 0h
>
> Trie-based memtable implementation as described in CEP-19, built on top of 
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this 
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-10-26 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624308#comment-17624308
 ] 

Branimir Lambov commented on CASSANDRA-17240:
-

{quote}I strongly feel that we shouldn't depend on the config if we are adding 
a table param, the table param should be self contained similar to 
compression/compaction.
{quote}
The initial API proposal matches what you describe. There were valid objections 
against this very proposal during the CEP-19 discussion which were taken into 
account in CASSANDRA-17034. We want to be able to adjust configurations per 
node (for anything from gradual switching through fixing a broken node to 
setting memory allocation sizes), and we want per-node to always override the 
DDL setting. Keep in mind that this configuration is also expected to 
eventually subsume the legacy memtable settings (e.g. allocation type, size 
thresholds, forced flush frequency). Configuration complexity isn't something 
we want to ignore either — complex mechanisms of specifying what settings to 
override are not a good idea.

After the initial testing period I would expect a pair of configurations (e.g. 
{{trie}} and {{{}skiplist{}}}) to be supplied with the default YAML, which most 
users will simply reference, knowing that they are always present, and having 
the ability to modify them on problem nodes, should this need ever arise. DB 
infrastructure teams can set up additional configurations if they desire so; 
users cannot, and arguably should not, use anything that is not prescribed by 
them.
{quote}we could simplify by making it strongly typed
{quote}
I erred on the side of adhering to the existing configuration schemes (e.g. 
commitlog compression). There were also some practical issues with the 
unstructured option, such as reserved keywords and the YAML parser being unable 
to honour the value type of a nested map.

> CEP-19: Trie memtable implementation
> 
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Normal
> Fix For: 4.2
>
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png, 
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png, 
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
>  Time Spent: 13.5h
>  Remaining Estimate: 0h
>
> Trie-based memtable implementation as described in CEP-19, built on top of 
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this 
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-10-25 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17623988#comment-17623988
 ] 

David Capwell commented on CASSANDRA-17240:
---

Been talking in Slack, moving context here.

The config side of this patch is problematic, the table param depends on a 
local yaml being consistent (at least the name) cross the cluster, but our 
schema logic doesn't actually allow this which then causes a schema mismatch 
cross the cluster...  This will also cause the conflicting nodes to fail to 
re-boot (guess a slight positive?)...

I strongly feel that we shouldn't depend on the config if we are adding a table 
param, the table param should be self contained similar to 
compression/compaction; we should do something like this

{code}
WITH memtable = {'type': 'trie', 'shards': 42}
{code}

Now, this also gets to the question, what do you do if a node doesn't 
understand the type?  Simple example is rolling upgrades picking up a new 
implementation or custom one... in this case we should revert back to the 
default (coordinator should attempt to validate and reject unknown, rest should 
fall back).

Second comment is the yaml is very dense and hard to use, we could simplify by 
making it strongly typed; we should move away from

{code}
memtable:
  configurations:
node1:
  class_name: TrieMemtable
  parameters:
shards: 42
{code}

To 

{code}
memtable:
  type: TrieMemtable # or "trie", cool with alias or explicit
  shards: 42
{code}

If we want to override at the table level we could also add a

{code}
memtable_table_overrides:
  ks.table: # or how/ever we wish to call out the name
type: trie
shards: 4
{code}

> CEP-19: Trie memtable implementation
> 
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Normal
> Fix For: 4.2
>
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png, 
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png, 
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
>  Time Spent: 13.5h
>  Remaining Estimate: 0h
>
> Trie-based memtable implementation as described in CEP-19, built on top of 
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this 
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-10-25 Thread Alex Petrov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17623987#comment-17623987
 ] 

Alex Petrov commented on CASSANDRA-17240:
-

> I don't think that we should default to consider experimental everything that 
> hasn't been tested with Harry, nor consider production-ready everything that 
> has been tested with it

[~adelapena] I agree about the second part (ie not considering something 
production ready if it has been tested with Harry), but former I tend to 
disagree with. I realise there might be no desire to adopt Harry by either you 
personally or by members of your team, but Harry has proven itself to find bugs 
other testing approaches do not, and it models Cassandra behaviour in a way 
that can reasonably only be matched by handcrafting a very large number of test 
cases, which is of course impossible. Unless you can provide an example of an 
integration testing tool that can match Harry in an ability to validate a 
feature, I'll be happy to check it out and compare approaches. 

If I'm not mistaken, there is no publicly available tooling that was used to 
validate _correctness_ of Cassandra as a whole with trie memtables enabled. I 
realize there might be some internal tooling you might have used for 
verification, but unless this tooling is available for others to build 
confidence and reproduce results, I'm afraid this is not enough, either.

Lastly, we had a discussion with [~blambov] during ApacheCon, and he has also 
agreed that we should not make this feature GA until we've exhaustively tested 
it with Harry.

 

> CEP-19: Trie memtable implementation
> 
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Normal
> Fix For: 4.2
>
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png, 
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png, 
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
>  Time Spent: 13.5h
>  Remaining Estimate: 0h
>
> Trie-based memtable implementation as described in CEP-19, built on top of 
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this 
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-10-24 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17623191#comment-17623191
 ] 

Ekaterina Dimitrova commented on CASSANDRA-17240:
-

Thanks! 

> CEP-19: Trie memtable implementation
> 
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Normal
> Fix For: 4.2
>
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png, 
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png, 
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
>  Time Spent: 13.5h
>  Remaining Estimate: 0h
>
> Trie-based memtable implementation as described in CEP-19, built on top of 
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this 
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-10-24 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17623160#comment-17623160
 ] 

Andres de la Peña commented on CASSANDRA-17240:
---

Just created CASSANDRA-17989 for adding tries to Jenkins, and CASSANDRA-17987 
for running specialised unit tests with Java 11 on CircleCI.

> CEP-19: Trie memtable implementation
> 
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Normal
> Fix For: 4.2
>
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png, 
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png, 
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
>  Time Spent: 13.5h
>  Remaining Estimate: 0h
>
> Trie-based memtable implementation as described in CEP-19, built on top of 
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this 
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-10-23 Thread Josh McKenzie (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17622781#comment-17622781
 ] 

Josh McKenzie commented on CASSANDRA-17240:
---

bq. I don't think that we should default to consider experimental everything 
that hasn't been tested with Harry

bq. flag new features *_beyond a certain scope_*
Emphasis added. New features that are very large and have correctness / data 
loss implications, to be more specific. The reason I bring this up here is that 
I recall hearing that Branimir and Caleb were discussing the value in Harry 
testing the Trie memtables.

> CEP-19: Trie memtable implementation
> 
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Normal
> Fix For: 4.2
>
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png, 
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png, 
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
>  Time Spent: 13.5h
>  Remaining Estimate: 0h
>
> Trie-based memtable implementation as described in CEP-19, built on top of 
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this 
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-10-22 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17622688#comment-17622688
 ] 

Ekaterina Dimitrova commented on CASSANDRA-17240:
-

My understanding is we need to be at least able to run all jobs in both 
CircleCI and Jenkins and all JDKs. Now whether mandatory or not I think it is a 
second question and of course important one.

I'd say at least adding them as opportunity to be able to run them in CircleCI 
J11 is a thing now. The variations of j8 unit tests are also not mandatory. The 
work in this ticket is quite big and adding j8 jobs here as a start is ok IMHO. 
I just think we need to ensure there is a follow up ticket to add those also in 
Jenkins (thanks for pointing it, I missed that) and also to JDK11. This on its 
own is also already a work that requires its time and effort too. I can totally 
see that. I'd suggest probably until we have those tests in Jenkins, to make 
them mandatory in CircleCI, same as we did with others like _system_keyspace_ 
and the _simulator_ ones so we do not see regressions in time (we did see it 
with those two mentioned suites and that was the reason we made them mandatory 
in CircleCI until they get added in Jenkins)

 

> CEP-19: Trie memtable implementation
> 
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Normal
> Fix For: 4.2
>
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png, 
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png, 
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
>  Time Spent: 13.5h
>  Remaining Estimate: 0h
>
> Trie-based memtable implementation as described in CEP-19, built on top of 
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this 
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-10-22 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17622675#comment-17622675
 ] 

Andres de la Peña commented on CASSANDRA-17240:
---

{quote}I have a question, why utests_trie was not added also for JDK11? Is it 
because the other variations of the unit tests like compression, etc are also 
not added or anything else?

If it is the first, then we need to add them. There is  ticket CASSANDRA-17930 
and initiative that we need to add all missing tests to all workflows to bring 
CircleCI on par with Jenkins. This will give us the opportunity to be able to 
release based on CircleCI until we fix some of the infra issues in Jenkins
{quote}
Do we have an agreement on whether we should run specialized unit tests 
(compression, trie, etc.) with all Java versions, and whether those jobs should 
those jobs be mandatory? By mandatory I mean automatically started on the 
pre-commit workflow. There is also the ongoing discussion about renaming jobs. 

Adding new jobs for j11 and making the jobs mandatory on Circle is mostly 
trivial. Perhaps it would be a good idea to modify all the jobs for  
specialized unit tests on a single ticket once we have a clear agreement of 
what jobs we want on CircleCI. If we do all of them at once we can have a 
better overall vision and consistency, wdyt?

As for being able to release based on CircleCI, the new trie-based 
configuration for utests is not yet available on Jenkins. So in this case 
Circle doesn't test less things that Jenkins, but more. Of course we should add 
this type of config to Jenkins for having parity.

> CEP-19: Trie memtable implementation
> 
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Normal
> Fix For: 4.2
>
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png, 
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png, 
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
>  Time Spent: 13.5h
>  Remaining Estimate: 0h
>
> Trie-based memtable implementation as described in CEP-19, built on top of 
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this 
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-10-22 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17622667#comment-17622667
 ] 

Ekaterina Dimitrova commented on CASSANDRA-17240:
-

Harry is great and we should definitely utilize it and encourage people to get 
advantage of it but I second[~adelapena] here regarding experimental flags and 
everything else he just said.

> CEP-19: Trie memtable implementation
> 
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Normal
> Fix For: 4.2
>
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png, 
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png, 
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
>  Time Spent: 13.5h
>  Remaining Estimate: 0h
>
> Trie-based memtable implementation as described in CEP-19, built on top of 
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this 
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-10-22 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17622665#comment-17622665
 ] 

Andres de la Peña commented on CASSANDRA-17240:
---

bq. To plant a seed (and maybe this is more appropriate to take to the dev ML 
shortly) - would it make sense to flag new features beyond a certain scope as 
experimental until we do some Harry / fuzz testing on them?

I'd say that what determines if a feature is experimental or not is whether it 
has been properly tested in general or, even better, seen some real usage. I 
don't think that we should default to consider experimental everything that 
hasn't been tested with Harry, nor consider production-ready everything that 
has been tested with it. Unless we are trying to force developers to use Harry 
if they want to move their features out of the experimental status.

> CEP-19: Trie memtable implementation
> 
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Normal
> Fix For: 4.2
>
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png, 
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png, 
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
>  Time Spent: 13.5h
>  Remaining Estimate: 0h
>
> Trie-based memtable implementation as described in CEP-19, built on top of 
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this 
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-10-22 Thread Josh McKenzie (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17622656#comment-17622656
 ] 

Josh McKenzie commented on CASSANDRA-17240:
---

bq.  the only larger question I have is whether we want to leverage the 
existing capabilities of Harry to fuzz test this at a high level w/ TieMemtable
To plant a seed (and maybe this is more appropriate to take to the dev ML 
shortly) - would it make sense to flag new features beyond a certain scope as 
experimental until we do some Harry / fuzz testing on them?

> CEP-19: Trie memtable implementation
> 
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Normal
> Fix For: 4.2
>
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png, 
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png, 
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
>  Time Spent: 13.5h
>  Remaining Estimate: 0h
>
> Trie-based memtable implementation as described in CEP-19, built on top of 
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this 
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-10-21 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17622505#comment-17622505
 ] 

Ekaterina Dimitrova commented on CASSANDRA-17240:
-

Congrats on completion of this work, I know it was a huge effort!

I have a question, why utests_trie was not added also for JDK11? Is it because 
the other variations of the unit tests like compression, etc are also not added 
or anything else?

If it is the first, then we need to add them. There is  ticket and initiative 
we need to add all missing tests to all workflows to bring CircleCI on par with 
Jenkins. 

> CEP-19: Trie memtable implementation
> 
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Normal
> Fix For: 4.2
>
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png, 
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png, 
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
>  Time Spent: 13h 20m
>  Remaining Estimate: 0h
>
> Trie-based memtable implementation as described in CEP-19, built on top of 
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this 
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-10-21 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17622180#comment-17622180
 ] 

Branimir Lambov commented on CASSANDRA-17240:
-

Committed as 
[49e0c61107005b1a83799f7f1e6c0a855d159c29|https://github.com/apache/cassandra/commit/49e0c61107005b1a83799f7f1e6c0a855d159c29],
 
[30641ea7b6b8253651562aeb0102778a0f9a405b|https://github.com/apache/cassandra/commit/30641ea7b6b8253651562aeb0102778a0f9a405b],
 
[562cb26010659830dd1192939ac815a0f6cb3502|https://github.com/apache/cassandra/commit/562cb26010659830dd1192939ac815a0f6cb3502],
 
[7c55c73825e341315e520381968338d57afbb67a|https://github.com/apache/cassandra/commit/7c55c73825e341315e520381968338d57afbb67a]
 and 
[9074ee7ef8e041e1b15116373be0df80b985e3d9|https://github.com/apache/cassandra/commit/9074ee7ef8e041e1b15116373be0df80b985e3d9].


> CEP-19: Trie memtable implementation
> 
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Normal
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png, 
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png, 
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
>  Time Spent: 13h 20m
>  Remaining Estimate: 0h
>
> Trie-based memtable implementation as described in CEP-19, built on top of 
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this 
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-10-20 Thread Caleb Rackliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17621201#comment-17621201
 ] 

Caleb Rackliffe commented on CASSANDRA-17240:
-

bq. Is the answer above not satisfactory?

Apologies. I didn't see the second part of the comment above w/ your answer.

+1 on the {{InMemoryTrie}} rename. Thanks!

> CEP-19: Trie memtable implementation
> 
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Normal
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png, 
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png, 
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
>  Time Spent: 13h 20m
>  Remaining Estimate: 0h
>
> Trie-based memtable implementation as described in CEP-19, built on top of 
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this 
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-10-20 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17621038#comment-17621038
 ] 

Andres de la Peña commented on CASSANDRA-17240:
---

Here is a CI round for the branch with the rename and missed CirceCI changes:
||Branch||CI||
|[trunk|https://github.com/apache/cassandra/compare/trunk...adelapena:cassandra:17240-trunk-review-feedback]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/2300/workflows/3f1de681-281a-4723-9715-77615a4fd4e1]
 
[j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/2300/workflows/0f2ce764-1da1-4a9f-9c26-152ad4da8525]|

> CEP-19: Trie memtable implementation
> 
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Normal
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png, 
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png, 
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
>  Time Spent: 13h 20m
>  Remaining Estimate: 0h
>
> Trie-based memtable implementation as described in CEP-19, built on top of 
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this 
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-10-20 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17620982#comment-17620982
 ] 

Andres de la Peña commented on CASSANDRA-17240:
---

It seems there are a couple of missing details in the CircleCI:

* In {{.circleci/config-2_1.yml}}, adding the new {{test-trie}} Ant target to 
the {{run_repeated_utests}} command.
* In {{.circleci/generate.sh}}, removing the {{utests_trie_repeat}} job if 
there aren't utests to be repeated.

[This 
commit|https://github.com/adelapena/cassandra/commit/f23f64db3c2818bd3c2e693c4c67802f9dd7a195]
 contains fixes for those two things, and the regenerated default config.

The {{s/MemtableTrie/InMemoryTrie/}} rename looks good to me, indeed it was a 
bit tricky to distinguish between {{MemtableTrie}} and {{TrieMemtable}}.

> CEP-19: Trie memtable implementation
> 
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Normal
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png, 
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png, 
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
>  Time Spent: 13h 20m
>  Remaining Estimate: 0h
>
> Trie-based memtable implementation as described in CEP-19, built on top of 
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this 
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-10-20 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17620946#comment-17620946
 ] 

Branimir Lambov commented on CASSANDRA-17240:
-

Pull request renaming {{MemtableTrie}} to {{{}InMemoryTrie{}}}: 
[https://github.com/apache/cassandra/pull/1935]

Only the last commit is different from what will be merged with the main pull 
request.

> CEP-19: Trie memtable implementation
> 
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Normal
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png, 
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png, 
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
>  Time Spent: 13h 20m
>  Remaining Estimate: 0h
>
> Trie-based memtable implementation as described in CEP-19, built on top of 
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this 
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-10-20 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17620908#comment-17620908
 ] 

Branimir Lambov commented on CASSANDRA-17240:
-

Is the answer above not satisfactory?

The additional counted bytes are at most a {{ByteBuffer}} (32 or 48 bytes) per 
partition plus a reference (4 bytes) per column in the partition. Given that 
the total per-partition size is in the high hundreds of bytes, and that the 
cell instance and reference are at least 32 + 4, the increase in calculated 
size in the worst case (hundreds of columns in a key-value table) cannot be 
over 1/9 (assuming a compressed-pointer JVM), and will be much lower in 
practice.

> CEP-19: Trie memtable implementation
> 
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Normal
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png, 
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png, 
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
>  Time Spent: 13h 20m
>  Remaining Estimate: 0h
>
> Trie-based memtable implementation as described in CEP-19, built on top of 
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this 
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-10-19 Thread Caleb Rackliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17620588#comment-17620588
 ] 

Caleb Rackliffe commented on CASSANDRA-17240:
-

I think my only obstacle to a final +1 would be my [question 
above|https://issues.apache.org/jira/browse/CASSANDRA-17240?focusedCommentId=17612376=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17612376]
 around flush frequency.

> CEP-19: Trie memtable implementation
> 
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Normal
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png, 
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png, 
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
>  Time Spent: 13h 20m
>  Remaining Estimate: 0h
>
> Trie-based memtable implementation as described in CEP-19, built on top of 
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this 
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-10-19 Thread Caleb Rackliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17620586#comment-17620586
 ] 

Caleb Rackliffe commented on CASSANDRA-17240:
-

Tests also look pretty good, and the [failing 
bits|https://app.circleci.com/pipelines/github/blambov/cassandra/265/workflows/15104f84-9e13-4c5e-9310-6d567483e8c1/jobs/1212/tests]
 look environment-related.

> CEP-19: Trie memtable implementation
> 
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Normal
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png, 
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png, 
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
>  Time Spent: 13h 20m
>  Remaining Estimate: 0h
>
> Trie-based memtable implementation as described in CEP-19, built on top of 
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this 
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-10-19 Thread Caleb Rackliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17620583#comment-17620583
 ] 

Caleb Rackliffe commented on CASSANDRA-17240:
-

NEWS.txt, CHANGES.txt and docs LGTM

> CEP-19: Trie memtable implementation
> 
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Normal
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png, 
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png, 
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
>  Time Spent: 13h 20m
>  Remaining Estimate: 0h
>
> Trie-based memtable implementation as described in CEP-19, built on top of 
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this 
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-10-19 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17620482#comment-17620482
 ] 

Branimir Lambov commented on CASSANDRA-17240:
-

The code has been rebased and squashed into 4 commits. Tests are running 
[here|[https://app.circleci.com/pipelines/github/blambov/cassandra/265/workflows/15104f84-9e13-4c5e-9310-6d567483e8c1].]

Can you please take a quick look at the last commit, with added NEWS.txt, 
CHANGES.txt and docs?

> CEP-19: Trie memtable implementation
> 
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Normal
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png, 
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png, 
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
>  Time Spent: 13h 20m
>  Remaining Estimate: 0h
>
> Trie-based memtable implementation as described in CEP-19, built on top of 
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this 
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-10-18 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17619706#comment-17619706
 ] 

Andres de la Peña commented on CASSANDRA-17240:
---

{quote}The flakes in the memtable size test are now fixed. [~adelapena], you 
can rerun the above if you wish; I have already run both flaky ones, 
{{MemtableSizeOffheapObjectsTest}} and {{{}MemtableSizeOffheapBuffersTest{}}}, 
thousands of times.
{quote}
CASSANDRA-17939 has just been merged. It will create conflicts with the new job 
for tries that this patch adds. The changes on CircleCI config should be 
adapted to also add an associated job for repeatedly run any new, modified or 
manually specify unit tests with memtable tries. [The previously mentioned 
commit|https://github.com/adelapena/cassandra/commit/23fe8743f185ad1b93cce37383d6d4807ea14e36]
 does exactly that. It was meant to help with the rebase pains.

> CEP-19: Trie memtable implementation
> 
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Normal
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png, 
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png, 
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
>  Time Spent: 13h 20m
>  Remaining Estimate: 0h
>
> Trie-based memtable implementation as described in CEP-19, built on top of 
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this 
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-10-18 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17619692#comment-17619692
 ] 

Branimir Lambov commented on CASSANDRA-17240:
-

{quote}It has been recently agreed that new or modified tests should be run 500 
times before commit...
{quote}
The flakes in the memtable size test are now fixed. [~adelapena], you can rerun 
the above if you wish; I have already run both flaky ones, 
{{MemtableSizeOffheapObjectsTest}} and {{{}MemtableSizeOffheapBuffersTest{}}}, 
thousands of times.
{quote}It seems like there are things that we're now ignoring (Owner, 
MemtableAllocator, and initial comparator in AbstractAllocatorMemtable; empty 
clusterings; and the measureDeepOmitShared()) and would push flush frequency 
down a bit, but then the changes in Columns potentially pushing flush frequency 
up. Do we have an idea of where that nets out?
{quote}
The {{Unmetered}} annotations only really affect the calculation the memtable 
size test does. The flush frequency can increase due to the columns objects, 
possibly by up to 10% in key-value workloads with a small amount of data per 
column.

I will leave the memtable precision parts in a separate commit so that, if we 
are unhappy with the effect, they can be independently rolled back and held off 
until a major release.

> CEP-19: Trie memtable implementation
> 
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Normal
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png, 
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png, 
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
>  Time Spent: 13h 20m
>  Remaining Estimate: 0h
>
> Trie-based memtable implementation as described in CEP-19, built on top of 
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this 
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-10-17 Thread Caleb Rackliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17619064#comment-17619064
 ] 

Caleb Rackliffe commented on CASSANDRA-17240:
-

[~blambov] The latest post-review-feedback commits LGTM. Thanks!

> CEP-19: Trie memtable implementation
> 
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Normal
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png, 
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png, 
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
>  Time Spent: 13h 20m
>  Remaining Estimate: 0h
>
> Trie-based memtable implementation as described in CEP-19, built on top of 
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this 
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-10-17 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17618953#comment-17618953
 ] 

Andres de la Peña commented on CASSANDRA-17240:
---

It has been [recently 
agreed|https://www.mail-archive.com/dev@cassandra.apache.org/msg19579.html] 
that new or modified tests should be run 500 times before commit. The current 
test multiplexer only allows to run tests one by one, which is a pain for the 
29 tests that this patch touches.

CASSANDRA-17939 should allow to automatically detect and repeatedly run new and 
modified tests, so we don't have to do 29 pushes with different CircleCI 
configs. That ticket is almost ready but not yet merged, so I have cherry 
picked it on top of this patch ([this 
commit|https://github.com/adelapena/cassandra/commit/23fe8743f185ad1b93cce37383d6d4807ea14e36])
 and [run it with 
MIDRES|https://github.com/adelapena/cassandra/commit/dc9e28e7026a08501336a90ebf31f0ae15ca59f8]
 for j8: 
[https://app.circleci.com/pipelines/github/adelapena/cassandra/2284/workflows/3d25a156-34e1-4abf-a787-de57eef175f9]

That run uses 500 iterations per job setup, so 500 iterations for j8, 500 for 
j11, 500 for {{{}utests_trie{}}}, 500 for {{{}utests_compression{}}}, etc. 
Given the high number of tests, probably we can save some resources if next 
time we give it just 100 runs to each setup and assume that all together they 
sum up the required 500 iterations:
{code:java}
.circleci/generate.sh -m \
  -e REPEATED_TESTS_STOP_ON_FAILURE=true \
  -e REPEATED_UTESTS_COUNT=100
{code}

> CEP-19: Trie memtable implementation
> 
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Normal
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png, 
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png, 
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
>  Time Spent: 12h 50m
>  Remaining Estimate: 0h
>
> Trie-based memtable implementation as described in CEP-19, built on top of 
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this 
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-10-03 Thread Caleb Rackliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17612376#comment-17612376
 ] 

Caleb Rackliffe commented on CASSANDRA-17240:
-

Ah, and I almost forgot...

Prepended to the actual trie stuff is a commit that changes some of the 
memtable heap space accounting. It seems like there are things that we're now 
ignoring ({{Owner}}, {{MemtableAllocator}}, and initial comparator in 
{{AbstractAllocatorMemtable}}; empty clusterings; and the 
{{measureDeepOmitShared()}}) and would push flush frequency down a bit, but 
then the changes in {{Columns}} potentially pushing flush frequency up. Do we 
have an idea of where that nets out?

> CEP-19: Trie memtable implementation
> 
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Normal
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png, 
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png, 
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
>  Time Spent: 10h 50m
>  Remaining Estimate: 0h
>
> Trie-based memtable implementation as described in CEP-19, built on top of 
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this 
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-09-29 Thread Caleb Rackliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17611278#comment-17611278
 ] 

Caleb Rackliffe commented on CASSANDRA-17240:
-

Apologies for the delay. I've just finished my review. My main concerns were, 
as you would expect, looking at test coverage, making sure the functionality 
we've added is disabled by default, and making sure there isn't too much risk 
to the existing Memtable infrastructure. I've left a raft of comments inline in 
the PR, none of which are particularly alarming.

I've had a brief conversation w/ [~ifesdjeen] about this, and I anticipate he 
might be able to sync up w/ [~blambov] at ApacheCon net week, but the only 
larger question I have is whether we want to leverage the existing capabilities 
of Harry to fuzz test this at a high level w/ {{TieMemtable}}. (I imagine that 
might be a better tool to de-risk things like merging Memtable and SSTable 
contents for client queries, RR, etc.)

+1 Otherwise

> CEP-19: Trie memtable implementation
> 
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Normal
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png, 
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png, 
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
>  Time Spent: 10h 50m
>  Remaining Estimate: 0h
>
> Trie-based memtable implementation as described in CEP-19, built on top of 
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this 
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-09-25 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17609228#comment-17609228
 ] 

Andres de la Peña commented on CASSANDRA-17240:
---

Sure, there are some pending test issues and we'll probably be OOO for a couple 
of days next week, so there's still time if you need to oversee this.

> CEP-19: Trie memtable implementation
> 
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Normal
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png, 
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png, 
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Trie-based memtable implementation as described in CEP-19, built on top of 
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this 
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-09-23 Thread Caleb Rackliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17608896#comment-17608896
 ] 

Caleb Rackliffe commented on CASSANDRA-17240:
-

[~adelapena] Do you mind giving me until the end of next week to do a quick 
review?

> CEP-19: Trie memtable implementation
> 
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Normal
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png, 
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png, 
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Trie-based memtable implementation as described in CEP-19, built on top of 
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this 
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-09-23 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17608685#comment-17608685
 ] 

Andres de la Peña commented on CASSANDRA-17240:
---

There are some CI results so far:

||PR||CI||
|​[trunk|https://github.com/apache/cassandra/pull/1723]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/2088/workflows/3394ef04-27f0-410f-90d6-2de41e904401]
 
[j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/2088/workflows/64a33d2f-b5e5-4245-9216-b435fa77b159]​|

There are still a bunch of new test failures, especially in JVM upgrade tests. 
A rebase could help us to avoid hitting the failures that have been recently 
fixed.

> CEP-19: Trie memtable implementation
> 
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Normal
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png, 
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png, 
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Trie-based memtable implementation as described in CEP-19, built on top of 
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this 
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-09-12 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17603181#comment-17603181
 ] 

Andres de la Peña commented on CASSANDRA-17240:
---

[~blambov] I'm starting to review this, sorry for the delay. Do we have CI runs 
for it? Also, would it make sense to have new CI jobs running any of the suites 
(utest/dtests) using trie memtables?

> CEP-19: Trie memtable implementation
> 
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Normal
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png, 
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png, 
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
>
> Trie-based memtable implementation as described in CEP-19, built on top of 
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this 
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-07-12 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17565895#comment-17565895
 ] 

Branimir Lambov commented on CASSANDRA-17240:
-

The branch is already rebased on top of the CASSANDRA-6936 commit.

> CEP-19: Trie memtable implementation
> 
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Normal
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png, 
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png, 
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
>
> Trie-based memtable implementation as described in CEP-19, built on top of 
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this 
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-07-11 Thread Caleb Rackliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17565179#comment-17565179
 ] 

Caleb Rackliffe commented on CASSANDRA-17240:
-

Minor note, but with CASSANDRA-6936 now merged to trunk, this could be 
rebased/simplified.

> CEP-19: Trie memtable implementation
> 
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Normal
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png, 
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png, 
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
>
> Trie-based memtable implementation as described in CEP-19, built on top of 
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this 
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-02-10 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17490045#comment-17490045
 ] 

Branimir Lambov commented on CASSANDRA-17240:
-

The test also includes latency stages. Because of the incomplete compaction 
this patch's latency results (which can be found in 
[^density_test_with_sharding.html.gz]) can be misleading, thus I'm only showing 
details and two graphs from the DataStax branch (full data in 
[^density_SG.html.gz]):
{code:java}
 SG-density-1TB-100byte 9fe6f629 2021-12-15 16:44:11 Note: 
Trie/big/UCS-30threads-F16...latency_1_1:latency_1_1.result-success
---
Total Operations   : 49994
Op Rate: 107119 op/sec
Min Latency: 0.143 ms
Avg Latency: 1.930 ms
Median Latency : 0.996 ms
95th Latency   : 4.281 ms
99th Latency   : 10.438 ms
99.9th Latency : 227.140 ms
Max Latency: 524.632 ms

 SG-density-1TB-100byte b90ea1ad 2021-12-15 16:15:56 Note: 
SkipList/big/UCS-30threads-F16...latency_1_1.result-success
---
Total Operations   : 49994
Op Rate: 105537 op/sec
Min Latency: 0.152 ms
Avg Latency: 2.738 ms
Median Latency : 1.370 ms
95th Latency   : 6.302 ms
99th Latency   : 14.265 ms
99.9th Latency : 293.650 ms
Max Latency: 567.837 ms
{code}
!latency-1_1-95.png!
{code:java}
 SG-density-1TB-100byte 9fe6f629 2021-12-15 16:44:11 Note: 
Trie/big/UCS-30threads-F16...latency_9_1:latency_9_1.result-success
---
Total Operations   : 49940
Op Rate: 107616 op/sec
Min Latency: 0.138 ms
Avg Latency: 1.430 ms
Median Latency : 0.715 ms
95th Latency   : 2.224 ms
99th Latency   : 7.820 ms
99.9th Latency : 270.483 ms
Max Latency: 515.391 ms

 SG-density-1TB-100byte b90ea1ad 2021-12-15 16:15:56 Note: 
SkipList/big/UCS-30threads-F16...latency_9_1:latency_9_1.result-success
---
Total Operations   : 49940
Op Rate: 105789 op/sec
Min Latency: 0.148 ms
Avg Latency: 2.399 ms
Median Latency : 1.007 ms
95th Latency   : 5.244 ms
99th Latency   : 11.947 ms
99.9th Latency : 328.532 ms
Max Latency: 639.730 ms
{code}
!latency-9_1-95.png!
{code:java}
 SG-density-1TB-100byte 9fe6f629 2021-12-15 16:44:11 Note: 
Trie/big/UCS-30threads-F16...throughput:throughput.result-success
---
Total Operations   : 10
Op Rate: 245111 op/sec
Min Latency: 0.229 ms
Avg Latency: 8.854 ms
Median Latency : 7.139 ms
95th Latency   : 17.499 ms
99th Latency   : 28.672 ms
99.9th Latency : 274.366 ms
Max Latency: 1187.840 ms
 
 SG-density-1TB-100byte b90ea1ad 2021-12-15 16:15:56 Note: 
SkipList/big/UCS-30threads-F16...throughput:throughput.result-success
---
Total Operations   : 997880
Op Rate: 107947 op/sec
Min Latency: 0.372 ms
Avg Latency: 20.273 ms
Median Latency : 16.723 ms
95th Latency   : 31.178 ms
99th Latency   : 105.595 ms
99.9th Latency : 337.592 ms
Max Latency: 18652.070 ms
{code}
Additionally, below are some JMH microbenchmarks of performing database reads 
off the memtable:
{code:java}
SkipListMemtable in heap_buffers mode: 1000 ops written in 89.949s, 
5.700GiB (97%) on-heap, 0.000KiB (0%) off-heap
TrieMemtable in heap_buffers mode: 1000 ops written in 35.515s, 
4.568GiB (78%) on-heap, 0.000KiB (0%) off-heap

Benchmark   (BATCH)   (count)  (flush)   
(memtableClass)  (threadCount)  (useNet)  Mode  Cnt  Score   Error  Units
ReadTestSmallPartitions.readFixed  1000  1000INMEM  
SkipListMemtable  1 false  avgt   10  3.610 ± 0.046  ms/op
ReadTestSmallPartitions.readFixed  1000  1000INMEM  
TrieMemtable  1 false  avgt   10  3.625 ± 0.061  ms/op
ReadTestSmallPartitions.readOutside1000  1000INMEM  
SkipListMemtable  1 false  avgt   10  2.754 ± 0.027  ms/op
ReadTestSmallPartitions.readOutside1000  1000INMEM  
TrieMemtable  1 false  avgt   10  2.306 ± 0.042  ms/op
ReadTestSmallPartitions.readRandomInside   1000  1000INMEM  
SkipListMemtable   

[jira] [Commented] (CASSANDRA-17240) CEP-19: Trie memtable implementation

2022-02-07 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17488240#comment-17488240
 ] 

Branimir Lambov commented on CASSANDRA-17240:
-

Attached some performance data comparing the new trie memtable with the legacy 
skip list one. The test we ran is a density test which runs a 90:10 write:read 
workload with 100-byte payloads to over 1TB of data on an {{i3.4xlarge}} 
instance with the following settings to remove some of the biggest throughput 
bottlenecks:
{code:java}
memtable_allocation_type: offheap_objects
memtable_flush_writers: 8
memtable_heap_space_in_mb: 16384
memtable_offheap_space_in_mb: 16384
concurrent_reads: 256
concurrent_writes: 256
commitlog_total_space_in_mb: 51200
commitlog_segment_size_in_mb: 320
commitlog_compression:
class_name: LZ4Compressor
disk_access_mode: mmap_index_only
file_cache_size_in_mb: 8192
compaction_throughput_mb_per_sec: 0
concurrent_compactors: 30
{code}
The test is meant to measure sustained throughput, and with the current C* code 
is quickly limited by the performance of compaction (compaction cannot keep up, 
sstables accumulate, and reads start dominating the time). The throughput stage 
graph looks like this:

!throughput_apache.png!

{{TrieMemtable}} (in red) starts off with double the performance of the legacy 
{{SkipListMemtable}} (in orange), and maintains a significant lead throughout 
the test. We have previously seen a significant improvement in throughput when 
memtables are sharded, thus we also tested two sharded variations of the skip 
list solution, with and without locking. Both versions lead over the unsharded 
skip-list, but are far from the performance of the new solution. (Note: the 
locking version (in green), which gives compaction threads more chances to run, 
meets its performance towards the end of the test when it is completely 
dominated by the effects of compaction.)

With improved and tuned compaction (using further improvements we intend to 
port C*), the trie memtable maintains ~2.3x better throughput:

!throughput_SG.png!

One interesting aspect of the comparison is the heap behavior, especially old 
generation sizes, during the throughput stage.

{{{}SkipListMemtable{}}}:
!SkipListMemtable-OSS.png! 
vs. {{{}TrieMemtable{}}}:
!TrieMemtable-OSS.png! 
The total garbage collection time through all stages of the test is more than 
halved.

Additionally, the new memtable is able to accept more data for the same memory 
allocation, which results in 30% bigger L0 sstables, reducing the number of 
sstables and the need for compaction and further improving performance.

> CEP-19: Trie memtable implementation
> 
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable
>Reporter: Branimir Lambov
>Priority: Normal
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png, 
> density_SG.html.gz, density_test_with_sharding.html.gz, throughput_SG.png, 
> throughput_apache.png
>
>
> Trie-based memtable implementation as described in CEP-19, built on top of 
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this 
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org