Re: [DISCUSS] CEP-10: Cluster and Code Simulations

Benjamin Lerer Tue, 13 Jul 2021 05:31:04 -0700

> In my limited view of the proposal, a major refactor of internal
> concurrency APIs to support the testing facility potentially risks the
> stability of a minor release, something we've been wanting to avoid with
> our focus on stability. So I'd prefer this to go in  trunk/4.1, otherwise
> we will create precedence to including non-bugfix changes in minor
> versions, something I think we should avoid.



I share Paulo's concern.

Le mar. 13 juil. 2021 à 14:21, Paulo Motta <pauloricard...@gmail.com> a
écrit :

> > No, in my opinion the target should be 4.0.x. We are reaching for a
> shippable trunk and this has no public API impacts. This work is IMO
> central to achieving a shippable trunk, either way. The only reason I do
> not target 3.x is that it would be too burdensome.
>
> In my limited view of the proposal, a major refactor of internal
> concurrency APIs to support the testing facility potentially risks the
> stability of a minor release, something we've been wanting to avoid with
> our focus on stability. So I'd prefer this to go in  trunk/4.1, otherwise
> we will create precedence to including non-bugfix changes in minor
> versions, something I think we should avoid.
>
> In the past we've been lenient to including seemingly harmless internal
> changes that caused client impact and we should be careful to avoid this in
> the future. To prevent this I think we should take a strict approach and
> only accept bug fixes in minor (ie. 4.0.x) versions moving forward.
>
> I'd go one step further and propose that any CEPs, which are generally
> about new features, major API changes or internal refactorings, should only
> be allowed in subsequent major versions, unless an explicit exception is
> granted.
>
> Em ter., 13 de jul. de 2021 às 07:11, bened...@apache.org <
> bened...@apache.org> escreveu:
>
> > Perhaps it’s worth looking forward at the roadmap that we plan to
> develop,
> > and consider whether such a facility would be welcome for proving their
> > safety, and we can then worry about evolving the specifics of any API(s)
> > together as we deploy the capability? Looking ahead, there are very few
> > major features I wouldn’t want to see exercised with this approach, given
> > the choice.
> >
> > The LWT Verifier by itself is an integration test that covers many of the
> > affected subsystems, including sstables, memtables and repair. But we
> will
> > have the ability to introduce dedicated verification for each of these
> > features and systems, and we will necessarily produce more robust code
> > (repair is a great example of a brittle system that would be impossible
> to
> > produce with such an adversarial test system)
> >
> >
> > *Query side improvements:*
> >
> >   * Storage Attached Index or SAI. The CEP can be found at
> >
> >
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-7%3A+Storage+Attached+Index
> >   * Add support for OR predicates in the CQL where clause
> >   * Allow to aggregate by time intervals (CASSANDRA-11871) and allow UDFs
> > in GROUP BY clause
> >   * Ability to read the TTL and WRITE TIME of an element in a collection
> > (CASSANDRA-8877)
> >   * Multi-Partition LWTs
> >   * Materialized views hardening: Addressing the different Materialized
> > Views issues (see CASSANDRA-15921 and [1] for some of the work involved)
> >
> > *Security improvements:*
> >
> >   * SSTables encryption (CASSANDRA-9633)
> >   * Add support for Dynamic Data Masking (CEP pending)
> >   * Allow the creation of roles that have the ability to assign arbitrary
> > privileges, or scoped privileges without also granting those roles access
> > to database objects.
> >   * Filter rows from system and system_schema based on users permissions
> > (CASSANDRA-15871)
> >
> > *Performance improvements:*
> >
> >   * Trie-based index format (CEP pending)
> >   * Trie-based memtables (CEP pending)
> >   * Paxos improvements: Paxos / LWT implementation that would enable the
> > database to serve serial writes with two round-trips and serial reads
> with
> > one round-trip in the uncontended case
> >
> > *Safety/Usability improvements:*
> >
> >   * Guardrails. The CEP can be found at
> >
> >
> https://cwiki.apache.org/confluence/display/CASSANDRA/%28DRAFT%29+-+CEP-3%3A+Guardrails
> >   * Add ability to track state in repair (CASSANDRA-15399)
> >   * Repair coordinator improvements (CASSANDRA-15399)
> >   * Make incremental backup configurable per keyspace and table
> > (CASSANDRA-15402)
> >   * Add ability to blacklist a CQL partition so all requests are ignored
> > (CASSANDRA-12106)
> >   * Add default and required keyspace replication options
> (CASSANDRA-14557)
> >   * Transactional Cluster Metadata: Use of transactions to propagate
> > cluster metadata
> >   * Downgrade-ability: Ability to downgrade to downgrade in the event
> that
> > a serious issue has been identified
> >
> > *Pluggability improvements:*
> >
> >   * Pluggable schema manager (CEP pending)
> >   * Pluggable filesystem (CEP pending)
> >   * Pluggable authenticator for CQLSH (CASSANDRA-16456). A CEP draft can
> be
> > found at
> >
> >
> https://docs.google.com/document/d/1_G-OZCAEmDyuQuAN2wQUYUtZBEJpMkHWnkYELLhqvKc/edit
> >   * Memtable API (CEP pending). The goal being to allow improvements such
> > as CASSANDRA-13981 to be easily plugged into Cassandra
> >
> > *Memtable pluggable implementation:*
> >
> >   * Enable Cassandra for Persistent Memory (CASSANDRA-13981)
> >
> >
> >
> >
> > From: bened...@apache.org <bened...@apache.org>
> > Date: Tuesday, 13 July 2021 at 10:51
> > To: dev@cassandra.apache.org <dev@cassandra.apache.org>
> > Subject: Re: [DISCUSS] CEP-10: Cluster and Code Simulations
> > Ach, editing code in the email editor isn’t smart when editors all have
> > different meanings for key combinations (accidentally hit send), but you
> > get the idea. The simulator would intercept these thread executions, the
> > memory accesses for the annotated field, and evaluate them so that in
> some
> > cases the assertions would fail.
> >
> > This is obviously a toy example that is not very interesting, but the
> main
> > real example we have is too complicated to produce a snippet to
> > demonstrate. In my view, the long term outcome of this work is likely the
> > enablement of many unit tests that are a little more complicated than
> this,
> > on less obvious code.
> >
> > But the headline goal of the CEP is not. By itself, the LWT Verifier
> > demonstrates the power and utility of the work. I don’t believe it is
> > terribly helpful to focus on secondary justifications like the example I
> > gave. For me, the _ability_ to prove the correctness of difficult but
> > critical systems is justification enough, whether or not we deliver a
> > simple API as part of the CEP.
> >
> >
> >
> > From: bened...@apache.org <bened...@apache.org>
> > Date: Tuesday, 13 July 2021 at 10:43
> > To: dev@cassandra.apache.org <dev@cassandra.apache.org>
> > Subject: Re: [DISCUSS] CEP-10: Cluster and Code Simulations
> > > Should target release be 4.1. (not 4.0.x) ?
> >
> >
> >
> > No, in my opinion the target should be 4.0.x. We are reaching for a
> > shippable trunk and this has no public API impacts. This work is IMO
> > central to achieving a shippable trunk, either way. The only reason I do
> > not target 3.x is that it would be too burdensome.
> >
> > > My concern is that changing code and tests at the same time risks
> > regressions…
> >
> >
> >
> > I’ve never heard this position before. Would you care to elaborate? It is
> > quite normal for us to update tests alongside changes to the code.
> >
> > > And seconding Benjamin's comments… some documentation on how to write a
> > test, and a simple test example, that this CEP then allows us to write
> > would help a lot (a la "working backwards").
> >
> > 1) This work is to _enable_ the development of tests, with the only test
> > originally planned to arrive alongside it the fairly sophisticated LWT
> > Verifier. This is something we have sorely needed as a project, as we
> have
> > had serious correctness violations for multiple years. This broad
> category
> > of integrated test for verifying correctness is the main goal of the work
> > and is not easily condensed into an example snippet.
> > 2) It is _possible_ that some simple and fluid APIs will be introduced in
> > a later phase of this work, but they haven’t been designed yet, so I
> cannot
> > share snippets.
> >
> > In principle, however, you would be able to do something like:
> >
> > @Nemesis volatile int x = 0;
> > int foo() {
> >     x = x + 1;
> >     return x;
> > }
> >
> > @Test
> > void test() {
> >     Future<?> f1 = executor.submit(() -> foo());
> >     Future<?> f2 = executor.submit(() -> foo());
> >     Assert.assertTrue(f1.get() == 1 || f2.get() == 1);
> > }
> >
> >
> > From: Mick Semb Wever <m...@apache.org>
> > Date: Tuesday, 13 July 2021 at 10:28
> > To: dev@cassandra.apache.org <dev@cassandra.apache.org>
> > Subject: Re: [DISCUSS] CEP-10: Cluster and Code Simulations
> > >
> > > To achieve this, significant modifications will be required to the
> > codebase, mostly cleaning up existing abstractions. Specifically, we will
> > need to be able to mock executors, any blocking concurrency primitives,
> > time, filesystem access and internode streaming.
> > >
> > > The work is – in large part – already complete, with JIRA and PRs to
> > follow in the coming weeks. Of course, the work is subject to the usual
> > community input and review, so this does not preclude changes to the work
> > (even significant ones, if they are warranted). I know a lot of incoming
> > CEP are likely to be backed up by significant off-list development as a
> > result of the focus on a shippable 4.0. Hopefully this is just a
> temporary
> > growing pain, particularly as we move towards a shippable trunk.
> > >
> > > I hope this work will be of huge value to the project, particularly as
> > we race to catch up on years of limited feature development.
> > >
> > > JIRA and PRs will follow, but I wanted to kick-off discussion in
> advance.
> > >
> >
> >
> >
> > Should target release be 4.1. (not 4.0.x) ?
> >
> > I'd be interested in seeing a rough timeline/plan of how the proposed
> > changes are to be defined in JIRAs and ordered.
> >
> > I'd like to hear a bit more about the test plan. Not so much about how
> > the CEP itself improves testability of the project, but for example
> > the testing required to be in place to introduce the changes of the
> > CEP (and if it already exists, where). My concern is that changing
> > code and tests at the same time risks regressions…
> >
> > And seconding Benjamin's comments… some documentation on how to write
> > a test, and a simple test example, that this CEP then allows us to
> > write would help a lot (a la "working backwards").
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
>

Re: [DISCUSS] CEP-10: Cluster and Code Simulations

Reply via email to