[DISCUSS] CEP-17: Add support for PEM based key material for SSL

2021-10-10 Thread Maulin Vasavada
Hi all,

I would like to start this discussion thread for the CEP-17
.
I think it would be a great addition to support a commonly used format for
private keys and trusted certificates for SSL configurations.

Thank you.
Maulin


Re: [DISCUSS] Cleaning up docs, completing CASSANDRA-16763

2021-10-10 Thread Anthony Grasso
Hi Stefan,

I agree with your thoughts around grouping together changes touching a
subsystem. This is exactly how I started doing the backporting work a few
weeks ago. For example I started by looking at all the changes in the
'doc/source/architecture' folder of the rST docs, and back ported only
those.

I propose every subsection (child folder in doc/source/; e.g.
'architecture', 'configuration', 'cql') that has rST doc changes dating
back to June 2020 has a ticket. Each ticket would list the commit hashes
that need to be backported. For commit hashes that span multiple
subsections we pick a single ticket for that hash to be done under. This
will allow us to divide up the work fairly easily with minimal conflicts
when merging.

This process would need to be done for both Cassandra 3.11 and 4.0/trunk.

We can use CASSANDRA-16761
 as the umbrella
ticket for these changes. This epic was opened to capture all the work
related to migrating from the old website and rTS docs to the new website
and AsciiDoc. It is the ideal location for the tickets which will capture
the backporting work.

Kind regards,
Anthony



On Sat, 9 Oct 2021 at 04:02, Ekaterina Dimitrova 
wrote:

> Hey Stefan,
>
> Thank you for your response.
>
> “If it was feasible to gather all related changes touching a subsystem
> under one umbrella ticket, that would be very nice but I do not know
> if that makes sense from your point of view (what workflow you have).”
>
> It is up to us to decide what would be the most efficient way how to move
> forward as a community so any ideas are appreciated. I think Anthony had
> similar idea to what you said. Probably he can share more details.
>
> Best regards,
> Ekaterina
>
> On Thu, 7 Oct 2021 at 3:32, Stefan Miklosovic <
> stefan.mikloso...@instaclustr.com> wrote:
>
> > Hi Lorina, Ekaterina,
> >
> > In general your approach sounds good to me. I am also +1 on not
> > creating too many tickets as I can see it will be easy to get lost in.
> >
> > If it was feasible to gather all related changes touching a subsystem
> > under one umbrella ticket, that would be very nice but I do not know
> > if that makes sense from your point of view (what workflow you have).
> >
> > Regards
> >
> > On Wed, 6 Oct 2021 at 23:56, Ekaterina Dimitrova 
> > wrote:
> > >
> > > Hey Lorina,
> > >
> > > First of all - thank you so much for all the work done by you and the
> > rest
> > > of the people! The website and the docs are our front door as a
> project!
> > >
> > > +1 on your proposal. My understanding is we need 1)+5) and then
> > everything
> > > else will be able to roll out and more people will be able to join the
> > > efforts so we can knock out 2) which seems the biggest work here, did I
> > get
> > > it correct?
> > >
> > > My only comment is about the tickets we will have to open. I can
> suggest
> > we
> > > don’t do 1:1 ticket for every small backport ticket/change but 1:1 for
> > > bigger bodies of work and 1:many where we see we can combine a few
> > smaller
> > > changes so we don’t deal with too many tickets. Does this sound
> > reasonable?
> > > Is there a different suggestion or plan?
> > >
> > > Thank you one more time. I will be happy to help with what I can do in
> > > order to bring this to the finish line. I am sure others will do too
> even
> > > with a ticket or two :-) In OSS every single contribution matter,
> right?
> > >
> > > Best regards,
> > > Ekaterina
> > >
> > > On Wed, 6 Oct 2021 at 8:22, Benjamin Lerer  wrote:
> > >
> > > > Thanks Lorina for all your work.
> > > >
> > > > +1 Your proposal makes sense to me.
> > > >
> > > > Le mer. 6 oct. 2021 à 00:34, Lorina Poland  a
> > écrit :
> > > >
> > > > > This is a discussion about how to tackle getting the docs “fixed”.
> > > > >
> > > > > As many of you know, I started months ago to convert the Apache
> > Cassandra
> > > > > in-tree docs
> > > > > from reStructedText (rST)to AsciiDoc. [1]
> > > > > The conversion required both the docs source files to be converted,
> > but
> > > > > also the cassandra-website
> > > > > source to be updated, to build the docs from AsciiDoc.[2] You all
> > have
> > > > seen
> > > > > the results of that
> > > > > conversion + the beautiful new design work accomplished.
> > > > > When Apache Cassandra 4.0 was ready to GA, we used my private repo
> > > > > (polandll/cassandra) to build the docs for
> > > > > publication. (The new cassandra-website procedure allows for any
> > repo to
> > > > be
> > > > > used to build.)
> > > > > Due to a series of interferences with virtually all the people on
> the
> > > > > project
> > > > > (myself, Anthony Grasso, Mick Semb Wever, Paul Lau) in the time
> > leading
> > > > up
> > > > > to the GA or right after,
> > > > > we have never gotten my repo work committed and merged to the
> > official
> > > > > source (apache/cassandra).
> > > > > So, here is the proposal for a plan of action:
> > > > >
> > > > > (1) Anthony and 

Re: Tradeoffs for Cassandra transaction management

2021-10-10 Thread bened...@apache.org
Hi Jonathan,

I will summarise my position below, that I have outlined at various points in 
the other thread, and then I would be interested to hear how you propose we 
move forwards. I will commit to responding the same day to any email I receive 
before 7pm GMT, and to engaging with each of your points. I would appreciate it 
if you could make similar commitments so that we may conclude this discussion 
in a reasonable time frame and conduct a vote on CEP-15.

I also reiterate my standing invitation to an open video chat, to discuss 
anything you like, for as long as you like. Please nominate a suitable time and 
day.

==TL;DR==
CEP-15 does not narrow our future options, it only broadens them. Accord is a 
distributed consensus protocol, so these techniques may build upon it without 
penalty. Alternatively, these approaches may simply live alongside Accord.

Since these alternative approaches do not achieve the goals of the CEP, and 
this CEP only enhances your ability to pursue them, it seems hard to conclude 
it should not proceed.

==Goals==
Our goals are first order principles: we want strict serializable cross-shard 
isolation that is highly available and can be scaled while maintaining optimal 
and predictable latency. Anything less, and the CEP is not achieved.

As outlined already (except SLOG, which I address below), these alternative 
approaches do not achieve these goals.

==Compatibility with other approaches==
0. In general, research systems are not irreducible - they are an assembly of 
ideas that can be mixed together. Accord is a distributed consensus protocol. 
These other protocols may utilise it without penalty for consensus, in many 
cases obtaining improved characteristics. Conversely, Accord may itself 
directly integrate some of these ideas.

1. Cockroach, YugaByte, Dynamo et al utilize read and write intents, the same 
as outlined as a technique for interactive transactions with Accord. They 
manage these in a distributed state machine with per-shard consensus, 
permitting them to achieve serializable isolation. This same technique can be 
used with Accord, with the advantage that strict serializable isolation would 
be achievable. For simple transactions we would be able to execute with “pure” 
Accord and retain its execution advantage. Accord does not disadvantage this 
approach, it is only enhanced and made easier.

2. Calvin: Accord is broadly functionally equivalent, only leaderless, thereby 
achieving better global latency properties.

3. SLOG: This is essentially Calvin. The main modification is that we may 
assign data a home region, so that transactions may be faster if they 
participate in just one region, and slower if they involve multiple regions. 
Note that this protocol does not achieve global serializability without either 
losing consistency or availability under network partition or paying a WAN cost.

In its consistent mode SLOG therefore remains slower than Accord for both 
single-home and multi-home transactions. Accord requires one WAN penalty for 
linearizing a transaction (competing transactions pay this cost simultaneously, 
as with SLOG), however this is achieved for global clients, whereas SLOG must 
cross the WAN multiple times for transactions initiated from outside their 
home, and for all multi-home transactions.

As discussed elsewhere, a future optimisation with Accord is to temporarily 
“home” competing transaction for execution only, so that there is no additional 
WAN penalty when executing competing transactions. This would confer the same 
performance advantages as SLOG, without any of its penalties for multi-home 
transactions or heterogenous latency characteristics, nor any of the 
complexities of re-homing data, thus avoiding these unpredictable performance 
characteristics.

For those use cases that do not require high availability, it would be possible 
to implement a “home” region setup with Accord, as with SLOG. This is not an 
idea that is exclusive to this particular system. We even discussed this 
briefly in the call, as some use cases do indeed prefer this trade-off.

SLOG additionally offers a kind of “home group” multi-home optimisation for 
clusters with many regions, that accept availability loss if fewer than half of 
their regions fail (e.g. in the paper 6 regions in pairs of 2 for 
availability). This is also exploitable by Accord, and something we can pursue 
as a future optimisation, as users explore such topologies in the real world.

==Responding to specific points==

>because it was asserted in the CEP-15 thread that Accord could support SQL by 
>applying known techniques on top. This is mistaken. Deterministic systems like 
>Calvin or SLOG or Accord can support queries where the rows affected are not 
>known in advance using a technique that Abadi calls OLLP

Language is hard and it is easy to conflate things. Here you seem to be 
discussing abort-free interactive transactions, not SQL. SQL does not 
necessitate