There are some completely valid points here, but you are conflating internode 
messaging with native (aka client) protocol somewhat. Since native protocol V5 
in C* 4.0, there is a degree of code re-use between the two, native protocol 
having adopted the framing model and resource allocation from internode at this 
point. So the FrameDecoder/Encoder and AbstractMessageHandler concept are 
shared, but the two protocols are distinct and there is very little overlap in 
the message formats or versioning schemes. The handshake protocol that you 
referenced is only part of the internode protocol, client/server negotiation is 
handled differently and the Initiate message is not part of that. 

I absolutely agree that we (ab)use the internode messaging version in several 
places, the issue of storage compatibility being a great example. In CEP-21 we 
do a lot of serialisation both on the wire and into persistent storage of 
metadata components and the operations which modify them. We try to avoid 
perpetuating this coupling by having a distinct versioning scheme for metadata 
so that we can evolve those serialisations independently of message format. 
Whilst this isn't a perfect solution, it has worked fairly well so far. Perhaps 
it's worth doing something similar in other cases, where a component/domain 
specific versioning scheme could be appropriate (e.g. commitlog/sstable/etc). 

Another issue with evolving native protocol is the beta qualifier, which in my 
experience doesn't really work in its current form. V5 for instance, is 
"supported" with the beta flag since C* 3.10, but the framing implementation 
was never backported to 3.x for obvious reasons. So a client which implements 
"final" V5 can't use V5-beta in pre-4.0 versions. Your proposal to negotiate 
features seperately from the protocol version sounds more sustainable, but a 
concern might be that soon enough a new feature will also require protocol 
changes, so clients are then required to do both. e.g. a new data or message 
type is required to make a particular feature viable. 


> On 29 Sep 2023, at 20:22, Maxim Muzafarov <mmu...@apache.org> wrote:
> 
> Hello everyone,
> 
> 
> The problem that I'm struggling with is not directly related to the
> topic I'm about to discuss now, but it probably illustrates the
> greater complexity of backwards compatibility with the drivers we now
> support. For instance, I want to replace the algorithm that is used to
> calculate a CRC on the message payload with a new one, and since the
> v5 of the native protocol has already been fixed there is no way to do
> this without bumping the protocol version up to v6, which, in turn,
> seems like too big a leap for such a small change. Right? Correct me
> if I'm wrong.
> 
> From a broader perspective, we use native protocol versioning to
> provide backwards compatibility not only for the protocol changes
> themselves, but also for the internal features that do not appear to
> be not directly dependent on the protocol specification as well. I
> would say a good example of this is the dependency of the
> MessagingService version on the storage compatibility mode [2], which
> makes these two subcomponents tightly coupled.
> 
> Another thing worth mentioning here is the number of Cassandra drivers
> [1] that we have, which have to implement a monotonically growing
> version of the native protocol in order to support new features. The
> main problem with a monotonically growing version is that a driver
> (and a driver's developer) can skip v6 if they are only interested in
> a new feature that only appears in v7, without having fully
> implemented v6. This is probably not a problem for the java, or python
> drivers which always get a lot of attention, but could be a problem
> for others. The next example here is an urgent fix, that might be
> blocked by a heavyweight feature which is difficult to implement in a
> particular driver.
> 
> 
> = Proposal =
> 
> I think we could take a step aside and take a slightly different
> approach here to addressing the same backward compatibility issues,
> rather than bumping up the native protocol version every time. A
> driver could send a bitmask to a server on a connection handshake with
> "features" that it is interested in, and the server could then respond
> to that driver with the features it supports from that list. I have
> checked the handshake protocol and it seems that we have some bits in
> reserve [3] in the Initiate message to allow this.
> 
> I see the following advantages:
> 
> - drivers will have enough flexibility to implement new features they
> want, especially those drivers that have a lack of maintainers (not in
> the order the native protocol specification grows up);
> - it gives security plugins the flexibility to enable/disable features
> they want on both the client and server sides;
> - we decouple internal components and their internal versions from each other;
> - allows us to push out urgent fixes or tuning of internal components,
> e.g. tuning FrameEncoders/FradeDecoders in a way that we need;
> 
> 
> Any thoughts?
> 
> 
> [1] 
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-8%3A+DataStax+Drivers+Donation#CEP8:DataStaxDriversDonation-Goals
> [2] 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/MessagingService.java#L223
> [3] 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/HandshakeProtocol.java#L74

Reply via email to