** we can dump v3 **DRIVER** compatibility, since later 4.X Java drivers are backward compatible with Cassandra 3
On Tue, Mar 19, 2024 at 2:43 PM Mike Thomsen <[email protected]> wrote: > David, > > Before we proceed, I think we should make sure we're all understanding the > same problem here. Starting with this: > > > I believe the CQL protocol is backwards compatible but the Java API is > not. > > For example "com.datastax.driver.core.Session" is now > > "com.datastax.oss.driver.api.core.session.Session" and there is no more > > "Cluster" class. Might be fairly trivial to fix though, if that's the > path > > of least resistance. > > From what I've learned using Cassandra 3 and 4 in my day job and reading > up on this stuff for the sake of discussion, that all tracks. We used the > ~4.11 driver in Spring Boot on both v3 and v4 clusters without issue during > an upgrade. So I don't see any reason to factor in the "changes from > DataStax 3 to 4" since the changes were likely a one-off decision meant to > position the driver for better future support and stability. > > TL;DR, we can dump v3 compatibility and the only thing our users will > notice is if we make the controller service totally incompatible with the > one they're already using which is something we can actively avoid. > > On Tue, Mar 19, 2024 at 2:00 PM David Handermann < > [email protected]> wrote: > >> All, >> >> I support a Controller Service API abstraction around the Cassandra >> Driver. The changes from DataStax 3 to 4 already highlight the need >> for that abstraction. The donation of the DataStax Java driver to >> Apache [1] also shows the value of providing some level of isolation, >> if at all possible. >> >> I have not taken a close look at the Matt's branch, and the details of >> the abstraction are important, but having the abstraction can be >> useful to avoid getting back to this same situation. >> >> Regards, >> David Handermann >> >> [1] https://github.com/apache/cassandra-java-driver/ >> >> On Tue, Mar 19, 2024 at 12:37 PM Mike Thomsen <[email protected]> >> wrote: >> > >> > Matt, >> > >> > I got that. My point was that the Java changes appear to be a one time >> > thing that DataStax did to make a better driver with a much more >> > future-proof API. Since Scylla tracks them as closely as possible, I >> > suspect that we don't need to plan for a bunch of abstraction to isolate >> > Java changes. >> > >> > On Tue, Mar 19, 2024 at 11:07 AM Steven Matison < >> [email protected]> >> > wrote: >> > >> > > That was kinda where i got stuck and fell out on my branch/jira. >> Mike and >> > > I wanted to make a new controller service , without backward >> compatibility; >> > > and remove the duplicate driver/connection properties found in some >> of the >> > > processors. >> > > >> > > I agree taking out all old stuff and making new controller service >> makes >> > > most sense. 4.x and 5.x should be mostly backwards compatible to >> 2&3.x >> > > with how it’s used within current processors. >> > > >> > > >> > > >> > > On Tue, Mar 19, 2024 at 10:49 AM Matt Burgess <[email protected]> >> > > wrote: >> > > >> > > > The abstraction is to isolate Java API changes, not protocol >> > > compatibility >> > > > Changing to the java-driver comes with a number of changes to the >> code >> > > (see >> > > > Steven's and my branches), if we can abstract that API it should >> lead to >> > > > more maintainable code in the future by not having to change any >> > > > processors, just the controller service implementation. >> > > > >> > > > >> > > > On Tue, Mar 19, 2024 at 10:14 AM Mike Thomsen < >> [email protected]> >> > > > wrote: >> > > > >> > > > > >> > > > > >> > > > >> > > >> https://opensource.docs.scylladb.com/stable/using-scylla/drivers/cql-drivers/scylla-java-driver.html >> > > > > >> > > > > Directly quoting Scylla docs here: >> > > > > >> > > > > > The Scylla Java Driver is a drop-in replacement for the >> DataStax Java >> > > > > Driver. As such, no code changes are needed to use this driver. >> > > > > >> > > > > On Tue, Mar 19, 2024 at 10:13 AM Mike Thomsen < >> [email protected]> >> > > > > wrote: >> > > > > >> > > > > > Matt, >> > > > > > >> > > > > > I don't think we need to really "abstract above" the drivers >> because >> > > > the >> > > > > > Java DataStax driver appears to support 4.X all the way back to >> 2.X, >> > > as >> > > > > > well as the enterprise versions from DataStax >> > > > > > >> > > > > > >> https://docs.datastax.com/en/driver-matrix/docs/java-drivers.html >> > > > > > >> > > > > > Similar situation with Scylla. When I looked at the driver, it >> > > appeared >> > > > > to >> > > > > > copy verbatim the entire public API of that driver. So I think >> before >> > > > we >> > > > > > dive into abstractions, it's worth doing a bit more validation >> of >> > > these >> > > > > > details. IMHO, this might be a much lighter lift than >> anticipated. >> > > > > > >> > > > > > >> > > > > > On Mon, Mar 18, 2024 at 4:30 PM Matt Burgess < >> [email protected]> >> > > > > wrote: >> > > > > > >> > > > > >> Totally agree, that's what my branch does (see link in previous >> > > > email). >> > > > > >> The >> > > > > >> more I work with it, the more I think I can abstract it >> further from >> > > > > their >> > > > > >> JDBC-like API but I started with a bunch of delegate classes >> then I >> > > > > figure >> > > > > >> I'll see where I can consolidate to more abstract concepts. If >> I >> > > don't >> > > > > >> have >> > > > > >> to support Cassandra 3 with the new API, so much the better. >> > > > > >> >> > > > > >> Regards, >> > > > > >> Matt >> > > > > >> >> > > > > >> On Mon, Mar 18, 2024 at 4:14 PM David Handermann < >> > > > > >> [email protected]> wrote: >> > > > > >> >> > > > > >> > Matt et al, >> > > > > >> > >> > > > > >> > It is good to see the background effort on moving Cassandra >> > > > > >> > capabilities in a supportable direction. >> > > > > >> > >> > > > > >> > I think new Cassandra components will require a significant >> > > > departure >> > > > > >> > from current Controller Service abstractions. Right now, the >> > > > existing >> > > > > >> > service interface does not provide a clean abstraction from >> the >> > > > > >> > Cassandra library, which is part of the reason for the >> current >> > > > > >> > coupling to the legacy driver version. >> > > > > >> > >> > > > > >> > Following up from Joe's comments, it seems like the cleanest >> way >> > > > > >> > forward is to deprecate the current bundle on the 1.x >> branch, and >> > > > > >> > remove the current bundle from the main branch. That will >> provide >> > > a >> > > > > >> > clean slate for new Service and Processor implementations, >> without >> > > > > >> > concern for uncertain compatibility questions. >> > > > > >> > >> > > > > >> > Regards, >> > > > > >> > David Handermann >> > > > > >> > >> > > > > >> > On Mon, Mar 18, 2024 at 2:35 PM Matt Burgess < >> > > [email protected]> >> > > > > >> wrote: >> > > > > >> > > >> > > > > >> > > What do y'all think about removing the individual >> connection >> > > > > >> properties >> > > > > >> > > from the Cassandra processors for NiFi 2.0 and requiring a >> > > > > >> > > CassandraSessionProvider instead? I think we started doing >> that >> > > > > >> elsewhere >> > > > > >> > > (Elasticsearch maybe?), I noticed duplicate code in the >> > > > > >> > > CassandraSessionProvider and AbstractCassandraProcessor, >> if we >> > > > keep >> > > > > >> those >> > > > > >> > > properties I can refactor them into a utility class. >> > > > > >> > > >> > > > > >> > > Thanks, >> > > > > >> > > Matt >> > > > > >> > > >> > > > > >> > > >> > > > > >> > > On Fri, Mar 15, 2024 at 2:44 PM Steven Matison < >> > > > > >> [email protected] >> > > > > >> > > >> > > > > >> > > wrote: >> > > > > >> > > >> > > > > >> > > > I got through quite a bit of work to enable 4.x… >> > > > > >> > > > >> > > > > >> > > > The 3.x pieces that were not backwards compatible is >> very edge >> > > > use >> > > > > >> > case and >> > > > > >> > > > could have been done slightly differently but with work >> > > around. >> > > > > >> > > > >> > > > > >> > > > https://github.com/steven-matison/nifi/tree/nifi-10120-1 >> > > > > >> > > > >> > > > > >> > > > >> > > > > >> > > > >> > > > > >> > > > >> > > > > >> > > > >> > > > > >> > > > >> > > > > >> > > > On Fri, Mar 15, 2024 at 2:30 PM Matt Burgess < >> > > > > [email protected]> >> > > > > >> > wrote: >> > > > > >> > > > >> > > > > >> > > > > Oops used the wrong email address so if there have been >> > > > > responses >> > > > > >> to >> > > > > >> > the >> > > > > >> > > > > Cassandra thread since mine I missed them, my bad! >> > > > > >> > > > > >> > > > > >> > > > > On Fri, Mar 15, 2024 at 2:00 PM Matt Burgess < >> > > > > [email protected] >> > > > > >> > >> > > > > >> > > > wrote: >> > > > > >> > > > > >> > > > > >> > > > > > I believe the CQL protocol is backwards compatible >> but the >> > > > > Java >> > > > > >> > API is >> > > > > >> > > > > > not. For example "com.datastax.driver.core.Session" >> is now >> > > > > >> > > > > > "com.datastax.oss.driver.api.core.session.Session" >> and >> > > there >> > > > > is >> > > > > >> no >> > > > > >> > more >> > > > > >> > > > > > "Cluster" class. Might be fairly trivial to fix >> though, if >> > > > > >> that's >> > > > > >> > the >> > > > > >> > > > > path >> > > > > >> > > > > > of least resistance. >> > > > > >> > > > > > >> > > > > >> > > > > > On Fri, Mar 15, 2024 at 1:40 PM Joe Witt < >> > > > [email protected]> >> > > > > >> > wrote: >> > > > > >> > > > > > >> > > > > >> > > > > >> Matt >> > > > > >> > > > > >> >> > > > > >> > > > > >> I dont know a ton about Cassandra but when I looked >> at >> > > > > >> > client/driver >> > > > > >> > > > > notes >> > > > > >> > > > > >> for 4+ it said it was compatible all the way back >> to 3.x. >> > > > > Not >> > > > > >> > sure >> > > > > >> > > > > what >> > > > > >> > > > > >> that means but it surely seems worth exploring. >> Also I >> > > > dont >> > > > > >> know >> > > > > >> > if >> > > > > >> > > > the >> > > > > >> > > > > >> 4.x drivers get rid of the vulnerable bits. >> > > > > >> > > > > >> >> > > > > >> > > > > >> Thanks >> > > > > >> > > > > >> >> > > > > >> > > > > >> On Fri, Mar 15, 2024 at 10:39 AM Matt Burgess < >> > > > > >> > [email protected]> >> > > > > >> > > > > >> wrote: >> > > > > >> > > > > >> >> > > > > >> > > > > >> > At the very least we should upgrade to Cassandra >> > > 3.11.6: >> > > > > >> > > > > >> > >> > > > > >> > > > > >> > > > > >> > >> > > > > >> https://github.com/apache/cassandra/blob/cassandra-3.11.16/CHANGES.txt >> > > > > >> > > > > >> > >> > > > > >> > > > > >> > On Fri, Mar 15, 2024 at 1:31 PM Matt Burgess < >> > > > > >> > [email protected]> >> > > > > >> > > > > >> wrote: >> > > > > >> > > > > >> > >> > > > > >> > > > > >> > > If the community agrees to get rid of Cassandra >> 3 >> > > > that'll >> > > > > >> > save me >> > > > > >> > > > > >> effort >> > > > > >> > > > > >> > > on the refactor after I add Cassandra 4 :) >> Otherwise >> > > > > those >> > > > > >> > > > > >> > > vulnerabilities would only be in a "new" >> Cassandra 3 >> > > > > >> services >> > > > > >> > NAR >> > > > > >> > > > > that >> > > > > >> > > > > >> > > would not be included in the convenience binary. >> > > > > >> > > > > >> > > >> > > > > >> > > > > >> > > On Fri, Mar 15, 2024 at 1:28 PM Joe Witt < >> > > > > >> [email protected]> >> > > > > >> > > > > wrote: >> > > > > >> > > > > >> > > >> > > > > >> > > > > >> > >> Mike, Matt, >> > > > > >> > > > > >> > >> >> > > > > >> > > > > >> > >> Happy to hear you both have active efforts or >> are >> > > > > >> interested >> > > > > >> > in >> > > > > >> > > > > doing >> > > > > >> > > > > >> > so. >> > > > > >> > > > > >> > >> Can you help me understand more specifically >> what >> > > that >> > > > > >> means >> > > > > >> > for >> > > > > >> > > > > the >> > > > > >> > > > > >> > >> current set of components? >> > > > > >> > > > > >> > >> >> > > > > >> > > > > >> > >> The CVE hits are concerning and long standing. >> > > > > Supporting >> > > > > >> > > > > Cassandra >> > > > > >> > > > > >> 3 >> > > > > >> > > > > >> > >> implies the current set of dependencies would >> remain >> > > > too >> > > > > >> > right? >> > > > > >> > > > > >> > >> >> > > > > >> > > > > >> > >> Is the current set of components we have ones >> we >> > > want >> > > > to >> > > > > >> > retain? >> > > > > >> > > > > We >> > > > > >> > > > > >> > >> certainly need Cassandra components - but are >> the >> > > ones >> > > > > we >> > > > > >> > have >> > > > > >> > > > now >> > > > > >> > > > > >> the >> > > > > >> > > > > >> > >> right ones? >> > > > > >> > > > > >> > >> >> > > > > >> > > > > >> > >> Thanks >> > > > > >> > > > > >> > >> Joe >> > > > > >> > > > > >> > >> >> > > > > >> > > > > >> > >> On Fri, Mar 15, 2024 at 10:25 AM Matt Burgess < >> > > > > >> > > > > [email protected]> >> > > > > >> > > > > >> > >> wrote: >> > > > > >> > > > > >> > >> >> > > > > >> > > > > >> > >> > I'm actively working this, I pushed my >> branch up >> > > in >> > > > > case >> > > > > >> > anyone >> > > > > >> > > > > >> wants >> > > > > >> > > > > >> > to >> > > > > >> > > > > >> > >> > take a look [1]. The idea is to abstract the >> > > > Cassandra >> > > > > >> API >> > > > > >> > "up >> > > > > >> > > > a >> > > > > >> > > > > >> > couple >> > > > > >> > > > > >> > >> > levels" and provide implementations for >> Cassandra >> > > 3, >> > > > > 4, >> > > > > >> and >> > > > > >> > > > > >> eventually >> > > > > >> > > > > >> > >> 5. >> > > > > >> > > > > >> > >> > For JDBC-like interfaces this is a PITA >> because of >> > > > the >> > > > > >> API >> > > > > >> > > > > >> (Statement, >> > > > > >> > > > > >> > >> > PreparedStatement, BoundStatement, ResultSet, >> > > etc.) >> > > > > but >> > > > > >> I'm >> > > > > >> > > > > hoping >> > > > > >> > > > > >> we >> > > > > >> > > > > >> > >> can >> > > > > >> > > > > >> > >> > find a common pattern for abstracting the >> > > > third-party >> > > > > >> > library >> > > > > >> > > > > >> > >> > implementation and API from the NiFi >> component >> > > > > >> (Processor, >> > > > > >> > > > > >> > >> > ControllerService, etc.) API. I think we're >> doing >> > > > > >> something >> > > > > >> > > > > similar >> > > > > >> > > > > >> > for >> > > > > >> > > > > >> > >> > Kafka? >> > > > > >> > > > > >> > >> > >> > > > > >> > > > > >> > >> > Regards, >> > > > > >> > > > > >> > >> > Matt >> > > > > >> > > > > >> > >> > >> > > > > >> > > > > >> > >> > [1] >> https://github.com/mattyb149/nifi/tree/cassy4 >> > > > > >> > > > > >> > >> > >> > > > > >> > > > > >> > >> > >> > > > > >> > > > > >> > >> > On Fri, Mar 15, 2024 at 8:43 AM Mike Thomsen >> < >> > > > > >> > > > > >> [email protected]> >> > > > > >> > > > > >> > >> > wrote: >> > > > > >> > > > > >> > >> > >> > > > > >> > > > > >> > >> > > That’s been on my todo list for a little >> while >> > > but >> > > > > >> things >> > > > > >> > > > kept >> > > > > >> > > > > >> > coming >> > > > > >> > > > > >> > >> up. >> > > > > >> > > > > >> > >> > > I think I could get started on that now. >> Based >> > > on >> > > > my >> > > > > >> > initial >> > > > > >> > > > > >> > research >> > > > > >> > > > > >> > >> it >> > > > > >> > > > > >> > >> > > appears that scylla uses the exact same >> api as >> > > > > >> datastax >> > > > > >> > so >> > > > > >> > > > > >> > supporting >> > > > > >> > > > > >> > >> > both >> > > > > >> > > > > >> > >> > > in a cql bundle should theoretically be >> fairly >> > > > easy. >> > > > > >> > > > > >> > >> > > >> > > > > >> > > > > >> > >> > > >> > > > > >> > > > > >> > >> > > Sent from my iPhone >> > > > > >> > > > > >> > >> > > >> > > > > >> > > > > >> > >> > > > On Mar 14, 2024, at 6:18 PM, Joe Witt < >> > > > > >> > [email protected]> >> > > > > >> > > > > >> wrote: >> > > > > >> > > > > >> > >> > > > >> > > > > >> > > > > >> > >> > > > Team, >> > > > > >> > > > > >> > >> > > > >> > > > > >> > > > > >> > >> > > > Cassandra remains a really important >> system to >> > > > be >> > > > > >> able >> > > > > >> > to >> > > > > >> > > > > send >> > > > > >> > > > > >> > data >> > > > > >> > > > > >> > >> to. >> > > > > >> > > > > >> > >> > > > However, it seems like we've not >> maintained >> > > > these >> > > > > >> > well. We >> > > > > >> > > > > >> have >> > > > > >> > > > > >> > >> what >> > > > > >> > > > > >> > >> > > > appears to be at least a full generation >> > > behind >> > > > on >> > > > > >> > client >> > > > > >> > > > > >> versions >> > > > > >> > > > > >> > >> (we >> > > > > >> > > > > >> > >> > > are >> > > > > >> > > > > >> > >> > > > on 3x vs 4x which is the latest stable >> with 5x >> > > > > >> > apparently >> > > > > >> > > > > >> coming >> > > > > >> > > > > >> > >> > > shortly). >> > > > > >> > > > > >> > >> > > > >> > > > > >> > > > > >> > >> > > > We have components to send data, query >> data, >> > > and >> > > > > use >> > > > > >> > > > > Cassandra >> > > > > >> > > > > >> as >> > > > > >> > > > > >> > a >> > > > > >> > > > > >> > >> > cache >> > > > > >> > > > > >> > >> > > > store. We have older mechanisms for >> json/avro >> > > > and >> > > > > >> > publish >> > > > > >> > > > > >> > >> mechanisms >> > > > > >> > > > > >> > >> > for >> > > > > >> > > > > >> > >> > > > records. >> > > > > >> > > > > >> > >> > > > >> > > > > >> > > > > >> > >> > > > The libraries we do have depend on >> outdated >> > > > > >> versions of >> > > > > >> > > > Guava >> > > > > >> > > > > >> and >> > > > > >> > > > > >> > >> > result >> > > > > >> > > > > >> > >> > > in >> > > > > >> > > > > >> > >> > > > many CVE hits. >> > > > > >> > > > > >> > >> > > > >> > > > > >> > > > > >> > >> > > > I am inclined to think we should >> deprecate the >> > > > 1.x >> > > > > >> > > > components >> > > > > >> > > > > >> and >> > > > > >> > > > > >> > >> > remove >> > > > > >> > > > > >> > >> > > > them as-is from the 2.x line. Then >> > > re-introduce >> > > > > >> them >> > > > > >> > with >> > > > > >> > > > > >> record >> > > > > >> > > > > >> > >> only >> > > > > >> > > > > >> > >> > > > interfaces and built against the latest >> stable >> > > > > >> > > > > >> > >> > > Cassandra/Datastax/ScyllaDB >> > > > > >> > > > > >> > >> > > > interfaces. >> > > > > >> > > > > >> > >> > > > >> > > > > >> > > > > >> > >> > > > I'd love to hear thoughts from those >> closer to >> > > > > this >> > > > > >> > space >> > > > > >> > > > > both >> > > > > >> > > > > >> as >> > > > > >> > > > > >> > a >> > > > > >> > > > > >> > >> > user >> > > > > >> > > > > >> > >> > > > and developer so we can make good next >> steps. >> > > > > >> > > > > >> > >> > > > >> > > > > >> > > > > >> > >> > > > Thanks >> > > > > >> > > > > >> > >> > > >> > > > > >> > > > > >> > >> > >> > > > > >> > > > > >> > >> >> > > > > >> > > > > >> > > >> > > > > >> > > > > >> > >> > > > > >> > > > > >> >> > > > > >> > > > > > >> > > > > >> > > > > >> > > > > >> > > > >> > > > > >> > >> > > > > >> >> > > > > > >> > > > > >> > > > >> > > >> >
