Work so far: https://github.com/MikeThomsen/nifi/tree/cql-changes
On Thu, Mar 21, 2024 at 9:52 AM Mike Thomsen <mikerthom...@gmail.com> wrote: > Matt/David, > > By this evening, I should be at a point where I can share my branch. It > should be far enough along that y'all can see what I mean about how most of > the changes really weren't that complicated. My sense is that if we > collaborate on it, we can probably get it ready for a PR within a week or > two. > > It would probably be a good idea to plan to revisit the Cassandra DMC's > design and make it more flexible. > > One nice thing about the new DataStax driver is that it supports > configuration by a very detailed configuration file format, so we can give > users that option + combine it with EL/parameters (I envision an option > where the user puts EL in the file, we load the file, preprocess the EL and > load that into the driver) > > On Wed, Mar 20, 2024 at 4:01 PM Mike Thomsen <mikerthom...@gmail.com> > wrote: > >> If it were that simple, they would probably have just gone with that >> solution. That said, the API is functionally vendor agnostic at this point >> at the Java API level. So I see no need to add abstraction above that. I've >> got probably 2/3 of nifi-cassandra-bundle converted. Hitting a few pain >> points where I'm having to dig deep into the docs to make progress, but so >> far, so good. >> >> On Wed, Mar 20, 2024 at 2:38 PM Matt Burgess <mattyb...@apache.org> >> wrote: >> >>> It would be interesting to see if you exclude the Scylla API JAR from the >>> Scylla implementation and instead include DataStax's, if that works. >>> However I'm still leaning towards a vendor-agnostic API. >>> >>> On Wed, Mar 20, 2024 at 11:26 AM Mike Thomsen <mikerthom...@gmail.com> >>> wrote: >>> >>> > At first glance, the package names look identical to me: >>> > >>> > https://java-driver.docs.scylladb.com/scylla-4.15.0.x/api/index.html >>> > >>> > So I see no reason to not take them at their word that it's drop-in >>> > >>> > On Wed, Mar 20, 2024 at 11:04 AM David Handermann < >>> > exceptionfact...@apache.org> wrote: >>> > >>> > > Mike, >>> > > >>> > > One important thing to mention about the DataStax vs ScyllaDB driver >>> > > is that the Maven coordinates are different, and managing the >>> > > dependencies correctly will make or break the implementation. >>> > > >>> > > In other words, if it is possible to use the DataStax 4 core JAR in >>> > > the Controller Service API, but use the ScyllaDB 3 query JAR in the >>> > > ScyllaDB implementation, then that could avoid the need for >>> additional >>> > > abstraction. Without taking a closer look, however, I would be >>> > > surprised if this worked. >>> > > >>> > > Although ScyllaDB highlights their forked driver as a drop-in >>> > > replacement for the DataStax version, and maintains the same Java >>> > > package names, there is a difference between a complete replacement >>> > > and a shared API JAR. Without a common API JAR, that both >>> > > implementations can use, it will be necessary to provide an >>> > > abstraction in NiFi that avoids depending on either library at the >>> > > Controller Service API level. >>> > > >>> > > Regards, >>> > > David Handermann >>> > > >>> > > On Wed, Mar 20, 2024 at 8:25 AM Mike Thomsen <mikerthom...@gmail.com >>> > >>> > > wrote: >>> > > > >>> > > > Matt/David, >>> > > > >>> > > > I got some time this morning to take a crack at directly migrating >>> it >>> > > over >>> > > > to the DataStax 4.17 driver. Definitely got a lot of work to do, >>> but so >>> > > far >>> > > > I haven't hit any real snags. This is a branch that reverts the >>> commit >>> > to >>> > > > remove the cassandra bundle and reuses the existing features as a >>> > > > foundation. From what I'm seeing so far (feels like I'm about 25% >>> done) >>> > > it >>> > > > should be doable to reuse the existing bundle, but rename it to the >>> > "CQL >>> > > > Bundle" and just add a second controller service for Scylla that is >>> > > > otherwise 100% the same codewise. >>> > > > >>> > > > On Tue, Mar 19, 2024 at 6:41 PM Mike Thomsen < >>> mikerthom...@gmail.com> >>> > > wrote: >>> > > > >>> > > > > A cursory look at the Cassandra 5 stuff didn’t indicate any >>> > > > > incompatibility. So yeah, I think we are likely pretty safe to >>> use >>> > the >>> > > 4.17 >>> > > > > driver >>> > > > > Sent from my iPhone >>> > > > > >>> > > > > > On Mar 19, 2024, at 3:35 PM, Matt Burgess < >>> mattyb...@apache.org> >>> > > wrote: >>> > > > > > >>> > > > > > Is it likely now (due to the refactor) that we will simply be >>> able >>> > > to >>> > > > > > upgrade the driver when Cassandra 5 is GA? Also does anyone use >>> > > Netflix's >>> > > > > > Astyanax [1]? >>> > > > > > >>> > > > > > [1] >>> > > > > > >>> > > > > >>> > > >>> > >>> https://cassandra.apache.org/doc/stable/cassandra/getting_started/drivers.html#java >>> > > > > > >>> > > > > >> On Tue, Mar 19, 2024 at 3:10 PM Mike Thomsen < >>> > > mikerthom...@gmail.com> >>> > > > > wrote: >>> > > > > >> >>> > > > > >> Realistically, I think we are only likely to see two drivers: >>> > > > > >> >>> > > > > >> * DataStax >>> > > > > >> * ScyllaDB >>> > > > > >> >>> > > > > >> The latter makes a selling point of being a binary compatible, >>> > > drop-in >>> > > > > >> replacement for the former. >>> > > > > >> >>> > > > > >> That's why I don't see a need to have an abstraction layer per >>> > se. I >>> > > > > think >>> > > > > >> we only need "DataStaxConnectionProviderImpl" and >>> > > > > >> "ScyllaDBConnectionProviderImpl" with the difference being >>> which >>> > > jar is >>> > > > > >> imported by maven. >>> > > > > >> >>> > > > > >> On Tue, Mar 19, 2024 at 2:59 PM David Handermann < >>> > > > > >> exceptionfact...@apache.org> wrote: >>> > > > > >> >>> > > > > >>> Mike, >>> > > > > >>> >>> > > > > >>> Thanks for the reply and clarification. >>> > > > > >>> >>> > > > > >>> I agree there is no need to maintain support for the >>> DataStax 3 >>> > > driver >>> > > > > >>> and Java API, any new components should be built on the >>> latest >>> > > version >>> > > > > >>> of the driver. >>> > > > > >>> >>> > > > > >>> What we do need going forward is to avoid, if at all >>> possible, >>> > > having >>> > > > > >>> a DataStax 4 dependency in the Controller Service API. >>> > > > > >>> >>> > > > > >>> One example of this is the WebClientServiceProvider >>> interface. >>> > That >>> > > > > >>> Controller Service API does not have any third-party >>> > dependencies. >>> > > The >>> > > > > >>> Controller Service implementation, >>> > > StandardWebClientServiceProvider, >>> > > > > >>> has a dependency on OkHttp to implement HTTP communication. >>> That >>> > is >>> > > > > >>> the kind of abstraction that would be ideal, and I believe >>> that >>> > > also >>> > > > > >>> aligns with what Matt has described. >>> > > > > >>> >>> > > > > >>> Regards, >>> > > > > >>> David Handermann >>> > > > > >>> >>> > > > > >>> On Tue, Mar 19, 2024 at 1:45 PM Mike Thomsen < >>> > > mikerthom...@gmail.com> >>> > > > > >>> wrote: >>> > > > > >>>> >>> > > > > >>>> ** we can dump v3 **DRIVER** compatibility, since later 4.X >>> Java >>> > > > > >> drivers >>> > > > > >>>> are backward compatible with Cassandra 3 >>> > > > > >>>> >>> > > > > >>>> On Tue, Mar 19, 2024 at 2:43 PM Mike Thomsen < >>> > > mikerthom...@gmail.com> >>> > > > > >>> wrote: >>> > > > > >>>> >>> > > > > >>>>> David, >>> > > > > >>>>> >>> > > > > >>>>> Before we proceed, I think we should make sure we're all >>> > > > > >> understanding >>> > > > > >>> the >>> > > > > >>>>> same problem here. Starting with this: >>> > > > > >>>>> >>> > > > > >>>>>> I believe the CQL protocol is backwards compatible but the >>> > Java >>> > > API >>> > > > > >>> is >>> > > > > >>>>> not. >>> > > > > >>>>>> For example "com.datastax.driver.core.Session" is now >>> > > > > >>>>>> "com.datastax.oss.driver.api.core.session.Session" and >>> there >>> > is >>> > > no >>> > > > > >>> more >>> > > > > >>>>>> "Cluster" class. Might be fairly trivial to fix though, if >>> > > that's >>> > > > > >> the >>> > > > > >>>>> path >>> > > > > >>>>>> of least resistance. >>> > > > > >>>>> >>> > > > > >>>>> From what I've learned using Cassandra 3 and 4 in my day >>> job >>> > and >>> > > > > >>> reading >>> > > > > >>>>> up on this stuff for the sake of discussion, that all >>> tracks. >>> > We >>> > > used >>> > > > > >>> the >>> > > > > >>>>> ~4.11 driver in Spring Boot on both v3 and v4 clusters >>> without >>> > > issue >>> > > > > >>> during >>> > > > > >>>>> an upgrade. So I don't see any reason to factor in the >>> "changes >>> > > from >>> > > > > >>>>> DataStax 3 to 4" since the changes were likely a one-off >>> > decision >>> > > > > >>> meant to >>> > > > > >>>>> position the driver for better future support and >>> stability. >>> > > > > >>>>> >>> > > > > >>>>> TL;DR, we can dump v3 compatibility and the only thing our >>> > users >>> > > will >>> > > > > >>>>> notice is if we make the controller service totally >>> > incompatible >>> > > with >>> > > > > >>> the >>> > > > > >>>>> one they're already using which is something we can >>> actively >>> > > avoid. >>> > > > > >>>>> >>> > > > > >>>>> On Tue, Mar 19, 2024 at 2:00 PM David Handermann < >>> > > > > >>>>> exceptionfact...@apache.org> wrote: >>> > > > > >>>>> >>> > > > > >>>>>> All, >>> > > > > >>>>>> >>> > > > > >>>>>> I support a Controller Service API abstraction around the >>> > > Cassandra >>> > > > > >>>>>> Driver. The changes from DataStax 3 to 4 already >>> highlight the >>> > > need >>> > > > > >>>>>> for that abstraction. The donation of the DataStax Java >>> driver >>> > > to >>> > > > > >>>>>> Apache [1] also shows the value of providing some level of >>> > > > > >> isolation, >>> > > > > >>>>>> if at all possible. >>> > > > > >>>>>> >>> > > > > >>>>>> I have not taken a close look at the Matt's branch, and >>> the >>> > > details >>> > > > > >> of >>> > > > > >>>>>> the abstraction are important, but having the abstraction >>> can >>> > be >>> > > > > >>>>>> useful to avoid getting back to this same situation. >>> > > > > >>>>>> >>> > > > > >>>>>> Regards, >>> > > > > >>>>>> David Handermann >>> > > > > >>>>>> >>> > > > > >>>>>> [1] https://github.com/apache/cassandra-java-driver/ >>> > > > > >>>>>> >>> > > > > >>>>>> On Tue, Mar 19, 2024 at 12:37 PM Mike Thomsen < >>> > > > > >> mikerthom...@gmail.com >>> > > > > >>>> >>> > > > > >>>>>> wrote: >>> > > > > >>>>>>> >>> > > > > >>>>>>> Matt, >>> > > > > >>>>>>> >>> > > > > >>>>>>> I got that. My point was that the Java changes appear to >>> be a >>> > > one >>> > > > > >>> time >>> > > > > >>>>>>> thing that DataStax did to make a better driver with a >>> much >>> > > more >>> > > > > >>>>>>> future-proof API. Since Scylla tracks them as closely as >>> > > > > >> possible, I >>> > > > > >>>>>>> suspect that we don't need to plan for a bunch of >>> abstraction >>> > > to >>> > > > > >>> isolate >>> > > > > >>>>>>> Java changes. >>> > > > > >>>>>>> >>> > > > > >>>>>>> On Tue, Mar 19, 2024 at 11:07 AM Steven Matison < >>> > > > > >>>>>> steven.mati...@gmail.com> >>> > > > > >>>>>>> wrote: >>> > > > > >>>>>>> >>> > > > > >>>>>>>> That was kinda where i got stuck and fell out on my >>> > > branch/jira. >>> > > > > >>>>>> Mike and >>> > > > > >>>>>>>> I wanted to make a new controller service , without >>> backward >>> > > > > >>>>>> compatibility; >>> > > > > >>>>>>>> and remove the duplicate driver/connection properties >>> found >>> > in >>> > > > > >>> some >>> > > > > >>>>>> of the >>> > > > > >>>>>>>> processors. >>> > > > > >>>>>>>> >>> > > > > >>>>>>>> I agree taking out all old stuff and making new >>> controller >>> > > > > >> service >>> > > > > >>>>>> makes >>> > > > > >>>>>>>> most sense. 4.x and 5.x should be mostly backwards >>> > compatible >>> > > > > >> to >>> > > > > >>>>>> 2&3.x >>> > > > > >>>>>>>> with how it’s used within current processors. >>> > > > > >>>>>>>> >>> > > > > >>>>>>>> >>> > > > > >>>>>>>> >>> > > > > >>>>>>>> On Tue, Mar 19, 2024 at 10:49 AM Matt Burgess < >>> > > > > >>> mattyb...@apache.org> >>> > > > > >>>>>>>> wrote: >>> > > > > >>>>>>>> >>> > > > > >>>>>>>>> The abstraction is to isolate Java API changes, not >>> > protocol >>> > > > > >>>>>>>> compatibility >>> > > > > >>>>>>>>> Changing to the java-driver comes with a number of >>> changes >>> > to >>> > > > > >>> the >>> > > > > >>>>>> code >>> > > > > >>>>>>>> (see >>> > > > > >>>>>>>>> Steven's and my branches), if we can abstract that API >>> it >>> > > > > >> should >>> > > > > >>>>>> lead to >>> > > > > >>>>>>>>> more maintainable code in the future by not having to >>> > change >>> > > > > >> any >>> > > > > >>>>>>>>> processors, just the controller service implementation. >>> > > > > >>>>>>>>> >>> > > > > >>>>>>>>> >>> > > > > >>>>>>>>> On Tue, Mar 19, 2024 at 10:14 AM Mike Thomsen < >>> > > > > >>>>>> mikerthom...@gmail.com> >>> > > > > >>>>>>>>> wrote: >>> > > > > >>>>>>>>> >>> > > > > >>>>>>>>>> >>> > > > > >>>>>>>>>> >>> > > > > >>>>>>>>> >>> > > > > >>>>>>>> >>> > > > > >>>>>> >>> > > > > >>> >>> > > > > >> >>> > > > > >>> > > >>> > >>> https://opensource.docs.scylladb.com/stable/using-scylla/drivers/cql-drivers/scylla-java-driver.html >>> > > > > >>>>>>>>>> >>> > > > > >>>>>>>>>> Directly quoting Scylla docs here: >>> > > > > >>>>>>>>>> >>> > > > > >>>>>>>>>>> The Scylla Java Driver is a drop-in replacement for >>> the >>> > > > > >>>>>> DataStax Java >>> > > > > >>>>>>>>>> Driver. As such, no code changes are needed to use >>> this >>> > > > > >>> driver. >>> > > > > >>>>>>>>>> >>> > > > > >>>>>>>>>> On Tue, Mar 19, 2024 at 10:13 AM Mike Thomsen < >>> > > > > >>>>>> mikerthom...@gmail.com> >>> > > > > >>>>>>>>>> wrote: >>> > > > > >>>>>>>>>> >>> > > > > >>>>>>>>>>> Matt, >>> > > > > >>>>>>>>>>> >>> > > > > >>>>>>>>>>> I don't think we need to really "abstract above" the >>> > > > > >> drivers >>> > > > > >>>>>> because >>> > > > > >>>>>>>>> the >>> > > > > >>>>>>>>>>> Java DataStax driver appears to support 4.X all the >>> way >>> > > > > >>> back to >>> > > > > >>>>>> 2.X, >>> > > > > >>>>>>>> as >>> > > > > >>>>>>>>>>> well as the enterprise versions from DataStax >>> > > > > >>>>>>>>>>> >>> > > > > >>>>>>>>>>> >>> > > > > >>>>>> >>> > > https://docs.datastax.com/en/driver-matrix/docs/java-drivers.html >>> > > > > >>>>>>>>>>> >>> > > > > >>>>>>>>>>> Similar situation with Scylla. When I looked at the >>> > > > > >> driver, >>> > > > > >>> it >>> > > > > >>>>>>>> appeared >>> > > > > >>>>>>>>>> to >>> > > > > >>>>>>>>>>> copy verbatim the entire public API of that driver. >>> So I >>> > > > > >>> think >>> > > > > >>>>>> before >>> > > > > >>>>>>>>> we >>> > > > > >>>>>>>>>>> dive into abstractions, it's worth doing a bit more >>> > > > > >>> validation >>> > > > > >>>>>> of >>> > > > > >>>>>>>> these >>> > > > > >>>>>>>>>>> details. IMHO, this might be a much lighter lift than >>> > > > > >>>>>> anticipated. >>> > > > > >>>>>>>>>>> >>> > > > > >>>>>>>>>>> >>> > > > > >>>>>>>>>>> On Mon, Mar 18, 2024 at 4:30 PM Matt Burgess < >>> > > > > >>>>>> mattyb...@gmail.com> >>> > > > > >>>>>>>>>> wrote: >>> > > > > >>>>>>>>>>> >>> > > > > >>>>>>>>>>>> Totally agree, that's what my branch does (see link >>> in >>> > > > > >>> previous >>> > > > > >>>>>>>>> email). >>> > > > > >>>>>>>>>>>> The >>> > > > > >>>>>>>>>>>> more I work with it, the more I think I can >>> abstract it >>> > > > > >>>>>> further from >>> > > > > >>>>>>>>>> their >>> > > > > >>>>>>>>>>>> JDBC-like API but I started with a bunch of delegate >>> > > > > >>> classes >>> > > > > >>>>>> then I >>> > > > > >>>>>>>>>> figure >>> > > > > >>>>>>>>>>>> I'll see where I can consolidate to more abstract >>> > > > > >>> concepts. If >>> > > > > >>>>>> I >>> > > > > >>>>>>>> don't >>> > > > > >>>>>>>>>>>> have >>> > > > > >>>>>>>>>>>> to support Cassandra 3 with the new API, so much the >>> > > > > >>> better. >>> > > > > >>>>>>>>>>>> >>> > > > > >>>>>>>>>>>> Regards, >>> > > > > >>>>>>>>>>>> Matt >>> > > > > >>>>>>>>>>>> >>> > > > > >>>>>>>>>>>> On Mon, Mar 18, 2024 at 4:14 PM David Handermann < >>> > > > > >>>>>>>>>>>> exceptionfact...@apache.org> wrote: >>> > > > > >>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>> Matt et al, >>> > > > > >>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>> It is good to see the background effort on moving >>> > > > > >>> Cassandra >>> > > > > >>>>>>>>>>>>> capabilities in a supportable direction. >>> > > > > >>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>> I think new Cassandra components will require a >>> > > > > >>> significant >>> > > > > >>>>>>>>> departure >>> > > > > >>>>>>>>>>>>> from current Controller Service abstractions. Right >>> > > > > >> now, >>> > > > > >>> the >>> > > > > >>>>>>>>> existing >>> > > > > >>>>>>>>>>>>> service interface does not provide a clean >>> abstraction >>> > > > > >>> from >>> > > > > >>>>>> the >>> > > > > >>>>>>>>>>>>> Cassandra library, which is part of the reason for >>> the >>> > > > > >>>>>> current >>> > > > > >>>>>>>>>>>>> coupling to the legacy driver version. >>> > > > > >>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>> Following up from Joe's comments, it seems like the >>> > > > > >>> cleanest >>> > > > > >>>>>> way >>> > > > > >>>>>>>>>>>>> forward is to deprecate the current bundle on the >>> 1.x >>> > > > > >>>>>> branch, and >>> > > > > >>>>>>>>>>>>> remove the current bundle from the main branch. >>> That >>> > > > > >> will >>> > > > > >>>>>> provide >>> > > > > >>>>>>>> a >>> > > > > >>>>>>>>>>>>> clean slate for new Service and Processor >>> > > > > >>> implementations, >>> > > > > >>>>>> without >>> > > > > >>>>>>>>>>>>> concern for uncertain compatibility questions. >>> > > > > >>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>> Regards, >>> > > > > >>>>>>>>>>>>> David Handermann >>> > > > > >>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>> On Mon, Mar 18, 2024 at 2:35 PM Matt Burgess < >>> > > > > >>>>>>>> mattyb...@apache.org> >>> > > > > >>>>>>>>>>>> wrote: >>> > > > > >>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>> What do y'all think about removing the individual >>> > > > > >>>>>> connection >>> > > > > >>>>>>>>>>>> properties >>> > > > > >>>>>>>>>>>>>> from the Cassandra processors for NiFi 2.0 and >>> > > > > >>> requiring a >>> > > > > >>>>>>>>>>>>>> CassandraSessionProvider instead? I think we >>> started >>> > > > > >>> doing >>> > > > > >>>>>> that >>> > > > > >>>>>>>>>>>> elsewhere >>> > > > > >>>>>>>>>>>>>> (Elasticsearch maybe?), I noticed duplicate code >>> in >>> > > > > >> the >>> > > > > >>>>>>>>>>>>>> CassandraSessionProvider and >>> > > > > >>> AbstractCassandraProcessor, >>> > > > > >>>>>> if we >>> > > > > >>>>>>>>> keep >>> > > > > >>>>>>>>>>>> those >>> > > > > >>>>>>>>>>>>>> properties I can refactor them into a utility >>> class. >>> > > > > >>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>> Thanks, >>> > > > > >>>>>>>>>>>>>> Matt >>> > > > > >>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>> On Fri, Mar 15, 2024 at 2:44 PM Steven Matison < >>> > > > > >>>>>>>>>>>> steven.mati...@gmail.com >>> > > > > >>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>> wrote: >>> > > > > >>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>> I got through quite a bit of work to enable 4.x… >>> > > > > >>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>> The 3.x pieces that were not backwards compatible >>> > > > > >> is >>> > > > > >>>>>> very edge >>> > > > > >>>>>>>>> use >>> > > > > >>>>>>>>>>>>> case and >>> > > > > >>>>>>>>>>>>>>> could have been done slightly differently but >>> with >>> > > > > >>> work >>> > > > > >>>>>>>> around. >>> > > > > >>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>> >>> > > > > >>> https://github.com/steven-matison/nifi/tree/nifi-10120-1 >>> > > > > >>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>> On Fri, Mar 15, 2024 at 2:30 PM Matt Burgess < >>> > > > > >>>>>>>>>> mattyb...@apache.org> >>> > > > > >>>>>>>>>>>>> wrote: >>> > > > > >>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>> Oops used the wrong email address so if there >>> > > > > >> have >>> > > > > >>> been >>> > > > > >>>>>>>>>> responses >>> > > > > >>>>>>>>>>>> to >>> > > > > >>>>>>>>>>>>> the >>> > > > > >>>>>>>>>>>>>>>> Cassandra thread since mine I missed them, my >>> > > > > >> bad! >>> > > > > >>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>> On Fri, Mar 15, 2024 at 2:00 PM Matt Burgess < >>> > > > > >>>>>>>>>> mattyb...@gmail.com >>> > > > > >>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>> wrote: >>> > > > > >>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>> I believe the CQL protocol is backwards >>> > > > > >>> compatible >>> > > > > >>>>>> but the >>> > > > > >>>>>>>>>> Java >>> > > > > >>>>>>>>>>>>> API is >>> > > > > >>>>>>>>>>>>>>>>> not. For example >>> > > > > >>> "com.datastax.driver.core.Session" >>> > > > > >>>>>> is now >>> > > > > >>>>>>>>>>>>>>>>> >>> > > > > >>> "com.datastax.oss.driver.api.core.session.Session" >>> > > > > >>>>>> and >>> > > > > >>>>>>>> there >>> > > > > >>>>>>>>>> is >>> > > > > >>>>>>>>>>>> no >>> > > > > >>>>>>>>>>>>> more >>> > > > > >>>>>>>>>>>>>>>>> "Cluster" class. Might be fairly trivial to fix >>> > > > > >>>>>> though, if >>> > > > > >>>>>>>>>>>> that's >>> > > > > >>>>>>>>>>>>> the >>> > > > > >>>>>>>>>>>>>>>> path >>> > > > > >>>>>>>>>>>>>>>>> of least resistance. >>> > > > > >>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>> On Fri, Mar 15, 2024 at 1:40 PM Joe Witt < >>> > > > > >>>>>>>>> joe.w...@gmail.com> >>> > > > > >>>>>>>>>>>>> wrote: >>> > > > > >>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>>> Matt >>> > > > > >>>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>>> I dont know a ton about Cassandra but when I >>> > > > > >>> looked >>> > > > > >>>>>> at >>> > > > > >>>>>>>>>>>>> client/driver >>> > > > > >>>>>>>>>>>>>>>> notes >>> > > > > >>>>>>>>>>>>>>>>>> for 4+ it said it was compatible all the way >>> > > > > >>> back >>> > > > > >>>>>> to 3.x. >>> > > > > >>>>>>>>>> Not >>> > > > > >>>>>>>>>>>>> sure >>> > > > > >>>>>>>>>>>>>>>> what >>> > > > > >>>>>>>>>>>>>>>>>> that means but it surely seems worth >>> > > > > >> exploring. >>> > > > > >>>>>> Also I >>> > > > > >>>>>>>>> dont >>> > > > > >>>>>>>>>>>> know >>> > > > > >>>>>>>>>>>>> if >>> > > > > >>>>>>>>>>>>>>> the >>> > > > > >>>>>>>>>>>>>>>>>> 4.x drivers get rid of the vulnerable bits. >>> > > > > >>>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>>> Thanks >>> > > > > >>>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>>> On Fri, Mar 15, 2024 at 10:39 AM Matt Burgess >>> > > > > >> < >>> > > > > >>>>>>>>>>>>> mattyb...@apache.org> >>> > > > > >>>>>>>>>>>>>>>>>> wrote: >>> > > > > >>>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>>>> At the very least we should upgrade to >>> > > > > >>> Cassandra >>> > > > > >>>>>>>> 3.11.6: >>> > > > > >>>>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>> >>> > > > > >>>>>>>>>> >>> > > > > >>>>>> >>> > > > > >>> >>> > > >>> https://github.com/apache/cassandra/blob/cassandra-3.11.16/CHANGES.txt >>> > > > > >>>>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>>>> On Fri, Mar 15, 2024 at 1:31 PM Matt >>> > > > > >> Burgess < >>> > > > > >>>>>>>>>>>>> mattyb...@apache.org> >>> > > > > >>>>>>>>>>>>>>>>>> wrote: >>> > > > > >>>>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>>>>> If the community agrees to get rid of >>> > > > > >>> Cassandra >>> > > > > >>>>>> 3 >>> > > > > >>>>>>>>> that'll >>> > > > > >>>>>>>>>>>>> save me >>> > > > > >>>>>>>>>>>>>>>>>> effort >>> > > > > >>>>>>>>>>>>>>>>>>>> on the refactor after I add Cassandra 4 :) >>> > > > > >>>>>> Otherwise >>> > > > > >>>>>>>>>> those >>> > > > > >>>>>>>>>>>>>>>>>>>> vulnerabilities would only be in a "new" >>> > > > > >>>>>> Cassandra 3 >>> > > > > >>>>>>>>>>>> services >>> > > > > >>>>>>>>>>>>> NAR >>> > > > > >>>>>>>>>>>>>>>> that >>> > > > > >>>>>>>>>>>>>>>>>>>> would not be included in the convenience >>> > > > > >>> binary. >>> > > > > >>>>>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>>>>> On Fri, Mar 15, 2024 at 1:28 PM Joe Witt < >>> > > > > >>>>>>>>>>>> joe.w...@gmail.com> >>> > > > > >>>>>>>>>>>>>>>> wrote: >>> > > > > >>>>>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>>>>>> Mike, Matt, >>> > > > > >>>>>>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>>>>>> Happy to hear you both have active >>> > > > > >> efforts >>> > > > > >>> or >>> > > > > >>>>>> are >>> > > > > >>>>>>>>>>>> interested >>> > > > > >>>>>>>>>>>>> in >>> > > > > >>>>>>>>>>>>>>>> doing >>> > > > > >>>>>>>>>>>>>>>>>>> so. >>> > > > > >>>>>>>>>>>>>>>>>>>>> Can you help me understand more >>> > > > > >>> specifically >>> > > > > >>>>>> what >>> > > > > >>>>>>>> that >>> > > > > >>>>>>>>>>>> means >>> > > > > >>>>>>>>>>>>> for >>> > > > > >>>>>>>>>>>>>>>> the >>> > > > > >>>>>>>>>>>>>>>>>>>>> current set of components? >>> > > > > >>>>>>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>>>>>> The CVE hits are concerning and long >>> > > > > >>> standing. >>> > > > > >>>>>>>>>> Supporting >>> > > > > >>>>>>>>>>>>>>>> Cassandra >>> > > > > >>>>>>>>>>>>>>>>>> 3 >>> > > > > >>>>>>>>>>>>>>>>>>>>> implies the current set of dependencies >>> > > > > >>> would >>> > > > > >>>>>> remain >>> > > > > >>>>>>>>> too >>> > > > > >>>>>>>>>>>>> right? >>> > > > > >>>>>>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>>>>>> Is the current set of components we have >>> > > > > >>> ones >>> > > > > >>>>>> we >>> > > > > >>>>>>>> want >>> > > > > >>>>>>>>> to >>> > > > > >>>>>>>>>>>>> retain? >>> > > > > >>>>>>>>>>>>>>>> We >>> > > > > >>>>>>>>>>>>>>>>>>>>> certainly need Cassandra components - but >>> > > > > >>> are >>> > > > > >>>>>> the >>> > > > > >>>>>>>> ones >>> > > > > >>>>>>>>>> we >>> > > > > >>>>>>>>>>>>> have >>> > > > > >>>>>>>>>>>>>>> now >>> > > > > >>>>>>>>>>>>>>>>>> the >>> > > > > >>>>>>>>>>>>>>>>>>>>> right ones? >>> > > > > >>>>>>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>>>>>> Thanks >>> > > > > >>>>>>>>>>>>>>>>>>>>> Joe >>> > > > > >>>>>>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>>>>>> On Fri, Mar 15, 2024 at 10:25 AM Matt >>> > > > > >>> Burgess < >>> > > > > >>>>>>>>>>>>>>>> mattyb...@apache.org> >>> > > > > >>>>>>>>>>>>>>>>>>>>> wrote: >>> > > > > >>>>>>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>>>>>>> I'm actively working this, I pushed my >>> > > > > >>>>>> branch up >>> > > > > >>>>>>>> in >>> > > > > >>>>>>>>>> case >>> > > > > >>>>>>>>>>>>> anyone >>> > > > > >>>>>>>>>>>>>>>>>> wants >>> > > > > >>>>>>>>>>>>>>>>>>> to >>> > > > > >>>>>>>>>>>>>>>>>>>>>> take a look [1]. The idea is to >>> > > > > >> abstract >>> > > > > >>> the >>> > > > > >>>>>>>>> Cassandra >>> > > > > >>>>>>>>>>>> API >>> > > > > >>>>>>>>>>>>> "up >>> > > > > >>>>>>>>>>>>>>> a >>> > > > > >>>>>>>>>>>>>>>>>>> couple >>> > > > > >>>>>>>>>>>>>>>>>>>>>> levels" and provide implementations for >>> > > > > >>>>>> Cassandra >>> > > > > >>>>>>>> 3, >>> > > > > >>>>>>>>>> 4, >>> > > > > >>>>>>>>>>>> and >>> > > > > >>>>>>>>>>>>>>>>>> eventually >>> > > > > >>>>>>>>>>>>>>>>>>>>> 5. >>> > > > > >>>>>>>>>>>>>>>>>>>>>> For JDBC-like interfaces this is a PITA >>> > > > > >>>>>> because of >>> > > > > >>>>>>>>> the >>> > > > > >>>>>>>>>>>> API >>> > > > > >>>>>>>>>>>>>>>>>> (Statement, >>> > > > > >>>>>>>>>>>>>>>>>>>>>> PreparedStatement, BoundStatement, >>> > > > > >>> ResultSet, >>> > > > > >>>>>>>> etc.) >>> > > > > >>>>>>>>>> but >>> > > > > >>>>>>>>>>>> I'm >>> > > > > >>>>>>>>>>>>>>>> hoping >>> > > > > >>>>>>>>>>>>>>>>>> we >>> > > > > >>>>>>>>>>>>>>>>>>>>> can >>> > > > > >>>>>>>>>>>>>>>>>>>>>> find a common pattern for abstracting >>> > > > > >> the >>> > > > > >>>>>>>>> third-party >>> > > > > >>>>>>>>>>>>> library >>> > > > > >>>>>>>>>>>>>>>>>>>>>> implementation and API from the NiFi >>> > > > > >>>>>> component >>> > > > > >>>>>>>>>>>> (Processor, >>> > > > > >>>>>>>>>>>>>>>>>>>>>> ControllerService, etc.) API. I think >>> > > > > >>> we're >>> > > > > >>>>>> doing >>> > > > > >>>>>>>>>>>> something >>> > > > > >>>>>>>>>>>>>>>> similar >>> > > > > >>>>>>>>>>>>>>>>>>> for >>> > > > > >>>>>>>>>>>>>>>>>>>>>> Kafka? >>> > > > > >>>>>>>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>>>>>>> Regards, >>> > > > > >>>>>>>>>>>>>>>>>>>>>> Matt >>> > > > > >>>>>>>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>>>>>>> [1] >>> > > > > >>>>>> https://github.com/mattyb149/nifi/tree/cassy4 >>> > > > > >>>>>>>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>>>>>>> On Fri, Mar 15, 2024 at 8:43 AM Mike >>> > > > > >>> Thomsen >>> > > > > >>>>>> < >>> > > > > >>>>>>>>>>>>>>>>>> mikerthom...@gmail.com> >>> > > > > >>>>>>>>>>>>>>>>>>>>>> wrote: >>> > > > > >>>>>>>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>>>>>>>> That’s been on my todo list for a >>> > > > > >>> little >>> > > > > >>>>>> while >>> > > > > >>>>>>>> but >>> > > > > >>>>>>>>>>>> things >>> > > > > >>>>>>>>>>>>>>> kept >>> > > > > >>>>>>>>>>>>>>>>>>> coming >>> > > > > >>>>>>>>>>>>>>>>>>>>> up. >>> > > > > >>>>>>>>>>>>>>>>>>>>>>> I think I could get started on that >>> > > > > >>> now. >>> > > > > >>>>>> Based >>> > > > > >>>>>>>> on >>> > > > > >>>>>>>>> my >>> > > > > >>>>>>>>>>>>> initial >>> > > > > >>>>>>>>>>>>>>>>>>> research >>> > > > > >>>>>>>>>>>>>>>>>>>>> it >>> > > > > >>>>>>>>>>>>>>>>>>>>>>> appears that scylla uses the exact >>> > > > > >> same >>> > > > > >>>>>> api as >>> > > > > >>>>>>>>>>>> datastax >>> > > > > >>>>>>>>>>>>> so >>> > > > > >>>>>>>>>>>>>>>>>>> supporting >>> > > > > >>>>>>>>>>>>>>>>>>>>>> both >>> > > > > >>>>>>>>>>>>>>>>>>>>>>> in a cql bundle should theoretically >>> > > > > >> be >>> > > > > >>>>>> fairly >>> > > > > >>>>>>>>> easy. >>> > > > > >>>>>>>>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>>>>>>>> Sent from my iPhone >>> > > > > >>>>>>>>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> On Mar 14, 2024, at 6:18 PM, Joe >>> > > > > >>> Witt < >>> > > > > >>>>>>>>>>>>> joew...@apache.org> >>> > > > > >>>>>>>>>>>>>>>>>> wrote: >>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> Team, >>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> Cassandra remains a really >>> > > > > >> important >>> > > > > >>>>>> system to >>> > > > > >>>>>>>>> be >>> > > > > >>>>>>>>>>>> able >>> > > > > >>>>>>>>>>>>> to >>> > > > > >>>>>>>>>>>>>>>> send >>> > > > > >>>>>>>>>>>>>>>>>>> data >>> > > > > >>>>>>>>>>>>>>>>>>>>> to. >>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> However, it seems like we've not >>> > > > > >>>>>> maintained >>> > > > > >>>>>>>>> these >>> > > > > >>>>>>>>>>>>> well. We >>> > > > > >>>>>>>>>>>>>>>>>> have >>> > > > > >>>>>>>>>>>>>>>>>>>>> what >>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> appears to be at least a full >>> > > > > >>> generation >>> > > > > >>>>>>>> behind >>> > > > > >>>>>>>>> on >>> > > > > >>>>>>>>>>>>> client >>> > > > > >>>>>>>>>>>>>>>>>> versions >>> > > > > >>>>>>>>>>>>>>>>>>>>> (we >>> > > > > >>>>>>>>>>>>>>>>>>>>>>> are >>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> on 3x vs 4x which is the latest >>> > > > > >>> stable >>> > > > > >>>>>> with 5x >>> > > > > >>>>>>>>>>>>> apparently >>> > > > > >>>>>>>>>>>>>>>>>> coming >>> > > > > >>>>>>>>>>>>>>>>>>>>>>> shortly). >>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> We have components to send data, >>> > > > > >>> query >>> > > > > >>>>>> data, >>> > > > > >>>>>>>> and >>> > > > > >>>>>>>>>> use >>> > > > > >>>>>>>>>>>>>>>> Cassandra >>> > > > > >>>>>>>>>>>>>>>>>> as >>> > > > > >>>>>>>>>>>>>>>>>>> a >>> > > > > >>>>>>>>>>>>>>>>>>>>>> cache >>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> store. We have older mechanisms >>> > > > > >> for >>> > > > > >>>>>> json/avro >>> > > > > >>>>>>>>> and >>> > > > > >>>>>>>>>>>>> publish >>> > > > > >>>>>>>>>>>>>>>>>>>>> mechanisms >>> > > > > >>>>>>>>>>>>>>>>>>>>>> for >>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> records. >>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> The libraries we do have depend on >>> > > > > >>>>>> outdated >>> > > > > >>>>>>>>>>>> versions of >>> > > > > >>>>>>>>>>>>>>> Guava >>> > > > > >>>>>>>>>>>>>>>>>> and >>> > > > > >>>>>>>>>>>>>>>>>>>>>> result >>> > > > > >>>>>>>>>>>>>>>>>>>>>>> in >>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> many CVE hits. >>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> I am inclined to think we should >>> > > > > >>>>>> deprecate the >>> > > > > >>>>>>>>> 1.x >>> > > > > >>>>>>>>>>>>>>> components >>> > > > > >>>>>>>>>>>>>>>>>> and >>> > > > > >>>>>>>>>>>>>>>>>>>>>> remove >>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> them as-is from the 2.x line. Then >>> > > > > >>>>>>>> re-introduce >>> > > > > >>>>>>>>>>>> them >>> > > > > >>>>>>>>>>>>> with >>> > > > > >>>>>>>>>>>>>>>>>> record >>> > > > > >>>>>>>>>>>>>>>>>>>>> only >>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> interfaces and built against the >>> > > > > >>> latest >>> > > > > >>>>>> stable >>> > > > > >>>>>>>>>>>>>>>>>>>>>>> Cassandra/Datastax/ScyllaDB >>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> interfaces. >>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> I'd love to hear thoughts from >>> > > > > >> those >>> > > > > >>>>>> closer to >>> > > > > >>>>>>>>>> this >>> > > > > >>>>>>>>>>>>> space >>> > > > > >>>>>>>>>>>>>>>> both >>> > > > > >>>>>>>>>>>>>>>>>> as >>> > > > > >>>>>>>>>>>>>>>>>>> a >>> > > > > >>>>>>>>>>>>>>>>>>>>>> user >>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> and developer so we can make good >>> > > > > >>> next >>> > > > > >>>>>> steps. >>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> Thanks >>> > > > > >>>>>>>>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>>> >>> > > > > >>>>>>>>>>>> >>> > > > > >>>>>>>>>>> >>> > > > > >>>>>>>>>> >>> > > > > >>>>>>>>> >>> > > > > >>>>>>>> >>> > > > > >>>>>> >>> > > > > >>>>> >>> > > > > >>> >>> > > > > >> >>> > > > > >>> > > >>> > >>> >>