Work so far: https://github.com/MikeThomsen/nifi/tree/cql-changes

On Thu, Mar 21, 2024 at 9:52 AM Mike Thomsen <mikerthom...@gmail.com> wrote:

> Matt/David,
>
> By this evening, I should be at a point where I can share my branch. It
> should be far enough along that y'all can see what I mean about how most of
> the changes really weren't that complicated. My sense is that if we
> collaborate on it, we can probably get it ready for a PR within a week or
> two.
>
> It would probably be a good idea to plan to revisit the Cassandra DMC's
> design and make it more flexible.
>
> One nice thing about the new DataStax driver is that it supports
> configuration by a very detailed configuration file format, so we can give
> users that option + combine it with EL/parameters (I envision an option
> where the user puts EL in the file, we load the file, preprocess the EL and
> load that into the driver)
>
> On Wed, Mar 20, 2024 at 4:01 PM Mike Thomsen <mikerthom...@gmail.com>
> wrote:
>
>> If it were that simple, they would probably have just gone with that
>> solution. That said, the API is functionally vendor agnostic at this point
>> at the Java API level. So I see no need to add abstraction above that. I've
>> got probably 2/3 of nifi-cassandra-bundle converted. Hitting a few pain
>> points where I'm having to dig deep into the docs to make progress, but so
>> far, so good.
>>
>> On Wed, Mar 20, 2024 at 2:38 PM Matt Burgess <mattyb...@apache.org>
>> wrote:
>>
>>> It would be interesting to see if you exclude the Scylla API JAR from the
>>> Scylla implementation and instead include DataStax's, if that works.
>>> However I'm still leaning towards a vendor-agnostic API.
>>>
>>> On Wed, Mar 20, 2024 at 11:26 AM Mike Thomsen <mikerthom...@gmail.com>
>>> wrote:
>>>
>>> > At first glance, the package names look identical to me:
>>> >
>>> > https://java-driver.docs.scylladb.com/scylla-4.15.0.x/api/index.html
>>> >
>>> > So I see no reason to not take them at their word that it's drop-in
>>> >
>>> > On Wed, Mar 20, 2024 at 11:04 AM David Handermann <
>>> > exceptionfact...@apache.org> wrote:
>>> >
>>> > > Mike,
>>> > >
>>> > > One important thing to mention about the DataStax vs ScyllaDB driver
>>> > > is that the Maven coordinates are different, and managing the
>>> > > dependencies correctly will make or break the implementation.
>>> > >
>>> > > In other words, if it is possible to use the DataStax 4 core JAR in
>>> > > the Controller Service API, but use the ScyllaDB 3 query JAR in the
>>> > > ScyllaDB implementation, then that could avoid the need for
>>> additional
>>> > > abstraction. Without taking a closer look, however, I would be
>>> > > surprised if this worked.
>>> > >
>>> > > Although ScyllaDB highlights their forked driver as a drop-in
>>> > > replacement for the DataStax version, and maintains the same Java
>>> > > package names, there is a difference between a complete replacement
>>> > > and a shared API JAR. Without a common API JAR, that both
>>> > > implementations can use, it will be necessary to provide an
>>> > > abstraction in NiFi that avoids depending on either library at the
>>> > > Controller Service API level.
>>> > >
>>> > > Regards,
>>> > > David Handermann
>>> > >
>>> > > On Wed, Mar 20, 2024 at 8:25 AM Mike Thomsen <mikerthom...@gmail.com
>>> >
>>> > > wrote:
>>> > > >
>>> > > > Matt/David,
>>> > > >
>>> > > > I got some time this morning to take a crack at directly migrating
>>> it
>>> > > over
>>> > > > to the DataStax 4.17 driver. Definitely got a lot of work to do,
>>> but so
>>> > > far
>>> > > > I haven't hit any real snags. This is a branch that reverts the
>>> commit
>>> > to
>>> > > > remove the cassandra bundle and reuses the existing features as a
>>> > > > foundation. From what I'm seeing so far (feels like I'm about 25%
>>> done)
>>> > > it
>>> > > > should be doable to reuse the existing bundle, but rename it to the
>>> > "CQL
>>> > > > Bundle" and just add a second controller service for Scylla that is
>>> > > > otherwise 100% the same codewise.
>>> > > >
>>> > > > On Tue, Mar 19, 2024 at 6:41 PM Mike Thomsen <
>>> mikerthom...@gmail.com>
>>> > > wrote:
>>> > > >
>>> > > > > A cursory look at the Cassandra 5 stuff didn’t indicate any
>>> > > > > incompatibility. So yeah, I think we are likely pretty safe to
>>> use
>>> > the
>>> > > 4.17
>>> > > > > driver
>>> > > > > Sent from my iPhone
>>> > > > >
>>> > > > > > On Mar 19, 2024, at 3:35 PM, Matt Burgess <
>>> mattyb...@apache.org>
>>> > > wrote:
>>> > > > > >
>>> > > > > > Is it likely now (due to the refactor) that we will simply be
>>> able
>>> > > to
>>> > > > > > upgrade the driver when Cassandra 5 is GA? Also does anyone use
>>> > > Netflix's
>>> > > > > > Astyanax [1]?
>>> > > > > >
>>> > > > > > [1]
>>> > > > > >
>>> > > > >
>>> > >
>>> >
>>> https://cassandra.apache.org/doc/stable/cassandra/getting_started/drivers.html#java
>>> > > > > >
>>> > > > > >> On Tue, Mar 19, 2024 at 3:10 PM Mike Thomsen <
>>> > > mikerthom...@gmail.com>
>>> > > > > wrote:
>>> > > > > >>
>>> > > > > >> Realistically, I think we are only likely to see two drivers:
>>> > > > > >>
>>> > > > > >> * DataStax
>>> > > > > >> * ScyllaDB
>>> > > > > >>
>>> > > > > >> The latter makes a selling point of being a binary compatible,
>>> > > drop-in
>>> > > > > >> replacement for the former.
>>> > > > > >>
>>> > > > > >> That's why I don't see a need to have an abstraction layer per
>>> > se. I
>>> > > > > think
>>> > > > > >> we only need "DataStaxConnectionProviderImpl" and
>>> > > > > >> "ScyllaDBConnectionProviderImpl" with the difference being
>>> which
>>> > > jar is
>>> > > > > >> imported by maven.
>>> > > > > >>
>>> > > > > >> On Tue, Mar 19, 2024 at 2:59 PM David Handermann <
>>> > > > > >> exceptionfact...@apache.org> wrote:
>>> > > > > >>
>>> > > > > >>> Mike,
>>> > > > > >>>
>>> > > > > >>> Thanks for the reply and clarification.
>>> > > > > >>>
>>> > > > > >>> I agree there is no need to maintain support for the
>>> DataStax 3
>>> > > driver
>>> > > > > >>> and Java API, any new components should be built on the
>>> latest
>>> > > version
>>> > > > > >>> of the driver.
>>> > > > > >>>
>>> > > > > >>> What we do need going forward is to avoid, if at all
>>> possible,
>>> > > having
>>> > > > > >>> a DataStax 4 dependency in the Controller Service API.
>>> > > > > >>>
>>> > > > > >>> One example of this is the WebClientServiceProvider
>>> interface.
>>> > That
>>> > > > > >>> Controller Service API does not have any third-party
>>> > dependencies.
>>> > > The
>>> > > > > >>> Controller Service implementation,
>>> > > StandardWebClientServiceProvider,
>>> > > > > >>> has a dependency on OkHttp to implement HTTP communication.
>>> That
>>> > is
>>> > > > > >>> the kind of abstraction that would be ideal, and I believe
>>> that
>>> > > also
>>> > > > > >>> aligns with what Matt has described.
>>> > > > > >>>
>>> > > > > >>> Regards,
>>> > > > > >>> David Handermann
>>> > > > > >>>
>>> > > > > >>> On Tue, Mar 19, 2024 at 1:45 PM Mike Thomsen <
>>> > > mikerthom...@gmail.com>
>>> > > > > >>> wrote:
>>> > > > > >>>>
>>> > > > > >>>> ** we can dump v3 **DRIVER** compatibility, since later 4.X
>>> Java
>>> > > > > >> drivers
>>> > > > > >>>> are backward compatible with Cassandra 3
>>> > > > > >>>>
>>> > > > > >>>> On Tue, Mar 19, 2024 at 2:43 PM Mike Thomsen <
>>> > > mikerthom...@gmail.com>
>>> > > > > >>> wrote:
>>> > > > > >>>>
>>> > > > > >>>>> David,
>>> > > > > >>>>>
>>> > > > > >>>>> Before we proceed, I think we should make sure we're all
>>> > > > > >> understanding
>>> > > > > >>> the
>>> > > > > >>>>> same problem here. Starting with this:
>>> > > > > >>>>>
>>> > > > > >>>>>> I believe the CQL protocol is backwards compatible but the
>>> > Java
>>> > > API
>>> > > > > >>> is
>>> > > > > >>>>> not.
>>> > > > > >>>>>> For example "com.datastax.driver.core.Session" is now
>>> > > > > >>>>>> "com.datastax.oss.driver.api.core.session.Session" and
>>> there
>>> > is
>>> > > no
>>> > > > > >>> more
>>> > > > > >>>>>> "Cluster" class. Might be fairly trivial to fix though, if
>>> > > that's
>>> > > > > >> the
>>> > > > > >>>>> path
>>> > > > > >>>>>> of least resistance.
>>> > > > > >>>>>
>>> > > > > >>>>> From what I've learned using Cassandra 3 and 4 in my day
>>> job
>>> > and
>>> > > > > >>> reading
>>> > > > > >>>>> up on this stuff for the sake of discussion, that all
>>> tracks.
>>> > We
>>> > > used
>>> > > > > >>> the
>>> > > > > >>>>> ~4.11 driver in Spring Boot on both v3 and v4 clusters
>>> without
>>> > > issue
>>> > > > > >>> during
>>> > > > > >>>>> an upgrade. So I don't see any reason to factor in the
>>> "changes
>>> > > from
>>> > > > > >>>>> DataStax 3 to 4" since the changes were likely a one-off
>>> > decision
>>> > > > > >>> meant to
>>> > > > > >>>>> position the driver for better future support and
>>> stability.
>>> > > > > >>>>>
>>> > > > > >>>>> TL;DR, we can dump v3 compatibility and the only thing our
>>> > users
>>> > > will
>>> > > > > >>>>> notice is if we make the controller service totally
>>> > incompatible
>>> > > with
>>> > > > > >>> the
>>> > > > > >>>>> one they're already using which is something we can
>>> actively
>>> > > avoid.
>>> > > > > >>>>>
>>> > > > > >>>>> On Tue, Mar 19, 2024 at 2:00 PM David Handermann <
>>> > > > > >>>>> exceptionfact...@apache.org> wrote:
>>> > > > > >>>>>
>>> > > > > >>>>>> All,
>>> > > > > >>>>>>
>>> > > > > >>>>>> I support a Controller Service API abstraction around the
>>> > > Cassandra
>>> > > > > >>>>>> Driver. The changes from DataStax 3 to 4 already
>>> highlight the
>>> > > need
>>> > > > > >>>>>> for that abstraction. The donation of the DataStax Java
>>> driver
>>> > > to
>>> > > > > >>>>>> Apache [1] also shows the value of providing some level of
>>> > > > > >> isolation,
>>> > > > > >>>>>> if at all possible.
>>> > > > > >>>>>>
>>> > > > > >>>>>> I have not taken a close look at the Matt's branch, and
>>> the
>>> > > details
>>> > > > > >> of
>>> > > > > >>>>>> the abstraction are important, but having the abstraction
>>> can
>>> > be
>>> > > > > >>>>>> useful to avoid getting back to this same situation.
>>> > > > > >>>>>>
>>> > > > > >>>>>> Regards,
>>> > > > > >>>>>> David Handermann
>>> > > > > >>>>>>
>>> > > > > >>>>>> [1] https://github.com/apache/cassandra-java-driver/
>>> > > > > >>>>>>
>>> > > > > >>>>>> On Tue, Mar 19, 2024 at 12:37 PM Mike Thomsen <
>>> > > > > >> mikerthom...@gmail.com
>>> > > > > >>>>
>>> > > > > >>>>>> wrote:
>>> > > > > >>>>>>>
>>> > > > > >>>>>>> Matt,
>>> > > > > >>>>>>>
>>> > > > > >>>>>>> I got that. My point was that the Java changes appear to
>>> be a
>>> > > one
>>> > > > > >>> time
>>> > > > > >>>>>>> thing that DataStax did to make a better driver with a
>>> much
>>> > > more
>>> > > > > >>>>>>> future-proof API. Since Scylla tracks them as closely as
>>> > > > > >> possible, I
>>> > > > > >>>>>>> suspect that we don't need to plan for a bunch of
>>> abstraction
>>> > > to
>>> > > > > >>> isolate
>>> > > > > >>>>>>> Java changes.
>>> > > > > >>>>>>>
>>> > > > > >>>>>>> On Tue, Mar 19, 2024 at 11:07 AM Steven Matison <
>>> > > > > >>>>>> steven.mati...@gmail.com>
>>> > > > > >>>>>>> wrote:
>>> > > > > >>>>>>>
>>> > > > > >>>>>>>> That was kinda where i got stuck and fell out on my
>>> > > branch/jira.
>>> > > > > >>>>>> Mike and
>>> > > > > >>>>>>>> I wanted to make a new controller service , without
>>> backward
>>> > > > > >>>>>> compatibility;
>>> > > > > >>>>>>>> and remove the duplicate driver/connection properties
>>> found
>>> > in
>>> > > > > >>> some
>>> > > > > >>>>>> of the
>>> > > > > >>>>>>>> processors.
>>> > > > > >>>>>>>>
>>> > > > > >>>>>>>> I agree taking out all old stuff and making new
>>> controller
>>> > > > > >> service
>>> > > > > >>>>>> makes
>>> > > > > >>>>>>>> most sense.  4.x and 5.x should be mostly backwards
>>> > compatible
>>> > > > > >> to
>>> > > > > >>>>>> 2&3.x
>>> > > > > >>>>>>>> with how it’s used within current processors.
>>> > > > > >>>>>>>>
>>> > > > > >>>>>>>>
>>> > > > > >>>>>>>>
>>> > > > > >>>>>>>> On Tue, Mar 19, 2024 at 10:49 AM Matt Burgess <
>>> > > > > >>> mattyb...@apache.org>
>>> > > > > >>>>>>>> wrote:
>>> > > > > >>>>>>>>
>>> > > > > >>>>>>>>> The abstraction is to isolate Java API changes, not
>>> > protocol
>>> > > > > >>>>>>>> compatibility
>>> > > > > >>>>>>>>> Changing to the java-driver comes with a number of
>>> changes
>>> > to
>>> > > > > >>> the
>>> > > > > >>>>>> code
>>> > > > > >>>>>>>> (see
>>> > > > > >>>>>>>>> Steven's and my branches), if we can abstract that API
>>> it
>>> > > > > >> should
>>> > > > > >>>>>> lead to
>>> > > > > >>>>>>>>> more maintainable code in the future by not having to
>>> > change
>>> > > > > >> any
>>> > > > > >>>>>>>>> processors, just the controller service implementation.
>>> > > > > >>>>>>>>>
>>> > > > > >>>>>>>>>
>>> > > > > >>>>>>>>> On Tue, Mar 19, 2024 at 10:14 AM Mike Thomsen <
>>> > > > > >>>>>> mikerthom...@gmail.com>
>>> > > > > >>>>>>>>> wrote:
>>> > > > > >>>>>>>>>
>>> > > > > >>>>>>>>>>
>>> > > > > >>>>>>>>>>
>>> > > > > >>>>>>>>>
>>> > > > > >>>>>>>>
>>> > > > > >>>>>>
>>> > > > > >>>
>>> > > > > >>
>>> > > > >
>>> > >
>>> >
>>> https://opensource.docs.scylladb.com/stable/using-scylla/drivers/cql-drivers/scylla-java-driver.html
>>> > > > > >>>>>>>>>>
>>> > > > > >>>>>>>>>> Directly quoting Scylla docs here:
>>> > > > > >>>>>>>>>>
>>> > > > > >>>>>>>>>>> The Scylla Java Driver is a drop-in replacement for
>>> the
>>> > > > > >>>>>> DataStax Java
>>> > > > > >>>>>>>>>> Driver. As such, no code changes are needed to use
>>> this
>>> > > > > >>> driver.
>>> > > > > >>>>>>>>>>
>>> > > > > >>>>>>>>>> On Tue, Mar 19, 2024 at 10:13 AM Mike Thomsen <
>>> > > > > >>>>>> mikerthom...@gmail.com>
>>> > > > > >>>>>>>>>> wrote:
>>> > > > > >>>>>>>>>>
>>> > > > > >>>>>>>>>>> Matt,
>>> > > > > >>>>>>>>>>>
>>> > > > > >>>>>>>>>>> I don't think we need to really "abstract above" the
>>> > > > > >> drivers
>>> > > > > >>>>>> because
>>> > > > > >>>>>>>>> the
>>> > > > > >>>>>>>>>>> Java DataStax driver appears to support 4.X all the
>>> way
>>> > > > > >>> back to
>>> > > > > >>>>>> 2.X,
>>> > > > > >>>>>>>> as
>>> > > > > >>>>>>>>>>> well as the enterprise versions from DataStax
>>> > > > > >>>>>>>>>>>
>>> > > > > >>>>>>>>>>>
>>> > > > > >>>>>>
>>> > > https://docs.datastax.com/en/driver-matrix/docs/java-drivers.html
>>> > > > > >>>>>>>>>>>
>>> > > > > >>>>>>>>>>> Similar situation with Scylla. When I looked at the
>>> > > > > >> driver,
>>> > > > > >>> it
>>> > > > > >>>>>>>> appeared
>>> > > > > >>>>>>>>>> to
>>> > > > > >>>>>>>>>>> copy verbatim the entire public API of that driver.
>>> So I
>>> > > > > >>> think
>>> > > > > >>>>>> before
>>> > > > > >>>>>>>>> we
>>> > > > > >>>>>>>>>>> dive into abstractions, it's worth doing a bit more
>>> > > > > >>> validation
>>> > > > > >>>>>> of
>>> > > > > >>>>>>>> these
>>> > > > > >>>>>>>>>>> details. IMHO, this might be a much lighter lift than
>>> > > > > >>>>>> anticipated.
>>> > > > > >>>>>>>>>>>
>>> > > > > >>>>>>>>>>>
>>> > > > > >>>>>>>>>>> On Mon, Mar 18, 2024 at 4:30 PM Matt Burgess <
>>> > > > > >>>>>> mattyb...@gmail.com>
>>> > > > > >>>>>>>>>> wrote:
>>> > > > > >>>>>>>>>>>
>>> > > > > >>>>>>>>>>>> Totally agree, that's what my branch does (see link
>>> in
>>> > > > > >>> previous
>>> > > > > >>>>>>>>> email).
>>> > > > > >>>>>>>>>>>> The
>>> > > > > >>>>>>>>>>>> more I work with it, the more I think I can
>>> abstract it
>>> > > > > >>>>>> further from
>>> > > > > >>>>>>>>>> their
>>> > > > > >>>>>>>>>>>> JDBC-like API but I started with a bunch of delegate
>>> > > > > >>> classes
>>> > > > > >>>>>> then I
>>> > > > > >>>>>>>>>> figure
>>> > > > > >>>>>>>>>>>> I'll see where I can consolidate to more abstract
>>> > > > > >>> concepts. If
>>> > > > > >>>>>> I
>>> > > > > >>>>>>>> don't
>>> > > > > >>>>>>>>>>>> have
>>> > > > > >>>>>>>>>>>> to support Cassandra 3 with the new API, so much the
>>> > > > > >>> better.
>>> > > > > >>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>> Regards,
>>> > > > > >>>>>>>>>>>> Matt
>>> > > > > >>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>> On Mon, Mar 18, 2024 at 4:14 PM David Handermann <
>>> > > > > >>>>>>>>>>>> exceptionfact...@apache.org> wrote:
>>> > > > > >>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>> Matt et al,
>>> > > > > >>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>> It is good to see the background effort on moving
>>> > > > > >>> Cassandra
>>> > > > > >>>>>>>>>>>>> capabilities in a supportable direction.
>>> > > > > >>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>> I think new Cassandra components will require a
>>> > > > > >>> significant
>>> > > > > >>>>>>>>> departure
>>> > > > > >>>>>>>>>>>>> from current Controller Service abstractions. Right
>>> > > > > >> now,
>>> > > > > >>> the
>>> > > > > >>>>>>>>> existing
>>> > > > > >>>>>>>>>>>>> service interface does not provide a clean
>>> abstraction
>>> > > > > >>> from
>>> > > > > >>>>>> the
>>> > > > > >>>>>>>>>>>>> Cassandra library, which is part of the reason for
>>> the
>>> > > > > >>>>>> current
>>> > > > > >>>>>>>>>>>>> coupling to the legacy driver version.
>>> > > > > >>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>> Following up from Joe's comments, it seems like the
>>> > > > > >>> cleanest
>>> > > > > >>>>>> way
>>> > > > > >>>>>>>>>>>>> forward is to deprecate the current bundle on the
>>> 1.x
>>> > > > > >>>>>> branch, and
>>> > > > > >>>>>>>>>>>>> remove the current bundle from the main branch.
>>> That
>>> > > > > >> will
>>> > > > > >>>>>> provide
>>> > > > > >>>>>>>> a
>>> > > > > >>>>>>>>>>>>> clean slate for new Service and Processor
>>> > > > > >>> implementations,
>>> > > > > >>>>>> without
>>> > > > > >>>>>>>>>>>>> concern for uncertain compatibility questions.
>>> > > > > >>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>> Regards,
>>> > > > > >>>>>>>>>>>>> David Handermann
>>> > > > > >>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>> On Mon, Mar 18, 2024 at 2:35 PM Matt Burgess <
>>> > > > > >>>>>>>> mattyb...@apache.org>
>>> > > > > >>>>>>>>>>>> wrote:
>>> > > > > >>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>> What do y'all think about removing the individual
>>> > > > > >>>>>> connection
>>> > > > > >>>>>>>>>>>> properties
>>> > > > > >>>>>>>>>>>>>> from the Cassandra processors for NiFi 2.0 and
>>> > > > > >>> requiring a
>>> > > > > >>>>>>>>>>>>>> CassandraSessionProvider instead? I think we
>>> started
>>> > > > > >>> doing
>>> > > > > >>>>>> that
>>> > > > > >>>>>>>>>>>> elsewhere
>>> > > > > >>>>>>>>>>>>>> (Elasticsearch maybe?), I noticed duplicate code
>>> in
>>> > > > > >> the
>>> > > > > >>>>>>>>>>>>>> CassandraSessionProvider and
>>> > > > > >>> AbstractCassandraProcessor,
>>> > > > > >>>>>> if we
>>> > > > > >>>>>>>>> keep
>>> > > > > >>>>>>>>>>>> those
>>> > > > > >>>>>>>>>>>>>> properties I can refactor them into a utility
>>> class.
>>> > > > > >>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>> Thanks,
>>> > > > > >>>>>>>>>>>>>> Matt
>>> > > > > >>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>> On Fri, Mar 15, 2024 at 2:44 PM Steven Matison <
>>> > > > > >>>>>>>>>>>> steven.mati...@gmail.com
>>> > > > > >>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>> wrote:
>>> > > > > >>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>> I got through quite a bit of work to enable 4.x…
>>> > > > > >>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>> The 3.x pieces that were not backwards compatible
>>> > > > > >> is
>>> > > > > >>>>>> very edge
>>> > > > > >>>>>>>>> use
>>> > > > > >>>>>>>>>>>>> case and
>>> > > > > >>>>>>>>>>>>>>> could have been done slightly differently but
>>> with
>>> > > > > >>> work
>>> > > > > >>>>>>>> around.
>>> > > > > >>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>
>>> > > > > >>> https://github.com/steven-matison/nifi/tree/nifi-10120-1
>>> > > > > >>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>> On Fri, Mar 15, 2024 at 2:30 PM Matt Burgess <
>>> > > > > >>>>>>>>>> mattyb...@apache.org>
>>> > > > > >>>>>>>>>>>>> wrote:
>>> > > > > >>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>> Oops used the wrong email address so if there
>>> > > > > >> have
>>> > > > > >>> been
>>> > > > > >>>>>>>>>> responses
>>> > > > > >>>>>>>>>>>> to
>>> > > > > >>>>>>>>>>>>> the
>>> > > > > >>>>>>>>>>>>>>>> Cassandra thread since mine I missed them, my
>>> > > > > >> bad!
>>> > > > > >>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>> On Fri, Mar 15, 2024 at 2:00 PM Matt Burgess <
>>> > > > > >>>>>>>>>> mattyb...@gmail.com
>>> > > > > >>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>> wrote:
>>> > > > > >>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>> I believe the CQL protocol is backwards
>>> > > > > >>> compatible
>>> > > > > >>>>>> but the
>>> > > > > >>>>>>>>>> Java
>>> > > > > >>>>>>>>>>>>> API is
>>> > > > > >>>>>>>>>>>>>>>>> not. For example
>>> > > > > >>> "com.datastax.driver.core.Session"
>>> > > > > >>>>>> is now
>>> > > > > >>>>>>>>>>>>>>>>>
>>> > > > > >>> "com.datastax.oss.driver.api.core.session.Session"
>>> > > > > >>>>>> and
>>> > > > > >>>>>>>> there
>>> > > > > >>>>>>>>>> is
>>> > > > > >>>>>>>>>>>> no
>>> > > > > >>>>>>>>>>>>> more
>>> > > > > >>>>>>>>>>>>>>>>> "Cluster" class. Might be fairly trivial to fix
>>> > > > > >>>>>> though, if
>>> > > > > >>>>>>>>>>>> that's
>>> > > > > >>>>>>>>>>>>> the
>>> > > > > >>>>>>>>>>>>>>>> path
>>> > > > > >>>>>>>>>>>>>>>>> of least resistance.
>>> > > > > >>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>> On Fri, Mar 15, 2024 at 1:40 PM Joe Witt <
>>> > > > > >>>>>>>>> joe.w...@gmail.com>
>>> > > > > >>>>>>>>>>>>> wrote:
>>> > > > > >>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>>> Matt
>>> > > > > >>>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>>> I dont know a ton about Cassandra but when I
>>> > > > > >>> looked
>>> > > > > >>>>>> at
>>> > > > > >>>>>>>>>>>>> client/driver
>>> > > > > >>>>>>>>>>>>>>>> notes
>>> > > > > >>>>>>>>>>>>>>>>>> for 4+ it said it was compatible all the way
>>> > > > > >>> back
>>> > > > > >>>>>> to 3.x.
>>> > > > > >>>>>>>>>> Not
>>> > > > > >>>>>>>>>>>>> sure
>>> > > > > >>>>>>>>>>>>>>>> what
>>> > > > > >>>>>>>>>>>>>>>>>> that means but it surely seems worth
>>> > > > > >> exploring.
>>> > > > > >>>>>> Also I
>>> > > > > >>>>>>>>> dont
>>> > > > > >>>>>>>>>>>> know
>>> > > > > >>>>>>>>>>>>> if
>>> > > > > >>>>>>>>>>>>>>> the
>>> > > > > >>>>>>>>>>>>>>>>>> 4.x drivers get rid of the vulnerable bits.
>>> > > > > >>>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>>> Thanks
>>> > > > > >>>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>>> On Fri, Mar 15, 2024 at 10:39 AM Matt Burgess
>>> > > > > >> <
>>> > > > > >>>>>>>>>>>>> mattyb...@apache.org>
>>> > > > > >>>>>>>>>>>>>>>>>> wrote:
>>> > > > > >>>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>>>> At the very least we should upgrade to
>>> > > > > >>> Cassandra
>>> > > > > >>>>>>>> 3.11.6:
>>> > > > > >>>>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>
>>> > > > > >>>>>>
>>> > > > > >>>
>>> > >
>>> https://github.com/apache/cassandra/blob/cassandra-3.11.16/CHANGES.txt
>>> > > > > >>>>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>>>> On Fri, Mar 15, 2024 at 1:31 PM Matt
>>> > > > > >> Burgess <
>>> > > > > >>>>>>>>>>>>> mattyb...@apache.org>
>>> > > > > >>>>>>>>>>>>>>>>>> wrote:
>>> > > > > >>>>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>>>>> If the community agrees to get rid of
>>> > > > > >>> Cassandra
>>> > > > > >>>>>> 3
>>> > > > > >>>>>>>>> that'll
>>> > > > > >>>>>>>>>>>>> save me
>>> > > > > >>>>>>>>>>>>>>>>>> effort
>>> > > > > >>>>>>>>>>>>>>>>>>>> on the refactor after I add Cassandra 4 :)
>>> > > > > >>>>>> Otherwise
>>> > > > > >>>>>>>>>> those
>>> > > > > >>>>>>>>>>>>>>>>>>>> vulnerabilities would only be in a "new"
>>> > > > > >>>>>> Cassandra 3
>>> > > > > >>>>>>>>>>>> services
>>> > > > > >>>>>>>>>>>>> NAR
>>> > > > > >>>>>>>>>>>>>>>> that
>>> > > > > >>>>>>>>>>>>>>>>>>>> would not be included in the convenience
>>> > > > > >>> binary.
>>> > > > > >>>>>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>>>>> On Fri, Mar 15, 2024 at 1:28 PM Joe Witt <
>>> > > > > >>>>>>>>>>>> joe.w...@gmail.com>
>>> > > > > >>>>>>>>>>>>>>>> wrote:
>>> > > > > >>>>>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>>>>>> Mike, Matt,
>>> > > > > >>>>>>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>>>>>> Happy to hear you both have active
>>> > > > > >> efforts
>>> > > > > >>> or
>>> > > > > >>>>>> are
>>> > > > > >>>>>>>>>>>> interested
>>> > > > > >>>>>>>>>>>>> in
>>> > > > > >>>>>>>>>>>>>>>> doing
>>> > > > > >>>>>>>>>>>>>>>>>>> so.
>>> > > > > >>>>>>>>>>>>>>>>>>>>> Can you help me understand more
>>> > > > > >>> specifically
>>> > > > > >>>>>> what
>>> > > > > >>>>>>>> that
>>> > > > > >>>>>>>>>>>> means
>>> > > > > >>>>>>>>>>>>> for
>>> > > > > >>>>>>>>>>>>>>>> the
>>> > > > > >>>>>>>>>>>>>>>>>>>>> current set of components?
>>> > > > > >>>>>>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>>>>>> The CVE hits are concerning and long
>>> > > > > >>> standing.
>>> > > > > >>>>>>>>>> Supporting
>>> > > > > >>>>>>>>>>>>>>>> Cassandra
>>> > > > > >>>>>>>>>>>>>>>>>> 3
>>> > > > > >>>>>>>>>>>>>>>>>>>>> implies the current set of dependencies
>>> > > > > >>> would
>>> > > > > >>>>>> remain
>>> > > > > >>>>>>>>> too
>>> > > > > >>>>>>>>>>>>> right?
>>> > > > > >>>>>>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>>>>>> Is the current set of components we have
>>> > > > > >>> ones
>>> > > > > >>>>>> we
>>> > > > > >>>>>>>> want
>>> > > > > >>>>>>>>> to
>>> > > > > >>>>>>>>>>>>> retain?
>>> > > > > >>>>>>>>>>>>>>>> We
>>> > > > > >>>>>>>>>>>>>>>>>>>>> certainly need Cassandra components - but
>>> > > > > >>> are
>>> > > > > >>>>>> the
>>> > > > > >>>>>>>> ones
>>> > > > > >>>>>>>>>> we
>>> > > > > >>>>>>>>>>>>> have
>>> > > > > >>>>>>>>>>>>>>> now
>>> > > > > >>>>>>>>>>>>>>>>>> the
>>> > > > > >>>>>>>>>>>>>>>>>>>>> right ones?
>>> > > > > >>>>>>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>>>>>> Thanks
>>> > > > > >>>>>>>>>>>>>>>>>>>>> Joe
>>> > > > > >>>>>>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>>>>>> On Fri, Mar 15, 2024 at 10:25 AM Matt
>>> > > > > >>> Burgess <
>>> > > > > >>>>>>>>>>>>>>>> mattyb...@apache.org>
>>> > > > > >>>>>>>>>>>>>>>>>>>>> wrote:
>>> > > > > >>>>>>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>>>>>>> I'm actively working this, I pushed my
>>> > > > > >>>>>> branch up
>>> > > > > >>>>>>>> in
>>> > > > > >>>>>>>>>> case
>>> > > > > >>>>>>>>>>>>> anyone
>>> > > > > >>>>>>>>>>>>>>>>>> wants
>>> > > > > >>>>>>>>>>>>>>>>>>> to
>>> > > > > >>>>>>>>>>>>>>>>>>>>>> take a look [1]. The idea is to
>>> > > > > >> abstract
>>> > > > > >>> the
>>> > > > > >>>>>>>>> Cassandra
>>> > > > > >>>>>>>>>>>> API
>>> > > > > >>>>>>>>>>>>> "up
>>> > > > > >>>>>>>>>>>>>>> a
>>> > > > > >>>>>>>>>>>>>>>>>>> couple
>>> > > > > >>>>>>>>>>>>>>>>>>>>>> levels" and provide implementations for
>>> > > > > >>>>>> Cassandra
>>> > > > > >>>>>>>> 3,
>>> > > > > >>>>>>>>>> 4,
>>> > > > > >>>>>>>>>>>> and
>>> > > > > >>>>>>>>>>>>>>>>>> eventually
>>> > > > > >>>>>>>>>>>>>>>>>>>>> 5.
>>> > > > > >>>>>>>>>>>>>>>>>>>>>> For JDBC-like interfaces this is a PITA
>>> > > > > >>>>>> because of
>>> > > > > >>>>>>>>> the
>>> > > > > >>>>>>>>>>>> API
>>> > > > > >>>>>>>>>>>>>>>>>> (Statement,
>>> > > > > >>>>>>>>>>>>>>>>>>>>>> PreparedStatement, BoundStatement,
>>> > > > > >>> ResultSet,
>>> > > > > >>>>>>>> etc.)
>>> > > > > >>>>>>>>>> but
>>> > > > > >>>>>>>>>>>> I'm
>>> > > > > >>>>>>>>>>>>>>>> hoping
>>> > > > > >>>>>>>>>>>>>>>>>> we
>>> > > > > >>>>>>>>>>>>>>>>>>>>> can
>>> > > > > >>>>>>>>>>>>>>>>>>>>>> find a common pattern for abstracting
>>> > > > > >> the
>>> > > > > >>>>>>>>> third-party
>>> > > > > >>>>>>>>>>>>> library
>>> > > > > >>>>>>>>>>>>>>>>>>>>>> implementation and API from the NiFi
>>> > > > > >>>>>> component
>>> > > > > >>>>>>>>>>>> (Processor,
>>> > > > > >>>>>>>>>>>>>>>>>>>>>> ControllerService, etc.) API. I think
>>> > > > > >>> we're
>>> > > > > >>>>>> doing
>>> > > > > >>>>>>>>>>>> something
>>> > > > > >>>>>>>>>>>>>>>> similar
>>> > > > > >>>>>>>>>>>>>>>>>>> for
>>> > > > > >>>>>>>>>>>>>>>>>>>>>> Kafka?
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>>>>>>> Regards,
>>> > > > > >>>>>>>>>>>>>>>>>>>>>> Matt
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>>>>>>> [1]
>>> > > > > >>>>>> https://github.com/mattyb149/nifi/tree/cassy4
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>>>>>>> On Fri, Mar 15, 2024 at 8:43 AM Mike
>>> > > > > >>> Thomsen
>>> > > > > >>>>>> <
>>> > > > > >>>>>>>>>>>>>>>>>> mikerthom...@gmail.com>
>>> > > > > >>>>>>>>>>>>>>>>>>>>>> wrote:
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>> That’s been on my todo list for a
>>> > > > > >>> little
>>> > > > > >>>>>> while
>>> > > > > >>>>>>>> but
>>> > > > > >>>>>>>>>>>> things
>>> > > > > >>>>>>>>>>>>>>> kept
>>> > > > > >>>>>>>>>>>>>>>>>>> coming
>>> > > > > >>>>>>>>>>>>>>>>>>>>> up.
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>> I think I could get started on that
>>> > > > > >>> now.
>>> > > > > >>>>>> Based
>>> > > > > >>>>>>>> on
>>> > > > > >>>>>>>>> my
>>> > > > > >>>>>>>>>>>>> initial
>>> > > > > >>>>>>>>>>>>>>>>>>> research
>>> > > > > >>>>>>>>>>>>>>>>>>>>> it
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>> appears that scylla uses the exact
>>> > > > > >> same
>>> > > > > >>>>>> api as
>>> > > > > >>>>>>>>>>>> datastax
>>> > > > > >>>>>>>>>>>>> so
>>> > > > > >>>>>>>>>>>>>>>>>>> supporting
>>> > > > > >>>>>>>>>>>>>>>>>>>>>> both
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>> in a cql bundle should theoretically
>>> > > > > >> be
>>> > > > > >>>>>> fairly
>>> > > > > >>>>>>>>> easy.
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>> Sent from my iPhone
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> On Mar 14, 2024, at 6:18 PM, Joe
>>> > > > > >>> Witt <
>>> > > > > >>>>>>>>>>>>> joew...@apache.org>
>>> > > > > >>>>>>>>>>>>>>>>>> wrote:
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> Team,
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> Cassandra remains a really
>>> > > > > >> important
>>> > > > > >>>>>> system to
>>> > > > > >>>>>>>>> be
>>> > > > > >>>>>>>>>>>> able
>>> > > > > >>>>>>>>>>>>> to
>>> > > > > >>>>>>>>>>>>>>>> send
>>> > > > > >>>>>>>>>>>>>>>>>>> data
>>> > > > > >>>>>>>>>>>>>>>>>>>>> to.
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> However, it seems like we've not
>>> > > > > >>>>>> maintained
>>> > > > > >>>>>>>>> these
>>> > > > > >>>>>>>>>>>>> well.  We
>>> > > > > >>>>>>>>>>>>>>>>>> have
>>> > > > > >>>>>>>>>>>>>>>>>>>>> what
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> appears to be at least a full
>>> > > > > >>> generation
>>> > > > > >>>>>>>> behind
>>> > > > > >>>>>>>>> on
>>> > > > > >>>>>>>>>>>>> client
>>> > > > > >>>>>>>>>>>>>>>>>> versions
>>> > > > > >>>>>>>>>>>>>>>>>>>>> (we
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>> are
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> on 3x vs 4x which is the latest
>>> > > > > >>> stable
>>> > > > > >>>>>> with 5x
>>> > > > > >>>>>>>>>>>>> apparently
>>> > > > > >>>>>>>>>>>>>>>>>> coming
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>> shortly).
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> We have components to send data,
>>> > > > > >>> query
>>> > > > > >>>>>> data,
>>> > > > > >>>>>>>> and
>>> > > > > >>>>>>>>>> use
>>> > > > > >>>>>>>>>>>>>>>> Cassandra
>>> > > > > >>>>>>>>>>>>>>>>>> as
>>> > > > > >>>>>>>>>>>>>>>>>>> a
>>> > > > > >>>>>>>>>>>>>>>>>>>>>> cache
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> store.  We have older mechanisms
>>> > > > > >> for
>>> > > > > >>>>>> json/avro
>>> > > > > >>>>>>>>> and
>>> > > > > >>>>>>>>>>>>> publish
>>> > > > > >>>>>>>>>>>>>>>>>>>>> mechanisms
>>> > > > > >>>>>>>>>>>>>>>>>>>>>> for
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> records.
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> The libraries we do have depend on
>>> > > > > >>>>>> outdated
>>> > > > > >>>>>>>>>>>> versions of
>>> > > > > >>>>>>>>>>>>>>> Guava
>>> > > > > >>>>>>>>>>>>>>>>>> and
>>> > > > > >>>>>>>>>>>>>>>>>>>>>> result
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>> in
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> many CVE hits.
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> I am inclined to think we should
>>> > > > > >>>>>> deprecate the
>>> > > > > >>>>>>>>> 1.x
>>> > > > > >>>>>>>>>>>>>>> components
>>> > > > > >>>>>>>>>>>>>>>>>> and
>>> > > > > >>>>>>>>>>>>>>>>>>>>>> remove
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> them as-is from the 2.x line.  Then
>>> > > > > >>>>>>>> re-introduce
>>> > > > > >>>>>>>>>>>> them
>>> > > > > >>>>>>>>>>>>> with
>>> > > > > >>>>>>>>>>>>>>>>>> record
>>> > > > > >>>>>>>>>>>>>>>>>>>>> only
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> interfaces and built against the
>>> > > > > >>> latest
>>> > > > > >>>>>> stable
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>> Cassandra/Datastax/ScyllaDB
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> interfaces.
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> I'd love to hear thoughts from
>>> > > > > >> those
>>> > > > > >>>>>> closer to
>>> > > > > >>>>>>>>>> this
>>> > > > > >>>>>>>>>>>>> space
>>> > > > > >>>>>>>>>>>>>>>> both
>>> > > > > >>>>>>>>>>>>>>>>>> as
>>> > > > > >>>>>>>>>>>>>>>>>>> a
>>> > > > > >>>>>>>>>>>>>>>>>>>>>> user
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> and developer so we can make good
>>> > > > > >>> next
>>> > > > > >>>>>> steps.
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>>> Thanks
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>>
>>> > > > > >>>>>>>>>>>
>>> > > > > >>>>>>>>>>
>>> > > > > >>>>>>>>>
>>> > > > > >>>>>>>>
>>> > > > > >>>>>>
>>> > > > > >>>>>
>>> > > > > >>>
>>> > > > > >>
>>> > > > >
>>> > >
>>> >
>>>
>>

Reply via email to