Matt et al,

It is good to see the background effort on moving Cassandra
capabilities in a supportable direction.

I think new Cassandra components will require a significant departure
from current Controller Service abstractions. Right now, the existing
service interface does not provide a clean abstraction from the
Cassandra library, which is part of the reason for the current
coupling to the legacy driver version.

Following up from Joe's comments, it seems like the cleanest way
forward is to deprecate the current bundle on the 1.x branch, and
remove the current bundle from the main branch. That will provide a
clean slate for new Service and Processor implementations, without
concern for uncertain compatibility questions.

Regards,
David Handermann

On Mon, Mar 18, 2024 at 2:35 PM Matt Burgess <mattyb...@apache.org> wrote:
>
> What do y'all think about removing the individual connection properties
> from the Cassandra processors for NiFi 2.0 and requiring a
> CassandraSessionProvider instead? I think we started doing that elsewhere
> (Elasticsearch maybe?), I noticed duplicate code in the
> CassandraSessionProvider and AbstractCassandraProcessor, if we keep those
> properties I can refactor them into a utility class.
>
> Thanks,
> Matt
>
>
> On Fri, Mar 15, 2024 at 2:44 PM Steven Matison <steven.mati...@gmail.com>
> wrote:
>
> > I got through quite a bit of work to enable 4.x…
> >
> > The 3.x pieces that were not backwards compatible is very edge use case and
> > could have been done slightly differently but with work around.
> >
> > https://github.com/steven-matison/nifi/tree/nifi-10120-1
> >
> >
> >
> >
> >
> >
> > On Fri, Mar 15, 2024 at 2:30 PM Matt Burgess <mattyb...@apache.org> wrote:
> >
> > > Oops used the wrong email address so if there have been responses to the
> > > Cassandra thread since mine I missed them, my bad!
> > >
> > > On Fri, Mar 15, 2024 at 2:00 PM Matt Burgess <mattyb...@gmail.com>
> > wrote:
> > >
> > > > I believe the CQL protocol is backwards compatible but the Java API is
> > > > not. For example "com.datastax.driver.core.Session" is now
> > > > "com.datastax.oss.driver.api.core.session.Session" and there is no more
> > > > "Cluster" class. Might be fairly trivial to fix though, if that's the
> > > path
> > > > of least resistance.
> > > >
> > > > On Fri, Mar 15, 2024 at 1:40 PM Joe Witt <joe.w...@gmail.com> wrote:
> > > >
> > > >> Matt
> > > >>
> > > >> I dont know a ton about Cassandra but when I looked at client/driver
> > > notes
> > > >> for 4+ it said it was compatible all the way back to 3.x.   Not sure
> > > what
> > > >> that means but it surely seems worth exploring.  Also I dont know if
> > the
> > > >> 4.x drivers get rid of the vulnerable bits.
> > > >>
> > > >> Thanks
> > > >>
> > > >> On Fri, Mar 15, 2024 at 10:39 AM Matt Burgess <mattyb...@apache.org>
> > > >> wrote:
> > > >>
> > > >> > At the very least we should upgrade to Cassandra 3.11.6:
> > > >> >
> > > https://github.com/apache/cassandra/blob/cassandra-3.11.16/CHANGES.txt
> > > >> >
> > > >> > On Fri, Mar 15, 2024 at 1:31 PM Matt Burgess <mattyb...@apache.org>
> > > >> wrote:
> > > >> >
> > > >> > > If the community agrees to get rid of Cassandra 3 that'll save me
> > > >> effort
> > > >> > > on the refactor after I add Cassandra 4 :) Otherwise those
> > > >> > > vulnerabilities would only be in a "new" Cassandra 3 services NAR
> > > that
> > > >> > > would not be included in the convenience binary.
> > > >> > >
> > > >> > > On Fri, Mar 15, 2024 at 1:28 PM Joe Witt <joe.w...@gmail.com>
> > > wrote:
> > > >> > >
> > > >> > >> Mike, Matt,
> > > >> > >>
> > > >> > >> Happy to hear you both have active efforts or are interested in
> > > doing
> > > >> > so.
> > > >> > >> Can you help me understand more specifically what that means for
> > > the
> > > >> > >> current set of components?
> > > >> > >>
> > > >> > >> The CVE hits are concerning and long standing.  Supporting
> > > Cassandra
> > > >> 3
> > > >> > >> implies the current set of dependencies would remain too right?
> > > >> > >>
> > > >> > >> Is the current set of components we have ones we want to retain?
> > > We
> > > >> > >> certainly need Cassandra components - but are the ones we have
> > now
> > > >> the
> > > >> > >> right ones?
> > > >> > >>
> > > >> > >> Thanks
> > > >> > >> Joe
> > > >> > >>
> > > >> > >> On Fri, Mar 15, 2024 at 10:25 AM Matt Burgess <
> > > mattyb...@apache.org>
> > > >> > >> wrote:
> > > >> > >>
> > > >> > >> > I'm actively working this, I pushed my branch up in case anyone
> > > >> wants
> > > >> > to
> > > >> > >> > take a look [1]. The idea is to abstract the Cassandra API "up
> > a
> > > >> > couple
> > > >> > >> > levels" and provide implementations for Cassandra 3, 4, and
> > > >> eventually
> > > >> > >> 5.
> > > >> > >> > For JDBC-like interfaces this is a PITA because of the API
> > > >> (Statement,
> > > >> > >> > PreparedStatement, BoundStatement, ResultSet, etc.) but I'm
> > > hoping
> > > >> we
> > > >> > >> can
> > > >> > >> > find a common pattern for abstracting the third-party library
> > > >> > >> > implementation and API from the NiFi component (Processor,
> > > >> > >> > ControllerService, etc.) API. I think we're doing something
> > > similar
> > > >> > for
> > > >> > >> > Kafka?
> > > >> > >> >
> > > >> > >> > Regards,
> > > >> > >> > Matt
> > > >> > >> >
> > > >> > >> > [1] https://github.com/mattyb149/nifi/tree/cassy4
> > > >> > >> >
> > > >> > >> >
> > > >> > >> > On Fri, Mar 15, 2024 at 8:43 AM Mike Thomsen <
> > > >> mikerthom...@gmail.com>
> > > >> > >> > wrote:
> > > >> > >> >
> > > >> > >> > > That’s been on my todo list for a little while but things
> > kept
> > > >> > coming
> > > >> > >> up.
> > > >> > >> > > I think I could get started on that now. Based on my initial
> > > >> > research
> > > >> > >> it
> > > >> > >> > > appears that scylla uses the exact same api as datastax so
> > > >> > supporting
> > > >> > >> > both
> > > >> > >> > > in a cql bundle should theoretically be fairly easy.
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> > > Sent from my iPhone
> > > >> > >> > >
> > > >> > >> > > > On Mar 14, 2024, at 6:18 PM, Joe Witt <joew...@apache.org>
> > > >> wrote:
> > > >> > >> > > >
> > > >> > >> > > > Team,
> > > >> > >> > > >
> > > >> > >> > > > Cassandra remains a really important system to be able to
> > > send
> > > >> > data
> > > >> > >> to.
> > > >> > >> > > > However, it seems like we've not maintained these well.  We
> > > >> have
> > > >> > >> what
> > > >> > >> > > > appears to be at least a full generation behind on client
> > > >> versions
> > > >> > >> (we
> > > >> > >> > > are
> > > >> > >> > > > on 3x vs 4x which is the latest stable with 5x apparently
> > > >> coming
> > > >> > >> > > shortly).
> > > >> > >> > > >
> > > >> > >> > > > We have components to send data, query data, and use
> > > Cassandra
> > > >> as
> > > >> > a
> > > >> > >> > cache
> > > >> > >> > > > store.  We have older mechanisms for json/avro and publish
> > > >> > >> mechanisms
> > > >> > >> > for
> > > >> > >> > > > records.
> > > >> > >> > > >
> > > >> > >> > > > The libraries we do have depend on outdated versions of
> > Guava
> > > >> and
> > > >> > >> > result
> > > >> > >> > > in
> > > >> > >> > > > many CVE hits.
> > > >> > >> > > >
> > > >> > >> > > > I am inclined to think we should deprecate the 1.x
> > components
> > > >> and
> > > >> > >> > remove
> > > >> > >> > > > them as-is from the 2.x line.  Then re-introduce them with
> > > >> record
> > > >> > >> only
> > > >> > >> > > > interfaces and built against the latest stable
> > > >> > >> > > Cassandra/Datastax/ScyllaDB
> > > >> > >> > > > interfaces.
> > > >> > >> > > >
> > > >> > >> > > > I'd love to hear thoughts from those closer to this space
> > > both
> > > >> as
> > > >> > a
> > > >> > >> > user
> > > >> > >> > > > and developer so we can make good next steps.
> > > >> > >> > > >
> > > >> > >> > > > Thanks
> > > >> > >> > >
> > > >> > >> >
> > > >> > >>
> > > >> > >
> > > >> >
> > > >>
> > > >
> > >
> >

Reply via email to