Team,

For the full time NiFi has been in Apache we've built with support for
various Hadoop ecosystem components like HDFS, Hive, HBase, others,
and more recently formats/serialization modes like necessary for
Parquet, Orc, Iceberg, etc..

All of these things however present endless challenges with
compatibility across different versions (Hive being the most difficult
by far), vendors (hadoop vendors, cloud vendors, etc..).  And also
super notably the incredible number of dependencies, dependency
conflicts, inclusions/exclusions, old log libs, vulnerability updates,
etc..  And last but certainly not least a big reason why our build has
grown so much.

We have a couple options:
1. Deprecate these components in NiFi 1.x and drop them entirely in
NiFi 2.x.  Leave this as a problem for vendors to deal with.  NiFi
users interacting with such components are nearly exclusively doing so
with vendors anyway.

2. Remove the components from NiFi main code line and create a
separate repo for 'nifi-hadoop-extensions'.  We manage those
independently and release them periodically.  They would be available
for people to grab the nars if they want to use them.  We include none
of them in the convenience binary going forward by default.

3. Change nothing.  Continue to battle with the above listed items.
This is admittedly a bit of a non-option.  We can't keep spending the
same time/energy on these we have.  It is a very small number of
people that fight this battle.

Look forward to hearing thoughts on these options or others we might consider.

Thanks

Reply via email to