There’s also some code that exists in order to maintain backward compatibility in the repositories. I would very much like the repositories to contain no unnecessary code. And swap file format supports really old formats. And the old impls of the repositories themselves, like PersistentProvRepo instead of WriteAheadProv Repo, etc. Lots of stuff there that can be removed. And some methods in ProcessSession that are never used by any processor in the codebase but exists in the public API so can’t be removed till 2.0.
I think his is also a great time to clean up the “standard nar.” At this point, it’s something like 70 MB. And many of the components there are not really “standard” - things like connecting to FTP & SFTP servers, XML processing, Jolt transform, etc. could potentially be moved into other nars. The nifi-standard-content-viewer-1.15.0-SNAPSHOT.war is 6.9 MB is not necessary for stateless or minifi java. Lots of things probably to reconsider within the standard nar. I definitely think this is a reasonable approach, to allow for a 2.0 that is not a huge feature release but allows the project to be simpler and more nimble. Thanks -Mark On Jul 24, 2021, at 10:59 AM, Mike Thomsen <mikerthom...@gmail.com<mailto:mikerthom...@gmail.com>> wrote: Russell, AFAICT from looking at Elastic's repos, the low level REST client is still fine. https://github.com/elastic/elasticsearch/blob/e5518e07f13701e3bb3dcc6842b9023966752497/client/rest/src/main/java/org/elasticsearch/client/RestClient.java Our Elasticsearch support is spread over two NARs at present. One uses OkHttp and the other uses that low level Elastic REST client. Therefore, I think we're fine on licensing for the moment. Mike On Fri, Jul 23, 2021 at 1:10 PM Russell Bateman <r...@windofkeltia.com<mailto:r...@windofkeltia.com>> wrote: Bringing up Elastic also reminds me that the Elastic framework has just recently transitioned out of Open Source, so to acknowledge that, maybe some effort toward OpenSearch--I say this not understanding exactly how this sort of thing is considered in a large-scale, world-class software project like Apache NiFi. (I'm not a contributor, just a grateful consumer.) Russ On 7/23/21 10:28 AM, Matt Burgess wrote: Along with the itemized list for ancient components we should look at updating versions of drivers, SDKs, etc. for external systems such as Elasticsearch, Cassandra, etc. There may be breaking changes but 2.0 is probably the right time to get things up to date to make them more useful to more people. On Fri, Jul 23, 2021 at 12:21 PM Nathan Gough <thena...@gmail.com<mailto:thena...@gmail.com>> wrote: I'm a +1 for removing pretty much all of this stuff. There are security implications to keeping old dependencies around, so the more old code we can remove the better. I agree that eventually we need to move to supporting only Java 11+, and as our next release will probably be about 4 - 6 months from now that doesn't seem too soon. We could potentially break this in two and remove the deprecated processors and leave 1.x on Java 8, and finally start on 2.x which would support only Java 11. I'm unsure of what implications changing the date and time handling would have - for running systems that use long term historical logs, unexpected impacts to time logging could be a problem. As Joe says I think feature work will have to be dedicated to 2.x and we could support 1.x for security fixes for some period of time. 2.x seems like a gargantuan task but it's probably time to get started. Not sure how we handle all open PRs and the transition between 1.x and 2.x. On Fri, Jul 23, 2021 at 10:57 AM Joe Witt <joe.w...@gmail.com<mailto:joe.w...@gmail.com>> wrote: Jon You're right we have to be careful and you're right there are still significant Java 8 users out there. But we also have to be careful about security and sustainability of the codebase. If we had talked about this last year when that article came out I'd have agreed it is too early. Interestingly that link seems to get updated and I tried [1] and found more recent data (not sure how recent). Anyway it suggests Java 8 is still the top dog but we see good growth on 11. In my $dayjob this aligns to what I'm seeing too. Customers didn't seem to care about Java 11 until later half last year and now suddenly it is all over the place. I think once we put out a NiFi 2.0 release we'd see rapid decrease in work on the 1.x line just being blunt. We did this many years ago with 0.x to 1.x and we stood behind 0.x for a while (maybe a year or so) but it was purely bug fix/security related bits. We would need to do something similar. But feature work would almost certainly go to the 2.x line. Maybe there are other workable models but my instinct suggests this is likely to follow a similar path. ...anyway I agree it isn't that easy of a call to dump Java 8. We need to make the call in both the interests of the user base and the contributor base of the community. [1] https://www.jetbrains.com/lp/devecosystem-2021/java/ Thanks Joe On Fri, Jul 23, 2021 at 7:46 AM Joe Witt <joe.w...@gmail.com<mailto:joe.w...@gmail.com>> wrote: Russ Yeah the flow registry is a key part of it. But also now you can download the flow definition in JSON (upload i think is there now too). Templates offered a series of challenges such as we store them in the flow definition which has made flows massive in an unintended way which isn't fun for cluster behavior. We have a couple cases where we headed down a particular concept and came up with better approaches later. We need to reconcile these with the benefit of hindsight, and while being careful to be not overly disruptive to existing users, to reduce the codebase/maintenance burden and allow continued evolution of the project. Thanks On Fri, Jul 23, 2021 at 7:43 AM Russell Bateman <r...@windofkeltia.com<mailto:r...@windofkeltia.com>> wrote: Joe, I apologize for the off-topic intrusion, but what replaces templates? The Registry? Templates rocked and we have used them since 0.5.x. Russ On 7/23/21 8:31 AM, Joe Witt wrote: David, I think this is a highly reasonable approach and such a focus will greatly help make a 2.0 release far more approachable to knock out. Not only that but tech debt reduction would help make work towards major features we'd think about in a 'major release' sense more approachable. We should remove all deprecated things (as well as verify we have the right list). We should remove/consider removal of deprecated concepts like templates. We should consider whether we can resolve the various ways we've handled what are now parameters down to one clean approach. We should remove options in the nifi.properties which turn out to never be used quite right (if there are). There is quite a bit we can do purely in the name of tech debt reduction. Lots to consider here but I think this is the right discussion. Than ks On Fri, Jul 23, 2021 at 7:26 AM Bryan Bende <bbe...@gmail.com<mailto:bbe...@gmail.com>> wrote: I'm a +1 for this... Not sure if this falls under "Removing Deprecated Components", but I think we should also look at anything that has been marked as deprecated throughout the code base as a candidate for removal. There are quite a few classes, methods, properties, etc that have been waiting for a chance to be removed. On Fri, Jul 23, 2021 at 10:13 AM David Handermann <exceptionfact...@apache.org<mailto:exceptionfact...@apache.org>> wrote: Team, With all of the excellent work that many have contributed to NiFi over the years, the code base has also accumulated some amount of technical debt. A handful of components have been marked as deprecated, and some components remain in the code base to support integration with old versions of various products. Following the principles of semantic versioning, introducing a major release would provide the opportunity to remove these deprecated and unsupported components. Rather than focusing the next major release on new features, what do you think about focusing on technical debt removal? This approach would not make for the most interesting release, but it provides the opportunity to clean up elements that involve breaking changes. Focusing on technical debt, at least three primary goals come to mind for the next major release: 1. Removal of deprecated and unmaintained components 2. Require Java 11 as the minimum supported version 3. Transition internal date and time handling to JSR 310 java.time components *Removing Deprecated Components* Removing support for older and deprecated components provides a great opportunity to improve the overall security posture when it comes to maintaining dependencies. The OWASP dependency plugin report currently generates 50 MB of HTML for questionable dependencies, many of which are related to old versions of various libraries. As a starting point, here are a handful of components and extension modules that could be targeted for removal in a major version: - PostHTTP and GetHTTP - ListenLumberjack and the entire nifi-lumberjack-bundle - ListenBeats and the entire nifi-beats-bundle - Elasticsearch 5 components - Hive 1 and 2 components *Requiring Java 11* Java 8 is now over seven years old, and NiFi has supported general compatibility with Java 11 for several years. NiFi 1.14.0 incorporated internal improvements specifically related to TLS 1.3, which allowed closing out the long-running Java 11 compatibility epic NIFI-5174. Making Java 11 the minimum required version provides the opportunity to address any lingering edge cases and put NiFi in a better position to support current Java versions. *JSR 310 for Date and Time Handling* Without making the scope too broad, transitioning internal date and time handling to use DateTimeFormatter instead of SimpleDateFormat would provide a number of advantages. The Java Time components provide much better clarity when it comes to handling localized date and time representations, and also avoid the inherent confusion of java.sql.Date extending java.util.Date. Many internal components, specifically Record-oriented processors and services, rely on date parsing, leading to confusion and various workarounds. The pattern formats of SimpleDateFormat and DateTimeFormatter are very similar, but there are a few subtle differences. Making this transition would provide a much better foundation going forward. *Conclusion* Thanks for giving this proposal some consideration. Many of you have been developing NiFi for years and I look forward to your feedback. I would be glad to put together a more formalized recommendation on Confluence and write up Jira epics if this general approach sounds agreeable to the community. Regards, David Handermann