Sure. The JAR is not actually scanned. You have to use dynamic properties to map a schema name to a particular fully qualified class. I would assume that most people using it are just taking a JAR that was generated or written by hand to be a client library with only (or mainly) POJOs representing the model. You can see a simple example here:
https://github.com/MikeThomsen/nifi-pojo-schema-repository-bundle/tree/main/test-pojos/src/main/java/org/apache/nifi/pojo/complex Nothing fancy per se. It's just POJOs with some standard annotations from the Avro lib. I would imagine this repo would be a big help for teams that can't/won't commit to a contract first design but have to work with NiFi and other big data systems. On Thu, Mar 7, 2024 at 5:39 PM David Handermann <[email protected]> wrote: > Mike, > > Thanks for the reply. I agree that file and property-based registries > are useful, so the main question seems to be a compiled-code-derived > registry as you have described. > > It seems that the general use case could still be supported through > file-backed registry, but without requiring the dynamic class loading > associated with a custom JAR. > > Loading code from a JAR also presents greater security risks than > loading schema files, so if this were to be supported, it would > require additional permission restrictions. > > To help think through this a bit more, can you describe the use case a > bit more? How would someone prepare a JAR for referencing in this > proposed registry? > > Regards, > David Handermann > > On Thu, Mar 7, 2024 at 4:30 PM Mike Thomsen <[email protected]> > wrote: > > > > You raise some good points, but I think there's still ample room for > > file-based schema registries within NiFi. With regard to the the edge > cases > > with schema generation, I think an argument can also be made for "not > > letting the perfect be the enemy of the good." > > > > On Wed, Mar 6, 2024 at 9:34 AM David Handermann < > [email protected]> > > wrote: > > > > > Mike, > > > > > > Thanks for raising this question, and providing the example repository. > > > > > > Although it sounds like a POJO-based repository could be useful in > > > some cases, it does not seem like something that should be included > > > for community support. > > > > > > Part of the value of a Schema Registry is a shared location for data > > > description. Although supporting property or file-based Schema > > > Registries is useful in NiFi itself, the general pattern is > > > externalized storage and maintenance of schema definitions. > > > > > > From another angle, this could be similar to code-first versus > > > contract-first API development. Each approach has its positives and > > > negatives. When it comes to a Schema Registry, however, it seems like > > > the contract needs to be defined outside of code. > > > > > > Introspecting JAR files also raises questions about what to include or > > > exclude, and how to handle edge cases for certain class definitions. > > > This seems like the more significant problem. For this reason, it > > > seems better to rely on external operations to produce Avro schema > > > definitions, rather than supporting that in NiFi itself. > > > > > > Those are my initial thoughts, perhaps others can provide additional > > > perspective. > > > > > > Regards, > > > David Handermann > > > > > > On Sat, Mar 2, 2024 at 9:18 AM Mike Thomsen <[email protected]> > > > wrote: > > > > > > > > I've had this project on the back burner for a while and wanted to > share > > > it > > > > with the team. It's a schema repository implementation that is > designed > > > to > > > > take a JAR file with POJOs and use Jackson's schema generation API to > > > > generate Avro schemas from those on startup. It also uses (via > Jackson) > > > > Avro annotations to help specify particular implementation details > where > > > > necessary. The code can be found here. Haven't worked on it lately, > but > > > it > > > > should easily run on 1.25: > > > > > > > > https://github.com/MikeThomsen/nifi-pojo-schema-repository-bundle > > > > > > > > I am planning to get the repo ready for a PR unless someone raises > > > reasons > > > > why including it might be a poor fit. I think for a lot of teams this > > > might > > > > be a killer feature because it would allow them to use Avro with > existing > > > > enterprise POJOs and stuff like that without having to write them by > > > hand. > > > > > > > > Thoughts? > > > >
