I agree that moving forward with the upgrade is the right thing to do, the example Mike pointed out is invalid according to the 1.8.1 spec [1] even though it was mistakenly allowed by Avro itself. Since we don't infer default values I think we can correctly require that hand-written schemas conform to the spec, especially with respect to default values, and our inferred schemas will adhere to the spec.
I also agree we should mention this in the Migration Guidance doc as it is a requirement that is now being enforced when it wasn't before. There might be an issue with Hive 1 but I don't think we're on the hook for that, if Hive has issues with nullable types and/or default values of null, it should obey the spec the same as we will. Regards, Matt [1] https://avro.apache.org/docs/1.8.1/spec.html#Unions On Thu, Apr 7, 2022 at 9:20 AM David Handermann <exceptionfact...@apache.org> wrote: > > Mike, > > Thanks for raising this issue for discussion, and thanks Isha and Edward > for the comments. It is helpful to get an understanding of the potential > impact. > > In general, it seems like moving forward with the upgrade is the best idea, > given the issues with older versions. For integration with other products > outside of NiFi, using the latest version of Avro should provide better > compatibility. > > That being said, it is important to consider the impact on existing > deployments. At minimum, this change is something that should be described > in the NiFi Migration Guidance associated with the released version. As far > as supporting different versions, it is possible to run older versions of > the various components, although this is more complicated given the scope > of the upgrade. This seems like a case where providing more detailed > migration steps would be helpful, to include issues related to cached > schemas or inferred schemas used as the basis for customization. > > Regards, > David Handermann > > On Thu, Apr 7, 2022 at 6:08 AM Edward Armes <edward.ar...@gmail.com> wrote: > > > I've had a quick look in JIRA and it looks like this might have happened as > > a side effect of AVRO-1544. > > > > I think it is worth upgrading especially given that it looks like few of > > the changes refer to updating against newer bundled dependencies some of > > which seem to have CVE's against them. > > > > Depending on peoples feelings would it be wroth creating 2 versions one > > using Avro 1.8 and one using Avro 1.11.0 and then removing the 1.8 version > > in a later release? > > > > On an additional note in cases where people are writing schemas manually I > > suspect they are probably going to be validating against the later versions > > of Avro using the projects tooling and that may create issues further down > > the line. > > > > Edward > > > > On Thu, Apr 7, 2022 at 11:52 AM Isha Lamboo < > > isha.lam...@virtualsciences.nl> > > wrote: > > > > > Hi Mike, > > > > > > The "Infer schema" functionality in NiFi currently generates schemas with > > > the order that will be invalid under Avro 1.9+. I noticed because I've > > been > > > using that to copy-paste schemas that were "almost right" so I could > > > manually fix them. > > > > > > I guess that inferred schemas should be fine if the inferring logic is > > > also upgraded by the dependency, but for any cached schemas and my own > > > manually saved schemas this will be somewhat painful. > > > My use case for manually saving inferred schemas is mostly database > > > migration where some inferred "choice" fields are not supported for the > > > target database. > > > > > > Hope this helps, > > > > > > Isha > > > > > > -----Oorspronkelijk bericht----- > > > Van: Mike Thomsen <mikerthom...@gmail.com> > > > Verzonden: donderdag 7 april 2022 12:11 > > > Aan: dev@nifi.apache.org > > > Onderwerp: Need some feedback on how upgrading Avro might cause problems > > > > > > Thread is here for full details: > > > > > > > > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fnifi%2Fpull%2F5900%23pullrequestreview-922490039&data=04%7C01%7Cisha.lamboo%40virtualsciences.nl%7C8c89ae3e621c4a255c3308da187eea09%7C21429da9e4ad45f99a6fcd126a64274b%7C0%7C0%7C637849231546931433%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=ztcxOWXBkpZFqEh%2Bu%2BG0du5BLUPyZ3WaMxqpeqn%2FexI%3D&reserved=0 > > > > > > It looks like Avro 1.8's schema parser may have been more lenient (or > > > buggy) in enforcing the specification with respect to the ordering of a > > > union for a nullable type. 1.9.X and higher are definitely more > > opinionated > > > and throw exceptions on schemas that used to work on 1.8.X. For example, > > > this used to be valid: > > > > > > { > > > "name": "message", > > > "type": [ "null", "string" ], > > > "default": "Hello, world" > > > } > > > > > > Now Avro **correctly** throws an exception per the specification. > > > Under 1.8 it did not, and as you can see here, I had to change numerous > > > test schemas in order to make them work under 1.9 to 1.11.0: > > > > > > > > > > > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fnifi%2Fpull%2F5900%2Ffiles%23r835954170&data=04%7C01%7Cisha.lamboo%40virtualsciences.nl%7C8c89ae3e621c4a255c3308da187eea09%7C21429da9e4ad45f99a6fcd126a64274b%7C0%7C0%7C637849231546931433%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=iBGW6sYZUdxvAADYIB5L2t94RBZH3A5%2BPJMhxuGv8q8%3D&reserved=0 > > > > > > As I said to Matt, I think we're in a "damned if you do, damned if you > > > don't" position here. > > > > > > Thoughts? > > > > >