I'm facing some issues related to schema evolution in combination with the
usage of Json Schemas and I was just wondering whether there are any
recommended best practices.

In particular, I'm using the following code generator:

- https://github.com/joelittlejohn/jsonschema2pojo

Main gotchas so far relate to the `additionalProperties` field. When
setting that to true, the resulting POJO is not valid according to Flink
rules because the generated getter/setter methods don't follow the java
beans naming conventions, e.g., see here:

- https://github.com/joelittlejohn/jsonschema2pojo/issues/1589

This means that the Kryo fallback is used for serialization purposes, which
is not only bad for performance but also breaks state schema evolution.

So, because of that, setting `additionalProperties` to `false` looks like a
good idea but then your job will break if an upstream/producer service adds
a property to the messages you are reading. To solve this problem, the
POJOs for your job (as a reader) can be generated to ignore the
`additionalProperties` field (via the `@JsonIgnore` Jackson annotation).
This seems to be a good overall solution to the problem, but looks a bit
convoluted to me / didn't come without some trial & error (= pain &
frustration).

Is there anyone here facing similar issues? It would be good to hear your
thoughts on this!

BTW, this is very interesting article that touches on the above mentioned
difficulties:
-
https://www.creekservice.org/articles/2024/01/09/json-schema-evolution-part-2.html

Reply via email to