On Wed, Jan 28, 2026 at 8:31 AM Simon Cockx <[email protected]> wrote:
> At REGnosys we are running into fundamental limitations of Jackson's > support for XML. I would like to know whether these limitations are > deliberate trade-offs, or changeable design decisions that could be fixed. > Based on that we are considering whether we can either *extend *Jackson > in our codebase, *contribute *to Jackson directly, or *move away* from > Jackson if it doesn't fit at all. > Hi! Yes, this makes sense. I am not sure what the ultimate answer is (it is obviously up to you), but I can try to address more specific questions/concerns. > > First of all: why Jackson? > Saying that we just want to ingest XML based on an XSD is somewhat > hand-wavy - the JAXB project exists exactly for that use case. So maybe the > question is better stated: why not JAXB? In short: the XSD is not our > source of truth, our domain specific language is. > > At REGnosys we maintain the open-source Rune DSL > <https://github.com/finos/rune-dsl>, a language specifically designed for > modelling processes in the financial industry. One important component of > the language is *ingestion*: the process of reading serial data (JSON, > XML, CSV, ...) in various financial standard formats and representing it in > a uniform way in our DSL. Many of these formats are XML-based and formally > defined as multiple XSD files, such as FpML <https://www.fpml.org/>. To > support ingesting of these data standards, we use the following steps. > > 1. Transform the XSD into Rune types. (similar to how JAXB transforms > XSD to Java classes) > 2. Annotate the Rune types and fields with additional serialization > information. (similar to what both Jackson and JAXB do/support) > 3. From this Rune model, generate Java code with custom annotations. > 4. Using a custom Jackson annotation processor, deserialize using a > Jackson object mapper. > > Note that steps 2 to 4 are independent of the exact serial format: we > don't just support XML, we also support JSON and CSV, and want to stay > extensible for any future formats. That is exactly the attractiveness of > Jackson and where we loose > interest in JAXB: Jackson's design principles align perfectly with this > goal of agnostic deserialisation and serialisation. > Agreed. Thank you for explaining the background -- I think it does align with Jackson goals at high level. > > Issues with Jackson XML > Most of our issues come down to the way bean properties are represented. > Their identity is purely based on the local name of the property being > deserialized, but doesn't take into account surrounding context such as > ordering, namespaces, or representation (e.g., XML attribute versus XML > element). > > Right: XML is probably THE trickiest format for Jackson to support (of ~10 supported formats). And most name mapping being namespace-unaware is problematic, and I'd have guessed number one problem. So as you say, these are known, unsolved problems. In a way you could say Jackson supports XML-specific aspects (namespaces, attribute-vs-element, ordering dependency) on serialization side but not well on deserialization -- on deserialization these aspects are essentially ignored. Examples of problems we run into: > > 1. Having XML elements and XML attributes with the same name is > unsupported. > Issue also described here: https://stackoverflow.com/q/47199799/3083982 > E.g., <foo id="my-id"><id>MyElementId</id></foo> > 2. The @JsonUnwrapped annotation breaks some XML features. > Fundamentally this is because it replaces the `FromXMLParser` instance with > a `TokenBuffer`-based parser, which breaks assumptions for some XML related > features. One example is described here: > https://github.com/FasterXML/jackson-dataformat-xml/issues/762 > 3. Jackson does not support XSD substitution groups, i.e., having a > single property with multiple potential names, depending on which a > specific subtype deserializer is used. Turns out that this is not a > fundamental issue: we have already extended Jackson to support it in the > open-source Rune Common <https://github.com/finos/rune-common> project. > See issue ticket here: > https://github.com/FasterXML/jackson-dataformat-xml/issues/679 > 4. Having XML elements with the same local name, but a different > namespace, is unsupported. See long-standing issue ticket here: > https://github.com/FasterXML/jackson-dataformat-xml/issues/65 > 5. Having XML elements with the same local name, but with a different > order, is unsupported. I don't see a direct issue open for this, but it is > related to this comment: > > https://github.com/FasterXML/jackson-dataformat-xml/issues/676#issuecomment-2438049500 > E.g., deserializing A1 and A2 to two distinct properties: <foo><a>A1</a > ><b/><a>A2</a></foo> > > While we have ideas of how to approach this, I am definitely not saying we > have a perfect solution in mind yet. We are mostly looking to answer the > question if it is worth looking for a solution in the first place, or if > this is just a fundamental limitation of Jackson. > Of these, (4) could be supported if databind used full `PropertyName` (which has "simple" and "namespace" part), so conceptually that is achievable, but implementation would be quite involved. Ideally there'd be no overhead for other formats, which would probably require more extensibility for XML backend to override handling (lookups). (1) is sort of related but trickier: XML "attributeness" handling is contained with XML components, only used on serialization (I think). (3) would be generally useful and ideally would be implemented -- not sure of all complexities due to "flattening" of layers Jackson otherwise adds. I think it is doable, but like all of these, non trivial. For (2) some support was added to allow format-backends to substitute their own `TokenBuffer` subtypes, but that's as far as that goes. Buffering is also problematic for some @JsonCreator induced buffering wrt `Collection` deserialization. (5) is probably the trickiest. I am not familiar with that yet, would need to dig deeper. Currently there isn't a ton of progress towards any of these (esp. as all are hard problems). But there are no fundamental blockers, I think. This is probably bit awkward wrt defining which path to take. I am happy to try to help in addressing these, for what that is worth. > > I'm happy to discuss here, but if possible, I would also be very happy to > jump on a call sometime to talk through this. Whatever works best. > I think discussing this here is good -- I will be out until next week now but wanted to send a quick response before that. Alternatively Github Discussions on https://github.com/FasterXML/jackson-dataformat-xml/discussions would also work. > Thanks in advance. > > Thank you, -+ Tatu +- > -- > You received this message because you are subscribed to the Google Groups > "jackson-dev" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion visit > https://groups.google.com/d/msgid/jackson-dev/474eea22-e935-4386-b2f3-1f1adfe65d06n%40googlegroups.com > <https://groups.google.com/d/msgid/jackson-dev/474eea22-e935-4386-b2f3-1f1adfe65d06n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "jackson-dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/jackson-dev/CAGrxA260obxS51Ex6napxdh0Dm6MKPMHtGuThpVfj81GBdZ3Bg%40mail.gmail.com.
