Perhaps we should just have a policy that it is a schema bug if data parses
but does not unparse regardless of whether the infoset created by the parse
is valid or not.

This of course assumes a schema designed to support both parse and unparse.
Some schemas will intentionally be just for parse, but in the Cyberian use
case we've only seen parse+unparse schemas.


On Tue, Sep 16, 2025 at 3:09 PM Steve Lawrence <[email protected]> wrote:

> I haven't tested it, but looking at the code I think this is already the
> case.
>
> The CLI exits with a non-zero exit code for parse or validation errors:
>
>
> https://github.com/apache/daffodil/blob/main/daffodil-cli/src/main/scala/org/apache/daffodil/cli/Main.scala#L1295-L1298
>
> The ParseResult.isError API returns true for either parse or validation
> errors:
>
>
> https://github.com/apache/daffodil/blob/main/daffodil-core/src/main/scala/org/apache/daffodil/runtime1/iapi/DFDLParserUnparser.scala#L217
>
> It looks like this behavior goes back at to least to Daffodil 3.0.0, so
> any
> modern version should have this behavior. If users are running into this,
> it
> might mean they aren't checking the CLI exit code, or using the API and
> explicitly testing ParseResult.isProcessorError instead of isError, or
> there is
> a bug in Daffodil.
>
> Or maybe they are parsing without validation enabled? In which case maybe
> we
> just need better documentation somewhere? Some schemas might not unparse
> with
> well-formed and but invalid data. In these cases, it might be important to
> parse
> with validation enabled and check for validation error or it could lead to
> unparse failures.
>
>
> On 2025-09-16 02:45 PM, Mike Beckerle wrote:
> > It seems like many people are surprised by when data successfully parses,
> > it does not successfully unparse.
> >
> > This happens when the parsed data is well-formed, but values do not obey
> > the XSD facets. That is, the XML result from parsing is created, but it
> is
> > invalid.
> >
> > Depending on  the DFDL schema, such data may not unparse successfully.
> >
> > Users who test schemas with a simple parse -> unparse process and test
> data
> > that has this well-formed-but-invalid behavior may get the impression
> that
> > there is a problem with the schema, but really it is just that validation
> > errors coming out of the parse are not being escalated into true errors.
> > This behavior of daffodil holds regardless of whether Daffodil is
> > configured to do validation or not, as validation errors are never parse
> > errors. They are effectively just warnings. I think this is unintuitive
> to
> > many users, who expect the DFDL parse cannot produce invalid XML.
> >
> > A test process that does parse -> XSD Validate -> unparse, is correct.
> The
> > XSD Validate step in the middle would block such messages as invalid and
> > they'd never get to the unparser so would not fail in the unparser.
> >
> > With that background, should we have an option where at the end of a
> > Daffodil parse, if there are validation errors we can cause the entire
> > parse to be considered a failure? This is not the same as escalating
> > individual validation errors into parse errors as that would affect
> > backtracking behavior. This is a separate final check once the DFDL
> Infoset
> > has been ceated.
> >
> > API users of Daffodil can of course inspect output for validation errors
> > and do this themselves. I just think they are not aware that this is
> needed.
> >
> > Thoughts?
> >
> >
> >
> > Mike Beckerle
> > Apache Daffodil PMC | daffodil.apache.org
> > OGF DFDL Workgroup Co-Chair |
> www.ogf.org/ogf/doku.php/standards/dfdl/dfdl
> > Owl Cyber Defense | www.owlcyberdefense.com
> >
>
>

Reply via email to