It seems like many people are surprised by when data successfully parses, it does not successfully unparse.
This happens when the parsed data is well-formed, but values do not obey the XSD facets. That is, the XML result from parsing is created, but it is invalid. Depending on the DFDL schema, such data may not unparse successfully. Users who test schemas with a simple parse -> unparse process and test data that has this well-formed-but-invalid behavior may get the impression that there is a problem with the schema, but really it is just that validation errors coming out of the parse are not being escalated into true errors. This behavior of daffodil holds regardless of whether Daffodil is configured to do validation or not, as validation errors are never parse errors. They are effectively just warnings. I think this is unintuitive to many users, who expect the DFDL parse cannot produce invalid XML. A test process that does parse -> XSD Validate -> unparse, is correct. The XSD Validate step in the middle would block such messages as invalid and they'd never get to the unparser so would not fail in the unparser. With that background, should we have an option where at the end of a Daffodil parse, if there are validation errors we can cause the entire parse to be considered a failure? This is not the same as escalating individual validation errors into parse errors as that would affect backtracking behavior. This is a separate final check once the DFDL Infoset has been ceated. API users of Daffodil can of course inspect output for validation errors and do this themselves. I just think they are not aware that this is needed. Thoughts? Mike Beckerle Apache Daffodil PMC | daffodil.apache.org OGF DFDL Workgroup Co-Chair | www.ogf.org/ogf/doku.php/standards/dfdl/dfdl Owl Cyber Defense | www.owlcyberdefense.com
