Re: DFDL infoset allows characters the XML infoset doesn't ... how can that be?

2019-01-16 Thread Beckerle, Mike
Not quite. The parsing is done on the data stream, there the data stream contains a NUL delimiter and you have to say to look for a NUL, not a U+E000. So you want dfdl:terminator="%NUL;" or dfdl:terminator="%#x0;" This is using DFDL's string literal syntax, which uses "%" to introduce DFDL

RE: DFDL infoset allows characters the XML infoset doesn't ... how can that be?

2019-01-16 Thread Costello, Roger L.
Hi Mike, Thanks for the pointer. If I understand correctly, in my DFDL schema I can specify that a string is terminated by an illegal XML character such as a NULL character (hex 0) by creating an XML character entity with a hex value in the Private Use Area: E000 + 0 = E000 ... then create the

Re: DFDL infoset allows characters the XML infoset doesn't ... how can that be?

2019-01-16 Thread Beckerle, Mike
See section on XML Illegal characters on this page: https://daffodil.apache.org/infoset/ From: Costello, Roger L. Sent: Wednesday, January 16, 2019 11:15:50 AM To: users@daffodil.apache.org Subject: DFDL infoset allows characters the XML infoset doesn't ... ho

Re: What word do you use for the document generated from unparsing?

2019-01-16 Thread Beckerle, Mike
The "data stream" is typically what we call the output of the unparser, or the input to the parser. The "physical representation" is another term. The fact that it is intended to be in some way related to the input is an artifact of a specific use case which is ripping data apart, validating,

DFDL infoset allows characters the XML infoset doesn't ... how can that be?

2019-01-16 Thread Costello, Roger L.
Hello DFDL community, Someone told me this: DFDL's infoset allows characters the XML infoset doesn't. What characters? How can it be? After all, well-formed XML is generated. And, the DFDL schema is well-formed XML. Right? /Roger

What word do you use for the document generated from unparsing?

2019-01-16 Thread Costello, Roger L.
Hello DFDL community, So, we parse an input to generate XML. Then, we unparse the XML to generate what? What do you call the document that results from unparsing? I call it the "reconstituted input document". Is "reconstituted" a good name? What do you call it? /Roger