Yes, this language is problematic, because

dfdl:initiator="%ES; %ES;"

satisfies the letter of the law, but is clearly nonsense. There is no rule 
preventing repeating the same delimiter multiple times in the list of 
delimiters. In that case %ES; is not "alone in the list". It has a twin 
sibling! But this is equivalent to dfdl:initiator="%ES;" which is always 
nonsense.

We need some language pointing out that %ES; cannot be adjacent to any other 
characters of a delimiter, because it is meaningless to have it there.
E.g., a%ES;a is the same as aa. The only use of %ES; is alone as one of the 
delimiters in a whitespace separated list of delimiters. (This stipulation may 
in fact be in the DFDL spec somewhere. I just have forgotten.)

Furthermore, it is meaningless and should be disallowed to have %ES; appear 
more than once in the list.

Somewhat analogously, %WSP*;%WSP*; means the same thing as %WSP*;, so having 
them adjacent should be disallowed. %WSP*;%WSP; means the same as %WSP+; but is 
innocent enough. %WSP*;%WSP+; means the same as %WSP+; so that combination 
could also be disallowed.
And %WSP*; should not be able to appear alone in the list of delimiters more 
than once.

Having specified those general rules for %ES; and %WSP*;, then the constraints 
on their usage in dfdl:initiator boils down to understanding initiators, and 
the dfdl:initiatedContent property.

The rest of the language there is correct, it's just not well motivated.

An initiator is a delimiter, but it is not a "terminating delimiter". Hence, 
one can imagine there are many formats where the dfdl:initiator property is 
used to specify optional stuff that might be found before an element/sequence.

Optional whitespace before elements/sequences can be absorbed by 
dfdl:initiator="%WSP*;"

An optional "*" initiator can be expressed by:
dfdl:initiator="%ES; *"

But when dfdl:initiatedContent="yes", then the initiator is never optional. It 
must be detectable and not of zero length, as this property indicates it will 
be used as a discriminator for recognizing a branch of a choice, or presence of 
optional elements.

In that case, the dfdl:initiator has to be specified as something that cannot 
be empty string, so neither %WSP*; nor %ES; can be a member of the list of 
initiators.

Note that dfdl:initiator="%ES;" is always nonsense. But if dfdl:initiator="{... 
some expression...}" and the expression returns "%ES;" then that should be 
allowed (so long as dfdl:initiatedContent="no") because the expression could be 
conditional on when an initiator is needed in the data, and when it is 
prohibited in the data.

If dfdl:initiatedContent="yes" is set, then some actual detectable initiator 
must be defined; hence. dfdl:initator="%WSP*;" can't be allowed, nor can a list 
of initiators containing "%ES;" be allowed, because initiatedContent requires 
there to be something detectable in the data stream to find as the initiator. 
This applies regardless of whether the initiator is explicit in the schema or 
computed from a runtime expression.




________________________________
From: Costello, Roger L. <[email protected]>
Sent: Friday, September 6, 2019 2:51 PM
To: [email protected] <[email protected]>
Subject: Can ES be alone in an initiator?


Table 15 says this about initiators:



•       ES must not appear as the only DFDL string literal in the property. It 
can only appear as a member of a list.

•        If the ES entity or the WSP* entity appear alone as one of the string 
literals in the list, then dfdl:initiatedContent must be "no".



That is wicked confusing. The first one says that ES cannot be alone in an 
initiator list. The second one says ES can be alone in an initiator list as 
long as initiatedContent is no.



Huh? I’m confused.



/Roger

Reply via email to