Yes, this language is problematic, because
dfdl:initiator="%ES; %ES;"
satisfies the letter of the law, but is clearly nonsense. There is no rule
preventing repeating the same delimiter multiple times in the list of
delimiters. In that case %ES; is not "alone in the list". It has a twin
sibling! But this is equivalent to dfdl:initiator="%ES;" which is always
nonsense.
We need some language pointing out that %ES; cannot be adjacent to any other
characters of a delimiter, because it is meaningless to have it there.
E.g., a%ES;a is the same as aa. The only use of %ES; is alone as one of the
delimiters in a whitespace separated list of delimiters. (This stipulation may
in fact be in the DFDL spec somewhere. I just have forgotten.)
Furthermore, it is meaningless and should be disallowed to have %ES; appear
more than once in the list.
Somewhat analogously, %WSP*;%WSP*; means the same thing as %WSP*;, so having
them adjacent should be disallowed. %WSP*;%WSP; means the same as %WSP+; but is
innocent enough. %WSP*;%WSP+; means the same as %WSP+; so that combination
could also be disallowed.
And %WSP*; should not be able to appear alone in the list of delimiters more
than once.
Having specified those general rules for %ES; and %WSP*;, then the constraints
on their usage in dfdl:initiator boils down to understanding initiators, and
the dfdl:initiatedContent property.
The rest of the language there is correct, it's just not well motivated.
An initiator is a delimiter, but it is not a "terminating delimiter". Hence,
one can imagine there are many formats where the dfdl:initiator property is
used to specify optional stuff that might be found before an element/sequence.
Optional whitespace before elements/sequences can be absorbed by
dfdl:initiator="%WSP*;"
An optional "*" initiator can be expressed by:
dfdl:initiator="%ES; *"
But when dfdl:initiatedContent="yes", then the initiator is never optional. It
must be detectable and not of zero length, as this property indicates it will
be used as a discriminator for recognizing a branch of a choice, or presence of
optional elements.
In that case, the dfdl:initiator has to be specified as something that cannot
be empty string, so neither %WSP*; nor %ES; can be a member of the list of
initiators.
Note that dfdl:initiator="%ES;" is always nonsense. But if dfdl:initiator="{...
some expression...}" and the expression returns "%ES;" then that should be
allowed (so long as dfdl:initiatedContent="no") because the expression could be
conditional on when an initiator is needed in the data, and when it is
prohibited in the data.
If dfdl:initiatedContent="yes" is set, then some actual detectable initiator
must be defined; hence. dfdl:initator="%WSP*;" can't be allowed, nor can a list
of initiators containing "%ES;" be allowed, because initiatedContent requires
there to be something detectable in the data stream to find as the initiator.
This applies regardless of whether the initiator is explicit in the schema or
computed from a runtime expression.
________________________________
From: Costello, Roger L. <[email protected]>
Sent: Friday, September 6, 2019 2:51 PM
To: [email protected] <[email protected]>
Subject: Can ES be alone in an initiator?
Table 15 says this about initiators:
• ES must not appear as the only DFDL string literal in the property. It
can only appear as a member of a list.
• If the ES entity or the WSP* entity appear alone as one of the string
literals in the list, then dfdl:initiatedContent must be "no".
That is wicked confusing. The first one says that ES cannot be alone in an
initiator list. The second one says ES can be alone in an initiator list as
long as initiatedContent is no.
Huh? I’m confused.
/Roger