Thank you Mike. That is very helpful.
I made a slight modification to your example: now the input is a series of
comma-separated names. To prohibit consecutive commas we wrap each name in
parenthesis and specify emptyValueDelimiterPolicy=both
<xs:sequence dfdl:separator="," dfdl:separatorPosition="infix">
<xs:element name="name" type="xs:string" maxOccurs="unbounded"
dfdl:initiator="(" dfdl:terminator=")"
dfdl:emptyValueDelimiterPolicy="both" />
</xs:sequence>
So, this is how the input would look when there is no value for the second name:
(John),(),(Bill),(Linda)
That works great.
Next, suppose that when there is no value for a name, we don't want the
initiator or terminator (consecutive commas are okay, we decide). We would
specify emptyValueDelimiterPolicy=none, right? The input should look like this:
(John),,(Bill),(Linda)
Right?
I tried that but got this message:
[warning] Left over data. Consumed 48 bit(s) with at least 128 bit(s) remaining.
This is the output I got:
<input>
<name>John</name>
</input>
Why is this happening? What happened to the other names?
/Roger
From: Beckerle, Mike <[email protected]>
Sent: Tuesday, February 18, 2020 2:40 PM
To: [email protected]
Subject: [EXT] Re: Need an example of using emptyValueDelimiterPolicy
emptyValueDelimiterPolicy is certainly a squirrelly area of DFDL and daffodil.
Made more complicated by the fact that default values aren't fully implemented
in either daffodil or IBM DFDL.
What you've expressed thusfar doesn't motivate any need for
emptyValueDelimiterPolicy.
Your element is an integer and has no default value. So there is nothing to
create if an "empty" syntax (which would be "()" for your case) is detected.
Hence, empty isn't allowed, and the message about the emptyValueDelimiterPolicy
being ignored.
Furthermore, your element has minOccurs 0, so it is "optional" and so no
defaulting would ever be done anyway. Instead nothing would be added to the
infoset on parsing. But that's only applicable if "empty" is even a concept for
your element type. In the case of an integer, it either needs to find "(8)"
with some value like 8, or it needs to find nothing at all (the next separator
perhaps). Finding "()" should cause an error that the integer can't be parsed
from empty string.
For emptyValueDelimiterPolicy to be useful on a numeric type, the element must
be required (so scalar element or minOccurs >= 1 with appropriate
occursCountKind), and must have a default value or be nillable and have
dfdl:useNilAsDefault="true" which makes being nilled the default value.
daffodil support for default values is only partial, also. I am not sure the
above such as making your integer nillable would not also result in diagnostic
messages about things being not supported.
The type xs:string however, is fully supported, because well, empty strings are
a legitimate value for strings. So you can use emptyValueDelimiterPolicy to
control whether for example you want explicit indications that the string value
is to be empty string, or not.
E.g., suppose you have a format which is comma separated and each of the 4
elements which are just scalars, are a choice of either an integer or a string.
For example here's some data
1,2,foo,bar
what if we want the strings to be allowed to be empty strings, but we don't
want this allowed:
1,2,,bar
because we consider those two adjacent commas to be an evil confusing thing. We
ant you to have to put something in the field.
So we can instead require the strings to have initiators and terminators so:
1,2,(foo),(bar),
but... depending on emptyValueDelimiterPolicy the evil adjacent commas might
still be allowed. If we want to disallow such evil, we must choose
emptyValueDelimiterPolicy='both'
so that
1,2,(),(bar)
is what is required to get an empty string value for the 3rd element.
Not sure that helps, but this is the sort of thing emptyValueDelimiterPolicy is
for.
________________________________
From: Costello, Roger L. <[email protected]<mailto:[email protected]>>
Sent: Tuesday, February 18, 2020 12:17 PM
To: [email protected]<mailto:[email protected]>
<[email protected]<mailto:[email protected]>>
Subject: Need an example of using emptyValueDelimiterPolicy
Hi Folks,
Suppose the input is an integer that is initiated by a left parenthesis and
terminated by a right parenthesis, e.g.,
(44)
I thought that I would use emptyValueDelimiterPolicy for that input, using this
schema:
<xs:sequence dfdl:initiator="(" dfdl:terminator=")" >
<xs:element name="num"
type="xs:integer"
minOccurs="0"
dfdl:emptyValueDelimiterPolicy="both" />
</xs:sequence>
Question #1: Is that a legitimate scenario for using emptyValueDelimiterPolicy?
Question #2: Does Daffodil support emptyValueDelimiterPolicy? This message
seems to suggest that Daffodil does not support it:
[warning] Schema Definition Warning: DFDL property was ignored:
emptyValueDelimiterPolicy="both"
/Roger