If you can send your entire schema I can reproduce this, and given that we're exploring variations on the theme, that would help there also.
>From what I see, I agree that this array should not be ending at those >adjacent commas, but should be constructing an empty-string element value. There are some more properties that could be playing a role here: One is dfdl:separatorSuppressionPolicy. I am curious if you have this as "anyEmpty" or "trailingEmpty" or "trailingEmptyStrict". I am not sure this should matter for parsing your example however. Another property is this new dfdlx:emptyElementParsePolicy. In Daffodil, this defaults to "treatAsEmpty" for the time being. It can be set to treatAsAbsent in case people really hate empty elements and want those always treated as absent. Property Name Description emptyElementParsePolicy Enum Valid values are "treatAsAbsent" or "treatAsEmpty" This property describes the behavior of the DFDL processor for occurrences of elements of any type that have the empty representation. When 'treatAsEmpty' if an occurrence of an element has the empty representation when parsed, the behaviour is as stated in section 9 for an occurrence with empty representation. Consequently, default values or empty strings may be added to the infoset. When 'treatAsAbsent' if an occurrence of an element has the empty representation when parsed, the behaviour is as stated in section 9 for an absent occurrence. Consequently, default values or empty strings are never added to the infoset. Annotation: dfdl:element, dfdl:simpleType ________________________________ From: Costello, Roger L. <[email protected]> Sent: Tuesday, February 18, 2020 3:05 PM To: [email protected] <[email protected]> Subject: Re: Need an example of using emptyValueDelimiterPolicy Thank you Mike. That is very helpful. I made a slight modification to your example: now the input is a series of comma-separated names. To prohibit consecutive commas we wrap each name in parenthesis and specify emptyValueDelimiterPolicy=both <xs:sequence dfdl:separator="," dfdl:separatorPosition="infix"> <xs:element name="name" type="xs:string" maxOccurs="unbounded" dfdl:initiator="(" dfdl:terminator=")" dfdl:emptyValueDelimiterPolicy="both" /> </xs:sequence> So, this is how the input would look when there is no value for the second name: (John),(),(Bill),(Linda) That works great. Next, suppose that when there is no value for a name, we don’t want the initiator or terminator (consecutive commas are okay, we decide). We would specify emptyValueDelimiterPolicy=none, right? The input should look like this: (John),,(Bill),(Linda) Right? I tried that but got this message: [warning] Left over data. Consumed 48 bit(s) with at least 128 bit(s) remaining. This is the output I got: <input> <name>John</name> </input> Why is this happening? What happened to the other names? /Roger From: Beckerle, Mike <[email protected]> Sent: Tuesday, February 18, 2020 2:40 PM To: [email protected] Subject: [EXT] Re: Need an example of using emptyValueDelimiterPolicy emptyValueDelimiterPolicy is certainly a squirrelly area of DFDL and daffodil. Made more complicated by the fact that default values aren't fully implemented in either daffodil or IBM DFDL. What you've expressed thusfar doesn't motivate any need for emptyValueDelimiterPolicy. Your element is an integer and has no default value. So there is nothing to create if an "empty" syntax (which would be "()" for your case) is detected. Hence, empty isn't allowed, and the message about the emptyValueDelimiterPolicy being ignored. Furthermore, your element has minOccurs 0, so it is "optional" and so no defaulting would ever be done anyway. Instead nothing would be added to the infoset on parsing. But that's only applicable if "empty" is even a concept for your element type. In the case of an integer, it either needs to find "(8)" with some value like 8, or it needs to find nothing at all (the next separator perhaps). Finding "()" should cause an error that the integer can't be parsed from empty string. For emptyValueDelimiterPolicy to be useful on a numeric type, the element must be required (so scalar element or minOccurs >= 1 with appropriate occursCountKind), and must have a default value or be nillable and have dfdl:useNilAsDefault="true" which makes being nilled the default value. daffodil support for default values is only partial, also. I am not sure the above such as making your integer nillable would not also result in diagnostic messages about things being not supported. The type xs:string however, is fully supported, because well, empty strings are a legitimate value for strings. So you can use emptyValueDelimiterPolicy to control whether for example you want explicit indications that the string value is to be empty string, or not. E.g., suppose you have a format which is comma separated and each of the 4 elements which are just scalars, are a choice of either an integer or a string. For example here's some data 1,2,foo,bar what if we want the strings to be allowed to be empty strings, but we don't want this allowed: 1,2,,bar because we consider those two adjacent commas to be an evil confusing thing. We ant you to have to put something in the field. So we can instead require the strings to have initiators and terminators so: 1,2,(foo),(bar), but... depending on emptyValueDelimiterPolicy the evil adjacent commas might still be allowed. If we want to disallow such evil, we must choose emptyValueDelimiterPolicy='both' so that 1,2,(),(bar) is what is required to get an empty string value for the 3rd element. Not sure that helps, but this is the sort of thing emptyValueDelimiterPolicy is for. ________________________________ From: Costello, Roger L. <[email protected]<mailto:[email protected]>> Sent: Tuesday, February 18, 2020 12:17 PM To: [email protected]<mailto:[email protected]> <[email protected]<mailto:[email protected]>> Subject: Need an example of using emptyValueDelimiterPolicy Hi Folks, Suppose the input is an integer that is initiated by a left parenthesis and terminated by a right parenthesis, e.g., (44) I thought that I would use emptyValueDelimiterPolicy for that input, using this schema: <xs:sequence dfdl:initiator="(" dfdl:terminator=")" > <xs:element name="num" type="xs:integer" minOccurs="0" dfdl:emptyValueDelimiterPolicy="both" /> </xs:sequence> Question #1: Is that a legitimate scenario for using emptyValueDelimiterPolicy? Question #2: Does Daffodil support emptyValueDelimiterPolicy? This message seems to suggest that Daffodil does not support it: [warning] Schema Definition Warning: DFDL property was ignored: emptyValueDelimiterPolicy="both" /Roger
