Roger,
The issue is that DFDL uses the first occurance of the terminator, regardless of padding or trim options. This means that if you have 2 NUL characters in a row, the first one would terminate the string, and the second one would terminate a difference (0 length) string. If you had a reasonable upper limit for how many NUL characters you could have in a row, you could do something like: dfdl:terminator="%NUL; %NUL;%NUL; %NUL;%NUL;%NUL;", which indicates that a string can be terminated by a sequence of 1, 2, or 3 null characters. DFDL will still terminate at the earliest possible chance, however once it terminates, it will use the longest possible terminator. Unfortunately, there is no way to specify a terminator of "1 or more NULs". There is %WSP+; which means "1 or more whitespace", and %WSP*; which means "0 or more whitespace", but these are specific to whitespace and not a general construct you can use with other characters. ________________________________ From: Costello, Roger L. <[email protected]> Sent: Monday, April 1, 2019 2:47:33 PM To: [email protected] Subject: Question about parsing binary input containing strings separated by nulls Hello DFDL community, My binary input file contains: string null(s) string null(s) …. The following DFDL schema correctly parses the input file: <xs:element name="input"> <xs:complexType> <xs:sequence> <xs:element name="string" type="xs:string" maxOccurs="unbounded" dfdl:lengthKind="pattern" dfdl:lengthPattern="[\x00-\xFF]+?(?=\x00([^\x00]|$))" dfdl:representation="text" dfdl:encoding="ISO-8859-1" dfdl:textTrimKind="padChar" dfdl:textStringPadCharacter="%NUL;" dfdl:textStringJustification="left" dfdl:terminator="%NUL;"/> </xs:sequence> </xs:complexType> </xs:element> But why do I need dfdl:lengthPattern? Why can’t I simply state this: the input contains an unbounded number of strings, each string is padded by one or more nulls or ends at the end-of-file. Why can’t I throw out dfdl:lengthPattern and set dfdl:lengthKind to “delimited”? Why doesn’t the following work correctly? <xs:element name="input"> <xs:complexType> <xs:sequence> <xs:element name="string" type="xs:string" maxOccurs="unbounded" dfdl:lengthKind="delimited" dfdl:representation="text" dfdl:encoding="ISO-8859-1" dfdl:textTrimKind="padChar" dfdl:textStringPadCharacter="%NUL;" dfdl:textStringJustification="left" dfdl:terminator="%NUL;"/> </xs:sequence> </xs:complexType> </xs:element> /Roger
