The issue here is that the dot character doesn't match newlines. So your
expression is essentially just looking for one or more non-newline
characters up until the end of the data Your field has a newline, so the
regular expression fails there, doesn't match, and results in a zero
length string.

If you want dot to match a newline, you can put the "(?s)" flag before
the regex.

You can also simplify the expression a bit. You don't need to make the
dot match non-greedy, and the $ doesn't need to be in a forward
lookahead. So the following should work and is a bit more compact:

 dfdl:lengthPattern="(?s).+$"

That will match one or more characters (including newlines) up until the
end of the data.



On 8/27/19 12:43 PM, Costello, Roger L. wrote:
> Hello DFDL community,
> 
> My input is this:
> 
> Hello, World Blah
> Broccoli
> 3ABC
> 
> I want it parsed to this:
> 
> <input>
> <A>Hello, World</A>
> <B>Blah</B>
> <C>Broccoli
> 3ABC</C>
> </input>
> 
> That is, the first field is exactly 12 characters. The second field extends 
> up 
> to the newline. The third field is the rest.
> 
> Below is my DFDL schema. It produces this result:
> 
> <input>
> <A>Hello, World</A>
> <B>Blah</B>
> <C></C>
> </input>
> 
> along with a warning message saying that a bunch of bytes remain.
> 
> Why do I get that result instead of the desired result?  /Roger
> 
> <xs:elementname="input">
> <xs:complexType>
> <xs:sequence>
> <xs:elementname="A"type="xs:string"
>                          dfdl:lengthKind="explicit"
>                          dfdl:length="12"
>                          dfdl:lengthUnits="characters"/>
> <xs:elementname="B"type="xs:string"
>                          dfdl:lengthKind="delimited"
>                          dfdl:terminator="%NL;"/>
> <xs:elementname="C"type="xs:string"
>                          dfdl:lengthKind="pattern"
>                          dfdl:lengthPattern=".+?(?=$)"/>
> </xs:sequence>
> </xs:complexType>
> </xs:element>
> 

Reply via email to