Ok, if you need to use lengthKind="pattern" you can do the following:
<xs:element name="RunwayIdentifier" type="xs:string"
dfdl:lengthKind="pattern"
dfdl:lengthPattern="[0-9]{2,2}(L|R){0,1}"
xs:minLength="3" xs:maxLength="3"
dfdl:textPadKind="char"
dfdl:textStringJustification="left"
dfdl:textStringPadChar="%SP;" />
Note that I dropped the optional space from the regex. This should produce the
result you are looking for. According to the spec, DFDL will pad the output to
the given xs:minLength if necessary.
Josh Adams
________________________________
From: Roger L Costello <[email protected]>
Sent: Thursday, July 29, 2021 3:21 PM
To: [email protected] <[email protected]>
Subject: Re: I am confused about what Daffodil does, in a fixed field, with the
data following the data that matches a regex
Hello Josh,
No doubt you are correct that a better solution is do as you describe.
However, for reasons that I cannot reveal, I must use regexes. Thus my question.
/Roger
From: Adams, Joshua <[email protected]>
Sent: Thursday, July 29, 2021 3:14 PM
To: [email protected]
Subject: [EXT] Re: I am confused about what Daffodil does, in a fixed field,
with the data following the data that matches a regex
Hello Roger,
In this case where there is always going to be exactly 3 characters, your best
bet would be to use lengthKind="explicit" and specify the appropriate padding:
<xs:element name="RunwayIdentifier" type="xs:string"
dfdl:lengthKind="explicit"
dfdl:textPadKind="char"
dfdl:textStringJustification="left"
dfdl:textStringPadChar="%SP;" />
When parsing, DFDL will generate an infoset without the extra space, but when
DFDL unparses it will know to add in the space if necessary.
If you want to further validate the data, you could add an assert to validate
against the pattern:
<xs:element name="RunwayIdentifier" type="xs:string"
dfdl:lengthKind="explicit"
dfdl:textPadKind="char"
dfdl:textStringJustification="left"
dfdl:textStringPadChar="%SP;" >
<xs:annotation>
<xs:appinfo source="http://www.ogf.org/dfdl/">
<dfdl:assert testKind="pattern" testPattern=" [0-9]{2,2}(L|R){0,1}[
]{0,1}" />
</xs:appinfo>
</xs:annotation>
</xs:element>
Josh Adams
________________________________
From: Roger L Costello <[email protected]<mailto:[email protected]>>
Sent: Thursday, July 29, 2021 2:31 PM
To: [email protected]<mailto:[email protected]>
<[email protected]<mailto:[email protected]>>
Subject: I am confused about what Daffodil does, in a fixed field, with the
data following the data that matches a regex
Hi Folks,
I have a data item that identifies an airport runway, e.g. 23L (runway 23 left).
The data format specifies that the length of the field for the data item is
exactly 3 characters.
A runway identifier does not need to have the indication of whether it is left
or right; it's acceptable to just provide a two digit identifier, e.g., 23. So,
L (and R) is optional.
Here's the DFDL Schema for the data item:
<xs:element name="RunwayIdentifier" type="xs:string"
dfdl:lengthKind="pattern"
dfdl:lengthPattern="[0-9]{2,2}(L|R){0,1}[ ]{0,1}"/>
Notice that the end of the regex says that there is an optional space (which is
needed in the event that only the two digit identifier is used, without L or R).
If this is the input:
/23L/
(Slashes separate the data format's fields.)
Then this is the generated XML:
<RunwayIdentifier>23L</RunwayIdentifier>
If this is the input:
/23 /
(One space after the two digit runway identifier.)
Then this is the generated XML:
<RunwayIdentifier>23 </RunwayIdentifier>
Notice the space after the two digit runway identifier.
Interestingly, I have found that in the regex I can omit [ ]{0,1}
If the schema has this:
<xs:element name="RunwayIdentifier" type="xs:string"
dfdl:lengthKind="pattern"
dfdl:lengthPattern="[0-9]{2,2}(L|R){0,1}"/>
Notice that I omitted [ ]{0,1}
Then with this input:
/23 /
(One space after the two digit runway identifier.)
This is the generated XML:
<RunwayIdentifier>23</RunwayIdentifier>
Notice that there is no space after the two digit runway identifier.
Observe the different outputs:
<RunwayIdentifier>23 </RunwayIdentifier>
<RunwayIdentifier>23</RunwayIdentifier>
I get the former when the regex specifies [ ]{0,1} and I get the latter when I
omit it.
I don't understand what's happening here. The field has a fixed length - 3
characters. In the second case, the regex specifies two digits followed
optionally by L or R and it does not specify an optional space. But the input
has a space. Apparently Daffodil gobbles up the input per the regex pattern
(i.e., it gobbles up 23) and then it does what with the space? Discard it? How
can Daffodil simply discard data? I'm confused.
/Roger