Hi Folks,
Below is my writeup of the first category. Is the writeup clear and easy to
understand? does it have any typos? is it correct and complete? (Thanks Davin
for your feedback from earlier today! I've incorporated your comments into my
writeup)
-------------------------------------------------------------------------------------
1. Field with fixed length, nillable, not composite, no choice
Scenario: A data format has a field that has a fixed length of 3. If no data is
available to populate the field, the field is to be populated with a hyphen.
The hyphen may occur anywhere within the field.
The field has these requirements:
>> Fixed length (3)
>> Nillable, hyphen is the nil value, the hyphen may be positioned anywhere
>> within the 3-character field
>> Not composite, i.e., values are atomic
>> Values must be left-justified
>> Values shorter than 3 characters must be padded with spaces
Here is an XML Schema (XSD) for the field:
<xs:element name="Example" nillable="true">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enumeration value="ABC"/>
<xs:enumeration value="DEF"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
To specify that the field is fixed length, add these two DFDL properties to the
XSD:
dfdl:lengthKind="explicit"
dfdl:length="3"
To specify that the field is nillable with hyphen as the nil value, add these
two DFDL properties:
dfdl:nilKind="literalValue"
dfdl:nilValue="%WSP*;-%WSP*;"
The value of nilValue (%WSP*;-%WSP*;) deserves scrutiny: zero or more
whitespace characters, a hyphen, and zero or more whitespace characters. That
is how to specify that the hyphen may be positioned anywhere within the
3-character field and is surrounded by whitespace.
To specify that field values must be left-justified and values shorter than 3
characters must be padded with spaces, add these DFDL properties:
dfdl:textPadKind="padChar"
dfdl:textStringPadCharacter="%SP;"
dfdl:textStringJustification="left"
If an input contains a value that is less than 3 characters, it will be padded
on the right with spaces. When parsed, however, we want just the value without
the padding. Use this DFDL property to direct the parser to remove padding:
dfdl:textTrimKind="padChar"
To specify that there is a single value and not a choice of values, use the XSD
simpleType, as shown above.
Here is the element declaration after adding the DFDL properties:
<xs:element name="Example" nillable="true"
dfdl:nilKind="literalValue"
dfdl:nilValue="%WSP*;-%WSP*;"
dfdl:lengthKind="explicit"
dfdl:length="3"
dfdl:textTrimKind="padChar"
dfdl:textPadKind="padChar"
dfdl:textStringPadCharacter="%SP;"
dfdl:textStringJustification="left">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enumeration value="ABC"/>
<xs:enumeration value="DEF"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
In this case all the enumeration values are of the required length (3). Suppose
some values were shorter (such as "XY"), would you need to pad them with
spaces? No, there is no need to pad enumeration values. The combination of
dfdl:length="3" and dfdl:textStringPadCharacter="%SP;" means that parsing will
check that the input field has length 3 and if the value is shorter than 3 it
is padded with spaces.
The dfdl:textStringJustification="left" property specifies that values must be
left-justified. Which means, this input is okay (assume the field is one field
within a series of fields that are delimited by slashes):
.../XY /...
but this is not:
.../ XY/...
If there is no input data available to populate the field, a hyphen is to be
inserted. In other words, hyphen is the nil value. Of course, even with a nil
value the field is still required to have length 3, so the hyphen must be
padded with spaces. dfdl:nilValue="%WSP*;-%WSP*;" specifies that the hyphen may
be positioned anywhere within the 3-character field.
Let's see how a DFDL processor parses the element. With the following input
(note the spaces around the hyphen):
.../ - /...
parsing produces this output:
<Example xsi:nil="true"></Example>
and unparsing produces this output:
.../- /...
Notice that unparsing results in moving the hyphen to the left side of the
field.
With this input:
.../ABC/...
parsing produces this output:
<Example>ABC</Example>
and unparsing produces this output:
.../ABC/...
If the XSD had used a pattern facet instead of the enumeration facet:
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="ABC|DEF" />
</xs:restriction>
</xs:simpleType>
everything works the same. That is, the same set of DFDL properties are used
and we get the same outputs on parsing and unparsing.