You said the length is 100, so that's what's going to want to be the
lengthKind 'explicit' length.
What about using your regex but via a pattern facet?
<element name="Foo" dfdl:lengthKind='explicit' dfdl:length='100'>
<simpleType>
<restriction base="xs:string">
<pattern value="[A-Z]{100}|[ ]*-[ ]*"/>
</restriction>
</simpleType>
</element>
You should be able to trim spaces as well from this so that you will get
either 100 characters of A-Z or a single "-" character as the string's
actual length.
Note that in this case your regex is simpler. The two "[ ]*" are gone
because the spaces will be trimmed from both ends of the string.
<element name="Foo" dfdl:lengthKind='explicit' dfdl:length='100'
dfdl:textTrimKind='padChar'
dfdl:textStringPadCharacter='%SP;'
dfdl:textPadKind='padChar'
dfdl:textStringJustification="center">
<simpleType>
<restriction base="xs:string">
<pattern value="[A-Z]{100}|-"/>
</restriction>
</simpleType>
</element>
I did not run this DFDL, but this sort of thing is typical of fixed length
data.
On Thu, Jul 28, 2022 at 8:52 AM Roger L Costello <[email protected]> wrote:
> Hi Folks,
>
> The text data format that I am writing a DFDL schema for has a field
> (let's name it "Foo") with a fixed width. Let's say the width is 100
> characters. The content of the field is uppercase letters. If there is no
> data available to populate the field, it must be populated with a single
> hyphen (surrounded by spaces to ensure the field has a width of 100). The
> hyphen may be in any position within the field. For reasons I will not
> share, I must specify the field's content using a regex:
>
> lengthKind=pattern
> lengthPattern=[A-Z]{100}
>
> However, that lengthPattern doesn't take into account the hyphen that is
> needed when there is no data. So I updated the regex like this:
>
> lengthPattern=[A-Z]{100}|[ ]*-[ ]*
>
> However, the right-hand side of that regex (which deals with the hyphen)
> doesn't constrain the length of the field. Recall the hyphen may be
> positioned anywhere within the 100 character field. Writing a regex that
> specifies all possible positions of the hyphen, while ensuring the field is
> 100 characters, is not reasonable.
>
> So it would seem that I need to specify length=100 on the element
> declaration:
>
> lengthKind=explicit
> length=100
>
> But now I have conflicting requirements:
>
> 1. The element declaration needs to specify lengthKind=pattern for the
> regex
>
> 2. The element declaration needs to specify lengthKind=explicit for the
> field length
>
> That's a problem. That's not legal.
>
> It other words, I need this illegal DFDL:
>
> <xs:element name="Foo"
> nillable="true"
> dfdl: nilValue="-"
> dfdl:lengthKind="explicit"
> dfdl:length="100"
> dfdl:lengthUnits="characters"
> dfdl:lengthKind="pattern"
> dfdl:lengthPattern="[A-Z]{100}|[ ]*-[ ]*">
> <xs:simpleType>
> <xs:annotation>
> <xs:appinfo source="http://www.ogf.org/dfdl/">
> <dfdl:assert test="{ (fn:nilled(.)) or (. ne '') }"/>
> </xs:appinfo>
> </xs:annotation>
> <xs:restriction base="xs:string"/>
> </xs:simpleType>
> </xs:element>
>
> Is there a solution to this problem? If not, is there a workaround?
>
> /Roger
>