I would suggest this sort of thing.

<xs:choice>

    <xs:sequence>

         <xs:element name="uri" type="tns:upaDummy"/>

         <xs:element name="name" type="tns:nameType" dfdl:terminator=":>" />

         <xs:element name="value" type="tns:URLType" dfdl:terminator="%NL;" />

    </xs:sequence>

   <xs:sequence>

         <xs:element name="b64" type="tns:upaDummy"/>

         <xs:element name="name" type="tns:nameType" dfdl:terminator="::"/>

         <xs:element name="value" type="tns:Base64Type"

                             dfdl:terminator="... whatever defines end of 
base64 ..."/>

    </xs:sequence>

   <xs:sequence>
         <xs:element name="str" type="tns:upaDummy"/>
         <xs:element name="name" type="tns:nameType" dfdl:terminator=":">

         <xs:element name="value" type="xs:string" dfdl:terminator="%NL;"/>

  </xs:sequence>

</xs:choice>


The uri, b64, and str are flag UPA dummy elements which are unfortunately 
unavoidable due to XSD restrictions.

The type upaDummy should define them to be fixed length zero-length strings.


Conversion of separators to terminators here is not arbitrary. When parsing a 
base64, the above will work even if "::" was legal base64 syntax, because 
there's no separator in scope surrounding the base64 value element.


...mike beckerle

Tresys



________________________________
From: Costello, Roger L. <[email protected]>
Sent: Monday, March 26, 2018 11:05:18 AM
To: [email protected]
Subject: RE: How to parse a line that is delimited by a colon but sometimes has 
two colons?

I would like to generalize my question a bit.

Not only can there be two consecutive colons:

name:: value

(the second colon indicates the value is base64 text)

But there can be colons within value, e.g.

name:> file:///usr/local/directory/photos/fiona.jpg

(the > symbol indicates the value is a url, and the url may contain a colon)

So, how to express this in DFDL?

/Roger

-----Original Message-----
From: Costello, Roger L.
Sent: Monday, March 26, 2018 10:42 AM
To: [email protected]
Subject: How to parse a line that is delimited by a colon but sometimes has two 
colons?

Hello DFDL experts!

I am using DFDL to parse lines that look like this:

name: value

I am using this DFDL code to parse the lines:

<xs:sequence dfdl:separator=":" dfdl:separatorPosition="infix">
    <xs:element name="name" type="xs:string" />
    <xs:element name="value" type="xs:string" /> </xs:sequence>

If the value is base64 text, then a double colon is used:

name:: base64-value

The above DFDL code doesn't seem to work in this situation. What's the correct 
way to write DFDL code which can handle lines with a single colon as well as 
lines with a double colon?

/Roger

Reply via email to