Another option is something like below. It doesn't have the niceness
that the value is always in a "value" element regardless of the type
like Mike's snippet does, but it does avoid some duplication.

  <xs:sequence>
    <xs:element name="name" type="xs:string" dfdl:terminator=":" />
    <xs:choice>
      <xs:choice dfdl:initiatedContent="yes">
        <xs:element name="base64" dfdl:initiator=":" ... />
        <xs:element name="uri" dfdl:initiator="&gt;" ... />
      </xs:choice>
      <xs:element name="string" type="xs:string" ... />
    </xs:choice>
  </xs:sequence>

So it uses a terminator to find the end of name, then uses nested
choices with initiators to determine if the next thing is base64 content
or a uri, defaulting to string if neither initiator exists. Similar to
Mike's, this also changes the separator to a terminator so the colon is
not in scope when parsing the values.

- Steve


On 03/26/2018 12:40 PM, Mike Beckerle wrote:
> I would suggest this sort of thing.
> 
> 
> <xs:choice>
> 
>      <xs:sequence>
> 
>           <xs:element name="uri" type="tns:upaDummy"/>
> 
>           <xs:element name="name" type="tns:nameType" dfdl:terminator=":>" />
> 
>           <xs:element name="value" type="tns:URLType" dfdl:terminator="%NL;" 
> />
> 
>      </xs:sequence>
> 
>     <xs:sequence>
> 
>           <xs:element name="b64" type="tns:upaDummy"/>
> 
>           <xs:element name="name" type="tns:nameType" dfdl:terminator="::"/>
> 
>           <xs:element name="value" type="tns:Base64Type"
> 
> dfdl:terminator="... whatever defines end of base64 ..."/>
> 
>      </xs:sequence>
> 
>     <xs:sequence>
>           <xs:element name="str" type="tns:upaDummy"/>
>           <xs:element name="name" type="tns:nameType" dfdl:terminator=":">
> 
>           <xs:element name="value" type="xs:string" dfdl:terminator="%NL;"/>
> 
>    </xs:sequence>
> 
> </xs:choice>
> 
> 
> The uri, b64, and str are flag UPA dummy elements which are unfortunately 
> unavoidable due to XSD restrictions.
> 
> The type upaDummy should define them to be fixed length zero-length strings.
> 
> 
> Conversion of separators to terminators here is not arbitrary. When parsing a 
> base64, the above will work even if "::" was legal base64 syntax, because 
> there's no separator in scope surrounding the base64 value element.
> 
> 
> ...mike beckerle
> 
> Tresys
> 
> 
> 
> 
> --------------------------------------------------------------------------------
> *From:* Costello, Roger L. <[email protected]>
> *Sent:* Monday, March 26, 2018 11:05:18 AM
> *To:* [email protected]
> *Subject:* RE: How to parse a line that is delimited by a colon but sometimes 
> has two colons?
> I would like to generalize my question a bit.
> 
> Not only can there be two consecutive colons:
> 
> name:: value
> 
> (the second colon indicates the value is base64 text)
> 
> But there can be colons within value, e.g.
> 
> name:> file:///usr/local/directory/photos/fiona.jpg
> 
> (the > symbol indicates the value is a url, and the url may contain a colon)
> 
> So, how to express this in DFDL?
> 
> /Roger
> 
> -----Original Message-----
> From: Costello, Roger L.
> Sent: Monday, March 26, 2018 10:42 AM
> To: [email protected]
> Subject: How to parse a line that is delimited by a colon but sometimes has 
> two 
> colons?
> 
> Hello DFDL experts!
> 
> I am using DFDL to parse lines that look like this:
> 
> name: value
> 
> I am using this DFDL code to parse the lines:
> 
> <xs:sequence dfdl:separator=":" dfdl:separatorPosition="infix">
>      <xs:element name="name" type="xs:string" />
>      <xs:element name="value" type="xs:string" /> </xs:sequence>
> 
> If the value is base64 text, then a double colon is used:
> 
> name:: base64-value
> 
> The above DFDL code doesn't seem to work in this situation. What's the 
> correct 
> way to write DFDL code which can handle lines with a single colon as well as 
> lines with a double colon?
> 
> /Roger
> 

Reply via email to