Hmmm. I feel like I may be missing something here.
It seems to me this would work.
<!-- array of cmdLine terminated by NL -->
<xs:element name="CmdLine" maxOccurs="unbounded"
dfdl:terminator="%NL;">
<xs:complexType>
<xs:sequence dfdl:separator=";"
dfdl:separatorPosition="infix" >
<!-- array of parts separated by
";" -->
<xs:element name="part"
macOccurs="unbounded">
<xs:complexType>
<xs:sequence
dfdl:separator="%SP;">
<!-- required token
separated from any following param by a space -->
<xs:element name="token"
type="xs:string" dfdl:lengthKind="delimited"/>
<!-- array of zero or
more param separated by spaces -->
<xs:element name="param"
type="xs:string" maxOccurs="unbounded" minOccurs="0"
dfdl:lengthKind="delimited"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
I think that would give you
<cmdLine><!-- first line -->
<part>
<token>AAAA</token>
<param>BB</param>
<param>C</param>
</part>
<part>
<token>DDDD</token>
<param>E</param>
<param>FFF</param>
</part>
</cmdLine>
<cmdLine><!-- second line -->
<part>
<token>GGGG</token>
</part>
<part>
<token>HHHH:III</token>
</part>
</cmdLine>
<cmdLine><!-- third line -->
<part>
<token>JJJJ</token>
</part>
</cmdLine>
On Wed, Mar 6, 2024 at 10:19 AM Larry Barber <[email protected]>
wrote:
> Trying to parse out some data that looks like this:
>
>
>
> AAAA BB C;DDDD E FFF
>
> GGGG;HHHH:III
>
> JJJJ
>
>
>
> I’m able to break out the various elements using this code:
>
>
>
> <xs:element
> name="CmdLine" maxOccurs="unbounded" dfdl:terminator="%NL;">
>
> <xs:complexType>
>
> <xs:sequence dfdl:separator=";"
> dfdl:separatorPosition="infix" >
>
> <xs:element name="Cmd"
> type="xs:string" maxOccurs="unbounded" dfdl:lengthKind="delimited" />
>
> </xs:sequence>
>
> </xs:complexType>
>
> </xs:element>
>
>
>
> Which gives me output like this:
>
>
>
> <CmdLine>
>
> <Cmd>AAAA BB C</Cmd>
>
> <Cmd>DDDD E FFF</Cmd>
>
> </CmdLine>
>
> <CmdLine>
>
> <Cmd>GGGG</Cmd>
>
> <Cmd>HHHH:III</Cmd>
>
> </CmdLine>
>
> <CmdLine>
>
> <Cmd>JJJJ</Cmd>
>
> </CmdLine>
>
>
>
> Is there a technique that I could use to parse the Cmd element further so
> that I could get something like this:
>
>
>
> <CmdLine>
>
> <Cmd>
>
> <token>AAAA</token>
>
> <parm1>BB</parm1>
>
> <parm2>C</parm2>
>
> </Cmd>
>
> <token>DDDD</token>
>
> <parm1>D</parm1>
>
> <parm2>FFF</parm2>
>
> </Cmd>
>
> <token>GGGG</token>
>
> </Cmd>
>
> …
>
> </CmdLine>
>
>
>