Steve Lawrence created DAFFODIL-1967:
----------------------------------------

             Summary: Support --stream option for CLI unparse and performance 
subcommands
                 Key: DAFFODIL-1967
                 URL: https://issues.apache.org/jira/browse/DAFFODIL-1967
             Project: Daffodil
          Issue Type: Bug
          Components: CLI
    Affects Versions: 2.2.0
            Reporter: Steve Lawrence
             Fix For: 2.2.0


The --stream option was only implemented for the parse CLI subcommand. Ideally, 
this would work for both unparse and performance subcommands as well. The 
issues with each are:

*Unparse:*

The --stream option for the parse subcommand currently outputs repetitions of 
XML data, e.g.
{code:xml}
<foo>...</foo>
<foo>...</foo>
<foo>...</foo>
{code}

Since there is no root element, this is not valid XML data and the libraries we 
use to parse the XML string throw and error. So, in order support streaming XML 
data into unparsing we need to manually split the XML before giving it to the 
XML parsing libraries. Parsing XML ourselves and just not caring about not 
having a root element is an option, but might be more effort than it is worth. 
Another option is to output some sort of delimiter in between each XML and just 
split the data on that. Extra care needs to be done to ensure that we do not 
split if the XML content contains that delimiter.

*Performance:*

The performance subcommand currently works by creating a ByteBuffer and just 
repeatedly calling parse on that. In order to test streaming performance we 
would need to create an InputStream and continuously provide data to it, 
perhaps via a PipeInput/OutputStream pair or something similar. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to