[ 
https://issues.apache.org/jira/browse/UIMA-2101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13011668#comment-13011668
 ] 

Steven Bethard commented on UIMA-2101:
--------------------------------------

Perhaps I should clarify my whole use case here. I'm taking the XML output by 
CasToInlineXml and transforming the element names and attributes so that they 
match the ISO-TimeML XML format. I'm not doing this for fun - I'm doing it 
because I have to give someone files in ISO-TimeML and they will only give me 
ISO-TimeML files back.

So, no, I don't have control over what elements go where or contain what, and 
no I can't live in XMI land, I have to be able to communicate with the world 
outside of UIMA. ;-)

All I need is the ability to turn off the extra whitespace to accomplish this.

> CasToInlineXml adds whitespace
> ------------------------------
>
>                 Key: UIMA-2101
>                 URL: https://issues.apache.org/jira/browse/UIMA-2101
>             Project: UIMA
>          Issue Type: Bug
>    Affects Versions: 2.3.1SDK
>            Reporter: Steven Bethard
>
> CasToInlineXml adds indentation between adjacent XML elements. E.g. for a 
> single character document with a single annotation covering that one 
> character, it will write:
> {noformat}
> <?xml version="1.0" encoding="UTF-8"?>
> <Document>
>     <uima.tcas.DocumentAnnotation sofa="Sofa" begin="0" end="1" 
> language="x-unspecified">
>         <uima.tcas.Annotation sofa="Sofa" begin="0" end="1"> 
> </uima.tcas.Annotation>
>     </uima.tcas.DocumentAnnotation>
> </Document>
> {noformat}
> I think it should instead write everything in a single line, that is:
> {noformat}
> <?xml version="1.0" encoding="UTF-8"?>
> <Document><uima.tcas.DocumentAnnotation sofa="Sofa" begin="0" end="1" 
> language="x-unspecified"><uima.tcas.Annotation sofa="Sofa" begin="0" end="1"> 
> </uima.tcas.Annotation></uima.tcas.DocumentAnnotation></Document>
> {noformat}
> I believe this could be fixed by replacing the line:
> {noformat}
> XMLSerializer sax2xml = new XMLSerializer(byteArrayOutputStream);
> {noformat}
> with the line:
> {noformat}
> XMLSerializer sax2xml = new XMLSerializer(byteArrayOutputStream, false);
> {noformat}
> I think it's a bug that CasToInlineXml is changing the character offsets, but 
> I would also be happy if there was an alternate constructor or a method on 
> CasToInlineXml that allowed disabling the formatting.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to