[ 
https://issues.apache.org/jira/browse/UIMA-6128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17016776#comment-17016776
 ] 

Mario Juric commented on UIMA-6128:
-----------------------------------

I don't know how many are still on UIMA v2, but we expect to move to UIMA v3 
soon, although that has been delayed on several occasions for various reasons, 
so having it in both versions might be a good idea when looking at it from an 
egocentric perspective. More broadly I think we can expect a lot of v2 
components in active use for some time to come, so it would also benefit all 
those v2 deployments that face some of the same issues. I therefore think we 
should support both versions, also because I don't think that transferring an 
implementation from one version to the next should be a big extra overhead 
unless there are big differences in the XMI serialization APIs, which I 
admittedly haven't inspected yet.

> Allow XMI to be optionally serialized with XML 1.1 instead of only 1.0
> ----------------------------------------------------------------------
>
>                 Key: UIMA-6128
>                 URL: https://issues.apache.org/jira/browse/UIMA-6128
>             Project: UIMA
>          Issue Type: New Feature
>          Components: UIMA
>            Reporter: Mario Juric
>            Assignee: Marshall Schor
>            Priority: Major
>             Fix For: 3.2.0SDK
>
>         Attachments: OddFeatureText.java, SimpleTypeSystem_TS.xml
>
>
> Some unicode characters are not handled by XML 1.0 and it can require some 
> normalization or cleanup to be able to serialize the CAS to XMI, but 
> requirements may not necessarily allow all such characters to be fully 
> removed from the CAS. It can also be impossible to do such 
> normalization/cleanup without full reprocess when converting data already 
> stored as compressed binaries to XMI. Being able to optionally select XML 1.1 
> instead of the default XML 1.0 would be an easier way for some to bypass many 
> of those unicode issues.
> See also discussion on the UIMA mailing list:
> https://lists.apache.org/thread.html/7f8124b7be9ea20ab21dc616243e5661a0b7668a856532031fda71e3@%3Cuser.uima.apache.org%3E
> This feature request suggests that an additional SerialFormat is introduced, 
> e.g. XMI_1_1, which can be selected as format parameter in the 
> CasIOUtils.save methods.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to