[ 
https://issues.apache.org/jira/browse/UIMA-5601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190293#comment-16190293
 ] 

Richard Eckart de Castilho commented on UIMA-5601:
--------------------------------------------------

Peter is constantly telling me that DKPro Core should stop using a custom 
subclass of DocumentAnnotation. At some point, it may happen. How soon might 
well depend on what kinds of conclusions we reach here. However, right now that 
fact that it is done helps detecting some changes going from v2 to v3. What I 
am trying to say is: an elaborate engineering solution might go a bit too far.

I believe the "normal" way to use a custom document annotation is to replace 
the JAR file that contains the default UIMA DocumentAnnotation with an 
alternative one. However, that is IMHO quite uncomfortable.

Indeed, having multiple subtypes of DocumentAnnotation (or even multiple 
instances of it) in a CAS/view is likely an error. In the long term, it might 
not be a bad idea to make DocumentAnnotation final such that no subclasses are 
allowed - that would make it much easier to handle but it would likely break 
stuff for existing users. 

That said, here is what happens in DKPro Core:

* The DocumentMetaData JCas class has a static "create(JCas)" method which 
checks if there is already a default UIMA DocumentAnnotation and if so, it 
copies the information contained in it, deletes it, and then adds a new 
DocumentMetaData, setting the previously copied infromation
* all reader components have some logic to create a DKPro Core DocumentMetaData 
annotation
* when DKPro Core uses the CasCopier, then the target CAS is initialized with a 
DKPro Core DocumentMetaData before annotations are copied over



> uv3: CasCopier problems with custom subclasses of DocumentAnnotation
> --------------------------------------------------------------------
>
>                 Key: UIMA-5601
>                 URL: https://issues.apache.org/jira/browse/UIMA-5601
>             Project: UIMA
>          Issue Type: Bug
>          Components: Core Java Framework
>    Affects Versions: 3.0.0SDK-beta
>            Reporter: Richard Eckart de Castilho
>
> It seems as if there may be a bug in the way that CasCopier handles the 
> documen annotation. 
> Specifically, it seems as if the CasCopier incorrectly handles the case where 
> the target CAS already contains a document annotation. In my case, I do:
> * create the target CAS
> * add a document annotation (DocumentMetaData extends DocumentAnnotation) to 
> the target CAS
> * create the CasCopier with the source and target CAS
> * copy several FSes but *not* the document annotation
> Expected:
> * target CAS contains 1 DocumentMetaData annotation
> Actual
> * target CAS contains 2 DocumentMetaData annotation
> Also, it seems that `isDocumentAnnotation` may not able to handle it if a CAS 
> uses a custom subclass of DocumentAnnotation:
> {noformat}
>   private <T extends FeatureStructure> boolean isDocumentAnnotation(T aFS) {
>     if (((TOP)aFS)._getTypeCode() != TypeSystemConstants.docTypeCode) {
>       return false;
>     }
>     if (srcCasDocumentAnnotation == null) {
>       srcCasDocumentAnnotation = 
> srcCasViewImpl.getDocumentAnnotationNoCreate(); 
>     }
>     return aFS == srcCasDocumentAnnotation;
>   }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to