Hi everybody,

sorry if this will be a slightly complicated mail, but I am considering to
add a stricter checking of API mis-use to UIMA and would appreciate if anybody
could take a moment, try to follow the argumentation and provide any kind of
feedback.

Consider the case where you have two CAS instances (cas1 and cas2) with the
same type system.

In this case, you can fetch a Type object from cas1 and use it to create
an annotation in cas2:

Type t = cas1.getTypeSystem().getType("my.Type");
AnnotationFS fs = cas2.createAnnotation(t);
cas2.addFsToIndexes(fs);


If the type systems in cas1 and cas2 are different, then this is can be 
problematic
of course.

UIMAv3 tries to ensure that for two CASes that share a semantically equivalent
type system, the actual same TypeSystemImpl instance is used. So normally, we 
see
that if the type systems are equivalent (even though the CASes may have been
created using different XML descriptor files (with the same content)). This 
feature
of UIMAv3 is called "type system consolidation".

cas1.getTypeSystem().getType("my.Type") == 
cas2.getTypeSystem().getType("my.Type")


In principle, even without type system consolidation (if the type instances are 
not
the same), the code usually works if the type definitions are semantically 
equivalent.
But the user may not get notified about the problem or may just experience weird
behavior.

Incidentally, when an annotation is *removed* from the CAS, UIMAv3 actually 
verifies
if the type of the annotation being removed matches the CAS type system - and 
if this
is not the case, it throws an exception.

UIMA 3.1.1 has a BU WHERE TYPE CONSOLIDATION MAY FAIL in a multi-threaded
environment. This can later cause an exception to be thrown when this 
annotation is
*removed* again from the CAS (because that is currently the only place where 
this is
checked).

A fix for that multi-threading bug is en route. 

Now, I would like to extend the check that is currently only being made in 1 of 
2 
possible cases of removal of annotations to:

- both removal cases
- also to the case of adding an annotations to the CAS
- and also to the case of creating an annotations from the CAS

In principle that could break user code that is currently working by chance.
For that reason, I would add a system property by which the check can be 
DISABLED
if necessary until the problematic user code is fixed.

An alternative could be to not check by default and install a system property to
ENABLE the check - risking that user code could exhibit odd behavior without
quickly informing the user about the problem.

... actually, thinking of it after writing this longisch mail, I'll make it so 
that
the checks are present, but instead of failing hard by default, they'll log 
warnings.
That would alert users about the issue without breaking their code.

If you have any additional feedback, please let me know :)

Cheers,

-- Richard

Reply via email to