[ 
https://issues.apache.org/jira/browse/UIMA-5554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16161351#comment-16161351
 ] 

Marshall Schor commented on UIMA-5554:
--------------------------------------

Thanks for the great questions, Richard :-)

I think most of these questions revolve around JCas and non-JCas usage in UIMA. 
 
UIMA originally had only non-JCas usage (JCas was added later).  The non-JCas 
model of access to the CAS is useful today when you're writing "generic" 
annotators that need to work with multiple type systems, where you don't know 
the type(s) and feature(s) ahead of time.

It's somewhat like Java's "reflection" approach - you do some calls in the 
typeSystemInit method to get some special values into local fields, which you 
then use when accessing UIMA types and features.  

The JCas, in contrast, adds classes having class names equivalent to the UIMA 
types, and method names corresponding to the features, and you use these in 
your code.  But, of course, using these means your code is specific to those 
types and features.

The bridge that exists between these is somewhat flexible.  Even when JCas is 
being used, the non-JCas access capabilities continues to exist along side it.  
So, in a particular pipeline / CAS, there may be UIMA types for which no JCas 
classes exist, or UIMA types have additional "features", for which no JCas 
getters/setters exist; these can be accessed (if needed) using the non-JCas 
approach.

A use case which motivates this scenario is the type "merging" that UIMA does 
when given type system descriptors coming from annotator descriptors - that 
merging might "add" additional features to a type (say, needed by a particular 
annotator you're including in the pipeline), or even add additional Types.   
That Annotator might be a non-JCas annotator, and doesn't define any JCas 
classes.  So there might not be any getter/setter for these additional fields 
or types.

That is the state of things in both V2 and V3.

The implementation details in V3 differ, because the actual Feature Structure 
instances are represented as instances of some JCas class.  In the case where 
the style is non-JCas, there still are the "built-in" JCas classes, like TOP 
and Annotation. When you define a UIMA type, say Foo, it always inherits from 
some supertype (TOP, if non other).  If no JCas definition exists (in v3) for 
an instance, it's most specific supertype JCas class is used to instantiate it.

With that background, let's address the questions, maybe in reverse order....

1) Why are the types not simply "installed" when the JCas class is loaded and 
initialized?  (Installed means the corresponding UIMA types are installed).  

JCas classes are normally loaded an initialized as part of type system commit, 
after the type system has been committed.
The exception of course, is that any user code running before type system 
commit, might make a reference to a JCas type; the first such reference would 
cause Java to load and initialize the JCas class.  Also, a user might write 
code like Class.forName(...) to force loading of a class.  V3 reports errors if 
these other kinds of loading/initializing are done before type system commit.

The reason that UIMA types are not "installed" when a JCas type is loaded in 
the two exception cases above, is because the details of the UIMA types are not 
available at that time (because the type system hasn't been gathered from all 
the annotators in the pipeline and merged, and committed). The UIMA types could 
be supersets of what this particular JCas implementation defines (see, for 
example the use case above where some non-JCas Annotator used additional 
fields).  

2) Is there a new concept of installing/committing types in V3?  

In both v2 and v3, type systems need to be assembled from annotator descriptors 
in a pipeline, merged, and committed, before being used.  UIMA uses this 
concept to allow efficiency in accessing.  This is, admittedly, a trade-off, 
versus an approach which allows a more dynamic (looking up more information on 
each access), but this trade-off was made early in the design of UIMA.

In V3, an additional "ordering" requirement is present, requiring that the UIMA 
type system be assembled, merged, and committed, before any JCas classes are 
*initialized*.  This, again, is an efficiency tradeoff, and enables feature 
access to be compiled into very efficient code that is 
modern-cpu-design-cache-loading efficient.  New error messages were added to 
detect when this constraint is being violated.

3) In v2 it was possible to have any number of type systems and different CASes 
initialized with different type systems - and if different classloaders were 
used, even with different JCas classes.  

This is also the case in V3, as long as the merged/committed type system is 
available before any JCas classes are installed.  If you are using JCas with 
different class loaders, they can have associated different type systems.    
This was done to support, for example, running "servlets" which each have their 
own isolating class loaders, each servlet running perhaps a different UIMA 
pipeline.

4) It is possible to re=initialize a CAS with a new type system even after one 
has already been committed.  

This is also true in V3.  Users of this typically are using the non-JCas APIs, 
because the reinitialization could install any type system.  The V3 
implementation insures that the built-in types and their JCas implementations 
are always available, and have the same feature offsets.  So, it is expected 
that most use cases doing this kind of thing will continue to work.

5) The type system for a CAS is committed, per CAS/classloader.  

This should still be true.  

I hope this explains this a bit better.  

> Strange exception when trying to get JCas FS class through reflection
> ---------------------------------------------------------------------
>
>                 Key: UIMA-5554
>                 URL: https://issues.apache.org/jira/browse/UIMA-5554
>             Project: UIMA
>          Issue Type: Bug
>          Components: Core Java Framework
>    Affects Versions: 3.0.0SDK-beta
>            Reporter: Richard Eckart de Castilho
>
> I am trying to get a class object for a JCas FS type using reflection:
> {noformat}
> Class.forName(typeName);
> {noformat}
> However, it produces this strange error.
> {noformat}
> java.lang.ExceptionInInitializerError
>       at java.lang.Class.forName0(Native Method)
>       at java.lang.Class.forName(Class.java:264)
> ...
> Caused by: org.apache.uima.cas.CASRuntimeException: A JCas class field "sofa" 
> is being initialized by non-framework (user) code before Type System Commit 
> for a type system with a corresponding type. Either change the user load code 
> to not do initialize, or to defer it until after the type system commit.
>       at 
> org.apache.uima.cas.impl.TypeSystemImpl.getAdjustedFeatureOffset(TypeSystemImpl.java:2575)
>       at 
> org.apache.uima.jcas.cas.AnnotationBase.<clinit>(AnnotationBase.java:71)
>       ... 27 more
> {noformat}
> Is it considered harmful to try getting a class object for a JCas FS class?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to