[ https://issues.apache.org/jira/browse/UIMA-5554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16161351#comment-16161351 ]
Marshall Schor commented on UIMA-5554: -------------------------------------- Thanks for the great questions, Richard :-) I think most of these questions revolve around JCas and non-JCas usage in UIMA. UIMA originally had only non-JCas usage (JCas was added later). The non-JCas model of access to the CAS is useful today when you're writing "generic" annotators that need to work with multiple type systems, where you don't know the type(s) and feature(s) ahead of time. It's somewhat like Java's "reflection" approach - you do some calls in the typeSystemInit method to get some special values into local fields, which you then use when accessing UIMA types and features. The JCas, in contrast, adds classes having class names equivalent to the UIMA types, and method names corresponding to the features, and you use these in your code. But, of course, using these means your code is specific to those types and features. The bridge that exists between these is somewhat flexible. Even when JCas is being used, the non-JCas access capabilities continues to exist along side it. So, in a particular pipeline / CAS, there may be UIMA types for which no JCas classes exist, or UIMA types have additional "features", for which no JCas getters/setters exist; these can be accessed (if needed) using the non-JCas approach. A use case which motivates this scenario is the type "merging" that UIMA does when given type system descriptors coming from annotator descriptors - that merging might "add" additional features to a type (say, needed by a particular annotator you're including in the pipeline), or even add additional Types. That Annotator might be a non-JCas annotator, and doesn't define any JCas classes. So there might not be any getter/setter for these additional fields or types. That is the state of things in both V2 and V3. The implementation details in V3 differ, because the actual Feature Structure instances are represented as instances of some JCas class. In the case where the style is non-JCas, there still are the "built-in" JCas classes, like TOP and Annotation. When you define a UIMA type, say Foo, it always inherits from some supertype (TOP, if non other). If no JCas definition exists (in v3) for an instance, it's most specific supertype JCas class is used to instantiate it. With that background, let's address the questions, maybe in reverse order.... 1) Why are the types not simply "installed" when the JCas class is loaded and initialized? (Installed means the corresponding UIMA types are installed). JCas classes are normally loaded an initialized as part of type system commit, after the type system has been committed. The exception of course, is that any user code running before type system commit, might make a reference to a JCas type; the first such reference would cause Java to load and initialize the JCas class. Also, a user might write code like Class.forName(...) to force loading of a class. V3 reports errors if these other kinds of loading/initializing are done before type system commit. The reason that UIMA types are not "installed" when a JCas type is loaded in the two exception cases above, is because the details of the UIMA types are not available at that time (because the type system hasn't been gathered from all the annotators in the pipeline and merged, and committed). The UIMA types could be supersets of what this particular JCas implementation defines (see, for example the use case above where some non-JCas Annotator used additional fields). 2) Is there a new concept of installing/committing types in V3? In both v2 and v3, type systems need to be assembled from annotator descriptors in a pipeline, merged, and committed, before being used. UIMA uses this concept to allow efficiency in accessing. This is, admittedly, a trade-off, versus an approach which allows a more dynamic (looking up more information on each access), but this trade-off was made early in the design of UIMA. In V3, an additional "ordering" requirement is present, requiring that the UIMA type system be assembled, merged, and committed, before any JCas classes are *initialized*. This, again, is an efficiency tradeoff, and enables feature access to be compiled into very efficient code that is modern-cpu-design-cache-loading efficient. New error messages were added to detect when this constraint is being violated. 3) In v2 it was possible to have any number of type systems and different CASes initialized with different type systems - and if different classloaders were used, even with different JCas classes. This is also the case in V3, as long as the merged/committed type system is available before any JCas classes are installed. If you are using JCas with different class loaders, they can have associated different type systems. This was done to support, for example, running "servlets" which each have their own isolating class loaders, each servlet running perhaps a different UIMA pipeline. 4) It is possible to re=initialize a CAS with a new type system even after one has already been committed. This is also true in V3. Users of this typically are using the non-JCas APIs, because the reinitialization could install any type system. The V3 implementation insures that the built-in types and their JCas implementations are always available, and have the same feature offsets. So, it is expected that most use cases doing this kind of thing will continue to work. 5) The type system for a CAS is committed, per CAS/classloader. This should still be true. I hope this explains this a bit better. > Strange exception when trying to get JCas FS class through reflection > --------------------------------------------------------------------- > > Key: UIMA-5554 > URL: https://issues.apache.org/jira/browse/UIMA-5554 > Project: UIMA > Issue Type: Bug > Components: Core Java Framework > Affects Versions: 3.0.0SDK-beta > Reporter: Richard Eckart de Castilho > > I am trying to get a class object for a JCas FS type using reflection: > {noformat} > Class.forName(typeName); > {noformat} > However, it produces this strange error. > {noformat} > java.lang.ExceptionInInitializerError > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:264) > ... > Caused by: org.apache.uima.cas.CASRuntimeException: A JCas class field "sofa" > is being initialized by non-framework (user) code before Type System Commit > for a type system with a corresponding type. Either change the user load code > to not do initialize, or to defer it until after the type system commit. > at > org.apache.uima.cas.impl.TypeSystemImpl.getAdjustedFeatureOffset(TypeSystemImpl.java:2575) > at > org.apache.uima.jcas.cas.AnnotationBase.<clinit>(AnnotationBase.java:71) > ... 27 more > {noformat} > Is it considered harmful to try getting a class object for a JCas FS class? -- This message was sent by Atlassian JIRA (v6.4.14#64029)