On 08.01.2018, at 16:16, Marshall Schor <m...@schor.com> wrote: > > After a lot of thought, here's a proposal, along the lines Richard suggests: > > The basic idea is to have the JCas classes, if they exist for some type, > augment > that type with features defined only in the JCas class. > > This augmentation would be done at type system commit time, and would really > modify the type system being committed to have the extra features. Because > the > type system would be modified to include these extra features, the Feature > Structures made with these "augmented" types would be larger (because they > would > have slots for these features). This insures that subtypes' features won't > overlap / collide with the expanded features. > > I'll work out the details, and see if I can make this change.
After some though, I believe the problem with the availability and ordering of features can be sidestepped if we consider the JCas classes as a canonical source for type system definitions. JCas classes represent a pretty strong and rigid contract on the type system and the can only be one set of the available through the classloader at any given time. XML TSDs on the other hand are comparably flexible and a dime a dozend. Arbitrary numbers of them can be merged and used to initialize a CAS. So my suggestion would be: when using the JCas API, then JCas classes are treated as the canonical source for the type system definition. They define which types exist, which parent types they have, and what is the order of the features. If a user provides additional TSDs when initializing a CAS, then these are merged on top of the definitions sourced from the JCas classes. In this way, features defined in JCas classes can never be missing and they always have a defined order, irrespective of the presence of any other TSDs. If any addition features are defined in TSDs, then they need to be access through the CAS API anyway. I believe there would also be no issues with subtypes in this "JCas first" scenario. This approach would also avoid that accessing features defined in JCas but not defined in an XML TSD would trigger an error, since the features are defined via their presence in the JCas class. A potential downside is, that users who initialize CAS with a small XML TSD but who have rich JCas classes on the classpath might end up with more memory usage than they asked for - I assume that would rarely happen. This could be mitigated by only initializing JCas classes if their types are actually defined in the user-provided TSD at initialization time. Finally, users who really do not want to have any JCas classes affect their CASes could maybe entirely disable JCas for a given CAS instance - I thought years ago, I had seen an option somewhere to do that, but I don't find it at the moment. What do you think? Cheers, -- Richard