Re: Design choices for changing type systems with loaded JCas classes [was Re: UIMAv3 & WebAnno}

Marshall Schor Tue, 09 Jan 2018 13:54:24 -0800

I did an initial implementation, ignoring Pear files.

I think the "feature expansion" when loading PEAR-classpath specified JCas
classes can't reasonably be done (because by the time you lazily get around to
loading these, the type system is committed).


So, I plan to have the pear loading path operate like before, with no feature
expansion.

I kind of doubt this will be a real issue in actual practice (he said hopefully
:-) ).

Still need to fix up some test cases, but it's looking promising...

-Marshall


On 1/8/2018 2:47 PM, Marshall Schor wrote:
> In working out the details, the following difficulty emerges:
>
> In the general case, a pipeline is associated with a class loader (used to 
> load
> JCas classes).
> When the pipeline contains "PEARs", each pear can specify it's own class 
> loader,
> and therefore, it's own set of JCas classes.
>
> So, at type system commit time, with this proposal, it would be necessary to
> find all of the class loaders that Pears might be using.  This unfortunately 
> is
> not possible in general, because the Pears are associated with a particular
> pipeline, and you can load a type system and create a CAS without referring 
> to a
> particular pipeline. 
>
> In the current implementation, the presence of a Pear in the pipeline is
> discovered (if and) when the pear is entered for the first time, and at that
> time (lazily) the loading of that Pear's JCas classes happens.
>
> Various limitations are possible, I suppose (e.g., not allowing a Pear version
> of JCas class to have new features, for example).
>
> Still thinking about this...
>
> -Marshall
>
>
> On 1/8/2018 10:16 AM, Marshall Schor wrote:
>> After a lot of thought, here's a proposal, along the lines Richard suggests:
>>
>> The basic idea is to have the JCas classes, if they exist for some type, 
>> augment
>> that type with features defined only in the JCas class.
>>
>> This augmentation would be done at type system commit time, and would really
>> modify the type system being committed to have the extra features.  Because 
>> the
>> type system would be modified to include these extra features, the Feature
>> Structures made with these "augmented" types would be larger (because they 
>> would
>> have slots for these features).  This insures that subtypes' features won't
>> overlap / collide with the expanded features.
>>
>> I'll work out the details, and see if I can make this change.
>>
>> -Marshall
>>
>>
>> On 1/5/2018 2:05 PM, Richard Eckart de Castilho wrote:
>>> On 05.01.2018, at 17:16, Marshall Schor <[email protected]> wrote:
>>>> Based on Web Annot's use case, I'm thinking thorough alternatives.
>>> "WebAnno" ;)
>>>
>>>> One way to support this would be to have the user code tell the UIMA 
>>>> framework
>>>> that no reachable instances of JCas classes exist; the user would be 
>>>> responsible
>>>> for guaranteeing this.
>>> There may be no way for the user code to know if this is the case or not or 
>>> to 
>>> enforce this to be the case. 
>>>
>>>> The other choice would be to not support this (because of the inherent 
>>>> dangers)
>>>> and instead require users having multiple type systems with JCas classes
>>>> specifying features only in some versions of those type systems, first 
>>>> load the
>>>> JCas classes with the feature-maximal versions of the types.
>>>>
>>>> I think I favor the 2nd approach, as it is much safer. 
>>>>
>>>> What do others think we should do?
>>> The current line of thinking seems to assume that:
>>>
>>> 1) a type system definition is loaded (maybe from an XML file)
>>> 2) a CAS is created using the TSD
>>> 3) the JCas classes are loaded and are initialized according to the TSD
>>>
>>> The suggestion to "first load a feature-maximal version of the types" seems
>>> to be following that line. I.e. the TSD loaded in 1) should cover all
>>> the features also covered by the JCas classes.
>>>
>>> How about a slightly different approach:
>>>
>>> 1) a type system definition is loaded (maybe from an XML file)
>>> 1a) the JCas classes are loaded and their definitions are merged with the
>>>     TSD
>>> 2) a CAS is created using the merged TSD
>>> 3) the JCas classes are initialized with the now feature-maximal type system
>>>
>>> An error would/should be thrown if in step 1a the JCas classes
>>> and the TSD are inherently incompatible. 
>>>
>>> In this case, the JCas classes would be an additional source of type system
>>> information. Thinking this further, one could even initialize a CAS without
>>> providing any TSD, simply by having UIMA inspect the available JCas classes
>>> (e.g. through classpath scanning or by providing the framework with a list
>>> of classes). To complete this, the JCas classes could be enhanced with
>>> Java annotations to carry any information included in TSDs which is 
>>> currently
>>> not included in a machine-readable way in the JCas classes, e.g. type and
>>> feature description text. As such, a set of suitably annotated JCas classes
>>> could be converted to a TSD XML and vice versa.
>>>
>>> The above assumes that JCas classes are loaded and initialized eagerly, but 
>>> probably it could be adapted to a situation where the classes are loaded 
>>> lazily.
>>>
>>> Cheers,
>>>
>>> -- Richard
>>>
>>>
>

Re: Design choices for changing type systems with loaded JCas classes [was Re: UIMAv3 & WebAnno}

Reply via email to