Re: "Standard" UIMA typesystem

Richard Eckart de Castilho Fri, 09 Sep 2016 06:42:36 -0700

In my experience, 95% of the function of the UIMA component class is to
data conversion, namely to/from the data model/format that some wrapped
non-UIMA tool has to the type system that is used in the pipeline.
The other 5% are passing along parameters and configuring resources.

In some cases, I have factored out the data conversion from the
process method of the UIMA component, leaving basically this:

  process(CAS) {
    DataModel data = convertToDataModel(CAS);
    runWrappedTool(data);
    convertToCas(data, CAS);
  }

UIMA allows me to implement that conversion quickly and in a rather
streamlined way. If that conversion needs to be made more flexible
to support different type system designs, it would IMHO introduce
unnecessary and annoying complexity.

That said...

=> I could imagine that some minimal support for "type mapping"
directly in the CAS could help in certain situations, e.g. when
types/features get renamed as part of evolving a type system

We already have a view mapping, i.e. when a component accesses view X,
it may be mapped to view Y in the CAS. The same could be done for type
names and for features. Eventually, I would like to change the type
names of the DKPro Core type system, and then this would come in very
handy.

=> the ability to extend the type system at runtime (via CAS API, ignoring JCas)

For frameworks like Ruta, it would be nice if types and features could
be added after the CAS has been initialized. Past discussions about this
can be found elsethread.

However, beyond that, I presently find it hard to imagine any sensible
framework support. If the structural design of a type system is changed,
then the type of mapping that can be done declaratively is usually hardly
sufficient.

Btw. nobody forces you to use the JCas API if you don't like it. Just
use the CAS API if that provides you with more of the flexibility that
you would like to have. You can happily mix components coded against CAS
and JCas in the same pipeline. I personally use the JCas whenever possible
and CAS whenever necessary (and LowLevelCas in a few cases as well ;) ).

Cheers,

-- Richard

> On 09.09.2016, at 15:11, Joern Kottmann <[email protected]> wrote:
> 
> A very good reason to use a framework like UIMA is that we can reuse
> components
> and don't have to build everything from scratch (if I have to do that I
> don't need UIMA these days).
> 
> To be able to reuse a component it must work with multiple type systems or
> can easily be adapted
> to a custom type system.
> 
> I am personally think the convenience the JCas brings is outweighed many
> times by all the complexity
> and disadvantages which come with it, e.g. code generation step, having
> extra special classes and mostly impossible
> to reuse the written code.
> 
> Jörn
> 
> On Fri, Sep 9, 2016 at 2:37 PM, Peter Klügl <[email protected]>
> wrote:
> 
>> How should this be solved/improved? I do not see it.
>> 
>> You have either generic analysis engines with parameters for the types,
>> or the analysis engine knows the types and depends on it, regardless if
>> you use CAS or JCas.
>> 
>> Isn't that the thing with static typed feature structures? If you have
>> Java code that depends on a class hierarchy, you are stuck with that
>> hierarchy. (I hope this discussion won't go in a direction that
>> dynamically typed programming languages are  better)
>> 
>> 
>> I probably do not understand the motivation. Can you give me an example?
>> 
>> 
>> Best,
>> 
>> 
>> Peter

Re: "Standard" UIMA typesystem

Reply via email to