Re: Using PEAR in a application based on Uima framework

2013-05-01 Thread swirl
Richard Eckart de Castilho  writes:

> 
> > Erhmmm, has anybody do something like this before?
> > I really am interested to know how you can do it.
> > 
> > To clarify, I am very interested in how you can mix-match different 
PEARs, 
> > possibly from different open source projects, with different type 
systems, 
> > and run them in a pipeline as a coherent whole.
> > 
> > How do you resolve the issue that all their type systems are of 
different 
> > Java types and be able to use each other's analysis results in the 
pipeline.
> 
> We generally stick to one type system and one collection of components and
> include components from other collections only if the are configurable 
with
> respect to type or of they are type agnostic. E.g. we combine the 
linguistic
> pre-processing components from DKPro Core [1] with the machine learning
> components from ClearTK [2].
> 
> There is also the uima-type-mapper [3] that tries to convert between 
systems. Maybe
> you want to try that. 
> 
> Cheers,
> 

Thanks Richard for your informative reply.
You mentioned you able to use DKPro Core and ClearTK together.
a. Do you use them as PEAR or do you simply download (manually  or via 
Maven) the JARs and use the annotators like a library?
b. Your post imply that DKPro Core and ClearTK has some sort of configurable  
type system or are type agnostic. So far I have seen that OpenNLP are able 
to load whatever type in the application via the parameter configuration 
settings like "opennlp.uima.SentenceType". Are DKPro Core and ClearTK simple 
to OpenNLP annotators in this aspect?

Thanks. 




RE: Textmarker: delete contained annotations

2013-05-01 Thread William Karl Thompson
Thanks Peter! I have read the documentation, but forgot about that condition.

From: Peter Klügl [pklu...@uni-wuerzburg.de]
Sent: Wednesday, May 01, 2013 6:18 AM
To: user@uima.apache.org
Subject: Re: Textmarker: delete contained annotations

Hi,

Am 01.05.2013 04:57, schrieb William Karl Thompson:
> Hi,
>
> I'm working with the cTAKES pipeline to annotate some clinical text. The 
> cTAKES syntactic chunker generates overlapping and nested annotations with 
> the same syntactic type. For example:
>
> [NP ascending [NP colon polyps]]
>
> What I would like to do is to use TextMarker rules to eliminate nested 
> annotations, so at the end of the day just have the following:
>
> [NP colon ascending polyps]
>
> I've tried to use UNMARK, but the following two rules appear to remove all 
> NPs starting at the first match, even the containing annotation:
>
> NP{PARTOF(NP)->UNMARK(NP)};
> NP{CONTAINS(NP)->UNMARK(NP)};
>
> Is there way to accomplish this that I'm missing? Using loops perhaps?

The simplest solution in your situation would be:

NP{PARTOFNEQ(NP) -> UNMARK(NP)};

(PARTOFNEQ: part of, but not equal)

If it gets more complicated, then something with a loop and/or
additional types is necessary:

BLOCK(forEach) Test{} {
 ANY+? Test{-> UNMARK(Test)};
 Test{-> UNMARK(Test)} ANY;
}


Best,

Peter

> Thanks,
>
> Will
>



Re: Textmarker: delete contained annotations

2013-05-01 Thread Peter Klügl

Hi,

Am 01.05.2013 04:57, schrieb William Karl Thompson:

Hi,

I'm working with the cTAKES pipeline to annotate some clinical text. The cTAKES 
syntactic chunker generates overlapping and nested annotations with the same 
syntactic type. For example:

[NP ascending [NP colon polyps]]

What I would like to do is to use TextMarker rules to eliminate nested 
annotations, so at the end of the day just have the following:

[NP colon ascending polyps]

I've tried to use UNMARK, but the following two rules appear to remove all NPs 
starting at the first match, even the containing annotation:

NP{PARTOF(NP)->UNMARK(NP)};
NP{CONTAINS(NP)->UNMARK(NP)};

Is there way to accomplish this that I'm missing? Using loops perhaps?


The simplest solution in your situation would be:

NP{PARTOFNEQ(NP) -> UNMARK(NP)};

(PARTOFNEQ: part of, but not equal)

If it gets more complicated, then something with a loop and/or 
additional types is necessary:


BLOCK(forEach) Test{} {
ANY+? Test{-> UNMARK(Test)};
Test{-> UNMARK(Test)} ANY;
}


Best,

Peter


Thanks,

Will