Re: Using PEAR in a application based on Uima framework
Richard Eckart de Castilho writes: > > > Erhmmm, has anybody do something like this before? > > I really am interested to know how you can do it. > > > > To clarify, I am very interested in how you can mix-match different PEARs, > > possibly from different open source projects, with different type systems, > > and run them in a pipeline as a coherent whole. > > > > How do you resolve the issue that all their type systems are of different > > Java types and be able to use each other's analysis results in the pipeline. > > We generally stick to one type system and one collection of components and > include components from other collections only if the are configurable with > respect to type or of they are type agnostic. E.g. we combine the linguistic > pre-processing components from DKPro Core [1] with the machine learning > components from ClearTK [2]. > > There is also the uima-type-mapper [3] that tries to convert between systems. Maybe > you want to try that. > > Cheers, > Thanks Richard for your informative reply. You mentioned you able to use DKPro Core and ClearTK together. a. Do you use them as PEAR or do you simply download (manually or via Maven) the JARs and use the annotators like a library? b. Your post imply that DKPro Core and ClearTK has some sort of configurable type system or are type agnostic. So far I have seen that OpenNLP are able to load whatever type in the application via the parameter configuration settings like "opennlp.uima.SentenceType". Are DKPro Core and ClearTK simple to OpenNLP annotators in this aspect? Thanks.
RE: Textmarker: delete contained annotations
Thanks Peter! I have read the documentation, but forgot about that condition. From: Peter Klügl [pklu...@uni-wuerzburg.de] Sent: Wednesday, May 01, 2013 6:18 AM To: user@uima.apache.org Subject: Re: Textmarker: delete contained annotations Hi, Am 01.05.2013 04:57, schrieb William Karl Thompson: > Hi, > > I'm working with the cTAKES pipeline to annotate some clinical text. The > cTAKES syntactic chunker generates overlapping and nested annotations with > the same syntactic type. For example: > > [NP ascending [NP colon polyps]] > > What I would like to do is to use TextMarker rules to eliminate nested > annotations, so at the end of the day just have the following: > > [NP colon ascending polyps] > > I've tried to use UNMARK, but the following two rules appear to remove all > NPs starting at the first match, even the containing annotation: > > NP{PARTOF(NP)->UNMARK(NP)}; > NP{CONTAINS(NP)->UNMARK(NP)}; > > Is there way to accomplish this that I'm missing? Using loops perhaps? The simplest solution in your situation would be: NP{PARTOFNEQ(NP) -> UNMARK(NP)}; (PARTOFNEQ: part of, but not equal) If it gets more complicated, then something with a loop and/or additional types is necessary: BLOCK(forEach) Test{} { ANY+? Test{-> UNMARK(Test)}; Test{-> UNMARK(Test)} ANY; } Best, Peter > Thanks, > > Will >
Re: Textmarker: delete contained annotations
Hi, Am 01.05.2013 04:57, schrieb William Karl Thompson: Hi, I'm working with the cTAKES pipeline to annotate some clinical text. The cTAKES syntactic chunker generates overlapping and nested annotations with the same syntactic type. For example: [NP ascending [NP colon polyps]] What I would like to do is to use TextMarker rules to eliminate nested annotations, so at the end of the day just have the following: [NP colon ascending polyps] I've tried to use UNMARK, but the following two rules appear to remove all NPs starting at the first match, even the containing annotation: NP{PARTOF(NP)->UNMARK(NP)}; NP{CONTAINS(NP)->UNMARK(NP)}; Is there way to accomplish this that I'm missing? Using loops perhaps? The simplest solution in your situation would be: NP{PARTOFNEQ(NP) -> UNMARK(NP)}; (PARTOFNEQ: part of, but not equal) If it gets more complicated, then something with a loop and/or additional types is necessary: BLOCK(forEach) Test{} { ANY+? Test{-> UNMARK(Test)}; Test{-> UNMARK(Test)} ANY; } Best, Peter Thanks, Will