Re: UIMA FIT Pipeline with OpenNLP tokeniser

Rui Lopes Tue, 19 Apr 2016 14:15:52 -0700

Thank you, Raj!

I tried it but no success… the Annotations keep being only one.
Should it be related to the type system?


Cheers,

/rp


> On 19 Apr 2016, at 17:38, Raj kiran <[email protected]> wrote:
> 
> I believe you are missing the SentenceDetector engine in the pipeline . It
> should be added before SimpleTokenizer .
> 
> SimpleTokenizer iterates over sentences in the text/document and in absence
> of sentence annotation, tokenizer fails to add any tokens to cas.
> 
> Hope it helps.
> 
> Regards,
> Raj
> 
> On Tue, Apr 19, 2016 at 7:48 PM, Richard Eckart de Castilho <[email protected]>
> wrote:
> 
>> Short answer: no :)
>> 
>> Longer answer: You don't seem to be using the actual OpenNLP UIMA
>> components.
>> 
>> If you want an example (in Groovy, but should be trivial to transfer to
>> Java)
>> on how to use the OpenNLP UIMA components with uimaFIT, see here:
>> 
>>  https://cwiki.apache.org/confluence/display/UIMA/uimaFIT+and+Groovy
>> 
>> Cheers,
>> 
>> -- Richard
>> 
>>> On 19.04.2016, at 16:07, Rui Lopes <[email protected]> wrote:
>>> 
>>> Hi all,
>>> 
>>> I’m trying to use OpenNLP uima to build a very simple pipeline:
>>> 
>>> CollectionReaderDescription reader = CollectionReaderFactory
>>> 
>> .createReaderDescription(AbstractCollectionReader.class,
>> AbstractCollectionReader.PARAM_VALUE, 33);
>>> 
>>> AnalysisEngineDescription tokenizer =
>> AnalysisEngineFactory.createEngineDescription(SimpleTokenizer.class,
>>>                              "opennlp.uima.SentenceType",
>> "pt.ipb.pos.type.Sentence", "opennlp.uima.TokenType",
>>>                              "pt.ipb.pos.type.Token");
>>> 
>>> 
>>> AnalysisEngineDescription ae =
>> AnalysisEngineFactory.createEngineDescription(GetStartedQuickAE.class);
>>> 
>>> SimplePipeline.runPipeline(reader, tokenizer, ae);
>>> 
>>> 
>>> ------
>>> The GetStartedQuickAE just prints the Annotations:
>>> 
>>>      @Override
>>>      public void process(JCas jCas) throws
>> AnalysisEngineProcessException {
>>>              System.out.println(jCas.getDocumentText());
>>> 
>>>              for(Annotation a : jCas.getAnnotationIndex()) {
>>>                      System.out.println(a);
>>>              }
>>> 
>>>              System.out.println("Done");
>>> 
>>> 
>>>      }
>>> 
>>> 
>>> ———
>>> The output is:
>>> 
>>> 
>>> Apr 19, 2016 3:04:46 PM opennlp.uima.tokenize.AbstractTokenizer
>> initialize(71)
>>> INFO: Initializing the OpenNLP Simple Tokenizer annotator.
>>> Apr 19, 2016 3:04:46 PM opennlp.uima.util.AnnotatorUtil
>> getOptionalParameter(440)
>>> INFO: opennlp.uima.IsRemoveExistingAnnotations = not set
>>> Apr 19, 2016 3:04:46 PM opennlp.uima.util.AnnotatorUtil
>> getOptionalParameter(440)
>>> INFO: opennlp.uima.SentenceType = pt.ipb.pos.type.Sentence
>>> Apr 19, 2016 3:04:46 PM opennlp.uima.util.AnnotatorUtil
>> getOptionalParameter(440)
>>> INFO: opennlp.uima.TokenType = pt.ipb.pos.type.Token
>>> This article aims to observe the didactic action and its epistemological
>> insertion in education trends as well as its role as a medium capable of
>> causing changes in this alignment. Main objective is the need to
>> consciously integrate between epistemology and education trends didactic
>> application. The methodological procedure trend the application relied on
>> observations from years in which the subjects were given Cytology and
>> Histology in undergraduate courses. The results of observations point to a
>> single procedure, with little clarity regarding the alignment epistemology,
>> educational trends, teaching action. Associate art practice can provide a
>> biological alternative capable of generating a position and "profitable
>> shifts" in epistemological and pedagogical articulating. Different
>> strategies need to be created to establish conditions that allow the
>> configuration of knowledge as a whole, while respecting cultural diversity
>> in which knowledge is configured.
>>> DocumentAnnotation
>>>  sofa: _InitialView
>>>  begin: 0
>>>  end: 969
>>>  language: "x-unspecified"
>>> 
>>> Done
>>> 
>>> 
>>> There is only one Annotation? Does anyone knows why?
>>> 
>>> Thanks for any feedback!
>>> 
>>> All the best,
>>> 
>>> Rui Lopes
>>> 
>> 
>>

Re: UIMA FIT Pipeline with OpenNLP tokeniser

Reply via email to