I believe you are missing the SentenceDetector engine in the pipeline . It should be added before SimpleTokenizer .
SimpleTokenizer iterates over sentences in the text/document and in absence of sentence annotation, tokenizer fails to add any tokens to cas. Hope it helps. Regards, Raj On Tue, Apr 19, 2016 at 7:48 PM, Richard Eckart de Castilho <[email protected]> wrote: > Short answer: no :) > > Longer answer: You don't seem to be using the actual OpenNLP UIMA > components. > > If you want an example (in Groovy, but should be trivial to transfer to > Java) > on how to use the OpenNLP UIMA components with uimaFIT, see here: > > https://cwiki.apache.org/confluence/display/UIMA/uimaFIT+and+Groovy > > Cheers, > > -- Richard > > > On 19.04.2016, at 16:07, Rui Lopes <[email protected]> wrote: > > > > Hi all, > > > > Iām trying to use OpenNLP uima to build a very simple pipeline: > > > > CollectionReaderDescription reader = CollectionReaderFactory > > > .createReaderDescription(AbstractCollectionReader.class, > AbstractCollectionReader.PARAM_VALUE, 33); > > > > AnalysisEngineDescription tokenizer = > AnalysisEngineFactory.createEngineDescription(SimpleTokenizer.class, > > "opennlp.uima.SentenceType", > "pt.ipb.pos.type.Sentence", "opennlp.uima.TokenType", > > "pt.ipb.pos.type.Token"); > > > > > > AnalysisEngineDescription ae = > AnalysisEngineFactory.createEngineDescription(GetStartedQuickAE.class); > > > > SimplePipeline.runPipeline(reader, tokenizer, ae); > > > > > > ------ > > The GetStartedQuickAE just prints the Annotations: > > > > @Override > > public void process(JCas jCas) throws > AnalysisEngineProcessException { > > System.out.println(jCas.getDocumentText()); > > > > for(Annotation a : jCas.getAnnotationIndex()) { > > System.out.println(a); > > } > > > > System.out.println("Done"); > > > > > > } > > > > > > āāā > > The output is: > > > > > > Apr 19, 2016 3:04:46 PM opennlp.uima.tokenize.AbstractTokenizer > initialize(71) > > INFO: Initializing the OpenNLP Simple Tokenizer annotator. > > Apr 19, 2016 3:04:46 PM opennlp.uima.util.AnnotatorUtil > getOptionalParameter(440) > > INFO: opennlp.uima.IsRemoveExistingAnnotations = not set > > Apr 19, 2016 3:04:46 PM opennlp.uima.util.AnnotatorUtil > getOptionalParameter(440) > > INFO: opennlp.uima.SentenceType = pt.ipb.pos.type.Sentence > > Apr 19, 2016 3:04:46 PM opennlp.uima.util.AnnotatorUtil > getOptionalParameter(440) > > INFO: opennlp.uima.TokenType = pt.ipb.pos.type.Token > > This article aims to observe the didactic action and its epistemological > insertion in education trends as well as its role as a medium capable of > causing changes in this alignment. Main objective is the need to > consciously integrate between epistemology and education trends didactic > application. The methodological procedure trend the application relied on > observations from years in which the subjects were given Cytology and > Histology in undergraduate courses. The results of observations point to a > single procedure, with little clarity regarding the alignment epistemology, > educational trends, teaching action. Associate art practice can provide a > biological alternative capable of generating a position and "profitable > shifts" in epistemological and pedagogical articulating. Different > strategies need to be created to establish conditions that allow the > configuration of knowledge as a whole, while respecting cultural diversity > in which knowledge is configured. > > DocumentAnnotation > > sofa: _InitialView > > begin: 0 > > end: 969 > > language: "x-unspecified" > > > > Done > > > > > > There is only one Annotation? Does anyone knows why? > > > > Thanks for any feedback! > > > > All the best, > > > > Rui Lopes > > > >
