Hi, Carlos, It is exactly the same. You can create a train corpus:
Sometimes for length I might have <START:length> 35+81' <END> which means > <START:length> 3500 + 81 3581' <END> <START: pressure> 977 psig <END> Notice that the corpus should have a tokenized sentence per line. You could also check if the regular expression Name Finder implementation would be better for your task: http://svn.apache.org/viewvc/opennlp/trunk/opennlp-tools/src/main/java/opennlp/tools/namefind/RegexNameFinder.java?view=markup Regards, William On Sun, Jul 8, 2012 at 6:43 AM, Carlos Scheidecker <[email protected]>wrote: > Hello all, > > I would like to train the system to identify pressure and length entities > on a document. > > So for instance, if I have 39481.8' 10.750" x .25" > > 977 psig > > Sometimes for length I might have 35+81' which means 3500 + 81 3581' > > Is there any examples on how to train entities on OpenNLP? > > On the manual it has this > > http://opennlp.apache.org/documentation/1.5.2-incubating/manual/opennlp.html#tools.namefind.training > > But then, I wonder how would that work or if I should use the examples on > percentage or money entities. > > Thanks again. > > Carlos. >
