On Thu, Feb 9, 2012 at 1:16 AM, [email protected] < [email protected]> wrote:
> Yes, maybe we should make the module dependencies clear to understand, > maybe with some diagrams. > The Tokenizer Tools documentation includes a passage explaining how to work > with raw text. Maybe it would be helpful to add something like that in the > Name Finder Tools documentation, since it is the most popular module. > > > http://incubator.apache.org/opennlp/documentation/1.5.2-incubating/manual/opennlp.html#tools.tokenizer.cmdline > > +1 that is something everyone needs to know and I guess it also something which often is not done entirely correct. It is easy to get it wrong because the name finder will still work even when a user gives it a paragraph or a whole document. The results are just not good, but many don't know what "good" looks like anyway when they try it out the first time. Jörn
