Hi James, Thank you for your great response!
1. I already used the command (as described in the documentation) and got some nice results. The only problem that I've found is with the NameFinder, It didn't recognizer different names. Can you please explain how can I use the trainer to "make" him recognize more names (Peoples, Places etc.)? 2. Linked documents, in other words is related articals, for example (in GATE): http://gate.ac.uk/biz/customers.html read the first paragraph under "media" 3. In addition, I have access to lots of texts/books that written in Hebrew, how can I use it to train the nameFinder (I will contribute it back)? an again, tahnk you very much! On Sun, Jun 5, 2011 at 2:04 AM, James Kosin <[email protected]> wrote: > Eldad, > > It is possible. > 1) This is easy enough with the current architecture and models. > Basically, you have to pass in the document or paragraphs and parse into > sentences using the SentenceDetector, which detects the sentences in the > paragraph and returns a String array of sentences. Next the output from > the sentence detector needs to be put through the Tokenizer, which takes > the sentences and tokenizes into smaller parts. Usually words, but it > also moves punctuation away from the words as well. This is done for > each sentence and returns a string list of tokens. From here you have > the raw data needed for most of the other models. From your > description, you will want to use the NameFinder and the supporting > models to tag the people, locations, and organizations and the like. > > 2) Not sure what you mean by link documents to others.... > > 3) We don't yet support all languages at the moment. Mostly because > training and test data need to be collected over many months and parsed > to be trained. Many groups have already done some work; unfortunately, > most is copyrighted and difficult for everyone to get in some cases. > > This should get you started. > http://incubator.apache.org/opennlp/documentation/manual/opennlp.html > > Download the release here... Don't forget the models toward the bottom. > http://incubator.apache.org/opennlp/download.cgi > > Let us know if you need anything else. > > James > > > On 6/4/2011 12:30 PM, Eldad Yamin wrote: > > Hello everyone, > > After researching about NLP I have found the OpenNLP as one of the most > > promising solution at the moment. > > however, I'm still looking for instruction on how to make the OpenNLP fit > to > > my needs. > > > > I need the OpenNLP to: > > 1. get as input a sentence/paragraph and in return IE, annotation, named > > entities (people, locations, organizations) and (numbers, dates, etc > .). > > 2. to use the OpenNLP to link documents to others. > > 3. to support multi languages. > > > > Please advise, > > Eldad. > > > >
