On 1/11/11 2:21 PM, Olivier Grisel wrote:
2011/1/4 Olivier Grisel<[email protected]>:
I plan to give more details in a blog post soon (tm).
Here it is:

   
http://blogs.nuxeo.com/dev/2011/01/mining-wikipedia-with-hadoop-and-pig-for-natural-language-processing.html

It gives a bit more context and some additional results and clues for
improvements and potential new usages.

Now I read this post too, sounds very interesting.

What is the biggest training file for the name finder you can generate with this method?

I think we need MapReduce training support for OpenNLP. Actually that is already on my todo list, but currently I am still busy with the Apache migration and the next release. Anyway I hope we can get that done at least partially for the name finder this year.

Jörn

Reply via email to