On 1/11/11 2:21 PM, Olivier Grisel wrote:
2011/1/4 Olivier Grisel<[email protected]>:
I plan to give more details in a blog post soon (tm).
Here it is:
http://blogs.nuxeo.com/dev/2011/01/mining-wikipedia-with-hadoop-and-pig-for-natural-language-processing.html
It gives a bit more context and some additional results and clues for
improvements and potential new usages.
Now I read this post too, sounds very interesting.
What is the biggest training file for the name finder you can generate
with this method?
I think we need MapReduce training support for OpenNLP. Actually that is
already on my
todo list, but currently I am still busy with the Apache migration and
the next release.
Anyway I hope we can get that done at least partially for the name
finder this year.
Jörn