+1 from me, sounds like a useful contribution. Sorry for my late reply
have been aways for two weeks.
Jörn
On 09/19/2012 01:52 AM, William Colen wrote:
Hi, Lance Norskog,
I am sure that it would be a great contribution to have it in OpenNLP! I am
reading your blog to check how it works and it looks very nice.
Congratulations!
William
On Sun, Sep 16, 2012 at 1:03 AM, Lance Norskog <[email protected]> wrote:
I wrote an LSA toolkit and used it in a document summarizer for Solr.
The toolkit includes all of the common conditioning algorithms for
term-document matrices, and I did an exhaustive bake-off of algorithms
using the first Reuters corpus.
http://ultrawhizbang.blogspot.com/2012/09/document-summarization-with-lsa-1.html
What does this have to do with OpenNLP? When I used parts-of-speech to
select only nouns & verbs, all of the algorithms were 10-15% better.
Every single one. Also, would the toolkit be useful for this project?
--
Lance Norskog
[email protected]