[jira] [Commented] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module
[ https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785043#comment-13785043 ] Zack Zullick commented on LUCENE-2899: -- I have seen this behavior before (look up to previous comments, especially from user Em and his previous fix) and I am experiencing similar results with the latest patch uploaded (Jun-16-2013) on 4.4/branch_4x. In my case, the OpenNLP system is only working when indexing the first document then no longer working thereafter. It seems you are having a similar issue, with the exception that yours is happening on the query end rather than the indexing. I sent out an email to Lance to see if he has any advice for us. > Add OpenNLP Analysis capabilities as a module > - > > Key: LUCENE-2899 > URL: https://issues.apache.org/jira/browse/LUCENE-2899 > Project: Lucene - Core > Issue Type: New Feature > Components: modules/analysis >Reporter: Grant Ingersoll >Assignee: Grant Ingersoll >Priority: Minor > Fix For: 5.0, 4.5 > > Attachments: LUCENE-2899-current.patch, LUCENE-2899.patch, > LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, > LUCENE-2899.patch, LUCENE-2899-RJN.patch, LUCENE-2899-x.patch, > LUCENE-2899-x.patch, LUCENE-2899-x.patch, OpenNLPFilter.java, > OpenNLPFilter.java, OpenNLPTokenizer.java, opennlp_trunk.patch > > > Now that OpenNLP is an ASF project and has a nice license, it would be nice > to have a submodule (under analysis) that exposed capabilities for it. Drew > Farris, Tom Morton and I have code that does: > * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it > would have to change slightly to buffer tokens) > * NamedEntity recognition as a TokenFilter > We are also planning a Tokenizer/TokenFilter that can put parts of speech as > either payloads (PartOfSpeechAttribute?) on a token or at the same position. > I'd propose it go under: > modules/analysis/opennlp -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module
[ https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13641739#comment-13641739 ] Zack Zullick commented on LUCENE-2899: -- Some information for those wanting to try this after fighting it for a day: the latest patch posted, LUCENE-2899-RJN.patch for 4.1 does not have Em's OpenNLPFilter.java and OpenNLPTokenizer.java fixed applied. So after applying the patch, make sure to replace those classes with Em's version or the bug that causes the NLP system to only be utilized on the first request will still be present. I was also able to successfully apply this patch to 4.2.1 with minor modification (mostly to the build/ivy xml files). > Add OpenNLP Analysis capabilities as a module > - > > Key: LUCENE-2899 > URL: https://issues.apache.org/jira/browse/LUCENE-2899 > Project: Lucene - Core > Issue Type: New Feature > Components: modules/analysis >Reporter: Grant Ingersoll >Assignee: Grant Ingersoll >Priority: Minor > Fix For: 4.3 > > Attachments: LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, > LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, > LUCENE-2899-RJN.patch, OpenNLPFilter.java, OpenNLPTokenizer.java, > opennlp_trunk.patch > > > Now that OpenNLP is an ASF project and has a nice license, it would be nice > to have a submodule (under analysis) that exposed capabilities for it. Drew > Farris, Tom Morton and I have code that does: > * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it > would have to change slightly to buffer tokens) > * NamedEntity recognition as a TokenFilter > We are also planning a Tokenizer/TokenFilter that can put parts of speech as > either payloads (PartOfSpeechAttribute?) on a token or at the same position. > I'd propose it go under: > modules/analysis/opennlp -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org