[jira] [Commented] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module
[ https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14007147#comment-14007147 ] rashi gandhi commented on LUCENE-2899: -- Hi, I have one running solr core with some data indexed on solr deployed on Tomcat. This core is designed to provide OpenNLP functionalities for indexing and searching. So I have kept following binary models at this location: \apache-tomcat-7.0.53\solr\collection1\conf\opennlp • en-sent.bin • en-token.bin • en-pos-maxent.bin • en-ner-person.bin • en-ner-location.bin My Problem is: When I unload the running core, and try to delete conf directory from it. It is not allowing me to delete directory with prompt that en-sent.bin and en-token.bin is in use. All other files in conf directory are getting deleted except en-sent.bin and en-token.bin. If I have unloaded core, then why it is not unlocking the connection with core? Is this a known issue with OpenNLP Binaries? How can I release the connection between unloaded core and conf directory. (Specially binary models) Please provide me some pointers on this. Thanks in Advance > Add OpenNLP Analysis capabilities as a module > - > > Key: LUCENE-2899 > URL: https://issues.apache.org/jira/browse/LUCENE-2899 > Project: Lucene - Core > Issue Type: New Feature > Components: modules/analysis >Reporter: Grant Ingersoll >Assignee: Grant Ingersoll >Priority: Minor > Fix For: 4.9, 5.0 > > Attachments: LUCENE-2899-RJN.patch, LUCENE-2899.patch, > OpenNLPFilter.java, OpenNLPTokenizer.java > > > Now that OpenNLP is an ASF project and has a nice license, it would be nice > to have a submodule (under analysis) that exposed capabilities for it. Drew > Farris, Tom Morton and I have code that does: > * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it > would have to change slightly to buffer tokens) > * NamedEntity recognition as a TokenFilter > We are also planning a Tokenizer/TokenFilter that can put parts of speech as > either payloads (PartOfSpeechAttribute?) on a token or at the same position. > I'd propose it go under: > modules/analysis/opennlp -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module
[ https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13859480#comment-13859480 ] rashi gandhi edited comment on LUCENE-2899 at 12/31/13 1:07 PM: ok, thanks Lance. One more Question I wanted to design an analyzer that can support location containment relationship , For example Europe->France->Paris My requirement is like: when user search for any country , then results must have the documents having that country , as well as the documents having states and cities which comes under that country. But , documents with country name must have high relevancy. It must obeys containment relationship up to 4 levels .i.e. Continent->Country->State->City I wanted to know , is there any way in OpenNLP that can support this type of search. Can location tagger model can be used for this? Please provide me some pointers to move ahead Thanks in Advance was (Author: rashi): ok, thanks Lance. One more Queation I want to design an analyzer that can support location containment relationship , For example Europe->France->Paris My requirement is like: when user search for any country , then results must have the documents having that country , as well as the documents having states and cities which comes under that country. But , documents with country name must have high relevancy. It must obeys containment relationship up to 4 levels .i.e. Continent->Country->State->City I wanted to know , is there any way in OpenNLP that can support this type of search. Can location tagger model can be used for this? Please provide me some pointers to move ahead Thanks in Advance > Add OpenNLP Analysis capabilities as a module > - > > Key: LUCENE-2899 > URL: https://issues.apache.org/jira/browse/LUCENE-2899 > Project: Lucene - Core > Issue Type: New Feature > Components: modules/analysis >Reporter: Grant Ingersoll >Assignee: Grant Ingersoll >Priority: Minor > Fix For: 4.7 > > Attachments: LUCENE-2899-RJN.patch, LUCENE-2899.patch, > OpenNLPFilter.java, OpenNLPTokenizer.java > > > Now that OpenNLP is an ASF project and has a nice license, it would be nice > to have a submodule (under analysis) that exposed capabilities for it. Drew > Farris, Tom Morton and I have code that does: > * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it > would have to change slightly to buffer tokens) > * NamedEntity recognition as a TokenFilter > We are also planning a Tokenizer/TokenFilter that can put parts of speech as > either payloads (PartOfSpeechAttribute?) on a token or at the same position. > I'd propose it go under: > modules/analysis/opennlp -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module
[ https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13859480#comment-13859480 ] rashi gandhi commented on LUCENE-2899: -- ok, thanks Lance. One more Queation I want to design an analyzer that can support location containment relationship , For example Europe->France->Paris My requirement is like: when user search for any country , then results must have the documents having that country , as well as the documents having states and cities which comes under that country. But , documents with country name must have high relevancy. It must obeys containment relationship up to 4 levels .i.e. Continent->Country->State->City I wanted to know , is there any way in OpenNLP that can support this type of search. Can location tagger model can be used for this? Please provide me some pointers to move ahead Thanks in Advance > Add OpenNLP Analysis capabilities as a module > - > > Key: LUCENE-2899 > URL: https://issues.apache.org/jira/browse/LUCENE-2899 > Project: Lucene - Core > Issue Type: New Feature > Components: modules/analysis >Reporter: Grant Ingersoll >Assignee: Grant Ingersoll >Priority: Minor > Fix For: 4.7 > > Attachments: LUCENE-2899-RJN.patch, LUCENE-2899.patch, > OpenNLPFilter.java, OpenNLPTokenizer.java > > > Now that OpenNLP is an ASF project and has a nice license, it would be nice > to have a submodule (under analysis) that exposed capabilities for it. Drew > Farris, Tom Morton and I have code that does: > * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it > would have to change slightly to buffer tokens) > * NamedEntity recognition as a TokenFilter > We are also planning a Tokenizer/TokenFilter that can put parts of speech as > either payloads (PartOfSpeechAttribute?) on a token or at the same position. > I'd propose it go under: > modules/analysis/opennlp -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module
[ https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13856194#comment-13856194 ] rashi gandhi commented on LUCENE-2899: -- Hi, I have successfully applied LUCENE-2899.patch to SOLR-4.5.1 and its working properly. Now , my requirement is to combine OpenNLP with jwnl. Is it possible to combine OpenNLP with jwnl and what are the changes required in SOLR schema.xml for the same? Kindly provide some pointers to move ahead. Thanks in Advance > Add OpenNLP Analysis capabilities as a module > - > > Key: LUCENE-2899 > URL: https://issues.apache.org/jira/browse/LUCENE-2899 > Project: Lucene - Core > Issue Type: New Feature > Components: modules/analysis >Reporter: Grant Ingersoll >Assignee: Grant Ingersoll >Priority: Minor > Fix For: 4.7 > > Attachments: LUCENE-2899-RJN.patch, LUCENE-2899.patch, > OpenNLPFilter.java, OpenNLPTokenizer.java > > > Now that OpenNLP is an ASF project and has a nice license, it would be nice > to have a submodule (under analysis) that exposed capabilities for it. Drew > Farris, Tom Morton and I have code that does: > * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it > would have to change slightly to buffer tokens) > * NamedEntity recognition as a TokenFilter > We are also planning a Tokenizer/TokenFilter that can put parts of speech as > either payloads (PartOfSpeechAttribute?) on a token or at the same position. > I'd propose it go under: > modules/analysis/opennlp -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module
[ https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785216#comment-13785216 ] rashi gandhi commented on LUCENE-2899: -- Thanks Zack Waiting for a reply from Lance :) > Add OpenNLP Analysis capabilities as a module > - > > Key: LUCENE-2899 > URL: https://issues.apache.org/jira/browse/LUCENE-2899 > Project: Lucene - Core > Issue Type: New Feature > Components: modules/analysis >Reporter: Grant Ingersoll >Assignee: Grant Ingersoll >Priority: Minor > Fix For: 5.0, 4.5 > > Attachments: LUCENE-2899-current.patch, LUCENE-2899.patch, > LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, > LUCENE-2899.patch, LUCENE-2899-RJN.patch, LUCENE-2899-x.patch, > LUCENE-2899-x.patch, LUCENE-2899-x.patch, OpenNLPFilter.java, > OpenNLPFilter.java, OpenNLPTokenizer.java, opennlp_trunk.patch > > > Now that OpenNLP is an ASF project and has a nice license, it would be nice > to have a submodule (under analysis) that exposed capabilities for it. Drew > Farris, Tom Morton and I have code that does: > * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it > would have to change slightly to buffer tokens) > * NamedEntity recognition as a TokenFilter > We are also planning a Tokenizer/TokenFilter that can put parts of speech as > either payloads (PartOfSpeechAttribute?) on a token or at the same position. > I'd propose it go under: > modules/analysis/opennlp -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module
[ https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13784869#comment-13784869 ] rashi gandhi commented on LUCENE-2899: -- Hi, I designed an analyzer using OpenNLP filters and indexed some data on it. My problem is:While searching, SOLR sometimes return result and sometimes not ( but documents are there). for example: if i search for Detail_Nvf:brett ,it returns a document and after sometime again if i fire the same query, it returns Zero document Iam not getting why SOLR results are unstable. Please help me on this. Thanks in Advance > Add OpenNLP Analysis capabilities as a module > - > > Key: LUCENE-2899 > URL: https://issues.apache.org/jira/browse/LUCENE-2899 > Project: Lucene - Core > Issue Type: New Feature > Components: modules/analysis >Reporter: Grant Ingersoll >Assignee: Grant Ingersoll >Priority: Minor > Fix For: 5.0, 4.5 > > Attachments: LUCENE-2899-current.patch, LUCENE-2899.patch, > LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, > LUCENE-2899.patch, LUCENE-2899-RJN.patch, LUCENE-2899-x.patch, > LUCENE-2899-x.patch, LUCENE-2899-x.patch, OpenNLPFilter.java, > OpenNLPFilter.java, OpenNLPTokenizer.java, opennlp_trunk.patch > > > Now that OpenNLP is an ASF project and has a nice license, it would be nice > to have a submodule (under analysis) that exposed capabilities for it. Drew > Farris, Tom Morton and I have code that does: > * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it > would have to change slightly to buffer tokens) > * NamedEntity recognition as a TokenFilter > We are also planning a Tokenizer/TokenFilter that can put parts of speech as > either payloads (PartOfSpeechAttribute?) on a token or at the same position. > I'd propose it go under: > modules/analysis/opennlp -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module
[ https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13782654#comment-13782654 ] rashi gandhi commented on LUCENE-2899: -- Hi, I have applied this patch successfully on SOLR latest branch 4.x. But now I am not getting how to perform contextual searches on the data I have. I need to perform search on text field using some NLP process. I am new to NLP so need some help on how do I proceed further. How to train model using this integrated solr ? Do I need to study some thing else before moving ahead with this ? I designed a analyzer and tried indexing data. But the results are weird and inconsistent. Kindly provide some pointers to move ahead Thanks in advance. > Add OpenNLP Analysis capabilities as a module > - > > Key: LUCENE-2899 > URL: https://issues.apache.org/jira/browse/LUCENE-2899 > Project: Lucene - Core > Issue Type: New Feature > Components: modules/analysis >Reporter: Grant Ingersoll >Assignee: Grant Ingersoll >Priority: Minor > Fix For: 5.0, 4.5 > > Attachments: LUCENE-2899-current.patch, LUCENE-2899.patch, > LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, > LUCENE-2899.patch, LUCENE-2899-RJN.patch, LUCENE-2899-x.patch, > LUCENE-2899-x.patch, LUCENE-2899-x.patch, OpenNLPFilter.java, > OpenNLPFilter.java, OpenNLPTokenizer.java, opennlp_trunk.patch > > > Now that OpenNLP is an ASF project and has a nice license, it would be nice > to have a submodule (under analysis) that exposed capabilities for it. Drew > Farris, Tom Morton and I have code that does: > * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it > would have to change slightly to buffer tokens) > * NamedEntity recognition as a TokenFilter > We are also planning a Tokenizer/TokenFilter that can put parts of speech as > either payloads (PartOfSpeechAttribute?) on a token or at the same position. > I'd propose it go under: > modules/analysis/opennlp -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org