Re: Semantic indexing in Lucene
Diego, The semanticvectors project has a mailing list and his author, Dominic Widdows, is responding actively there. paul Le 24 mai 2011 à 02:34, Diego Cavalcanti a écrit : > Sorry, I thought the blog was yours! I will read the post and see if it > helps me. Thank you! > > About the Semantic Vectors project, surely I know how to get its source > code. What I said is that I cannot use it only by API, because the Javadoc > does not show all methods. I really do not want to change the project's > source code. Well... this is not important for this list! > > If anyone has another idea about how to implement semantic indexing in > Lucene, I would be grateful! > > []s, > -- > Diego > > > On Mon, May 23, 2011 at 21:30, Yiannis Gkoufas wrote: > >> It's not my blog! :D >> I used some of the ideas in that article >> >> http://sujitpal.blogspot.com/2009/03/vector-space-classifier-using-lucene.html >> in >> order to perform classification with lucene for my tasks. >> You can get full access to the source code of the project by typing in the >> command line: >> >> svn checkout *http*:// >> semanticvectors.googlecode.com/svn/trunk/semanticvectors-read-only >> >> Or you can access the trunk directly by the url >> http://semanticvectors.googlecode.com/svn/trunk/ >> >> On Tue, May 24, 2011 at 3:22 AM, Diego Cavalcanti < >> di...@diegocavalcanti.com >>> wrote: >> >>> Hi Yiannis, >>> >>> Thank your for your reply. >>> >>> Yes, I'm referring to project Semantic Vectors. Before sending the >> previous >>> email, I read the project API and noticed that its most classes don't >>> contain public methods, so that we cannot use the project >> programmatically >>> (only by command line). >>> >>> I've seen your blog, but I haven't found any post about semantic indexing >>> in >>> Lucene. Can you point that for me, please? >>> >>> Thanks, >>> -- >>> Diego >>> >>> >>> On Mon, May 23, 2011 at 21:17, Yiannis Gkoufas >>> wrote: >>> Hi Diego, Are you referring to that project--> http://code.google.com/p/semanticvectors/ ? If yes , then documentation exists here >>> >> http://semanticvectors.googlecode.com/svn/javadoc/latest-stable/index.html >>> . Also I think this blog might interest you --> http://sujitpal.blogspot.com/ and the project related to it ---> http://jtmt.sf.net/ BR, Yiannis On Tue, May 24, 2011 at 3:09 AM, Diego Cavalcanti < di...@diegocavalcanti.com > wrote: > Hello, > > I have a project which indexes and scores documents using Lucene. However, > I'd like to do that using semantic indexing (LSI, LSA or Semantic Vectors). > > I've read old posts and some people said that Semantic Vectors plays >>> well > with Lucene. However, I noticed that its classes are used only by >>> command > line (throw method main) instead of by API. > > So, I'd like to know if anyone can suggest any other approach so that >> I > could use semantic indexing in Lucene. > > Thanks, > Diego > >>> >> - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
Re: Semantic indexing in Lucene
I meant to check out the Semantic vectors project, but never got around to it, so there is nothing in the blog (sujitpal.blogspot.com) that talks about semantic vectors at the moment. Its on my (rather long) todo list though... Sorry about that... -sujit On Mon, 2011-05-23 at 21:22 -0300, Diego Cavalcanti wrote: > Hi Yiannis, > > Thank your for your reply. > > Yes, I'm referring to project Semantic Vectors. Before sending the previous > email, I read the project API and noticed that its most classes don't > contain public methods, so that we cannot use the project programmatically > (only by command line). > > I've seen your blog, but I haven't found any post about semantic indexing in > Lucene. Can you point that for me, please? > > Thanks, > -- > Diego > > > On Mon, May 23, 2011 at 21:17, Yiannis Gkoufas wrote: > > > Hi Diego, > > > > Are you referring to that project--> > > http://code.google.com/p/semanticvectors/ ? > > If yes , then documentation exists here > > http://semanticvectors.googlecode.com/svn/javadoc/latest-stable/index.html. > > Also I think this blog might interest you --> > > http://sujitpal.blogspot.com/ and > > the project related to it ---> http://jtmt.sf.net/ > > > > BR, > > Yiannis > > > > On Tue, May 24, 2011 at 3:09 AM, Diego Cavalcanti < > > di...@diegocavalcanti.com > > > wrote: > > > > > Hello, > > > > > > I have a project which indexes and scores documents using Lucene. > > However, > > > I'd like to do that using semantic indexing (LSI, LSA or Semantic > > Vectors). > > > > > > I've read old posts and some people said that Semantic Vectors plays well > > > with Lucene. However, I noticed that its classes are used only by command > > > line (throw method main) instead of by API. > > > > > > So, I'd like to know if anyone can suggest any other approach so that I > > > could use semantic indexing in Lucene. > > > > > > Thanks, > > > Diego > > > > > - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
Re: Semantic indexing in Lucene
Sorry, I thought the blog was yours! I will read the post and see if it helps me. Thank you! About the Semantic Vectors project, surely I know how to get its source code. What I said is that I cannot use it only by API, because the Javadoc does not show all methods. I really do not want to change the project's source code. Well... this is not important for this list! If anyone has another idea about how to implement semantic indexing in Lucene, I would be grateful! []s, -- Diego On Mon, May 23, 2011 at 21:30, Yiannis Gkoufas wrote: > It's not my blog! :D > I used some of the ideas in that article > > http://sujitpal.blogspot.com/2009/03/vector-space-classifier-using-lucene.html > in > order to perform classification with lucene for my tasks. > You can get full access to the source code of the project by typing in the > command line: > > svn checkout *http*:// > semanticvectors.googlecode.com/svn/trunk/semanticvectors-read-only > > Or you can access the trunk directly by the url > http://semanticvectors.googlecode.com/svn/trunk/ > > On Tue, May 24, 2011 at 3:22 AM, Diego Cavalcanti < > di...@diegocavalcanti.com > > wrote: > > > Hi Yiannis, > > > > Thank your for your reply. > > > > Yes, I'm referring to project Semantic Vectors. Before sending the > previous > > email, I read the project API and noticed that its most classes don't > > contain public methods, so that we cannot use the project > programmatically > > (only by command line). > > > > I've seen your blog, but I haven't found any post about semantic indexing > > in > > Lucene. Can you point that for me, please? > > > > Thanks, > > -- > > Diego > > > > > > On Mon, May 23, 2011 at 21:17, Yiannis Gkoufas > > wrote: > > > > > Hi Diego, > > > > > > Are you referring to that project--> > > > http://code.google.com/p/semanticvectors/ ? > > > If yes , then documentation exists here > > > > > > http://semanticvectors.googlecode.com/svn/javadoc/latest-stable/index.html > > . > > > Also I think this blog might interest you --> > > > http://sujitpal.blogspot.com/ and > > > the project related to it ---> http://jtmt.sf.net/ > > > > > > BR, > > > Yiannis > > > > > > On Tue, May 24, 2011 at 3:09 AM, Diego Cavalcanti < > > > di...@diegocavalcanti.com > > > > wrote: > > > > > > > Hello, > > > > > > > > I have a project which indexes and scores documents using Lucene. > > > However, > > > > I'd like to do that using semantic indexing (LSI, LSA or Semantic > > > Vectors). > > > > > > > > I've read old posts and some people said that Semantic Vectors plays > > well > > > > with Lucene. However, I noticed that its classes are used only by > > command > > > > line (throw method main) instead of by API. > > > > > > > > So, I'd like to know if anyone can suggest any other approach so that > I > > > > could use semantic indexing in Lucene. > > > > > > > > Thanks, > > > > Diego > > > > > > > > > >
Re: Semantic indexing in Lucene
It's not my blog! :D I used some of the ideas in that article http://sujitpal.blogspot.com/2009/03/vector-space-classifier-using-lucene.html in order to perform classification with lucene for my tasks. You can get full access to the source code of the project by typing in the command line: svn checkout *http*://semanticvectors.googlecode.com/svn/trunk/semanticvectors-read-only Or you can access the trunk directly by the url http://semanticvectors.googlecode.com/svn/trunk/ On Tue, May 24, 2011 at 3:22 AM, Diego Cavalcanti wrote: > Hi Yiannis, > > Thank your for your reply. > > Yes, I'm referring to project Semantic Vectors. Before sending the previous > email, I read the project API and noticed that its most classes don't > contain public methods, so that we cannot use the project programmatically > (only by command line). > > I've seen your blog, but I haven't found any post about semantic indexing > in > Lucene. Can you point that for me, please? > > Thanks, > -- > Diego > > > On Mon, May 23, 2011 at 21:17, Yiannis Gkoufas > wrote: > > > Hi Diego, > > > > Are you referring to that project--> > > http://code.google.com/p/semanticvectors/ ? > > If yes , then documentation exists here > > > http://semanticvectors.googlecode.com/svn/javadoc/latest-stable/index.html > . > > Also I think this blog might interest you --> > > http://sujitpal.blogspot.com/ and > > the project related to it ---> http://jtmt.sf.net/ > > > > BR, > > Yiannis > > > > On Tue, May 24, 2011 at 3:09 AM, Diego Cavalcanti < > > di...@diegocavalcanti.com > > > wrote: > > > > > Hello, > > > > > > I have a project which indexes and scores documents using Lucene. > > However, > > > I'd like to do that using semantic indexing (LSI, LSA or Semantic > > Vectors). > > > > > > I've read old posts and some people said that Semantic Vectors plays > well > > > with Lucene. However, I noticed that its classes are used only by > command > > > line (throw method main) instead of by API. > > > > > > So, I'd like to know if anyone can suggest any other approach so that I > > > could use semantic indexing in Lucene. > > > > > > Thanks, > > > Diego > > > > > >
Re: Semantic indexing in Lucene
Hi Yiannis, Thank your for your reply. Yes, I'm referring to project Semantic Vectors. Before sending the previous email, I read the project API and noticed that its most classes don't contain public methods, so that we cannot use the project programmatically (only by command line). I've seen your blog, but I haven't found any post about semantic indexing in Lucene. Can you point that for me, please? Thanks, -- Diego On Mon, May 23, 2011 at 21:17, Yiannis Gkoufas wrote: > Hi Diego, > > Are you referring to that project--> > http://code.google.com/p/semanticvectors/ ? > If yes , then documentation exists here > http://semanticvectors.googlecode.com/svn/javadoc/latest-stable/index.html. > Also I think this blog might interest you --> > http://sujitpal.blogspot.com/ and > the project related to it ---> http://jtmt.sf.net/ > > BR, > Yiannis > > On Tue, May 24, 2011 at 3:09 AM, Diego Cavalcanti < > di...@diegocavalcanti.com > > wrote: > > > Hello, > > > > I have a project which indexes and scores documents using Lucene. > However, > > I'd like to do that using semantic indexing (LSI, LSA or Semantic > Vectors). > > > > I've read old posts and some people said that Semantic Vectors plays well > > with Lucene. However, I noticed that its classes are used only by command > > line (throw method main) instead of by API. > > > > So, I'd like to know if anyone can suggest any other approach so that I > > could use semantic indexing in Lucene. > > > > Thanks, > > Diego > > >
Re: Semantic indexing in Lucene
Hi Diego, Are you referring to that project--> http://code.google.com/p/semanticvectors/ ? If yes , then documentation exists here http://semanticvectors.googlecode.com/svn/javadoc/latest-stable/index.html . Also I think this blog might interest you --> http://sujitpal.blogspot.com/ and the project related to it ---> http://jtmt.sf.net/ BR, Yiannis On Tue, May 24, 2011 at 3:09 AM, Diego Cavalcanti wrote: > Hello, > > I have a project which indexes and scores documents using Lucene. However, > I'd like to do that using semantic indexing (LSI, LSA or Semantic Vectors). > > I've read old posts and some people said that Semantic Vectors plays well > with Lucene. However, I noticed that its classes are used only by command > line (throw method main) instead of by API. > > So, I'd like to know if anyone can suggest any other approach so that I > could use semantic indexing in Lucene. > > Thanks, > Diego >
Semantic indexing in Lucene
Hello, I have a project which indexes and scores documents using Lucene. However, I'd like to do that using semantic indexing (LSI, LSA or Semantic Vectors). I've read old posts and some people said that Semantic Vectors plays well with Lucene. However, I noticed that its classes are used only by command line (throw method main) instead of by API. So, I'd like to know if anyone can suggest any other approach so that I could use semantic indexing in Lucene. Thanks, Diego
FastVectorHighlighter - can FieldFragList expose fragInfo?
Hello, My version: Lucene 3.1.0 I've had to customize the snippet for highlighting based on our application requirements. Specifically, instead of the snippet being a set of relevant fragments in the text, I need it to be the first sentence where a match occurs, with a fixed size from the beginning of the sentence. For this, I built (in my application code, using Lucene jars) a custom FragmentsBuilder (subclassing SimpleFragmentBuilder and overriding the createFragment(IndexReader reader, int docId, String fieldName, FieldFragList fieldFragList). However, the FieldFragList does not allow access to the List member variable. I changed this locally to be public so my subclass can access it, ie: public List fragInfos = new ArrayList(); Once this is done, my createFragment method can get at the fragInfos from the passed in fieldFragList, iterate through its WeightedFragInfo.SubInfo.Toffs to get the term offsets, which I then use to calculate and highlight my snippet (I can provide the code if it makes things clearer, but thats the gist). So my question is - would it be feasible to make the FieldFragList.fragInfos variable public in a future release? If not, is there some other way that I should do what I need to do? Thanks very much, Sujit - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
Re: QueryParser/StopAnalyzer question
Hi Erick, I think answer to this question depends which hat you put on. If you put search engine hat (or do similar things in, i.e. Google), the results will be the same as what Lucene does at the moment. And that's fair enough - getting more results in search engine world is almost always better than getting less. Even if a bunch of slightly irrelevant results is returned, nobody cares. But if you put a database hat, the world view suddenly changes. I am sure there are plenty of people who use Lucene in situations where they need exact matches and any excess results are not desirable. The root of the evil here is coming from the fact that stopwords are not indexed and reasonable defaults have to be assumed in different situations. Thinking of this, to return all data for stopword-only query would probably be least expected and I don't disagree on your argument about the mixed case, too. This probably leaves me with a single option which is not to use stopwords at all, allowing me to get the best of the both worlds. Does anyone have any experience on how much of increased index size (roughly) can I expect? Regards, Mindaugas On Mon, May 23, 2011 at 3:13 PM, Erick Erickson wrote: > Hmmm, somehow I missed this days ago > > Anyway, the Lucene query parsing process isn't quite Boolean logic. > I encourage you to think in terms of "required", "optional", and > "prohibited". > > Both queries are equivalent, to see this try attaching &debugQuery=on > to your URL and look at the "parsed query" in the debug info > > Anyway, to your qestion. > +foo:bar +baz:"there is" > > reads that "bar" must appear in the field "foo". So far so good. > But it's also required that baz contain the empty clause, which > is different than saying baz must be empty. One can argue that > any field contains, by definition, nothing. > > But imagine the impact of what you're requesting. If all stop words > get removed, then no query would ever match yours. Which > would be very counter-intuitive IMO. Your users have no clue > that you've removed stopwords, so they'll sit there saying "Look, I > KNOW that "bar" was in foo and I KNOW that "there is" was in > baz, why the heck didn't this cursed system find my doc? > > Anyway, I don't think you really want this behavior in the > stopword removal case. If you can post some use-cases where this > would be desirable, maybe we can noodle about a solution > > Best > Erick > > > 2011/5/23 Mindaugas Žakšauskas : >> Not much luck so far :( >> >> Just in case if anyone wants to earn some virtual dosh, I have added >> some 50 bonus points to this question on StackOverflow: >> >> http://stackoverflow.com/questions/6H044061/lucene-query-parsing-behaviour-joining-query-parts-with-and >> >> I also promise to post a solution here if anything satisfactory turns up. >> >> m. >> >> 2011/5/17 Mindaugas Žakšauskas : >>> Hi, >>> >>> Let's say we have an index having few documents indexed using >>> StopAnalyzer.ENGLISH_STOP_WORDS_SET. The user issues two queries: >>> 1) foo:bar >>> 2) baz:"there is" >>> >>> Let's assume that the first query yields some results because there >>> are documents matching that query. >>> >>> The second query contains two stopwords ("there" and "is") and yields >>> 0 results. The reason for this is because when baz:"there is" is >>> parsed, it ends up as a void query as both "there" and "is" are >>> stopwords (technically speaking, this is converted to an empty >>> BooleanQuery having no clauses). So far so good. >>> >>> However, any of the following combined queries >>> >>> +foo:bar +baz:"there is" >>> foo:bar AND baz:"there is" >>> >>> behave exactly the same way as query +foo:bar, that is, brings back >>> some results. The second AND part which is supposed to yield no >>> results is completely ignored. >>> >>> One might argue that when ANDing both conditions have to be met, that >>> is, documents having foo=bar and baz being empty have to be retrieved, >>> as when issued seperately, baz:"there is" yields 0 results. >>> >>> It seem contradictory as an atomic query component has different >>> impact on the overall query depending on the context. Is there any >>> logical explanation for this? Can this be addressed in any way, >>> preferably without writing own QueryAnalyzer? >>> >>> If this makes any difference, observed behaviour happens under Lucene >>> v3.0.2. >>> >>> Regards, >>> Mindaugas >>> >> >> - >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> > > - > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional
Re: QueryParser/StopAnalyzer question
Hmmm, somehow I missed this days ago Anyway, the Lucene query parsing process isn't quite Boolean logic. I encourage you to think in terms of "required", "optional", and "prohibited". Both queries are equivalent, to see this try attaching &debugQuery=on to your URL and look at the "parsed query" in the debug info Anyway, to your qestion. +foo:bar +baz:"there is" reads that "bar" must appear in the field "foo". So far so good. But it's also required that baz contain the empty clause, which is different than saying baz must be empty. One can argue that any field contains, by definition, nothing. But imagine the impact of what you're requesting. If all stop words get removed, then no query would ever match yours. Which would be very counter-intuitive IMO. Your users have no clue that you've removed stopwords, so they'll sit there saying "Look, I KNOW that "bar" was in foo and I KNOW that "there is" was in baz, why the heck didn't this cursed system find my doc? Anyway, I don't think you really want this behavior in the stopword removal case. If you can post some use-cases where this would be desirable, maybe we can noodle about a solution Best Erick 2011/5/23 Mindaugas Žakšauskas : > Not much luck so far :( > > Just in case if anyone wants to earn some virtual dosh, I have added > some 50 bonus points to this question on StackOverflow: > > http://stackoverflow.com/questions/6H044061/lucene-query-parsing-behaviour-joining-query-parts-with-and > > I also promise to post a solution here if anything satisfactory turns up. > > m. > > 2011/5/17 Mindaugas Žakšauskas : >> Hi, >> >> Let's say we have an index having few documents indexed using >> StopAnalyzer.ENGLISH_STOP_WORDS_SET. The user issues two queries: >> 1) foo:bar >> 2) baz:"there is" >> >> Let's assume that the first query yields some results because there >> are documents matching that query. >> >> The second query contains two stopwords ("there" and "is") and yields >> 0 results. The reason for this is because when baz:"there is" is >> parsed, it ends up as a void query as both "there" and "is" are >> stopwords (technically speaking, this is converted to an empty >> BooleanQuery having no clauses). So far so good. >> >> However, any of the following combined queries >> >> +foo:bar +baz:"there is" >> foo:bar AND baz:"there is" >> >> behave exactly the same way as query +foo:bar, that is, brings back >> some results. The second AND part which is supposed to yield no >> results is completely ignored. >> >> One might argue that when ANDing both conditions have to be met, that >> is, documents having foo=bar and baz being empty have to be retrieved, >> as when issued seperately, baz:"there is" yields 0 results. >> >> It seem contradictory as an atomic query component has different >> impact on the overall query depending on the context. Is there any >> logical explanation for this? Can this be addressed in any way, >> preferably without writing own QueryAnalyzer? >> >> If this makes any difference, observed behaviour happens under Lucene v3.0.2. >> >> Regards, >> Mindaugas >> > > - > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
Re: # search in Query
Are you sure that it isn't working? If you use the same analyzer at both indexing and query time you should end up with consistent results. Read up on exactly what your analyzer is doing by looking at the javadocs. Google will find you lots of info on analysis, or get hold of a copy of Lucene In Action 2nd edition to learn all about lucene. And use Luke to see what is being indexed. -- Ian. On Mon, May 23, 2011 at 12:44 PM, Yogesh Dabhi wrote: > > > I have some bellow value in lucene index field > > > > 1#abcd > > 2#test wer > > 3# testing rty > > > > I wright the query like bellow > > > > +fieldname:1# > > > > After query parser I see query string become > > +fieldname:1 > > > > is there a way to search given string > > > > > Thanks & Regards > > Yogesh > > - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
Re: FastVectorHighlighter StringIndexOutofBounds bug
(11/05/23 14:36), Weiwei Wang wrote: > 1. source string: 7 > 2. WhitespaceTokenizer + EGramTokenFilter > 3. FastVectorHighlighter, > 4. debug info: subInfos=(777((8,11))777((5,8))777((2,5)))/3.0(2,102), > srcIndex is not correctly computed for the second loop of the outer for-loop > How does your query look like? And what is EGramTokenFilter? Is it NGramTokenFilter? If so, what are min and max gram sizes? Note that FVH has a restriction - min and max should equal. (i.e. min=1 and max=3 cannot be supported by FVH) koji -- http://www.rondhuit.com/en/ - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
# search in Query
I have some bellow value in lucene index field 1#abcd 2#test wer 3# testing rty I wright the query like bellow +fieldname:1# After query parser I see query string become +fieldname:1 is there a way to search given string Thanks & Regards Yogesh
Re: QueryParser/StopAnalyzer question
Not much luck so far :( Just in case if anyone wants to earn some virtual dosh, I have added some 50 bonus points to this question on StackOverflow: http://stackoverflow.com/questions/6044061/lucene-query-parsing-behaviour-joining-query-parts-with-and I also promise to post a solution here if anything satisfactory turns up. m. 2011/5/17 Mindaugas Žakšauskas : > Hi, > > Let's say we have an index having few documents indexed using > StopAnalyzer.ENGLISH_STOP_WORDS_SET. The user issues two queries: > 1) foo:bar > 2) baz:"there is" > > Let's assume that the first query yields some results because there > are documents matching that query. > > The second query contains two stopwords ("there" and "is") and yields > 0 results. The reason for this is because when baz:"there is" is > parsed, it ends up as a void query as both "there" and "is" are > stopwords (technically speaking, this is converted to an empty > BooleanQuery having no clauses). So far so good. > > However, any of the following combined queries > > +foo:bar +baz:"there is" > foo:bar AND baz:"there is" > > behave exactly the same way as query +foo:bar, that is, brings back > some results. The second AND part which is supposed to yield no > results is completely ignored. > > One might argue that when ANDing both conditions have to be met, that > is, documents having foo=bar and baz being empty have to be retrieved, > as when issued seperately, baz:"there is" yields 0 results. > > It seem contradictory as an atomic query component has different > impact on the overall query depending on the context. Is there any > logical explanation for this? Can this be addressed in any way, > preferably without writing own QueryAnalyzer? > > If this makes any difference, observed behaviour happens under Lucene v3.0.2. > > Regards, > Mindaugas > - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
Re: stop the search
Thanks a lot. I tried to debug a long query and see when it gets to the collector. I thought it will be better to catch the "stop" action in the search itself and not the top doc collector as I would assume the search action will take long time to finish and once we get to the top doc collector, it will return immediately (I take only the top 100 results) I saw that it gets there after a long time - it first "gets stuck" on a wait function. I use MultiSearcher - any idea why that happens? Many Thanks, Liat On 23 May 2011 02:48, Simon Willnauer wrote: > The simplest way would be a CollectorDelegate that wraps an existing > collector and checks a boolean before calling the delegates collect > method. > > simon > > On Mon, May 23, 2011 at 8:09 AM, liat oren wrote: > > Thank you very much. > > > > So the best solution would be to implement the collector with a stop > > function. > > Do you happen to have an example for that? > > > > Many thanks, > > Liat > > > > On 22 May 2011 13:19, Simon Willnauer > > wrote: > >> > >> On Sun, May 22, 2011 at 4:48 PM, Devon H. O'Dell > > >> wrote: > >> > I have my own collector, but implemented this functionality by running > >> > the search in a thread pool and terminating the FutureTask running the > >> > job if it took longer than some configurable amount of time. That > >> > seemed to do the trick for me. (In my case, the IndexReader is > >> > explicitly opened readonly, so I'm not too worried about it). > >> > >> This can be super dangerous if you use Future. cancel() ie. > >> Thread.interrupt(). If the interrupt is called while you are reading > >> from a NIO FileDescriptor the channel will be closed and Lucene can > >> not recover from that state if the file has already been merged away. > >> Your Reader will get ChannelAlreadyClosed exceptions for any > >> subsequent access. You should prevent this. > >> see FSDirectory Javadoc > >> > >> > http://lucene.apache.org/java/3_1_0/api/core/org/apache/lucene/store/FSDirectory.html > >> > >> simon > >> > > >> > --dho > >> > > >> > 2011/5/22 Simon Willnauer : > >> >> you can impl. you own collector and notify the collector to stop if > you > >> >> need to. > >> >> simon > >> >> > >> >> On Sun, May 22, 2011 at 12:06 PM, liat oren > >> >> wrote: > >> >>> Hi Everyone, > >> >>> > >> >>> Is there a way to stop a multi search in the middle? > >> >>> > >> >>> Thanks a lot, > >> >>> Liat > >> >>> > >> >> > >> >> - > >> >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > >> >> For additional commands, e-mail: java-user-h...@lucene.apache.org > >> >> > >> >> > >> > > >> > >> - > >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > >> For additional commands, e-mail: java-user-h...@lucene.apache.org > >> > > > > >