Re: search match documents and pagination in lucene3.x

2011-09-21 Thread Ian Lea
>   i  want to implements pagination like google search result page in my > project.We use lucene3.3.Here are 2 issue: >   1.How can i get the number of matched documents TopDocs.totalHits >   2.What is the best practice for lucene search results pagination http://wiki.apache.org/lucene-java/Lu

Re: Re: search match documents and pagination in lucene3.x

2011-09-21 Thread janwen
thanks.Ian. The number of the results matched the query is the return value: TopDocs search(Query query, int n) ,the n arg pass to the method means that lucene just search the top n results.But i want to get the matched number that is the whole index returned not the n i pass into the TopDocs

Re: search match documents and pagination in lucene3.x

2011-09-21 Thread Manish Bafna
You can use SOLR. On Wed, Sep 21, 2011 at 1:37 PM, Ian Lea wrote: >>   i  want to implements pagination like google search result page in my >> project.We use lucene3.3.Here are 2 issue: >>   1.How can i get the number of matched documents > > TopDocs.totalHits > >>   2.What is the best practice

Re: Re: search match documents and pagination in lucene3.x

2011-09-21 Thread janwen
That is impossible.We just use lucene in my project at the monent.Maybe in the future we change to solr.anyway thanks for your advice 2011-09-21 janwen | China website : http://www.qianpin.com/ - 一键发送本地文件,请用网易闪电邮(fm.163.com)! 发件人: Ma

Re: Re: search match documents and pagination in lucene3.x

2011-09-21 Thread Ian Lea
The n in search(Query query, int n) specifies the number of hits (docs) you want returned. There is no point in lucene accumulating 10,000 if you only want 10 for the first page. TopDocs.totalHits tells you the total number of hits for the query. If you actually want all the hits, see the thread

Re: Re: Re: search match documents and pagination in lucene3.x

2011-09-21 Thread janwen
Google results page will show these info:About 439,000,000 results (0.21 seconds) so i think any application will show the total matched results to the user.Lucene need to think about implement this function?haha thanks Ian. I will search the mail archive. 2011-09-21 janwen | China website

Re: Re: Re: search match documents and pagination in lucene3.x

2011-09-21 Thread Mihai Caraman
totalHits = searcher.search( query,searcher.maxDoc()).scoreDocs.length

Higher rank for closer matches

2011-09-21 Thread Akos Tajti
Dear List, for multi term expressions I'd like to add higher rank if the matches are closer to each other. For example for the search term "like eating" the string "i like eating" comes before "I like some eating". Is this possible? Thanks in advance, Ákos Tajti

Missing Facet link

2011-09-21 Thread Mihai Caraman
at the documentation page http://lucene.apache.org/java/3_4_0/ there's no link in contrib towards http://lucene.apache.org/java/3_4_0/api/contrib-facet/index.html

Re: Missing Facet link

2011-09-21 Thread Shai Erera
You're right Mihai. I noticed that only after the release and fixed it yesterday. The link will still be missing from the release page though ... unless someone knows how to change it now. Shai On Wed, Sep 21, 2011 at 2:54 PM, Mihai Caraman wrote: > at the documentation page http://lucene.apache

Re: Higher rank for closer matches

2011-09-21 Thread Em
Àkos, have a look at SpanNearQuery. This is what you want. If you own the 2nd Edition of Lucene in Action have a look at their examples. It illustrates how to combine them with the classical queries. Regards, Em Am 21.09.2011 13:46, schrieb Akos Tajti: > Dear List, > > for multi term expression

Re: Higher rank for closer matches

2011-09-21 Thread Akos Tajti
Thanks, I will check SpanNearQuery! Regards, Ákos On Wed, Sep 21, 2011 at 2:20 PM, Em wrote: > Àkos, > > have a look at SpanNearQuery. This is what you want. > If you own the 2nd Edition of Lucene in Action have a look at their > examples. It illustrates how to combine them with the classica

Re: Higher rank for closer matches

2011-09-21 Thread Erik Hatcher
PhraseQuery suffices for the stated requirement of boosting when query terms are closer. A common technique is to incorporate a PhraseQuery with a large slop factor of the query terms into the query automatically, which implicitly boosts matching documents when the query terms are closer. A Sp

Custom Score-Similarity

2011-09-21 Thread Darshan . Pandit
Hi, I need some insight into the how the document matching actually happens and how score is generated. The basic need driving this desire is to search on multiple fields using boolean queries, and computing similarity score, for each field for each hit document so as to device a method to smar

FacetedSearch DrillDown

2011-09-21 Thread Mihai Caraman
Hello gurus, Cutting to the chase, I index this: CategoryPath(lvl1,lvl2,lvl3) I want to group things as deep as lvl3. Which should be more eficient: *search for categoryPath(lvl1) to get lvl2 results: search lvl2 number of times for categoryPath(lvl1,lvl2) to get lvl3 results* ? or *search drilldo

Problem with BooleanQuery

2011-09-21 Thread Peyman Faratin
Hi The problem I would like to solve is determining the lucene score of a word in _a particular_ given document. The 2 candidates i have been trying are - QueryWrapperFilter - BooleanQuery Both are to restrict search within a search space. But according to Doug Cutting QueryWrapperFilter opti

Re: Problem with BooleanQuery

2011-09-21 Thread Ian Lea
How is the "title" field indexed? Seems likely it is analyzed in which case a TermQuery won't match because "list of newspapers in New York" would be analyzed into terms "list", "newspapers", "new", "york" assuming things were lowercased, stop words removed etc. Maybe you need your "word" as Term

Re: Higher rank for closer matches

2011-09-21 Thread Em
Hi Erik, could you explain why PhraseQuery performs better than SpanNearQuery? Some time has passed since I read about it, however I think it was exactly the other way round. Thanks! Em Am 21.09.2011 15:56, schrieb Erik Hatcher: > PhraseQuery suffices for the stated requirement of boosting whe

Re: Higher rank for closer matches

2011-09-21 Thread Erik Hatcher
SpanNearQuery does more work than PhraseQuery - it keeps track of all matching spans, whereas PhraseQuery does not. Whether the performance difference will be relevant depends on your environment and data - so it may not be a big deal at all. Erik On Sep 21, 2011, at 10:44 , Em wrote

Re: Higher rank for closer matches

2011-09-21 Thread Em
Thanks, Erik. If PhraseQuery does not keep track of all matching spans, how does it do its work (in comparison to SpanNearQuery)? Regards, Em Am 21.09.2011 19:52, schrieb Erik Hatcher: > SpanNearQuery does more work than PhraseQuery - it keeps track of all > matching spans, whereas PhraseQuery d

Re: Higher rank for closer matches

2011-09-21 Thread Tajti Ákos
Thanks for the hints Guys. I implemented the SpanNearQuery approach because we might need the matching spans information. Regards, Ákos On 2011.09.21., at 20:08, Em wrote: > Thanks, Erik. > If PhraseQuery does not keep track of all matching spans, how does it do > its work (in comparison to

Re: FacetedSearch DrillDown

2011-09-21 Thread Shai Erera
Can you please clarify the question? What do you mean "up to lvl3"? Let me try with an example: if you index two documents, one with category [l1, l2, l3] and one with [l1, l2], and you ask to count "l1", you will get 2. If you ask to count [l1, l2, l3] you will get 1, as only one document is asso

Re: Problem with BooleanQuery

2011-09-21 Thread Peyman Faratin
Hi Ian I am not analyzing the title Field titleField = new Field("title", article.getTitle(),Field.Store.YES, Field.Index.NOT_ANALYZED); Do you think booleanquery is the right approach for solving the problem (finding lucene score of a word or a phrase in _a_ particular document)? thanks for

Re: FacetedSearch DrillDown

2011-09-21 Thread Mihai Caraman
2011/9/21 Shai Erera > What do you mean "up to lvl3"? > "as *deep *as lvl3" :P In this example, let's look at these lvls as a tree(like n-ary tree) with root in a unique value at(the top) lvl 1 ..one with category [l1, l2, l3] and one with [l1, l2], All documents have the same depth (of categor

Re: FacetedSearch DrillDown

2011-09-21 Thread Em
Hi Mihai, what about having an extra field per level? doc1: [day:monday], [hour:11pm], [minute:22], [second:00], [year:2011], [month:October], [calendar day:11]... This way you do not need to hack and you can easily extend your format if you want to add new dimensions in future. I did not work

Re: Higher rank for closer matches

2011-09-21 Thread Erik Hatcher
SpanNearQuery is a different kind of beast than PhraseQuery... it matches when it's nested SpanQuery's are in proximity. So it is like multiple PhraseQuery's and checking proximities of those with one another... or proximity with any other type of SpanQuery. On Sep 21, 2011, at 11:08 , Em wro

MoreLikeThis Interface changes

2011-09-21 Thread Scott Smith
I'm updating my lucene code from 3.0 to 3.4. There's a change in the MLT interface I'm confused about. I used the MLT.like(InputStream) method. It now appears I should change to the MLT.like(InputStreamReader, fieldname) method. Easy enough to create an InputStreamReader from an InputStream.

Re: MoreLikeThis Interface changes

2011-09-21 Thread Robert Muir
On Wed, Sep 21, 2011 at 5:17 PM, Scott Smith wrote: > I'm updating my lucene code from 3.0 to 3.4.  There's a change in the MLT > interface I'm confused about.  I used the MLT.like(InputStream) method.  It > now appears I should change to the MLT.like(InputStreamReader, fieldname) > method.  Ea