looking for a BooleanMatcher instead of BooleanScorer

Li Li Fri, 01 Jun 2012 00:37:08 -0700

hi all,
    I am looking for a 'BooleanMatcher' in lucene. for many
application, we don't need order matched documents by relevant scores.
we just like the boolean query. But the BooleanScorer/BooleanScorer2
is a little bit heavy for the purpose of relevant scoring.
    one use case is: we have some fields which has very small number
of tokens(usually only one word). such as id,tag or something else.
    But we need query like this: id in (1,3,5.....). if using
booleanQuery (id:1 id:3 id:5 ...). BooleanScorer can only apply to 31
terms. BooleanScorer2 using priority queue to know how many terms are
matched(Coord).
    Filters may help but it can be a very complicated query(or else,
it self still using BooleanQuery, there is a recursive problem)


    we may divide current BooleanScorer to a BooleanMatcher and a
Ranker. if we need score the hitted docs, we ask the BooleanScorer for
not only hitted id but also tf/idf coord or anything we need to use in
ranking. but sometimes we only need docIds. then the BooleanMatcher
can optimize it's implementation. for the case of many disjunction
terms, we can do it like Filter or BooleanScorer instead of
BooleanScorer2.

    is it possible?

    following is some user demands I searched from the mail list. the
first one is my own requirement.

    1. https://github.com/neo4j/community/issues/494

    2. mail to lucene

qibaoy...@126.com qibaoy...@126.com via lucene.apache.org
        
May 6
                
to lucene
Hi,
      I met a problem about how to search many keywords  in about
5,000,000 documents.For example the query may be like "(a1 or a2 or a3
....a200) and (b1 or b2 or b3 or b4 ..... b400)",I found it will take
vey long time(40seconds) to get the the answer in only one field(Title
field),and JVM will throw OutMemory error in more fields(title field
plus content field).Any suggestions or good idea to solve this
problem?thanks in advance.


   3 mail to lucene
Chris Book chrisb...@gmail.com via lucene.apache.org
        
Apr 11
                
to solr-user
Hello, I have a solr index running that is working very well as a search.
 But I want to add the ability (if possible) to use it to do matching.  The
problem is that by default it is only looking for all the input terms to be
present, and it doesn't give me any indication as to how many terms in the
target field were not specified by the input.

For example, if I'm trying to match to the song title "dust in the wind",
I'm correctly getting a match if the input query is "dust in wind".  But I
don't want to get a match if the input is just "dust".  Although as a
search "dust" should return this result, I'm looking for some way to filter
this out based on some indication that the input isn't close enough to the
output.  Perhaps if I could get information that that the number of input
terms is much less than the number of terms in the field.  Or something
else along those line?

I realize that this isn't the typical use case for a search, but I'm just
looking for some suggestions as to how I could improve the above example a
bit.

Thanks,
Chris

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

looking for a BooleanMatcher instead of BooleanScorer

Reply via email to