I think I've exhausted my expertise in Lucene filters, but I think you can wrap a query with a filter and also wrap a filter with a query. So, for IndexSearcher.search, you could take a filter and wrap it with ConstantScoreQuery. So, if a BooleanQuery got wrapped as a filter, it could be wrapped as a CSQ for search so that no scoring would be done.

-- Jack Krupansky

-----Original Message----- From: Sriram Sankar
Sent: Wednesday, July 24, 2013 3:58 PM
To: java-user@lucene.apache.org
Subject: Re: Performance measurements

On Wed, Jul 24, 2013 at 10:24 AM, Jack Krupansky <j...@basetechnology.com>wrote:

Unicorn sounds like it was optimized for graph search. Specialized search
engines can in fact beat out generalized search engines for specific use
cases.


Yes and no (I worked on it).  Yes, there are many aspect of Unicorn that
have been optimized for graph search.  But the tests I am running have very
little to do with those optimizations.  I am still learning about Lucene
and have suspected that the scoring framework (that has to be very general)
may be contributing to the performance issues.  With Unicorn, we made a
decision to do all scoring after retrieval and not during retrieval.



Scoring has been a major focus of Lucene. Non-scored filters are also
available, but the query parsers are focused (exclusively) on scored-search.


When you say "filter" do you mean a step performed after retrieval?  Or is
it yet another retrieval operation?



As Adrien indicates, try using raw Lucene filters and you should get much
better results. Whether even that will compete with a use-case-specific
(graph) search engine remains to be seen.


Thanks (I will study this more).

Sriram.





-- Jack Krupansky

-----Original Message----- From: Sriram Sankar
Sent: Wednesday, July 24, 2013 1:03 PM
To: java-user@lucene.apache.org
Subject: Re: Performance measurements


No I do not need scoring.  This is a pure retrieval query - which matches
what we used to do with Unicorn in Facebook - something like:

(name:sriram AND (friend:1 OR friend:2 ...))

This automatically gives us second degree.

With Unicorn, we would always get sub-millisecond performance even for
n>500.

Should I assume that Lucene is that much worse - or is it that this use
case has not been optimized?

Sriram.



On Wed, Jul 24, 2013 at 9:59 AM, Adrien Grand <jpou...@gmail.com> wrote:

 Hi,

On Wed, Jul 24, 2013 at 6:11 PM, Sriram Sankar <san...@gmail.com> wrote:
> termA AND (termB1 OR termB2 OR ... OR termBn)

Maybe this comment is not appropriate for your use-case, but if you
don't actually need scoring from the disjunction on the right of the
query, a TermsFilter will be faster when n gets large.

--
Adrien

------------------------------**------------------------------**---------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.**apache.org<java-user-unsubscr...@lucene.apache.org> For additional commands, e-mail: java-user-help@lucene.apache.**org<java-user-h...@lucene.apache.org>




------------------------------**------------------------------**---------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.**apache.org<java-user-unsubscr...@lucene.apache.org> For additional commands, e-mail: java-user-help@lucene.apache.**org<java-user-h...@lucene.apache.org>




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to