Re: Performance measurements

Jack Krupansky Wed, 24 Jul 2013 15:07:07 -0700

I think I've exhausted my expertise in Lucene filters, but I think you canwrap a query with a filter and also wrap a filter with a query. So, forIndexSearcher.search, you could take a filter and wrap it withConstantScoreQuery. So, if a BooleanQuery got wrapped as a filter, it couldbe wrapped as a CSQ for search so that no scoring would be done.


-- Jack Krupansky

-----Original Message-----From: Sriram Sankar

Sent: Wednesday, July 24, 2013 3:58 PM
To: java-user@lucene.apache.org
Subject: Re: Performance measurements

On Wed, Jul 24, 2013 at 10:24 AM, Jack Krupansky<j...@basetechnology.com>wrote:

Unicorn sounds like it was optimized for graph search. Specialized search
engines can in fact beat out generalized search engines for specific use
cases.


Yes and no (I worked on it).  Yes, there are many aspect of Unicorn that
have been optimized for graph search.  But the tests I am running have very
little to do with those optimizations.  I am still learning about Lucene
and have suspected that the scoring framework (that has to be very general)
may be contributing to the performance issues.  With Unicorn, we made a
decision to do all scoring after retrieval and not during retrieval.

Scoring has been a major focus of Lucene. Non-scored filters are also
available, but the query parsers are focused (exclusively) onscored-search.


When you say "filter" do you mean a step performed after retrieval?  Or is
it yet another retrieval operation?


As Adrien indicates, try using raw Lucene filters and you should get much
better results. Whether even that will compete with a use-case-specific
(graph) search engine remains to be seen.



Thanks (I will study this more).

Sriram.

-- Jack Krupansky

-----Original Message----- From: Sriram Sankar
Sent: Wednesday, July 24, 2013 1:03 PM
To: java-user@lucene.apache.org
Subject: Re: Performance measurements

No I do not need scoring.  This is a pure retrieval query - which matches
what we used to do with Unicorn in Facebook - something like:

(name:sriram AND (friend:1 OR friend:2 ...))

This automatically gives us second degree.

With Unicorn, we would always get sub-millisecond performance even for
n>500.

Should I assume that Lucene is that much worse - or is it that this use
case has not been optimized?

Sriram.

On Wed, Jul 24, 2013 at 9:59 AM, Adrien Grand <jpou...@gmail.com> wrote:

 Hi,

On Wed, Jul 24, 2013 at 6:11 PM, Sriram Sankar <san...@gmail.com> wrote:
> termA AND (termB1 OR termB2 OR ... OR termBn)

Maybe this comment is not appropriate for your use-case, but if you
don't actually need scoring from the disjunction on the right of the
query, a TermsFilter will be faster when n gets large.

--
Adrien

------------------------------**------------------------------**---------
To unsubscribe, e-mail:java-user-unsubscribe@lucene.**apache.org<java-user-unsubscr...@lucene.apache.org>For additional commands, e-mail:java-user-help@lucene.apache.**org<java-user-h...@lucene.apache.org>


------------------------------**------------------------------**---------

To unsubscribe, e-mail:java-user-unsubscribe@lucene.**apache.org<java-user-unsubscr...@lucene.apache.org>For additional commands, e-mail:java-user-help@lucene.apache.**org<java-user-h...@lucene.apache.org>



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Performance measurements

Reply via email to