I had a look to this but didn't find anything that correspond to my problem.
Apparently there is a bug in Hibernate Search. If I use the same load test on the same index with the same data with a direct access to the lucene API, I get much better performance (and no deadlock on SegmentReader). I will report the problem there. Thanks, Stéphane On Thu, May 1, 2008 at 11:15 AM, mark harwood <[EMAIL PROTECTED]> wrote: > The issue here is a general one of trying to perform an efficient join > between an external resource (rdbms) and Lucene. > This experiment may be of interest: > http://issues.apache.org/jira/browse/LUCENE-434 > > KeyMap.java embodies the core service which translates from lucene doc ids > to DB primary keys or vice versa. > There are a couple of implementations of KeyMap that are not optimal (they > pre-date Lucene's FieldCache) but it may give you food for thought. > > Cheers > Mark > > > > > ----- Original Message ---- > From: Stephane Nicoll <[EMAIL PROTECTED]> > To: java-user@lucene.apache.org > Sent: Thursday, 1 May, 2008 9:00:33 AM > Subject: hybrid query (lucene + db) > > Hi there, > > We're using lucene with Hibernate search and we're very happy so far > with the performance and the usability of lucene. We have however a > specific use cases that prevent us to use only lucene: spatial > queries. I already sent a mail on this list a while back about the > problem and we started investigating multiple solutions. > > When the user selects a geographic area and some keywords we do the > following: > > * Perform a search on the lucene index for the keywords with a > projection that returns only the primaryKey of the element sorted by > primary key > * Perform a search on the database with other criterias and a > projection that returns only the primary key of the elements > * Iterate on both list to find N matching IDs, optionally with paging > (some from X to X + N where X is the first result of the page) > * Run a query on the database to return the actual objects (select a > from MyClass a where a.id IN (the list of matching IDs) ) We limit the > page to 1000 results > > We have searched a way to optimize the queries and to avoid to consume > too much memory, knowing that we must support paging. > > With a single user a search by kewyords takes 30msec to complete, a > search by box takes 45msec. With both (keywords + spatial area) it > takes 300msec > > With 10 concurrent users, a search by keywords takes 150msec/user but > for both it takes 3 sec/user !!! > > I had the profiler running on this scenario and I've found that *all* > threads are waiting on org.apache.lucene.index.SegmentReader. I then > configured Hibernate Search to use a separate index reader per thread. > The deadlocks disappeared but it's still very slow (2.8sec). > > Some questions: > > * Does anyone knows where the deadlocks on SegmentReader are coming from? > * Is the sorting on the primary keys a bad idea regarding performance > and memory usage? > * Does anyone has an idea to perform this kind of hybrid query in an > efficient way? > > I am using lucene 2.3.1 and Hibernate Search 3.0.1. I already ask for > support on the Hibernate Search forum but did not get any answer so > far. > > Thanks, > Stéphane > > -- > Large Systems Suck: This rule is 100% transitive. If you build one, > you suck" -- S.Yegge > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > __________________________________________________________ > Sent from Yahoo! Mail. > A Smarter Email http://uk.docs.yahoo.com/nowyoucan.html > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > -- Large Systems Suck: This rule is 100% transitive. If you build one, you suck" -- S.Yegge --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]