Also, how long does it take Luke to do a search against the same index.

That way you can remove any of the timing that your application is adding into the mix.

If Luke doesn't take the minimum of 8 seconds... then you know its an issue with your app. (or at least a large part of it)

Matt

Ian Lea wrote:
Still surprising that your searches are taking so long.

Have you worked through everything on
http://wiki.apache.org/lucene-java/ImproveSearchingSpeed, suggested by
someone earlier in this thread?  Are you sure that the problem is
really with lucene? Is it the search itself that takes a long time, or
retrieving data for the hits?  What does query.toString() look like?
How many hits does a search typically match?  Is a search on document
id effectively instant?

You have to supply more detail if you want better answers.


--
Ian.


On Tue, Aug 4, 2009 at 12:21 PM, prashant
ullegaddi<prashullega...@gmail.com> wrote:
Shahi,

Our queries are free text queries. But they will be expanded into:
Multifield, Boolean.
We are also expanding the original query using SynExpand of lucene. A simple
query
gets expanded to say a query of page size.

And we are not storing any other fields except key (document IDs), target
URLs and titles.

Prashant.

On Tue, Aug 4, 2009 at 1:31 PM, Shashi Kant <shashi....@gmail.com> wrote:

Prashant, I have had better luck with even larger sized indices on
similar platforms. Could you elaborate what types of queries you are
running, Multifield? Boolean? combinations? etc. Also you might want
to remove unnecessary stored fields from the index and move them to a
relational db to squeeze out better performance.


Shashi


On Tue, Aug 4, 2009 at 3:18 AM, prashant
ullegaddi<prashullega...@gmail.com> wrote:
I did that as well. Actually, we had 32 indexes initially. We searched
them.
It was even horrible.
After that I merged them into 4 indexes. And did the same. No gain!

Then, I had to merge 32 indexes into one.

On Tue, Aug 4, 2009 at 10:48 AM, Anshum <ansh...@gmail.com> wrote:

Hi Prashant,
8 seconds as the minimum time is a little too much, though considering
you're using just 4G of RAM its still ok.
I would advice you to break your index into smaller indexes, perhaps
selectively query the indexes (if that's possible for your application)
and
use a parallelmultisearcher. Its just something that you might try and
like.
All said and done, parallelizing would only get you a bell-curve like
performance graph, so you'd have to figure out the sweet spot there.

--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com

The facts expressed here belong to everybody, the opinions to me. The
distinction is yours to draw............


On Tue, Aug 4, 2009 at 10:08 AM, prashant ullegaddi <
prashullega...@gmail.com> wrote:

I'm running it on Quadcore, 2.4GHz each, 4GB RAM.

Prashant.

On Tue, Aug 4, 2009 at 8:38 AM, Otis Gospodnetic <
otis_gospodne...@yahoo.com
wrote:
With such a large index be prepared to put it on a server with lots
of
RAM
(even if you follow all the tips from the Wiki).
When reporting performance numbers, you really ought to tell us
about
your
hardware, types of queries, etc.

Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



----- Original Message ----
From: prashant ullegaddi <prashullega...@gmail.com>
To: java-user@lucene.apache.org
Sent: Monday, August 3, 2009 12:33:46 AM
Subject: How to improve search time?

Hi,

I've a single index of size 87GB containing around 50M documents.
When
I
search for any query,
best search time I observed was 8sec. And when query is expanded
with
synonyms, search takes
minutes (~ 2-3min). Is there a better way to search so that
overall
search
time reduces?

Thanks,
Prashant.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



--
Matthew Hall
Software Engineer
Mouse Genome Informatics
mh...@informatics.jax.org
(207) 288-6012



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to