Still surprising that your searches are taking so long. Have you worked through everything on http://wiki.apache.org/lucene-java/ImproveSearchingSpeed, suggested by someone earlier in this thread? Are you sure that the problem is really with lucene? Is it the search itself that takes a long time, or retrieving data for the hits? What does query.toString() look like? How many hits does a search typically match? Is a search on document id effectively instant?
You have to supply more detail if you want better answers. -- Ian. On Tue, Aug 4, 2009 at 12:21 PM, prashant ullegaddi<prashullega...@gmail.com> wrote: > Shahi, > > Our queries are free text queries. But they will be expanded into: > Multifield, Boolean. > We are also expanding the original query using SynExpand of lucene. A simple > query > gets expanded to say a query of page size. > > And we are not storing any other fields except key (document IDs), target > URLs and titles. > > Prashant. > > On Tue, Aug 4, 2009 at 1:31 PM, Shashi Kant <shashi....@gmail.com> wrote: > >> Prashant, I have had better luck with even larger sized indices on >> similar platforms. Could you elaborate what types of queries you are >> running, Multifield? Boolean? combinations? etc. Also you might want >> to remove unnecessary stored fields from the index and move them to a >> relational db to squeeze out better performance. >> >> >> Shashi >> >> >> On Tue, Aug 4, 2009 at 3:18 AM, prashant >> ullegaddi<prashullega...@gmail.com> wrote: >> > I did that as well. Actually, we had 32 indexes initially. We searched >> them. >> > It was even horrible. >> > After that I merged them into 4 indexes. And did the same. No gain! >> > >> > Then, I had to merge 32 indexes into one. >> > >> > On Tue, Aug 4, 2009 at 10:48 AM, Anshum <ansh...@gmail.com> wrote: >> > >> >> Hi Prashant, >> >> 8 seconds as the minimum time is a little too much, though considering >> >> you're using just 4G of RAM its still ok. >> >> I would advice you to break your index into smaller indexes, perhaps >> >> selectively query the indexes (if that's possible for your application) >> and >> >> use a parallelmultisearcher. Its just something that you might try and >> >> like. >> >> All said and done, parallelizing would only get you a bell-curve like >> >> performance graph, so you'd have to figure out the sweet spot there. >> >> >> >> -- >> >> Anshum Gupta >> >> Naukri Labs! >> >> http://ai-cafe.blogspot.com >> >> >> >> The facts expressed here belong to everybody, the opinions to me. The >> >> distinction is yours to draw............ >> >> >> >> >> >> On Tue, Aug 4, 2009 at 10:08 AM, prashant ullegaddi < >> >> prashullega...@gmail.com> wrote: >> >> >> >> > I'm running it on Quadcore, 2.4GHz each, 4GB RAM. >> >> > >> >> > Prashant. >> >> > >> >> > On Tue, Aug 4, 2009 at 8:38 AM, Otis Gospodnetic < >> >> > otis_gospodne...@yahoo.com >> >> > > wrote: >> >> > >> >> > > With such a large index be prepared to put it on a server with lots >> of >> >> > RAM >> >> > > (even if you follow all the tips from the Wiki). >> >> > > When reporting performance numbers, you really ought to tell us >> about >> >> > your >> >> > > hardware, types of queries, etc. >> >> > > >> >> > > Otis >> >> > > -- >> >> > > Sematext is hiring -- http://sematext.com/about/jobs.html?mls >> >> > > Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR >> >> > > >> >> > > >> >> > > >> >> > > ----- Original Message ---- >> >> > > > From: prashant ullegaddi <prashullega...@gmail.com> >> >> > > > To: java-user@lucene.apache.org >> >> > > > Sent: Monday, August 3, 2009 12:33:46 AM >> >> > > > Subject: How to improve search time? >> >> > > > >> >> > > > Hi, >> >> > > > >> >> > > > I've a single index of size 87GB containing around 50M documents. >> >> When >> >> > I >> >> > > > search for any query, >> >> > > > best search time I observed was 8sec. And when query is expanded >> with >> >> > > > synonyms, search takes >> >> > > > minutes (~ 2-3min). Is there a better way to search so that >> overall >> >> > > search >> >> > > > time reduces? >> >> > > > >> >> > > > Thanks, >> >> > > > Prashant. >> >> > > >> >> > > >> >> > > >> --------------------------------------------------------------------- >> >> > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> >> > > For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> > > >> >> > > >> >> > >> >> >> > >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org