Shahi, Our queries are free text queries. But they will be expanded into: Multifield, Boolean. We are also expanding the original query using SynExpand of lucene. A simple query gets expanded to say a query of page size.
And we are not storing any other fields except key (document IDs), target URLs and titles. Prashant. On Tue, Aug 4, 2009 at 1:31 PM, Shashi Kant <shashi....@gmail.com> wrote: > Prashant, I have had better luck with even larger sized indices on > similar platforms. Could you elaborate what types of queries you are > running, Multifield? Boolean? combinations? etc. Also you might want > to remove unnecessary stored fields from the index and move them to a > relational db to squeeze out better performance. > > > Shashi > > > On Tue, Aug 4, 2009 at 3:18 AM, prashant > ullegaddi<prashullega...@gmail.com> wrote: > > I did that as well. Actually, we had 32 indexes initially. We searched > them. > > It was even horrible. > > After that I merged them into 4 indexes. And did the same. No gain! > > > > Then, I had to merge 32 indexes into one. > > > > On Tue, Aug 4, 2009 at 10:48 AM, Anshum <ansh...@gmail.com> wrote: > > > >> Hi Prashant, > >> 8 seconds as the minimum time is a little too much, though considering > >> you're using just 4G of RAM its still ok. > >> I would advice you to break your index into smaller indexes, perhaps > >> selectively query the indexes (if that's possible for your application) > and > >> use a parallelmultisearcher. Its just something that you might try and > >> like. > >> All said and done, parallelizing would only get you a bell-curve like > >> performance graph, so you'd have to figure out the sweet spot there. > >> > >> -- > >> Anshum Gupta > >> Naukri Labs! > >> http://ai-cafe.blogspot.com > >> > >> The facts expressed here belong to everybody, the opinions to me. The > >> distinction is yours to draw............ > >> > >> > >> On Tue, Aug 4, 2009 at 10:08 AM, prashant ullegaddi < > >> prashullega...@gmail.com> wrote: > >> > >> > I'm running it on Quadcore, 2.4GHz each, 4GB RAM. > >> > > >> > Prashant. > >> > > >> > On Tue, Aug 4, 2009 at 8:38 AM, Otis Gospodnetic < > >> > otis_gospodne...@yahoo.com > >> > > wrote: > >> > > >> > > With such a large index be prepared to put it on a server with lots > of > >> > RAM > >> > > (even if you follow all the tips from the Wiki). > >> > > When reporting performance numbers, you really ought to tell us > about > >> > your > >> > > hardware, types of queries, etc. > >> > > > >> > > Otis > >> > > -- > >> > > Sematext is hiring -- http://sematext.com/about/jobs.html?mls > >> > > Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR > >> > > > >> > > > >> > > > >> > > ----- Original Message ---- > >> > > > From: prashant ullegaddi <prashullega...@gmail.com> > >> > > > To: java-user@lucene.apache.org > >> > > > Sent: Monday, August 3, 2009 12:33:46 AM > >> > > > Subject: How to improve search time? > >> > > > > >> > > > Hi, > >> > > > > >> > > > I've a single index of size 87GB containing around 50M documents. > >> When > >> > I > >> > > > search for any query, > >> > > > best search time I observed was 8sec. And when query is expanded > with > >> > > > synonyms, search takes > >> > > > minutes (~ 2-3min). Is there a better way to search so that > overall > >> > > search > >> > > > time reduces? > >> > > > > >> > > > Thanks, > >> > > > Prashant. > >> > > > >> > > > >> > > > --------------------------------------------------------------------- > >> > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > >> > > For additional commands, e-mail: java-user-h...@lucene.apache.org > >> > > > >> > > > >> > > >> > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >