Shahi,

Our queries are free text queries. But they will be expanded into:
Multifield, Boolean.
We are also expanding the original query using SynExpand of lucene. A simple
query
gets expanded to say a query of page size.

And we are not storing any other fields except key (document IDs), target
URLs and titles.

Prashant.

On Tue, Aug 4, 2009 at 1:31 PM, Shashi Kant <shashi....@gmail.com> wrote:

> Prashant, I have had better luck with even larger sized indices on
> similar platforms. Could you elaborate what types of queries you are
> running, Multifield? Boolean? combinations? etc. Also you might want
> to remove unnecessary stored fields from the index and move them to a
> relational db to squeeze out better performance.
>
>
> Shashi
>
>
> On Tue, Aug 4, 2009 at 3:18 AM, prashant
> ullegaddi<prashullega...@gmail.com> wrote:
> > I did that as well. Actually, we had 32 indexes initially. We searched
> them.
> > It was even horrible.
> > After that I merged them into 4 indexes. And did the same. No gain!
> >
> > Then, I had to merge 32 indexes into one.
> >
> > On Tue, Aug 4, 2009 at 10:48 AM, Anshum <ansh...@gmail.com> wrote:
> >
> >> Hi Prashant,
> >> 8 seconds as the minimum time is a little too much, though considering
> >> you're using just 4G of RAM its still ok.
> >> I would advice you to break your index into smaller indexes, perhaps
> >> selectively query the indexes (if that's possible for your application)
> and
> >> use a parallelmultisearcher. Its just something that you might try and
> >> like.
> >> All said and done, parallelizing would only get you a bell-curve like
> >> performance graph, so you'd have to figure out the sweet spot there.
> >>
> >> --
> >> Anshum Gupta
> >> Naukri Labs!
> >> http://ai-cafe.blogspot.com
> >>
> >> The facts expressed here belong to everybody, the opinions to me. The
> >> distinction is yours to draw............
> >>
> >>
> >> On Tue, Aug 4, 2009 at 10:08 AM, prashant ullegaddi <
> >> prashullega...@gmail.com> wrote:
> >>
> >> > I'm running it on Quadcore, 2.4GHz each, 4GB RAM.
> >> >
> >> > Prashant.
> >> >
> >> > On Tue, Aug 4, 2009 at 8:38 AM, Otis Gospodnetic <
> >> > otis_gospodne...@yahoo.com
> >> > > wrote:
> >> >
> >> > > With such a large index be prepared to put it on a server with lots
> of
> >> > RAM
> >> > > (even if you follow all the tips from the Wiki).
> >> > > When reporting performance numbers, you really ought to tell us
> about
> >> > your
> >> > > hardware, types of queries, etc.
> >> > >
> >> > > Otis
> >> > > --
> >> > > Sematext is hiring -- http://sematext.com/about/jobs.html?mls
> >> > > Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
> >> > >
> >> > >
> >> > >
> >> > > ----- Original Message ----
> >> > > > From: prashant ullegaddi <prashullega...@gmail.com>
> >> > > > To: java-user@lucene.apache.org
> >> > > > Sent: Monday, August 3, 2009 12:33:46 AM
> >> > > > Subject: How to improve search time?
> >> > > >
> >> > > > Hi,
> >> > > >
> >> > > > I've a single index of size 87GB containing around 50M documents.
> >> When
> >> > I
> >> > > > search for any query,
> >> > > > best search time I observed was 8sec. And when query is expanded
> with
> >> > > > synonyms, search takes
> >> > > > minutes (~ 2-3min). Is there a better way to search so that
> overall
> >> > > search
> >> > > > time reduces?
> >> > > >
> >> > > > Thanks,
> >> > > > Prashant.
> >> > >
> >> > >
> >> > >
> ---------------------------------------------------------------------
> >> > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> >> > > For additional commands, e-mail: java-user-h...@lucene.apache.org
> >> > >
> >> > >
> >> >
> >>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

Reply via email to