Total # of bytes for the input data is a more useful number than # of
documents.

400 million documents was our peak at my last job. They were maybe 300-500
bytes of text, for 1k of disk space per document.  The index was thus 400
gigabytes.  The problems were:

1) system administration: the logistics of the index were a nightmare.
Optimize took 14 hours, a full copy to the query servers took 1/2 an hour.
Optimize needs twice the index size in the same partition.
2) sorting creates an array with one element for every document. We needed
32G of ram in a server to allow sorted results.
3) faceting on some fields was likewise impossible, since faceting makes an
array of facet values. Faceting on timestamps was a no-no.

The servers were Dell 2950s, 2 or 4 processor, 32G ram, 6 300mb high-speed
SATA in Raid-5 for 1.2 terabytes of space.

Basic searching was a little slower than the smaller index, but still 50ms
for pre-cached queries.

On Fri, Jun 26, 2009 at 8:28 AM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:

>
> Hi Daniel,
>
> How much Solr can handle really depends on the hardware you run it on, the
> type of document you index in it, and the query rate and type.
>
> 10M doesn't sound like a large number even for an average server today
> (e.g. 4 GB of RAM, 1-2 cores), web-page sized documents, and a query rate of
> a few dozen a second simple keyword, boolean, or phrase queries
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> ----- Original Message ----
> > From: Daniel Löfquist <daniel.lofqu...@it.cdon.com>
> > To: solr-user@lucene.apache.org
> > Sent: Friday, June 26, 2009 7:27:45 AM
> > Subject: How much data can Solr handle?
> >
> > We're looking to build a search solution that can contain as many as 10
> million
> > different items and I was wondering if Solr could handle that kind of
> data
> > amount or not?
> >
> > Has anybody done any testing or published any kind of results for a
> > Solr-installation
> > working on huge amounts of data like this?
> >
> > //Daniel
> >
> > --
> > Daniel Löfquist
> > Software Engineer
> >
> > CDON.COM
> > Bergsgatan 20, Box 385, SE 201 23 Malmö, Sweden
> >
> > Office: +46 40 601 61 00
> > Direct: +46 40 601 61 16
> > Fax: +46 40 601 61 20
> > E-mail: daniel.lofqu...@it.cdon.com
> >
> > CDON.COM
> >
> > Confidentiality
> > Information contained in this e-mail is intended for the use of the
> > addressee only, and is confidential. Any dissemination, distribution,
> > copying or use of this communication without prior permission of
> > the addressee is strictly prohibited. If you are not the intended
> > addressee you must delete this e-mail and its attachments.
>
>


-- 
Lance Norskog
goks...@gmail.com
650-922-8831 (US)

Reply via email to