Total # of bytes for the input data is a more useful number than # of documents.
400 million documents was our peak at my last job. They were maybe 300-500 bytes of text, for 1k of disk space per document. The index was thus 400 gigabytes. The problems were: 1) system administration: the logistics of the index were a nightmare. Optimize took 14 hours, a full copy to the query servers took 1/2 an hour. Optimize needs twice the index size in the same partition. 2) sorting creates an array with one element for every document. We needed 32G of ram in a server to allow sorted results. 3) faceting on some fields was likewise impossible, since faceting makes an array of facet values. Faceting on timestamps was a no-no. The servers were Dell 2950s, 2 or 4 processor, 32G ram, 6 300mb high-speed SATA in Raid-5 for 1.2 terabytes of space. Basic searching was a little slower than the smaller index, but still 50ms for pre-cached queries. On Fri, Jun 26, 2009 at 8:28 AM, Otis Gospodnetic < otis_gospodne...@yahoo.com> wrote: > > Hi Daniel, > > How much Solr can handle really depends on the hardware you run it on, the > type of document you index in it, and the query rate and type. > > 10M doesn't sound like a large number even for an average server today > (e.g. 4 GB of RAM, 1-2 cores), web-page sized documents, and a query rate of > a few dozen a second simple keyword, boolean, or phrase queries > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > ----- Original Message ---- > > From: Daniel Löfquist <daniel.lofqu...@it.cdon.com> > > To: solr-user@lucene.apache.org > > Sent: Friday, June 26, 2009 7:27:45 AM > > Subject: How much data can Solr handle? > > > > We're looking to build a search solution that can contain as many as 10 > million > > different items and I was wondering if Solr could handle that kind of > data > > amount or not? > > > > Has anybody done any testing or published any kind of results for a > > Solr-installation > > working on huge amounts of data like this? > > > > //Daniel > > > > -- > > Daniel Löfquist > > Software Engineer > > > > CDON.COM > > Bergsgatan 20, Box 385, SE 201 23 Malmö, Sweden > > > > Office: +46 40 601 61 00 > > Direct: +46 40 601 61 16 > > Fax: +46 40 601 61 20 > > E-mail: daniel.lofqu...@it.cdon.com > > > > CDON.COM > > > > Confidentiality > > Information contained in this e-mail is intended for the use of the > > addressee only, and is confidential. Any dissemination, distribution, > > copying or use of this communication without prior permission of > > the addressee is strictly prohibited. If you are not the intended > > addressee you must delete this e-mail and its attachments. > > -- Lance Norskog goks...@gmail.com 650-922-8831 (US)