On 3/27/07, Kevin Osborn <[EMAIL PROTECTED]> wrote:
I know there are a bunch of variables here (RAM, number of fields, hits, etc.), but I am trying to get a sense of how big of an index in terms of number of documents Solr can reasonable handle. I have heard indexes of 3-4 million documents running fine. But, I have no idea what a reasonable upper limit might be.
People have constructed (lucene) indices with over a billion documents. But if "reasonable" means something like "<1s query time for a medium-complexity query on non-astronomical hardware", I wouldn't go much higher than the figure you quote.
I have a large number of documents and about 200-300 customers would have access to varying subsets of those documents. So, one possible strategy is to have everything in a large index, but duplicate the documents for each customer that has access to that document. But, that would really make the total number of documents huge. So, I am trying to get a sense of how big is too big. Each document will probably have about 30 fields. Most of them will be strings, but there will be some text, ints,a nd floats.
If you are going to store a document for each customer then some field must indicate to which customer the document instance belongs. In that case, why not index a single copy of each document, with a field containing a list of customers having access? -Mike