On Mon, 2011-07-04 at 13:51 +0200, Jame Vaalet wrote: > What would be the maximum size of a single SOLR index file for resulting in > optimum search time ?
There is no clear answer. It depends on the number of (unique) terms, number of documents, bytes on storage, storage speed, query complexity, faceting, number of concurrent users and a lot of other factors. > In case I have got to index all the documents in my repository (which is in > TB size) what would be the ideal architecture to follow , distributed SOLR ? A TB in source documents might very well end up as a simple, single machine index of 100GB or less. It depends on the amount of search relevant information in the documents, rather that their size in bytes. If your sources are Word-documents or a similar format with a relatively large amount of stuffing and your searches are mostly simple "the user enters 2-5 verbs and hits enter", my guess is that you don't need to worry about distribution yet. Make a pilot. Most of the work you'll have to do for a single machine test can be reused for a distributed production setup.