Thanks really useful article. I am wondering about this statement in the article
"Keep in mind that Solr does not calculate universal term/doc frequencies. At a large scale, its not likely to matter that tf/idf is calculated at the shard level - however, if your collection is heavily skewed in its distribution across servers, you might take issue with the relevance results. Its probably best to randomly distribute documents to your shards" So if there is no universal tf/idf kept, then how does solr determine the rank of two documents which came from different shards in a distributed search query? Regards, Abhishek Juan Pedro Danculovic-2 wrote: > > To scale solr, take a look to this article > > http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Scaling-Lucene-and-Solr > > > > Juan Pedro Danculovic > CTO - www.linebee.com > > > On Thu, Feb 11, 2010 at 4:12 AM, abhishes <abhis...@gmail.com> wrote: > >> >> Suppose I am indexing very large data (5 billion rows in a database) >> >> Now I want to use the Solr Core feature to split the index into >> manageable >> chunks. >> >> However I have two questions >> >> >> 1. Can Cores reside on difference physical servers? >> >> 2. when a query comes, will the query be answered by index in 1 core or >> the >> query will be sent to all the cores? >> >> My desire is to have a system which from outside appears as a single >> large >> index... but inside it is multiple small indexes running on different >> hardware machines. >> -- >> View this message in context: >> http://old.nabble.com/Question-on-Solr-Scalability-tp27543068p27543068.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> >> > > -- View this message in context: http://old.nabble.com/Question-on-Solr-Scalability-tp27543068p27544436.html Sent from the Solr - User mailing list archive at Nabble.com.