Thanks really useful article.

I am wondering about this statement in the article

"Keep in mind that Solr does not calculate universal term/doc frequencies.
At a large scale, its not likely  to matter that tf/idf is calculated at the
shard level - however, if your collection is heavily skewed in its
distribution across servers, you might take issue with the relevance
results. Its probably best to randomly distribute documents to your shards"

So if there is no universal tf/idf kept, then how does solr determine the
rank of two documents which came from different shards in a distributed
search query?

Regards,
Abhishek





Juan Pedro Danculovic-2 wrote:
> 
> To scale solr, take a look to this article
> 
> http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Scaling-Lucene-and-Solr
> 
> 
> 
> Juan Pedro Danculovic
> CTO - www.linebee.com
> 
> 
> On Thu, Feb 11, 2010 at 4:12 AM, abhishes <abhis...@gmail.com> wrote:
> 
>>
>> Suppose I am indexing very large data (5 billion rows in a database)
>>
>> Now I want to use the Solr Core feature to split the index into
>> manageable
>> chunks.
>>
>> However I have two questions
>>
>>
>> 1. Can Cores reside on difference physical servers?
>>
>> 2. when a query comes, will the query be answered by index in 1 core or
>> the
>> query will be sent to all the cores?
>>
>> My desire is to have a system which from outside appears as a single
>> large
>> index... but inside it is multiple small indexes running on different
>> hardware machines.
>> --
>> View this message in context:
>> http://old.nabble.com/Question-on-Solr-Scalability-tp27543068p27543068.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Question-on-Solr-Scalability-tp27543068p27544436.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to