Good morning, https://issues.apache.org/jira/browse/SOLR-1632
- Mitch Li Li wrote: > > where is the link of this patch? > > 2010/7/24 Yonik Seeley <yo...@lucidimagination.com>: >> On Fri, Jul 23, 2010 at 2:23 PM, MitchK <mitc...@web.de> wrote: >>> why do we do not send the output of TermsComponent of every node in the >>> cluster to a Hadoop instance? >>> Since TermsComponent does the map-part of the map-reduce concept, Hadoop >>> only needs to reduce the stuff. Maybe we even do not need Hadoop for >>> this. >>> After reducing, every node in the cluster gets the current values to >>> compute >>> the idf. >>> We can store this information in a HashMap-based SolrCache (or something >>> like that) to provide constant-time access. To keep the values up to >>> date, >>> we can repeat that after every x minutes. >> >> There's already a patch in JIRA that does distributed IDF. >> Hadoop wouldn't be the right tool for that anyway... it's for batch >> oriented systems, not low-latency queries. >> >>> If we got that, it does not care whereas we use doc_X from shard_A or >>> shard_B, since they will all have got the same scores. >> >> That only works if the docs are exactly the same - they may not be. >> >> -Yonik >> http://www.lucidimagination.com >> > > -- View this message in context: http://lucene.472066.n3.nabble.com/a-bug-of-solr-distributed-search-tp983533p995407.html Sent from the Solr - User mailing list archive at Nabble.com.