Yes, this problem has been solved though not completely, there is still a refresh problem. To eliminate duplicate documents with a unique id during update, you need to set <maxBufferedDeleteTerms>1</maxBufferedDeleteTerms>. This makes the most recent updated document to become searchable as well as removing the older documents. There is a catch though, if some of the fields in a document are different and this is updated , older content might show up as part of the results even though the query matches the most recent document content ie. if the most recent doc has afield set to <doc><afield>abc</afield></doc> and this is updated, and the old docs were <doc><afield>xyz</afield>, at query time, q=afield:abc matches, but the results show may show <doc><afield>xyz</afield>. I am still researching this.

You can get more information about the performance and known issues here:
http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver_3.x

Regards,

- Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org



On 7/19/2011 1:21 AM, Andy wrote:
Nagendra,

In another email you mentioned there's a problem where if an existing document 
is updated both the old and new version will show up in search results.

Has that been solved in Solr-RA 3.3?

--- On Mon, 7/18/11, Nagendra Nagarajayya<nnagaraja...@transaxtions.com>  wrote:

From: Nagendra Nagarajayya<nnagaraja...@transaxtions.com>
Subject: [Announce] Solr 3.3 with RankingAlgorithm NRT capability, very high 
performance 10000 tps
To: solr-user@lucene.apache.org
Date: Monday, July 18, 2011, 10:43 AM
Hi!

I would like to announce the availability of Solr 3.3 with
RankingAlgorithm and Near Real Time (NRT) search capability
now. The NRT performance is very high, 10,000 documents/sec
with the MBArtists 390k index. The NRT functionality allows
you to add documents without the IndexSearchers being closed
or caches being cleared. A commit is also not needed with
the document update. Searches can run concurrently with
document updates. No changes are needed except for enabling
the NRT through solrconfig.xml.

RankingAlgorithm query performance is now 3x times faster
than before and is exposed as the Lucene API. This release
also adds supports for the last document with a unique id to
be searchable and visible in search results in case of
multiple updates of the document.

I have a wiki page that describes NRT performance in detail
and can be accessed from here:

http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver3.x

You can download Solr 3.3 with RankingAlgorithm (NRT
version) from here:

http://solr-ra.tgels.org

I would like to invite you to give this version a try as
the performance is very high.

Regards,

- Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org






Reply via email to