Update and add are basically the same thing if there's an existing document.
There will be some performance consequence since you're getting the stored
fields on the server as opposed to getting the full input from the external
source
and handing it to Solr. However, I know of at least one situation where the
atomic update rate is sky-high and it works, so I wouldn't worry about it
unless and
until I saw a problem.

Best,
Erick


On Tue, Feb 11, 2014 at 3:03 PM, Shawn Heisey <s...@elyograg.org> wrote:

> On 2/11/2014 2:37 PM, shamik wrote:
>
>> Eric,
>>
>>    Thanks for your reply. I should have given a better context. I'm
>> currently
>> running an incremental crawl daily on this particular source and indexing
>> the documents. Incremental crawl looks for any change since last crawl
>> date
>> based on the document publish date. But, there's no way for me to know if
>> a
>> document has been deleted. To ensure that, I ran a full crawl on a
>> weekend,
>> which basically re-index the entire content. After the full index is
>> over, I
>> call a purge script, which deletes any content which is more than 24 hour
>> old, based on the indextimestamp field.
>>
>> The issue with atomic update is that it doesn't alter the indextimstamp
>> field. So even if I run a full crawl with atomic updates, the timestamp
>> will
>> stick to its old value. Unfortunately, I can't rely on another date field
>> coming from the source as they are not consistent. That translates to the
>> fact that I can't remove stale content.
>>
>
> One possibility is this: When you send the atomic update to Solr, include
> a new value for the indextimestamp field.
>
> Another option: You can write a custom update processor plugin for Solr.
>  When the custom code is used, it will be executed on each incoming
> document.  Depending on what it finds in the update request, it can make
> appropriate changes, like updating indextimestamp.  You can do pretty much
> anything.
>
> http://wiki.apache.org/solr/UpdateRequestProcessor
>
> Writing an update processor in Java typically gives the best results in
> terms of flexibility and performance, but there is also a way to use other
> programming languages:
>
> http://wiki.apache.org/solr/ScriptUpdateProcessor
>
> Thanks,
> Shawn
>
>

Reply via email to