Re: Partial update vs full update performance

adfel70 Wed, 12 Jun 2013 08:55:10 -0700

Yes it is.
But in my case, these are metadata fields, and I need them to be searchable,
facetable, sortable in the context of the main text fields.
Will I be able to achieve that if I index them in another core?



Upayavira wrote
> My question would be, why are you updating 10m documents? Is it because
> of denormalised fields? E.g. one system I have needs to reindex all data
> for a publication when that publication switches between active and
> inactive. 
> 
> If this is the case, you can perhaps achieve the same using joins. Store
> the publications, and their status, in another core. Then, to find
> documents for active publications could be:
> 
> q=harry potter&fq={!join fromIndex=pubs from=pubID
> to=pubID}status:active
> 
> This would find documents containing the terms 'harry potter' which are
> associated with active publications.
> 
> Changing the status of a publication would require a single document in
> the 'pubs' core to be changed, rather than re-indexing all documents.
> 
> Does this hit what you are trying to achieve?
> 
> Upayavira
> 
> 
> On Wed, Jun 12, 2013, at 03:51 PM, Jack Krupansky wrote:
>> Correct.
>> 
>> Generally, I think most apps will benefit from partial update, especially
>> if 
>> they have a lot of fields. Otherwise, they will have two round trip
>> requests 
>> rather than one. Solr does the reading of existing document values more 
>> efficiently, under the hood, with no need to format for the response and 
>> parse the incoming (redundant) values.
>> 
>> OTOH, if the client has all the data anyway (maybe because it wants to 
>> display the data before update), it may be easier to do a full update.
>> 
>> You could do an actual performance test, but I would suggest that 
>> (generally) partial update will be more efficient than a full update.
>> 
>> And Lucene can do add and delete rather quickly, so that should not be a 
>> concern for modest to medium size documents, but clearly would be an
>> issue 
>> for large and very large documents (hundreds of fields or large field 
>> values.)
>> 
>> -- Jack Krupansky
>> 
>> -----Original Message----- 
>> From: adfel70
>> Sent: Wednesday, June 12, 2013 10:40 AM
>> To: 

> solr-user@.apache

>> Subject: Partial update vs full update performance
>> 
>> Hi
>> As I understand, even if I use partial update, lucene can't really update
>> documents. Solr will use the stored fields in order to pass the values to
>> lucene, and a delete,add opeartions will still be performed.
>> 
>> If this is the case is there a performance issue when comparing partial
>> update to full update?
>> 
>> My documents have dozens of fields, most of them are not stored.
>> I sometimes need to go through a portion of the documents and modify a
>> single field.
>> What I do right now is deleting the portion I want to update, and adding
>> them with the updated field.
>> This of course takes a lot of time (I'm talking about ten of millions of
>> documents).
>> 
>> Should I move to using partial update? will it improve the indexing time
>> at
>> all? will it improve the indexing time in such extent that I would better
>> be
>> storing the fields I don't need stored just for the partial update
>> feature?
>> 
>> thanks
>> 
>> 
>> 
>> 
>> 
>> 
>> --
>> View this message in context: 
>> http://lucene.472066.n3.nabble.com/Partial-update-vs-full-update-performance-tp4069948.html
>> Sent from the Solr - User mailing list archive at Nabble.com. 
>>





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Partial-update-vs-full-update-performance-tp4069948p4069974.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Partial update vs full update performance

Reply via email to