Re: update some fields vs replace the whole document

2013-03-10 Thread Upayavira
In terms of the impact upon the index, there is no difference, they do
the same thing - mark the previous doc deleted and insert another. As
jack says, maybe atomic updates are easier for you from an application
perspective.

Note Solr/lucene are heavily optimised towards reading - writing is a
relatively heavy operation.

Upayavira

On Fri, Mar 8, 2013, at 10:41 PM, Mingfeng Yang wrote:
 Then what's the difference between adding a new document vs.
 replacing/overwriting a document?
 
 Ming-
 
 
 On Fri, Mar 8, 2013 at 2:07 PM, Upayavira u...@odoko.co.uk wrote:
 
  With an atomic update, you need to retrieve the stored fields in order
  to build up the full document to insert back.
 
  In either case, you'll have to locate the previous version and mark it
  deleted before you can insert the new version.
 
  I bet that the amount of time spent retrieving stored fields is matched
  by the time saved by not having to transmit those fields over the wire,
  although I'd be very curious to see someone actually test that.
 
  Upayavira
 
  On Fri, Mar 8, 2013, at 09:51 PM, Mingfeng Yang wrote:
   Generally speaking, which has better performance for Solr?
   1. updating some fields or adding new fields into a document.
   or
   2. replacing the whole document.
  
   As I understand,  update fields need to search for the corresponding doc
   first, and then replace field values.  While replacing the whole document
   is just like adding new document.  Is it right?
 


Re: update some fields vs replace the whole document

2013-03-08 Thread Upayavira
With an atomic update, you need to retrieve the stored fields in order
to build up the full document to insert back.

In either case, you'll have to locate the previous version and mark it
deleted before you can insert the new version.

I bet that the amount of time spent retrieving stored fields is matched
by the time saved by not having to transmit those fields over the wire,
although I'd be very curious to see someone actually test that.

Upayavira

On Fri, Mar 8, 2013, at 09:51 PM, Mingfeng Yang wrote:
 Generally speaking, which has better performance for Solr?
 1. updating some fields or adding new fields into a document.
 or
 2. replacing the whole document.
 
 As I understand,  update fields need to search for the corresponding doc
 first, and then replace field values.  While replacing the whole document
 is just like adding new document.  Is it right?


Re: update some fields vs replace the whole document

2013-03-08 Thread Mingfeng Yang
Then what's the difference between adding a new document vs.
replacing/overwriting a document?

Ming-


On Fri, Mar 8, 2013 at 2:07 PM, Upayavira u...@odoko.co.uk wrote:

 With an atomic update, you need to retrieve the stored fields in order
 to build up the full document to insert back.

 In either case, you'll have to locate the previous version and mark it
 deleted before you can insert the new version.

 I bet that the amount of time spent retrieving stored fields is matched
 by the time saved by not having to transmit those fields over the wire,
 although I'd be very curious to see someone actually test that.

 Upayavira

 On Fri, Mar 8, 2013, at 09:51 PM, Mingfeng Yang wrote:
  Generally speaking, which has better performance for Solr?
  1. updating some fields or adding new fields into a document.
  or
  2. replacing the whole document.
 
  As I understand,  update fields need to search for the corresponding doc
  first, and then replace field values.  While replacing the whole document
  is just like adding new document.  Is it right?



Re: update some fields vs replace the whole document

2013-03-08 Thread Jack Krupansky
Generally it will be more a matter of application semantics. Solr makes it 
reasonably efficient to completely overwrite the existing document and 
fields, if that is what you want. But, in some applications, it may be 
desirable to preserve some or most of the existing fields; whether that is 
easier to accomplish be completely regenerating the full document from data 
stored elsewhere in the application (e.g., a RDBMS) or doing a selective 
write will depend on the application. In some apps, the rest of the data may 
not be maintained separately, so a selective write makes more sense. Or, 
maybe the existing document contains metadata fields such as timestamps or 
counters that would get reset if the whole document was regenerated.


-- Jack Krupansky

-Original Message- 
From: Mingfeng Yang

Sent: Friday, March 08, 2013 5:41 PM
To: solr-user@lucene.apache.org
Subject: Re: update some fields vs replace the whole document

Then what's the difference between adding a new document vs.
replacing/overwriting a document?

Ming-


On Fri, Mar 8, 2013 at 2:07 PM, Upayavira u...@odoko.co.uk wrote:


With an atomic update, you need to retrieve the stored fields in order
to build up the full document to insert back.

In either case, you'll have to locate the previous version and mark it
deleted before you can insert the new version.

I bet that the amount of time spent retrieving stored fields is matched
by the time saved by not having to transmit those fields over the wire,
although I'd be very curious to see someone actually test that.

Upayavira

On Fri, Mar 8, 2013, at 09:51 PM, Mingfeng Yang wrote:
 Generally speaking, which has better performance for Solr?
 1. updating some fields or adding new fields into a document.
 or
 2. replacing the whole document.

 As I understand,  update fields need to search for the corresponding doc
 first, and then replace field values.  While replacing the whole 
 document

 is just like adding new document.  Is it right?