Re: loading many documents by ID

Yonik Seeley Fri, 02 Feb 2007 10:22:45 -0800

On 2/1/07, Ryan McKinley <[EMAIL PROTECTED]> wrote:

>
> Not sure... depends on how update handlers will use it...


by update handler, you mean UpdateRequestHandler(s)? or UpdateHandler?


Both.

> One thing we might not want to get rid of though is streaming
> (constructing and adding a document, then discarding it).  People are
> starting to add a lot of documents in a single XML request, and this
> will be much larger for CVS/SQL.
>

So you are uncomfortable with the Collection because you would have to
load all the documents before indexing them.  If this was many, it
could be a problem...

If UpdateHandler is going to take care of stuff like autocommit and
modifying documents, It seems best to have that apply to all the
documents you are going to modify as a unit.  For example, say i have
a SQL updater that will modify 100,000 documents incrementing field
'count_*' and replacing 'fl_*'.  If the DocumentCommand only applies
to a single document, it would have to match each field as it went
along rather then once when it starts.

How about: Iterable<SolrDocument>


Maybe... but that might not be the easiest for request handlers to
use... they would then need to spin up a different thread and use a
pull model (provide a new doc on demand) rather than push (call
addDocument()).

I'm really just thinking a little out loud... just first impressions
- don't read too much into it.
When I'm coding, the design tends to morph a lot.

I think we need to figure out what type of update semantics we want
w.r.t. adding multiple documents, and all the other misc autocommit
params.

-Yonik

Re: loading many documents by ID

Reply via email to