Thanks. We're probably not going to be sending huge batches of documents very often, so I'll just
try a persistent connection and hopefully performance won't be an issue. With our document size, I
was posting around 300+ docs/s, so anything reasonably close to that will be good. Historically
we've been processing 335k document updates per hour, so we're way under the max docs/s we've seen
with Solr.
Doug
Chris Hostetter wrote:
: Sometimes there's a field that shouldn't be multiValued, but the data comes in
: with multiple fields of the same name in a single document.
:
: Is there any way to continue processing other documents in a file even if one
: document errors out? It seems like whenever we hit one of these cases, it
: stops processing the file completely.
I believe you are correct, the UpdateRequestHandler aborts as soon as bad
doc is found. It might be possible to make it skip bad docs and continue
processing, but what mechanism could it use to report which doc had
failed? not all schemas have uniqueKey fields, and even if they do - the
uniqueKey field may have been the problem.
This is one of the reasons why i personally recommend only sending one doc
at a time -- if you use persistent HTTP connections, there really
shouldn't be much performance differnece (and if there is, we can probably
optimize that)
-Hoss