Thanks. We're probably not going to be sending huge batches of documents very often, so I'll just try a persistent connection and hopefully performance won't be an issue. With our document size, I was posting around 300+ docs/s, so anything reasonably close to that will be good. Historically we've been processing 335k document updates per hour, so we're way under the max docs/s we've seen with Solr.

Doug

Chris Hostetter wrote:
: Sometimes there's a field that shouldn't be multiValued, but the data comes in
: with multiple fields of the same name in a single document.
: : Is there any way to continue processing other documents in a file even if one
: document errors out? It seems like whenever we hit one of these cases, it
: stops processing the file completely.

I believe you are correct, the UpdateRequestHandler aborts as soon as bad doc is found. It might be possible to make it skip bad docs and continue processing, but what mechanism could it use to report which doc had failed? not all schemas have uniqueKey fields, and even if they do - the uniqueKey field may have been the problem.

This is one of the reasons why i personally recommend only sending one doc at a time -- if you use persistent HTTP connections, there really shouldn't be much performance differnece (and if there is, we can probably optimize that)


-Hoss

Reply via email to