Thanks, I'll create a deliberate test tomorrow feed some random data through it several times to see what happens.
I'm also working on simply improving the buffer to handle the situation internally, but a few hours of testing isn't a big deal. Ta, Greg On 8 September 2010 21:41, Erick Erickson <erickerick...@gmail.com> wrote: > This would be surprising behavior, if you can reliably reproduce this > it's worth a JIRA. > > But (and I'm stretching a bit here) are you sure you're committing at the > end of the batch AND are you sure you're looking after the commit? Here's > the scenario: Your updated document is a position 1 and 100 in your batch. > Somewhere around SOLR processing document 50, an autocommit occurs, > and you're looking at your results before SOLR gets around to committing > document 100. Like I said, it's a stretch. > > To test this, you need to be absolutely sure of two things before you > search: > 1> the batch is finished processing > 2> you've issued a commit after the last document in the batch. > > If you're sure of the above and still see the problem, please let us > know... > > HTH > Erick > > On Tue, Sep 7, 2010 at 10:32 PM, Greg Pendlebury > <greg.pendleb...@gmail.com>wrote: > > > Does anyone know with certainty how (or even if) order is evaluated when > > updates are performed by batch? > > > > Our application internally buffers solr documents for speed of ingest > > before > > sending them to the server in chunks. The XML documents sent to the solr > > server contain all documents in the order they arrived without any > settings > > changed from the defaults (so overwrite = true). We are careful to avoid > > things like HashMaps on our side since they'd lose the order, but I can't > > be > > certain what occurs inside Solr. > > > > Sometimes if an object has been indexed twice for various reasons it > could > > appear twice in the buffer but the most up-to-date version is always > last. > > I > > have however observed instances where the first copy of the document is > > indexed and differences in the second copy are missing. Does this sound > > likely? And if so are there any obvious settings I can play with to get > the > > behavior I desire? > > > > I looked at: > > http://wiki.apache.org/solr/UpdateXmlMessages > > > > but there is no mention of order, just the overwrite flag (which I'm > unsure > > how it is applied internally to an update message) and the deprecated > > duplicates flag (which I have no idea about). > > > > Would switching to SolrInputDocuments on a CommonsHttpSolrServer help? as > > per http://wiki.apache.org/solr/Solrj. This is no mention of order there > > either however. > > > > Thanks to anyone who took the time to read this. > > > > Ta, > > Greg > > >