Hi,

hossman wrote:
> 
> : We index using 4 processes that read from a queue of documents. Each
> process
> : send one document at a time to the /update handler.
> 
> Hmmm.. then you should have a message from the LogUpdateProcessorFactory 
> for every individual "add" command that was recieved ... did you crunch 
> those to see if anything odd popped up (ie: duplicated IDs)
> 
> what did the "start commit" log messages look like?
> 
> (FWIW: I have no hunches as to what caused that behavior, i'm just 
> scrounging for more data)
> 

A quick check did show me a couple of duplicates, but if I understand
correctly, even if two different process send the same document, the last
one should update the previous. If I send the same documents 10 times, in
the end, it should only be in my index once, no?

The "start commit" message is always:
start
commit(optimize=false,waitFlush=true,waitSearcher=true,expungeDeletes=false)


hossman wrote:
> 
> : Yes, I double checked that no delete occur. Since that indexation, I
> : re-index the same set of documents twice and we always end up with 7725
> : documents, but it did not show that ~10000 documents count that we saw
> the
> : first time. But the difference between the first indexation and the
> others
> : was that the first time, the indexation last a couple of hours because
> the
> : documents were not always accessible in our document queue. The others
> 
> Hmmm... what exactly does yout indexing code do when the documents aren't 
> available?  ... and what happens if you forcibly commit in the middle of 
> reindexing (to see some of those counts again)
> 

If no document is available, the threads are sleeping. If a commit is send
manually during the re-indexation, it just commit what has been sent to the
index so far.

I will redo the test with the same documents and in the same conditions as
in our first indexation to see if the counts will be the same again.

Again, thanks a lot for your help.


-- 
View this message in context: 
http://old.nabble.com/Documents-disappearing-tp27659047p27794641.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to