[mailto:f...@efendi.ca]
Sent: August-18-09 12:25 AM
To: solr-user@lucene.apache.org
Subject: Re: SOLR uniqueKey - extremely strange behavior! Documents
disappeared...
sorry for typo in prev msg,
Increase = 2,297,231 - 1,786,552 = 500,000 (average)
RATE (non-unique-id:unique-id) = 7,000,000 : 500,000
I'd say you have a lot of documents that have the same id.
When you add a doc with the same id, first the old one is deleted, then the
new one is added (atomically though).
The deleted docs are not removed from the index immediately though - the doc
id is just marked as deleted.
Over time
But how to explain that within an hour (after commit) I have had about
500,000 new documents, and within 30 hours (after commit) only 1,300,000?
Same _random_enough_ documents...
BTW, SOLR Console was showing only few hundreds deletesById although I
don't use any deleteById explicitly; only
One more hour, and I have +0.5 mlns more (after commit/optimize)
Something strange happening with SOLR buffer flush (if we have single
segment???)... explicit commit prevents it...
30 hours, with index flush, commit: 783,714
+ 1 hour, commit, optimize: 1,281,851
+ 1 hour, commit, optimize:
UPDATE:
After few more minutes (after previous commit):
docsPending: about 7,000,000
After commit:
numDocs: 2,297,231
Increase = 2,297,231 - 1,281,851 = 1,000,000 (average)
So that I have 7 docs with same ID in average.
Having 100,000,000 and then dropping below 1,000,000 is strange; it is a
sorry for typo in prev msg,
Increase = 2,297,231 - 1,786,552 = 500,000 (average)
RATE (non-unique-id:unique-id) = 7,000,000 : 500,000 = 14:1
but 125:1 (initial 30 hours) was very strange...
Funtick wrote:
UPDATE:
After few more minutes (after previous commit):
docsPending: about