UPDATE:

After few more minutes (after previous commit):
docsPending: about 7,000,000

After commit:
numDocs: 2,297,231

Increase = 2,297,231 - 1,281,851 = 1,000,000 (average)

So that I have 7 docs with same ID in average.

Having 100,000,000 and then dropping below 1,000,000 is strange; it is a bug
somewhere... need to investigate ramBufferSize and MergePolicy, including
SOLR uniqueId implementation...



Funtick wrote:
> 
> After running an application which heavily uses MD5 HEX-representation as
> <uniqueKey> for SOLR v.1.4-dev-trunk:
> 
> 1. After 30 hours: 
> 101,000,000 documents added
> 
> 2. Commit: 
> numDocs = 783,714 
> maxDoc = 3,975,393
> 
> 3. Upload new docs to SOLR during 1 hour(!!!!!!!), then commit, then
> optimize:
> numDocs=1,281,851
> maxDocs=1,281,851
> 
> It looks _extremely_ strange that within an hour I have such a huge
> increase with same 'average' document set...
> 
> I am suspecting something goes wrong with Lucene buffer flush / index
> merge OR SOLR - Unique ID handling...
> 
> According to my own estimates, I should have about 10,000,000 new
> documents now... I had 0.5 millions within an hour, and 0.8 mlns within a
> day; same 'random' documents.
> 
> This morning index size was about 4Gb, then suddenly dropped below 0.5 Gb.
> Why? I haven't issued any "commit"...
> 
> I am using ramBufferMB=8192
> 
> 
> 
> 
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/SOLR-%3CuniqueKey%3E---extremely-strange-behavior%21-Documents-disappeared...-tp25017728p25018221.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to