Re: segment gets corrupted (after background merge ?)

Stéphane Delprat Wed, 12 Jan 2011 07:33:49 -0800

I got another corruption.

It sure looks like it's the same type of error. (on a different field)


It's also not linked to a merge, since the segment size did not change.


*** good segment :

  1 of 9: name=_ncc docCount=1841685
    compound=false
    hasProx=true
    numFiles=9
    size (MB)=6,683.447

diagnostics = {optimize=false, mergeFactor=10,os.version=2.6.26-2-amd64, os=Linux, mergeDocStores=true,lucene.version=2.9.3 951790 - 2010-06-06 01:30:55, source=merge,os.arch=amd64, java.version=1.6.0

_20, java.vendor=Sun Microsystems Inc.}
    has deletions [delFileName=_ncc_22s.del]
    test: open reader.........OK [275881 deleted docs]
    test: fields..............OK [51 fields]
    test: field norms.........OK [51 fields]

test: terms, freq, prox...OK [17952652 terms; 174113812 terms/docspairs; 204561440 tokens]test: stored fields.......OK [45511958 total field count; avg29.066 fields per doc]test: term vectors........OK [0 total vector count; avg 0 term/freqvector fields per doc]



a few hours latter :

*** broken segment :

  1 of 17: name=_ncc docCount=1841685
    compound=false
    hasProx=true
    numFiles=9
    size (MB)=6,683.447

_20, java.vendor=Sun Microsystems Inc.}
    has deletions [delFileName=_ncc_24f.del]
    test: open reader.........OK [278167 deleted docs]
    test: fields..............OK [51 fields]
    test: field norms.........OK [51 fields]

test: terms, freq, prox...ERROR [term post_id:1599104 docFreq=1 !=num docs seen 0 + num docs deleted 0]java.lang.RuntimeException: term post_id:1599104 docFreq=1 != num docsseen 0 + num docs deleted 0atorg.apache.lucene.index.CheckIndex.testTermIndex(CheckIndex.java:675)atorg.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:530)

        at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:903)

test: stored fields.......OK [45429565 total field count; avg29.056 fields per doc]test: term vectors........OK [0 total vector count; avg 0 term/freqvector fields per doc]

FAILED

WARNING: fixIndex() would remove reference to this segment; fullexception:

java.lang.RuntimeException: Term Index test failed

atorg.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:543)

        at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:903)


I'll activate infoStream for next time.


Thanks,


Le 12/01/2011 00:49, Michael McCandless a écrit :

When you hit corruption is it always this same problem?:

   java.lang.RuntimeException: term source:margolisphil docFreq=1 !=
num docs seen 0 + num docs deleted 0

Can you run with Lucene's IndexWriter infoStream turned on, and catch
the output leading to the corruption?  If something is somehow messing
up the bits in the deletes file that could cause this.

Mike

On Mon, Jan 10, 2011 at 5:52 AM, Stéphane Delprat
<stephane.delp...@blogspirit.com>  wrote:

Hi,

We are using :
Solr Specification Version: 1.4.1
Solr Implementation Version: 1.4.1 955763M - mark - 2010-06-17 18:06:42
Lucene Specification Version: 2.9.3
Lucene Implementation Version: 2.9.3 951790 - 2010-06-06 01:30:55

# java -version
java version "1.6.0_20"
Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)

We want to index 4M docs in one core (and when it works fine we will add
other cores with 2M on the same server) (1 doc ~= 1kB)

We use SOLR replication every 5 minutes to update the slave server (queries
are executed on the slave only)

Documents are changing very quickly, during a normal day we will have approx
:
* 200 000 updated docs
* 1000 new docs
* 200 deleted docs


I attached the last good checkIndex : solr20110107.txt
And the corrupted one : solr20110110.txt


This is not the first time a segment gets corrupted on this server, that's
why I ran frequent "checkIndex". (but as you can see the first segment is
1.800.000 docs and it works fine!)


I can't find any "SEVER" "FATAL" or "exception" in the Solr logs.


I also attached my schema.xml and solrconfig.xml


Is there something wrong with what we are doing ? Do you need other info ?


Thanks,

Re: segment gets corrupted (after background merge ?)

Reply via email to