Thanks for the CheckIndex info, that worked! It looks like only one of the segments in that shard has issues:
1 of 20: name=_1om docCount=216683 codec=Lucene3x compound=false numFiles=10 size (MB)=5,111.421 diagnostics = {os=Linux, os.version=3.5.7, mergeFactor=7, source=merge, lucene.version=3.6.0 1310449 - rmuir - 2012-04-06 11:31:16, os.arch=amd64, mergeMaxNumSegments=-1, java.version=1.6.0_26, java.vendor=Sun Microsystems Inc.} no deletions test: open reader.........OK test: check integrity.....OK test: check live docs.....OK test: fields..............OK [31 fields] test: field norms.........OK [20 fields] test: terms, freq, prox...ERROR: java.lang.AssertionError: index=216690, numBits=216683 java.lang.AssertionError: index=216690, numBits=216683 at org.apache.lucene.util.FixedBitSet.set(FixedBitSet.java:252) at org.apache.lucene.index.CheckIndex.checkFields(CheckIndex.java:932) at org.apache.lucene.index.CheckIndex.testPostings(CheckIndex.java:1325) at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:631) at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2051) test: stored fields.......OK [3033562 total field count; avg 14 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] test: docvalues...........OK [0 docvalues fields; 0 BINARY; 0 NUMERIC; 0 SORTED; 0 SORTED_NUMERIC; 0 SORTED_SET] FAILED WARNING: fixIndex() would remove reference to this segment; full exception: java.lang.RuntimeException: Term Index test failed at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:646) at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2051) This is on ES 1.3.4, but the index I was running optimize on was likely created back in 0.9 or 1.0. On Tuesday, March 24, 2015 at 5:27:04 AM UTC-4, Michael McCandless wrote: > > Hmm, not good. > > Which version of ES? Do you have a full stack trace for the exception? > > To run CheckIndex you need to add all ES jars to the classpath. It's > easiest to just use a wildcard for this, e.g.: > > java -cp "/path/to/es-install/lib/*" org.apache.lucene.index.CheckIndex > ... > > Make sure you have the double quotes so the shell does not expand that > wildcard! > > Mike McCandless > > On Mon, Mar 23, 2015 at 9:50 PM, <mjd...@gmail.com <javascript:>> wrote: > >> I did an optimize on this index and it looks like it caused a shard to >> become corrupted. Or maybe the optimize just brought the shard corruption >> to light? >> >> On the node that reported the corrupted shard I tried shutting it down, >> moving the shard out and then restarting. Unfortunately the next node that >> got that shard then started with the same corruption issues. The errors: >> >> Mar 24 01:40:17 localhost elasticsearch: [bma.0][WARN >> ][indices.cluster ] [Meteorite II] [1-2013][0] failed to start >> shard >> Mar 24 01:40:17 localhost >> org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: >> [1-2013][0] failed to fetch index version after copying it over >> Mar 24 01:40:17 localhost elasticsearch: [bma.0][WARN >> ][cluster.action.shard ] [Meteorite II] [1-2013][0] sending failed >> shard for [1-2013][0], node[ZzXsIZCsTyWD2emFuU0idg], [P], s[INITIALIZING], >> indexUUID [_na_], reason [Failed to start shard, message >> [IndexShardGatewayRecoveryException[[1-2013][0] failed to fetch index >> version after copying it over]; nested: CorruptIndexException[[1-2013][0] >> Corrupted index [corrupted_OahNymObSTyBzCCPu1FuJA] caused by: >> CorruptIndexException[docs out of order (1493829 <= 1493874 ) (docOut: >> org.apache.lucene.store.RateLimitedIndexOutput@2901a3e1)]]; ]] >> >> I tried using CheckIndex, but had this issue: >> >> java.lang.IllegalArgumentException: A SPI class of type >> org.apache.lucene.codecs.PostingsFormat with name 'es090' does not exist. >> You need to add the corresponding JAR file supporting this SPI to your >> classpath.The current classpath supports the following names: [Pulsing41, >> SimpleText, Memory, BloomFilter, Direct, FSTPulsing41, FSTOrdPulsing41, >> FST41, FSTOrd41, Lucene40, Lucene41] >> >> When running with: >> >> java -cp >> /usr/share/elasticsearch/lib/lucene-codecs-4.9.1.jar:/usr/share/elasticsearch/lib/lucene-core-4.9.1.jar >> >> -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex >> >> I'm not a java programmer so after I tried other classpath combinations I >> was out of ideas. >> >> >> Any tips? Looking at _cat/shards the replica is currently marked >> "unassigned" while the primary is "initializing". Thanks! >> >> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to elasticsearc...@googlegroups.com <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/31fa3d97-02fa-4d1c-b507-d413051f2ea3%40googlegroups.com >> >> <https://groups.google.com/d/msgid/elasticsearch/31fa3d97-02fa-4d1c-b507-d413051f2ea3%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4cf24288-a7f7-4b3a-88b2-11181fe93d3f%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.