Hi all, I have run CheckIndex. It seems that the index is currupted. I've got plenty of exceptions like:
test: terms, freq, prox...ERROR: java.lang.ArrayIndexOutOfBoundsException java.lang.ArrayIndexOutOfBoundsException at org.apache.lucene.store.ByteArrayDataInput.readBytes(ByteArrayDataInput.java:181) at org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum$Frame.nextLeaf(BlockTreeTermsReader.java:2414) at org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum$Frame.next(BlockTreeTermsReader.java:2400) at org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum.next(BlockTreeTermsReader.java:2074) at org.apache.lucene.index.CheckIndex.checkFields(CheckIndex.java:771) at org.apache.lucene.index.CheckIndex.testPostings(CheckIndex.java:1164) at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:602) at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:1748) and test: terms, freq, prox...ERROR: java.lang.RuntimeException: term [6f 70 65 72 61 63 69 6a 61]: doc 105407 <= lastDoc 105407 java.lang.RuntimeException: term [6f 70 65 72 61 63 69 6a 61]: doc 105407 <= lastDoc 105407 at org.apache.lucene.index.CheckIndex.checkFields(CheckIndex.java:858) at org.apache.lucene.index.CheckIndex.testPostings(CheckIndex.java:1164) at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:602) at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:1748) test: stored fields.......OK [723321 total field count; avg 3 fields per doc] final warning was: WARNING: 154 broken segments (containing 48127608 documents) detected WARNING: would write new segments file, and 48127608 documents would be lost, if -fix were specified As I mentiod - I have run optimization after initial import (no further adds or deletion were made). For import I'm creating csv files and I'm loading them through csv upload with multiple threads. The index is otherwise queryable. Any ideas what should I do next? Is this a bug in lucene? Many thanks... Rok On Thu, Jun 7, 2012 at 5:05 PM, Jack Krupansky <j...@basetechnology.com>wrote: > Is the index otherwise usable for queries? And it is only the optimize > that is failing? > > I suppose it is possible that the index could be corrupted, but it is also > possible that there is a bug in Lucene. > > I would suggest running Lucene "CheckIndex" next. See what it has to say. > > See: > https://builds.apache.org/job/**Lucene-trunk/javadoc/core/org/** > apache/lucene/index/**CheckIndex.html#main(java.**lang.String[])<https://builds.apache.org/job/Lucene-trunk/javadoc/core/org/apache/lucene/index/CheckIndex.html#main%28java.lang.String[]%29> > > > -- Jack Krupansky > > -----Original Message----- From: Rok Rejc > Sent: Thursday, June 07, 2012 5:50 AM > To: solr-user@lucene.apache.org > Subject: Re: Exception when optimizing index > > > Hi Jack, > > its the virtual machine running on a VMware vSphere 5 Enterprise Plus. > Machine has 30 GB vRAM, 8 core vCPU 3.0 GHz, 2 TB SATA RAID-10 over iSCSI. > Operation system is CentOS 6.2 64bit. > > Here are java infos: > > > - catalina.base/usr/share/**tomcat6 > - catalina.home/usr/share/**tomcat6 > - catalina.useNamingtrue > - common.loader > ${catalina.base}/lib,${**catalina.base}/lib/*.jar,${** > catalina.home}/lib,${catalina.**home}/lib/*.jar > - file.encodingUTF-8 > - file.encoding.pkgsun.io > - file.separator/ > - java.awt.graphicsenvsun.awt.**X11GraphicsEnvironment > - java.awt.printerjobsun.**print.PSPrinterJob > - java.class.path > /usr/share/tomcat6/bin/**bootstrap.jar > /usr/share/tomcat6/bin/tomcat-**juli.jar/usr/share/java/** > commons-daemon.jar > - java.class.version50.0 > - java.endorsed.dirs > - java.ext.dirs > /usr/lib/jvm/java-1.6.0-**openjdk-1.6.0.0.x86_64/jre/**lib/ext > /usr/java/packages/lib/ext > - java.home/usr/lib/jvm/java-1.**6.0-openjdk-1.6.0.0.x86_64/jre > - java.io.tmpdir/var/cache/**tomcat6/temp > - java.library.path > /usr/lib/jvm/java-1.6.0-**openjdk-1.6.0.0.x86_64/jre/**lib/amd64/server > /usr/lib/jvm/java-1.6.0-**openjdk-1.6.0.0.x86_64/jre/**lib/amd64 > /usr/lib/jvm/java-1.6.0-**openjdk-1.6.0.0.x86_64/jre/../**lib/amd64 > /usr/java/packages/lib/amd64/**usr/lib64/lib64/lib/usr/lib > - java.naming.factory.initial > org.apache.naming.java.**javaURLContextFactory > - java.naming.factory.url.**pkgsorg.apache.naming > - java.runtime.nameOpenJDK Runtime Environment > - java.runtime.version1.6.0_**22-b22 > - java.specification.nameJava Platform API Specification > - java.specification.vendorSun Microsystems Inc. > - java.specification.version1.**6 > - java.util.logging.config.**file > /usr/share/tomcat6/conf/**logging.properties > - java.util.logging.**managerorg.apache.juli.**ClassLoaderLogManager > - java.vendorSun Microsystems Inc. > - java.vendor.urlhttp://java.**sun.com/ <http://java.sun.com/> > - > java.vendor.url.bughttp://j**ava.sun.com/cgi-bin/bugreport.**cgi<http://java.sun.com/cgi-bin/bugreport.cgi> > - java.version1.6.0_22 > - java.vm.infomixed mode > - java.vm.nameOpenJDK 64-Bit Server VM > - java.vm.specification.**nameJava Virtual Machine Specification > - java.vm.specification.**vendorSun Microsystems Inc. > - java.vm.specification.**version1.0 > - java.vm.vendorSun Microsystems Inc. > - java.vm.version20.0-b11 > - javax.sql.DataSource.**Factory > org.apache.commons.dbcp.**BasicDataSourceFactory > - line.separator > - os.archamd64 > - os.nameLinux > - os.version2.6.32-220.13.1.**el6.x86_64 > - package.access > sun.,org.apache.catalina.,org.**apache.coyote.,org.apache.** > tomcat.,org.apache.jasper.,**sun.beans. > - package.definition > sun.,java.,org.apache.**catalina.,org.apache.coyote.,** > org.apache.tomcat.,org.apache.**jasper. > - path.separator: > - server.loader > - shared.loader > - sun.arch.data.model64 > - sun.boot.class.path > /usr/lib/jvm/java-1.6.0-**openjdk-1.6.0.0.x86_64/jre/**lib/resources.jar > /usr/lib/jvm/java-1.6.0-**openjdk-1.6.0.0.x86_64/jre/**lib/rt.jar > /usr/lib/jvm/java-1.6.0-**openjdk-1.6.0.0.x86_64/jre/**lib/sunrsasign.jar > /usr/lib/jvm/java-1.6.0-**openjdk-1.6.0.0.x86_64/jre/**lib/jsse.jar > /usr/lib/jvm/java-1.6.0-**openjdk-1.6.0.0.x86_64/jre/**lib/jce.jar > /usr/lib/jvm/java-1.6.0-**openjdk-1.6.0.0.x86_64/jre/**lib/charsets.jar > /usr/lib/jvm/java-1.6.0-**openjdk-1.6.0.0.x86_64/jre/**lib/netx.jar > /usr/lib/jvm/java-1.6.0-**openjdk-1.6.0.0.x86_64/jre/**lib/plugin.jar > /usr/lib/jvm/java-1.6.0-**openjdk-1.6.0.0.x86_64/jre/**lib/rhino.jar > /usr/lib/jvm/java-1.6.0-**openjdk-1.6.0.0.x86_64/jre/** > lib/modules/jdk.boot.jar > /usr/lib/jvm/java-1.6.0-**openjdk-1.6.0.0.x86_64/jre/**classes > - sun.boot.library.path > /usr/lib/jvm/java-1.6.0-**openjdk-1.6.0.0.x86_64/jre/**lib/amd64 > - sun.cpu.endianlittle > - sun.cpu.isalist > - sun.io.unicode.**encodingUnicodeLittle > - sun.java.commandorg.apache.**catalina.startup.Bootstrap start > - sun.java.launcherSUN_**STANDARD > - sun.jnu.encodingUTF-8 > - sun.management.**compilerHotSpot 64-Bit Tiered Compilers > - sun.os.patch.levelunknown > - tomcat.util.buf.**StringCache.byte.enabledtrue > - user.countryUS > - user.dir/usr/share/tomcat6 > - user.home/usr/share/tomcat6 > - user.languageen > - user.nametomcat > - user.timezoneEurope/Ljubljana > > > > > As far as I see from the JIRA issue I have the patch attached (as mentioned > I have a trunk version from May 12). Any ideas? > > Many thanks! > > > > On Wed, Jun 6, 2012 at 2:49 PM, Jack Krupansky <j...@basetechnology.com>** > wrote: > > It could be related to https://issues.apache.org/**** >> jira/browse/LUCENE-2975<https://issues.apache.org/**jira/browse/LUCENE-2975> >> <https:**//issues.apache.org/jira/**browse/LUCENE-2975<https://issues.apache.org/jira/browse/LUCENE-2975> >> >. >> At least the exception comes from the same function. >> >> >> "Caused by: java.io.IOException: Invalid vInt detected (too many bits) >> at org.apache.lucene.store.****DataInput.readVInt(DataInput.*** >> *java:112)" >> >> What hardware and Java version are you running? >> >> -- Jack Krupansky >> >> -----Original Message----- From: Rok Rejc >> Sent: Wednesday, June 06, 2012 3:45 AM >> To: solr-user@lucene.apache.org >> Subject: Exception when optimizing index >> >> >> Hi all, >> >> I have a solr installation (version 4.0 from trunk - 1st May 2012). >> >> After I imported documents (99831145 documents) I have run the >> optimization. I got an exception: >> >> <response><lst name="responseHeader"><int name="status">500</int><int >> name="QTime">281615</int></****lst><lst name="error"><str >> name="msg">background >> merge hit exception: _8x(4.0):C202059 _e0(4.0):C192649 _3r(4.0):C205785 >> _1s(4.0):C203526 _4w(4.0):C199793 _7f(4.0):C193108 _dy(4.0):C185814 >> _7d(4.0):C190364 _c5(4.0):C187881 _8u(4.0):C185001 _r(4.0):C183475 >> _1r(4.0):C185622 _2s(4.0):C174349 _3s(4.0):C171683 _7h(4.0):C170618 >> _fj(4.0):C179232 _2t(4.0):C161907 _fi(4.0):C168713 _1q(4.0):C165402 >> _2r(4.0):C152995 _e1(4.0):C146080 _f4(4.0):C155072 _af(4.0):C149113 >> _dx(4.0):C147298 _3t(4.0):C150806 _q(4.0):C146874 _4v(4.0):C146324 >> _fc(4.0):C141426 _al(4.0):C125361 _64(4.0):C119208 into _ft >> [maxNumSegments=1]</str><str name="trace">java.io.****IOException: >> background >> merge hit exception: _8x(4.0):C202059 _e0(4.0):C192649 _3r(4.0):C205785 >> _1s(4.0):C203526 _4w(4.0):C199793 _7f(4.0):C193108 _dy(4.0):C185814 >> _7d(4.0):C190364 _c5(4.0):C187881 _8u(4.0):C185001 _r(4.0):C183475 >> _1r(4.0):C185622 _2s(4.0):C174349 _3s(4.0):C171683 _7h(4.0):C170618 >> _fj(4.0):C179232 _2t(4.0):C161907 _fi(4.0):C168713 _1q(4.0):C165402 >> _2r(4.0):C152995 _e1(4.0):C146080 _f4(4.0):C155072 _af(4.0):C149113 >> _dx(4.0):C147298 _3t(4.0):C150806 _q(4.0):C146874 _4v(4.0):C146324 >> _fc(4.0):C141426 _al(4.0):C125361 _64(4.0):C119208 into _ft >> [maxNumSegments=1] >> at org.apache.lucene.index.****IndexWriter.forceMerge(** >> IndexWriter.java:1475) >> at org.apache.lucene.index.****IndexWriter.forceMerge(** >> IndexWriter.java:1412) >> at >> org.apache.solr.update.****DirectUpdateHandler2.commit(** >> DirectUpdateHandler2.java:385) >> at >> org.apache.solr.update.****processor.RunUpdateProcessor.*** >> *processCommit(** >> RunUpdateProcessorFactory.****java:82) >> at >> org.apache.solr.update.****processor.****UpdateRequestProcessor.** >> processCommit(****UpdateRequestProcessor.java:****64) >> at >> org.apache.solr.update.****processor.****DistributedUpdateProcessor.** >> processCommit(****DistributedUpdateProcessor.****java:783) >> at >> org.apache.solr.update.****processor.LogUpdateProcessor.*** >> *processCommit(** >> LogUpdateProcessorFactory.****java:154) >> at org.apache.solr.handler.****XMLLoader.processUpdate(** >> XMLLoader.java:155) >> at org.apache.solr.handler.****XMLLoader.load(XMLLoader.java:****79) >> at >> org.apache.solr.handler.****ContentStreamHandlerBase.**** >> handleRequestBody(** >> ContentStreamHandlerBase.java:****59) >> at >> org.apache.solr.handler.****RequestHandlerBase.****handleRequest(** >> RequestHandlerBase.java:129) >> at org.apache.solr.core.SolrCore.****execute(SolrCore.java:1540) >> at >> org.apache.solr.servlet.****SolrDispatchFilter.execute(** >> SolrDispatchFilter.java:435) >> at >> org.apache.solr.servlet.****SolrDispatchFilter.doFilter(** >> SolrDispatchFilter.java:256) >> at >> org.apache.catalina.core.****ApplicationFilterChain.**** >> internalDoFilter(** >> ApplicationFilterChain.java:****235) >> at >> org.apache.catalina.core.****ApplicationFilterChain.****doFilter(** >> ApplicationFilterChain.java:****206) >> at >> org.apache.catalina.core.****StandardWrapperValve.invoke(** >> StandardWrapperValve.java:233) >> at >> org.apache.catalina.core.****StandardContextValve.invoke(** >> StandardContextValve.java:191) >> at >> org.apache.catalina.core.****StandardHostValve.invoke(** >> StandardHostValve.java:127) >> at >> org.apache.catalina.valves.****ErrorReportValve.invoke(** >> ErrorReportValve.java:102) >> at >> org.apache.catalina.core.****StandardEngineValve.invoke(** >> StandardEngineValve.java:109) >> at >> org.apache.catalina.connector.****CoyoteAdapter.service(** >> CoyoteAdapter.java:298) >> at >> org.apache.coyote.http11.****Http11AprProcessor.process(** >> Http11AprProcessor.java:865) >> at >> org.apache.coyote.http11.****Http11AprProtocol$**** >> Http11ConnectionHandler.** >> process(Http11AprProtocol.****java:579) >> at >> org.apache.tomcat.util.net.****AprEndpoint$Worker.run(** >> AprEndpoint.java:1556) >> at java.lang.Thread.run(Thread.****java:679) >> Caused by: java.io.IOException: Invalid vInt detected (too many bits) >> at org.apache.lucene.store.****DataInput.readVInt(DataInput.*** >> *java:112) >> at >> org.apache.lucene.codecs.****lucene40.****Lucene40PostingsReader$** >> AllDocsSegmentDocsEnum.****nextUnreadDoc(**** >> Lucene40PostingsReader.java:** >> 557) >> at >> org.apache.lucene.codecs.****lucene40.****Lucene40PostingsReader$** >> SegmentDocsEnumBase.refill(****Lucene40PostingsReader.java:****408) >> at >> org.apache.lucene.codecs.****lucene40.****Lucene40PostingsReader$** >> AllDocsSegmentDocsEnum.****nextDoc(****Lucene40PostingsReader.java:**** >> 508) >> at >> org.apache.lucene.codecs.****MappingMultiDocsEnum.nextDoc(**** >> MappingMultiDocsEnum.java:85) >> at >> org.apache.lucene.codecs.****PostingsConsumer.merge(** >> PostingsConsumer.java:65) >> at org.apache.lucene.codecs.****TermsConsumer.merge(** >> TermsConsumer.java:82) >> at org.apache.lucene.codecs.****FieldsConsumer.merge(** >> FieldsConsumer.java:54) >> at >> org.apache.lucene.index.****SegmentMerger.mergeTerms(** >> SegmentMerger.java:356) >> at org.apache.lucene.index.****SegmentMerger.merge(** >> SegmentMerger.java:115) >> at >> org.apache.lucene.index.****IndexWriter.mergeMiddle(**** >> IndexWriter.java:3382) >> at org.apache.lucene.index.****IndexWriter.merge(IndexWriter.** >> **java:3004) >> at >> org.apache.lucene.index.****ConcurrentMergeScheduler.****doMerge(** >> ConcurrentMergeScheduler.java:****382) >> at >> org.apache.lucene.index.****ConcurrentMergeScheduler$**** >> MergeThread.run(** >> ConcurrentMergeScheduler.java:****451) >> </str><int name="code">500</int></lst></****response> >> >> What could be wrong? Exception is reproducable. Is exception fixed in >> later >> versions? >> >> Many thanks... >> >> >