Hi, We're testing a 10 node cluster running trunk and write a few million documents to it from Hadoop. We just saw a node die for no apparent reason. Tomcat was completely dead before it was automatically restarted again. Indexing failed when it received the typical Internal Server Error. The log only shows:
2012-10-23 19:07:09,291 ERROR [solr.update.UpdateLog] - [main] - : Failure to open existing log file (non fatal) /opt/solr/cores/shard_f/data/tlog/tlog.0000000000000010484:org.apache.solr.common.SolrException: java.io.EOFException at org.apache.solr.update.TransactionLog.<init>(TransactionLog.java:182) at org.apache.solr.update.UpdateLog.init(UpdateLog.java:216) at org.apache.solr.update.UpdateHandler.initLog(UpdateHandler.java:82) at org.apache.solr.update.UpdateHandler.<init>(UpdateHandler.java:111) at org.apache.solr.update.DirectUpdateHandler2.<init>(DirectUpdateHandler2.java:97) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:532) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:483) at org.apache.solr.core.SolrCore.createUpdateHandler(SolrCore.java:551) at org.apache.solr.core.SolrCore.<init>(SolrCore.java:714) at org.apache.solr.core.SolrCore.<init>(SolrCore.java:573) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:850) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:534) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:356) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:308) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:107) ..lots of catalina traces... Caused by: java.io.EOFException at org.apache.solr.common.util.FastInputStream.readUnsignedByte(FastInputStream.java:72) at org.apache.solr.common.util.FastInputStream.readInt(FastInputStream.java:206) at org.apache.solr.update.TransactionLog.readHeader(TransactionLog.java:266) at org.apache.solr.update.TransactionLog.<init>(TransactionLog.java:160) ... 44 more According to syslog Tomcat was not killed by the OOM-killer, what i initially expected. Syslog is also still running ;) It seem the error is more fatal than the error tells me, the indexing error and the exception happened within a few seconds of eachother. Any ideas? Existing issue? File bug? Thanks Markus