Some more info to provide:

-Replication almost never completes following the "this IndexWriter is
closed" stacktraces.
-When the replication begins after "this IndexWriter is closed" error, over
a few hours the replica eventually fills the disk to 100% with index files
under data/. There are so many files in the data directory it can't be
listed and takes a very long time to delete. It seems the frequent
replications are filling the disk with new files whose sum is roughly 3
times larger than the real index. Is it leaking filehandles or forgetting
it has downloaded something?

Is this a better question for the lucene list? It seems (see below) that
this stacktrace is occuring in the lucene layer vs solr, but maybe someone
could confirm?

"ERROR [2014-01-27 18:28:49.368] [org.apache.solr.common.SolrException]
org.apache.lucene.store.AlreadyClosedException: this IndexWriter is closed
    at
org.apache.lucene.index.DocumentsWriter.ensureOpen(DocumentsWriter.java:199)
    at
org.apache.lucene.index.DocumentsWriter.preUpdate(DocumentsWriter.java:338)
    at
org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:419)
    at
org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1508)
    at
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:210)
    at
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
    at
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
    at
org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:519)
    at
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:655)
    at
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:398)
    at
org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:246)
    at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:173)
    at
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
    at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
    at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
    at org.apache.solr.core.SolrCore.execute(SolrCore.java:1820)
    at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:656)
    at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:359)
    at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:155)
    ... <chopped>"

Thanks!

Tim


On 5 February 2014 13:04, Tim Vaillancourt <t...@elementspace.com> wrote:

> Hey guys,
>
> I am troubleshooting an issue on a 4.3.1 SolrCloud: 1 collection and 2
> shards over 4 Solr instances, (which results in 1 core per Solr instance).
>
> After some time in Production without issues, we are seeing errors related
> to the IndexWriter all over our logs and an infinite loop of failing
> replication from Leader on our 2 replicas.
>
> We see a flood of: "org.apache.lucene.store.AlreadyClosedException: this
> IndexWriter is closed" stacktraces, then the Solr replica tries to
> replicate/recover, then fails replication and then the following 2 errors
> show up:
>
> 1) "SolrIndexWriter was not closed prior to finalize(), indicates a bug --
> POSSIBLE RESOURCE LEAK!!!"
> 2) "Error closing IndexWriter, trying rollback" (which results in a
> null-pointer exception).
>
> I'm guessing the best way forward would be to upgrade to latest, but that
> is an undertaking that will take significant time/testing. In the meantime,
> is there anything I can do to mitigate or understand the issue more?
>
> Does anyone know what the IndexWriter errors refer to?
>
> Below is a URL to a .txt file with summarized portions of my solr.log. Any
> help is really appreciated as always!!
>
> http://timvaillancourt.com.s3.amazonaws.com/tmp/solr.log-summarized.txt
>
> Thanks all,
>
> Tim
>

Reply via email to