Hi there,
I have been doing some load testing with Solr 4 beta (now, trunk). My
configuration is fairly simple - two servers, replicating via SolrCloud.
SolrCloud is configured as recommended in the wiki:

<updateRequestProcessorChain name="standard">
       <processor class="solr.LogUpdateProcessorFactory" />
       <processor class="solr.DistributedUpdateProcessorFactory" />
       <processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>

Twice now I've seen sudden thread and file-descriptor spikes along with
a complete deadlock, simultaneously on both machines. My max FDs is set
to 1024, and (excepting the spikes) I never see usage over 375 fds.

The first FD spike was with an older trunk revision. It was co-incident
with a corrupt transaction log. I've lost the logs, unfortunately, but
SOLR tried to re-process the same log over and over, leaking FDs and dying.

The upgraded version has not reported the corrupt transaction issue
prior to deadlock. However, according to the log files, the deadlock
persists for about 5 minutes prior to FD exhaustion. The last log line
is simply "INFO: end_commit_flush"

Upon restart, I see a frightening amount of corrupt transaction log
exceptions and " New transaction log already exists" exceptions.

Any thoughts?
Contact me for the thread dump; it's 1 MiB.

Thanks,
--Casey C.

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to