Thanks for the reply Erick,

Hard Commit - 15000ms, openSearcher=false
Soft Commit - 1000ms, openSearcher=true

15sec hard commit was sort of a guess, I could try a smaller number. When you say "getting too large" what limit do you think it would be hitting: a ulimit (nofiles), disk space, number of changes, a limit in Solr itself?

By my math there would be 15 tlogs max per core, but I don't really know how it all works if someone could fill me in/point me somewhere.

Cheers,

Tim

On 27/07/13 07:57 AM, Erick Erickson wrote:
What is your autocommit limit? Is it possible that your transaction
logs are simply getting too large? tlogs are truncated whenever
you do a hard commit (autocommit) with openSearcher either
true for false it doesn't matter.....

FWIW,
Erick

On Fri, Jul 26, 2013 at 12:56 AM, Tim Vaillancourt<t...@elementspace.com>  
wrote:
Thanks Shawn and Yonik!

Yonik: I noticed this error appears to be fairly trivial, but it is not
appearing after a previous crash. Every time I run this high-volume test
that produced my stack trace, I zero out the logs, Solr data and Zookeeper
data and start over from scratch with a brand new collection and zero'd out
logs.

The test is mostly high volume (2000-4000 updates/sec) and at the start the
SolrCloud runs decently for a good 20-60~ minutes, no errors in the logs at
all. Then that stack trace occurs on all 3 nodes (staggered), I immediately
get some replica down messages and then some "cannot connect" errors to all
other cluster nodes, who have all crashed the same way. The tlog error could
be a symptom of the problem of running out of threads perhaps.

Shawn: thanks so much for sharing those details! Yes, they seem to be nice
servers, for sure - I don't get to touch/see them but they're fast! I'll
look into firmwares for sure and will try again after updating them. These
Solr instances are not-bare metal and are actually KVM VMs so that's another
layer to look into, although it is consistent between the two clusters.

I am not currently increasing the 'nofiles' ulimit to above default like you
are, but does Solr use 10,000+ file handles? It won't hurt to try it I guess
:). To rule out Java 7, I'll probably also try Jetty 8 and Java 1.6 as an
experiment as well.

Thanks!

Tim


On 25/07/13 05:55 PM, Yonik Seeley wrote:
On Thu, Jul 25, 2013 at 7:44 PM, Tim Vaillancourt<t...@elementspace.com>
wrote:
"ERROR [2013-07-25 19:34:24.264] [org.apache.solr.common.SolrException]
Failure to open existing log file (non fatal)

That itself isn't necessarily a problem (and why it says "non fatal")
- it just means that most likely the a transaction log file was
truncated from a previous crash.  It may be unrelated to the other
issues you are seeing.

-Yonik
http://lucidworks.com

Reply via email to