Could you give a simple stack trace dump as well? It's likely the distributed update deadlock that has been reported a few times now - I think usually with a replication factor greater than 2, but I can't be sure. The deadlock involves sending docs concurrently to replicas and I wouldn't have expected it to be so easily hit with only 2 replicas per shard. I should be able to tell from a stack trace though.
If it is that, it's on my short list to investigate (been there a long time now though - but I still hope to look at it soon). - Mark On Jun 17, 2013, at 1:44 PM, Rishi Easwaran <rishi.easwa...@aol.com> wrote: > > > Hi All, > > I am trying to benchmark SOLR Cloud and it consistently hangs. > Nothing in the logs, no stack trace, no errors, no warnings, just seems stuck. > > A little bit about my set up. > I have 3 benchmark hosts, each with 96GB RAM, 24 CPU's and 1TB SSD. Each host > is configured to have 8 SOLR cloud nodes running at 4GB each. > JVM configs: http://apaste.info/57Ai > > My cluster has 12 shards with replication factor 2- http://apaste.info/09sA > > I originally stated with SOLR 4.2., tomcat 5 and jdk 6, as we are already > running this configuration in production in Non-Cloud form. > It got stuck repeatedly. > > I decided to upgrade to the latest and greatest of everything, SOLR 4.3, JDK7 > and tomcat7. > It still shows same behaviour and hangs through the test. > > My test schema and config. > Schema.xml - http://apaste.info/imah > SolrConfig.xml - http://apaste.info/ku4F > > The test is pretty simple. its a jmeter test with update command via SOAP rpc > (round robin request across every node), adding in 5 fields from a csv file - > id, guid, subject, body, compositeID (guid!id). > number of jmeter threads = 150. loop count = 20, num of messages to add/per > guid = 3; total 150*3*20 = 9000 documents. > > When cloud gets stuck, i don't get anything in the logs, but when i run > netstat i see the following. > Sample netstat on a stuck run. http://apaste.info/hr0O > hycl-d20 is my jmeter host. ssd-d01/2/3 are my cloud hosts. > > > At the moment my benchmarking efforts are at a stand still. > > Any help from the community would be great, I got some heap dumps and stack > dumps, but haven't found a smoking gun yet. > If I can provide anything else to diagnose this issue. just let me know. > > Thanks, > > Rishi. > > > > > > > >