Update!!

This happens with replicationFactor=1
Just for kicks I created a collection with a 24 shards, replicationfactor=1 
cluster on my exisiting benchmark env.
Same behaviour, SOLR cloud just hangs. Nothing in the logs, top/heap/cpu most 
metrics looks fine.
Only indication seems to be netstat showing incoming request not being read in.
 
Yago,

I saw your previous post 
(http://lucene.472066.n3.nabble.com/updating-docs-in-solr-cloud-hangs-td4067388.html#a4067631)
Following it, Last week, I upgraded to SOLR 4.3, to see if the issue gets 
fixed, but no luck.
Looks like this is a dominant and easily reproducible issue on SOLR cloud.


Thanks,

Rishi. 





 

 

 

-----Original Message-----
From: Yago Riveiro <yago.rive...@gmail.com>
To: solr-user <solr-user@lucene.apache.org>
Sent: Mon, Jun 17, 2013 5:15 pm
Subject: Re: Solr Cloud Hangs consistently .


I can confirm that the deadlock happen with only 2 replicas by shard. I need 
shutdown one node that host a replica of the shard to recover the indexation 
capability.

-- 
Yago Riveiro
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)


On Monday, June 17, 2013 at 6:44 PM, Rishi Easwaran wrote:

> 
> 
> Hi All,
> 
> I am trying to benchmark SOLR Cloud and it consistently hangs. 
> Nothing in the logs, no stack trace, no errors, no warnings, just seems stuck.
> 
> A little bit about my set up. 
> I have 3 benchmark hosts, each with 96GB RAM, 24 CPU's and 1TB SSD. Each host 
is configured to have 8 SOLR cloud nodes running at 4GB each.
> JVM configs: http://apaste.info/57Ai
> 
> My cluster has 12 shards with replication factor 2- http://apaste.info/09sA
> 
> I originally stated with SOLR 4.2., tomcat 5 and jdk 6, as we are already 
running this configuration in production in Non-Cloud form. 
> It got stuck repeatedly.
> 
> I decided to upgrade to the latest and greatest of everything, SOLR 4.3, JDK7 
and tomcat7. 
> It still shows same behaviour and hangs through the test.
> 
> My test schema and config.
> Schema.xml - http://apaste.info/imah
> SolrConfig.xml - http://apaste.info/ku4F
> 
> The test is pretty simple. its a jmeter test with update command via SOAP rpc 
(round robin request across every node), adding in 5 fields from a csv file - 
id, guid, subject, body, compositeID (guid!id).
> number of jmeter threads = 150. loop count = 20, num of messages to add/per 
guid = 3; total 150*3*20 = 9000 documents. 
> 
> When cloud gets stuck, i don't get anything in the logs, but when i run 
netstat i see the following.
> Sample netstat on a stuck run. http://apaste.info/hr0O 
> hycl-d20 is my jmeter host. ssd-d01/2/3 are my cloud hosts.
> 
> At the moment my benchmarking efforts are at a stand still.
> 
> Any help from the community would be great, I got some heap dumps and stack 
dumps, but haven't found a smoking gun yet.
> If I can provide anything else to diagnose this issue. just let me know.
> 
> Thanks,
> 
> Rishi. 


 

Reply via email to