I do all the indexing through a HTTP POST, with replicationFactor=1 no problem, 
if is higher deadlock problems can appear

A stack trace like this 
http://lucene.472066.n3.nabble.com/updating-docs-in-solr-cloud-hangs-td4067388.html#a4067862
 is that I get

-- 
Yago Riveiro
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)


On Monday, June 17, 2013 at 11:03 PM, Mark Miller wrote:

> If it actually happens with replicationFactor=1, it doesn't likely have 
> anything to do with the update handler issue I'm referring to. In some cases 
> like these, people have better luck with Jetty than Tomcat - we test it much 
> more. For instance, it's setup to help avoid search side distributed 
> deadlocks.
> 
> In any case, there is something special about it - I do and have seen a lot 
> of heavy indexing to SolrCloud by me and others without running into this. 
> Both with replicationFacotor=1 and greater. So there is something specific in 
> how the load is being done or what features/methods are being used that 
> likely causes it or makes it easier to cause.
> 
> But again, the issue I know about involves threads that are not even created 
> in the replicationFactor = 1 case, so that could be a first report afaik.
> 
> - Mark
> 
> On Jun 17, 2013, at 5:52 PM, Rishi Easwaran <rishi.easwa...@aol.com 
> (mailto:rishi.easwa...@aol.com)> wrote:
> 
> > Update!!
> > 
> > This happens with replicationFactor=1
> > Just for kicks I created a collection with a 24 shards, replicationfactor=1 
> > cluster on my exisiting benchmark env.
> > Same behaviour, SOLR cloud just hangs. Nothing in the logs, top/heap/cpu 
> > most metrics looks fine.
> > Only indication seems to be netstat showing incoming request not being read 
> > in.
> > 
> > Yago,
> > 
> > I saw your previous post 
> > (http://lucene.472066.n3.nabble.com/updating-docs-in-solr-cloud-hangs-td4067388.html#a4067631)
> > Following it, Last week, I upgraded to SOLR 4.3, to see if the issue gets 
> > fixed, but no luck.
> > Looks like this is a dominant and easily reproducible issue on SOLR cloud.
> > 
> > 
> > Thanks,
> > 
> > Rishi. 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > -----Original Message-----
> > From: Yago Riveiro <yago.rive...@gmail.com (mailto:yago.rive...@gmail.com)>
> > To: solr-user <solr-user@lucene.apache.org 
> > (mailto:solr-user@lucene.apache.org)>
> > Sent: Mon, Jun 17, 2013 5:15 pm
> > Subject: Re: Solr Cloud Hangs consistently .
> > 
> > 
> > I can confirm that the deadlock happen with only 2 replicas by shard. I 
> > need 
> > shutdown one node that host a replica of the shard to recover the 
> > indexation 
> > capability.
> > 
> > -- 
> > Yago Riveiro
> > Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
> > 
> > 
> > On Monday, June 17, 2013 at 6:44 PM, Rishi Easwaran wrote:
> > 
> > > 
> > > 
> > > Hi All,
> > > 
> > > I am trying to benchmark SOLR Cloud and it consistently hangs. 
> > > Nothing in the logs, no stack trace, no errors, no warnings, just seems 
> > > stuck.
> > > 
> > > A little bit about my set up. 
> > > I have 3 benchmark hosts, each with 96GB RAM, 24 CPU's and 1TB SSD. Each 
> > > host 
> > > 
> > 
> > is configured to have 8 SOLR cloud nodes running at 4GB each.
> > > JVM configs: http://apaste.info/57Ai
> > > 
> > > My cluster has 12 shards with replication factor 2- 
> > > http://apaste.info/09sA
> > > 
> > > I originally stated with SOLR 4.2., tomcat 5 and jdk 6, as we are already 
> > running this configuration in production in Non-Cloud form. 
> > > It got stuck repeatedly.
> > > 
> > > I decided to upgrade to the latest and greatest of everything, SOLR 4.3, 
> > > JDK7 
> > and tomcat7. 
> > > It still shows same behaviour and hangs through the test.
> > > 
> > > My test schema and config.
> > > Schema.xml - http://apaste.info/imah
> > > SolrConfig.xml - http://apaste.info/ku4F
> > > 
> > > The test is pretty simple. its a jmeter test with update command via SOAP 
> > > rpc 
> > (round robin request across every node), adding in 5 fields from a csv file 
> > - 
> > id, guid, subject, body, compositeID (guid!id).
> > > number of jmeter threads = 150. loop count = 20, num of messages to 
> > > add/per 
> > 
> > guid = 3; total 150*3*20 = 9000 documents. 
> > > 
> > > When cloud gets stuck, i don't get anything in the logs, but when i run 
> > netstat i see the following.
> > > Sample netstat on a stuck run. http://apaste.info/hr0O 
> > > hycl-d20 is my jmeter host. ssd-d01/2/3 are my cloud hosts.
> > > 
> > > At the moment my benchmarking efforts are at a stand still.
> > > 
> > > Any help from the community would be great, I got some heap dumps and 
> > > stack 
> > dumps, but haven't found a smoking gun yet.
> > > If I can provide anything else to diagnose this issue. just let me know.
> > > 
> > > Thanks,
> > > 
> > > Rishi. 

Reply via email to