[
https://issues.apache.org/jira/browse/SOLR-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13726831#comment-13726831
]
Erick Erickson commented on SOLR-5081:
--------------------------------------
Yeah, that is odd. The stack traces you sent basically showed no deadlocks,
nothing interesting at all. I suspect pursuing whether anything is getting to
Solr or not is a good idea....
Hmmmm, blunt-instrument test when the cluster is hung. What happens if you,
say, submit a query directly to one of the nodes? Does it respond or do you see
anything in the solr log on that node? Tip: adding &distrib=false to the
_query_ will not try to send sub-queries to other shards.
And I wonder what happens if you, say, use post.jar (comes with the example) to
try to send a doc to Solr when it's hung, anything?
Clearly I'm grasping at straws here, but I'm kind of out of good ideas.
> Highly parallel document insertion hangs SolrCloud
> --------------------------------------------------
>
> Key: SOLR-5081
> URL: https://issues.apache.org/jira/browse/SOLR-5081
> Project: Solr
> Issue Type: Bug
> Components: SolrCloud
> Affects Versions: 4.3.1
> Reporter: Mike Schrag
> Attachments: threads.txt
>
>
> If I do a highly parallel document load using a Hadoop cluster into an 18
> node solrcloud cluster, I can deadlock solr every time.
> The ulimits on the nodes are:
> core file size (blocks, -c) 0
> data seg size (kbytes, -d) unlimited
> scheduling priority (-e) 0
> file size (blocks, -f) unlimited
> pending signals (-i) 1031181
> max locked memory (kbytes, -l) unlimited
> max memory size (kbytes, -m) unlimited
> open files (-n) 32768
> pipe size (512 bytes, -p) 8
> POSIX message queues (bytes, -q) 819200
> real-time priority (-r) 0
> stack size (kbytes, -s) 10240
> cpu time (seconds, -t) unlimited
> max user processes (-u) 515590
> virtual memory (kbytes, -v) unlimited
> file locks (-x) unlimited
> The open file count is only around 4000 when this happens.
> If I bounce all the servers, things start working again, which makes me think
> this is Solr and not ZK.
> I'll attach the stack trace from one of the servers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]