[ 
https://issues.apache.org/jira/browse/SOLR-9129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16302905#comment-16302905
 ] 

Gus Heck commented on SOLR-9129:
--------------------------------

Interestingly I'm seeing this exception while working on a command for creating 
routed aliases. In my case only one collection is being created (the initial 
collection for the routing set) in advance of creating the associated alias. I 
am already using a separate local zookeeper on port 2181.

> Solr Cloud hangs when creating large number of collections and node fails to 
> recover after restart
> --------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-9129
>                 URL: https://issues.apache.org/jira/browse/SOLR-9129
>             Project: Solr
>          Issue Type: Bug
>          Components: Server
>    Affects Versions: 6.0
>         Environment: OS: GNU Linux, kernel 4.4.0-22 on x86_64 (Ubuntu Linux 
> 16.04 LTS (64-bit))
> RAM: 16 GB
> CPU: Intel Core i7-4720HQ CPU @ 2.60GHz × 8
> Java version: Oracle JDK 1.8.0_92 (x64) build 1.8.0_92-b14 Java HotSpot(TM) 
> 64-Bit Server VM (build 25.92-b14, mixed mode)
>            Reporter: Peter Horvath
>         Attachments: exception1.txt, exception2.txt, exception3.txt, 
> exception4.txt, solr_node_hung_after_restart.txt, 
> solr_node_hung_before_restart.txt, solr_visualvm.png, solr_visualvm2.png
>
>
> I attempted to benchmark SolrCloud to see how well it would work with some 
> sample data set of ours. 
> I wanted to create about 2500 empty collections first to see how that would 
> scale.
> Unfortunately, the test was not successful. Solr started failing after 
> creating around 2000 collections and the cluster has failed to recover after 
> a complete restart, which is quite concerning to me. 
> I based my environment on the cloud example (I use the same config set as the 
> gettingstarted example collection etc); so I have the vanilla install and 
> used the following commands to bring the nodes online:
> .../solr/6.0.0/bin/solr start -m 2g -cloud -p 8983 -s
> ".../solr/6.0.0/example/cloud/node1/solr"
> .../solr/6.0.0/bin/solr start -m 2g -cloud -p 7574 -s
> ".../solr/6.0.0/example/cloud/node2/solr" -z localhost:9983
> .../solr/6.0.0/bin/solr start -m 2g -cloud -p 8984 -s
> ".../solr/6.0.0/example/cloud/node3/solr" -z localhost:9983
> .../solr/6.0.0/bin/solr start -m 2g -cloud -p 7575 -s
> ".../solr/6.0.0/example/cloud/node4/solr" -z localhost:9983
> After about 2000 collections were created, SolR got hung; REST requests 
> started failing. I found the following entry in the logs, wihch I could 
> relate to the failed REST request. For further logs, please see the 
> attachment of this issue. 
> null:org.apache.solr.common.SolrException: Could not fully create collection: 
> FOOBAR
>       at 
> org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:266)
>       at 
> org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:197)
>       at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:155)
>       at 
> org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:658)
>       at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:441)
>       at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:229)
>       at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:184)
>       at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)
>       at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581)
>       at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>       at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>       at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
>       at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1160)
>       at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511)
>       at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>       at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1092)
>       at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>       at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
>       at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
>       at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>       at org.eclipse.jetty.server.Server.handle(Server.java:518)
>       at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)
>       at 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:244)
>       at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
>       at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
>       at 
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>       at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:246)
>       at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:156)
>       at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)
>       at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
>       at java.lang.Thread.run(Thread.java:745)
> For further logs, please see the attachment of this issue. 
> After the Solr instance affected has failed to recover, I decided to restart 
> the whole cluster (using the official solr stop-start commands). 
> Unfortunately, after this, at least one node remained spinning in ZooKeeper 
> logic, creating more than four thousand (!!) threads.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to