[ https://issues.apache.org/jira/browse/SOLR-9129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16302905#comment-16302905 ]
Gus Heck commented on SOLR-9129: -------------------------------- Interestingly I'm seeing this exception while working on a command for creating routed aliases. In my case only one collection is being created (the initial collection for the routing set) in advance of creating the associated alias. I am already using a separate local zookeeper on port 2181. > Solr Cloud hangs when creating large number of collections and node fails to > recover after restart > -------------------------------------------------------------------------------------------------- > > Key: SOLR-9129 > URL: https://issues.apache.org/jira/browse/SOLR-9129 > Project: Solr > Issue Type: Bug > Components: Server > Affects Versions: 6.0 > Environment: OS: GNU Linux, kernel 4.4.0-22 on x86_64 (Ubuntu Linux > 16.04 LTS (64-bit)) > RAM: 16 GB > CPU: Intel Core i7-4720HQ CPU @ 2.60GHz × 8 > Java version: Oracle JDK 1.8.0_92 (x64) build 1.8.0_92-b14 Java HotSpot(TM) > 64-Bit Server VM (build 25.92-b14, mixed mode) > Reporter: Peter Horvath > Attachments: exception1.txt, exception2.txt, exception3.txt, > exception4.txt, solr_node_hung_after_restart.txt, > solr_node_hung_before_restart.txt, solr_visualvm.png, solr_visualvm2.png > > > I attempted to benchmark SolrCloud to see how well it would work with some > sample data set of ours. > I wanted to create about 2500 empty collections first to see how that would > scale. > Unfortunately, the test was not successful. Solr started failing after > creating around 2000 collections and the cluster has failed to recover after > a complete restart, which is quite concerning to me. > I based my environment on the cloud example (I use the same config set as the > gettingstarted example collection etc); so I have the vanilla install and > used the following commands to bring the nodes online: > .../solr/6.0.0/bin/solr start -m 2g -cloud -p 8983 -s > ".../solr/6.0.0/example/cloud/node1/solr" > .../solr/6.0.0/bin/solr start -m 2g -cloud -p 7574 -s > ".../solr/6.0.0/example/cloud/node2/solr" -z localhost:9983 > .../solr/6.0.0/bin/solr start -m 2g -cloud -p 8984 -s > ".../solr/6.0.0/example/cloud/node3/solr" -z localhost:9983 > .../solr/6.0.0/bin/solr start -m 2g -cloud -p 7575 -s > ".../solr/6.0.0/example/cloud/node4/solr" -z localhost:9983 > After about 2000 collections were created, SolR got hung; REST requests > started failing. I found the following entry in the logs, wihch I could > relate to the failed REST request. For further logs, please see the > attachment of this issue. > null:org.apache.solr.common.SolrException: Could not fully create collection: > FOOBAR > at > org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:266) > at > org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:197) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:155) > at > org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:658) > at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:441) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:229) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:184) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668) > at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) > at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) > at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226) > at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1160) > at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511) > at > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) > at > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1092) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) > at > org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213) > at > org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119) > at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) > at org.eclipse.jetty.server.Server.handle(Server.java:518) > at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308) > at > org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:244) > at > org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273) > at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95) > at > org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93) > at > org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:246) > at > org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:156) > at > org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654) > at > org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572) > at java.lang.Thread.run(Thread.java:745) > For further logs, please see the attachment of this issue. > After the Solr instance affected has failed to recover, I decided to restart > the whole cluster (using the official solr stop-start commands). > Unfortunately, after this, at least one node remained spinning in ZooKeeper > logic, creating more than four thousand (!!) threads. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org