[ https://issues.apache.org/jira/browse/SOLR-11392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16185583#comment-16185583 ]
Alan Woodward commented on SOLR-11392: -------------------------------------- It looks like there are two issues here: 1) A previous test (in the case I'm looking at, testExecutorStream) is creating the mainCorpus collection, but for some reason it's created with replicas named _n1 and _n3: {code} 55135 [junit4] 2> 151484 INFO (qtp1541434343-1456) [ ] o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/collections params={async=147c5276-5e91-4a73-912a-0025669a97ec&r 55135 eplicationFactor=1&collection.configName=conf&name=mainCorpus&nrtReplicas=1&action=CREATE&numShards=2&wt=javabin&version=2} status=0 QTime=1 55136 [junit4] 2> 151485 INFO (qtp1541434343-1455) [ ] o.a.s.h.a.CollectionsHandler Invoked Collection Action :requeststatus with params requestid=147c5276-5e91-4a73-912a-0 55136 025669a97ec&action=REQUESTSTATUS&wt=javabin&version=2 and sendToOCPQueue=true 55137 [junit4] 2> 151486 INFO (qtp1541434343-1455) [ ] o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/collections params={requestid=147c5276-5e91-4a73-912a-0025669a97 55137 ec&action=REQUESTSTATUS&wt=javabin&version=2} status=0 QTime=1 55138 [junit4] 2> 151488 INFO (OverseerThreadFactory-647-thread-5) [ ] o.a.s.c.CreateCollectionCmd Create collection mainCorpus 55139 [junit4] 2> 151488 INFO (OverseerCollectionConfigSetProcessor-98740321293107209-127.0.0.1:65381_solr-n_0000000000) [ ] o.a.s.c.OverseerTaskQueue Response ZK path: /ov 55139 erseer/collection-queue-work/qnr-0000000038 doesn't exist. Requestor may have disconnected from ZooKeeper 55140 [junit4] 2> 151610 INFO (OverseerStateUpdate-98740321293107209-127.0.0.1:65381_solr-n_0000000000) [ ] o.a.s.c.o.SliceMutator createReplica() { 55141 [junit4] 2> "operation":"ADDREPLICA", 55142 [junit4] 2> "collection":"mainCorpus", 55143 [junit4] 2> "shard":"shard1", 55144 [junit4] 2> "core":"mainCorpus_shard1_replica_n1", 55145 [junit4] 2> "state":"down", 55146 [junit4] 2> "base_url":"http://127.0.0.1:65381/solr", 55147 [junit4] 2> "type":"NRT"} 55148 [junit4] 2> 151616 INFO (OverseerStateUpdate-98740321293107209-127.0.0.1:65381_solr-n_0000000000) [ ] o.a.s.c.o.SliceMutator createReplica() { 55149 [junit4] 2> "operation":"ADDREPLICA", 55150 [junit4] 2> "collection":"mainCorpus", 55151 [junit4] 2> "shard":"shard2", 55152 [junit4] 2> "core":"mainCorpus_shard2_replica_n3", 55153 [junit4] 2> "state":"down", 55154 [junit4] 2> "base_url":"http://127.0.0.1:65394/solr", 55155 [junit4] 2> "type":"NRT"} {code} This is a bit weird, but it works fine. At the end of the test, the collection is deleted. Then testParallelExecutorStream starts up, and it too creates a 'mainCorpus' collection, only this time with shards named _n1 and _n2, as you'd expect. The bug then comes when the cluster's existing CloudSolrClient is used to send updates to the newly recreated collection. The cluster state provider still has state cached from the previous test, so it thinks that the relevant replicas to send data to are _n1 and _n3. But when it gets a 404 back from the (no longer existing) _n3 replica, it doesn't invalidate its cache and try again, it just fails. This looks like a genuine bug in CloudSolrClient. [~noble.paul] I think you're best placed to know how to fix this? A workaround for the test is to use different collection names for the different tests. > StreamExpressionTest.testParallelExecutorStream fails too frequently > -------------------------------------------------------------------- > > Key: SOLR-11392 > URL: https://issues.apache.org/jira/browse/SOLR-11392 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) > Reporter: Joel Bernstein > > I've never been able to reproduce the failure but jenkins fails frequently > with the following error: > {code} > Stack Trace: > org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error from > server at http://127.0.0.1:38180/solr/workQueue_shard2_replica_n3: Expected > mime type application/octet-stream but got text/html. <html> > <head> > <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/> > <title>Error 404 </title> > </head> > <body> > <h2>HTTP ERROR: 404</h2> > <p>Problem accessing /solr/workQueue_shard2_replica_n3/update. Reason: > <pre> Can not find: /solr/workQueue_shard2_replica_n3/update</pre></p> > <hr /><a href="http://eclipse.org/jetty">Powered by Jetty:// > 9.3.20.v20170531</a><hr/> > </body> > </html> > {code} > What appears to be happening is that the test framework is having trouble > setting up the collection. > Here is the test code: > {code} > @Test > public void testParallelExecutorStream() throws Exception { > CollectionAdminRequest.createCollection("workQueue", "conf", 2, > 1).process(cluster.getSolrClient()); > AbstractDistribZkTestBase.waitForRecoveriesToFinish("workQueue", > cluster.getSolrClient().getZkStateReader(), > false, true, TIMEOUT); > CollectionAdminRequest.createCollection("mainCorpus", "conf", 2, > 1).process(cluster.getSolrClient()); > AbstractDistribZkTestBase.waitForRecoveriesToFinish("mainCorpus", > cluster.getSolrClient().getZkStateReader(), > false, true, TIMEOUT); > CollectionAdminRequest.createCollection("destination", "conf", 2, > 1).process(cluster.getSolrClient()); > AbstractDistribZkTestBase.waitForRecoveriesToFinish("destination", > cluster.getSolrClient().getZkStateReader(), > false, true, TIMEOUT); > UpdateRequest workRequest = new UpdateRequest(); > UpdateRequest dataRequest = new UpdateRequest(); > for (int i = 0; i < 500; i++) { > workRequest.add(id, String.valueOf(i), "expr_s", "update(destination, > batchSize=50, search(mainCorpus, q=id:"+i+", rows=1, sort=\"id asc\", > fl=\"id, body_t, field_i\"))"); > dataRequest.add(id, String.valueOf(i), "body_t", "hello world "+i, > "field_i", Integer.toString(i)); > } > workRequest.commit(cluster.getSolrClient(), "workQueue"); > dataRequest.commit(cluster.getSolrClient(), "mainCorpus"); > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org