In some collections I am having problems with Solr8.1.1 through 8.3; with other collections it is fine in Solr8.1.1 through 8.3
I'm investigating what might be wrong with the collections which have the problems. Thanks -----Original Message----- From: Oakley, Craig (NIH/NLM/NCBI) [C] <craig.oak...@nih.gov.INVALID> Sent: Tuesday, November 19, 2019 9:53 AM To: solr-user@lucene.apache.org Subject: RE: async BACKUP under Solr8.3 FYI, I DO succeed in doing an async backup in Solr8.1 -----Original Message----- From: Oakley, Craig (NIH/NLM/NCBI) [C] <craig.oak...@nih.gov.INVALID> Sent: Tuesday, November 19, 2019 9:03 AM To: solr-user@lucene.apache.org Subject: RE: async BACKUP under Solr8.3 This is on a test server: simple case: one node, one shard, one replica In production we currently use Solr7.4 and the async BACKUP works fine. I could test whether I get the same symptoms on Solr8.1 and/or 8.2 Thanks -----Original Message----- From: Mikhail Khludnev <m...@apache.org> Sent: Tuesday, November 19, 2019 12:40 AM To: solr-user <solr-user@lucene.apache.org> Subject: Re: async BACKUP under Solr8.3 Hello, Craig. There was a significant fix for async BACKUP in 8.1, if I remember it correctly. Which version you used for it before? How many nodes, shards, replicas `bug` has? Unfortunately this stacktrace is not really representative, it just says that some node (ok, it's overseer) fails to wait another one. Ideally we need a log from overseer node and subordinate node during backup operation. Thanks. On Tue, Nov 19, 2019 at 2:13 AM Oakley, Craig (NIH/NLM/NCBI) [C] <craig.oak...@nih.gov.invalid> wrote: > For Solr 8.3, when I attempt a command of the form > > > host:port/solr/admin/collections?action=BACKUP&name=snapshot1&collection=col1&location=/tmp&async=bug > > And then when I run > /solr/admin/collections?action=REQUESTSTATUS&requestid=bug I get > "msg":"found [bug] in failed tasks" > > The solr.log file has a stack trace like the following > 2019-11-18 17:31:31.369 ERROR > (OverseerThreadFactory-9-thread-5-processing-n:host:port_solr) [c:col1 ] > o.a.s.c.a.c.OverseerCollectionMessageHandler Error from shard: > http://host:port/solr => > org.apache.solr.client.solrj.SolrServerException: Timeout occured while > waiting response from server at: http://host:port/solr/admin/cores > at > org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:408) > org.apache.solr.client.solrj.SolrServerException: Timeout occured while > waiting response from server at: http://host:port/solr/admin/cores > at > org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:408) > ~[?:?] > at > org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:754) > ~[?:?] > at > org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1290) ~[?:?] > at > org.apache.solr.handler.component.HttpShardHandler.request(HttpShardHandler.java:238) > ~[?:?] > at > org.apache.solr.handler.component.HttpShardHandler.lambda$submit$0(HttpShardHandler.java:199) > ~[?:?] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[?:1.8.0_232] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[?:1.8.0_232] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[?:1.8.0_232] > at > com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:181) > ~[metrics-core-4.0.5.jar:4.0.5] > at > org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:210) > ~[?:?] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > ~[?:1.8.0_232] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > ~[?:1.8.0_232] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_232] > Caused by: java.util.concurrent.TimeoutException > at > org.eclipse.jetty.client.util.InputStreamResponseListener.get(InputStreamResponseListener.java:216) > ~[?:?] > at > org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:399) > ~[?:?] > ... 12 more > > If I remove the async=bug, then it works > > In fact, the backup looks successful, but REQUESTSTATUS does not recognize > it as such > > I notice that the 3:30am 11/4/19 Email to solr-user@lucene.apache.org > mentions in Solr 8.3.0 Release Highlights "Fix for SPLITSHARD (async) with > failures in underlying sub-operations can result in data loss" > > Did a fix to SPLITSHARD break BACKUP? > > Has anyone been successful running > solr/admin/collections?action=BACKUP&async=requestname under Solr8.3? > > Thanks > -- Sincerely yours Mikhail Khludnev