[ https://issues.apache.org/jira/browse/SOLR-15045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Raj Yadav updated SOLR-15045: ----------------------------- Description: Hi All, When we issue commit through curl command, not all the shards are getting `start commit` requests at the same time. *Solr Setup Detail : (Running in solrCloud mode)* It has 6 shards, and each shard has only one replica (which is also a leader) and the replica type is NRT. Each shards are hosted on the separate physical host. Zookeeper => We are using external zookeeper ensemble (3 separate node cluster) *Shard and Host name* shard1_0=>solr_199 shard1_1=>solr_200 shard2_0=> solr_254 shard2_1=> solr_132 shard3_0=>solr_133 shard3_1=>solr_198 *Request rate on the system is currently zero and only hourly indexing* *running on it.* We are using curl command to issue commit. {code:java} curl "http://solr_254:8389/solr/my_collection/update?openSearcher=true&commit=true&wt=json"{code} (Using solr_254 host to issue commit) On using the above command all the shards have started processing commit (i.e getting `start commit` request) except the one used in curl command (i.e shard2_0 which is hosted on solr_254). Individually each shards takes around 10 to 12 min to process hard commit (most of this time is spent on reloading external files). As per logs, shard2_0 is getting `start commit` request after 10 minutes (approx). This leads to following timeout error. {code:java} 2020-12-06 18:47:47.013 ERROR org.apache.solr.client.solrj.SolrServerException: Timeout occured while waiting response from server at: http://solr_132:9744/solr/my_collection_shard2_1_replica_n21/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2Fsolr_254%3A9744%2Fsolr%2Fmy_collection_shard2_0_replica_n11%2F at org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:407) at org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:753) at org.apache.solr.client.solrj.impl.ConcurrentUpdateHttp2SolrClient.request(ConcurrentUpdateHttp2SolrClient.java:369) at org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1290) at org.apache.solr.update.SolrCmdDistributor.doRequest(SolrCmdDistributor.java:344) at org.apache.solr.update.SolrCmdDistributor.lambda$submit$0(SolrCmdDistributor.java:333) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:180) at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:210) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.util.concurrent.TimeoutException at org.eclipse.jetty.client.util.InputStreamResponseListener.get(InputStreamResponseListener.java:216) at org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:398) ... 13 more{code} Above timeout error is between solr_254 and solr_132. Similar errors are there between solr_254 and other 4 shards Since query load is zero, mostly CPU utilization is around 3%. After issuing curl commit command, CPU goes up to 14% on all shards except shard2_0 (host: solr_254, the one used in curl command). And after 10 minutes (i.e after getting the `start commit` request) CPU on shard2_0 also goes up to 14%. As I mentioned earlier each shards take around 10-12 mins to process commit and due to delay in starting commit process on one shard (shard2_0) our overall commit time is doubled now. (22-24 minutes approx). *We are observing this delay in both hard and soft commit.* In our solr-5.4.0(having similar setup), we use the similar curl command to issue commit, and there all the shards are getting `start commit` request at same time. Including the one used in curl command. was: Hi All, When we issue commit through curl command, not all the shards are getting `start commit` requests at the same time. *Solr Setup Detail : (Running in solrCloud mode)* It has 6 shards, and each shard has only one replica (which is also a leader) and the replica type is NRT. Each shards are hosted on the separate physical host. Zookeeper => We are using external zookeeper ensemble (3 separate node cluster) *Shard and Host name* shard1_0=>solr_199 shard1_1=>solr_200 shard2_0=> solr_254 shard2_1=> solr_132 shard3_0=>solr_133 shard3_1=>solr_198 *Request rate on the system is currently zero and only hourly indexing* *running on it.* We are using curl command to issue commit. {code:java} curl "http://solr_254:8389/solr/my_collection/update?openSearcher=true&commit=true&wt=json"{code} (Using solr_254 host to issue commit) On using the above command all the shards have started processing commit (i.e getting `start commit` request) except the one used in curl command (i.e shard2_0 which is hosted on solr_254). Individually each shards takes around 10 to 12 min to process hard commit (most of this time is spent on reloading external files). As per logs, shard2_0 is getting `start commit` request after 10 minutes (approx). This leads to following timeout error. {code:java} 2020-12-06 18:47:47.013 ERROR org.apache.solr.client.solrj.SolrServerException: Timeout occured while waiting response from server at: http://solr_132:9744/solr/my_collection_shard2_1_replica_n21/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2Fsolr_254%3A9744%2Fsolr%2Fmy_collection_shard2_0_replica_n11%2F at org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:407) at org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:753) at org.apache.solr.client.solrj.impl.ConcurrentUpdateHttp2SolrClient.request(ConcurrentUpdateHttp2SolrClient.java:369) at org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1290) at org.apache.solr.update.SolrCmdDistributor.doRequest(SolrCmdDistributor.java:344) at org.apache.solr.update.SolrCmdDistributor.lambda$submit$0(SolrCmdDistributor.java:333) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:180) at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:210) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.util.concurrent.TimeoutException at org.eclipse.jetty.client.util.InputStreamResponseListener.get(InputStreamResponseListener.java:216) at org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:398) ... 13 more{code} Above timeout error is between solr_254 and solr_132. Similar errors are there between solr_254 and other 4 shards Since query load is zero, mostly CPU utilization is around 3%. After issuing curl commit command, CPU goes up to 14% on all shards except shard2_0 (host: solr_254, the one used in curl command). And after 10 minutes (i.e after getting the `start commit` request) CPU on shard2_0 also goes up to 14%. As I mentioned earlier each shards take around 10-12 mins to process commit and due to delay in starting commit process on one shard (shard2_0) our overall commit time is doubled now. (22-24 minutes approx). *We are observing this delay in both hard and soft commit.* In our solr-5.4.0(having similar setup), we use similar command curl command to issue commit and there all the shards are getting `start commit` request at same time. Including the one used in curl command. > Commit through curl command is causing delay in issuing commit > -------------------------------------------------------------- > > Key: SOLR-15045 > URL: https://issues.apache.org/jira/browse/SOLR-15045 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud > Affects Versions: 8.5.2 > Environment: Operating system: Linux (centos 7.7.1908) > Reporter: Raj Yadav > Priority: Major > > Hi All, > When we issue commit through curl command, not all the shards are getting > `start commit` requests at the same time. > *Solr Setup Detail : (Running in solrCloud mode)* > It has 6 shards, and each shard has only one replica (which is also a > leader) and the replica type is NRT. > Each shards are hosted on the separate physical host. > Zookeeper => We are using external zookeeper ensemble (3 separate node > cluster) > *Shard and Host name* > shard1_0=>solr_199 > shard1_1=>solr_200 > shard2_0=> solr_254 > shard2_1=> solr_132 > shard3_0=>solr_133 > shard3_1=>solr_198 > *Request rate on the system is currently zero and only hourly indexing* > *running on it.* > We are using curl command to issue commit. > {code:java} > curl > "http://solr_254:8389/solr/my_collection/update?openSearcher=true&commit=true&wt=json"{code} > (Using solr_254 host to issue commit) > On using the above command all the shards have started processing commit (i.e > getting `start commit` request) except the one used in curl command (i.e > shard2_0 which is hosted on solr_254). Individually each shards takes around > 10 to 12 min to process hard commit (most of this time is spent on reloading > external files). > As per logs, shard2_0 is getting `start commit` request after 10 minutes > (approx). This leads to following timeout error. > {code:java} > 2020-12-06 18:47:47.013 ERROR > org.apache.solr.client.solrj.SolrServerException: Timeout occured while > waiting response from server at: > http://solr_132:9744/solr/my_collection_shard2_1_replica_n21/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2Fsolr_254%3A9744%2Fsolr%2Fmy_collection_shard2_0_replica_n11%2F > at > org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:407) > at > org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:753) > at > org.apache.solr.client.solrj.impl.ConcurrentUpdateHttp2SolrClient.request(ConcurrentUpdateHttp2SolrClient.java:369) > at > org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1290) > at > org.apache.solr.update.SolrCmdDistributor.doRequest(SolrCmdDistributor.java:344) > at > org.apache.solr.update.SolrCmdDistributor.lambda$submit$0(SolrCmdDistributor.java:333) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:180) > at > org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:210) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.util.concurrent.TimeoutException > at > org.eclipse.jetty.client.util.InputStreamResponseListener.get(InputStreamResponseListener.java:216) > at > org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:398) > ... 13 more{code} > Above timeout error is between solr_254 and solr_132. Similar errors are > there between solr_254 and other 4 shards > Since query load is zero, mostly CPU utilization is around 3%. > After issuing curl commit command, CPU goes up to 14% on all shards except > shard2_0 (host: solr_254, the one used in curl command). > And after 10 minutes (i.e after getting the `start commit` request) CPU on > shard2_0 also goes up to 14%. > As I mentioned earlier each shards take around 10-12 mins to process commit > and due to delay in starting commit process on one shard (shard2_0) our > overall commit time is doubled now. (22-24 minutes approx). > *We are observing this delay in both hard and soft commit.* > In our solr-5.4.0(having similar setup), we use the similar curl command to > issue commit, and there all the shards are getting `start commit` request at > same time. Including the one used in curl command. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org