[ 
https://issues.apache.org/jira/browse/SOLR-15045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raj Yadav updated SOLR-15045:
-----------------------------
    Description: 
Hi All,

When we issue commit through curl command, not all the shards are getting 
`start commit` requests at the same time.

*Solr Setup Detail : (Running in solrCloud mode)*
 It has 6 shards, and each shard has only one replica (which is also a
 leader) and the replica type is NRT.
 Each shards are hosted on the separate physical host.

Zookeeper => We are using external zookeeper ensemble (3 separate node
 cluster)

*Shard and Host name*
 shard1_0=>solr_199
 shard1_1=>solr_200
 shard2_0=> solr_254
 shard2_1=> solr_132
 shard3_0=>solr_133
 shard3_1=>solr_198

*Request rate on the system is currently zero and only hourly indexing*
 *running on it.*

We are using curl command to issue commit.
{code:java}
curl
"http://solr_254:8389/solr/my_collection/update?openSearcher=true&commit=true&wt=json"{code}
(Using solr_254 host to issue commit)

On using the above command all the shards have started processing commit (i.e
 getting `start commit` request) except the one used in curl command (i.e
 shard2_0 which is hosted on solr_254). Individually each shards takes around
 10 to 12 min to process hard commit (most of this time is spent on reloading
 external files).
 As per logs, shard2_0 is getting `start commit` request after 10 minutes
 (approx). This leads to following timeout error.
{code:java}
2020-12-06 18:47:47.013 ERROR
org.apache.solr.client.solrj.SolrServerException: Timeout occured while
waiting response from server at:
http://solr_132:9744/solr/my_collection_shard2_1_replica_n21/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2Fsolr_254%3A9744%2Fsolr%2Fmy_collection_shard2_0_replica_n11%2F
      at
org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:407)
      at
org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:753)
      at
org.apache.solr.client.solrj.impl.ConcurrentUpdateHttp2SolrClient.request(ConcurrentUpdateHttp2SolrClient.java:369)
      at
org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1290)
      at
org.apache.solr.update.SolrCmdDistributor.doRequest(SolrCmdDistributor.java:344)
      at
org.apache.solr.update.SolrCmdDistributor.lambda$submit$0(SolrCmdDistributor.java:333)
      at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      at
com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:180)
      at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:210)
      at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      at java.lang.Thread.run(Thread.java:748)
    Caused by: java.util.concurrent.TimeoutException
      at
org.eclipse.jetty.client.util.InputStreamResponseListener.get(InputStreamResponseListener.java:216)
      at
org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:398)
      ... 13 more{code}
Above timeout error is between solr_254 and solr_132. Similar errors are
 there between solr_254 and other 4 shards

Since query load is zero, mostly CPU utilization is around 3%.
 After issuing curl commit command, CPU goes up to 14% on all shards except
 shard2_0 (host: solr_254, the one used in curl command).
 And after 10 minutes (i.e after getting the `start commit` request)  CPU  on
 shard2_0 also goes up to 14%.

As I mentioned earlier each shards take around 10-12 mins to process commit
 and due to delay in starting commit process on one shard (shard2_0) our
 overall commit time is doubled now. (22-24 minutes approx).

*We are observing this delay in both hard and soft commit.*

In our solr-5.4.0(having similar setup), we use the similar curl command to 
issue commit, and there all the shards are getting `start commit` request at 
same time. Including the one used in curl command.

  was:
Hi All,

When we issue commit through curl command, not all the shards are getting 
`start commit` requests at the same time.


*Solr Setup Detail : (Running in solrCloud mode)*
It has 6 shards, and each shard has only one replica (which is also a
leader) and the replica type is NRT.
Each shards are hosted on the separate physical host.

Zookeeper => We are using external zookeeper ensemble (3 separate node
cluster)

*Shard and Host name*
shard1_0=>solr_199
shard1_1=>solr_200
shard2_0=> solr_254
shard2_1=> solr_132
shard3_0=>solr_133
shard3_1=>solr_198

*Request rate on the system is currently zero and only hourly indexing*
*running on it.*

We are using curl command to issue commit.
{code:java}
curl
"http://solr_254:8389/solr/my_collection/update?openSearcher=true&commit=true&wt=json"{code}

(Using solr_254 host to issue commit)

On using the above command all the shards have started processing commit (i.e
getting `start commit` request) except the one used in curl command (i.e
shard2_0 which is hosted on solr_254). Individually each shards takes around
10 to 12 min to process hard commit (most of this time is spent on reloading
external files).
As per logs, shard2_0 is getting `start commit` request after 10 minutes
(approx). This leads to following timeout error.



{code:java}
2020-12-06 18:47:47.013 ERROR
org.apache.solr.client.solrj.SolrServerException: Timeout occured while
waiting response from server at:
http://solr_132:9744/solr/my_collection_shard2_1_replica_n21/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2Fsolr_254%3A9744%2Fsolr%2Fmy_collection_shard2_0_replica_n11%2F
      at
org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:407)
      at
org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:753)
      at
org.apache.solr.client.solrj.impl.ConcurrentUpdateHttp2SolrClient.request(ConcurrentUpdateHttp2SolrClient.java:369)
      at
org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1290)
      at
org.apache.solr.update.SolrCmdDistributor.doRequest(SolrCmdDistributor.java:344)
      at
org.apache.solr.update.SolrCmdDistributor.lambda$submit$0(SolrCmdDistributor.java:333)
      at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      at
com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:180)
      at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:210)
      at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      at java.lang.Thread.run(Thread.java:748)
    Caused by: java.util.concurrent.TimeoutException
      at
org.eclipse.jetty.client.util.InputStreamResponseListener.get(InputStreamResponseListener.java:216)
      at
org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:398)
      ... 13 more{code}



Above timeout error is between solr_254 and solr_132. Similar errors are
there between solr_254 and other 4 shards


Since query load is zero, mostly CPU utilization is around 3%.
After issuing curl commit command, CPU goes up to 14% on all shards except
shard2_0 (host: solr_254, the one used in curl command).
And after 10 minutes (i.e after getting the `start commit` request)  CPU  on
shard2_0 also goes up to 14%.

As I mentioned earlier each shards take around 10-12 mins to process commit
and due to delay in starting commit process on one shard (shard2_0) our
overall commit time is doubled now. (22-24 minutes approx).

*We are observing this delay in both hard and soft commit.*


In our solr-5.4.0(having similar setup), we use similar command curl command
to issue commit and there all the shards are getting `start commit` request
at same time. Including the one used in curl command.


> Commit through curl command is causing delay in issuing commit
> --------------------------------------------------------------
>
>                 Key: SOLR-15045
>                 URL: https://issues.apache.org/jira/browse/SOLR-15045
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrCloud
>    Affects Versions: 8.5.2
>         Environment: Operating system: Linux (centos 7.7.1908)
>            Reporter: Raj Yadav
>            Priority: Major
>
> Hi All,
> When we issue commit through curl command, not all the shards are getting 
> `start commit` requests at the same time.
> *Solr Setup Detail : (Running in solrCloud mode)*
>  It has 6 shards, and each shard has only one replica (which is also a
>  leader) and the replica type is NRT.
>  Each shards are hosted on the separate physical host.
> Zookeeper => We are using external zookeeper ensemble (3 separate node
>  cluster)
> *Shard and Host name*
>  shard1_0=>solr_199
>  shard1_1=>solr_200
>  shard2_0=> solr_254
>  shard2_1=> solr_132
>  shard3_0=>solr_133
>  shard3_1=>solr_198
> *Request rate on the system is currently zero and only hourly indexing*
>  *running on it.*
> We are using curl command to issue commit.
> {code:java}
> curl
> "http://solr_254:8389/solr/my_collection/update?openSearcher=true&commit=true&wt=json"{code}
> (Using solr_254 host to issue commit)
> On using the above command all the shards have started processing commit (i.e
>  getting `start commit` request) except the one used in curl command (i.e
>  shard2_0 which is hosted on solr_254). Individually each shards takes around
>  10 to 12 min to process hard commit (most of this time is spent on reloading
>  external files).
>  As per logs, shard2_0 is getting `start commit` request after 10 minutes
>  (approx). This leads to following timeout error.
> {code:java}
> 2020-12-06 18:47:47.013 ERROR
> org.apache.solr.client.solrj.SolrServerException: Timeout occured while
> waiting response from server at:
> http://solr_132:9744/solr/my_collection_shard2_1_replica_n21/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2Fsolr_254%3A9744%2Fsolr%2Fmy_collection_shard2_0_replica_n11%2F
>       at
> org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:407)
>       at
> org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:753)
>       at
> org.apache.solr.client.solrj.impl.ConcurrentUpdateHttp2SolrClient.request(ConcurrentUpdateHttp2SolrClient.java:369)
>       at
> org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1290)
>       at
> org.apache.solr.update.SolrCmdDistributor.doRequest(SolrCmdDistributor.java:344)
>       at
> org.apache.solr.update.SolrCmdDistributor.lambda$submit$0(SolrCmdDistributor.java:333)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:180)
>       at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:210)
>       at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>       at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>       at java.lang.Thread.run(Thread.java:748)
>     Caused by: java.util.concurrent.TimeoutException
>       at
> org.eclipse.jetty.client.util.InputStreamResponseListener.get(InputStreamResponseListener.java:216)
>       at
> org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:398)
>       ... 13 more{code}
> Above timeout error is between solr_254 and solr_132. Similar errors are
>  there between solr_254 and other 4 shards
> Since query load is zero, mostly CPU utilization is around 3%.
>  After issuing curl commit command, CPU goes up to 14% on all shards except
>  shard2_0 (host: solr_254, the one used in curl command).
>  And after 10 minutes (i.e after getting the `start commit` request)  CPU  on
>  shard2_0 also goes up to 14%.
> As I mentioned earlier each shards take around 10-12 mins to process commit
>  and due to delay in starting commit process on one shard (shard2_0) our
>  overall commit time is doubled now. (22-24 minutes approx).
> *We are observing this delay in both hard and soft commit.*
> In our solr-5.4.0(having similar setup), we use the similar curl command to 
> issue commit, and there all the shards are getting `start commit` request at 
> same time. Including the one used in curl command.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to