Re: Commits (with openSearcher = true) are too slow in solr 8

2020-12-09 Thread raj.yadav
Hi All,

I tried debugging but unable to find any solution. Do let me know in case
details/logs shared by me are not suffiecient/clear. 

Regards,
Raj



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Commits (with openSearcher = true) are too slow in solr 8

2020-12-08 Thread raj.yadav
matthew sporleder wrote
> I would stick to soft commits and schedule hard-commits as
> spaced-out-as-possible in regular maintenance windows until you can
> find the culprit of the timeout.
> 
> This way you will have very focused windows for intense monitoring
> during the hard-commit runs.

*Little correction:*
In my last post, I had mentioned that softCommit is working fine and there
no delay or error message.
Here is what happening:

1. Hard commit with openSearcher=true
curl
"http://:solr_port/solr/my_collection/update?openSearcher=true=true=json"

All the cores started processing commit except , the one hosted ``.
Also we are getting timeout error on this.

2. softCommit
curl
"http://:solr_port/solr/my_collection/update?softCommit=true=json"
Same as 1.

3.Hard commit with openSearcher=false
curl
"http://:solr_port/solr/my_collection/update?openSearcher=false=true=json"
All the cores started processing commit immediately and there is no error.


Solr commands used to set up system

Solr start command
#/var/solr-8.5.2/bin/solr start -c  -p solr_port  -z
zk_host1:zk_port,zk_host1:zk_port,zk_host1:zk_port -s
/var/node_my_collection_1/solr-8.5.2/server/solr -h  -m 26g
-DzkClientTimeout=3 -force



Creat Collection
1.upload config to zookeper
#var/solr-8.5.2/server/scripts/cloud-scripts/./zkcli.sh -z
zk_host1:zk_port,zk_host1:zk_port,zk_host1:zk_port  -cmd upconfig -confname
my_collection  -confdir /

2. Cretaed collection with 3 shards (shard1,shard2,shard3),
#curl
"http://:solr_port/solr/admin/collections?action=CREATE=my_collection=3=1=1=my_collection=solr_node1:solr_port,solr_node2:solr_port,solr_node3:solr_port"

3. Used SPLITSHARD command to split each shards into two half
(shard1_1,shard1_0,shard2_0,...)
e.g
 #curl
"http://:solr_port/solr/admin/collections?action=SPLITSHARD=my_collection=shard1

4. Used DELETESHARD command to delete old shatds (shard1,shard2,shard3).
e.g
 #curl
"http://:solr_port/solr/admin/collections?action=DELETESHARD=my_collection=shard1









--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Commits (with openSearcher = true) are too slow in solr 8

2020-12-07 Thread matthew sporleder
I would stick to soft commits and schedule hard-commits as
spaced-out-as-possible in regular maintenance windows until you can
find the culprit of the timeout.

This way you will have very focused windows for intense monitoring
during the hard-commit runs.


On Mon, Dec 7, 2020 at 9:24 AM raj.yadav  wrote:
>
> Hi Folks,
>
> Do let me know if any more information required to debug this.
>
>
> Regards,
> Raj
>
>
>
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Commits (with openSearcher = true) are too slow in solr 8

2020-12-07 Thread raj.yadav
Hi Folks,

Do let me know if any more information required to debug this.


Regards,
Raj



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Commits (with openSearcher = true) are too slow in solr 8

2020-12-06 Thread raj.yadav
matthew sporleder wrote
> Is zookeeper on the solr hosts or on its own?  Have you tried
> opensearcher=false (soft commit?)

1. we are using zookeeper in ensemble mode. Its hosted on 3 seperate node.
2. Soft commit  (opensearcher=false) is working fine. All the shards are
getting commit request immediately and its got processed within second.





--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Commits (with openSearcher = true) are too slow in solr 8

2020-12-06 Thread matthew sporleder
Is zookeeper on the solr hosts or on its own?  Have you tried
opensearcher=false (soft commit?)

On Sun, Dec 6, 2020 at 6:19 PM raj.yadav  wrote:
>
> Hi Everyone,
>
>
> matthew sporleder wrote
> > Are you stuck in iowait during that commit?
>
> During commit operation, there is no iowait.
> Infact most of the time cpu utilization percentage is very low.
>
> /*As I mentioned in my previous post that we are getting `SolrCmdDistributor
> org.apache.solr.client.solrj.SolrServerException: Timeout occured while
> waiting response from server` and `DistributedZkUpdateProcessor` ERROR on
> one of the shards. And this error is always occurring on the shard that is
> used (in culr command) to issue commit. (See below example for better
> understanding)*/
>
> Here is shard and corresponding node details:
> shard1_0=>solr_199
> shard1_1=>solr_200
> shard2_0=> solr_254
> shard2_1=> solr_132
> shard3_0=>solr_133
> shard3_1=>solr_198
>
> We are using the following command to issue commit:
> /curl
> "http://solr_node:8389/solr/my_collection/update?openSearcher=true=true=json"/
>
> For example, in the above command, if we replace solr_node with solr_254,
> then it's throwing SolrCmdDistributor and DistributedZkUpdateProcessor
> errors on shard2_0. Similarly, if we replace solr_node with solr_200 its
> throws errors on shard1_1.
>
> *I'm not able to figure out why this is happening. Is there any connection
> timeout setting that is affecting this? Is there any limit that, at a time
> only N number of shards can run commit ops simultaneously or is it some
> network related issue?*
>
>
> For a better understanding of what's happening in SOLR logs. I will
> demonstrate here one commit operation.
>
> I used the below command to issue commit at `2020-12-06 18:37:40` (approx)
> curl
> "http://solr_200:8389/solr/my_collection/update?openSearcher=true=true=json;
>
>
> /*shard2_0 (node: solr_254) Logs:*/
>
>
> *Commit is received at `2020-12-06 18:37:47` and got over by `2020-12-06
> 18:37:47` since there were no changes to commit. And CPU utilization during
> the whole period is around 2%.*
>
>
> 2020-12-06 18:37:47.023 INFO  (qtp2034610694-31355) [c:my_collection
> s:shard2_0 r:core_node13 x:my_collection_shard2_0_replica_n11]
> o.a.s.u.DirectUpdateHandler2 start
> commit{_version_=1685355093842460672,optimize=false,ope
> nSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> 2020-12-06 18:37:47.023 INFO No uncommitted changes. Skipping IW.commit.
> 2020-12-06 18:37:47.023 INFO end_commit_flush
> 2020-12-06 18:37:47.023 INFO  (qtp2034610694-31355) [c:my_collection
> s:shard2_0 r:core_node13 x:my_collection_shard2_0_replica_n11]
> o.a.s.u.p.LogUpdateProcessorFactory [my_collection_shard2_0_replica_n11]
> webapp=/solr path=/update
>
> params={update.distrib=TOLEADER=true=true=true=false=http://solr_200:8389/solr/my_collection_shard1_1_replica_n19/_end_point=leaders=javabi
> n=2=false}{commit=} 0 3
>
> /*shard2_1 (node: solr_132) Logs:*/
>
> *Commit is received at `2020-12-06 18:37:47` and got over by `2020-12-06
> 18:50:46` in between there were some external file reloading operations (our
> solr-5.4.2 system is also taking similar time to reload external files so
> right now this is not a major concern for us)
> CPU utilization before commit (i.e `2020-12-06 18:37:47` timestamp) is 2%
> and between commit ops (i.e from `2020-12-06 18:37:47`  to `2020-12-06
> 18:50:46` timestamp) is 14% and after commit operation is done it agains
> fall back to 2%*
>
>
> 2020-12-06 18:37:47.024 INFO  (qtp2034610694-30058) [c:my_collection
> s:shard2_1 r:core_node22 x:my_collection_shard2_1_replica_n21]
> o.a.s.u.DirectUpdateHandler2 start
> commit{_version_=1685355093844557824,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
>
> 2020-12-06 18:50:46.218 INFO  (qtp2034610694-30058) [c:my_collection
> s:shard2_1 r:core_node22 x:my_collection_shard2_1_replica_n21]
> o.a.s.u.p.LogUpdateProcessorFactory [my_collection_shard2_1_replica_n21]
> webapp=/solr path=/update
> params={update.distrib=TOLEADER=true=true=true=false=http://solr_200:8389/solr/my_collection_shard1_1_replica_n19/_end_point=leaders=javabin=2=false}{commit=}
> 0 779196
>
>
> /*shard3_0 (node: solr_133) logs*/
>
> Same as shard2_1, commit received at `2020-12-06 18:37:47` and got over by
> `2020-12-06 18:49:24`.
> CPU utilization pattern is same is shard2_1.
>
> /*shard3_1 (node: solr_198) logs.*/
>
> Same as shard2_1, commit received at `2020-12-06 18:37:47` and got over by
> `2020-12-06 18:53:57`.
> CPU utilization pattern is same is shard2_1.
>
> /*shard1_0 (node: solr_199) logs.*/
>
> Same as shard2_1, commit received at `2020-12-06 18:37:47` and got over by
> `2020-12-06 18:54:51`.
> CPU utilization pattern is same is shard2_1.
>
> /*shard1_1 (node: solr_200) logs.*/
>
> /This is the same solr_node which we used in curl command to issue commit.
> As expected we got SolrCmdDistributor 

Re: Commits (with openSearcher = true) are too slow in solr 8

2020-12-06 Thread raj.yadav
Hi Everyone,


matthew sporleder wrote
> Are you stuck in iowait during that commit?

During commit operation, there is no iowait.
Infact most of the time cpu utilization percentage is very low.

/*As I mentioned in my previous post that we are getting `SolrCmdDistributor
org.apache.solr.client.solrj.SolrServerException: Timeout occured while
waiting response from server` and `DistributedZkUpdateProcessor` ERROR on
one of the shards. And this error is always occurring on the shard that is
used (in culr command) to issue commit. (See below example for better
understanding)*/

Here is shard and corresponding node details:
shard1_0=>solr_199
shard1_1=>solr_200
shard2_0=> solr_254
shard2_1=> solr_132
shard3_0=>solr_133
shard3_1=>solr_198

We are using the following command to issue commit:
/curl
"http://solr_node:8389/solr/my_collection/update?openSearcher=true=true=json"/

For example, in the above command, if we replace solr_node with solr_254,
then it's throwing SolrCmdDistributor and DistributedZkUpdateProcessor
errors on shard2_0. Similarly, if we replace solr_node with solr_200 its
throws errors on shard1_1.

*I'm not able to figure out why this is happening. Is there any connection
timeout setting that is affecting this? Is there any limit that, at a time
only N number of shards can run commit ops simultaneously or is it some
network related issue?*


For a better understanding of what's happening in SOLR logs. I will
demonstrate here one commit operation.

I used the below command to issue commit at `2020-12-06 18:37:40` (approx)
curl
"http://solr_200:8389/solr/my_collection/update?openSearcher=true=true=json;


/*shard2_0 (node: solr_254) Logs:*/


*Commit is received at `2020-12-06 18:37:47` and got over by `2020-12-06
18:37:47` since there were no changes to commit. And CPU utilization during
the whole period is around 2%.*


2020-12-06 18:37:47.023 INFO  (qtp2034610694-31355) [c:my_collection
s:shard2_0 r:core_node13 x:my_collection_shard2_0_replica_n11]
o.a.s.u.DirectUpdateHandler2 start
commit{_version_=1685355093842460672,optimize=false,ope
nSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
2020-12-06 18:37:47.023 INFO No uncommitted changes. Skipping IW.commit.
2020-12-06 18:37:47.023 INFO end_commit_flush
2020-12-06 18:37:47.023 INFO  (qtp2034610694-31355) [c:my_collection
s:shard2_0 r:core_node13 x:my_collection_shard2_0_replica_n11]
o.a.s.u.p.LogUpdateProcessorFactory [my_collection_shard2_0_replica_n11] 
webapp=/solr path=/update

params={update.distrib=TOLEADER=true=true=true=false=http://solr_200:8389/solr/my_collection_shard1_1_replica_n19/_end_point=leaders=javabi
n=2=false}{commit=} 0 3

/*shard2_1 (node: solr_132) Logs:*/

*Commit is received at `2020-12-06 18:37:47` and got over by `2020-12-06
18:50:46` in between there were some external file reloading operations (our
solr-5.4.2 system is also taking similar time to reload external files so
right now this is not a major concern for us)
CPU utilization before commit (i.e `2020-12-06 18:37:47` timestamp) is 2%
and between commit ops (i.e from `2020-12-06 18:37:47`  to `2020-12-06
18:50:46` timestamp) is 14% and after commit operation is done it agains
fall back to 2%*


2020-12-06 18:37:47.024 INFO  (qtp2034610694-30058) [c:my_collection
s:shard2_1 r:core_node22 x:my_collection_shard2_1_replica_n21]
o.a.s.u.DirectUpdateHandler2 start
commit{_version_=1685355093844557824,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}

2020-12-06 18:50:46.218 INFO  (qtp2034610694-30058) [c:my_collection
s:shard2_1 r:core_node22 x:my_collection_shard2_1_replica_n21]
o.a.s.u.p.LogUpdateProcessorFactory [my_collection_shard2_1_replica_n21] 
webapp=/solr path=/update
params={update.distrib=TOLEADER=true=true=true=false=http://solr_200:8389/solr/my_collection_shard1_1_replica_n19/_end_point=leaders=javabin=2=false}{commit=}
0 779196


/*shard3_0 (node: solr_133) logs*/

Same as shard2_1, commit received at `2020-12-06 18:37:47` and got over by
`2020-12-06 18:49:24`.
CPU utilization pattern is same is shard2_1.

/*shard3_1 (node: solr_198) logs.*/

Same as shard2_1, commit received at `2020-12-06 18:37:47` and got over by
`2020-12-06 18:53:57`.
CPU utilization pattern is same is shard2_1.

/*shard1_0 (node: solr_199) logs.*/

Same as shard2_1, commit received at `2020-12-06 18:37:47` and got over by
`2020-12-06 18:54:51`.
CPU utilization pattern is same is shard2_1.

/*shard1_1 (node: solr_200) logs.*/

/This is the same solr_node which we used in curl command to issue commit.
As expected we got SolrCmdDistributor and DistributedZkUpdateProcessor error
on it./

/Till `2020-12-06 18:46:50` timestamp there is no `start commit`  request
received. Also CPU utilization is 2% till this time./
/*Received follwing error at `2020-12-06 18:47:47` timestamp*/

2020-12-06 18:47:47.013 ERROR
(updateExecutor-5-thread-6-processing-n:solr_200:8389_solr

Re: Commits (with openSearcher = true) are too slow in solr 8

2020-12-06 Thread raj.yadav
matthew sporleder wrote
> On unix the top command will tell you.  On windows you need to find
> the disk latency stuff.

Will check this and report here



matthew sporleder wrote
> Are you on a spinning disk or on a (good) SSD?

we are using SSD


matthew sporleder wrote
> Anyway, my theory is that trying to do too many commits in parallel
> (too many or not enough shards) is causing iowait = high latency to
> work through.

Can you please elaborate more about this.




--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Commits (with openSearcher = true) are too slow in solr 8

2020-12-06 Thread matthew sporleder
On unix the top command will tell you.  On windows you need to find
the disk latency stuff.

Are you on a spinning disk or on a (good) SSD?

Anyway my theory is that trying to do too many commits in parallel
(too many or not enough shards) is causing iowait = high latency to
work through.

On Sun, Dec 6, 2020 at 9:05 AM raj.yadav  wrote:
>
> matthew sporleder wrote
> > Are you stuck in iowait during that commit?
>
> I am not sure how do I determine that, could you help me here.
>
>
>
>
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Commits (with openSearcher = true) are too slow in solr 8

2020-12-06 Thread raj.yadav
matthew sporleder wrote
> Are you stuck in iowait during that commit?

I am not sure how do I determine that, could you help me here.




--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Commits (with openSearcher = true) are too slow in solr 8

2020-12-04 Thread matthew sporleder
Are you stuck in iowait during that commit?



On Fri, Dec 4, 2020 at 6:28 AM raj.yadav  wrote:
>
> Hi everyone,
>
> As per suggestions in previous post (by Erick and Shawn) we did following
> changes.
>
> OLD CACHE CONFIG
>   size="32768"
>  initialSize="6000"
>  autowarmCount="6000"/>
>
>size="25600"
>   initialSize="6000"
>   autowarmCount="0"/>
>
> size="32768"
>initialSize="6144"
>autowarmCount="0"/>
>
> NEW CACHE CONFIG
>   size="8192"
>  initialSize="512"
>  autowarmCount="512"/>
>
>size="8192"
>   initialSize="3000"
>   autowarmCount="0"/>
>
> size="8192"
>initialSize="3072"
>autowarmCount="0"/>
>
>
> *Reduced JVM heap size from 30GB to 26GB*
>
>
>
> *Currently query request rate on the system is zero.
> But still, commit with openSearcher=true is taking 25 mins.*
>
> We looked into solr logs, and observed the following things:
>
> 1. /Once the commit is issued, five (shard1_0, shard1_1, shard2_2, shard3_0,
> shard3_1) of the six shards have immediately started processing commit but
> on one shard (shard2_1) we are getting follwing error:/
>
> 2020-12-03 12:29:17.518 ERROR
> (updateExecutor-5-thread-6-processing-n:solr_132:8389_solr
> x:my_collection_shard2_1_replica_n21 c:my_collection s:shard2_1
> r:core_node22) [c:my_collection s:shard2_1 r:core_node22
> x:my_collection_shard2_1_replica_n21] o.a.s.u.SolrCmdDistributor
> org.apache.solr.client.solrj.SolrServerException: Timeout occured while
> waiting response from server at:
> http://solr_198:8389/solr/my_collection_shard3_1_replica_n23/update?update.distrib=TOLEADER=http%3A%2F%2Fsolr_132%3A8389%2Fsolr%2Fmy_collection_shard2_1_replica_n21%2F
> at
> org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:407)
> at
> org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:753)
> at
> org.apache.solr.client.solrj.impl.ConcurrentUpdateHttp2SolrClient.request(ConcurrentUpdateHttp2SolrClient.java:369)
> at 
> org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1290)
> at
> org.apache.solr.update.SolrCmdDistributor.doRequest(SolrCmdDistributor.java:344)
> at
> org.apache.solr.update.SolrCmdDistributor.lambda$submit$0(SolrCmdDistributor.java:333)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:180)
> at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:210)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.util.concurrent.TimeoutException
> at
> org.eclipse.jetty.client.util.InputStreamResponseListener.get(InputStreamResponseListener.java:216)
> at
> org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:398)
> ... 13 more
>
> 2020-12-03 12:29:17.518 ERROR
> (updateExecutor-5-thread-2-processing-n:solr_132:8389_solr
> x:my_collection_shard2_1_replica_n21 c:my_collection s:shard2_1
> r:core_node22) [c:my_collection s:shard2_1 r:core_node22
> x:my_collection_shard2_1_replica_n21] o.a.s.u.SolrCmdDistributor
> org.apache.solr.client.solrj.SolrServerException: Timeout occured while
> waiting response from server at:
> http://solr_199:8389/solr/my_collection_shard1_0_replica_n7/update?update.distrib=TOLEADER=http%3A%2F%2Fsolr_132%3A8389%2Fsolr%2Fmy_collection_shard2_1_replica_n21%2F
> at
> org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:407)
> at
> org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:753)
> at
> org.apache.solr.client.solrj.impl.ConcurrentUpdateHttp2SolrClient.request(ConcurrentUpdateHttp2SolrClient.java:369)
> at 
> org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1290)
> at
> org.apache.solr.update.SolrCmdDistributor.doRequest(SolrCmdDistributor.java:344)
> at
> org.apache.solr.update.SolrCmdDistributor.lambda$submit$0(SolrCmdDistributor.java:333)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> 

Re: Commits (with openSearcher = true) are too slow in solr 8

2020-12-02 Thread raj.yadav
Hi everyone,

As per suggestions in previous post (by Erick and Shawn) we did following
changes.

OLD




 

NEW







*Reduced JVM heap size from 30GB to 26GB*

GC setting:
GC_TUNE=" \
-XX:+UseG1GC \
-XX:+PerfDisableSharedMem \
-XX:+ParallelRefProcEnabled \
-XX:G1HeapRegionSize=8m \
-XX:MaxGCPauseMillis=150 \
-XX:InitiatingHeapOccupancyPercent=60 \
-XX:+UseLargePages \
-XX:+AggressiveOpts \
"

Solr Collection details: (running in solrCloud mode)
It has 6 shards, and each shard has only one replica (which is also a
leader) and replica type is NRT
Each shard Index size: 11 GB
avg size/doc: 1.0Kb

We are running indexing on this collection:
*Indexing rate: 2.4 million per hour*

*The query rate is zero. Still commit with opensearcher=true is taking 25 to
28 minutes.*
Is this because of heavy indexing? Also, with an increase in the number of
documents in collection commit time is increasing. 
This is not our production system. In the prod system generally, our
indexing rate is 5k/hour.

Is it expected to have such high commit time with the above indexing rate?



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Commits (with openSearcher = true) are too slow in solr 8

2020-11-09 Thread raj.yadav
Thanks, Shawn and  Erick.
We are step by step trying out the changes suggested in your post.
Will get back once we have some numbers.



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Commits (with openSearcher = true) are too slow in solr 8

2020-11-04 Thread Erick Erickson
I completely agree with Shawn. I’d emphasize that your heap is that large
probably to accommodate badly mis-configured caches.

Why it’s different in 5.4 I don’t quite know, but 10-12
minutes is unacceptable anyway.

My guess is that you made your heaps that large as a consequence of
having low hit rates. If you were using bare NOW in fq clauses,
perhaps you were getting very low hit rates as a result and expanded
the cache size, see:

https://dzone.com/articles/solr-date-math-now-and-filter

At any rate, I _strongly_ recommend that you drop your filterCache
to the default size of 512, and drop your autowarmCount to something
very small, say 16. Ditto for queryResultCache. The documentCache
to maybe 10,000 (autowarm is a no-op for documentCache). Then
drop your heap to something closer to 16G. Then test, tune, test. Do
NOT assume bigger caches are the answer until you have evidence.
Keep reducing your heap size until you start to see GC problems (on 
a test system obviously) to get your lower limit. Then add some
back for your production to give you some breathing room.

Finally, see Uwe’s blog:

https://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html

to get a sense of why the size on disk is not necessarily a good
indicator of the heap requirements.

Best,
Erick

> On Nov 4, 2020, at 2:40 AM, Shawn Heisey  wrote:
> 
> On 11/3/2020 11:46 PM, raj.yadav wrote:
>> We have two parallel system one is  solr 8.5.2 and other one is solr 5.4
>> In solr_5.4 commit time with opensearcher true is 10 to 12 minutes while in
>> solr_8 it's around 25 minutes.
> 
> Commits on a properly configured and sized system should take a few seconds, 
> not minutes.  10 to 12 minutes for a commit is an enormous red flag.
> 
>> This is our current caching policy of solr_8
>> >  size="32768"
>>  initialSize="6000"
>>  autowarmCount="6000"/>
> 
> This is probably the culprit.  Do you know how many entries the filterCache 
> actually ends up with?  What you've said with this config is "every time I 
> open a new searcher, I'm going to execute up to 6000 queries against the new 
> index."  If each query takes one second, running 6000 of them is going to 
> take 100 minutes.  I have seen these queries take a lot longer than one 
> second.
> 
> Also, each entry in the filterCache can be enormous, depending on the number 
> of docs in the index.  Let's say that you have five million documents in your 
> core.  With five million documents, each entry in the filterCache is going to 
> be 625000 bytes.  That means you need 20GB of heap memory for a full 
> filterCache of 32768 entries -- 20GB of memory above and beyond everything 
> else that Solr requires.  Your message doesn't say how many documents you 
> have, it only says the index is 11GB.  From that, it is not possible for me 
> to figure out how many documents you have.
> 
>> While debugging this we came across this page.
>> https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems#SolrPerformanceProblems-Slowcommits
> 
> I wrote that wiki page.
> 
>> Here one of the reasons for slow commit is mentioned as:
>> */`Heap size issues. Problems from the heap being too big will tend to be
>> infrequent, while problems from the heap being too small will tend to happen
>> consistently.`/*
>> Can anyone please help me understand the above point?
> 
> If your heap is a lot bigger than it needs to be, then what you'll see is 
> slow garbage collections, but it won't happen very often.  If the heap is too 
> small, then there will be garbage collections that happen REALLY often, 
> leaving few system resources for actually running the program.  This applies 
> to ANY Java program, not just Solr.
> 
>> System config:
>> disk size: 250 GB
>> cpu: (8 vcpus, 64 GiB memory)
>> Index size: 11 GB
>> JVM heap size: 30 GB
> 
> That heap seems to be a lot larger than it needs to be.  I have run systems 
> with over 100GB of index, with tens of millions of documents, on an 8GB heap. 
>  My filterCache on each core had a max size of 64, with an autowarmCount of 
> four ... and commits STILL would take 10 to 15 seconds, which I consider to 
> be very slow.  Most of that time was spent executing those four queries in 
> order to autowarm the filterCache.
> 
> What I would recommend you start with is reducing the size of the 
> filterCache.  Try a size of 128 and an autowarmCount of 8, see what you get 
> for a hit rate on the cache.  Adjust from there as necessary.  And I would 
> reduce the heap size for Solr as well -- your heap requirements should drop 
> dramatically with a reduced filterCache.
> 
> Thanks,
> Shawn



Re: Commits (with openSearcher = true) are too slow in solr 8

2020-11-03 Thread Shawn Heisey

On 11/3/2020 11:46 PM, raj.yadav wrote:

We have two parallel system one is  solr 8.5.2 and other one is solr 5.4
In solr_5.4 commit time with opensearcher true is 10 to 12 minutes while in
solr_8 it's around 25 minutes.


Commits on a properly configured and sized system should take a few 
seconds, not minutes.  10 to 12 minutes for a commit is an enormous red 
flag.



This is our current caching policy of solr_8




This is probably the culprit.  Do you know how many entries the 
filterCache actually ends up with?  What you've said with this config is 
"every time I open a new searcher, I'm going to execute up to 6000 
queries against the new index."  If each query takes one second, running 
6000 of them is going to take 100 minutes.  I have seen these queries 
take a lot longer than one second.


Also, each entry in the filterCache can be enormous, depending on the 
number of docs in the index.  Let's say that you have five million 
documents in your core.  With five million documents, each entry in the 
filterCache is going to be 625000 bytes.  That means you need 20GB of 
heap memory for a full filterCache of 32768 entries -- 20GB of memory 
above and beyond everything else that Solr requires.  Your message 
doesn't say how many documents you have, it only says the index is 11GB. 
 From that, it is not possible for me to figure out how many documents 
you have.



While debugging this we came across this page.
https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems#SolrPerformanceProblems-Slowcommits


I wrote that wiki page.


Here one of the reasons for slow commit is mentioned as:
*/`Heap size issues. Problems from the heap being too big will tend to be
infrequent, while problems from the heap being too small will tend to happen
consistently.`/*

Can anyone please help me understand the above point?


If your heap is a lot bigger than it needs to be, then what you'll see 
is slow garbage collections, but it won't happen very often.  If the 
heap is too small, then there will be garbage collections that happen 
REALLY often, leaving few system resources for actually running the 
program.  This applies to ANY Java program, not just Solr.



System config:
disk size: 250 GB
cpu: (8 vcpus, 64 GiB memory)
Index size: 11 GB
JVM heap size: 30 GB


That heap seems to be a lot larger than it needs to be.  I have run 
systems with over 100GB of index, with tens of millions of documents, on 
an 8GB heap.  My filterCache on each core had a max size of 64, with an 
autowarmCount of four ... and commits STILL would take 10 to 15 seconds, 
which I consider to be very slow.  Most of that time was spent executing 
those four queries in order to autowarm the filterCache.


What I would recommend you start with is reducing the size of the 
filterCache.  Try a size of 128 and an autowarmCount of 8, see what you 
get for a hit rate on the cache.  Adjust from there as necessary.  And I 
would reduce the heap size for Solr as well -- your heap requirements 
should drop dramatically with a reduced filterCache.


Thanks,
Shawn


Commits (with openSearcher = true) are too slow in solr 8

2020-11-03 Thread raj.yadav
Hi everyone,
We have two parallel system one is  solr 8.5.2 and other one is solr 5.4
In solr_5.4 commit time with opensearcher true is 10 to 12 minutes while in
solr_8 it's around 25 minutes.

This is our current caching policy of solr_8







In solr 5, we are using FastLRUCache (instead of CaffeineCache) and other
parameters are same.

While debugging this we came across this page.
https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems#SolrPerformanceProblems-Slowcommits

Here one of the reasons for slow commit is mentioned as:
*/`Heap size issues. Problems from the heap being too big will tend to be
infrequent, while problems from the heap being too small will tend to happen
consistently.`/*

Can anyone please help me understand the above point?

System config:
disk size: 250 GB
cpu: (8 vcpus, 64 GiB memory)
Index size: 11 GB
JVM heap size: 30 GB



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html