A little confused about CoreContainer.replayUpdatesExecutor

2020-10-23 Thread slly
Hello everyone.


I'm a little confused:  replayUpdatesExecutor, only one thread is always 
running? 
OrderedExecutor limits the cfg.getReplayUpdatesThreads() number of tasks to be 
submitted, and the newMDCAwareCachedThreadPool thread queue are up to 
cfg.getReplayUpdatesThreads() queues, the thread poll corePoolSize is 0.


 
 CoreContainer.java
| this.replayUpdatesExecutor =newOrderedExecutor( |
| | cfg.getReplayUpdatesThreads(), |
| | ExecutorUtil.newMDCAwareCachedThreadPool( |
| | cfg.getReplayUpdatesThreads(), |
| | newSolrNamedThreadFactory("replayUpdatesExecutor"))); |
Thanks.




 





 

Re: TieredMergePolicyFactory question

2020-10-23 Thread Erick Erickson
Well, you mentioned that the segments you’re concerned were merged a year ago.
If segments aren’t being merged, they’re pretty static.

There’s no real harm in optimizing _occasionally_, even in an NRT index. If you 
have
segments that were merged that long ago, you may be indexing continually but it
sounds like it’s a situation where you update more recent docs rather than 
random
ones over the entire corpus.

That caution is more for indexes where you essentially replace docs in your
corpus randomly, and it’s really about wasting a lot of cycles rather than
bad stuff happening. When you randomly update documents (or delete them),
the extra work isn’t worth it.

Either operation will involve a lot of CPU cycles and can require that you have
at least as much free space on your disk as the indexes occupy, so do be aware
of that.

All that said, what evidence do you have that this is worth any effort at all?
Depending on the environment, you may not even be able to measure
performance changes so this all may be irrelevant anyway.

But to your question. Yes, you can cause regular merging to more aggressively 
merge segments with deleted docs by setting the
deletesPctAllowed
in solroconfig.xml. The default value is 33, and you can set it as low as 20 or 
as
high as 50. We put
a floor of 20% because the cost starts to rise quickly if it’s lower than that, 
and
expungeDeletes is a better alternative at that point.

This is not a hard number, and in practice the percentage of you index that 
consists
of deleted documents tends to be lower than this number, depending of course
on your particular environment.

Best,
Erick

> On Oct 23, 2020, at 12:59 PM, Moulay Hicham  wrote:
> 
> Thanks Eric.
> 
> My index is near real time and frequently updated.
> I checked this page
> https://lucene.apache.org/solr/guide/8_1/uploading-data-with-index-handlers.html#xml-update-commands
> and using forceMerge/expungeDeletes are NOT recommended.
> 
> So I was hoping that the change in mergePolicyFactory will affect the
> segments with high percent of deletes as part of the REGULAR segment
> merging cycles. Is my understanding correct?
> 
> 
> 
> 
> On Fri, Oct 23, 2020 at 9:47 AM Erick Erickson 
> wrote:
> 
>> Just go ahead and optimize/forceMerge, but do _not_ optimize to one
>> segment. Or you can expungeDeletes, that will rewrite all segments with
>> more than 10% deleted docs. As of Solr 7.5, these operations respect the 5G
>> limit.
>> 
>> See: https://lucidworks.com/post/solr-and-optimizing-your-index-take-ii/
>> 
>> Best
>> Erick
>> 
>> On Fri, Oct 23, 2020, 12:36 Moulay Hicham  wrote:
>> 
>>> Hi,
>>> 
>>> I am using solr 8.1 in production. We have about 30%-50% of deleted
>>> documents in some old segments that were merged a year ago.
>>> 
>>> These segments size is about 5GB.
>>> 
>>> I was wondering why these segments have a high % of deleted docs and
>> found
>>> out that they are NOT being candidates for merging because the
>>> default TieredMergePolicy maxMergedSegmentMB is 5G.
>>> 
>>> So I have modified the TieredMergePolicyFactory config as below to
>>> lower the delete docs %
>>> 
>>> > class="org.apache.solr.index.TieredMergePolicyFactory">
>>>  10
>>>  10
>>>  12000
>>>  20
>>> 
>>> 
>>> 
>>> Do you see any issues with increasing the max merged segment to 12GB and
>>> lowered the deletedPctAllowed to 20%?
>>> 
>>> Thanks,
>>> 
>>> Moulay
>>> 
>> 



Re: TieredMergePolicyFactory question

2020-10-23 Thread Moulay Hicham
Thanks Eric.

My index is near real time and frequently updated.
I checked this page
https://lucene.apache.org/solr/guide/8_1/uploading-data-with-index-handlers.html#xml-update-commands
and using forceMerge/expungeDeletes are NOT recommended.

So I was hoping that the change in mergePolicyFactory will affect the
segments with high percent of deletes as part of the REGULAR segment
merging cycles. Is my understanding correct?




On Fri, Oct 23, 2020 at 9:47 AM Erick Erickson 
wrote:

> Just go ahead and optimize/forceMerge, but do _not_ optimize to one
> segment. Or you can expungeDeletes, that will rewrite all segments with
> more than 10% deleted docs. As of Solr 7.5, these operations respect the 5G
> limit.
>
> See: https://lucidworks.com/post/solr-and-optimizing-your-index-take-ii/
>
> Best
> Erick
>
> On Fri, Oct 23, 2020, 12:36 Moulay Hicham  wrote:
>
> > Hi,
> >
> > I am using solr 8.1 in production. We have about 30%-50% of deleted
> > documents in some old segments that were merged a year ago.
> >
> > These segments size is about 5GB.
> >
> > I was wondering why these segments have a high % of deleted docs and
> found
> > out that they are NOT being candidates for merging because the
> > default TieredMergePolicy maxMergedSegmentMB is 5G.
> >
> > So I have modified the TieredMergePolicyFactory config as below to
> > lower the delete docs %
> >
> >  class="org.apache.solr.index.TieredMergePolicyFactory">
> >   10
> >   10
> >   12000
> >   20
> > 
> >
> >
> > Do you see any issues with increasing the max merged segment to 12GB and
> > lowered the deletedPctAllowed to 20%?
> >
> > Thanks,
> >
> > Moulay
> >
>


Re: TieredMergePolicyFactory question

2020-10-23 Thread Erick Erickson
Just go ahead and optimize/forceMerge, but do _not_ optimize to one
segment. Or you can expungeDeletes, that will rewrite all segments with
more than 10% deleted docs. As of Solr 7.5, these operations respect the 5G
limit.

See: https://lucidworks.com/post/solr-and-optimizing-your-index-take-ii/

Best
Erick

On Fri, Oct 23, 2020, 12:36 Moulay Hicham  wrote:

> Hi,
>
> I am using solr 8.1 in production. We have about 30%-50% of deleted
> documents in some old segments that were merged a year ago.
>
> These segments size is about 5GB.
>
> I was wondering why these segments have a high % of deleted docs and found
> out that they are NOT being candidates for merging because the
> default TieredMergePolicy maxMergedSegmentMB is 5G.
>
> So I have modified the TieredMergePolicyFactory config as below to
> lower the delete docs %
>
> 
>   10
>   10
>   12000
>   20
> 
>
>
> Do you see any issues with increasing the max merged segment to 12GB and
> lowered the deletedPctAllowed to 20%?
>
> Thanks,
>
> Moulay
>


TieredMergePolicyFactory question

2020-10-23 Thread Moulay Hicham
Hi,

I am using solr 8.1 in production. We have about 30%-50% of deleted
documents in some old segments that were merged a year ago.

These segments size is about 5GB.

I was wondering why these segments have a high % of deleted docs and found
out that they are NOT being candidates for merging because the
default TieredMergePolicy maxMergedSegmentMB is 5G.

So I have modified the TieredMergePolicyFactory config as below to
lower the delete docs %


  10
  10
  12000
  20



Do you see any issues with increasing the max merged segment to 12GB and
lowered the deletedPctAllowed to 20%?

Thanks,

Moulay


Metric Trigger not being recognised & picked up

2020-10-23 Thread Jonathan Tan
Hi All

I've been trying to get a metric trigger set up in SolrCloud 8.4.1, but
it's not working, and was hoping for some help.

I've created a metric trigger using this:

```
POST /solr/admin/autoscaling {
  "set-trigger": {
"name": "metric_trigger",
"event": "metric",
"waitFor": "10s",
"metric": "metrics:solr.jvm:os.systemCpuLoad",
"above": 0.7,
"preferredOperation": "MOVEREPLICA",
"enabled": true
  }
}
```

And I get a successful response.

I can also see the new trigger in the `files -> tree -> autoscaling.json`.

However, I don't see any difference in the logs (I had the autoscaling
logging set to debug), and it's definitely not moving any replicas around
when under load, and the node is consistently in the > 85% overall
systemCpuLoad. (I can see this as well when I use the `/metrics` endpoint
with the above key.)


I then restarted all the nodes, and saw this error on startup, saying it
couldn't set the state during a restore, with the worrying part saying that
it is discarding the trigger...

I'd really like some help with this.

We've been seeing that out of the 3 nodes, there's always - seemingly
randomly - massively utilised on CPU (maxed out 8 cores, and it's not
always the one with overseer), so we were hoping that we could let the
Metric Trigger sort it out in the short term.

```
2020-10-22 23:03:19.905 ERROR (ScheduledTrigger-7-thread-3) [   ]
o.a.s.c.a.ScheduledTriggers Error restoring trigger state jvm_cpu_trigger
=> java.lang.NullPointerException
at
org.apache.solr.cloud.autoscaling.MetricTrigger.setState(MetricTrigger.java:94)
java.lang.NullPointerException: null
at
org.apache.solr.cloud.autoscaling.MetricTrigger.setState(MetricTrigger.java:94)
~[?:?]
at
org.apache.solr.cloud.autoscaling.TriggerBase.restoreState(TriggerBase.java:279)
~[?:?]
at
org.apache.solr.cloud.autoscaling.ScheduledTriggers$TriggerWrapper.run(ScheduledTriggers.java:638)
~[?:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
~[?:?]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) ~[?:?]
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
~[?:?]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
~[?:?]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
~[?:?]
at java.lang.Thread.run(Thread.java:834) [?:?]
2020-10-22 23:03:19.912 ERROR (ScheduledTrigger-7-thread-1) [   ]
o.a.s.c.a.ScheduledTriggers Failed to re-play event, discarding: {
  "id":"dd2ebf3d56bTboddkoovyjxdvy1hauq2zskpt",
  "source":"metric_trigger",
  "eventTime":15199552918891,
  "eventType":"METRIC",
  "properties":{

"node":{"mycoll-solr-solr-service-1.mycoll-solr-solr-service-headless.mycoll-solr-test:8983_solr":0.7322834645669292},
"_dequeue_time_":261690991035,
"metric":"metrics:solr.jvm:os.systemCpuLoad",
"preferredOperation":"MOVEREPLICA",
"_enqueue_time_":15479182216601,
"requestedOps":[{
"action":"MOVEREPLICA",

"hints":{"SRC_NODE":["mycoll-solr-solr-service-1.mycoll-solr-solr-service-headless.mycoll-solr-test:8983_solr"]}}],
"replaying":true}}
2020-10-22 23:03:19.913 INFO
 
(OverseerStateUpdate-144115201265369088-mycoll-solr-solr-service-0.mycoll-solr-solr-service-headless.mycoll-solr-test:8983_solr-n_000199)
[   ] o.a.s.c.o.SliceMutator createReplica() {
  "operation":"addreplica",
  "collection":"mycoll-2",
  "shard":"shard5",
  "core":"mycoll-2_shard5_replica_n122",
  "state":"down",
  "base_url":"
http://mycoll-solr-solr-service-0.mycoll-solr-solr-service-headless.mycoll-solr-test:8983/solr
",

"node_name":"mycoll-solr-solr-service-0.mycoll-solr-solr-service-headless.mycoll-solr-test:8983_solr",
  "type":"NRT"}
2020-10-22 23:03:19.921 ERROR (ScheduledTrigger-7-thread-1) [   ]
o.a.s.c.a.ScheduledTriggers Error restoring trigger state metric_trigger =>
java.lang.NullPointerException
at
org.apache.solr.cloud.autoscaling.MetricTrigger.setState(MetricTrigger.java:94)
java.lang.NullPointerException: null
at
org.apache.solr.cloud.autoscaling.MetricTrigger.setState(MetricTrigger.java:94)
~[?:?]
at
org.apache.solr.cloud.autoscaling.TriggerBase.restoreState(TriggerBase.java:279)
~[?:?]
at
org.apache.solr.cloud.autoscaling.ScheduledTriggers$TriggerWrapper.run(ScheduledTriggers.java:638)
~[?:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
~[?:?]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) ~[?:?]
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
~[?:?]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
~[?:?]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
~[?:?]
at java.lang.Thread.run(Thread.java:834) [?:?]

```


Any help please?
Thank you
Jonathan