[ https://issues.apache.org/jira/browse/SOLR-12470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506437#comment-16506437 ]
Andrzej Bialecki commented on SOLR-12470: ------------------------------------------ I'll look into it more closely - a quick comment for now: bq. My first expectation was that I'd see 3 docs but I saw 4 docs. Curious why it's 4 ( the docs are attached as 4_docs.json ) These are STARTED and SUCCEEDED events produced by two listeners - your listener and the default one that is always created when a trigger is created. Admittedly there should be an event property that contains the listener name - it's impossible to figure out now which listener produced what event... 2 replicas on the same node ... it could be caused by the cluster layout existing just before each operation, and the autoscaling policy? AddReplicaSuggester simulates how adding a replica affects number of cores and disk space on each node, and assigns replicas to nodes based on minimizing the violations. > Search Rate Trigger created more than 3 replicas > ------------------------------------------------ > > Key: SOLR-12470 > URL: https://issues.apache.org/jira/browse/SOLR-12470 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: AutoScaling > Reporter: Varun Thacker > Assignee: Andrzej Bialecki > Priority: Major > Attachments: 4_docs.json, bug_report.txt, graph_view.png, > multiple_replicas.zip, system_docs.json > > > Here's the trigger that I created . At this point the collection was one > shard and one replica ( on node3 ) > {code:java} > curl -X POST -H 'Content-type:application/json' --data-binary '{ > "set-trigger": { > "name" : "search_rate_trigger", > "event" : "searchRate", > "collection" : "test_rate_trigger", > "rate" : 1.0, > "waitFor" : "1m", > "enabled" : true, > "actions" : [ > { > "name" : "compute_plan", > "class": "solr.ComputePlanAction" > }, > { > "name" : "execute_plan", > "class": "solr.ExecutePlanAction" > } > ] > } > }' http://localhost:8983/solr/admin/autoscaling{code} > > I also had a trigger listener setup as I was testing the listener feature > {code:java} > curl -X POST -H 'Content-type:application/json' --data-binary '{ > "set-listener": { > "name": "search_rate_listener", > "trigger": "search_rate_trigger", > "stage": ["STARTED", "ABORTED", "SUCCEEDED", "FAILED"], > "class": "solr.SystemLogListener" > } > }' http://localhost:8983/solr/admin/autoscaling{code} > > I ran a script to fire queries at every 100ms . The index didn't have any > docs so it's a simple match all query > {code:java} > while [ 1 ] > do > curl -s "http://localhost:8984/solr/test_rate_trigger/select/?q=*:*" > > /dev/null > sleep .1 > done{code} > After a few minutes I see 4 replicas being created. > Attaching logs from all 4 nodes. It should be fairly easy to reproduce with > the above mentioned steps > Also attaching all the docs from the .system collection for reference > Here's another interesting this I noticed. I re-created the setup but this > time removed the execute_plan part > Now every 1 min the compute plan action tries to create 3 replicas . Why I > found this interesting is that it was trying to create two replicas on the > same node > Does this look like a separate bug? > {code:java} > INFO - 2018-06-08 03:41:32.586; [ ] org.apache.solr.servlet.HttpSolrCall; > [admin] webapp=null path=/admin/metrics > params={wt=javabin&version=2&key=solr.jvm:os.processCpuLoad&key=solr.node:CONTAINER.fs.coreRoot.usableSpace&key=solr.jvm:os.systemLoadAverage&key=solr.jvm:memory.heap.used} > status=0 QTime=0 > INFO - 2018-06-08 03:41:40.909; [ ] org.apache.solr.servlet.HttpSolrCall; > [admin] webapp=null path=/admin/metrics > params={prefix=CONTAINER.fs.usableSpace,CORE.coreName&wt=javabin&version=2&group=solr.node,solr.core} > status=0 QTime=1 > INFO - 2018-06-08 03:41:40.932; [ ] > org.apache.solr.cloud.autoscaling.ComputePlanAction; Computed Plan: > action=ADDREPLICA&collection=test_rate_trigger&shard=shard1&node=127.94.0.1:8984_solr&type=NRT > INFO - 2018-06-08 03:41:40.933; [ ] > org.apache.solr.cloud.autoscaling.ComputePlanAction; Computed Plan: > action=ADDREPLICA&collection=test_rate_trigger&shard=shard1&node=127.94.0.1:8983_solr&type=NRT > INFO - 2018-06-08 03:41:40.934; [ ] > org.apache.solr.cloud.autoscaling.ComputePlanAction; Computed Plan: > action=ADDREPLICA&collection=test_rate_trigger&shard=shard1&node=127.94.0.1:8983_solr&type=NRT > INFO - 2018-06-08 03:41:40.934; [ ] > org.apache.solr.client.solrj.cloud.autoscaling.PolicyHelper; returnSession, > curr-time 9184331 sessionWrapper.createTime 9184324085271, > this.sessionWrapper.createTime 9184324085271 > INFO - 2018-06-08 03:42:32.604; [ ] org.apache.solr.servlet.HttpSolrCall; > [admin] webapp=null path=/admin/metrics > params={wt=javabin&version=2&key=solr.jvm:os.processCpuLoad&key=solr.node:CONTAINER.fs.coreRoot.usableSpace&key=solr.jvm:os.systemLoadAverage&key=solr.jvm:memory.heap.used} > status=0 QTime=0 > INFO - 2018-06-08 03:42:41.525; [ ] org.apache.solr.servlet.HttpSolrCall; > [admin] webapp=null path=/admin/metrics > params={prefix=CONTAINER.fs.usableSpace,CORE.coreName&wt=javabin&version=2&group=solr.node,solr.core} > status=0 QTime=0 > INFO - 2018-06-08 03:42:41.559; [ ] > org.apache.solr.cloud.autoscaling.ComputePlanAction; Computed Plan: > action=ADDREPLICA&collection=test_rate_trigger&shard=shard1&node=127.94.0.1:8984_solr&type=NRT > INFO - 2018-06-08 03:42:41.560; [ ] > org.apache.solr.cloud.autoscaling.ComputePlanAction; Computed Plan: > action=ADDREPLICA&collection=test_rate_trigger&shard=shard1&node=127.94.0.1:8983_solr&type=NRT > INFO - 2018-06-08 03:42:41.560; [ ] > org.apache.solr.cloud.autoscaling.ComputePlanAction; Computed Plan: > action=ADDREPLICA&collection=test_rate_trigger&shard=shard1&node=127.94.0.1:8983_solr&type=NRT > INFO - 2018-06-08 03:42:41.561; [ ] > org.apache.solr.client.solrj.cloud.autoscaling.PolicyHelper; returnSession, > curr-time 9244959 sessionWrapper.createTime 9244956725861, > this.sessionWrapper.createTime 9244956725861 > INFO - 2018-06-08 03:43:32.622; [ ] org.apache.solr.servlet.HttpSolrCall; > [admin] webapp=null path=/admin/metrics > params={wt=javabin&version=2&key=solr.jvm:os.processCpuLoad&key=solr.node:CONTAINER.fs.coreRoot.usableSpace&key=solr.jvm:os.systemLoadAverage&key=solr.jvm:memory.heap.used} > status=0 QTime=0 > INFO - 2018-06-08 03:43:42.158; [ ] org.apache.solr.servlet.HttpSolrCall; > [admin] webapp=null path=/admin/metrics > params={prefix=CONTAINER.fs.usableSpace,CORE.coreName&wt=javabin&version=2&group=solr.node,solr.core} > status=0 QTime=1 > INFO - 2018-06-08 03:43:42.178; [ ] > org.apache.solr.cloud.autoscaling.ComputePlanAction; Computed Plan: > action=ADDREPLICA&collection=test_rate_trigger&shard=shard1&node=127.94.0.1:8984_solr&type=NRT > INFO - 2018-06-08 03:43:42.180; [ ] > org.apache.solr.cloud.autoscaling.ComputePlanAction; Computed Plan: > action=ADDREPLICA&collection=test_rate_trigger&shard=shard1&node=127.94.0.1:8983_solr&type=NRT > INFO - 2018-06-08 03:43:42.181; [ ] > org.apache.solr.cloud.autoscaling.ComputePlanAction; Computed Plan: > action=ADDREPLICA&collection=test_rate_trigger&shard=shard1&node=127.94.0.1:8983_solr&type=NRT > INFO - 2018-06-08 03:43:42.181; [ ] > org.apache.solr.client.solrj.cloud.autoscaling.PolicyHelper; returnSession, > curr-time 9305581 sessionWrapper.createTime 9305577119413, > this.sessionWrapper.createTime 9305577119413 > INFO - 2018-06-08 03:44:32.642; [ ] org.apache.solr.servlet.HttpSolrCall; > [admin] webapp=null path=/admin/metrics > params={wt=javabin&version=2&key=solr.jvm:os.processCpuLoad&key=solr.node:CONTAINER.fs.coreRoot.usableSpace&key=solr.jvm:os.systemLoadAverage&key=solr.jvm:memory.heap.used} > status=0 QTime=0 > INFO - 2018-06-08 03:44:42.759; [ ] org.apache.solr.servlet.HttpSolrCall; > [admin] webapp=null path=/admin/metrics > params={prefix=CONTAINER.fs.usableSpace,CORE.coreName&wt=javabin&version=2&group=solr.node,solr.core} > status=0 QTime=0 > INFO - 2018-06-08 03:44:42.778; [ ] > org.apache.solr.cloud.autoscaling.ComputePlanAction; Computed Plan: > action=ADDREPLICA&collection=test_rate_trigger&shard=shard1&node=127.94.0.1:8984_solr&type=NRT > INFO - 2018-06-08 03:44:42.779; [ ] > org.apache.solr.cloud.autoscaling.ComputePlanAction; Computed Plan: > action=ADDREPLICA&collection=test_rate_trigger&shard=shard1&node=127.94.0.1:8983_solr&type=NRT > INFO - 2018-06-08 03:44:42.779; [ ] > org.apache.solr.cloud.autoscaling.ComputePlanAction; Computed Plan: > action=ADDREPLICA&collection=test_rate_trigger&shard=shard1&node=127.94.0.1:8983_solr&type=NRT > INFO - 2018-06-08 03:44:42.779; [ ] > org.apache.solr.client.solrj.cloud.autoscaling.PolicyHelper; returnSession, > curr-time 9366182 sessionWrapper.createTime 9366178796748, > this.sessionWrapper.createTime 9366178796748{code} > > Thirdly, for this above mentioned test I started observing the .system > collection . Here is a query that I used to get the documents created from > the first time the listener kicked in > {code:java} > http://localhost:8983/solr/.system/select?fq=source_s:SystemLogListener&q=*:*&rows=4&sort=timestamp%20asc{code} > My first expectation was that I'd see 3 docs but I saw 4 docs. Curious why > it's 4 ( the docs are attached as 4_docs.json ) > My intention here is to remove the system log listener with an http listener > here I wanted to understand should I be looking out for 4 events or 3 > The first reaction here is it's a minor bug hence I'm putting it as part of > this jira > Happy to break it up into smaller Jiras once I hear back if these are valid > issues. This test was run against master -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org