[ 
https://issues.apache.org/jira/browse/SOLR-12470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Thacker updated SOLR-12470:
---------------------------------
    Attachment: system_docs.json
                multiple_replicas.zip
                graph_view.png
                bug_report.txt
                4_docs.json

> Search Rate Trigger created more than 3 replicas
> ------------------------------------------------
>
>                 Key: SOLR-12470
>                 URL: https://issues.apache.org/jira/browse/SOLR-12470
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: AutoScaling
>            Reporter: Varun Thacker
>            Priority: Major
>         Attachments: 4_docs.json, bug_report.txt, graph_view.png, 
> multiple_replicas.zip, system_docs.json
>
>
> Here's the trigger that I created . At this point the collection was one 
> shard and one replica ( on node3 )
> {code:java}
> curl -X POST -H 'Content-type:application/json' --data-binary '{
> "set-trigger": {
> "name" : "search_rate_trigger",
> "event" : "searchRate",
> "collection" : "test_rate_trigger",
> "rate" : 1.0,
> "waitFor" : "1m",
> "enabled" : true,
> "actions" : [
> {
> "name" : "compute_plan",
> "class": "solr.ComputePlanAction"
> },
> {
> "name" : "execute_plan",
> "class": "solr.ExecutePlanAction"
> }
> ]
> }
> }' http://localhost:8983/solr/admin/autoscaling{code}
>  
> I also had a trigger listener setup as I was testing the listener feature
> {code:java}
> curl -X POST -H 'Content-type:application/json' --data-binary '{
> "set-listener": {
> "name": "search_rate_listener",
> "trigger": "search_rate_trigger",
> "stage": ["STARTED", "ABORTED", "SUCCEEDED", "FAILED"],
> "class": "solr.SystemLogListener"
> }
> }' http://localhost:8983/solr/admin/autoscaling{code}
>  
> I ran a script to fire queries at every 100ms . The index didn't have any 
> docs so it's a simple match all query
> {code:java}
> while [ 1 ]
> do
> curl -s "http://localhost:8984/solr/test_rate_trigger/select/?q=*:*"; > 
> /dev/null
> sleep .1
> done{code}
> After a few minutes I see 4 replicas being created.
> Attaching logs from all 4 nodes. It should be fairly easy to reproduce with 
> the above mentioned steps
> Also attaching all the docs from the .system collection for reference
> Here's another interesting this I noticed. I re-created the setup but this 
> time removed the execute_plan part
> Now every 1 min the compute plan action tries to create 3 replicas . Why I 
> found this interesting is that it was trying to create two replicas on the 
> same node
> Does this look like a separate bug?
> {code:java}
> INFO - 2018-06-08 03:41:32.586; [ ] org.apache.solr.servlet.HttpSolrCall; 
> [admin] webapp=null path=/admin/metrics 
> params={wt=javabin&version=2&key=solr.jvm:os.processCpuLoad&key=solr.node:CONTAINER.fs.coreRoot.usableSpace&key=solr.jvm:os.systemLoadAverage&key=solr.jvm:memory.heap.used}
>  status=0 QTime=0
> INFO - 2018-06-08 03:41:40.909; [ ] org.apache.solr.servlet.HttpSolrCall; 
> [admin] webapp=null path=/admin/metrics 
> params={prefix=CONTAINER.fs.usableSpace,CORE.coreName&wt=javabin&version=2&group=solr.node,solr.core}
>  status=0 QTime=1
> INFO - 2018-06-08 03:41:40.932; [ ] 
> org.apache.solr.cloud.autoscaling.ComputePlanAction; Computed Plan: 
> action=ADDREPLICA&collection=test_rate_trigger&shard=shard1&node=127.94.0.1:8984_solr&type=NRT
> INFO - 2018-06-08 03:41:40.933; [ ] 
> org.apache.solr.cloud.autoscaling.ComputePlanAction; Computed Plan: 
> action=ADDREPLICA&collection=test_rate_trigger&shard=shard1&node=127.94.0.1:8983_solr&type=NRT
> INFO - 2018-06-08 03:41:40.934; [ ] 
> org.apache.solr.cloud.autoscaling.ComputePlanAction; Computed Plan: 
> action=ADDREPLICA&collection=test_rate_trigger&shard=shard1&node=127.94.0.1:8983_solr&type=NRT
> INFO - 2018-06-08 03:41:40.934; [ ] 
> org.apache.solr.client.solrj.cloud.autoscaling.PolicyHelper; returnSession, 
> curr-time 9184331 sessionWrapper.createTime 9184324085271, 
> this.sessionWrapper.createTime 9184324085271
> INFO - 2018-06-08 03:42:32.604; [ ] org.apache.solr.servlet.HttpSolrCall; 
> [admin] webapp=null path=/admin/metrics 
> params={wt=javabin&version=2&key=solr.jvm:os.processCpuLoad&key=solr.node:CONTAINER.fs.coreRoot.usableSpace&key=solr.jvm:os.systemLoadAverage&key=solr.jvm:memory.heap.used}
>  status=0 QTime=0
> INFO - 2018-06-08 03:42:41.525; [ ] org.apache.solr.servlet.HttpSolrCall; 
> [admin] webapp=null path=/admin/metrics 
> params={prefix=CONTAINER.fs.usableSpace,CORE.coreName&wt=javabin&version=2&group=solr.node,solr.core}
>  status=0 QTime=0
> INFO - 2018-06-08 03:42:41.559; [ ] 
> org.apache.solr.cloud.autoscaling.ComputePlanAction; Computed Plan: 
> action=ADDREPLICA&collection=test_rate_trigger&shard=shard1&node=127.94.0.1:8984_solr&type=NRT
> INFO - 2018-06-08 03:42:41.560; [ ] 
> org.apache.solr.cloud.autoscaling.ComputePlanAction; Computed Plan: 
> action=ADDREPLICA&collection=test_rate_trigger&shard=shard1&node=127.94.0.1:8983_solr&type=NRT
> INFO - 2018-06-08 03:42:41.560; [ ] 
> org.apache.solr.cloud.autoscaling.ComputePlanAction; Computed Plan: 
> action=ADDREPLICA&collection=test_rate_trigger&shard=shard1&node=127.94.0.1:8983_solr&type=NRT
> INFO - 2018-06-08 03:42:41.561; [ ] 
> org.apache.solr.client.solrj.cloud.autoscaling.PolicyHelper; returnSession, 
> curr-time 9244959 sessionWrapper.createTime 9244956725861, 
> this.sessionWrapper.createTime 9244956725861
> INFO - 2018-06-08 03:43:32.622; [ ] org.apache.solr.servlet.HttpSolrCall; 
> [admin] webapp=null path=/admin/metrics 
> params={wt=javabin&version=2&key=solr.jvm:os.processCpuLoad&key=solr.node:CONTAINER.fs.coreRoot.usableSpace&key=solr.jvm:os.systemLoadAverage&key=solr.jvm:memory.heap.used}
>  status=0 QTime=0
> INFO - 2018-06-08 03:43:42.158; [ ] org.apache.solr.servlet.HttpSolrCall; 
> [admin] webapp=null path=/admin/metrics 
> params={prefix=CONTAINER.fs.usableSpace,CORE.coreName&wt=javabin&version=2&group=solr.node,solr.core}
>  status=0 QTime=1
> INFO - 2018-06-08 03:43:42.178; [ ] 
> org.apache.solr.cloud.autoscaling.ComputePlanAction; Computed Plan: 
> action=ADDREPLICA&collection=test_rate_trigger&shard=shard1&node=127.94.0.1:8984_solr&type=NRT
> INFO - 2018-06-08 03:43:42.180; [ ] 
> org.apache.solr.cloud.autoscaling.ComputePlanAction; Computed Plan: 
> action=ADDREPLICA&collection=test_rate_trigger&shard=shard1&node=127.94.0.1:8983_solr&type=NRT
> INFO - 2018-06-08 03:43:42.181; [ ] 
> org.apache.solr.cloud.autoscaling.ComputePlanAction; Computed Plan: 
> action=ADDREPLICA&collection=test_rate_trigger&shard=shard1&node=127.94.0.1:8983_solr&type=NRT
> INFO - 2018-06-08 03:43:42.181; [ ] 
> org.apache.solr.client.solrj.cloud.autoscaling.PolicyHelper; returnSession, 
> curr-time 9305581 sessionWrapper.createTime 9305577119413, 
> this.sessionWrapper.createTime 9305577119413
> INFO - 2018-06-08 03:44:32.642; [ ] org.apache.solr.servlet.HttpSolrCall; 
> [admin] webapp=null path=/admin/metrics 
> params={wt=javabin&version=2&key=solr.jvm:os.processCpuLoad&key=solr.node:CONTAINER.fs.coreRoot.usableSpace&key=solr.jvm:os.systemLoadAverage&key=solr.jvm:memory.heap.used}
>  status=0 QTime=0
> INFO - 2018-06-08 03:44:42.759; [ ] org.apache.solr.servlet.HttpSolrCall; 
> [admin] webapp=null path=/admin/metrics 
> params={prefix=CONTAINER.fs.usableSpace,CORE.coreName&wt=javabin&version=2&group=solr.node,solr.core}
>  status=0 QTime=0
> INFO - 2018-06-08 03:44:42.778; [ ] 
> org.apache.solr.cloud.autoscaling.ComputePlanAction; Computed Plan: 
> action=ADDREPLICA&collection=test_rate_trigger&shard=shard1&node=127.94.0.1:8984_solr&type=NRT
> INFO - 2018-06-08 03:44:42.779; [ ] 
> org.apache.solr.cloud.autoscaling.ComputePlanAction; Computed Plan: 
> action=ADDREPLICA&collection=test_rate_trigger&shard=shard1&node=127.94.0.1:8983_solr&type=NRT
> INFO - 2018-06-08 03:44:42.779; [ ] 
> org.apache.solr.cloud.autoscaling.ComputePlanAction; Computed Plan: 
> action=ADDREPLICA&collection=test_rate_trigger&shard=shard1&node=127.94.0.1:8983_solr&type=NRT
> INFO - 2018-06-08 03:44:42.779; [ ] 
> org.apache.solr.client.solrj.cloud.autoscaling.PolicyHelper; returnSession, 
> curr-time 9366182 sessionWrapper.createTime 9366178796748, 
> this.sessionWrapper.createTime 9366178796748{code}
>  
> Thirdly, for this above mentioned test I started observing the .system 
> collection . Here is a query that I used to get the documents created from 
> the first time the listener kicked in
> {code:java}
> http://localhost:8983/solr/.system/select?fq=source_s:SystemLogListener&q=*:*&rows=4&sort=timestamp%20asc{code}
> My first expectation was that I'd see 3 docs but I saw 4 docs. Curious why 
> it's 4 ( the docs are attached as 4_docs.json )
> My intention here is to remove the system log listener with an http listener 
> here I wanted to understand should I be looking out for 4 events or 3
> The first reaction here is it's a minor bug hence I'm putting it as part of 
> this jira
> Happy to break it up into smaller Jiras once I hear back if these are valid 
> issues. This test was run against master



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to