weizhouapache commented on issue #8967:
URL: https://github.com/apache/cloudstack/issues/8967#issuecomment-2090350202

   > @DaanHoogland here is how I successfully reproduced the issue on our QA 
platform:
   > 
   > ```
   > 56 bytes from 1e:00:4d:00:00:c1 (XX.YY.17.30): index=959 time=275.810 usec
   > 56 bytes from 1e:00:6b:00:00:c2 (XX.YY.17.30): index=960 time=337.038 usec
   > ```
   > 
   > You were right, this is a timing issue. Here is the raw cmk code to 
reproduce:
   > 
   > * Create 2 isolated networks, network-a and network-b (mine are ids 
c6a4bc87-93cc-499c-8fdc-4e44b4d0fca0 and 7e2aa0f1-c536-4162-be43-aea277ae80aa
   > * You don't need to add any VMs, only playing with virtual routers is 
enough to cause the issue
   > * Target a specific IP address, in my case XX.YY.17.30 
(id=f1988f51-c9e5-46e4-9393-2a873b2610e0)
   > * Create 2 scripts as follows (make sure cmk works properly on your 
machine, I'm using 6.2.0)
   > * Run both scripts in parallel in a while true loop as follows:
   > 
   > Terminal 1:
   > 
   > ```shell
   > while true; do bash ./test1.sh; done
   > ```
   > 
   > Terminal 2:
   > 
   > ```shell
   > while true; do bash ./test2.sh; done
   > ```
   > 
   > * Test with arping on the IP (arping XX.YY.17.30), it should take around 
2-3 mins on a platform with 0 load.
   > 
   > ```
   > #test1.sh
   > 
   > cmk associate ipaddress projectid=05019f40-8552-4856-9cf1-02ae1b7e2621 
zoneid=3adfd657-7e4e-4613-99da-49b3a0ef9db7 ipaddress=XX.YY.17.30 account=qa 
networkid=c6a4bc87-93cc-499c-8fdc-4e44b4d0fca0 # Network A
   > 
   > cmk createLoadBalancerRule algorithm=roundrobin 
publicipid=f1988f51-c9e5-46e4-9393-2a873b2610e0 publicport=22 privateport=22 
name=vm-a
   > 
   > cmk disassociate ipaddress id=f1988f51-c9e5-46e4-9393-2a873b2610e0
   > 
   > cmk associate ipaddress projectid=05019f40-8552-4856-9cf1-02ae1b7e2621 
zoneid=3adfd657-7e4e-4613-99da-49b3a0ef9db7 ipaddress=XX.YY.17.30 account=qa 
networkid=7e2aa0f1-c536-4162-be43-aea277ae80aa # network B
   > 
   > cmk createLoadBalancerRule algorithm=roundrobin 
publicipid=f1988f51-c9e5-46e4-9393-2a873b2610e0 publicport=22 privateport=22 
name=vm-b
   > 
   > cmk disassociate ipaddress id=f1988f51-c9e5-46e4-9393-2a873b2610e0
   > ```
   > 
   > ```
   > #test2.sh
   > 
   > cmk associate ipaddress projectid=05019f40-8552-4856-9cf1-02ae1b7e2621 
zoneid=3adfd657-7e4e-4613-99da-49b3a0ef9db7 ipaddress=XX.YY.17.30 account=qa 
networkid=c6a4bc87-93cc-499c-8fdc-4e44b4d0fca0 # Network A
   > 
   > cmk createLoadBalancerRule algorithm=roundrobin 
publicipid=f1988f51-c9e5-46e4-9393-2a873b2610e0 publicport=23 privateport=23 
name=vm-a
   > 
   > cmk createLoadBalancerRule algorithm=roundrobin 
publicipid=f1988f51-c9e5-46e4-9393-2a873b2610e0 publicport=24 privateport=24 
name=vm-a
   > 
   > cmk disassociate ipaddress id=f1988f51-c9e5-46e4-9393-2a873b2610e0
   > 
   > cmk associate ipaddress projectid=05019f40-8552-4856-9cf1-02ae1b7e2621 
zoneid=3adfd657-7e4e-4613-99da-49b3a0ef9db7 ipaddress=XX.YY.17.30 account=qa 
networkid=7e2aa0f1-c536-4162-be43-aea277ae80aa # network B
   > 
   > cmk createLoadBalancerRule algorithm=roundrobin 
publicipid=f1988f51-c9e5-46e4-9393-2a873b2610e0 publicport=23 privateport=23 
name=vm-b
   > 
   > cmk createLoadBalancerRule algorithm=roundrobin 
publicipid=f1988f51-c9e5-46e4-9393-2a873b2610e0 publicport=24 privateport=24 
name=vm-b
   > 
   > cmk disassociate ipaddress id=f1988f51-c9e5-46e4-9393-2a873b2610e0
   > ```
   > 
   > Now I understand that handling concurrency isn't really a priority for 
this project (see another issue from last year: #7907)
   > 
   > However, with the introduction of more vectors of automation (Terraform, 
Kubernetes), I believe that these types of issues might become really common. 
In our case, the IP conflicts have impacted more than 10 IPs total on a pool of 
400 IPs, which is 2.5%.
   
   thanks @vdombrovski  
   I adapted the two scripts you shared, and run them in parallel for around 2 
hours with a public IP, two isolated networks, two vms
   I did not face any issue.
   I tested with latest main branch (4.20-SNAPSHOT), not 4.17.2
   
   Can you share the full management-server.log for further investigation ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to