vdombrovski commented on issue #8967:
URL: https://github.com/apache/cloudstack/issues/8967#issuecomment-2107149571

   Hello @weizhouapache, sorry for the late reply.
   
   I ran the script again to fetch your logs, I can confirm that:
   
   1. These logs do sometimes appear when running the script (see below)
   2. **These logs are absolutely not related to the original issue**
   
   For example, I'm getting these logs:
   
   ```
   tail -f /var/log/cloudstack/management/management-server.log | grep -E 
"Failed to release|Unable to revoke"                                            
             
   2024-05-13 11:35:04,879 WARN  [c.c.n.IpAddressManagerImpl] 
(API-Job-Executor-87:ctx-e73f179d job-2834 ctx-d655a156) (logid:f8ef96fd) 
Unable to revoke all the firewall rules for ip id=107 as a part of ip release
   2024-05-13 11:35:22,448 WARN  [c.c.n.IpAddressManagerImpl] 
(API-Job-Executor-87:ctx-e73f179d job-2834 ctx-d655a156) (logid:f8ef96fd) 
Failed to release resources for ip address id=107
   2024-05-13 11:35:22,483 WARN  [c.c.n.NetworkServiceImpl] 
(API-Job-Executor-87:ctx-e73f179d job-2834 ctx-d655a156) (logid:f8ef96fd) 
Failed to release public ip address id=107
   2024-05-13 11:37:15,709 WARN  [c.c.n.IpAddressManagerImpl] 
(API-Job-Executor-112:ctx-95282fbe job-2859 ctx-4f1f38f7) (logid:ee6bbf3d) 
Unable to revoke all the firewall rules for ip id=107 as a part of ip release
   2024-05-13 11:37:33,802 WARN  [c.c.n.IpAddressManagerImpl] 
(API-Job-Executor-112:ctx-95282fbe job-2859 ctx-4f1f38f7) (logid:ee6bbf3d) 
Failed to release resources for ip address id=107
   2024-05-13 11:37:33,841 WARN  [c.c.n.NetworkServiceImpl] 
(API-Job-Executor-112:ctx-95282fbe job-2859 ctx-4f1f38f7) (logid:ee6bbf3d) 
Failed to release public ip address id=107
   2024-05-13 11:38:43,386 WARN  [c.c.n.IpAddressManagerImpl] 
(API-Job-Executor-4:ctx-a99870ee job-2876 ctx-15237616) (logid:5632f157) Unable 
to revoke all the firewall rules for ip id=107 as a part of ip release
   2024-05-13 11:38:53,873 WARN  [c.c.n.IpAddressManagerImpl] 
(API-Job-Executor-4:ctx-a99870ee job-2876 ctx-15237616) (logid:5632f157) Failed 
to release resources for ip address id=107
   2024-05-13 11:39:00,601 WARN  [c.c.n.NetworkServiceImpl] 
(API-Job-Executor-4:ctx-a99870ee job-2876 ctx-15237616) (logid:5632f157) Failed 
to release public ip address id=107
   2024-05-13 11:41:42,867 WARN  [c.c.n.IpAddressManagerImpl] 
(API-Job-Executor-41:ctx-06acb0e1 job-2913 ctx-63672a38) (logid:477fad2b) 
Unable to revoke all the firewall rules for ip id=107 as a part of ip release
   2024-05-13 11:42:00,821 WARN  [c.c.n.IpAddressManagerImpl] 
(API-Job-Executor-41:ctx-06acb0e1 job-2913 ctx-63672a38) (logid:477fad2b) 
Failed to release resources for ip address id=107
   2024-05-13 11:42:00,848 WARN  [c.c.n.NetworkServiceImpl] 
(API-Job-Executor-41:ctx-06acb0e1 job-2913 ctx-63672a38) (logid:477fad2b) 
Failed to release public ip address id=107
   ```
   However, there is no IP conflict introduced when this log appears (arping 
returns a single MAC address). 
   
   Similarly, when the IP conflict appears, this log is not present at all (at 
least in the 3-4 last iterations which lead to the conflict in my tests).
   
   I think there are 2 separate issues here:
   
   1. The manager itself is vulnerable to race-conditions, leading to your 
observations in the logs, and to the fact an IP can become deadlocked in the 
database
   2. The provisioning logic which governs the assignment of public IP 
addresses / firewall rules etc... is itself flawed, allowing 2 VRs to deploy a 
stale version of the configuration, which causes the IP conflict even from a 
clean database state.
   
   Hopefully these details help this investigation.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to