GitHub user dstoy53 edited a discussion: disabling security groups cleanly
In 4.20.1 (edit: KVM), is there an established non-disruptive way to disable security groups in an advanced zone for a shared network? If there's one obvious answer please don't bother with the mess below, I got a little carried away. There's a few things that don't play nice with SGs (CAPI, floating IPs), so I'm exploring how to disable the feature in-place without rebuilding the network or zone. I can just rely on other security boundaries instead. Here's what I've tried in a lab: 1. disable security group provider - no helpful effect, causes failures later 2. restarting the agent while the provider is disabled - no effect 3. stopping and starting the vm - fails to start because the security group provider was disabled in step 1 4. update network's network offering via api - fails, only allowed to change the offering for isolated networks 5. enable security group provider, uncheck all SGs on VM - VM launches with default sg I also remember reading a solution to just empty out security_group.py which would work until an agent update. So that's the UX options I could imagine, and here's the blind DB surgery solution that worked: 1. Change the offering id in the `networks` table - this breaks the network rendering in the UI but then... 2. Delete the row in `ntwk_sevice_map` that maps the network <-> service - this fixed the UI rendering 3. Power on the VM - this succeeds, and no rules are applied (for either ebtables or iptables) 4. Disable SG provider, power off VM, power on - this succeeds 5. security_group.py continues to fire for power on/off and mostly complains about having nothing to do 6. power off the VM, uncheck Default SG that was automatically applied during my earlier efforts (was probably causing the script to fire), this hides the Security Group section in the VM UI 7. power on the VM - no security group is applied, and security_group.py does not fire anymore 8. management logs show the console proxy fails to deploy because the Zone is SG enabled and none of the networks use an offering with SG - "Can not found security enabled network in SG Zone" 9. In the `data_center` table, for the zone's row set is_security_group_enabled to 0. If I wanted to provide both SG and non-SG networks in the zone I probably wouldn't need this step. 10. Now the only remaining problem is that my lab scenario is incomplete in an obvious way so I can ignore the error and call it done. The network gurus refuse to design because this is the second zone, not dedicated, and the Public network gets allocated to `[ ROOT ] system` instead of `System Pool` which is already handled in Zone 1. The only reason I kept playing with powering on/off is to simulate a real world live migration to an empty host since that's one of the triggers to apply the SG rules. Unless I completely missed the beaten path, I think the updateNetwork API should be allowed to change the network offering, and also update the network service map. Then it would be up to the user to remove the security groups from the VMs. Disabling the SG Provider should also clear the SG enabled flag on the zone so system VMs can get deployed. Of course that's just one service, maybe doing this breaks netscalers catastrophically for some unexpected reason. I'm also guessing the power off to modify security groups requirement is because that's an easy trigger to fire the script, so if I remove the relevant row in security_group_vm_map and live migrate to another host after disabling everything things will probably work correctly and avoid downtime. So if I butcher the DB and live migrate a VM to an empty host, everything should work out. GitHub link: https://github.com/apache/cloudstack/discussions/11864 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
