Public bug reported: neutron-fwaas replaced conntrack-tools by netlink + pyroute2[1] in order to delete faster conntrack entries when the rules associated to a firewall (through a policy) are updated.
This change highlights many warnings: 1) the code readability should be improved: ** it embeds dead code/unused constants[2] ** it uses non-meaningful variable names[3] ("h", "ct") ** it uses hardcoded values instead of existing constants[3] ** it miss docstrings/comments for tricky parts[2][3] 2) the change tricky part[3] is untested 3) the code seems unhealthy, ** nothing ensures that file descriptors/objects are correctly closed/destroyed if an exception is raised[3] ** pyroute2.netns.setns use[3] seems not (green)thread-safe. It seems a (green)thread can move the process in $netns2 when a concurrent greenthread is "working" in $netns1. It can be solved by a lock. 4) the change should highlight if and why nfct and libc C libraries used in the change are eventlet-friendly. 5) the change uses pyroute2.netns.setns in [3] which moves the current process to a specific netns ** it seems that when we enter the netns, the process seems to never go back in the root netns[9]. Is it an acceptable limitation? ** When the process is in a netns, every network action is done in the netns including rpc calls, remote sylog which won't work because they should be done in the root namespace. 1) => we can live on the short term with it (easy to address) 2) => we can address it (costly) 3) => we can address it (easy to solve) 4) => i don't know how to get an answer to this question 5) => it seems to imply that we have to dedicate specific(s) process(es) moving between netns to list/kill/flush_entries like as it's done in oslo.rootwrap daemon mode (very costly by required) These warnings are based on my understanding of the netlink feature, perhaps i miss something. [1] https://review.openstack.org/389654 [2] https://github.com/openstack/neutron-fwaas/blob/master/neutron_fwaas/common/netlink_constants.py [3] https://github.com/openstack/neutron-fwaas/blob/master/neutron_fwaas/privileged/netlink_lib.py [9] Code used to test pyroute2: import os, pyroute2, pyroute2.netns root_ifs = {x.get_attr('IFLA_IFNAME') for x in pyroute2.IPRoute().get_links()} fd = pyroute2.netns.setns('taz') ns_ifs = {x.get_attr('IFLA_IFNAME') for x in pyroute2.IPRoute().get_links()} os.close(fd) last_ifs = {x.get_attr('IFLA_IFNAME') for x in pyroute2.IPRoute().get_links()} if root_ifs == ns_ifs: print 'Add a new interface in root netns with ip tuntap a mode tap' elif last_ifs == root_ifs: print 'OK: we ended in root netns' elif root_ifs != last_ifs == ns_ifs: print 'KO: we ended in taz netns' else: print 'Something is wrong' ** Affects: neutron Importance: Undecided Assignee: Cedric Brandily (cbrandily) Status: New ** Tags: fwaas ocata-rc-potential -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1664294 Title: Netlink solution not enough mature for Ocata Status in neutron: New Bug description: neutron-fwaas replaced conntrack-tools by netlink + pyroute2[1] in order to delete faster conntrack entries when the rules associated to a firewall (through a policy) are updated. This change highlights many warnings: 1) the code readability should be improved: ** it embeds dead code/unused constants[2] ** it uses non-meaningful variable names[3] ("h", "ct") ** it uses hardcoded values instead of existing constants[3] ** it miss docstrings/comments for tricky parts[2][3] 2) the change tricky part[3] is untested 3) the code seems unhealthy, ** nothing ensures that file descriptors/objects are correctly closed/destroyed if an exception is raised[3] ** pyroute2.netns.setns use[3] seems not (green)thread-safe. It seems a (green)thread can move the process in $netns2 when a concurrent greenthread is "working" in $netns1. It can be solved by a lock. 4) the change should highlight if and why nfct and libc C libraries used in the change are eventlet-friendly. 5) the change uses pyroute2.netns.setns in [3] which moves the current process to a specific netns ** it seems that when we enter the netns, the process seems to never go back in the root netns[9]. Is it an acceptable limitation? ** When the process is in a netns, every network action is done in the netns including rpc calls, remote sylog which won't work because they should be done in the root namespace. 1) => we can live on the short term with it (easy to address) 2) => we can address it (costly) 3) => we can address it (easy to solve) 4) => i don't know how to get an answer to this question 5) => it seems to imply that we have to dedicate specific(s) process(es) moving between netns to list/kill/flush_entries like as it's done in oslo.rootwrap daemon mode (very costly by required) These warnings are based on my understanding of the netlink feature, perhaps i miss something. [1] https://review.openstack.org/389654 [2] https://github.com/openstack/neutron-fwaas/blob/master/neutron_fwaas/common/netlink_constants.py [3] https://github.com/openstack/neutron-fwaas/blob/master/neutron_fwaas/privileged/netlink_lib.py [9] Code used to test pyroute2: import os, pyroute2, pyroute2.netns root_ifs = {x.get_attr('IFLA_IFNAME') for x in pyroute2.IPRoute().get_links()} fd = pyroute2.netns.setns('taz') ns_ifs = {x.get_attr('IFLA_IFNAME') for x in pyroute2.IPRoute().get_links()} os.close(fd) last_ifs = {x.get_attr('IFLA_IFNAME') for x in pyroute2.IPRoute().get_links()} if root_ifs == ns_ifs: print 'Add a new interface in root netns with ip tuntap a mode tap' elif last_ifs == root_ifs: print 'OK: we ended in root netns' elif root_ifs != last_ifs == ns_ifs: print 'KO: we ended in taz netns' else: print 'Something is wrong' To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1664294/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp