[Yahoo-eng-team] [Bug 1905552] [NEW] neutron-fwaas netlink conntrack driver would catch error while conntrack rules protocol is 'unknown'

2020-11-25 Thread Zhang Jian
Public bug reported:

2020-11-25 11:07:32.606 127 DEBUG oslo_concurrency.lockutils 
[req-ab14782d-80b1-43f6-8d1b-2874531aca5e - 9d40b483f885496896d81c487f420438 - 
- -] Releasing semaphore 
"iptables-qrouter-9e18395d-961d-46b3-a0e9-4c6a94c32baf" lock 
/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:228
2020-11-25 11:07:32.609 127 ERROR 
neutron_fwaas.services.firewall.service_drivers.agents.drivers.linux.iptables_fwaas_v2
 [req-ab14782d-80b1-43f6-8d1b-2874531aca5e - 9d40b483f885496896d81c487f420438 - 
- -] Failed to update firewall: daedc38a-04ee-4818-b7a6-3d8311d7fc30: KeyError: 
'unknown'
2020-11-25 11:07:32.609 127 ERROR 
neutron_fwaas.services.firewall.service_drivers.agents.drivers.linux.iptables_fwaas_v2
 Traceback (most recent call last):
2020-11-25 11:07:32.609 127 ERROR 
neutron_fwaas.services.firewall.service_drivers.agents.drivers.linux.iptables_fwaas_v2
   File 
"/var/lib/kolla/venv/lib/python2.7/site-packages/neutron_fwaas/services/firewall/service_drivers/agents/drivers/linux/iptables_fwaas_v2.py",
 line 144, in update_firewall_group
2020-11-25 11:07:32.609 127 ERROR 
neutron_fwaas.services.firewall.service_drivers.agents.drivers.linux.iptables_fwaas_v2
 apply_list, self.pre_firewall, firewall)
2020-11-25 11:07:32.609 127 ERROR 
neutron_fwaas.services.firewall.service_drivers.agents.drivers.linux.iptables_fwaas_v2
   File 
"/var/lib/kolla/venv/lib/python2.7/site-packages/neutron_fwaas/services/firewall/service_drivers/agents/drivers/linux/iptables_fwaas_v2.py",
 line 327, in _remove_conntrack_updated_firewall
2020-11-25 11:07:32.609 127 ERROR 
neutron_fwaas.services.firewall.service_drivers.agents.drivers.linux.iptables_fwaas_v2
 ipt_mgr.namespace)
2020-11-25 11:07:32.609 127 ERROR 
neutron_fwaas.services.firewall.service_drivers.agents.drivers.linux.iptables_fwaas_v2
   File 
"/var/lib/kolla/venv/lib/python2.7/site-packages/neutron_fwaas/services/firewall/service_drivers/agents/drivers/linux/netlink_conntrack.py",
 line 41, in delete_entries
2020-11-25 11:07:32.609 127 ERROR 
neutron_fwaas.services.firewall.service_drivers.agents.drivers.linux.iptables_fwaas_v2
 entries = nl_lib.list_entries(namespace)
2020-11-25 11:07:32.609 127 ERROR 
neutron_fwaas.services.firewall.service_drivers.agents.drivers.linux.iptables_fwaas_v2
   File 
"/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_privsep/priv_context.py", 
line 207, in _wrap
2020-11-25 11:07:32.609 127 ERROR 
neutron_fwaas.services.firewall.service_drivers.agents.drivers.linux.iptables_fwaas_v2
 return self.channel.remote_call(name, args, kwargs)
2020-11-25 11:07:32.609 127 ERROR 
neutron_fwaas.services.firewall.service_drivers.agents.drivers.linux.iptables_fwaas_v2
   File 
"/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_privsep/daemon.py", line 
202, in remote_call
2020-11-25 11:07:32.609 127 ERROR 
neutron_fwaas.services.firewall.service_drivers.agents.drivers.linux.iptables_fwaas_v2
 raise exc_type(*result[2])
2020-11-25 11:07:32.609 127 ERROR 
neutron_fwaas.services.firewall.service_drivers.agents.drivers.linux.iptables_fwaas_v2
 KeyError: 'unknown'
2020-11-25 11:07:32.609 127 ERROR 
neutron_fwaas.services.firewall.service_drivers.agents.drivers.linux.iptables_fwaas_v2

This error appears when  configured the neutron-fwaas v2 with netlink_conntrack 
driver in fwaas_agent.ini
vim /etc/kolla/neutron-l3-agent/fwaas_driver.ini 
   [fwaas]
   enabled = True
   agent_version = v2
   driver = iptables_v2
   conntrack_driver = netlink_conntrack

And the conntrack list has 'unknown' rules, example below:
unknown  2 597 src=169.254.192.2 dst=224.0.0.22 [UNREPLIED] src=224.0.0.22 
dst=169.254.192.2 mark=0 use=1
unknown  112 598 src=169.254.192.2 dst=224.0.0.18 [UNREPLIED] src=224.0.0.18 
dst=169.254.192.2 mark=0 use=1

This may interrupt conntrack refresh when firewall rules update.

** Affects: neutron
 Importance: Undecided
 Assignee: Zhang Jian (jasonzhangj)
 Status: New

** Changed in: neutron
 Assignee: (unassigned) => Zhang Jian (q5536487)

** Changed in: neutron
 Assignee: Zhang Jian (jasonzhangj) => (unassigned)

** Changed in: neutron
 Assignee: (unassigned) => Zhang Jian (jasonzhangj)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1905552

Title:
  neutron-fwaas netlink conntrack driver would catch error while
  conntrack rules protocol is 'unknown'

Status in neutron:
  New

Bug description:
  2020-11-25 11:07:32.606 127 DEBUG oslo_concurrency.lockutils 
[req-ab14782d-80b1-43f6-8d1b-2874531aca5e - 9d40b483f885496896d81c487f420438 - 
- -] Releasing semaphore 
"iptables-qrouter-9e18395d-961d-46b3-a0e9-4c6a94c32baf" lock 
/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:228
  2020-11-25 11:07:32.609 127 ERROR 
neutron_fwaas.services.fire

[Yahoo-eng-team] [Bug 1852680] [NEW] floatingip can not access after associate to instance

2019-11-14 Thread Zhang Jian
Public bug reported:

I have deploy openstack neutron component user kolla ansible with the rocky 
release  successfully.
and I enabled the SDN  ML2 plugin in neutron ml2_conf.ini。
when I create a baremetal port from a vlan internal network, it the SDN 
controller can modify the vlan automatically.
And network works normally,shown below:
root@ubuntu:~# ip netns exec qrouter-50c1c5ac-1676-4a9d-ab04-a181a700  ip a
1: lo:  mtu 65536 qdisc noqueue state UNKNOWN group 
default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
   valid_lft forever preferred_lft forever
inet6 ::1/128 scope host 
   valid_lft forever preferred_lft forever
53: qr-66ff06af-8a:  mtu 1500 qdisc noqueue 
state UNKNOWN group default qlen 1000
link/ether fa:16:3e:da:e3:3c brd ff:ff:ff:ff:ff:ff
inet 192.168.1.254/24 brd 192.168.1.255 scope global qr-66ff06af-8a
   valid_lft forever preferred_lft forever
inet6 fe80::f816:3eff:feda:e33c/64 scope link 
   valid_lft forever preferred_lft forever
54: qg-091949c0-13:  mtu 1500 qdisc noqueue 
state UNKNOWN group default qlen 1000
link/ether fa:16:3e:7c:5d:3f brd ff:ff:ff:ff:ff:ff
inet 36.250.72.178/24 brd 36.250.72.255 scope global qg-091949c0-13
   valid_lft forever preferred_lft forever
inet 36.250.72.179/32 brd 36.250.72.179 scope global qg-091949c0-13
   valid_lft forever preferred_lft forever
inet6 fe80::f816:3eff:fe7c:5d3f/64 scope link 
   valid_lft forever preferred_lft forever


And I can ping public gateway and internal port from namespace:
root@ubuntu:~# ip netns exec qrouter-50c1c5ac-1676-4a9d-ab04-a181a700  ping 
192.168.1.2
PING 192.168.1.2 (192.168.1.2) 56(84) bytes of data.
64 bytes from 192.168.1.2: icmp_seq=1 ttl=64 time=0.594 ms
64 bytes from 192.168.1.2: icmp_seq=2 ttl=64 time=0.178 ms
^C
--- 192.168.1.2 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1017ms
rtt min/avg/max/mdev = 0.178/0.386/0.594/0.208 ms
root@ubuntu:~# ip netns exec qrouter-50c1c5ac-1676-4a9d-ab04-a181a700  ping 
36.250.72.177
PING 79.61.92.177 (36.250.72.177) 56(84) bytes of data.
64 bytes from 36.250.72.177: icmp_seq=1 ttl=255 time=0.277 ms
64 bytes from 36.250.72.177: icmp_seq=2 ttl=255 time=0.275 ms
64 bytes from 36.250.72.177: icmp_seq=3 ttl=255 time=0.309 ms
^C
--- 36.250.72.177 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2042ms
rtt min/avg/max/mdev = 0.275/0.287/0.309/0.015 ms

And the instance can alos access exteral network normally
root@instance:~# ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=255 time=0.277 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=255 time=0.275 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=255 time=0.309 ms
^C
--- 8.8.8.8 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2042ms
rtt min/avg/max/mdev = 0.275/0.287/0.309/0.015 ms

but after I associate a floatingip to this port(floatingip is:36.250.72.180):
neutron floatingip-associate   f10a8e0a-3e86-407e-a654-7187ebc16e72  
386dc61a-c01c-46ff-b001-eb799b3b6042

I can not access from 36.250.72.180  to instance and instance also can not 
access to external network.
but from the namespace the network still shown normally.

I doubt the error occurs in my wrong neutron configurations. but in some
case, the floatingip can access normally and no longer appears.

I only can reproduce when a create a new neutron router and reassociate
the floatingip to a port that attach to this router in a new namespace.

the following show the iptables and ovs configuration when  error
occurs:

root@ubuntu:~# ip netns exec qrouter-0ccc1435-636d-41b9-912c-2a96c68e6a09 
iptables-save
# Generated by iptables-save v1.6.1 on Fri Nov 15 05:16:28 2019
*raw
:PREROUTING ACCEPT [113408:41184050]
:OUTPUT ACCEPT [9442:553311]
:neutron-l3-agent-OUTPUT - [0:0]
:neutron-l3-agent-PREROUTING - [0:0]
-A PREROUTING -j neutron-l3-agent-PREROUTING
-A OUTPUT -j neutron-l3-agent-OUTPUT
COMMIT
# Completed on Fri Nov 15 05:16:28 2019
# Generated by iptables-save v1.6.1 on Fri Nov 15 05:16:28 2019
*nat
:PREROUTING ACCEPT [2515:147604]
:INPUT ACCEPT [1126:64144]
:OUTPUT ACCEPT [1:84]
:POSTROUTING ACCEPT [1148:66130]
:neutron-l3-agent-OUTPUT - [0:0]
:neutron-l3-agent-POSTROUTING - [0:0]
:neutron-l3-agent-PREROUTING - [0:0]
:neutron-l3-agent-float-snat - [0:0]
:neutron-l3-agent-snat - [0:0]
:neutron-postrouting-bottom - [0:0]
-A PREROUTING -j neutron-l3-agent-PREROUTING
-A OUTPUT -j neutron-l3-agent-OUTPUT
-A POSTROUTING -j neutron-l3-agent-POSTROUTING
-A POSTROUTING -j neutron-postrouting-bottom
-A neutron-l3-agent-OUTPUT -d 36.250.72.179/32 -j DNAT --to-destination 
192.168.1.8
-A neutron-l3-agent-POSTROUTING ! -i qg-091949c0-13 ! -o qg-091949c0-13 -m 
conntrack ! --ctstate DNAT -j ACCEPT
-A neutron-l3-agent-PREROUTING -d 169.254.169.254/32 -i qr-+ -p tcp -m tcp 
--dport 80 -j REDIRECT --to-ports 9697
-A