[Openstack] Fwd: [Question #221283]: VM instance is not able to get IP address
Hi All, I posted below question on launchpad -quantum, but i didn't get any response from the team, may be its not that active as openstack mailing list. I am facing an issue detailed in this question [ https://answers.launchpad.net/quantum/+question/221283] and did some analysis and shared it on the same question. You can find my analysis in the below mail as well. I am looking for suggestion from the openstack networking expert on how to further debug this issue. My deployment is stuck because of this issue. I really appreciate your help. Thanks Anil -- Forwarded message -- From: Anil Vishnoi question221...@answers.launchpad.net Date: Fri, Feb 8, 2013 at 3:41 AM Subject: Re: [Question #221283]: VM instance is not able to get IP address To: vishnoia...@gmail.com Your question #221283 on quantum changed: https://answers.launchpad.net/quantum/+question/221283 You gave more information on the question: Hi Team, I further debugged this issue, and figure out one workaround. I really don't want to say it a workaround but moreover its a hack. As i mentioned in the above description that because of the action=drop, DHCP packets were getting dropped and not reaching to the DHCP agent, and hence it was not able to respond with the DHCPOFFER response. First i resolve this error [Feb 07 17:32:40|1|netdev_linux|WARN|/sys/class/net/tap9fdb5c15-26/carrier: open failed: ] with the following steps : 1. disable the network namespace for l3_agent and dhcp agent by modifying the use_namespace=false in the respective configuration file. 2. Delete the port (tap9fdb5c15-26) from the br-int bridge. [Quick instructions : root@management:~# ovs-vsctl del-port tap9fdb5c15-26 root@management:~# ovs-vsctl add-port br-int tap9fdb5c15-26 root@management:~# ovs-vsctl set port tap9fdb5c15-26 tag=1 root@management:~# ovs-vsctl set Interface tap9fdb5c15-26 type=internal ] 3. Restart both the services and it will create tap devices outside the network name space. If network namespace is enabled, ifconfig will not show this tap device in its output, but if you fire command 'ip netns exec dhcpns ip -d link' it will show you the device. In my setup i followed the above step, but even if you don't want to disable namespace, you can stop dhcp agent, delete the port from br-int and restart the service. It possibly will resolve this error ( it did worked in my setup). So in my setup, namespace is disabled. And following is the output of ovs-dpctl root@management:~# ovs-dpctl show system@br-eth1: lookups: hit:151651 missed:37759 lost:0 flows: 3 port 0: br-eth1 (internal) port 1: eth1 port 3: phy-br-eth1 system@br-int: lookups: hit:1183 missed:23283 lost:0 flows: 1 port 0: br-int (internal) port 6: tap9fdb5c15-26 (internal) port 7: int-br-eth1 system@br-ex: lookups: hit:96895 missed:67156 lost:0 flows: 16 port 0: br-ex (internal) port 1: eth0 DHCP request packet is broadcast packet and it takes following path to reach the br-intport 1: eth1 (br-eth1) -- port 7: int-br-eth1(br- int) and this packet gets drop here because of the following rule installed on br-int bridge cookie=0x0, duration=11422.615s, table=0, n_packets=16711, n_bytes=1178562, priority=2,in_port=7 actions=drop Ideally it should be forwarded to port 6: tap9fdb5c15-26 (internal) (br- int) and that way it can reach DHCP agent. So i modified above flow to following flow cookie=0x0, duration=3169.501s, table=0, n_packets=2562, n_bytes=228241, priority=2,in_port=7 actions=output:6 and also installed following rule to route back the DHCPOFFER packet cookie=0x0, duration=4536.551s, table=0, n_packets=233, n_bytes=28896, priority=2,in_port=6 actions=output:7 So after installing these two flow rules, DHCP agent got the request and responded with the DHCPOFFER response. root@management:~# tail -f /var/log/syslog Feb 8 03:26:16 management dnsmasq-dhcp[25811]: DHCPREQUEST(tap9fdb5c15-26) 192.168.0.3 fa:16:3e:93:74:73 Feb 8 03:26:16 management dnsmasq-dhcp[25811]: DHCPACK(tap9fdb5c15-26) 192.168.0.3 fa:16:3e:93:74:73 192-168-0-3 DHCP response packet will take following path port 6: tap9fdb5c15-26 (internal)(br-int) --- port 7: int-br-eth1(br-int) --- port 3: phy-br- eth1 (br-eth1) --- port 1: eth1 (br-eth1) and that way this packet will go out of controller node. But on br-eth1 bridge another rule was installed which was dropping the response cookie=0x0, duration=2669.22s, table=0, n_packets=173, n_bytes=18144, priority=2,in_port=3 actions=drop and i changed this flow to cookie=0x0, duration=2669.22s, table=0, n_packets=173, n_bytes=18144, priority=2,in_port=3 actions=output:1 so now packet can escape from the controller machine. Now follows the story of compute node side. Following is ovs-dpctl output of my compute node : system@br-eth1: lookups: hit:404442 missed:110048 lost:0 flows: 1 port 0:
Re: [Openstack] Fwd: [Question #221283]: VM instance is not able to get IP address
I'm getting a similar problem in my deploy. If someone could help, I appreciate it. Regards. Guilherme. 2013/2/8 Anil Vishnoi vishnoia...@gmail.com Hi All, I posted below question on launchpad -quantum, but i didn't get any response from the team, may be its not that active as openstack mailing list. I am facing an issue detailed in this question [ https://answers.launchpad.net/quantum/+question/221283] and did some analysis and shared it on the same question. You can find my analysis in the below mail as well. I am looking for suggestion from the openstack networking expert on how to further debug this issue. My deployment is stuck because of this issue. I really appreciate your help. Thanks Anil -- Forwarded message -- From: Anil Vishnoi question221...@answers.launchpad.net Date: Fri, Feb 8, 2013 at 3:41 AM Subject: Re: [Question #221283]: VM instance is not able to get IP address To: vishnoia...@gmail.com Your question #221283 on quantum changed: https://answers.launchpad.net/quantum/+question/221283 You gave more information on the question: Hi Team, I further debugged this issue, and figure out one workaround. I really don't want to say it a workaround but moreover its a hack. As i mentioned in the above description that because of the action=drop, DHCP packets were getting dropped and not reaching to the DHCP agent, and hence it was not able to respond with the DHCPOFFER response. First i resolve this error [Feb 07 17:32:40|1|netdev_linux|WARN|/sys/class/net/tap9fdb5c15-26/carrier: open failed: ] with the following steps : 1. disable the network namespace for l3_agent and dhcp agent by modifying the use_namespace=false in the respective configuration file. 2. Delete the port (tap9fdb5c15-26) from the br-int bridge. [Quick instructions : root@management:~# ovs-vsctl del-port tap9fdb5c15-26 root@management:~# ovs-vsctl add-port br-int tap9fdb5c15-26 root@management:~# ovs-vsctl set port tap9fdb5c15-26 tag=1 root@management:~# ovs-vsctl set Interface tap9fdb5c15-26 type=internal ] 3. Restart both the services and it will create tap devices outside the network name space. If network namespace is enabled, ifconfig will not show this tap device in its output, but if you fire command 'ip netns exec dhcpns ip -d link' it will show you the device. In my setup i followed the above step, but even if you don't want to disable namespace, you can stop dhcp agent, delete the port from br-int and restart the service. It possibly will resolve this error ( it did worked in my setup). So in my setup, namespace is disabled. And following is the output of ovs-dpctl root@management:~# ovs-dpctl show system@br-eth1: lookups: hit:151651 missed:37759 lost:0 flows: 3 port 0: br-eth1 (internal) port 1: eth1 port 3: phy-br-eth1 system@br-int: lookups: hit:1183 missed:23283 lost:0 flows: 1 port 0: br-int (internal) port 6: tap9fdb5c15-26 (internal) port 7: int-br-eth1 system@br-ex: lookups: hit:96895 missed:67156 lost:0 flows: 16 port 0: br-ex (internal) port 1: eth0 DHCP request packet is broadcast packet and it takes following path to reach the br-intport 1: eth1 (br-eth1) -- port 7: int-br-eth1(br- int) and this packet gets drop here because of the following rule installed on br-int bridge cookie=0x0, duration=11422.615s, table=0, n_packets=16711, n_bytes=1178562, priority=2,in_port=7 actions=drop Ideally it should be forwarded to port 6: tap9fdb5c15-26 (internal) (br- int) and that way it can reach DHCP agent. So i modified above flow to following flow cookie=0x0, duration=3169.501s, table=0, n_packets=2562, n_bytes=228241, priority=2,in_port=7 actions=output:6 and also installed following rule to route back the DHCPOFFER packet cookie=0x0, duration=4536.551s, table=0, n_packets=233, n_bytes=28896, priority=2,in_port=6 actions=output:7 So after installing these two flow rules, DHCP agent got the request and responded with the DHCPOFFER response. root@management:~# tail -f /var/log/syslog Feb 8 03:26:16 management dnsmasq-dhcp[25811]: DHCPREQUEST(tap9fdb5c15-26) 192.168.0.3 fa:16:3e:93:74:73 Feb 8 03:26:16 management dnsmasq-dhcp[25811]: DHCPACK(tap9fdb5c15-26) 192.168.0.3 fa:16:3e:93:74:73 192-168-0-3 DHCP response packet will take following path port 6: tap9fdb5c15-26 (internal)(br-int) --- port 7: int-br-eth1(br-int) --- port 3: phy-br- eth1 (br-eth1) --- port 1: eth1 (br-eth1) and that way this packet will go out of controller node. But on br-eth1 bridge another rule was installed which was dropping the response cookie=0x0, duration=2669.22s, table=0, n_packets=173, n_bytes=18144, priority=2,in_port=3 actions=drop and i changed this flow to cookie=0x0, duration=2669.22s, table=0, n_packets=173, n_bytes=18144, priority=2,in_port=3 actions=output:1