[Openstack] Fwd: [Question #221283]: VM instance is not able to get IP address

2013-02-08 Thread Anil Vishnoi
Hi All,

I posted below question on launchpad -quantum, but i didn't get any
response from the team, may be its not that active as openstack mailing
list.

I am facing an issue detailed in this question [
https://answers.launchpad.net/quantum/+question/221283] and did some
analysis and shared it on the same question. You can find my analysis in
the below mail as well.

I am looking for suggestion from the openstack networking expert on how to
further debug this issue. My deployment is stuck because of this issue. I
really appreciate your help.

Thanks
Anil

-- Forwarded message --
From: Anil Vishnoi question221...@answers.launchpad.net
Date: Fri, Feb 8, 2013 at 3:41 AM
Subject: Re: [Question #221283]: VM instance is not able to get IP address
To: vishnoia...@gmail.com


Your question #221283 on quantum changed:
https://answers.launchpad.net/quantum/+question/221283

You gave more information on the question:
Hi Team,

I further debugged this issue, and figure out one workaround. I really
don't want to say it a workaround but moreover its a hack.

As i mentioned in the above description that because of the action=drop,
DHCP packets were getting dropped and not reaching to the DHCP agent,
and hence it was not able to respond with the DHCPOFFER response.

First i resolve this error [Feb 07
17:32:40|1|netdev_linux|WARN|/sys/class/net/tap9fdb5c15-26/carrier:
open failed: ] with the following steps :
1. disable the network namespace for l3_agent and dhcp agent by modifying
the use_namespace=false in the respective configuration file.
2. Delete the port (tap9fdb5c15-26) from the br-int bridge.
[Quick instructions :
root@management:~# ovs-vsctl del-port tap9fdb5c15-26
root@management:~# ovs-vsctl add-port br-int tap9fdb5c15-26
root@management:~# ovs-vsctl set port tap9fdb5c15-26 tag=1
root@management:~# ovs-vsctl set Interface tap9fdb5c15-26 type=internal
]
3. Restart both the services and it will create tap devices outside the
network name space.

If network namespace is enabled, ifconfig will not show this tap device
in its output, but if you fire command 'ip netns exec dhcpns ip -d
link' it will show you the device.

In my setup i followed the above step, but even if you don't want to
disable namespace, you can stop dhcp agent, delete the port from br-int
and restart the service. It possibly will resolve this error ( it did
worked in my setup).

So in my setup, namespace is disabled. And following is the output of
ovs-dpctl

root@management:~# ovs-dpctl show
system@br-eth1:
lookups: hit:151651 missed:37759 lost:0
flows: 3
port 0: br-eth1 (internal)
port 1: eth1
port 3: phy-br-eth1
system@br-int:
lookups: hit:1183 missed:23283 lost:0
flows: 1
port 0: br-int (internal)
port 6: tap9fdb5c15-26 (internal)
port 7: int-br-eth1
system@br-ex:
lookups: hit:96895 missed:67156 lost:0
flows: 16
port 0: br-ex (internal)
port 1: eth0

DHCP request packet is broadcast packet and it takes following path to
reach the br-intport 1: eth1 (br-eth1) -- port 7: int-br-eth1(br-
int) and this packet gets drop here because of the following rule
installed on br-int bridge

 cookie=0x0, duration=11422.615s, table=0, n_packets=16711,
n_bytes=1178562, priority=2,in_port=7 actions=drop

Ideally it should be forwarded to port 6: tap9fdb5c15-26 (internal) (br-
int) and that way it can reach DHCP agent. So i modified above flow to
following flow

cookie=0x0, duration=3169.501s, table=0, n_packets=2562, n_bytes=228241,
priority=2,in_port=7 actions=output:6

and also installed following rule to route back the DHCPOFFER packet

cookie=0x0, duration=4536.551s, table=0, n_packets=233, n_bytes=28896,
priority=2,in_port=6 actions=output:7

So after installing these two flow rules, DHCP agent got the request and
responded with the DHCPOFFER response.

root@management:~# tail -f /var/log/syslog
Feb  8 03:26:16 management dnsmasq-dhcp[25811]: DHCPREQUEST(tap9fdb5c15-26)
192.168.0.3 fa:16:3e:93:74:73
Feb  8 03:26:16 management dnsmasq-dhcp[25811]: DHCPACK(tap9fdb5c15-26)
192.168.0.3 fa:16:3e:93:74:73 192-168-0-3

DHCP response packet will take following path  port 6: tap9fdb5c15-26
(internal)(br-int) --- port 7: int-br-eth1(br-int) --- port 3: phy-br-
eth1 (br-eth1) --- port 1: eth1 (br-eth1)  and that way this packet
will go out of controller node. But on br-eth1 bridge another rule was
installed which was dropping the response

cookie=0x0, duration=2669.22s, table=0, n_packets=173, n_bytes=18144,
priority=2,in_port=3 actions=drop

and i changed this flow to

cookie=0x0, duration=2669.22s, table=0, n_packets=173, n_bytes=18144,
priority=2,in_port=3 actions=output:1

so now packet can escape from the controller machine. Now follows the
story of compute node side.

Following is ovs-dpctl output of my compute node :

system@br-eth1:
lookups: hit:404442 missed:110048 lost:0
flows: 1
port 0: 

Re: [Openstack] Fwd: [Question #221283]: VM instance is not able to get IP address

2013-02-08 Thread Guilherme Russi
I'm getting a similar problem in my deploy. If someone could help, I
appreciate it.

Regards.

Guilherme.


2013/2/8 Anil Vishnoi vishnoia...@gmail.com

 Hi All,

 I posted below question on launchpad -quantum, but i didn't get any
 response from the team, may be its not that active as openstack mailing
 list.

 I am facing an issue detailed in this question [
 https://answers.launchpad.net/quantum/+question/221283] and did some
 analysis and shared it on the same question. You can find my analysis in
 the below mail as well.

 I am looking for suggestion from the openstack networking expert on how to
 further debug this issue. My deployment is stuck because of this issue. I
 really appreciate your help.

 Thanks
 Anil

 -- Forwarded message --
 From: Anil Vishnoi question221...@answers.launchpad.net
 Date: Fri, Feb 8, 2013 at 3:41 AM
 Subject: Re: [Question #221283]: VM instance is not able to get IP address
 To: vishnoia...@gmail.com


 Your question #221283 on quantum changed:
 https://answers.launchpad.net/quantum/+question/221283

 You gave more information on the question:
 Hi Team,

 I further debugged this issue, and figure out one workaround. I really
 don't want to say it a workaround but moreover its a hack.

 As i mentioned in the above description that because of the action=drop,
 DHCP packets were getting dropped and not reaching to the DHCP agent,
 and hence it was not able to respond with the DHCPOFFER response.

 First i resolve this error [Feb 07
 17:32:40|1|netdev_linux|WARN|/sys/class/net/tap9fdb5c15-26/carrier:
 open failed: ] with the following steps :
 1. disable the network namespace for l3_agent and dhcp agent by modifying
 the use_namespace=false in the respective configuration file.
 2. Delete the port (tap9fdb5c15-26) from the br-int bridge.
 [Quick instructions :
 root@management:~# ovs-vsctl del-port tap9fdb5c15-26
 root@management:~# ovs-vsctl add-port br-int tap9fdb5c15-26
 root@management:~# ovs-vsctl set port tap9fdb5c15-26 tag=1
 root@management:~# ovs-vsctl set Interface tap9fdb5c15-26 type=internal
 ]
 3. Restart both the services and it will create tap devices outside the
 network name space.

 If network namespace is enabled, ifconfig will not show this tap device
 in its output, but if you fire command 'ip netns exec dhcpns ip -d
 link' it will show you the device.

 In my setup i followed the above step, but even if you don't want to
 disable namespace, you can stop dhcp agent, delete the port from br-int
 and restart the service. It possibly will resolve this error ( it did
 worked in my setup).

 So in my setup, namespace is disabled. And following is the output of
 ovs-dpctl

 root@management:~# ovs-dpctl show
 system@br-eth1:
 lookups: hit:151651 missed:37759 lost:0
 flows: 3
 port 0: br-eth1 (internal)
 port 1: eth1
 port 3: phy-br-eth1
 system@br-int:
 lookups: hit:1183 missed:23283 lost:0
 flows: 1
 port 0: br-int (internal)
 port 6: tap9fdb5c15-26 (internal)
 port 7: int-br-eth1
 system@br-ex:
 lookups: hit:96895 missed:67156 lost:0
 flows: 16
 port 0: br-ex (internal)
 port 1: eth0

 DHCP request packet is broadcast packet and it takes following path to
 reach the br-intport 1: eth1 (br-eth1) -- port 7: int-br-eth1(br-
 int) and this packet gets drop here because of the following rule
 installed on br-int bridge

  cookie=0x0, duration=11422.615s, table=0, n_packets=16711,
 n_bytes=1178562, priority=2,in_port=7 actions=drop

 Ideally it should be forwarded to port 6: tap9fdb5c15-26 (internal) (br-
 int) and that way it can reach DHCP agent. So i modified above flow to
 following flow

 cookie=0x0, duration=3169.501s, table=0, n_packets=2562, n_bytes=228241,
 priority=2,in_port=7 actions=output:6

 and also installed following rule to route back the DHCPOFFER packet

 cookie=0x0, duration=4536.551s, table=0, n_packets=233, n_bytes=28896,
 priority=2,in_port=6 actions=output:7

 So after installing these two flow rules, DHCP agent got the request and
 responded with the DHCPOFFER response.

 root@management:~# tail -f /var/log/syslog
 Feb  8 03:26:16 management dnsmasq-dhcp[25811]:
 DHCPREQUEST(tap9fdb5c15-26) 192.168.0.3 fa:16:3e:93:74:73
 Feb  8 03:26:16 management dnsmasq-dhcp[25811]: DHCPACK(tap9fdb5c15-26)
 192.168.0.3 fa:16:3e:93:74:73 192-168-0-3

 DHCP response packet will take following path  port 6: tap9fdb5c15-26
 (internal)(br-int) --- port 7: int-br-eth1(br-int) --- port 3: phy-br-
 eth1 (br-eth1) --- port 1: eth1 (br-eth1)  and that way this packet
 will go out of controller node. But on br-eth1 bridge another rule was
 installed which was dropping the response

 cookie=0x0, duration=2669.22s, table=0, n_packets=173, n_bytes=18144,
 priority=2,in_port=3 actions=drop

 and i changed this flow to

 cookie=0x0, duration=2669.22s, table=0, n_packets=173, n_bytes=18144,
 priority=2,in_port=3 actions=output:1