I cannot reproduce this problem. Do you changed the DHCP lease time ? Édouard.
On Thu, Aug 8, 2013 at 7:52 AM, Chu Duc Minh <chu.ducm...@gmail.com> wrote: > Do you have this log in agent log file : >> 2013-08-07 13:21:46 WARNING [quantum.openstack.common.loopingcall] task >> run outlasted interval by 2.375859 sec >> > Yes, i have: > "WARNING [quantum.openstack.common.loopingcall] task run outlasted > interval by 4.738189 sec" > > I set report_interval = 15 and agent_down_time = 30, then launch 50 > instances simultaneously. > Now, every instances is ok, I can ping them all. But in Dashboard, I still > see a bug, some instanes have 2 IP addresses (screenshot attached) - and > ofcourse, with each instance I can only ping 1 IP address. > > For example, with first instance in attached image, I check file Dnsmasq's > host, see 2 entries: > *fa:16:3e:7c:42:bb*,10-2-1-41.openstacklocal,10.2.1.41 > fa:16:3e:0c:bc:4b,10-2-1-52.openstacklocal,10.2.1.52 > Only can ping 10.2.1.52 > > I check Quantum DB, i saw that *fa:16:3e:7c:42:bb* still exist in 'ports' > table. > ('6260622a6b324557bc9064698c8c03ed','*f3e79e1b-2236-4189-8516-fb18dc7e58a9 > *','','dbc59888-e2be-4b31-b579-0a4575159bb1',*'fa:16:3e:7c:42:bb* > ',1,'DOWN','8420f945-2d88-4204-8444-9c078491def0','compute:None') > Same result with quantum port-list: > | *f3e79e1b-2236-4189-8516-fb18dc7e58a9* | | fa:16:3e:7c:42:bb | > {"subnet_id": "4d238201-a8d5-4175-a9b4-c1d13efb5e2e", "ip_address": > "10.2.1.41"} | > > > Then, I check Nova DB, and found a record in *instance_info_caches* table: > | 2013-08-08 04:34:38 | 2013-08-08 04:38:19 | NULL | 2730 | > [{"ovs_interfaceid": "43802d05-ee1f-401d-9d61-055444da8df4", "network": > {"bridge": "br-int", "subnets": [{"ips": [{"meta": {}, "version": 4, > "type": "fixed", "floating_ips": [], "address": "10.2.1.52"}], "version": > 4, "meta": {"dhcp_server": "10.2.1.9"}, "dns": [], "routes": [], "cidr": " > 10.2.1.0/24", "gateway": {"meta": {}, "version": 4, "type": "gateway", > "address": "10.2.1.1"}}], "meta": {"injected": false, "tenant_id": > "6260622a6b324557bc9064698c8c03ed"}, "id": > "dbc59888-e2be-4b31-b579-0a4575159bb1", "label": "net_minhcd_proj1"}, > "devname": "tap43802d05-ee", "qbh_params": null, "meta": {}, "address": > "fa:16:3e:0c:bc:4b", "type": "ovs", "id": > "43802d05-ee1f-401d-9d61-055444da8df4", "qbg_params": null}, > {"ovs_interfaceid": "*f3e79e1b-2236-4189-8516-fb18dc7e58a9*", "network": > {"bridge": "br-int", "subnets": [{"ips": [{"meta": {}, "version": 4, > "type": "fixed", "floating_ips": [], "address": "10.2.1.41"}], "version": > 4, "meta": {"dhcp_server": "10.2.1.9"}, "dns": [], "routes": [], "cidr": " > 10.2.1.0/24", "gateway": {"meta": {}, "version": 4, "type": "gateway", > "address": "10.2.1.1"}}], "meta": {"injected": false, "tenant_id": > "6260622a6b324557bc9064698c8c03ed"}, "id": > "dbc59888-e2be-4b31-b579-0a4575159bb1", "label": "net_minhcd_proj1"}, > "devname": "tapf3e79e1b-22", "qbh_params": null, "meta": {}, "address": > "fa:16:3e:7c:42:bb", "type": "ovs", "id": "* > f3e79e1b-2236-4189-8516-fb18dc7e58a9*", "qbg_params": null}] | > 8420f945-2d88-4204-8444-9c078491def0 | 0 | > > In quantum-server.log: > 2013-08-08 11:30:14 DEBUG [quantum.openstack.common.rpc.amqp] received > {u'_context_roles': [u'admin'], u'_context_read_deleted': u'no', > u'_context_tenant_id': None, u'args': {u'network_id': > u'dbc59888-e2be-4b31-b579-0a4575159bb1', u'lease_remaining': 0, u'host': > u'thor-quantum-01.localdomain', u'ip_address': u'10.2.1.41'}, > u'_unique_id': u'49f419ee040d4d77822ecf696533e484', u'_context_is_admin': > True, u'version': u'1.0', u'_context_project_id': None, > u'_context_timestamp': u'2013-08-08 04:26:09.092921', u'_context_user_id': > None, u'method': u'update_lease_expiration'} > 2013-08-08 11:30:16 DEBUG [quantum.db.dhcp_rpc_base] Updating lease > expiration for 10.2.1.41 on network dbc59888-e2be-4b31-b579-0a4575159bb1 > from thor-quantum-01.localdomain. > 2013-08-08 11:33:53 DEBUG [quantum.db.db_base_plugin_v2] Recycle > 10.2.1.41 > 2013-08-08 11:33:53 DEBUG [quantum.db.db_base_plugin_v2] Recycle: > updated last 10.2.1.39-10.2.1.41 > 2013-08-08 11:33:53 DEBUG [quantum.db.db_base_plugin_v2] Delete > allocated IP 10.2.1.41 > (dbc59888-e2be-4b31-b579-0a4575159bb1/4d238201-a8d5-4175-a9b4-c1d13efb5e2e) > 2013-08-08 11:33:53 DEBUG [quantum.db.db_base_plugin_v2] Recycle: last > match for 10.2.1.39-10.2.1.41 > 2013-08-08 11:35:29 DEBUG [quantum.db.db_base_plugin_v2] Allocated IP - > 10.2.1.41 from 10.2.1.41 to 10.2.1.42 > 2013-08-08 11:35:29 DEBUG [quantum.db.db_base_plugin_v2] Allocated IP > 10.2.1.41 > (dbc59888-e2be-4b31-b579-0a4575159bb1/4d238201-a8d5-4175-a9b4-c1d13efb5e2e/f3e79e1b-2236-4189-8516-fb18dc7e58a9) > (seem normal?) > > And when i deleted all instances, some entries still exists in Dnsmasq's > host file --> can't ping on next launching. > Maybe I need to increase report_interval more, because I still see the > message "WARNING [quantum.openstack.common.loopingcall] task run outlasted > interval by X seconds" on high stressed test. > > But the question is, how much is enough? > Could i fix this bug thoroughly? (apply patch? but need to rename > Quantum<->Neutron first) > > Thank you very much! > > > On Wed, Aug 7, 2013 at 9:46 PM, Édouard Thuleau <thul...@gmail.com> wrote: > >> I think we have found (Sylvain and me) a problem that can explain this >> trouble: >> >> When the load is too heavy (update dnsmasq host file and send lease >> update) on DHCP agent, the report state to Neutron server is delayed and >> the Neutron sever considers that agent is down and doesn't sent the port >> creation to the agent. So the dnsmasq host file isn't updated to serve that >> IP port's. >> >> Do you have this log in agent log file : >> 2013-08-07 13:21:46 WARNING [quantum.openstack.common.loopingcall] task >> run outlasted interval by 2.375859 sec >> >> You can increase the 'report_interval' flag on the agent and the >> 'agent_down_time' flag on the Neutron server side. >> This problem should be corrected with this bp: >> https://blueprints.launchpad.net/neutron/+spec/remove-dhcp-lease >> Meanwhile, I think we should add log warning in the neutron server code >> to prevent that it cannot notify any DHCP agent for a port creation. And >> backport that on the Grizzly release. >> >> What do you think ? >> >> I had this comment on the bug >> https://bugs.launchpad.net/neutron/+bug/1185916 >> >> Édouard. >> >> >> On Fri, Aug 2, 2013 at 11:45 AM, Chu Duc Minh <chu.ducm...@gmail.com>wrote: >> >>> After i deleted 2 instances: 10.2.1.10 & 10.2.1.12 >>> The Dnsmasq's hosts file is: >>> fa:16:3e:01:d1:70,10-2-1-1.openstacklocal,10.2.1.1 >>> fa:16:3e:71:6a:4e,10-2-1-11.openstacklocal,10.2.1.11 >>> *fa:16:3e:cf:0f:c1,10-2-1-12.openstacklocal,10.2.1.12* *<-- still >>> exist, problem?!* >>> >>> fa:16:3e:35:a1:72,10-2-1-9.openstacklocal,10.2.1.9 >>> >>> >>> BR, >>> >>> >>> On Fri, Aug 2, 2013 at 4:27 PM, Chu Duc Minh <chu.ducm...@gmail.com>wrote: >>> >>>> Hi, i have the same problem when create -> terminate -> create >>>> instances. >>>> This problem only occur when the new instances have the same IP as >>>> deleted instances. >>>> >>>> I check the dnsmasq's host file >>>> /var/lib/quantum/dhcp/dbc59888-e2be-4b31-b579-0a4575159bb1/host, >>>> sometimes it's not update. >>>> >>>> I think this problem maybe not only related to Dnsmasq, it may related >>>> to firewall rules (generated by Quantum) on compute-node too. Because i see >>>> some dropped DHCP packet: >>>> Aug 2 14:08:11 thor-compute-03 kernel: [95971.005423] >>>> IN=qbr23c67719-14 OUT=qbr23c67719-14 PHYSIN=qvb23c67719-14 >>>> PHYSOUT=tap23c67719- >>>> 14 MAC=ff:ff:ff:ff:ff:ff:fa:16:3e:34:72:05:08:00 SRC=0.0.0.0 >>>> DST=255.255.255.255 LEN=328 TOS=0x10 PREC=0x00 TTL=128 ID=0 *PROTO=UDP >>>> SPT=68 DPT=67* LEN=308 >>>> (DHCP Discovery packet?) >>>> It dropped in chain quantum-openvswi-sg-fallback, then instance can't >>>> get IP. Although in Dashboard i see instance got IP. >>>> >>>> I tried many times, and got a strange case: duplicate IP in Dnsmasq's >>>> host file: >>>> fa:16:3e:01:d1:70,10-2-1-1.openstacklocal,10.2.1.1 >>>> fa:16:3e:71:6a:4e,10-2-1-11.openstacklocal,10.2.1.11 >>>> *fa:16:3e:78:b5:2f,10-2-1-10.openstacklocal,10.2.1.10* >>>> fa:16:3e:35:a1:72,10-2-1-9.openstacklocal,10.2.1.9 >>>> fa:16:3e:cf:0f:c1,10-2-1-12.openstacklocal,10.2.1.12 >>>> *fa:16:3e:c7:ea:0c,10-2-1-10.openstacklocal,10.2.1.10* >>>> >>>> My newest instance is *10.2.1.10*, and I can't ping it. In boot log of >>>> this instance, i found: >>>> >>>> cloudinitnonet waiting 120 seconds for a network device. >>>> cloudinitnonet gave up waiting for a network device. >>>> ciinfo: lo : 1 127.0.0.1 255.0.0.0 . >>>> ciinfo: eth0 : 1 . . fa:16:3e:c7:ea:0c >>>> route_info failed >>>> >>>> Restart instance didn't make it work, but restart quantum-dhcp-agent on >>>> Quantum-node make it work. >>>> After restart, content of Dnsmasq's host file is: >>>> fa:16:3e:01:d1:70,10-2-1-1.openstacklocal,10.2.1.1 >>>> fa:16:3e:71:6a:4e,10-2-1-11.openstacklocal,10.2.1.11 >>>> fa:16:3e:cf:0f:c1,10-2-1-12.openstacklocal,10.2.1.12 >>>> fa:16:3e:35:a1:72,10-2-1-9.openstacklocal,10.2.1.9 >>>> *fa:16:3e:c7:ea:0c,10-2-1-10.openstacklocal,10.2.1.10* >>>> >>>> I think it a serious problem, hope someone could fix it soon.. :) >>>> >>>> Best Regards, >>>> >>>> >>>> On Tue, Jul 2, 2013 at 8:01 PM, James Page <james.p...@ubuntu.com>wrote: >>>> >>>>> On 20/05/13 07:51, Heinonen, Johanna (NSN - FI/Espoo) wrote: >>>>> >>>>>> Hi, >>>>>> I have installed grizzly with quantum and ovs-plugin. It seems that >>>>>> grizzly allocates the third address of each subnet for dhcp. (In >>>>>> folsom >>>>>> it was the second address). This means that the VMs will get addresses >>>>>> >>>>> >>>>> This sound alot like https://bugs.launchpad.net/** >>>>> ubuntu/+source/quantum/+bug/**1189909<https://bugs.launchpad.net/ubuntu/+source/quantum/+bug/1189909>; >>>>> I'll raise a task for dnsmasq as well. >>>>> >>>>> Cheers >>>>> >>>>> James >>>>> >>>>> -- >>>>> James Page >>>>> Ubuntu Core Developer >>>>> Debian Maintainer >>>>> james.p...@ubuntu.com >>>>> >>>>> >>>>> ______________________________**_________________ >>>>> Mailing list: >>>>> https://launchpad.net/~**openstack<https://launchpad.net/~openstack> >>>>> Post to : openst...@lists.launchpad.net >>>>> Unsubscribe : >>>>> https://launchpad.net/~**openstack<https://launchpad.net/~openstack> >>>>> More help : >>>>> https://help.launchpad.net/**ListHelp<https://help.launchpad.net/ListHelp> >>>>> >>>> >>>> >>> >>> _______________________________________________ >>> Mailing list: >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >>> Post to : openstack@lists.openstack.org >>> Unsubscribe : >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >>> >>> >> >
_______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack@lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack