Re: [Openstack] inter vm communication issue

2012-06-29 Thread Tom Sante
Hi,

I am a colleague of Bram working with him on these same systems. 
We are now experiencing other issues related to networking on our nodes:

- we gave openstack eth0 as the vlan interface
- eth0 en eth1 are still slaves in a bond0 (mode 6)
== we are seeing a big number of dropped packets on the eth1 interface, this 
under heavy load causing an unstable network on our VMs

My guess would be because we are directly using eth0 as vlan interface while it 
is a slave in a bond is creating these issues.
Or should this not create issues?
 
If so while we managed to avoid inter VM communication issues by using eth0 as 
vlin int. instead of our bond0 (= eth0+eth1)
this still leaves the issue of why a bond interface would function as the 
openstack vlan interface?

Regards,

Tom

Op vrijdag 1 juni 2012, om 15:28 heeft Bram De Wilde het volgende geschreven: 
 The bond was the culprit!
 
 As we have been breaking our heads over this for close to 2 days it seems 
 important enough to report here:
 
 On our ubuntu 12.04 systems we had 2 bonded interfaces configured with an ip 
 of 10.0.0.0/24 in an adaptive load balancing mode. We used this mode = 6 type 
 bonding a bonding is not supported by the switch administrator. This appears 
 not to be compatible with vlan tagged multi-host networking. @Vish: thanx for 
 the suggestion, any idea where we would have to post this issue as a bug? I 
 guess not openstack but rather the ifenslave people?
 I would suspect this not to occur with other, switch based bonding modes but 
 as we have no support for this I am unable to test...
 This explained the inter vm communication to be really unreliable an drop out 
 after a while. Using the eth0 interface instead of the bond0 as the vlan 
 interface the network now is stable as ever.
 
 Happy openstack users we will now be configuring our private cloud for stable 
 operation in our department, thanx all!
 
 We will be working on a solution for the name resolution in vlan tagged 
 multi-host configurations, I will keep you posted as we progress.
 
 Kind regards,
 
 Bram
 
 On 1-jun-2012, at 10:02, Vishvananda Ishaya wrote:
 
  
  On Jun 1, 2012, at 12:46 AM, Bram De Wilde wrote:
  
   Thanx Vish,
   
   On the name resolution: would you consider this a bug (I can file one if 
   you would like) or a feature?
  
  Bug if it is an easy fix :)
  
   Could this be fixed by changing the /usr/bin/nova-dhcpbridge script to 
   load all mac, hostname, ip combinations for the database instead of just 
   the physical hosts one? Or would this create other issues?
  
  We would have to do some investigation into special settings. We want to 
  make sure that the host doesn't respond to dhcp requests from other hosts. 
  If it is possible to set up dnsmasq to do name resolution for the other 
  hosts without handing ip addresses then we could do it this way. Someone 
  will have to look into it. It might have to be something a little more 
  complicated like writing out a hosts file in addition to the dhcp file and 
  telling dnsmasq to use it. If you want to investigate the easiest way to 
  configure dnsmasq to do this, that would be a big help.
  
  Vish
 
 
 ___
 Mailing list: https://launchpad.net/~openstack
 Post to : openstack@lists.launchpad.net (mailto:openstack@lists.launchpad.net)
 Unsubscribe : https://launchpad.net/~openstack
 More help : https://help.launchpad.net/ListHelp




___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] instances loosing IP address while running, due to No DHCPOFFER

2012-06-27 Thread Tom Sante
Hey,

I seem to have the same issue with our VMs, I commented (comment #7) on a bug 
report that seems to correspond with our DHCP issues: 
https://bugs.launchpad.net/nova/+bug/887162

Please report if you are still affected by this issue on the bug page so the 
developers can look into a fix.

Regards,


Op zaterdag 16 juni 2012, om 01:19 heeft Christian Parpart het volgende 
geschreven:

 Hey all,
 
 it now just happened twice again, both just today. and the last at 22:00 UTC, 
 with
 the following in the nova-network's syslog:
 
 root@gw1:/var/log# grep 'dnsmasq.*10889' daemon.log 
 Jun 15 17:39:32 cesar1 dnsmasq[10889]: started, version v2.62-7-g4ce4f37 
 cachesize 150
 Jun 15 17:39:32 cesar1 dnsmasq[10889]: compile time options: IPv6 GNU-getopt 
 no-DBus no-i18n no-IDN DHCP DHCPv6 no-Lua TFTP no-conntrack
 Jun 15 17:39:32 cesar1 dnsmasq-dhcp[10889]: DHCP, static leases only on 
 10.10.40.3, lease time 3d
 Jun 15 17:39:32 cesar1 dnsmasq[10889]: reading /etc/resolv.conf
 Jun 15 17:39:32 cesar1 dnsmasq[10889]: using nameserver 4.2.2.1#53
 Jun 15 17:39:32 cesar1 dnsmasq[10889]: using nameserver 178.63.26.173#53
 Jun 15 17:39:32 cesar1 dnsmasq[10889]: using nameserver 192.168.2.122#53
 Jun 15 17:39:32 cesar1 dnsmasq[10889]: using nameserver 192.168.2.121#53
 Jun 15 17:39:32 cesar1 dnsmasq[10889]: read /etc/hosts - 519 addresses
 Jun 15 17:39:32 cesar1 dnsmasq-dhcp[10889]: read 
 /var/lib/nova/networks/nova-br100.conf
 Jun 15 21:59:41 cesar1 dnsmasq-dhcp[10889]: DHCPREQUEST(br100) 10.10.40.16 
 fa:16:3e:3d:ff:f3 
 Jun 15 21:59:41 cesar1 dnsmasq-dhcp[10889]: DHCPACK(br100) 10.10.40.16 
 fa:16:3e:3d:ff:f3 redis-appdata1
 
 it seemed that this once VM was the only one who sent a dhcp request over the 
 past 5 hours, 
 and that first wone got replied with dhcp ack, and that is it.
 That's been the time the host behind that IP (redis-appdata1) stopped 
 functioning.
 
 However, I now actually did update dnsmasq on our gateway note, to latest 
 trunk 
 of dnsmasq git repository, killed dnsmasq, restarted nova-network (which 
 auto-starts dnsmasq per 
 device).
 
 Now, I really hoped that this one particular bug fix was the cause of the 
 downtime, 
 but appearently, thet MIGHT be another factor.
 
 There is unfortunately nothing to read in the VM's syslog.
 What else could cause the VM to forget its IP?
 Can this also be caused by send_arp_for_ha=True?
 
 Regards,
 Christian.
 
 Christian.
 On Fri, Jun 15, 2012 at 2:50 AM, Nathanael Burton 
 nathanael.i.bur...@gmail.com (mailto:nathanael.i.bur...@gmail.com) wrote:
  FWIW I haven't run across the dnsmasq bug in our environment using EPEL 
  packages. 
  Nate
  On Jun 14, 2012 7:20 PM, Vishvananda Ishaya vishvana...@gmail.com 
  (mailto:vishvana...@gmail.com) wrote:
   Are you running in VLAN mode? If so, you probably need to update to a new 
   version of dnsmasq. See this message for reference:
   
   http://osdir.com/ml/openstack-cloud-computing/2012-05/msg00785.html 
   
   Vish
   
   On Jun 14, 2012, at 1:41 PM, Christian Parpart wrote:
Hey all,

I feel really sad with saying this, now, that we have quite a few 
instances in producgtion 
since about 5 days at least, I now have encountered the second instance 
loosing its
IP address due to No DHCPOFFER (as of syslog in the instance).

I checked the logs in the central nova-network and gateway node and 
found
dnsmasq still to reply on requests from all the other instances and it 
even
got the request from the instance in question and even sent an OFFER, 
as of what
I can tell by now (i'm investigating / posting logs asap), but while it 
seemed
that the dnsmasq sends an offer, the instances says it didn't receive 
one - wtf?

Please tell me what I can do to actually *fix* this issue, since this 
is by far very fatal.

One chance I'd see (as a workaround) is, to let created instanced 
retrieve 
its IP via dhcp, but then reconfigure /etc/network/instances to 
continue with
static networking setup. However, I'd just like the dhcp thingy to get 
fixed.

I'm very open to any kind of helping comments, :) 

So long,
Christian.

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net 
(mailto:openstack@lists.launchpad.net)
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp
   
   
   
   ___
   Mailing list: https://launchpad.net/~openstack
   Post to : openstack@lists.launchpad.net 
   (mailto:openstack@lists.launchpad.net)
   Unsubscribe : https://launchpad.net/~openstack
   More help : https://help.launchpad.net/ListHelp
  
 
 
 ___
 Mailing list: https://launchpad.net/~openstack
 Post to : openstack@lists.launchpad.net