Re: [Openstack-operators] RHEL 7 / CentOS 7 instances losing their network gateway

2015-01-28 Thread Joe Topjian
I'm pretty sure I've resolved this issue. Since this seems to happen
randomly, it might just be a coincidence that this is by far the longest
streak that it hasn't happened. :)

I noticed that CentOS 7 and RHEL 7 are setting a `valid_lft` and
`preferred_lft` timeout on the IPv4 address. You can see this by doing an
ip a on CentOS7/RHEL7 and comparing with either CentOS6 or Ubuntu. This
is the first time I've seen this used on IPv4. It's usually used for IPv6
privacy addresses. The timeout is set to something larger than the lease
renewal time.

What happens, though, is that it is occasionally taking a little longer to
receive the DHCP renewal. Then the `valid_lft` hits zero and the IP is
removed from the interface. When this happens, the kernel will clean up any
routes used by the removed IP (in this case, the default gateway).

A few seconds later, the late DHCP renewal is finally received and the IP
is added back to the interface. But due to how CentOS/RHEL7 is handling the
renewal in /usr/sbin/dhclient-script, the gateway is never re-added.

My guess as to why a newer version of dnsmasq does not exhibit this issue
is because it's advertising renewals a little different: enough to trigger
the part of dhclient-script to re-add the gateway. I have not verified this
theory, though.

What I've done for now is modified dhclient-script and removed any portion
that sets a valid_lft and preferred_lft, so now they are set to forever
just like other distros.

And so far, so good (crossing fingers).

Thanks,
Joe

On Tue, Jan 27, 2015 at 1:53 PM, Joe Topjian j...@topjian.net wrote:

 Hi George,

 All instances have only a single interface.

 Thanks,
 Joe

 On Tue, Jan 27, 2015 at 1:38 PM, George Shuklin george.shuk...@gmail.com
 wrote:

  How many network interfaces have your instance? If more than one - check
 settings for second network (subnet). It can have own dhcp settings which
 may mess up with routes for the main network.


 On 01/27/2015 06:08 PM, Joe Topjian wrote:

 Hello,

  I have run into two different OpenStack clouds where instances running
 either RHEL 7 or CentOS 7 images are randomly losing their network gateway.

  There's nothing in the logs that show any indication of why. There's no
 DHCP hiccup or anything like that. The gateway has just disappeared.

  If I log into the instance via another instance (so on the same subnet
 since there's no gateway), I can manually re-add the gateway and everything
 works... until it loses it again.

  One cloud is running Havana and the other is running Icehouse. Both are
 using nova-network and both are Ubuntu 12.04.

  On the Havana cloud, we decided to install the dnsmasq package from
 Ubuntu 14.04. This looks to have resolved the issue as this was back in
 November and I haven't heard an update since.

  However, we don't want to do that just yet on the Icehouse cloud. We'd
 like to understand exactly why this is happening and why updating dnsmasq
 resolves an issue that only one specific type of image is having.

  I can make my way around CentOS, but I'm not as familiar with it as I
 am with Ubuntu (especially CentOS 7). Does anyone know what change in
 RHEL7/CentOS7 might be causing this? Or does anyone have any other ideas on
 how to troubleshoot the issue?

  I currently have access to two instances in this state, so I'd be happy
 to act as remote hands and eyes. :)

  Thanks,
 Joe


 ___
 OpenStack-operators mailing 
 listOpenStack-operators@lists.openstack.orghttp://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators



 ___
 OpenStack-operators mailing list
 OpenStack-operators@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators



___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] RHEL 7 / CentOS 7 instances losing their network gateway

2015-01-27 Thread Joe Topjian
Thanks, Kris. I'm going to see if there's any oddities between the version
of dnsmasq packaged with 12.04/Icehouse and systemd-dhcp.

On Tue, Jan 27, 2015 at 9:25 AM, Kris G. Lindgren klindg...@godaddy.com
wrote:

  I can't help as we use config-drive to set networking and are just
 starting to roll out Cent7 vm's.  However, a huge change from Cent6 to
 Cent7 was the switch from upstart/dhclient to systemd/systemd-dhcp.
  

 Kris Lindgren
 Senior Linux Systems Engineer
 GoDaddy, LLC.



   From: Joe Topjian j...@topjian.net
 Date: Tuesday, January 27, 2015 at 9:08 AM
 To: openstack-operators@lists.openstack.org 
 openstack-operators@lists.openstack.org
 Subject: [Openstack-operators] RHEL 7 / CentOS 7 instances losing their
 network gateway

   Hello,

  I have run into two different OpenStack clouds where instances running
 either RHEL 7 or CentOS 7 images are randomly losing their network gateway.

  There's nothing in the logs that show any indication of why. There's no
 DHCP hiccup or anything like that. The gateway has just disappeared.

  If I log into the instance via another instance (so on the same subnet
 since there's no gateway), I can manually re-add the gateway and everything
 works... until it loses it again.

  One cloud is running Havana and the other is running Icehouse. Both are
 using nova-network and both are Ubuntu 12.04.

  On the Havana cloud, we decided to install the dnsmasq package from
 Ubuntu 14.04. This looks to have resolved the issue as this was back in
 November and I haven't heard an update since.

  However, we don't want to do that just yet on the Icehouse cloud. We'd
 like to understand exactly why this is happening and why updating dnsmasq
 resolves an issue that only one specific type of image is having.

  I can make my way around CentOS, but I'm not as familiar with it as I am
 with Ubuntu (especially CentOS 7). Does anyone know what change in
 RHEL7/CentOS7 might be causing this? Or does anyone have any other ideas on
 how to troubleshoot the issue?

  I currently have access to two instances in this state, so I'd be happy
 to act as remote hands and eyes. :)

  Thanks,
 Joe

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] RHEL 7 / CentOS 7 instances losing their network gateway

2015-01-27 Thread George Shuklin
How many network interfaces have your instance? If more than one - check 
settings for second network (subnet). It can have own dhcp settings 
which may mess up with routes for the main network.


On 01/27/2015 06:08 PM, Joe Topjian wrote:

Hello,

I have run into two different OpenStack clouds where instances running 
either RHEL 7 or CentOS 7 images are randomly losing their network 
gateway.


There's nothing in the logs that show any indication of why. There's 
no DHCP hiccup or anything like that. The gateway has just disappeared.


If I log into the instance via another instance (so on the same subnet 
since there's no gateway), I can manually re-add the gateway and 
everything works... until it loses it again.


One cloud is running Havana and the other is running Icehouse. Both 
are using nova-network and both are Ubuntu 12.04.


On the Havana cloud, we decided to install the dnsmasq package from 
Ubuntu 14.04. This looks to have resolved the issue as this was back 
in November and I haven't heard an update since.


However, we don't want to do that just yet on the Icehouse cloud. We'd 
like to understand exactly why this is happening and why updating 
dnsmasq resolves an issue that only one specific type of image is having.


I can make my way around CentOS, but I'm not as familiar with it as I 
am with Ubuntu (especially CentOS 7). Does anyone know what change in 
RHEL7/CentOS7 might be causing this? Or does anyone have any other 
ideas on how to troubleshoot the issue?


I currently have access to two instances in this state, so I'd be 
happy to act as remote hands and eyes. :)


Thanks,
Joe


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] RHEL 7 / CentOS 7 instances losing their network gateway

2015-01-27 Thread Joe Topjian
Hi George,

All instances have only a single interface.

Thanks,
Joe

On Tue, Jan 27, 2015 at 1:38 PM, George Shuklin george.shuk...@gmail.com
wrote:

  How many network interfaces have your instance? If more than one - check
 settings for second network (subnet). It can have own dhcp settings which
 may mess up with routes for the main network.


 On 01/27/2015 06:08 PM, Joe Topjian wrote:

 Hello,

  I have run into two different OpenStack clouds where instances running
 either RHEL 7 or CentOS 7 images are randomly losing their network gateway.

  There's nothing in the logs that show any indication of why. There's no
 DHCP hiccup or anything like that. The gateway has just disappeared.

  If I log into the instance via another instance (so on the same subnet
 since there's no gateway), I can manually re-add the gateway and everything
 works... until it loses it again.

  One cloud is running Havana and the other is running Icehouse. Both are
 using nova-network and both are Ubuntu 12.04.

  On the Havana cloud, we decided to install the dnsmasq package from
 Ubuntu 14.04. This looks to have resolved the issue as this was back in
 November and I haven't heard an update since.

  However, we don't want to do that just yet on the Icehouse cloud. We'd
 like to understand exactly why this is happening and why updating dnsmasq
 resolves an issue that only one specific type of image is having.

  I can make my way around CentOS, but I'm not as familiar with it as I am
 with Ubuntu (especially CentOS 7). Does anyone know what change in
 RHEL7/CentOS7 might be causing this? Or does anyone have any other ideas on
 how to troubleshoot the issue?

  I currently have access to two instances in this state, so I'd be happy
 to act as remote hands and eyes. :)

  Thanks,
 Joe


 ___
 OpenStack-operators mailing 
 listOpenStack-operators@lists.openstack.orghttp://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators



 ___
 OpenStack-operators mailing list
 OpenStack-operators@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] RHEL 7 / CentOS 7 instances losing their network gateway

2015-01-27 Thread Joe Topjian
Hello,

I have run into two different OpenStack clouds where instances running
either RHEL 7 or CentOS 7 images are randomly losing their network gateway.

There's nothing in the logs that show any indication of why. There's no
DHCP hiccup or anything like that. The gateway has just disappeared.

If I log into the instance via another instance (so on the same subnet
since there's no gateway), I can manually re-add the gateway and everything
works... until it loses it again.

One cloud is running Havana and the other is running Icehouse. Both are
using nova-network and both are Ubuntu 12.04.

On the Havana cloud, we decided to install the dnsmasq package from Ubuntu
14.04. This looks to have resolved the issue as this was back in November
and I haven't heard an update since.

However, we don't want to do that just yet on the Icehouse cloud. We'd like
to understand exactly why this is happening and why updating dnsmasq
resolves an issue that only one specific type of image is having.

I can make my way around CentOS, but I'm not as familiar with it as I am
with Ubuntu (especially CentOS 7). Does anyone know what change in
RHEL7/CentOS7 might be causing this? Or does anyone have any other ideas on
how to troubleshoot the issue?

I currently have access to two instances in this state, so I'd be happy to
act as remote hands and eyes. :)

Thanks,
Joe
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] RHEL 7 / CentOS 7 instances losing their network gateway

2015-01-27 Thread Kris G. Lindgren
I can't help as we use config-drive to set networking and are just starting to 
roll out Cent7 vm's.  However, a huge change from Cent6 to Cent7 was the switch 
from upstart/dhclient to systemd/systemd-dhcp.


Kris Lindgren
Senior Linux Systems Engineer
GoDaddy, LLC.



From: Joe Topjian j...@topjian.netmailto:j...@topjian.net
Date: Tuesday, January 27, 2015 at 9:08 AM
To: 
openstack-operators@lists.openstack.orgmailto:openstack-operators@lists.openstack.org
 
openstack-operators@lists.openstack.orgmailto:openstack-operators@lists.openstack.org
Subject: [Openstack-operators] RHEL 7 / CentOS 7 instances losing their network 
gateway

Hello,

I have run into two different OpenStack clouds where instances running either 
RHEL 7 or CentOS 7 images are randomly losing their network gateway.

There's nothing in the logs that show any indication of why. There's no DHCP 
hiccup or anything like that. The gateway has just disappeared.

If I log into the instance via another instance (so on the same subnet since 
there's no gateway), I can manually re-add the gateway and everything works... 
until it loses it again.

One cloud is running Havana and the other is running Icehouse. Both are using 
nova-network and both are Ubuntu 12.04.

On the Havana cloud, we decided to install the dnsmasq package from Ubuntu 
14.04. This looks to have resolved the issue as this was back in November and I 
haven't heard an update since.

However, we don't want to do that just yet on the Icehouse cloud. We'd like to 
understand exactly why this is happening and why updating dnsmasq resolves an 
issue that only one specific type of image is having.

I can make my way around CentOS, but I'm not as familiar with it as I am with 
Ubuntu (especially CentOS 7). Does anyone know what change in RHEL7/CentOS7 
might be causing this? Or does anyone have any other ideas on how to 
troubleshoot the issue?

I currently have access to two instances in this state, so I'd be happy to act 
as remote hands and eyes. :)

Thanks,
Joe
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators