Re: [openstack-dev] [neutron] high dhcp lease times in neutron deployments considered harmful (or not???)

2015-01-28 Thread Chuck Carlino

On 01/28/2015 12:51 PM, Kevin Benton wrote:


If we are going to ignore the IP address changing use-case, can we 
just make the default infinity? Then nobody ever has to worry about 
control plane outages for existing client. 24 hours is way too long to 
be useful anyway.




Why would users want to change an active port's IP address anyway? I can 
see possible use in changing an inactive port's IP address, but that 
wouldn't cause the dhcp issues mentioned here.  I worry about setting a 
default config value to handle a very unusual use case.


Chuck


On Jan 28, 2015 12:44 PM, Salvatore Orlando sorla...@nicira.com 
mailto:sorla...@nicira.com wrote:




On 28 January 2015 at 20:19, Brian Haley brian.ha...@hp.com
mailto:brian.ha...@hp.com wrote:

Hi Kevin,

On 01/28/2015 03:50 AM, Kevin Benton wrote:
 Hi,

 Approximately a year and a half ago, the default DHCP lease
time in Neutron was
 increased from 120 seconds to 86400 seconds.[1] This was
done with the goal of
 reducing DHCP traffic with very little discussion (based on
what I can see in
 the review and bug report). While it it does indeed reduce
DHCP traffic, I don't
 think any bug reports were filed showing that a 120 second
lease time resulted
 in too much traffic or that a jump all of the way to 86400
seconds was required
 instead of a value in the same order of magnitude.

 Why does this matter?

 Neutron ports can be updated with a new IP address from the
same subnet or
 another subnet on the same network. The port update will
result in anti-spoofing
 iptables rule changes that immediately stop the old IP
address from working on
 the host. This means the host is unreachable for 0-12 hours
based on the current
 default lease time without manual intervention[2] (assuming
half-lease length
 DHCP renewal attempts).

So I'll first comment on the problem.  You're essentially
pulling the rug out
from under these VMs by changing their IP (and that of their
router and DHCP/DNS
server), but you expect they should fail quickly and come
right back online.  In
a non-Neutron environment wouldn't the IT person that did this
need some pretty
good heat-resistant pants for all the flames from pissed-off
users?  Sure, the
guy on his laptop will just bounce the connection, but servers
(aka VMs) should
stay pretty static.  VMs are servers (and cows according to some).


I actually expect this kind operation to not be one Neutron users
will do very often, mostly because regardless of whether you're in
the cloud or not, you'd still need to wear those heat resistant pants.


The correct solution is to be able to renumber the network so
there is no issue
with the anti-spoofing rules dropping packets, or the VMs
having an unreachable
IP address, but that's a much bigger nut to crack.


Indeed. In my opinion the update IP operation sets false
expectations in users. I have considered disallowing PUT on
fixed_ips in the past but that did not go ahead because there were
users leveraging it.


 Why is this on the mailing list?

 In an attempt to make the VMs usable in a much shorter
timeframe following a
 Neutron port address change, I submitted a patch to reduce
the default DHCP
 lease time to 8 minutes.[3] However, this was upsetting to
several people,[4] so
 it was suggested I bring this discussion to the mailing
list. The following are
 the high-level concerns followed by my responses:

   * 8 minutes is arbitrary
   o Yes, but it's no more arbitrary than 1440 minutes. I
picked it as an
 interval because it is still 4 times larger than the last 
short value,
 but it still allows VMs to regain connectivity in 5
minutes in the
 event their IP is changed. If someone has a good
suggestion for another
 interval based on known dnsmasq QPS limits or some
other quantitative
 reason, please chime in here.

We run 48 hours as the default in our public cloud, and I did
some digging to
remind myself of the multiple reasons:

1. Too much DHCP traffic.  Sure, only that initial request is
broadcast, but
dnsmasq is very verbose and loves writing to syslog for
everything it does -
less is more.  Do a scale test with 10K VMs and you'll quickly
find out a large
portion of traffic is DHCP RENEWs, and syslog is huge.


This is correct, and something I overlooked in 

Re: [openstack-dev] [TripleO] nominating James Polley for tripleo-core

2015-01-15 Thread Chuck Carlino

On 01/15/2015 08:49 AM, Alexis Lee wrote:

Clint Byrum said on Wed, Jan 14, 2015 at 10:14:45AM -0800:

holidays. However, I believe James has demonstrated superb review skills
and a commitment to the project that shows broad awareness of the
project.

Big +1. Thanks for taking the time to meta-review, Clint.


Alexis


I don't get a vote, but just wanted to point out James' excellent 
contributions in chasing down neutron issues.


Hmm, now that I've said it, I'm not entirely certain he'd have wanted me 
to :P


ChuckC

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] - the setup of a DHCP sub-group

2014-12-05 Thread Chuck Carlino

On 11/26/2014 08:55 PM, Don Kehn wrote:
Sure, will try and gen to it over the holiday, do you have a link to 
the spec repo?



Hi Don,

Has there been any progress on a DHCP sub-group?

Regards,
Chuck


On Mon, Nov 24, 2014 at 3:27 PM, Carl Baldwin c...@ecbaldwin.net 
mailto:c...@ecbaldwin.net wrote:


Don,

Could the spec linked to your BP be moved to the specs repository?
I'm hesitant to start reading it as a google doc when I know I'm going
to want to make comments and ask questions.

Carl

On Thu, Nov 13, 2014 at 9:19 AM, Don Kehn dek...@gmail.com
mailto:dek...@gmail.com wrote:
 If this shows up twice sorry for the repeat:

 Armando, Carl:
 During the Summit, Armando and I had a very quick conversation
concern a
 blue print that I submitted,

https://blueprints.launchpad.net/neutron/+spec/dhcp-cpnr-integration
and
 Armando had mention the possibility of getting together a
sub-group tasked
 with DHCP Neutron concerns. I have talk with Infoblox folks (see
 https://blueprints.launchpad.net/neutron/+spec/neutron-ipam),
and everyone
 seems to be in agreement that there is synergy especially
concerning the
 development of a relay and potentially looking into how DHCP is
handled. In
 addition during the Fridays meetup session on DHCP that I gave
there seems
 to be some general interest by some of the operators as well.

 So what would be the formality in going forth to start a
sub-group and
 getting this underway?

 DeKehn



 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
mailto:OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
mailto:OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




--

Don Kehn
303-442-0060


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [ironic] the possible use of dhcp client id

2014-11-17 Thread Chuck Carlino

On 11/17/2014 12:43 PM, Devananda van der Veen wrote:

Thanks for the reply!

On Wed Nov 12 2014 at 2:41:27 PM Chuck Carlino 
chuckjcarl...@gmail.com mailto:chuckjcarl...@gmail.com wrote:


Hi,

I'm working on the neutron side of a couple of ironic issues, and
I need some help.  Here are the issues.

 1. If a nic on an ironic server fails and is replaced by a nic
with a different mac address, neutron's dhcp service will not
serve it the same ip address.  This can be worked around by
deleting the neutron port and creating a new one, but it
leaves a window wherein the ip address could be lost to an
unrelated port creation happening at the same time.
 2. While performing large deployments, a random nic failure can
cause the failure of the entire deploy.  The ability to retry
a failed boot with a different nic has been requested.

It has been proposed that both issues could be at least partially
addressed by adding the ability to use dhcp client id to neutron. 
In this solution, the dhcp client is configured to use a dhcp

client id, and the server associates this client id (instead of
mac address) with the ip address.  Note that this idea just came
up today, so no code exists yet to try things out.

My questions:

For 1, the mac address of the neutron port will be left different
from the actual nic's mac address.  Is that a problem for ironic? 
It makes me feel uneasy, and might confuse users, but that's all I

got.

I think that's a show-stopper, actually. Not just because it would be 
very confusing for operators to see a fake MAC in Nova and the real 
MAC in Ironic. Neutron's lack of knowledge of the physical MAC(s) 
would seem to prevent it performing physical switch configuration (via 
ml2 plugins) for those who choose to use Ironic in a multi-tenant 
environment (eg, OnMetal).


Good to know.


In general, does using dhcp client id present any issues for
booting an ironic server?  I've done a bit of web searching and
from a protocol perspective it looks feasible, but I don't get a
sense of whether it's a good general solution.

A few things come to mind:

- How does the instance know what DHCP client ID to include in its 
request, before it has an IP by which to contact the metadata service? 
It sounds like this feature would only work if Ironic has a pre-boot 
way to pass in data (eg, configdrive). Not all our drivers support 
that today.


So using dhcp client id may not be a general solution.



- Is it possible / desirable to group multiple NICs under a single 
DHCP client ID? If so, then again, it would seem like neutron would 
need to know the physical MACs. (I recall us chatting about port 
bonding at some point, but I'm not sure if these were related 
conversations.)


I'd rather not confuse the issue with any details around how bonding or 
link aggregation works, so let's just say that in case #2 above, the 
guest may or may not be bonding the interfaces.  Since bonding occurs 
after boot, the bonding itself is not pertinent.  But yes, all NICs 
through which network boot can be attempted must present the same 
dhcp_client_id for this solution to work.  I don't see the connection to 
neutron needing correct mac addresses, though, since the client id 
effectively replaces the mac address for ip address lookup.




- What prevents some other server from spoofing the DHCP client ID in 
a multi-tenant environment? Again, folks using an ML2 plugin today are 
able to do MAC filtering on traffic at the switch. Removing knowledge 
of the node's physical MACs looks like it breaks this.


Googling around, it looks like spoofing can be addressed as in 
https://www.ietf.org/rfc/rfc3046.txt (needs a trusted component).  I 
agree that neutron needs the correct mac address here.


Thanks,
Chuck


If you have any off-the-top 'there's no chance that'll work' or
better things to try kind of feedback, it would be great to hear
it now since I'm about to start a POC to try it out.

Thanks,
Chuck

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
mailto:OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [ironic] the possible use of dhcp client id

2014-11-12 Thread Chuck Carlino

Hi,

I'm working on the neutron side of a couple of ironic issues, and I need 
some help.  Here are the issues.


1. If a nic on an ironic server fails and is replaced by a nic with a
   different mac address, neutron's dhcp service will not serve it the
   same ip address.  This can be worked around by deleting the neutron
   port and creating a new one, but it leaves a window wherein the ip
   address could be lost to an unrelated port creation happening at the
   same time.
2. While performing large deployments, a random nic failure can cause
   the failure of the entire deploy.  The ability to retry a failed
   boot with a different nic has been requested.

It has been proposed that both issues could be at least partially 
addressed by adding the ability to use dhcp client id to neutron. In 
this solution, the dhcp client is configured to use a dhcp client id, 
and the server associates this client id (instead of mac address) with 
the ip address.  Note that this idea just came up today, so no code 
exists yet to try things out.


My questions:

For 1, the mac address of the neutron port will be left different from 
the actual nic's mac address.  Is that a problem for ironic? It makes me 
feel uneasy, and might confuse users, but that's all I got.


In general, does using dhcp client id present any issues for booting an 
ironic server?  I've done a bit of web searching and from a protocol 
perspective it looks feasible, but I don't get a sense of whether it's a 
good general solution.


If you have any off-the-top 'there's no chance that'll work' or better 
things to try kind of feedback, it would be great to hear it now since 
I'm about to start a POC to try it out.


Thanks,
Chuck

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Lightning talks during the Design Summit!

2014-11-01 Thread Chuck Carlino

On 10/31/2014 01:01 PM, Ian Wells wrote:
Maruti's talk is, in fact, so interesting that we should probably get 
together and talk about this earlier in the week.  I very much want to 
see virtual-physical programmatic bridging, and I know Kevin Benton is 
also interested. Arguably the MPLS VPN stuff also is similar in 
scope.  Can I propose we have a meeting on cloud edge functionality?

--
Ian.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

I am interested in these discussions as well.

Chuck Carlino
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] allow-mac-to-be-updated

2014-10-16 Thread Chuck Carlino

On 10/14/2014 05:19 AM, Gary Kotton wrote:

Hi,
I am really in favor of this. The implementation looks great! Nova can
surely benefit from this and we can make Neutron allocations at the API
level and save a ton of complexity at the compute level.
Kudos!
Thanks
Gary

On 10/13/14, 11:31 PM, Chuck Carlino chuckjcarl...@gmail.com wrote:


Hi,

Is anyone working on this blueprint[1]?  I have an implementation [2]
and would like to write up a spec.


Spec is now available for review.

https://review.openstack.org/129085



Thanks,
Chuck

[1] https://blueprints.launchpad.net/neutron/+spec/allow-mac-to-be-updated
[2] https://review.openstack.org/#/c/112129/

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [neutron] allow-mac-to-be-updated

2014-10-13 Thread Chuck Carlino

Hi,

Is anyone working on this blueprint[1]?  I have an implementation [2] 
and would like to write up a spec.


Thanks,
Chuck

[1] https://blueprints.launchpad.net/neutron/+spec/allow-mac-to-be-updated
[2] https://review.openstack.org/#/c/112129/

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev