Public bug reported: Cloud administrators have monitoring systems externally placed watching different types of resources of their cloud infrastructures. A cloud infrastructure is comprehended not exclusively by an OpenStack instance but also other components not managed by and possibly not visible to OpenStack such as SDN controller, physical network elements, etc.
External systems may detect a fault on one of multiple of infrastructure resources that subsequently may affect services being provided by OpenStack. From a network perspective, an example of a fault can be the crashing of openvswitch on a compute node. When using the reference implementation (ovs + neutron-l2-agent), neutron-l2-agent will continue reporting to the Neutron server its state as alive (there's heartbeat; service's up ), although there's an internal error caused by unreachability to the virtual bridge (br-int). By means of external tools to OpenStack monitoring openvswitch, the administrator knows there's something wrong and as a fault management action he may want to explicitly set the agent state down. Such action requires a new API exposed by Neutron allowing admins to set (true/false) the aliveness state of Neutron agents. This feature request goes in line with the work proposed to Nova [1] and implemented in Liberty. The same is also being currently proposed to Cinder [2] [1] https://blueprints.launchpad.net/nova/+spec/mark-host-down [2] https://blueprints.launchpad.net/cinder/+spec/mark-services-down ** Affects: neutron Importance: Undecided Assignee: Carlos Goncalves (cgoncalves) Status: New ** Tags: rfe ** Changed in: neutron Assignee: (unassigned) => Carlos Goncalves (cgoncalves) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1513144 Title: Allow admin to mark agents down Status in neutron: New Bug description: Cloud administrators have monitoring systems externally placed watching different types of resources of their cloud infrastructures. A cloud infrastructure is comprehended not exclusively by an OpenStack instance but also other components not managed by and possibly not visible to OpenStack such as SDN controller, physical network elements, etc. External systems may detect a fault on one of multiple of infrastructure resources that subsequently may affect services being provided by OpenStack. From a network perspective, an example of a fault can be the crashing of openvswitch on a compute node. When using the reference implementation (ovs + neutron-l2-agent), neutron-l2-agent will continue reporting to the Neutron server its state as alive (there's heartbeat; service's up ), although there's an internal error caused by unreachability to the virtual bridge (br- int). By means of external tools to OpenStack monitoring openvswitch, the administrator knows there's something wrong and as a fault management action he may want to explicitly set the agent state down. Such action requires a new API exposed by Neutron allowing admins to set (true/false) the aliveness state of Neutron agents. This feature request goes in line with the work proposed to Nova [1] and implemented in Liberty. The same is also being currently proposed to Cinder [2] [1] https://blueprints.launchpad.net/nova/+spec/mark-host-down [2] https://blueprints.launchpad.net/cinder/+spec/mark-services-down To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1513144/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp