> -----Original Message-----
> From: Aaron Conole [mailto:acon...@redhat.com]
> Sent: Monday, April 9, 2018 4:32 PM
> To: Mooney, Sean K <sean.k.moo...@intel.com>
> Cc: d...@openvswitch.org; Stokes, Ian <ian.sto...@intel.com>; Kevin
> Traynor <ktray...@redhat.com>; Ilya Maximets <i.maxim...@samsung.com>;
> Loftus, Ciara <ciara.lof...@intel.com>; Terry Wilson
> <twil...@redhat.com>; Assaf Muller <as...@redhat.com>
> Subject: Re: [RFC 0/2] dpdk: minor refactor of the initialization step
> 
> "Mooney, Sean K" <sean.k.moo...@intel.com> writes:
> 
> > So just from a deployment tools point of view I would like to point
> > out that This change could break existing workflow that deploy ovs in
> a docker container.
> > Kolla ansible assumes that if the docker ovs_vswitchd container is
> > still running that the is infact running in dpdk mode when we set
> > dpdk-init=true.
> 
> Is there a way to test this out and see the behavior?
[Mooney, Sean K] well you could use kolla to deploy ovs-dpdk :)
Am when I wrote the code I relied on the existing behavior.
when kolla ansible is deploying openstack we first deploy the ovsdb.
https://github.com/openstack/kolla-ansible/blob/4c39ea7eccd946775722646eec19f9ea5cbe6eb5/ansible/roles/ovs-dpdk/handlers/main.yml#L2-L37
Then we start the ovs-vswitchd container
https://github.com/openstack/kolla-ansible/blob/4c39ea7eccd946775722646eec19f9ea5cbe6eb5/ansible/roles/ovs-dpdk/handlers/main.yml#L55-L73
finally we configure the bridges and physical interfaces.
https://github.com/openstack/kolla-ansible/blob/4c39ea7eccd946775722646eec19f9ea5cbe6eb5/ansible/roles/ovs-dpdk/handlers/main.yml#L75-L90

the "- name: Ensuring ovsdpdk bridges are properly setup named" task does not 
use --no-wait when creating bridges and adding interfaces so it
will fail if the vswitchd is not running. This will result in ansible stopping 
to run any futher task on that node and reporting the error
to the user. If for some reason the ensure bridge task passed the next task 
that check an ip is assigned to the ovs bride would fail.
> 
> It does seem strange that for a possible configuration error we abort()
[Mooney, Sean K] why I would expect this to be standard behavior for any deamon.
e.g. the damon would validate it config is correct and exit if invalid.
If we don't abort the vswitch is in an undefined state. Is is still using 
hugepages
For example if the eal init fails after they are allocated.
> running the vswitchd (and with --monitor set, it will continue to
> abort() over and over - so I guess you're also not using the monitor
> thread?).  In the case that an abort does happen, does the Kolla script
> distinguish between issues where dpdk setup failed vs. some other
> software issue?
> 
> > Can I request that if you make this change you add something along
> the
> > lines of dpdk-init-is-fatal=true/false so that we can explicitly say
> which behavior we want.
> > I would not be surprised if people have built monitoring around "is
> > the ovs-vswitchd running"
> 
> I think they have, but I don't know that they use it to infer such low-
> level details (meaning a crash implies that dpdk configuration is
> wrong).
[Mooney, Sean K] they don't use is to infer that dpdk configuring is wronge
But rather that some configuration was wrong. Dpdk-init is currently considered
Fatal if it fails so it was treated the same as any other error that would have
caused the vsiwtchd process to exit. I belive in the opnfv community they used
the liveleyness of the vswitchd process and in the future dpdk keepalive
functionality to set the datapalne status filed in neutron for the host.
this allows openstack to be aware of dataplane outages.
> 
> > To infer at least at a highlevel that "everything is fine" where as
> > the log message/db field proposed Here will invalidate that.
> 
> I've added Assaf Mueller from our Open Stack team as well - maybe he
> has some additional details on those mechanisms outside of Kolla (maybe
> it exists in some kind of director / other software too, as you point
> out).
> 
> > it would be ease to check that field but its work that needs to be
> > done in multiple places.
> 
> I think such a knob wouldn't be useful.  I believe it would either have
> to be defaulted to 'dpdk-init-is-fatal=true' to abort on failure (which
> most users would want to change making it an undesirable default)
[Mooney, Sean K] I would argue against that I would never deploy with
dpdk-init-is-fatal=false. If your datapane does not start what is the point
of running ovs at all? It will not be able to forward packets.
, or
> the Kolla ansible scripts (and other detection mechanisms for dpdk
> failure - if they exist) would need to change.  Maybe there's another
> approach, though?
> 
> >> -----Original Message-----
> >> From: Aaron Conole [mailto:acon...@redhat.com]
> >> Sent: Thursday, April 5, 2018 10:23 PM
> >> To: d...@openvswitch.org
> >> Cc: Stokes, Ian <ian.sto...@intel.com>; Kevin Traynor
> >> <ktray...@redhat.com>; Ilya Maximets <i.maxim...@samsung.com>;
> >> Loftus, Ciara <ciara.lof...@intel.com>; Mooney, Sean K
> >> <sean.k.moo...@intel.com>; Terry Wilson <twil...@redhat.com>
> >> Subject: [RFC 0/2] dpdk: minor refactor of the initialization step
> >>
> >> Sometimes, DPDK initialization can fail, but ovs-vswitchd will abort
> >> in that case.  When that occurs, ovs-vswitchd will be restarted by
> >> the monitor and immediately abort.  This is rather unfriendly to
> >> users, who would prefer to possibly correct the issue or at least,
> >> not have lots of processes continually spawning.
> >>
> >> This series accepts that rte_eal_init() can and does fail for real.
> >> It reflects the initialization status in the database, as well as
> >> adding the DPDK version (where appropriate).
> >>
> >> Submitted as RFC to spawn discussion around the type to reflect for
> >> the initialized information.  Presented here as a boolean - however,
> >> it might be more interesting to be a 'string' and have more
> elaborate
> >> details (ex: 'failed - ovs_strerror(rte_errno)' or 'uninitialized'
> or
> >> 'initialized').
> >>
> >> Aaron Conole (2):
> >>   dpdk: allow init to fail
> >>   dpdk: reflect status and version in the database
> >>
> >>  lib/dpdk-stub.c            | 10 ++++++++++
> >>  lib/dpdk.c                 | 31 +++++++++++++++++++++++++------
> >>  lib/dpdk.h                 |  3 ++-
> >>  vswitchd/bridge.c          |  5 +++++
> >>  vswitchd/vswitch.ovsschema | 11 ++++++++---
> >>  vswitchd/vswitch.xml       | 11 +++++++++++
> >>  6 files changed, 61 insertions(+), 10 deletions(-)
> >>
> >> --
> >> 2.14.3
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to