Re: [ovs-dev] [PATCH ovn v2 3/7] controller: Add load balancer hairpin OF flows.

2020-11-04 Thread Numan Siddique
On Fri, Oct 30, 2020 at 2:48 AM Dumitru Ceara  wrote:
>
> On 10/27/20 6:18 PM, num...@ovn.org wrote:
> > From: Numan Siddique 
> >
>
> Hi Numan,
>
> > Presently to handle the load balancer hairpin traffic (the traffic destined 
> > to the
> > load balancer VIP is dnatted to the backend which originated the traffic), 
> > ovn-northd
> > adds a lot of logical flows to check this scenario. This patch attempts to 
> > reduce the
> > these logical flows. Each ovn-controller will read the load balancers from
> > the newly added southbound Load_Balancer table and adds the load balancer 
> > hairpin OF
> > flows in the table 68, 69 and 70. If suppose a below load balancer is 
> > configured
> >
> > 10.0.0.10:80 = 10.0.0.4:8080, 10.0.0.5:8090, then the below flows are added
> >
> > table=68, ip.src = 10.0.0.4,ip.dst=10.0.0.4,tcp.dst=8080 
> > actions=load:1->NXM_NX_REG9[7]
> > table=68, ip.src = 10.0.0.5,ip.dst=10.0.0.5,tcp.dst=8090 
> > actions=load:1->NXM_NX_REG9[7]
> > table=69, ip.src = 10.0.0.4,ip.dst=10.0.0.10,tcp.src=8080 
> > actions=load:1->NXM_NX_REG9[7]
> > table=69, ip.src = 10.0.0.5,ip.dst=10.0.0.10,tcp.src=8090 
> > actions=load:1->NXM_NX_REG9[7]
>
> Shouldn't this be REG10[7]?

No. The same register is used for hairpinning the traffic and for the
reverse hairpin since we have a different
table for each.


>
> > table=70, ct.trk && ct.dnat && ct.nw_dst == 10.0.0.10. actions=ct(commit, 
> > zone=reg12, nat(src=10.0.0.5))
> >
> > Upcoming patch will add OVN actions which does the lookup in these tables 
> > to handle the
> > hairpin traffic.
> >
> > Signed-off-by: Numan Siddique 
> > ---
> >  controller/lflow.c   | 253 +++
> >  controller/lflow.h   |   6 +-
> >  controller/ovn-controller.c  |  27 +-
> >  include/ovn/logical-fields.h |   3 +
> >  tests/ovn.at | 469 +++
> >  5 files changed, 756 insertions(+), 2 deletions(-)
> >
> > diff --git a/controller/lflow.c b/controller/lflow.c
> > index f631679c3f..657482626d 100644
> > --- a/controller/lflow.c
> > +++ b/controller/lflow.c
> > @@ -26,6 +26,7 @@
> >  #include "ovn-controller.h"
> >  #include "ovn/actions.h"
> >  #include "ovn/expr.h"
> > +#include "lib/lb.h"
> >  #include "lib/ovn-l7.h"
> >  #include "lib/ovn-sb-idl.h"
> >  #include "lib/extend-table.h"
> > @@ -1138,6 +1139,213 @@ add_neighbor_flows(struct ovsdb_idl_index 
> > *sbrec_port_binding_by_name,
> >  }
> >  }
> >
> > +static void
> > +add_lb_vip_hairpin_flows(struct ovn_lb *lb, struct lb_vip *lb_vip,
> > + struct lb_vip_backend *lb_backend,
> > + uint8_t lb_proto,
> > + struct ovn_desired_flow_table *flow_table)
> > +{
> > +uint64_t stub[1024 / 8];
> > +struct ofpbuf ofpacts = OFPBUF_STUB_INITIALIZER(stub);
> > +
> > +uint8_t value = 1;
> > +put_load(, sizeof value, MFF_LOG_FLAGS,
> > + MLF_LOOKUP_LB_HAIRPIN_BIT, 1, );
> > +
> > +ovs_be32 vip4;
> > +struct in6_addr vip6;
> > +
> > +if (lb_vip->addr_family == AF_INET) {
> > +ovs_assert(ip_parse(lb_vip->vip, ));
> > +} else {
> > +ovs_assert(ipv6_parse(lb_vip->vip, ));
> > +}
> > +
> > +struct match hairpin_match = MATCH_CATCHALL_INITIALIZER;
> > +struct match hairpin_reply_match = MATCH_CATCHALL_INITIALIZER;
> > +
> > +if (lb_vip->addr_family == AF_INET) {
> > +ovs_be32 ip4;
> > +ovs_assert(ip_parse(lb_backend->ip, ));
>
> Because we share the same ovn_lb structure between NB and SB we're now forced
> to parse the backend IP twice:
>
> ovn_sb_lb_create()
> - ovn_lb_create()
> -- ip_address_and_port_from_key()
> --- this parses the string into a sockaddr_storage

The output of the function - ip_address_and_port_from_key() is ip
address as char * and port as uint16_t.
We do need to anyway call ip_parse/ipv6_parse whether we has same
'struct lb_vip' for both NB and SB
or different structures.

Otherwise we should not use ip_address_and_port_from_key() at all.

Also I don't see any issues with using ip_parse here.


> --- then builds an IP string and extracts the port
>
> Here we reparse the IP string into an ovs_be32.

>
> If the structs were independent for NB/SB I think we would pottentially have
> simpler code and we would avoid reparsing IPs.
>
> Also, what if the SB DB gets corrupted, do we really want ovn-controller to 
> abort?

If suppose, lb vip is invalid, then the lib/lb.c util function
ovn_lb_create() will not even
create the 'struct lb_vip' instance. Same for the backend ip.

So I think it's ok to assert here.


>
> > +
> > +match_set_dl_type(_match, htons(ETH_TYPE_IP));
> > +match_set_nw_src(_match, ip4);
> > +match_set_nw_dst(_match, ip4);
> > +
> > +match_set_dl_type(_reply_match,
> > +htons(ETH_TYPE_IP));
> > +match_set_nw_src(_reply_match, ip4);
> > +match_set_nw_dst(_reply_match, vip4);
> > +} else {

Re: [ovs-dev] [PATCH ovn v2 3/7] controller: Add load balancer hairpin OF flows.

2020-10-29 Thread Dumitru Ceara
On 10/27/20 6:18 PM, num...@ovn.org wrote:
> From: Numan Siddique 
>

Hi Numan,

> Presently to handle the load balancer hairpin traffic (the traffic destined 
> to the
> load balancer VIP is dnatted to the backend which originated the traffic), 
> ovn-northd
> adds a lot of logical flows to check this scenario. This patch attempts to 
> reduce the
> these logical flows. Each ovn-controller will read the load balancers from
> the newly added southbound Load_Balancer table and adds the load balancer 
> hairpin OF
> flows in the table 68, 69 and 70. If suppose a below load balancer is 
> configured
> 
> 10.0.0.10:80 = 10.0.0.4:8080, 10.0.0.5:8090, then the below flows are added
> 
> table=68, ip.src = 10.0.0.4,ip.dst=10.0.0.4,tcp.dst=8080 
> actions=load:1->NXM_NX_REG9[7]
> table=68, ip.src = 10.0.0.5,ip.dst=10.0.0.5,tcp.dst=8090 
> actions=load:1->NXM_NX_REG9[7]
> table=69, ip.src = 10.0.0.4,ip.dst=10.0.0.10,tcp.src=8080 
> actions=load:1->NXM_NX_REG9[7]
> table=69, ip.src = 10.0.0.5,ip.dst=10.0.0.10,tcp.src=8090 
> actions=load:1->NXM_NX_REG9[7]

Shouldn't this be REG10[7]?

> table=70, ct.trk && ct.dnat && ct.nw_dst == 10.0.0.10. actions=ct(commit, 
> zone=reg12, nat(src=10.0.0.5))
> 
> Upcoming patch will add OVN actions which does the lookup in these tables to 
> handle the
> hairpin traffic.
> 
> Signed-off-by: Numan Siddique 
> ---
>  controller/lflow.c   | 253 +++
>  controller/lflow.h   |   6 +-
>  controller/ovn-controller.c  |  27 +-
>  include/ovn/logical-fields.h |   3 +
>  tests/ovn.at | 469 +++
>  5 files changed, 756 insertions(+), 2 deletions(-)
> 
> diff --git a/controller/lflow.c b/controller/lflow.c
> index f631679c3f..657482626d 100644
> --- a/controller/lflow.c
> +++ b/controller/lflow.c
> @@ -26,6 +26,7 @@
>  #include "ovn-controller.h"
>  #include "ovn/actions.h"
>  #include "ovn/expr.h"
> +#include "lib/lb.h"
>  #include "lib/ovn-l7.h"
>  #include "lib/ovn-sb-idl.h"
>  #include "lib/extend-table.h"
> @@ -1138,6 +1139,213 @@ add_neighbor_flows(struct ovsdb_idl_index 
> *sbrec_port_binding_by_name,
>  }
>  }
>  
> +static void
> +add_lb_vip_hairpin_flows(struct ovn_lb *lb, struct lb_vip *lb_vip,
> + struct lb_vip_backend *lb_backend,
> + uint8_t lb_proto,
> + struct ovn_desired_flow_table *flow_table)
> +{
> +uint64_t stub[1024 / 8];
> +struct ofpbuf ofpacts = OFPBUF_STUB_INITIALIZER(stub);
> +
> +uint8_t value = 1;
> +put_load(, sizeof value, MFF_LOG_FLAGS,
> + MLF_LOOKUP_LB_HAIRPIN_BIT, 1, );
> +
> +ovs_be32 vip4;
> +struct in6_addr vip6;
> +
> +if (lb_vip->addr_family == AF_INET) {
> +ovs_assert(ip_parse(lb_vip->vip, ));
> +} else {
> +ovs_assert(ipv6_parse(lb_vip->vip, ));
> +}
> +
> +struct match hairpin_match = MATCH_CATCHALL_INITIALIZER;
> +struct match hairpin_reply_match = MATCH_CATCHALL_INITIALIZER;
> +
> +if (lb_vip->addr_family == AF_INET) {
> +ovs_be32 ip4;
> +ovs_assert(ip_parse(lb_backend->ip, ));

Because we share the same ovn_lb structure between NB and SB we're now forced
to parse the backend IP twice:

ovn_sb_lb_create()
- ovn_lb_create()
-- ip_address_and_port_from_key()
--- this parses the string into a sockaddr_storage
--- then builds an IP string and extracts the port

Here we reparse the IP string into an ovs_be32.

If the structs were independent for NB/SB I think we would pottentially have
simpler code and we would avoid reparsing IPs.

Also, what if the SB DB gets corrupted, do we really want ovn-controller to 
abort?

> +
> +match_set_dl_type(_match, htons(ETH_TYPE_IP));
> +match_set_nw_src(_match, ip4);
> +match_set_nw_dst(_match, ip4);
> +
> +match_set_dl_type(_reply_match,
> +htons(ETH_TYPE_IP));
> +match_set_nw_src(_reply_match, ip4);
> +match_set_nw_dst(_reply_match, vip4);
> +} else {
> +struct in6_addr ip6;
> +ovs_assert(ipv6_parse(lb_backend->ip, ));

Same here.

> +
> +match_set_dl_type(_match, htons(ETH_TYPE_IPV6));
> +match_set_ipv6_src(_match, );
> +match_set_ipv6_dst(_match, );
> +
> +match_set_dl_type(_reply_match,
> +htons(ETH_TYPE_IPV6));

Indentation is a bit odd.

> +match_set_ipv6_src(_reply_match, );
> +match_set_ipv6_dst(_reply_match, );
> +}
> +
> +if (lb_backend->port) {
> +match_set_nw_proto(_match, lb_proto);
> +match_set_tp_dst(_match, htons(lb_backend->port));
> +
> +match_set_nw_proto(_reply_match, lb_proto);
> +match_set_tp_src(_reply_match,
> +htons(lb_backend->port));

Same here.

> +}
> +
> +for (size_t i = 0; i < lb->slb->n_datapaths; i++) {
> +match_set_metadata(_match,
> +