On 10/20/22 15:49, Abhiram Sangana wrote: > Hi Dumitru, > > Thanks for reviewing the patch. > >> On 19 Oct 2022, at 14:09, Dumitru Ceara <dce...@redhat.com> wrote: >> >> Hi Abhiram, >> >> Thanks for the patch! I only skimmed the changes so this is not a full >> review but more of a discussion starter. >> >> On 10/18/22 17:33, Abhiram Sangana wrote: >>> To identify connections dropped by ACLs, users can enable logging for ACLs >>> but this approach does not scale. ACL logging uses "controller" action >>> which causes a significant spike in the CPU usage of ovs-vswitchd (and >>> ovn-controller to a lesser extent) even with metering enabled (observed >>> 65% ovs-vswitchd CPU usage for logging 1000 packets per second). Another >>> approach is to use drop sampling (patch by Adrian Moreno currently in >>> review) but we might miss specific connections of interest with this >>> approach. >>> >>> This patch commits connections dropped by ACLs to the connection tracking >>> table with a specific ACL label that was introduced in 0e0228be ( >>> northd: Add ACL label). The dropped connections are committed in a separate >>> CT zone so that they can be managed independently. >>> >> >> I'm not sure I understand how the CMS can manage this. How is this >> better than sampling? Committed connections are going to time out at >> some point (30 sec by default for udp/icmp with the kernel datapath). >> So the CMS will have to continuously monitor the contents of the >> conntrack zone? Aren't we just moving the CPU load somewhere else with >> this? Even so, there's a chance an entry is missed. > > Linux nf_conntrack module supports sending connection tracking events > to userspace via ctnetlink ("net.netfilter.nf_conntrack_events" kernel > parameter). So, CMS can parse the stream of new connection events from > conntrack and log the packets based on CT label. >
Ack. > An issue with sampling is that if there are a large number of packets > for a particular connection(s), packets of other connections might not > get sampled and we miss information about these connections. With the > conntrack approach, we get a single event for each connection (until > they time out), so there is lesser load on the CMS/collector and lesser > likelihood of missing connections. > I hadn't considered this advantage, thanks for pointing it out! >> >>> Each logical port is assigned a new zone for committing dropped flows. >>> The zone is loaded into register MFF_LOG_ACL_DROP_ZONE. >>> >>> A new lflow action "ct_commit_drop" is introduced that commits flows >>> to connection tracking table in a zone identified by >>> MFF_LOG_ACL_DROP_ZONE register. >>> >>> An ACL with "drop" action and non-empty label is translated to >>> "ct_commit_drop" >>> instead of silently dropping the packet. >>> >>> Signed-off-by: Abhiram Sangana <sangana.abhi...@nutanix.com> >>> --- >>> controller/ovn-controller.c | 23 ++++++++++++++--- >>> controller/physical.c | 32 +++++++++++++++++++++-- >>> include/ovn/actions.h | 1 + >>> include/ovn/logical-fields.h | 1 + >>> lib/actions.c | 50 ++++++++++++++++++++++++++++++++++++ >>> lib/ovn-util.c | 4 +-- >>> lib/ovn-util.h | 2 +- >>> northd/northd.c | 14 ++++++++-- >>> northd/ovn-northd.8.xml | 14 ++++++++-- >>> ovn-sb.xml | 17 ++++++++++++ >>> ovs | 2 +- >> >> This shouldn't need to change the OVS submodule. >> > My bad. Will fix this. > Cool, thanks, looking forward for v1! Regards, Dumitru > Thanks, > Abhiram Sangana > >> Thanks, >> Dumitru >> >>> utilities/ovn-nbctl.c | 7 ++--- >>> 12 files changed, 151 insertions(+), 16 deletions(-) >>> >>> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c >>> index c97744d57..1ad20fe55 100644 >>> --- a/controller/ovn-controller.c >>> +++ b/controller/ovn-controller.c >>> @@ -660,8 +660,15 @@ update_ct_zones(const struct shash *binding_lports, >>> unsigned long unreq_snat_zones[BITMAP_N_LONGS(MAX_CT_ZONES)]; >>> >>> struct shash_node *shash_node; >>> + const struct binding_lport *lport; >>> SHASH_FOR_EACH (shash_node, binding_lports) { >>> sset_add(&all_users, shash_node->name); >>> + >>> + /* Zone for committing dropped connections of a vNIC. */ >>> + lport = shash_node->data; >>> + char *drop_zone = alloc_ct_zone_key(&lport->pb->header_.uuid, >>> "drop"); >>> + sset_add(&all_users, drop_zone); >>> + free(drop_zone); >>> } >>> >>> /* Local patched datapath (gateway routers) need zones assigned. */ >>> @@ -670,8 +677,8 @@ update_ct_zones(const struct shash *binding_lports, >>> HMAP_FOR_EACH (ld, hmap_node, local_datapaths) { >>> /* XXX Add method to limit zone assignment to logical router >>> * datapaths with NAT */ >>> - char *dnat = alloc_nat_zone_key(&ld->datapath->header_.uuid, >>> "dnat"); >>> - char *snat = alloc_nat_zone_key(&ld->datapath->header_.uuid, >>> "snat"); >>> + char *dnat = alloc_ct_zone_key(&ld->datapath->header_.uuid, >>> "dnat"); >>> + char *snat = alloc_ct_zone_key(&ld->datapath->header_.uuid, >>> "snat"); >>> sset_add(&all_users, dnat); >>> sset_add(&all_users, snat); >>> shash_add(&all_lds, dnat, ld); >>> @@ -2090,7 +2097,7 @@ ct_zones_datapath_binding_handler(struct engine_node >>> *node, void *data) >>> /* Check if the requested snat zone has changed for the datapath >>> * or not. If so, then fall back to full recompute of >>> * ct_zone engine. */ >>> - char *snat_dp_zone_key = alloc_nat_zone_key(&dp->header_.uuid, >>> "snat"); >>> + char *snat_dp_zone_key = alloc_ct_zone_key(&dp->header_.uuid, >>> "snat"); >>> struct simap_node *simap_node = simap_find(&ct_zones_data->current, >>> snat_dp_zone_key); >>> free(snat_dp_zone_key); >>> @@ -2148,6 +2155,16 @@ ct_zones_runtime_data_handler(struct engine_node >>> *node, void *data) >>> &ct_zones_data->pending); >>> updated = true; >>> } >>> + char *drop_zone = alloc_ct_zone_key( >>> + &t_lport->pb->header_.uuid, "drop"); >>> + if (!simap_contains(&ct_zones_data->current, drop_zone)) { >>> + alloc_id_to_ct_zone(drop_zone, >>> + &ct_zones_data->current, >>> + ct_zones_data->bitmap, &scan_start, >>> + &ct_zones_data->pending); >>> + updated = true; >>> + } >>> + free(drop_zone); >>> } else if (t_lport->tracked_type == TRACKED_RESOURCE_REMOVED) { >>> struct simap_node *ct_zone = >>> simap_find(&ct_zones_data->current, >>> diff --git a/controller/physical.c b/controller/physical.c >>> index f3c8bddce..fc46669c1 100644 >>> --- a/controller/physical.c >>> +++ b/controller/physical.c >>> @@ -60,6 +60,7 @@ struct zone_ids { >>> int ct; /* MFF_LOG_CT_ZONE. */ >>> int dnat; /* MFF_LOG_DNAT_ZONE. */ >>> int snat; /* MFF_LOG_SNAT_ZONE. */ >>> + int drop; /* MFF_LOG_ACL_DROP_ZONE. */ >>> }; >>> >>> struct tunnel { >>> @@ -204,14 +205,18 @@ get_zone_ids(const struct sbrec_port_binding *binding, >>> >>> const struct uuid *key = &binding->datapath->header_.uuid; >>> >>> - char *dnat = alloc_nat_zone_key(key, "dnat"); >>> + char *dnat = alloc_ct_zone_key(key, "dnat"); >>> zone_ids.dnat = simap_get(ct_zones, dnat); >>> free(dnat); >>> >>> - char *snat = alloc_nat_zone_key(key, "snat"); >>> + char *snat = alloc_ct_zone_key(key, "snat"); >>> zone_ids.snat = simap_get(ct_zones, snat); >>> free(snat); >>> >>> + char *drop_zone = alloc_ct_zone_key(&binding->header_.uuid, "drop"); >>> + zone_ids.drop = simap_get(ct_zones, drop_zone); >>> + free(drop_zone); >>> + >>> return zone_ids; >>> } >>> >>> @@ -822,6 +827,9 @@ put_zones_ofpacts(const struct zone_ids *zone_ids, >>> struct ofpbuf *ofpacts_p) >>> if (zone_ids->snat) { >>> put_load(zone_ids->snat, MFF_LOG_SNAT_ZONE, 0, 32, ofpacts_p); >>> } >>> + if (zone_ids->drop) { >>> + put_load(zone_ids->drop, MFF_LOG_ACL_DROP_ZONE, 0, 32, >>> ofpacts_p); >>> + } >>> } >>> } >>> >>> @@ -858,6 +866,26 @@ put_local_common_flows(uint32_t dp_key, >>> pb->header_.uuid.parts[0], &match, ofpacts_p, >>> &pb->header_.uuid); >>> >>> + if (zone_ids->drop) { >>> + /* Table 39, Priority 1. >>> + * ======================= >>> + * >>> + * Clear the logical registers (for consistent behavior with >>> packets >>> + * that get tunneled) except MFF_LOG_ACL_DROP_ZONE. */ >>> + match_init_catchall(&match); >>> + ofpbuf_clear(ofpacts_p); >>> + match_set_metadata(&match, htonll(dp_key)); >>> + for (int i = 0; i < MFF_N_LOG_REGS; i++) { >>> + if ((MFF_REG0 + i) != MFF_LOG_ACL_DROP_ZONE) { >>> + put_load(0, MFF_REG0 + i, 0, 32, ofpacts_p); >>> + } >>> + } >>> + put_resubmit(OFTABLE_LOG_EGRESS_PIPELINE, ofpacts_p); >>> + ofctrl_add_flow(flow_table, OFTABLE_CHECK_LOOPBACK, 1, >>> + pb->datapath->header_.uuid.parts[0], &match, >>> + ofpacts_p, &pb->datapath->header_.uuid); >>> + } >>> + >>> /* Table 39, Priority 100. >>> * ======================= >>> * >>> diff --git a/include/ovn/actions.h b/include/ovn/actions.h >>> index d7ee84dac..6424250a6 100644 >>> --- a/include/ovn/actions.h >>> +++ b/include/ovn/actions.h >>> @@ -121,6 +121,7 @@ struct ovn_extend_table; >>> OVNACT(COMMIT_ECMP_NH, ovnact_commit_ecmp_nh) \ >>> OVNACT(CHK_ECMP_NH_MAC, ovnact_result) \ >>> OVNACT(CHK_ECMP_NH, ovnact_result) \ >>> + OVNACT(CT_COMMIT_DROP, ovnact_nest) \ >>> >>> /* enum ovnact_type, with a member OVNACT_<ENUM> for each action. */ >>> enum OVS_PACKED_ENUM ovnact_type { >>> diff --git a/include/ovn/logical-fields.h b/include/ovn/logical-fields.h >>> index 3db7265e4..889f5f9e3 100644 >>> --- a/include/ovn/logical-fields.h >>> +++ b/include/ovn/logical-fields.h >>> @@ -47,6 +47,7 @@ enum ovn_controller_event { >>> #define MFF_LOG_REG0 MFF_REG0 >>> #define MFF_LOG_LB_ORIG_DIP_IPV4 MFF_REG1 >>> #define MFF_LOG_LB_ORIG_TP_DPORT MFF_REG2 >>> +#define MFF_LOG_ACL_DROP_ZONE MFF_REG8 >>> >>> #define MFF_LOG_XXREG0 MFF_XXREG0 >>> #define MFF_LOG_LB_ORIG_DIP_IPV6 MFF_XXREG1 >>> diff --git a/lib/actions.c b/lib/actions.c >>> index adbb42db4..dfff3e793 100644 >>> --- a/lib/actions.c >>> +++ b/lib/actions.c >>> @@ -4600,6 +4600,54 @@ encode_CHK_ECMP_NH(const struct ovnact_result *res, >>> MLF_LOOKUP_COMMIT_ECMP_NH_BIT, ofpacts); >>> } >>> >>> +static void >>> +parse_ct_commit_drop(struct action_context *ctx) >>> +{ >>> + parse_nested_action(ctx, OVNACT_CT_COMMIT_DROP, "ip", WR_CT_COMMIT); >>> +} >>> + >>> +static void >>> +format_CT_COMMIT_DROP(const struct ovnact_nest *on, struct ds *s) >>> +{ >>> + format_nested_action(on, "ct_commit_drop", s); >>> +} >>> + >>> +static void >>> +encode_CT_COMMIT_DROP(const struct ovnact_nest *on, >>> + const struct ovnact_encode_params *ep OVS_UNUSED, >>> + struct ofpbuf *ofpacts) >>> +{ >>> + struct ofpact_conntrack *ct = ofpact_put_CT(ofpacts); >>> + ct->flags = NX_CT_F_COMMIT; >>> + ct->recirc_table = NX_CT_RECIRC_NONE; >>> + ct->zone_src.field = mf_from_id(MFF_LOG_ACL_DROP_ZONE); >>> + ct->zone_src.ofs = 0; >>> + ct->zone_src.n_bits = 16; >>> + >>> + /* If the datapath supports all-zero SNAT then use it to avoid tuple >>> + * collisions at commit time between NATed and firewalled-only >>> sessions. >>> + */ >>> + if (ovs_feature_is_supported(OVS_CT_ZERO_SNAT_SUPPORT)) { >>> + size_t nat_offset = ofpacts->size; >>> + ofpbuf_pull(ofpacts, nat_offset); >>> + >>> + struct ofpact_nat *nat = ofpact_put_NAT(ofpacts); >>> + nat->flags = 0; >>> + nat->range_af = AF_UNSPEC; >>> + nat->flags |= NX_NAT_F_SRC; >>> + ofpacts->header = ofpbuf_push_uninit(ofpacts, nat_offset); >>> + ct = ofpacts->header; >>> + } >>> + >>> + size_t set_field_offset = ofpacts->size; >>> + ofpbuf_pull(ofpacts, set_field_offset); >>> + >>> + ovnacts_encode(on->nested, on->nested_len, ep, ofpacts); >>> + ofpacts->header = ofpbuf_push_uninit(ofpacts, set_field_offset); >>> + ct = ofpacts->header; >>> + ofpact_finish(ofpacts, &ct->ofpact); >>> +} >>> + >>> /* Parses an assignment or exchange or put_dhcp_opts action. */ >>> static void >>> parse_set_action(struct action_context *ctx) >>> @@ -4790,6 +4838,8 @@ parse_action(struct action_context *ctx) >>> parse_put_fdb(ctx, ovnact_put_PUT_FDB(ctx->ovnacts)); >>> } else if (lexer_match_id(ctx->lexer, "commit_ecmp_nh")) { >>> parse_commit_ecmp_nh(ctx, ovnact_put_COMMIT_ECMP_NH(ctx->ovnacts)); >>> + } else if (lexer_match_id(ctx->lexer, "ct_commit_drop")) { >>> + parse_ct_commit_drop(ctx); >>> } else { >>> lexer_syntax_error(ctx->lexer, "expecting action"); >>> } >>> diff --git a/lib/ovn-util.c b/lib/ovn-util.c >>> index 616999eab..a72533a9e 100644 >>> --- a/lib/ovn-util.c >>> +++ b/lib/ovn-util.c >>> @@ -443,12 +443,12 @@ split_addresses(const char *addresses, struct svec >>> *ipv4_addrs, >>> destroy_lport_addresses(&laddrs); >>> } >>> >>> -/* Allocates a key for NAT conntrack zone allocation for a provided >>> +/* Allocates a key for conntrack zone allocation for a provided >>> * 'key' record and a 'type'. >>> * >>> * It is the caller's responsibility to free the allocated memory. */ >>> char * >>> -alloc_nat_zone_key(const struct uuid *key, const char *type) >>> +alloc_ct_zone_key(const struct uuid *key, const char *type) >>> { >>> return xasprintf(UUID_FMT"_%s", UUID_ARGS(key), type); >>> } >>> diff --git a/lib/ovn-util.h b/lib/ovn-util.h >>> index b3905ef7b..e0855f19e 100644 >>> --- a/lib/ovn-util.h >>> +++ b/lib/ovn-util.h >>> @@ -92,7 +92,7 @@ const char *find_lport_address(const struct >>> lport_addresses *laddrs, >>> void split_addresses(const char *addresses, struct svec *ipv4_addrs, >>> struct svec *ipv6_addrs); >>> >>> -char *alloc_nat_zone_key(const struct uuid *key, const char *type); >>> +char *alloc_ct_zone_key(const struct uuid *key, const char *type); >>> >>> const char *default_nb_db(void); >>> const char *default_sb_db(void); >>> diff --git a/northd/northd.c b/northd/northd.c >>> index 7e2681865..2aff3458c 100644 >>> --- a/northd/northd.c >>> +++ b/northd/northd.c >>> @@ -287,7 +287,7 @@ enum ovn_stage { >>> * +----+----------------------------------------------+ G | >>> | >>> * | R7 | UNUSED | 1 | >>> | >>> * >>> +----+----------------------------------------------+---+------------------+ >>> - * | R8 | UNUSED | >>> + * | R8 | DROP_CT_ZONE (<= IN(/OUT)_ACL | >>> * +----+----------------------------------------------+ >>> * | R9 | UNUSED | >>> * +----+----------------------------------------------+ >>> @@ -6343,6 +6343,11 @@ consider_acl(struct hmap *lflows, struct >>> ovn_datapath *od, >>> } else { >>> ds_put_format(match, " && (%s)", acl->match); >>> build_acl_log(actions, acl, meter_groups); >>> + if (acl->label) { >>> + ds_put_format(actions, "ct_commit_drop { " >>> + "ct_label.label = %"PRId64"; }; ", >>> + acl->label); >>> + } >>> ds_put_cstr(actions, "/* drop */"); >>> ovn_lflow_add_with_hint(lflows, od, stage, >>> acl->priority + OVN_ACL_PRI_OFFSET, >>> @@ -6363,8 +6368,13 @@ consider_acl(struct hmap *lflows, struct >>> ovn_datapath *od, >>> ds_clear(match); >>> ds_clear(actions); >>> ds_put_cstr(match, REGBIT_ACL_HINT_BLOCK " == 1"); >>> - ds_put_format(actions, "ct_commit { %s = 1; }; ", >>> + ds_put_format(actions, "ct_commit { %s = 1; ", >>> ct_blocked_match); >>> + if (acl->label) { >>> + ds_put_format(actions, "ct_label.label = %"PRId64"; ", >>> + acl->label); >>> + } >>> + ds_put_cstr(actions, "}; "); >>> if (!strcmp(acl->action, "reject")) { >>> build_reject_acl_rules(od, lflows, stage, acl, match, >>> actions, &acl->header_, >>> meter_groups); >>> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml >>> index b8f871394..af653be4d 100644 >>> --- a/northd/ovn-northd.8.xml >>> +++ b/northd/ovn-northd.8.xml >>> @@ -699,7 +699,12 @@ >>> connections and <code>ct_commit(ct_label=1/1);</code> for known >>> connections. Setting <code>ct_label</code> marks a connection >>> as one that was previously allowed, but should no longer be >>> - allowed due to a policy change. >>> + allowed due to a policy change. If the ACL has a >>> <code>label</code>, >>> + then it translates to >>> + <code>ct_commit_drop(ct_label.label=label)</code> for new and >>> + untracked connections and >>> + <code>ct_commit(ct_label.blocked=1; ct_label.label=label);</code> >>> + for known connections. >>> </li> >>> </ul> >>> >>> @@ -969,7 +974,12 @@ >>> or untracked connections and <code>ct_commit(ct_label=1/1);</code> >>> for >>> known connections. Setting <code>ct_label</code> marks a connection >>> as one that was previously allowed, but should no longer be >>> - allowed due to a policy change. >>> + allowed due to a policy change. If the ACL has a >>> <code>label</code>, >>> + then it translates to >>> + <code>ct_commit_drop(ct_label.label=label)</code> for new and >>> + untracked connections and >>> + <code>ct_commit(ct_label.blocked=1; ct_label.label=label);</code> >>> + for known connections. >>> </li> >>> </ul> >>> >>> diff --git a/ovn-sb.xml b/ovn-sb.xml >>> index 37a709f83..4e294a212 100644 >>> --- a/ovn-sb.xml >>> +++ b/ovn-sb.xml >>> @@ -1377,6 +1377,23 @@ >>> </p> >>> </dd> >>> >>> + <dt><code>ct_commit_drop { };</code></dt> >>> + <dt><code>ct_commit_drop { ct_mark=<var>value[/mask]</var>; >>> };</code></dt> >>> + <dt><code>ct_commit_drop { ct_label=<var>value[/mask]</var>; >>> };</code></dt> >>> + <dt><code>ct_commit_drop { ct_mark=<var>value[/mask]</var>; >>> ct_label=<var>value[/mask]</var>; };</code></dt> >>> + <dd> >>> + <p> >>> + This action is identical to <code>ct_commit</code> except that >>> the >>> + connection tracking entry is committed in a different zone. >>> + </p> >>> + >>> + <p> >>> + This action was added to store connections dropped by ACLs in a >>> + separate zone that is managed independently of the >>> + <code>ct_commit</code> zone, for debugging. >>> + </p> >>> + </dd> >>> + >>> <dt><code>ct_dnat;</code></dt> >>> <dt><code>ct_dnat(<var>IP</var>);</code></dt> >>> <dd> >>> diff --git a/ovs b/ovs >>> index 6f24c2bc7..d94cd0d3e 160000 >>> --- a/ovs >>> +++ b/ovs >>> @@ -1 +1 @@ >>> -Subproject commit 6f24c2bc769afde0a390ce344de1a7d9c592e5a6 >>> +Subproject commit d94cd0d3eec33e4290d7ca81918f5ac61444886e >>> diff --git a/utilities/ovn-nbctl.c b/utilities/ovn-nbctl.c >>> index 3bbdbd998..07509f488 100644 >>> --- a/utilities/ovn-nbctl.c >>> +++ b/utilities/ovn-nbctl.c >>> @@ -2225,10 +2225,11 @@ nbctl_acl_add(struct ctl_context *ctx) >>> /* Set the ACL label */ >>> const char *label = shash_find_data(&ctx->options, "--label"); >>> if (label) { >>> - /* Ensure that the action is either allow or allow-related */ >>> - if (strcmp(action, "allow") && strcmp(action, "allow-related")) { >>> + /* Ensure that the action is either allow or allow-related or drop */ >>> + if (strcmp(action, "allow") && strcmp(action, "allow-related") && >>> + strcmp(action, "drop")) { >>> ctl_error(ctx, "label can only be set with actions \"allow\" or " >>> - "\"allow-related\""); >>> + "\"allow-related\" or \"drop\""); >>> return; >>> } > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev