> On 01/16/2019 01:05 PM, Ilya Maximets wrote: > > Conditional EMC insert helps a lot in scenarios with high numbers of > > parallel flows, but in current implementation this option affects all > > the threads and ports at once. There are scenarios where we have > > different number of flows on different ports. For example, if one of > > the VMs encapsulates traffic using additional headers, it will receive > > large number of flows but only few flows will come out of this VM. In > > this scenario it's much faster to use EMC instead of classifier for > > traffic from the VM, but it's better to disable EMC for the traffic > > which flows to VM. > > > > To handle above issue introduced 'emc-enable' configurable to > > enable/disable EMC on a per-port basis. Ex.: > > > > ovs-vsctl set interface dpdk0 other_config:emc-enable=false > > > > EMC probability kept as is and it works for all the ports with > > 'emc-enable=true'. > > > > Hi Ilya, just a couple of minor comments below,
Hi Ilya, I agree with what Kevins flagged below, one other comment from myself. (BTW I still have to do some testing on this but the code looks good so far.) > > > Signed-off-by: Ilya Maximets <[email protected]> > > --- > > > > Version 3: > > * Minor rebase on current master. > > > > Version 2: > > * The main concern was about backward compatibility. Also, there > > is no real profit having the per-port probability value. > > So, per-port probability switched to per-port 'emc-enable' > > configurable. > > Global probability kept back and can be used without any changes. > > > > It's been a while since the first version. It's available here: > > https://patchwork.ozlabs.org/patch/800277/ > > > > Documentation/topics/dpdk/bridge.rst | 4 ++ > > NEWS | 5 +- > > lib/dpif-netdev.c | 79 +++++++++++++++++++++++++--- > > vswitchd/vswitch.xml | 19 +++++++ > > 4 files changed, 98 insertions(+), 9 deletions(-) > > > > diff --git a/Documentation/topics/dpdk/bridge.rst > > b/Documentation/topics/dpdk/bridge.rst > > index 82bdad840..2fcd86607 100644 > > --- a/Documentation/topics/dpdk/bridge.rst > > +++ b/Documentation/topics/dpdk/bridge.rst > > @@ -101,6 +101,10 @@ observed with pmd stats:: > > For certain traffic profiles with many parallel flows, it's > > recommended to set ``N`` to '0' to achieve higher forwarding > performance. > > > > +It is also possible to enable/disable EMC on per-port basis using:: > > + > > + $ ovs-vsctl set interface <iface> > > + other_config:emc-enable={true,false} > > + > > For more information on the EMC refer to :doc:`/intro/install/dpdk` . It would be good to give a short description of why/when this feature should be used. The description of the use case you give in the commit message would even do. As more 'tunable' features like this are added I think it's good for reference so users understand the motivation for the feature as they may not check commit message. Ian > > > > > > diff --git a/NEWS b/NEWS > > index 4618cc0a0..7826578d3 100644 > > --- a/NEWS > > +++ b/NEWS > > @@ -24,10 +24,13 @@ Post-v2.10.0 > > allocated dynamically using the following syntax: > > ovn-nbctl lsp-set-addresses <port> "dynamic <IP>" > > - DPDK: > > + * Add support for DPDK 18.11 > > + - Userspace datapath: > > * Add option for simple round-robin based Rxq to PMD assignment. > > It can be set with pmd-rxq-assign. > > - * Add support for DPDK 18.11 > > * Add support for Auto load balancing of PMDs (experimental) > > + * Added new per-port configurable option to manage EMC: > > + 'other_config:emc-enable'. > > - Add 'symmetric_l3' hash function. > > - OVS now honors 'updelay' and 'downdelay' for bonds with LACP > configured. > > - ovs-vswitchd: > > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index > > be529b6b0..6704be400 100644 > > --- a/lib/dpif-netdev.c > > +++ b/lib/dpif-netdev.c > > @@ -474,6 +474,7 @@ struct dp_netdev_port { > > unsigned n_rxq; /* Number of elements in 'rxqs' */ > > unsigned *txq_used; /* Number of threads that use each tx > queue. */ > > struct ovs_mutex txq_used_mutex; > > + bool emc_enabled; /* If true EMC will be used. */ > > char *type; /* Port type as requested by user. */ > > char *rxq_affinity_list; /* Requested affinity of rx queues. */ > > }; > > @@ -588,6 +589,7 @@ static void dp_netdev_actions_free(struct > > dp_netdev_actions *); struct polled_queue { > > struct dp_netdev_rxq *rxq; > > odp_port_t port_no; > > + bool emc_enabled; > > }; > > > > /* Contained by struct dp_netdev_pmd_thread's 'poll_list' member. */ > > @@ -617,6 +619,8 @@ struct dp_netdev_pmd_thread_ctx { > > long long now; > > /* RX queue from which last packet was received. */ > > struct dp_netdev_rxq *last_rxq; > > + /* EMC insertion probability context for the current processing > cycle. */ > > + uint32_t emc_insert_min; > > }; > > > > /* PMD: Poll modes drivers. PMD accesses devices via polling to > > eliminate @@ -1798,6 +1802,7 @@ port_create(const char *devname, const > char *type, > > port->netdev = netdev; > > port->type = xstrdup(type); > > port->sf = sf; > > + port->emc_enabled = true; > > port->need_reconfigure = true; > > ovs_mutex_init(&port->txq_used_mutex); > > > > @@ -2830,9 +2835,7 @@ emc_probabilistic_insert(struct > dp_netdev_pmd_thread *pmd, > > * default the value is UINT32_MAX / 100 which yields an insertion > > * probability of 1/100 ie. 1% */ > > > > - uint32_t min; > > - > > - atomic_read_relaxed(&pmd->dp->emc_insert_min, &min); > > + uint32_t min = pmd->ctx.emc_insert_min; > > > > if (min && random_uint32() <= min) { > > emc_insert(&(pmd->flow_cache).emc_cache, key, flow); @@ > > -3698,7 +3701,8 @@ dpif_netdev_execute(struct dpif *dpif, struct > dpif_execute *execute) > > ovs_mutex_lock(&dp->non_pmd_mutex); > > } > > > > - /* Update current time in PMD context. */ > > + /* Update current time in PMD context. We don't care about EMC > insertion > > + * probability, because we are on a slow path. */ > > pmd_thread_ctx_time_update(pmd); > > > > /* The action processing expects the RSS hash to be valid, > > because @@ -3842,7 +3846,7 @@ dpif_netdev_set_config(struct dpif *dpif, > const struct smap *other_config) > > if (insert_min != cur_min) { > > atomic_store_relaxed(&dp->emc_insert_min, insert_min); > > if (insert_min == 0) { > > - VLOG_INFO("EMC has been disabled"); > > + VLOG_INFO("EMC insertion probability changed to zero"); > > } else { > > VLOG_INFO("EMC insertion probability changed to 1/%llu > (~%.2f%%)", > > insert_prob, (100 / (float)insert_prob)); @@ > > -3965,6 +3969,27 @@ exit: > > return error; > > } > > > > +/* Returns 'true' if one of the 'port's RX queues exists in 'poll_list' > > + * of given PMD thread. */ > > +static bool > > +dpif_netdev_pmd_polls_port(struct dp_netdev_pmd_thread *pmd, > > + struct dp_netdev_port *port) > > + OVS_EXCLUDED(pmd->port_mutex) > > +{ > > + struct rxq_poll *poll; > > + bool found = false; > > + > > + ovs_mutex_lock(&pmd->port_mutex); > > + HMAP_FOR_EACH (poll, node, &pmd->poll_list) { > > + if (port == poll->rxq->port) { > > + found = true; > > + break; > > + } > > + } > > + ovs_mutex_unlock(&pmd->port_mutex); > > + return found; > > +} > > + > > /* Changes the affinity of port's rx queues. The changes are actually > applied > > * in dpif_netdev_run(). */ > > The comment is a bit stale now. > > > static int > > @@ -3975,10 +4000,33 @@ dpif_netdev_port_set_config(struct dpif *dpif, > odp_port_t port_no, > > struct dp_netdev_port *port; > > int error = 0; > > const char *affinity_list = smap_get(cfg, "pmd-rxq-affinity"); > > + bool emc_enabled = smap_get_bool(cfg, "emc-enable", true); > > > > ovs_mutex_lock(&dp->port_mutex); > > error = get_port_by_number(dp, port_no, &port); > > - if (error || !netdev_is_pmd(port->netdev) > > + if (error) { > > + goto unlock; > > + } > > + > > + if (emc_enabled != port->emc_enabled) { > > + struct dp_netdev_pmd_thread *pmd; > > + > > + port->emc_enabled = emc_enabled; > > + /* Mark for reload all the threads that polls this port and > request > > + * for reconfiguration for the actual reloading of threads. */ > > + CMAP_FOR_EACH (pmd, node, &dp->poll_threads) { > > + if (dpif_netdev_pmd_polls_port(pmd, port)) { > > + pmd->need_reload = true; > > + } > > + } > > + dp_netdev_request_reconfigure(dp); > > + > > + VLOG_INFO("%s: EMC has been %s", netdev_get_name(port->netdev), > > + (emc_enabled) ? "enabled" : "disabled"); > > I think it would be good idea to add a similar log as > dpif_netdev_set_config() for 'emc-insert-inv-prob' when emc_enabled gets > enabled here as a reminder for the user. Otherwise, a user will see 'EMC > has been enabled' which is not incorrect, but perhaps a long time ago > 'emc-insert-inv-prob' was set to 0, and there is confusion about whether > the EMC will actually be used now or not. > > > + } > > + > > + /* Checking for RXq affinity changes. */ > > + if (!netdev_is_pmd(port->netdev) > > || nullable_string_is_equal(affinity_list, port- > >rxq_affinity_list)) { > > goto unlock; > > } > > @@ -5123,6 +5171,13 @@ dpif_netdev_run(struct dpif *dpif) > > if (!netdev_is_pmd(port->netdev)) { > > int i; > > > > + if (port->emc_enabled) { > > + atomic_read_relaxed(&dp->emc_insert_min, > > + &non_pmd->ctx.emc_insert_min); > > + } else { > > + non_pmd->ctx.emc_insert_min = 0; > > + } > > + > > for (i = 0; i < port->n_rxq; i++) { > > if (dp_netdev_process_rxq_port(non_pmd, > > &port->rxqs[i], @@ > > -5296,6 +5351,7 @@ pmd_load_queues_and_ports(struct dp_netdev_pmd_thread > *pmd, > > HMAP_FOR_EACH (poll, node, &pmd->poll_list) { > > poll_list[i].rxq = poll->rxq; > > poll_list[i].port_no = poll->rxq->port->port_no; > > + poll_list[i].emc_enabled = poll->rxq->port->emc_enabled; > > i++; > > } > > > > @@ -5360,6 +5416,14 @@ reload: > > pmd_perf_start_iteration(s); > > > > for (i = 0; i < poll_cnt; i++) { > > + > > + if (poll_list[i].emc_enabled) { > > + atomic_read_relaxed(&pmd->dp->emc_insert_min, > > + &pmd->ctx.emc_insert_min); > > + } else { > > + pmd->ctx.emc_insert_min = 0; > > + } > > + > > process_packets = > > dp_netdev_process_rxq_port(pmd, poll_list[i].rxq, > > poll_list[i].port_no); @@ > > -6301,7 +6365,7 @@ dfc_processing(struct dp_netdev_pmd_thread *pmd, > > struct dfc_cache *cache = &pmd->flow_cache; > > struct dp_packet *packet; > > const size_t cnt = dp_packet_batch_size(packets_); > > - uint32_t cur_min; > > + uint32_t cur_min = pmd->ctx.emc_insert_min; > > int i; > > uint16_t tcp_flags; > > bool smc_enable_db; > > @@ -6309,7 +6373,6 @@ dfc_processing(struct dp_netdev_pmd_thread *pmd, > > bool batch_enable = true; > > > > atomic_read_relaxed(&pmd->dp->smc_enable_db, &smc_enable_db); > > - atomic_read_relaxed(&pmd->dp->emc_insert_min, &cur_min); > > pmd_perf_update_counter(&pmd->perf_stats, > > md_is_valid ? PMD_STAT_RECIRC : > PMD_STAT_RECV, > > cnt); > > diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index > > d58f63228..88edb5d35 100644 > > --- a/vswitchd/vswitch.xml > > +++ b/vswitchd/vswitch.xml > > @@ -3101,6 +3101,25 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 > type=patch options:peer=p1 \ > > </column> > > </group> > > > > + <group title="EMC (Exact Match Cache) Configuration"> > > + <p> > > + These settings controls behaviour of EMC lookups/insertions for > packets > > + received from the interface. > > + </p> > > + > > + <column name="other_config" key="emc-enable" type='{"type": > "boolean"}'> > > + <p> > > + Specifies if Exact Match Cache (EMC) should be used while > processing > > + packets received from this interface. > > + If true, <ref table="Open_vSwitch" column="other_config" > > + key="emc-insert-inv-prob"/> will have effect on this > interface. > > + </p> > > + <p> > > + Defaults to true. > > + </p> > > + </column> > > + </group> > > + > > <group title="MTU"> > > <p> > > The MTU (maximum transmission unit) is the largest amount of > > data > > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
