On 5/15/2019 6:02 PM, Ian Stokes wrote:
On 5/8/2019 4:13 PM, Harry van Haaren wrote:
This allows plugging-in of different subtable hash-lookup-verify
routines, and allows special casing of those functions based on
known context (eg: # of bits set) of the specific subtable.

Signed-off-by: Harry van Haaren <harry.van.haa...@intel.com>

---

v9:
- Use count_1bits in favour of __builtin_popcount (Ilya)

v6:
- Implement subtable effort per packet "lookups_match" counter  (Ilya)
- Remove double newline (Eelco)
- Remove doubel * before comments (Eelco)
- Reword comments in dpcls_lookup() for clarity (Harry)
---
  lib/dpif-netdev.c | 135 +++++++++++++++++++++++++++++++---------------
  1 file changed, 93 insertions(+), 42 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 5a6f2abac..e2e2c14e7 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -7572,6 +7572,27 @@ dpif_dummy_register(enum dummy_level level)
  
  /* Datapath Classifier. */
+/* forward declaration for lookup_func typedef */

Minor nit above, general rule of thumb for comment style, start with capital letter and finish with period.

+struct dpcls_subtable;
+
+/* Lookup function for a subtable in the dpcls. This function is called
+ * by each subtable with an array of packets, and a bitmask of packets to
+ * perform the lookup on. Using a function pointer gives flexibility to
+ * optimize the lookup function based on subtable properties and the
+ * CPU instruction set available at runtime.
+ */
+typedef uint32_t (*dpcls_subtable_lookup_func)(struct dpcls_subtable *subtable,
+                uint32_t keys_map, const struct netdev_flow_key *keys[],
+                struct dpcls_rule **rules);

Alignment for the arguments above looks odd, typically arguments are kept vertically in line with one another *similar to dpcls_subtable_lookup_generic below).

+
+/* Prototype for generic lookup func, using same code path as before */

Period at end of comment above.

+uint32_t
+dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable,
+                              uint32_t keys_map,
+                              const struct netdev_flow_key *keys[],
+                              struct dpcls_rule **rules);
+
+

One minor follow up, extra whitespace seems to have been added here, cab be reduced to 1 new line.

  /* A set of rules that all have the same fields wildcarded. */
  struct dpcls_subtable {
      /* The fields are only used by writers. */
@@ -7581,6 +7602,13 @@ struct dpcls_subtable {
      struct cmap rules;           /* Contains "struct dpcls_rule"s. */
      uint32_t hit_cnt;            /* Number of match hits in subtable in current
                                      optimization interval. */
+
+    /* the lookup function to use for this subtable. If there is a known
+     * property of the subtable (eg: only 3 bits of miniflow metadata is
+     * used for the lookup) then this can point at an optimized version of
+     * the lookup function for this particular subtable. */
the -> The above.
+    dpcls_subtable_lookup_func lookup_func;
+
      struct netdev_flow_key mask; /* Wildcards for fields (const). */
      /* 'mask' must be the last field, additional space is allocated here. */
  };
@@ -7640,6 +7668,10 @@ dpcls_create_subtable(struct dpcls *cls, const struct netdev_flow_key *mask)
      cmap_init(&subtable->rules);
      subtable->hit_cnt = 0;
      netdev_flow_key_clone(&subtable->mask, mask);
+
+    /* decide which hash/lookup/verify function to use */
+    subtable->lookup_func = dpcls_subtable_lookup_generic;

So by default dpcls_subtable_lookup_generic is set as the look up function. Makes sense as this is the only look up available at this stage. I assume it's later in the series a mechanism to select a different lookup is introduced. Or by default does the lookup always start off as the generic option when a subtable is created even when other options are possible?

+
      cmap_insert(&cls->subtables_map, &subtable->cmap_node, mask->hash);
      /* Add the new subtable at the end of the pvector (with no hits yet) */
      pvector_insert(&cls->subtables, subtable, 0);
@@ -7800,6 +7832,53 @@ dpcls_rule_matches_key(const struct dpcls_rule *rule,
      return true;
  }
+uint32_t
+dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable,
+                              uint32_t keys_map,
+                              const struct netdev_flow_key *keys[],
+                              struct dpcls_rule **rules)
+{
+        int i;
+        /* Compute hashes for the remaining keys.  Each search-key is
+         * masked with the subtable's mask to avoid hashing the wildcarded
+         * bits. */
+        uint32_t hashes[NETDEV_MAX_BURST];
+        ULLONG_FOR_EACH_1(i, keys_map) {
+            hashes[i] = netdev_flow_key_hash_in_mask(keys[i],
+                                                     &subtable->mask);
+        }
+
+        /* Lookup. */
+        const struct cmap_node *nodes[NETDEV_MAX_BURST];
+        uint32_t found_map =

Minor nit. Variable declaration. You may see a mix of approaches WRT where variables are declared. Personally I prefer at the beginning of the function to keep with the int variable you have already declared. looking at the original dpcls_lookup() where this code is moved from it looks like they also declared variables at the beginning. Would be good to follow that example.

+                cmap_find_batch(&subtable->rules, keys_map, hashes, nodes); +        /* Check results.  When the i-th bit of found_map is set, it means +         * that a set of nodes with a matching hash value was found for the +         * i-th search-key.  Due to possible hash collisions we need to check
+         * which of the found rules, if any, really matches our masked
+         * search-key. */
+        ULLONG_FOR_EACH_1(i, found_map) {
+            struct dpcls_rule *rule;
+
+            CMAP_NODE_FOR_EACH (rule, cmap_node, nodes[i]) {
+                if (OVS_LIKELY(dpcls_rule_matches_key(rule, keys[i]))) {
+                    rules[i] = rule;
+                    /* Even at 20 Mpps the 32-bit hit_cnt cannot wrap
+                     * within one second optimization interval. */
+                    subtable->hit_cnt++;
+                    goto next;
+                }
+            }
+            /* None of the found rules was a match.  Reset the i-th bit to
+             * keep searching this key in the next subtable. */
+            ULLONG_SET0(found_map, i);  /* Did not match. */
+        next:
+            ;                     /* Keep Sparse happy. */
+        }
+
+        return found_map;
+}
+
  /* For each miniflow in 'keys' performs a classifier lookup writing the result    * into the corresponding slot in 'rules'.  If a particular entry in 'keys' is
   * NULL it is skipped.
@@ -7818,16 +7897,12 @@ dpcls_lookup(struct dpcls *cls, const struct netdev_flow_key *keys[],       /* The received 'cnt' miniflows are the search-keys that will be processed
       * to find a matching entry into the available subtables.
       * The number of bits in map_type is equal to NETDEV_MAX_BURST. */
-    typedef uint32_t map_type;
-#define MAP_BITS (sizeof(map_type) * CHAR_BIT)
+#define MAP_BITS (sizeof(uint32_t) * CHAR_BIT)
      BUILD_ASSERT_DECL(MAP_BITS >= NETDEV_MAX_BURST);
      struct dpcls_subtable *subtable;
-    map_type keys_map = TYPE_MAXIMUM(map_type); /* Set all bits. */
-    map_type found_map;
-    uint32_t hashes[MAP_BITS];
-    const struct cmap_node *nodes[MAP_BITS];
+    uint32_t keys_map = TYPE_MAXIMUM(uint32_t); /* Set all bits. */
      if (cnt != MAP_BITS) {
          keys_map >>= MAP_BITS - cnt; /* Clear extra bits. */
@@ -7844,52 +7919,28 @@ dpcls_lookup(struct dpcls *cls, const struct netdev_flow_key *keys[],        * search-key, the search for that key can stop because the rules are
       * non-overlapping. */
      PVECTOR_FOR_EACH (subtable, &cls->subtables) {
-        int i;
+        /* call the subtable specific lookup function */
+        uint32_t found_map = subtable->lookup_func(subtable, keys_map,
+                                                   keys, rules);
-        /* Compute hashes for the remaining keys.  Each search-key is
-         * masked with the subtable's mask to avoid hashing the wildcarded
-         * bits. */
-        ULLONG_FOR_EACH_1(i, keys_map) {
-            hashes[i] = netdev_flow_key_hash_in_mask(keys[i],
-                                                     &subtable->mask);
-        }
-        /* Lookup. */
-        found_map = cmap_find_batch(&subtable->rules, keys_map, hashes, nodes); -        /* Check results.  When the i-th bit of found_map is set, it means -         * that a set of nodes with a matching hash value was found for the -         * i-th search-key.  Due to possible hash collisions we need to check
-         * which of the found rules, if any, really matches our masked
-         * search-key. */
-        ULLONG_FOR_EACH_1(i, found_map) {
-            struct dpcls_rule *rule;
+        /* Count the number of subtables searched for this packet match. This +         * estimates the "spread" of subtables looked at per matched packet */

Minor, missing period in comment above.

+        uint32_t pkts_matched = count_1bits(found_map);

Just to clarify, at this point found_map has been returned and it only contains matches where packets were found? i.e. it doesn't contain matches from possible hash collisions?

+        lookups_match += pkts_matched * subtable_pos;

Hi Harry, can you clarify above why '* subtable_pos' is used? Is that right? I'm just thinking that you already have the number of packets matched from the count_1bits(found_map).

Ian

-            CMAP_NODE_FOR_EACH (rule, cmap_node, nodes[i]) {
-                if (OVS_LIKELY(dpcls_rule_matches_key(rule, keys[i]))) {
-                    rules[i] = rule;
-                    /* Even at 20 Mpps the 32-bit hit_cnt cannot wrap
-                     * within one second optimization interval. */
-                    subtable->hit_cnt++;
-                    lookups_match += subtable_pos;
-                    goto next;
-                }
-            }
-            /* None of the found rules was a match.  Reset the i-th bit to
-             * keep searching this key in the next subtable. */
-            ULLONG_SET0(found_map, i);  /* Did not match. */
-        next:
-            ;                     /* Keep Sparse happy. */
-        }
-        keys_map &= ~found_map;             /* Clear the found rules. */
+        /* Clear the found rules, and return early if all packets are found */
Missing period in comment above.

+        keys_map &= ~found_map;
          if (!keys_map) {
              if (num_lookups_p) {
                  *num_lookups_p = lookups_match;
              }
-            return true;              /* All found. */
+            return true;
          }
          subtable_pos++;
      }
+
      if (num_lookups_p) {
          *num_lookups_p = lookups_match;
      }
-    return false;                     /* Some misses. */
+    return false;
  }


I've also run some performance tests on this patch but I didn't see any notable performance degradation from the function implementation which is good.

Ian
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to