Re: [ovs-dev] [PATCH] Improve kernel hash table

Pravin Shelar Sun, 21 Aug 2011 16:02:53 -0700

On Sat, Aug 20, 2011 at 4:31 AM, Jesse Gross <[email protected]> wrote:
> On Sat, Aug 20, 2011 at 4:49 AM, Pravin Shelar <[email protected]> wrote:
>> Currently OVS used its own hashing implmentation for hash tables which
>> has some problems, e.g. error case on deletion code. Following patch
>> replaces that with hlist based hash table which is consistent with other
>> kernel hash tables.
>>
>> Signed-off-by: Pravin Shelar <[email protected]>
>
> I get a whole pile of compiler errors with this on 2.6.38:
>
> /home/jesse/openvswitch/datapath/linux/datapath.c: In function
> ‘get_table_protected’:
> /home/jesse/openvswitch/datapath/linux/datapath.c:112:9: error:
> dereferencing pointer to incomplete type
> /home/jesse/openvswitch/datapath/linux/datapath.c:112:2: warning: type
> defaults to ‘int’ in type name
> /home/jesse/openvswitch/datapath/linux/datapath.c:112:2: warning:
> return from incompatible pointer type
> /home/jesse/openvswitch/datapath/linux/datapath.c: In function
> ‘dp_process_received_packet’:
> /home/jesse/openvswitch/datapath/linux/datapath.c:298:22: error:
> dereferencing pointer to incomplete type
> /home/jesse/openvswitch/datapath/linux/datapath.c:298:22: warning:
> type defaults to ‘int’ in declaration of ‘_________p1’
> /home/jesse/openvswitch/datapath/linux/datapath.c:298:22: error:
> dereferencing pointer to incomplete type
> /home/jesse/openvswitch/datapath/linux/datapath.c:298:3: warning: type
> defaults to ‘int’ in type name
> /home/jesse/openvswitch/datapath/linux/datapath.c:298:22: error:
> dereferencing pointer to incomplete type
> /home/jesse/openvswitch/datapath/linux/datapath.c:298:3: warning: type
> defaults to ‘int’ in type name
> /home/jesse/openvswitch/datapath/linux/datapath.c:298:3: warning:
> passing argument 1 of ‘flow_lookup’ from incompatible pointer type
> /home/jesse/openvswitch/datapath/linux/../flow.h:165:17: note:
> expected ‘struct flow_table *’ but argument is of type ‘int *’
> /home/jesse/openvswitch/datapath/linux/datapath.c: In function ‘flush_flows’:
> /home/jesse/openvswitch/datapath/linux/datapath.c:539:2: error:
> dereferencing pointer to incomplete type
> /home/jesse/openvswitch/datapath/linux/datapath.c:539:2: warning: type
> defaults to ‘int’ in type name
> /home/jesse/openvswitch/datapath/linux/datapath.c:539:2: warning:
> assignment from incompatible pointer type
> /home/jesse/openvswitch/datapath/linux/datapath.c: In function ‘expand_table’:
> /home/jesse/openvswitch/datapath/linux/datapath.c:633:3: error:
> dereferencing pointer to incomplete type
> /home/jesse/openvswitch/datapath/linux/datapath.c:633:3: warning: type
> defaults to ‘int’ in type name
> /home/jesse/openvswitch/datapath/linux/datapath.c:633:3: warning:
> assignment from incompatible pointer type
> /home/jesse/openvswitch/datapath/linux/datapath.c: In function 
> ‘odp_dp_cmd_new’:
> /home/jesse/openvswitch/datapath/linux/datapath.c:1357:2: error:
> dereferencing pointer to incomplete type
> /home/jesse/openvswitch/datapath/linux/datapath.c:1357:2: warning:
> type defaults to ‘int’ in type name
> /home/jesse/openvswitch/datapath/linux/datapath.c:1357:2: warning:
> assignment from incompatible pointer type
>
> I'm guessing that you didn't try this on a kernel that has real
> definitions for the RCU sparse/lockdep checks.
right, new rcu macro need struct definition.


>
> Also, it doesn't look like you actually deleted table.[h|c].
forgot to commit that to git patch

>
>> diff --git a/datapath/datapath.c b/datapath/datapath.c
>> index b191239..e198357 100644
>> --- a/datapath/datapath.c
>> +++ b/datapath/datapath.c
>> @@ -225,7 +225,7 @@ static void destroy_dp_rcu(struct rcu_head *rcu)
>>  {
>>        struct datapath *dp = container_of(rcu, struct datapath, rcu);
>>
>> -       tbl_destroy((struct tbl __force *)dp->table, flow_free_tbl);
>> +       flow_deferred_destroy(dp->table);
>
> It's not necessary to wait for an extra RCU grace period before
> deleting the table.  Previously it was done immediately, now it's a
> deferred free.
I just wanted to keep one interface for flow destroy as it does not
cause any problem delaying it.


>
>> +       unsigned int i;
>> +
>> +       l1 = kmalloc((n_buckets) * sizeof(struct hlist_head *), GFP_KERNEL);
>
> You can't just directly allocate the array of buckets because it will
> quickly become too large to allocate reliably (on 64-bit machines even
> the initial 1024 bucket allocation is already 2 pages).  The previous
> implementation went to a lot of work to avoid this by using a
> hierarchy of single page allocations.  The much easier solution that
> is now available, which we talked about in the past, is to use
> flex_arrays to handle the hierarchy automatically.  I already
> backported the flex_array code from the latest kernel (and improved
> it) so it should be easy to use.

ok, i fixed it.

>> +void flow_insert(struct flow_table *table, struct sw_flow *flow)
>> +{
>> +       struct hlist_head *head;
>> +       struct hlist_node *n;
>> +       struct sw_flow *cflow;
>> +       u32 hash;
>> +
>> +       hash = flow_hash(&flow->key, flow->key_len);
>> +       head = find_bucket(table, hash);
>> +
>> +       hlist_for_each_entry_rcu(cflow, n, head, hash_node) {
>> +               if (!memcmp(&cflow->key, &flow->key, flow->key_len))
>> +                       return;
>> +       }
>
> We should never add duplicate flow entries, so the above code isn't
> necessary.  If it does happen, then it will cause a memory leak
> because the new flow will silently disappear.

ok, i am adding assert so that we will get bug  if something goes wrong.

>
>> diff --git a/datapath/flow.h b/datapath/flow.h
>> index 6a3c539..6bfe893 100644
>> --- a/datapath/flow.h
>> +++ b/datapath/flow.h
>>  struct sw_flow {
>>        struct rcu_head rcu;
>> -       struct tbl_node tbl_node;
>> +       struct hlist_node  hash_node;
>>
>>        struct sw_flow_key key;
>>        struct sw_flow_actions __rcu *sf_acts;
>> +       int key_len;
>
> It looks like key_len was only added so that we can rehash the flows
> when the table is resized.  I think perhaps a better use of memory
> would be to store the hash value itself.  This way we wouldn't have to
> recompute the hashes on resize and we can also use it to speed up
> lookup by skipping comparisons of flows that may land in the same
> bucket but actually have different hash values.
>
having hash value in flow is good idea. i will add it.

>> +int tnl_init()
>> +{
>> +       port_table = alloc_buckets(port_table_size);
>> +       if (!port_table)
>> +               return -ENOMEM;
>
> Previously we would allocate the hash table when the first tunnel port
> was created but now we're doing it as soon as the module is loaded.
> Why the change?  Short of a compelling reason, I think it's better to
> do it on demand, otherwise we are consuming resources for something
> that will potentially never get used.

it saves port_table null check in lookup which i think is worth two
pages of preallocation.

>
>> +void tnl_exit()
>> +{
>> +       struct hlist_node *n, *pos;
>> +       struct hlist_head *hash_head;
>> +       struct tnl_vport * tnl_vport;
>> +       int i;
>> +
>> +       if (!port_table)
>> +               return;
>> +
>> +       for (i = 0; i < port_table_size; i++) {
>> +               hash_head = &port_table[i];
>> +               hlist_for_each_entry_safe(tnl_vport, pos, n, hash_head,
>> +                                         hash_node) {
>> +                       hlist_del_init_rcu(&tnl_vport->hash_node);
>> +                       __free_port_rcu(tnl_vport);
>> +               }
>> +       }
>
> When a datapath is deleted, it calls delete on all of the ports
> automatically so it should be necessary to do it here as well.

ok, added assert.
_______________________________________________
dev mailing list
[email protected]
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH] Improve kernel hash table

Reply via email to