On Sat, Aug 20, 2011 at 4:31 AM, Jesse Gross <[email protected]> wrote:
> On Sat, Aug 20, 2011 at 4:49 AM, Pravin Shelar <[email protected]> wrote:
>> Currently OVS used its own hashing implmentation for hash tables which
>> has some problems, e.g. error case on deletion code. Following patch
>> replaces that with hlist based hash table which is consistent with other
>> kernel hash tables.
>>
>> Signed-off-by: Pravin Shelar <[email protected]>
>
> I get a whole pile of compiler errors with this on 2.6.38:
>
> /home/jesse/openvswitch/datapath/linux/datapath.c: In function
> ‘get_table_protected’:
> /home/jesse/openvswitch/datapath/linux/datapath.c:112:9: error:
> dereferencing pointer to incomplete type
> /home/jesse/openvswitch/datapath/linux/datapath.c:112:2: warning: type
> defaults to ‘int’ in type name
> /home/jesse/openvswitch/datapath/linux/datapath.c:112:2: warning:
> return from incompatible pointer type
> /home/jesse/openvswitch/datapath/linux/datapath.c: In function
> ‘dp_process_received_packet’:
> /home/jesse/openvswitch/datapath/linux/datapath.c:298:22: error:
> dereferencing pointer to incomplete type
> /home/jesse/openvswitch/datapath/linux/datapath.c:298:22: warning:
> type defaults to ‘int’ in declaration of ‘_________p1’
> /home/jesse/openvswitch/datapath/linux/datapath.c:298:22: error:
> dereferencing pointer to incomplete type
> /home/jesse/openvswitch/datapath/linux/datapath.c:298:3: warning: type
> defaults to ‘int’ in type name
> /home/jesse/openvswitch/datapath/linux/datapath.c:298:22: error:
> dereferencing pointer to incomplete type
> /home/jesse/openvswitch/datapath/linux/datapath.c:298:3: warning: type
> defaults to ‘int’ in type name
> /home/jesse/openvswitch/datapath/linux/datapath.c:298:3: warning:
> passing argument 1 of ‘flow_lookup’ from incompatible pointer type
> /home/jesse/openvswitch/datapath/linux/../flow.h:165:17: note:
> expected ‘struct flow_table *’ but argument is of type ‘int *’
> /home/jesse/openvswitch/datapath/linux/datapath.c: In function ‘flush_flows’:
> /home/jesse/openvswitch/datapath/linux/datapath.c:539:2: error:
> dereferencing pointer to incomplete type
> /home/jesse/openvswitch/datapath/linux/datapath.c:539:2: warning: type
> defaults to ‘int’ in type name
> /home/jesse/openvswitch/datapath/linux/datapath.c:539:2: warning:
> assignment from incompatible pointer type
> /home/jesse/openvswitch/datapath/linux/datapath.c: In function ‘expand_table’:
> /home/jesse/openvswitch/datapath/linux/datapath.c:633:3: error:
> dereferencing pointer to incomplete type
> /home/jesse/openvswitch/datapath/linux/datapath.c:633:3: warning: type
> defaults to ‘int’ in type name
> /home/jesse/openvswitch/datapath/linux/datapath.c:633:3: warning:
> assignment from incompatible pointer type
> /home/jesse/openvswitch/datapath/linux/datapath.c: In function
> ‘odp_dp_cmd_new’:
> /home/jesse/openvswitch/datapath/linux/datapath.c:1357:2: error:
> dereferencing pointer to incomplete type
> /home/jesse/openvswitch/datapath/linux/datapath.c:1357:2: warning:
> type defaults to ‘int’ in type name
> /home/jesse/openvswitch/datapath/linux/datapath.c:1357:2: warning:
> assignment from incompatible pointer type
>
> I'm guessing that you didn't try this on a kernel that has real
> definitions for the RCU sparse/lockdep checks.
right, new rcu macro need struct definition.
>
> Also, it doesn't look like you actually deleted table.[h|c].
forgot to commit that to git patch
>
>> diff --git a/datapath/datapath.c b/datapath/datapath.c
>> index b191239..e198357 100644
>> --- a/datapath/datapath.c
>> +++ b/datapath/datapath.c
>> @@ -225,7 +225,7 @@ static void destroy_dp_rcu(struct rcu_head *rcu)
>> {
>> struct datapath *dp = container_of(rcu, struct datapath, rcu);
>>
>> - tbl_destroy((struct tbl __force *)dp->table, flow_free_tbl);
>> + flow_deferred_destroy(dp->table);
>
> It's not necessary to wait for an extra RCU grace period before
> deleting the table. Previously it was done immediately, now it's a
> deferred free.
I just wanted to keep one interface for flow destroy as it does not
cause any problem delaying it.
>
>> + unsigned int i;
>> +
>> + l1 = kmalloc((n_buckets) * sizeof(struct hlist_head *), GFP_KERNEL);
>
> You can't just directly allocate the array of buckets because it will
> quickly become too large to allocate reliably (on 64-bit machines even
> the initial 1024 bucket allocation is already 2 pages). The previous
> implementation went to a lot of work to avoid this by using a
> hierarchy of single page allocations. The much easier solution that
> is now available, which we talked about in the past, is to use
> flex_arrays to handle the hierarchy automatically. I already
> backported the flex_array code from the latest kernel (and improved
> it) so it should be easy to use.
ok, i fixed it.
>> +void flow_insert(struct flow_table *table, struct sw_flow *flow)
>> +{
>> + struct hlist_head *head;
>> + struct hlist_node *n;
>> + struct sw_flow *cflow;
>> + u32 hash;
>> +
>> + hash = flow_hash(&flow->key, flow->key_len);
>> + head = find_bucket(table, hash);
>> +
>> + hlist_for_each_entry_rcu(cflow, n, head, hash_node) {
>> + if (!memcmp(&cflow->key, &flow->key, flow->key_len))
>> + return;
>> + }
>
> We should never add duplicate flow entries, so the above code isn't
> necessary. If it does happen, then it will cause a memory leak
> because the new flow will silently disappear.
ok, i am adding assert so that we will get bug if something goes wrong.
>
>> diff --git a/datapath/flow.h b/datapath/flow.h
>> index 6a3c539..6bfe893 100644
>> --- a/datapath/flow.h
>> +++ b/datapath/flow.h
>> struct sw_flow {
>> struct rcu_head rcu;
>> - struct tbl_node tbl_node;
>> + struct hlist_node hash_node;
>>
>> struct sw_flow_key key;
>> struct sw_flow_actions __rcu *sf_acts;
>> + int key_len;
>
> It looks like key_len was only added so that we can rehash the flows
> when the table is resized. I think perhaps a better use of memory
> would be to store the hash value itself. This way we wouldn't have to
> recompute the hashes on resize and we can also use it to speed up
> lookup by skipping comparisons of flows that may land in the same
> bucket but actually have different hash values.
>
having hash value in flow is good idea. i will add it.
>> +int tnl_init()
>> +{
>> + port_table = alloc_buckets(port_table_size);
>> + if (!port_table)
>> + return -ENOMEM;
>
> Previously we would allocate the hash table when the first tunnel port
> was created but now we're doing it as soon as the module is loaded.
> Why the change? Short of a compelling reason, I think it's better to
> do it on demand, otherwise we are consuming resources for something
> that will potentially never get used.
it saves port_table null check in lookup which i think is worth two
pages of preallocation.
>
>> +void tnl_exit()
>> +{
>> + struct hlist_node *n, *pos;
>> + struct hlist_head *hash_head;
>> + struct tnl_vport * tnl_vport;
>> + int i;
>> +
>> + if (!port_table)
>> + return;
>> +
>> + for (i = 0; i < port_table_size; i++) {
>> + hash_head = &port_table[i];
>> + hlist_for_each_entry_safe(tnl_vport, pos, n, hash_head,
>> + hash_node) {
>> + hlist_del_init_rcu(&tnl_vport->hash_node);
>> + __free_port_rcu(tnl_vport);
>> + }
>> + }
>
> When a datapath is deleted, it calls delete on all of the ports
> automatically so it should be necessary to do it here as well.
ok, added assert.
_______________________________________________
dev mailing list
[email protected]
http://openvswitch.org/mailman/listinfo/dev