On 04/02/2018 08:54 AM, Alexei Starovoitov wrote:
> On Sun, Apr 01, 2018 at 08:01:10AM -0700, John Fastabend wrote:
>> Sockmap is currently backed by an array and enforces keys to be
>> four bytes. This works well for many use cases and was originally
>> modeled after devmap which also uses four bytes keys. However,
>> this has become limiting in larger use cases where a hash would
>> be more appropriate. For example users may want to use the 5-tuple
>> of the socket as the lookup key.
>>
>> To support this add hash support.
>>
>> Signed-off-by: John Fastabend <john.fastab...@gmail.com>
> 
> api looks good, but I think it came a bit too late for this release.
> _nulls part you don't need for this hash. Few other nits:
> 

Yeah no problem will push when {bpf|net}-next opens again. We
also have some fields we need to access from sockmap programs
as well such as ip addrs and ports. I'll respin the fixes (first
two patches) for bpf-net.

>> +static void htab_elem_free_rcu(struct rcu_head *head)
>> +{
>> +    struct htab_elem *l = container_of(head, struct htab_elem, rcu);
>> +
>> +    /* must increment bpf_prog_active to avoid kprobe+bpf triggering while
>> +     * we're calling kfree, otherwise deadlock is possible if kprobes
>> +     * are placed somewhere inside of slub
>> +     */
>> +    preempt_disable();
>> +    __this_cpu_inc(bpf_prog_active);
>> +    kfree(l);
>> +    __this_cpu_dec(bpf_prog_active);
>> +    preempt_enable();
> 
> I don't think it's necessary.
> 

Yep agreed.

>> +static struct bpf_map *sock_hash_alloc(union bpf_attr *attr)
>> +{
>> +    struct bpf_htab *htab;
>> +    int i, err;
>> +    u64 cost;
>> +
>> +    if (!capable(CAP_NET_ADMIN))
>> +            return ERR_PTR(-EPERM);
>> +
>> +    /* check sanity of attributes */
>> +    if (attr->max_entries == 0 ||
>> +        attr->map_flags & ~SOCK_CREATE_FLAG_MASK)
>> +            return ERR_PTR(-EINVAL);
>> +
>> +    if (attr->value_size > KMALLOC_MAX_SIZE)
>> +            return ERR_PTR(-E2BIG);
> 
> doesn't seem to match
> +     u32 fd = *(u32 *)value;
> that is done later.
> 

Yep.

>> +static struct htab_elem *lookup_elem_raw(struct hlist_nulls_head *head,
>> +                                     u32 hash, void *key, u32 key_size)
>> +{
>> +    struct hlist_nulls_node *n;
>> +    struct htab_elem *l;
>> +
>> +    hlist_nulls_for_each_entry_rcu(l, n, head, hash_node)
>> +            if (l->hash == hash && !memcmp(&l->key, key, key_size))
>> +                    return l;
> 
> if nulls is needed, there gotta be a comment explaining it.
> 

Sure its not needed lets drop it.

> please add tests for all methods.
> 

I'll add these tests to test_maps.

>> diff --git a/tools/bpf/bpftool/map.c b/tools/bpf/bpftool/map.c
>> index f95fa67..2fa4cbb 100644
>> --- a/tools/bpf/bpftool/map.c
>> +++ b/tools/bpf/bpftool/map.c
>> @@ -67,6 +67,7 @@
>>      [BPF_MAP_TYPE_DEVMAP]           = "devmap",
>>      [BPF_MAP_TYPE_SOCKMAP]          = "sockmap",
>>      [BPF_MAP_TYPE_CPUMAP]           = "cpumap",
>> +    [BPF_MAP_TYPE_SOCKHASH]         = "sockhash",
>>  };
>>  
>>  static unsigned int get_possible_cpus(void)
>> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
>> index 9d07465..1a19450 100644
>> --- a/tools/include/uapi/linux/bpf.h
>> +++ b/tools/include/uapi/linux/bpf.h
>> @@ -115,6 +115,7 @@ enum bpf_map_type {
>>      BPF_MAP_TYPE_DEVMAP,
>>      BPF_MAP_TYPE_SOCKMAP,
>>      BPF_MAP_TYPE_CPUMAP,
>> +    BPF_MAP_TYPE_SOCKHASH,
> 
> tools/* updates should be in separate commit.
> 

Great, thanks for the review.

Reply via email to