Re: [PATCH RFC v7 net-next 00/28] BPF syscall

2014-08-27 Thread Alexei Starovoitov
On Wed, Aug 27, 2014 at 12:18 PM, Stephen Hemminger
 wrote:
> Something in man page format similar to FreeBSD man page:
>  http://www.freebsd.org/cgi/man.cgi?bpf(4)
>
> would be more readable and reviewable.

Ok. will chop it into smallest diff possible and will add a doc
for syscall only. I guess the problem is that we have too many
docs now that talk about everything.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC v7 net-next 00/28] BPF syscall

2014-08-27 Thread Daniel Borkmann

On 08/27/2014 09:18 PM, Stephen Hemminger wrote:

Something in man page format similar to FreeBSD man page:
  http://www.freebsd.org/cgi/man.cgi?bpf(4)

would be more readable and reviewable.


I think at some point, we could perhaps do a section 7 page
with a general overview of the engine and where it can be
applied, and let the syscall page partially refer to it so
that it doesn't get too long. So far, we tried to squeeze
everything into Documentation/networking/filter.txt, and that
itself is quite long already.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC v7 net-next 00/28] BPF syscall

2014-08-27 Thread Stephen Hemminger
Something in man page format similar to FreeBSD man page:
 http://www.freebsd.org/cgi/man.cgi?bpf(4)

would be more readable and reviewable.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC v7 net-next 00/28] BPF syscall

2014-08-27 Thread Andy Lutomirski
On Tue, Aug 26, 2014 at 9:57 PM, Alexei Starovoitov  wrote:
> On Tue, Aug 26, 2014 at 9:49 PM, Andy Lutomirski  wrote:
>> On Tue, Aug 26, 2014 at 9:35 PM, Alexei Starovoitov  
>> wrote:
>>> On Tue, Aug 26, 2014 at 8:56 PM, Andy Lutomirski  
>>> wrote:
 On Aug 26, 2014 7:29 PM, "Alexei Starovoitov"  wrote:
>
> Hi Ingo, David,
>
> posting whole thing again as RFC to get feedback on syscall only.
> If syscall bpf(int cmd, union bpf_attr *attr, unsigned int size) is ok,
> I'll split them into small chunks as requested and will repost without 
> RFC.

 IMO it's much easier to review a syscall if we just look at a
 specification of what it does.  The code is, in some sense, secondary.
>>>
>>> 'specification of what it does'... hmm, you mean beyond what's
>>> there in commit logs and in Documentation/networking/filter.txt ?
>>> Aren't samples at the end give an idea on 'what it does'?
>>> I'm happy to add 'specification', I just don't understand yet what
>>> it suppose to talk about beyond what's already written.
>>> I understand that the patches are missing explanation on 'why'
>>> the syscall is being added, but I don't think it's what you're asking...
>>
>> I mean a hopefully short document that defines what the syscall does.
>> It should be precise enough that one could, in principle, implement
>> the syscall just by reading the document and that one could use the
>> syscall just by reading the document.
>>
>> Given that there's a whole instruction set to go with it, it may end
>> up being moderately complicated or saying things like "see this other
>> thing for a description of the instruction set" and "there are some
>> extensible sets of functions you can call with it".
>
> I'm still lost.
>
> Here is the quote from Documentation/networking/filter.txt
> "
> 'maps' is a generic storage of different types for sharing data between kernel
> and userspace.
>
> The maps are accessed from user space via BPF syscall,
> which has commands:
> - create a map with given type and attributes
>   map_fd = bpf(BPF_MAP_CREATE, union bpf_attr *attr, u32 size)
>   using attr->map_type, attr->key_size, attr->value_size, attr->max_entries
>   returns process-local file descriptor or negative error
>
> - lookup key in a given map
>   err = bpf(BPF_MAP_LOOKUP_ELEM, union bpf_attr *attr, u32 size)
>   using attr->map_fd, attr->key, attr->value
>   returns zero and stores found elem into value or negative error
>
> - create or update key/value pair in a given map
>   err = bpf(BPF_MAP_UPDATE_ELEM, union bpf_attr *attr, u32 size)
>   using attr->map_fd, attr->key, attr->value
>   returns zero or negative error
>
> - find and delete element by key in a given map
>   err = bpf(BPF_MAP_DELETE_ELEM, union bpf_attr *attr, u32 size)
>   using attr->map_fd, attr->key
>
> - to delete map: close(fd)
>   Exiting process will delete maps automatically
>
> userspace programs uses this API to create/populate/read
> maps that eBPF programs are concurrently updating.
> "
> and more in commit log:
> "
> - load eBPF program
>   fd = bpf(BPF_PROG_LOAD, union bpf_attr *attr, u32 size)
>
>   where 'attr' is
>   struct {
>   enum bpf_prog_type prog_type;
>   __u32 insn_cnt;
>   struct bpf_insn __user *insns;
>   const char __user *license;
>   };
>   insns - array of eBPF instructions
>   license - must be GPL compatible to call helper functions marked gpl_only
>
> - unload eBPF program
>   close(fd)
> "
>
> Isn't it short and describes what it does?
> Do you want me to describe what eBPF program can do?

The problem is that everyone needs to dig around a very long patch
series to find it.  Since you're asking for a review of a syscall, it
would be nice to have everything needed to review whether the syscall
is a good idea in its present form in one place and to keep the amount
of email under control.

--Andy

-- 
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC v7 net-next 00/28] BPF syscall

2014-08-27 Thread Steven Stewart-Gallus
> I'm personally not reviewing such a large patch series, sorry.
> 
> You need to submit smaller sets if you want to get reasonable
> review of your changes and ideas.

Hello. As well, this clogs up the mailing boxes of other people who
have no interest in the patch set.

Thank you,
Steven Stewart-Gallus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC v7 net-next 00/28] BPF syscall

2014-08-27 Thread David Miller
From: Alexei Starovoitov 
Date: Tue, 26 Aug 2014 19:29:14 -0700

> posting whole thing again as RFC to get feedback on syscall only.
> If syscall bpf(int cmd, union bpf_attr *attr, unsigned int size) is ok,

I'm personally not reviewing such a large patch series, sorry.

You need to submit smaller sets if you want to get reasonable
review of your changes and ideas.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC v7 net-next 00/28] BPF syscall

2014-08-27 Thread David Miller
From: Alexei Starovoitov a...@plumgrid.com
Date: Tue, 26 Aug 2014 19:29:14 -0700

 posting whole thing again as RFC to get feedback on syscall only.
 If syscall bpf(int cmd, union bpf_attr *attr, unsigned int size) is ok,

I'm personally not reviewing such a large patch series, sorry.

You need to submit smaller sets if you want to get reasonable
review of your changes and ideas.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC v7 net-next 00/28] BPF syscall

2014-08-27 Thread Steven Stewart-Gallus
 I'm personally not reviewing such a large patch series, sorry.
 
 You need to submit smaller sets if you want to get reasonable
 review of your changes and ideas.

Hello. As well, this clogs up the mailing boxes of other people who
have no interest in the patch set.

Thank you,
Steven Stewart-Gallus
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC v7 net-next 00/28] BPF syscall

2014-08-27 Thread Andy Lutomirski
On Tue, Aug 26, 2014 at 9:57 PM, Alexei Starovoitov a...@plumgrid.com wrote:
 On Tue, Aug 26, 2014 at 9:49 PM, Andy Lutomirski l...@amacapital.net wrote:
 On Tue, Aug 26, 2014 at 9:35 PM, Alexei Starovoitov a...@plumgrid.com 
 wrote:
 On Tue, Aug 26, 2014 at 8:56 PM, Andy Lutomirski l...@amacapital.net 
 wrote:
 On Aug 26, 2014 7:29 PM, Alexei Starovoitov a...@plumgrid.com wrote:

 Hi Ingo, David,

 posting whole thing again as RFC to get feedback on syscall only.
 If syscall bpf(int cmd, union bpf_attr *attr, unsigned int size) is ok,
 I'll split them into small chunks as requested and will repost without 
 RFC.

 IMO it's much easier to review a syscall if we just look at a
 specification of what it does.  The code is, in some sense, secondary.

 'specification of what it does'... hmm, you mean beyond what's
 there in commit logs and in Documentation/networking/filter.txt ?
 Aren't samples at the end give an idea on 'what it does'?
 I'm happy to add 'specification', I just don't understand yet what
 it suppose to talk about beyond what's already written.
 I understand that the patches are missing explanation on 'why'
 the syscall is being added, but I don't think it's what you're asking...

 I mean a hopefully short document that defines what the syscall does.
 It should be precise enough that one could, in principle, implement
 the syscall just by reading the document and that one could use the
 syscall just by reading the document.

 Given that there's a whole instruction set to go with it, it may end
 up being moderately complicated or saying things like see this other
 thing for a description of the instruction set and there are some
 extensible sets of functions you can call with it.

 I'm still lost.

 Here is the quote from Documentation/networking/filter.txt
 
 'maps' is a generic storage of different types for sharing data between kernel
 and userspace.

 The maps are accessed from user space via BPF syscall,
 which has commands:
 - create a map with given type and attributes
   map_fd = bpf(BPF_MAP_CREATE, union bpf_attr *attr, u32 size)
   using attr-map_type, attr-key_size, attr-value_size, attr-max_entries
   returns process-local file descriptor or negative error

 - lookup key in a given map
   err = bpf(BPF_MAP_LOOKUP_ELEM, union bpf_attr *attr, u32 size)
   using attr-map_fd, attr-key, attr-value
   returns zero and stores found elem into value or negative error

 - create or update key/value pair in a given map
   err = bpf(BPF_MAP_UPDATE_ELEM, union bpf_attr *attr, u32 size)
   using attr-map_fd, attr-key, attr-value
   returns zero or negative error

 - find and delete element by key in a given map
   err = bpf(BPF_MAP_DELETE_ELEM, union bpf_attr *attr, u32 size)
   using attr-map_fd, attr-key

 - to delete map: close(fd)
   Exiting process will delete maps automatically

 userspace programs uses this API to create/populate/read
 maps that eBPF programs are concurrently updating.
 
 and more in commit log:
 
 - load eBPF program
   fd = bpf(BPF_PROG_LOAD, union bpf_attr *attr, u32 size)

   where 'attr' is
   struct {
   enum bpf_prog_type prog_type;
   __u32 insn_cnt;
   struct bpf_insn __user *insns;
   const char __user *license;
   };
   insns - array of eBPF instructions
   license - must be GPL compatible to call helper functions marked gpl_only

 - unload eBPF program
   close(fd)
 

 Isn't it short and describes what it does?
 Do you want me to describe what eBPF program can do?

The problem is that everyone needs to dig around a very long patch
series to find it.  Since you're asking for a review of a syscall, it
would be nice to have everything needed to review whether the syscall
is a good idea in its present form in one place and to keep the amount
of email under control.

--Andy

-- 
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC v7 net-next 00/28] BPF syscall

2014-08-27 Thread Stephen Hemminger
Something in man page format similar to FreeBSD man page:
 http://www.freebsd.org/cgi/man.cgi?bpf(4)

would be more readable and reviewable.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC v7 net-next 00/28] BPF syscall

2014-08-27 Thread Daniel Borkmann

On 08/27/2014 09:18 PM, Stephen Hemminger wrote:

Something in man page format similar to FreeBSD man page:
  http://www.freebsd.org/cgi/man.cgi?bpf(4)

would be more readable and reviewable.


I think at some point, we could perhaps do a section 7 page
with a general overview of the engine and where it can be
applied, and let the syscall page partially refer to it so
that it doesn't get too long. So far, we tried to squeeze
everything into Documentation/networking/filter.txt, and that
itself is quite long already.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC v7 net-next 00/28] BPF syscall

2014-08-27 Thread Alexei Starovoitov
On Wed, Aug 27, 2014 at 12:18 PM, Stephen Hemminger
step...@networkplumber.org wrote:
 Something in man page format similar to FreeBSD man page:
  http://www.freebsd.org/cgi/man.cgi?bpf(4)

 would be more readable and reviewable.

Ok. will chop it into smallest diff possible and will add a doc
for syscall only. I guess the problem is that we have too many
docs now that talk about everything.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC v7 net-next 00/28] BPF syscall

2014-08-26 Thread Alexei Starovoitov
On Tue, Aug 26, 2014 at 9:49 PM, Andy Lutomirski  wrote:
> On Tue, Aug 26, 2014 at 9:35 PM, Alexei Starovoitov  wrote:
>> On Tue, Aug 26, 2014 at 8:56 PM, Andy Lutomirski  wrote:
>>> On Aug 26, 2014 7:29 PM, "Alexei Starovoitov"  wrote:

 Hi Ingo, David,

 posting whole thing again as RFC to get feedback on syscall only.
 If syscall bpf(int cmd, union bpf_attr *attr, unsigned int size) is ok,
 I'll split them into small chunks as requested and will repost without RFC.
>>>
>>> IMO it's much easier to review a syscall if we just look at a
>>> specification of what it does.  The code is, in some sense, secondary.
>>
>> 'specification of what it does'... hmm, you mean beyond what's
>> there in commit logs and in Documentation/networking/filter.txt ?
>> Aren't samples at the end give an idea on 'what it does'?
>> I'm happy to add 'specification', I just don't understand yet what
>> it suppose to talk about beyond what's already written.
>> I understand that the patches are missing explanation on 'why'
>> the syscall is being added, but I don't think it's what you're asking...
>
> I mean a hopefully short document that defines what the syscall does.
> It should be precise enough that one could, in principle, implement
> the syscall just by reading the document and that one could use the
> syscall just by reading the document.
>
> Given that there's a whole instruction set to go with it, it may end
> up being moderately complicated or saying things like "see this other
> thing for a description of the instruction set" and "there are some
> extensible sets of functions you can call with it".

I'm still lost.

Here is the quote from Documentation/networking/filter.txt
"
'maps' is a generic storage of different types for sharing data between kernel
and userspace.

The maps are accessed from user space via BPF syscall,
which has commands:
- create a map with given type and attributes
  map_fd = bpf(BPF_MAP_CREATE, union bpf_attr *attr, u32 size)
  using attr->map_type, attr->key_size, attr->value_size, attr->max_entries
  returns process-local file descriptor or negative error

- lookup key in a given map
  err = bpf(BPF_MAP_LOOKUP_ELEM, union bpf_attr *attr, u32 size)
  using attr->map_fd, attr->key, attr->value
  returns zero and stores found elem into value or negative error

- create or update key/value pair in a given map
  err = bpf(BPF_MAP_UPDATE_ELEM, union bpf_attr *attr, u32 size)
  using attr->map_fd, attr->key, attr->value
  returns zero or negative error

- find and delete element by key in a given map
  err = bpf(BPF_MAP_DELETE_ELEM, union bpf_attr *attr, u32 size)
  using attr->map_fd, attr->key

- to delete map: close(fd)
  Exiting process will delete maps automatically

userspace programs uses this API to create/populate/read
maps that eBPF programs are concurrently updating.
"
and more in commit log:
"
- load eBPF program
  fd = bpf(BPF_PROG_LOAD, union bpf_attr *attr, u32 size)

  where 'attr' is
  struct {
  enum bpf_prog_type prog_type;
  __u32 insn_cnt;
  struct bpf_insn __user *insns;
  const char __user *license;
  };
  insns - array of eBPF instructions
  license - must be GPL compatible to call helper functions marked gpl_only

- unload eBPF program
  close(fd)
"

Isn't it short and describes what it does?
Do you want me to describe what eBPF program can do?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC v7 net-next 00/28] BPF syscall

2014-08-26 Thread Andy Lutomirski
On Tue, Aug 26, 2014 at 9:35 PM, Alexei Starovoitov  wrote:
> On Tue, Aug 26, 2014 at 8:56 PM, Andy Lutomirski  wrote:
>> On Aug 26, 2014 7:29 PM, "Alexei Starovoitov"  wrote:
>>>
>>> Hi Ingo, David,
>>>
>>> posting whole thing again as RFC to get feedback on syscall only.
>>> If syscall bpf(int cmd, union bpf_attr *attr, unsigned int size) is ok,
>>> I'll split them into small chunks as requested and will repost without RFC.
>>
>> IMO it's much easier to review a syscall if we just look at a
>> specification of what it does.  The code is, in some sense, secondary.
>
> 'specification of what it does'... hmm, you mean beyond what's
> there in commit logs and in Documentation/networking/filter.txt ?
> Aren't samples at the end give an idea on 'what it does'?
> I'm happy to add 'specification', I just don't understand yet what
> it suppose to talk about beyond what's already written.
> I understand that the patches are missing explanation on 'why'
> the syscall is being added, but I don't think it's what you're asking...

I mean a hopefully short document that defines what the syscall does.
It should be precise enough that one could, in principle, implement
the syscall just by reading the document and that one could use the
syscall just by reading the document.

Given that there's a whole instruction set to go with it, it may end
up being moderately complicated or saying things like "see this other
thing for a description of the instruction set" and "there are some
extensible sets of functions you can call with it".

--Andy

-- 
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC v7 net-next 00/28] BPF syscall

2014-08-26 Thread Alexei Starovoitov
On Tue, Aug 26, 2014 at 8:56 PM, Andy Lutomirski  wrote:
> On Aug 26, 2014 7:29 PM, "Alexei Starovoitov"  wrote:
>>
>> Hi Ingo, David,
>>
>> posting whole thing again as RFC to get feedback on syscall only.
>> If syscall bpf(int cmd, union bpf_attr *attr, unsigned int size) is ok,
>> I'll split them into small chunks as requested and will repost without RFC.
>
> IMO it's much easier to review a syscall if we just look at a
> specification of what it does.  The code is, in some sense, secondary.

'specification of what it does'... hmm, you mean beyond what's
there in commit logs and in Documentation/networking/filter.txt ?
Aren't samples at the end give an idea on 'what it does'?
I'm happy to add 'specification', I just don't understand yet what
it suppose to talk about beyond what's already written.
I understand that the patches are missing explanation on 'why'
the syscall is being added, but I don't think it's what you're asking...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC v7 net-next 00/28] BPF syscall

2014-08-26 Thread Andy Lutomirski
On Aug 26, 2014 7:29 PM, "Alexei Starovoitov"  wrote:
>
> Hi Ingo, David,
>
> posting whole thing again as RFC to get feedback on syscall only.
> If syscall bpf(int cmd, union bpf_attr *attr, unsigned int size) is ok,
> I'll split them into small chunks as requested and will repost without RFC.

IMO it's much easier to review a syscall if we just look at a
specification of what it does.  The code is, in some sense, secondary.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RFC v7 net-next 00/28] BPF syscall

2014-08-26 Thread Alexei Starovoitov
Hi Ingo, David,

posting whole thing again as RFC to get feedback on syscall only.
If syscall bpf(int cmd, union bpf_attr *attr, unsigned int size) is ok,
I'll split them into small chunks as requested and will repost without RFC.

Right now please only review syscall API
patch 0003 introduces sys_bpf and first BPF_MAP_CREATE command
patch 0005 adds four more commands to the same syscall
patch 0007 adds BPF_PROG_LOAD command
patch 0010 extends BPF_PROG_LOAD command with 3 more attributes
patch 0021 adds user space wrapper for BPF syscall
patch 0023 uses these wrappers in verifier testsuite

Note that additions of commands and attributes kept this syscall
backwards compatible from one patch to another.
I've decided not to bother with forward compatiblity for now.
We can address it later the way perf_event_open did.

Please ignore other patches, since I cannot easily remove them
without breaking compilation

btw, tested on x64/i386 and comiled tested on arm with NET-less config.

V6->V7:
- only BPF syscall interface changed from long+nlattr+a_lot_of_type_casts to
  single 'union bpf_attr'.
  It pretty much removed all type casts in kernel and in user space that
  were there because of 'long' and because of 'nlattr'

Thanks for feedback. I think this version is definitely cleaner.
As a side note I've addressed Cong's comment regarding commit log.
Now it documents syscall itself instead of wrappers of syscall.

If anyone prefers to see patches in the browser, they are here:
https://git.kernel.org/cgit/linux/kernel/git/ast/bpf.git/log/?h=v7

Thanks!

Alexei Starovoitov (28):
  net: filter: add "load 64-bit immediate" eBPF instruction
  net: filter: split filter.h and expose eBPF to user space
  bpf: introduce syscall(BPF, ...) and BPF maps
  bpf: enable bpf syscall on x64 and i386
  bpf: add lookup/update/delete/iterate methods to BPF maps
  bpf: add hashtable type of BPF maps
  bpf: expand BPF syscall with program load/unload
  bpf: handle pseudo BPF_CALL insn
  bpf: verifier (add docs)
  bpf: verifier (add ability to receive verification log)
  bpf: handle pseudo BPF_LD_IMM64 insn
  bpf: verifier (add branch/goto checks)
  bpf: verifier (add verifier core)
  bpf: verifier (add state prunning optimization)
  bpf: allow eBPF programs to use maps
  bpf: split eBPF out of NET
  tracing: allow eBPF programs to be attached to events
  tracing: allow eBPF programs call printk()
  tracing: allow eBPF programs to be attached to kprobe/kretprobe
  tracing: allow eBPF programs to call ktime_get_ns() and get_current()
  samples: bpf: add mini eBPF library to manipulate maps and programs
  samples: bpf: example of tracing filters with eBPF
  bpf: verifier test
  bpf: llvm backend
  samples: bpf: elf file loader
  samples: bpf: eBPF example in C
  samples: bpf: counting eBPF example in C
  samples: bpf: IO latency analysis (iosnoop/heatmap)

 Documentation/networking/filter.txt|  313 +++-
 arch/Kconfig   |3 +
 arch/x86/net/bpf_jit_comp.c|   17 +
 arch/x86/syscalls/syscall_32.tbl   |1 +
 arch/x86/syscalls/syscall_64.tbl   |1 +
 fs/btrfs/super.c   |3 +
 include/linux/bpf.h|  139 ++
 include/linux/filter.h |  303 +---
 include/linux/ftrace_event.h   |5 +
 include/linux/syscalls.h   |3 +-
 include/trace/bpf_trace.h  |   23 +
 include/trace/ftrace.h |   25 +
 include/uapi/asm-generic/unistd.h  |4 +-
 include/uapi/linux/Kbuild  |1 +
 include/uapi/linux/bpf.h   |  439 +
 kernel/Makefile|2 +-
 kernel/bpf/Makefile|2 +-
 kernel/bpf/core.c  |   17 +
 kernel/bpf/hashtab.c   |  365 
 kernel/bpf/syscall.c   |  645 +++
 kernel/bpf/verifier.c  | 1910 
 kernel/sys_ni.c|3 +
 kernel/trace/Kconfig   |1 +
 kernel/trace/Makefile  |1 +
 kernel/trace/bpf_trace.c   |  264 +++
 kernel/trace/trace.h   |3 +
 kernel/trace/trace_events.c|   41 +-
 kernel/trace/trace_events_filter.c |   72 +-
 kernel/trace/trace_kprobe.c|   28 +
 kernel/trace/trace_syscalls.c  |   32 +
 lib/test_bpf.c |   21 +
 net/Kconfig|1 +
 net/core/filter.c  |2 +
 samples/bpf/Makefile   |   28 +
 

[PATCH RFC v7 net-next 00/28] BPF syscall

2014-08-26 Thread Alexei Starovoitov
Hi Ingo, David,

posting whole thing again as RFC to get feedback on syscall only.
If syscall bpf(int cmd, union bpf_attr *attr, unsigned int size) is ok,
I'll split them into small chunks as requested and will repost without RFC.

Right now please only review syscall API
patch 0003 introduces sys_bpf and first BPF_MAP_CREATE command
patch 0005 adds four more commands to the same syscall
patch 0007 adds BPF_PROG_LOAD command
patch 0010 extends BPF_PROG_LOAD command with 3 more attributes
patch 0021 adds user space wrapper for BPF syscall
patch 0023 uses these wrappers in verifier testsuite

Note that additions of commands and attributes kept this syscall
backwards compatible from one patch to another.
I've decided not to bother with forward compatiblity for now.
We can address it later the way perf_event_open did.

Please ignore other patches, since I cannot easily remove them
without breaking compilation

btw, tested on x64/i386 and comiled tested on arm with NET-less config.

V6-V7:
- only BPF syscall interface changed from long+nlattr+a_lot_of_type_casts to
  single 'union bpf_attr'.
  It pretty much removed all type casts in kernel and in user space that
  were there because of 'long' and because of 'nlattr'

Thanks for feedback. I think this version is definitely cleaner.
As a side note I've addressed Cong's comment regarding commit log.
Now it documents syscall itself instead of wrappers of syscall.

If anyone prefers to see patches in the browser, they are here:
https://git.kernel.org/cgit/linux/kernel/git/ast/bpf.git/log/?h=v7

Thanks!

Alexei Starovoitov (28):
  net: filter: add load 64-bit immediate eBPF instruction
  net: filter: split filter.h and expose eBPF to user space
  bpf: introduce syscall(BPF, ...) and BPF maps
  bpf: enable bpf syscall on x64 and i386
  bpf: add lookup/update/delete/iterate methods to BPF maps
  bpf: add hashtable type of BPF maps
  bpf: expand BPF syscall with program load/unload
  bpf: handle pseudo BPF_CALL insn
  bpf: verifier (add docs)
  bpf: verifier (add ability to receive verification log)
  bpf: handle pseudo BPF_LD_IMM64 insn
  bpf: verifier (add branch/goto checks)
  bpf: verifier (add verifier core)
  bpf: verifier (add state prunning optimization)
  bpf: allow eBPF programs to use maps
  bpf: split eBPF out of NET
  tracing: allow eBPF programs to be attached to events
  tracing: allow eBPF programs call printk()
  tracing: allow eBPF programs to be attached to kprobe/kretprobe
  tracing: allow eBPF programs to call ktime_get_ns() and get_current()
  samples: bpf: add mini eBPF library to manipulate maps and programs
  samples: bpf: example of tracing filters with eBPF
  bpf: verifier test
  bpf: llvm backend
  samples: bpf: elf file loader
  samples: bpf: eBPF example in C
  samples: bpf: counting eBPF example in C
  samples: bpf: IO latency analysis (iosnoop/heatmap)

 Documentation/networking/filter.txt|  313 +++-
 arch/Kconfig   |3 +
 arch/x86/net/bpf_jit_comp.c|   17 +
 arch/x86/syscalls/syscall_32.tbl   |1 +
 arch/x86/syscalls/syscall_64.tbl   |1 +
 fs/btrfs/super.c   |3 +
 include/linux/bpf.h|  139 ++
 include/linux/filter.h |  303 +---
 include/linux/ftrace_event.h   |5 +
 include/linux/syscalls.h   |3 +-
 include/trace/bpf_trace.h  |   23 +
 include/trace/ftrace.h |   25 +
 include/uapi/asm-generic/unistd.h  |4 +-
 include/uapi/linux/Kbuild  |1 +
 include/uapi/linux/bpf.h   |  439 +
 kernel/Makefile|2 +-
 kernel/bpf/Makefile|2 +-
 kernel/bpf/core.c  |   17 +
 kernel/bpf/hashtab.c   |  365 
 kernel/bpf/syscall.c   |  645 +++
 kernel/bpf/verifier.c  | 1910 
 kernel/sys_ni.c|3 +
 kernel/trace/Kconfig   |1 +
 kernel/trace/Makefile  |1 +
 kernel/trace/bpf_trace.c   |  264 +++
 kernel/trace/trace.h   |3 +
 kernel/trace/trace_events.c|   41 +-
 kernel/trace/trace_events_filter.c |   72 +-
 kernel/trace/trace_kprobe.c|   28 +
 kernel/trace/trace_syscalls.c  |   32 +
 lib/test_bpf.c |   21 +
 net/Kconfig|1 +
 net/core/filter.c  |2 +
 samples/bpf/Makefile   |   28 +
 

Re: [PATCH RFC v7 net-next 00/28] BPF syscall

2014-08-26 Thread Andy Lutomirski
On Aug 26, 2014 7:29 PM, Alexei Starovoitov a...@plumgrid.com wrote:

 Hi Ingo, David,

 posting whole thing again as RFC to get feedback on syscall only.
 If syscall bpf(int cmd, union bpf_attr *attr, unsigned int size) is ok,
 I'll split them into small chunks as requested and will repost without RFC.

IMO it's much easier to review a syscall if we just look at a
specification of what it does.  The code is, in some sense, secondary.

--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC v7 net-next 00/28] BPF syscall

2014-08-26 Thread Alexei Starovoitov
On Tue, Aug 26, 2014 at 8:56 PM, Andy Lutomirski l...@amacapital.net wrote:
 On Aug 26, 2014 7:29 PM, Alexei Starovoitov a...@plumgrid.com wrote:

 Hi Ingo, David,

 posting whole thing again as RFC to get feedback on syscall only.
 If syscall bpf(int cmd, union bpf_attr *attr, unsigned int size) is ok,
 I'll split them into small chunks as requested and will repost without RFC.

 IMO it's much easier to review a syscall if we just look at a
 specification of what it does.  The code is, in some sense, secondary.

'specification of what it does'... hmm, you mean beyond what's
there in commit logs and in Documentation/networking/filter.txt ?
Aren't samples at the end give an idea on 'what it does'?
I'm happy to add 'specification', I just don't understand yet what
it suppose to talk about beyond what's already written.
I understand that the patches are missing explanation on 'why'
the syscall is being added, but I don't think it's what you're asking...
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC v7 net-next 00/28] BPF syscall

2014-08-26 Thread Andy Lutomirski
On Tue, Aug 26, 2014 at 9:35 PM, Alexei Starovoitov a...@plumgrid.com wrote:
 On Tue, Aug 26, 2014 at 8:56 PM, Andy Lutomirski l...@amacapital.net wrote:
 On Aug 26, 2014 7:29 PM, Alexei Starovoitov a...@plumgrid.com wrote:

 Hi Ingo, David,

 posting whole thing again as RFC to get feedback on syscall only.
 If syscall bpf(int cmd, union bpf_attr *attr, unsigned int size) is ok,
 I'll split them into small chunks as requested and will repost without RFC.

 IMO it's much easier to review a syscall if we just look at a
 specification of what it does.  The code is, in some sense, secondary.

 'specification of what it does'... hmm, you mean beyond what's
 there in commit logs and in Documentation/networking/filter.txt ?
 Aren't samples at the end give an idea on 'what it does'?
 I'm happy to add 'specification', I just don't understand yet what
 it suppose to talk about beyond what's already written.
 I understand that the patches are missing explanation on 'why'
 the syscall is being added, but I don't think it's what you're asking...

I mean a hopefully short document that defines what the syscall does.
It should be precise enough that one could, in principle, implement
the syscall just by reading the document and that one could use the
syscall just by reading the document.

Given that there's a whole instruction set to go with it, it may end
up being moderately complicated or saying things like see this other
thing for a description of the instruction set and there are some
extensible sets of functions you can call with it.

--Andy

-- 
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC v7 net-next 00/28] BPF syscall

2014-08-26 Thread Alexei Starovoitov
On Tue, Aug 26, 2014 at 9:49 PM, Andy Lutomirski l...@amacapital.net wrote:
 On Tue, Aug 26, 2014 at 9:35 PM, Alexei Starovoitov a...@plumgrid.com wrote:
 On Tue, Aug 26, 2014 at 8:56 PM, Andy Lutomirski l...@amacapital.net wrote:
 On Aug 26, 2014 7:29 PM, Alexei Starovoitov a...@plumgrid.com wrote:

 Hi Ingo, David,

 posting whole thing again as RFC to get feedback on syscall only.
 If syscall bpf(int cmd, union bpf_attr *attr, unsigned int size) is ok,
 I'll split them into small chunks as requested and will repost without RFC.

 IMO it's much easier to review a syscall if we just look at a
 specification of what it does.  The code is, in some sense, secondary.

 'specification of what it does'... hmm, you mean beyond what's
 there in commit logs and in Documentation/networking/filter.txt ?
 Aren't samples at the end give an idea on 'what it does'?
 I'm happy to add 'specification', I just don't understand yet what
 it suppose to talk about beyond what's already written.
 I understand that the patches are missing explanation on 'why'
 the syscall is being added, but I don't think it's what you're asking...

 I mean a hopefully short document that defines what the syscall does.
 It should be precise enough that one could, in principle, implement
 the syscall just by reading the document and that one could use the
 syscall just by reading the document.

 Given that there's a whole instruction set to go with it, it may end
 up being moderately complicated or saying things like see this other
 thing for a description of the instruction set and there are some
 extensible sets of functions you can call with it.

I'm still lost.

Here is the quote from Documentation/networking/filter.txt

'maps' is a generic storage of different types for sharing data between kernel
and userspace.

The maps are accessed from user space via BPF syscall,
which has commands:
- create a map with given type and attributes
  map_fd = bpf(BPF_MAP_CREATE, union bpf_attr *attr, u32 size)
  using attr-map_type, attr-key_size, attr-value_size, attr-max_entries
  returns process-local file descriptor or negative error

- lookup key in a given map
  err = bpf(BPF_MAP_LOOKUP_ELEM, union bpf_attr *attr, u32 size)
  using attr-map_fd, attr-key, attr-value
  returns zero and stores found elem into value or negative error

- create or update key/value pair in a given map
  err = bpf(BPF_MAP_UPDATE_ELEM, union bpf_attr *attr, u32 size)
  using attr-map_fd, attr-key, attr-value
  returns zero or negative error

- find and delete element by key in a given map
  err = bpf(BPF_MAP_DELETE_ELEM, union bpf_attr *attr, u32 size)
  using attr-map_fd, attr-key

- to delete map: close(fd)
  Exiting process will delete maps automatically

userspace programs uses this API to create/populate/read
maps that eBPF programs are concurrently updating.

and more in commit log:

- load eBPF program
  fd = bpf(BPF_PROG_LOAD, union bpf_attr *attr, u32 size)

  where 'attr' is
  struct {
  enum bpf_prog_type prog_type;
  __u32 insn_cnt;
  struct bpf_insn __user *insns;
  const char __user *license;
  };
  insns - array of eBPF instructions
  license - must be GPL compatible to call helper functions marked gpl_only

- unload eBPF program
  close(fd)


Isn't it short and describes what it does?
Do you want me to describe what eBPF program can do?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/