Re: [v8, bpf-next, 4/9] net/wireless/iwlwifi: fix iwlwifi_dev_ucode_error tracepoint

2018-05-24 Thread Alexei Starovoitov
On Wed, May 23, 2018 at 01:03:08PM +0200, Johannes Berg wrote:
> On Wed, 2018-03-28 at 12:05 -0700, Alexei Starovoitov wrote:
> > fix iwlwifi_dev_ucode_error tracepoint to pass pointer to a table
> > instead of all 17 arguments by value.
> > dvm/main.c and mvm/utils.c have 'struct iwl_error_event_table'
> > defined with very similar yet subtly different fields and offsets.
> > tracepoint is still common and using definition of 'struct 
> > iwl_error_event_table'
> > from dvm/commands.h while copying fields.
> > Long term this tracepoint probably should be split into two.
> 
> It would've been nice to CC the wireless list for wireless related
> patches ...

Ohh. I didn't realize that networking wireless doesn't fall under netdev.
I thought wireless folks are silent because they are embarrassed
by a function with 17 arguments.

> > --- a/drivers/net/wireless/intel/iwlwifi/iwl-devtrace.c
> > +++ b/drivers/net/wireless/intel/iwlwifi/iwl-devtrace.c
> > @@ -30,6 +30,7 @@
> >  #ifndef __CHECKER__
> >  #include "iwl-trans.h"
> >  
> > +#include "dvm/commands.h"
> 
> In particular, this breaks the whole driver abstraction.
> 
> > +++ b/drivers/net/wireless/intel/iwlwifi/mvm/utils.c
> > @@ -549,12 +549,7 @@ static void iwl_mvm_dump_lmac_error_log(struct iwl_mvm 
> > *mvm, u32 base)
> >  
> > IWL_ERR(mvm, "Loaded firmware version: %s\n", mvm->fw->fw_version);
> >  
> > -   trace_iwlwifi_dev_ucode_error(trans->dev, table.error_id, 
> > table.tsf_low,
> > - table.data1, table.data2, table.data3,
> > - table.blink2, table.ilink1,
> > - table.ilink2, table.bcon_time, 
> > table.gp1,
> > - table.gp2, table.fw_rev_type, 
> > table.major,
> > - table.minor, table.hw_ver, 
> > table.brd_ver);
> > +   trace_iwlwifi_dev_ucode_error(trans->dev, , table.hw_ver, 
> > table.brd_ver);
> 
> This is also utterly wrong because mvm has - for better or worse - a
> different type "struct iwl_error_event_table" in this file ...

As I was trying to explain in the commit log the single struct
is used in both places, but differences in two
"struct iwl_error_event_table" are carefully matched
field and by field. For two extra fields it was not
possible and they are passed separately as you can see above.
I still believe that tracepoint output is still exactly
the same before and after the patch.
I guess you see the breakage because new fields got
added into one "struct iwl_error_event_table",
but were not added to its evil twin "struct iwl_error_event_table"
with the same name after the patch landed ?
imo wireless folks need to avoid such naming conflicts.
I suggest to isolate common fields into separate base struct and
give two children structs different names.



Re: [PATCH v4 00/10] prevent bounds-check bypass via speculative execution

2018-01-20 Thread Alexei Starovoitov
On Sat, Jan 20, 2018 at 8:56 AM, Alexei Starovoitov
<alexei.starovoi...@gmail.com> wrote:
> On Fri, Jan 19, 2018 at 10:58:44PM -0800, Dan Williams wrote:
>> On Thu, Jan 18, 2018 at 4:01 PM, Dan Williams <dan.j.willi...@intel.com> 
>> wrote:
>> > Changes since v3 [1]
>> > * Drop 'ifence_array_ptr' and associated compile-time + run-time
>> >   switching and just use the masking approach all the time.
>> >
>> > * Convert 'get_user' to use pointer sanitization via masking rather than
>> >   lfence. '__get_user' and associated paths still rely on
>> >   lfence. (Linus)
>> >
>> >   "Basically, the rule is trivial: find all 'stac' users, and use
>> >address masking if those users already integrate the limit
>> >check, and lfence they don't."
>> >
>> > * At syscall entry sanitize the syscall number under speculation
>> >   to remove a user controlled pointer de-reference in kernel
>> >   space.  (Linus)
>> >
>> > * Fix a raw lfence in the kvm code (added for v4.15-rc8) to use
>> >   'array_ptr'.
>> >
>> > * Propose 'array_idx' as a way to sanitize user input that is
>> >   later used as an array index, but where the validation is
>> >   happening in a different code block than the array reference.
>> >   (Christian).
>> >
>> > * Fix grammar in speculation.txt (Kees)
>> >
>> > ---
>> >
>> > Quoting Mark's original RFC:
>> >
>> > "Recently, Google Project Zero discovered several classes of attack
>> > against speculative execution. One of these, known as variant-1, allows
>> > explicit bounds checks to be bypassed under speculation, providing an
>> > arbitrary read gadget. Further details can be found on the GPZ blog [2]
>> > and the Documentation patch in this series."
>> >
>> > A precondition of using this attack on the kernel is to get a user
>> > controlled pointer de-referenced (under speculation) in privileged code.
>> > The primary source of user controlled pointers in the kernel is the
>> > arguments passed to 'get_user' and '__get_user'. An example of other
>> > user controlled pointers are user-controlled array / pointer offsets.
>> >
>> > Better tooling is needed to find more arrays / pointers with user
>> > controlled indices / offsets that can be converted to use 'array_ptr' or
>> > 'array_idx'. A few are included in this set, and these are not expected
>> > to be complete. That said, the 'get_user' protections raise the bar on
>> > finding a vulnerable gadget in the kernel.
>> >
>> > These patches are also available via the 'nospec-v4' git branch here:
>> >
>> > git://git.kernel.org/pub/scm/linux/kernel/git/djbw/linux nospec-v4
>>
>> I've pushed out a nospec-v4.1 with the below minor cleanup, a fixup of
>> the changelog for "kvm, x86: fix spectre-v1 mitigation", and added
>> Paolo's ack.
>>
>>  git://git.kernel.org/pub/scm/linux/kernel/git/djbw/linux nospec-v4.1
>>
>> diff --git a/include/linux/nospec.h b/include/linux/nospec.h
>> index 8af35be1869e..b8a9222e34d1 100644
>> --- a/include/linux/nospec.h
>> +++ b/include/linux/nospec.h
>> @@ -37,7 +37,7 @@ static inline unsigned long array_ptr_mask(unsigned
>> long idx, unsigned long sz)
>> unsigned long _i = (idx);   \
>> unsigned long _mask = array_ptr_mask(_i, (sz)); \
>> \
>> -   __u._ptr = _arr + (_i & _mask); \
>> +   __u._ptr = _arr + _i;   \
>> __u._bit &= _mask;  \
>> __u._ptr;   \
>
> hmm. I'm not sure it's the right thing to do, since the macro
> is forcing cpu to speculate subsequent load from null instead
> of valid pointer.
> As Linus said: "
>  So that __u._bit masking wasn't masking
>  the pointer, it was masking the value that was *read* from the
>  pointer, so that you could know that an invalid access returned
>  0/NULL, not just the first value in the array.
> "
> imo just
>   return _arr + (_i & _mask);
> is enough. No need for union games.
> The cpu will speculate the load from _arr[0] if _i is out of bounds
> which is the same as if user passed _i == 0 which would have passed
> bounds check anyway, so I don't see any data leak from populating
> cache with _arr[0] data. In-bounds access can do that just as well
> without any speculation.

scratch that. It's array_ptr, not array_access.
The code will do if (!ptr) later, so yeah this api is fine.


Re: [PATCH v4 00/10] prevent bounds-check bypass via speculative execution

2018-01-20 Thread Alexei Starovoitov
On Fri, Jan 19, 2018 at 10:58:44PM -0800, Dan Williams wrote:
> On Thu, Jan 18, 2018 at 4:01 PM, Dan Williams  
> wrote:
> > Changes since v3 [1]
> > * Drop 'ifence_array_ptr' and associated compile-time + run-time
> >   switching and just use the masking approach all the time.
> >
> > * Convert 'get_user' to use pointer sanitization via masking rather than
> >   lfence. '__get_user' and associated paths still rely on
> >   lfence. (Linus)
> >
> >   "Basically, the rule is trivial: find all 'stac' users, and use
> >address masking if those users already integrate the limit
> >check, and lfence they don't."
> >
> > * At syscall entry sanitize the syscall number under speculation
> >   to remove a user controlled pointer de-reference in kernel
> >   space.  (Linus)
> >
> > * Fix a raw lfence in the kvm code (added for v4.15-rc8) to use
> >   'array_ptr'.
> >
> > * Propose 'array_idx' as a way to sanitize user input that is
> >   later used as an array index, but where the validation is
> >   happening in a different code block than the array reference.
> >   (Christian).
> >
> > * Fix grammar in speculation.txt (Kees)
> >
> > ---
> >
> > Quoting Mark's original RFC:
> >
> > "Recently, Google Project Zero discovered several classes of attack
> > against speculative execution. One of these, known as variant-1, allows
> > explicit bounds checks to be bypassed under speculation, providing an
> > arbitrary read gadget. Further details can be found on the GPZ blog [2]
> > and the Documentation patch in this series."
> >
> > A precondition of using this attack on the kernel is to get a user
> > controlled pointer de-referenced (under speculation) in privileged code.
> > The primary source of user controlled pointers in the kernel is the
> > arguments passed to 'get_user' and '__get_user'. An example of other
> > user controlled pointers are user-controlled array / pointer offsets.
> >
> > Better tooling is needed to find more arrays / pointers with user
> > controlled indices / offsets that can be converted to use 'array_ptr' or
> > 'array_idx'. A few are included in this set, and these are not expected
> > to be complete. That said, the 'get_user' protections raise the bar on
> > finding a vulnerable gadget in the kernel.
> >
> > These patches are also available via the 'nospec-v4' git branch here:
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/djbw/linux nospec-v4
> 
> I've pushed out a nospec-v4.1 with the below minor cleanup, a fixup of
> the changelog for "kvm, x86: fix spectre-v1 mitigation", and added
> Paolo's ack.
> 
>  git://git.kernel.org/pub/scm/linux/kernel/git/djbw/linux nospec-v4.1
> 
> diff --git a/include/linux/nospec.h b/include/linux/nospec.h
> index 8af35be1869e..b8a9222e34d1 100644
> --- a/include/linux/nospec.h
> +++ b/include/linux/nospec.h
> @@ -37,7 +37,7 @@ static inline unsigned long array_ptr_mask(unsigned
> long idx, unsigned long sz)
> unsigned long _i = (idx);   \
> unsigned long _mask = array_ptr_mask(_i, (sz)); \
> \
> -   __u._ptr = _arr + (_i & _mask); \
> +   __u._ptr = _arr + _i;   \
> __u._bit &= _mask;  \
> __u._ptr;   \

hmm. I'm not sure it's the right thing to do, since the macro
is forcing cpu to speculate subsequent load from null instead
of valid pointer.
As Linus said: "
 So that __u._bit masking wasn't masking
 the pointer, it was masking the value that was *read* from the
 pointer, so that you could know that an invalid access returned
 0/NULL, not just the first value in the array.
"
imo just
  return _arr + (_i & _mask);
is enough. No need for union games.
The cpu will speculate the load from _arr[0] if _i is out of bounds
which is the same as if user passed _i == 0 which would have passed
bounds check anyway, so I don't see any data leak from populating
cache with _arr[0] data. In-bounds access can do that just as well
without any speculation.



Re: using verifier to ensure a BPF program uses certain metadata?

2017-10-18 Thread Alexei Starovoitov
On Wed, Oct 18, 2017 at 08:56:31AM +0200, Johannes Berg wrote:
> > > Now, I realize that people could trivially just work around this in
> > > their program if they wanted, but I think most will take the
> > > reminder
> > > and just implement
> > > 
> > > if (ctx->is_data_ethernet)
> > > return DROP_FRAME;
> > > 
> > > instead, since mostly data frames will not be very relevant to
> > > them.
> > > 
> > > What do you think?
> > 
> > sounds fine and considering new verifier ops after Jakub refactoring
> > a check that is_data_ethernet was accessed would fit nicely.
> > Without void** hack.
> 
> Ok, thanks! I'll have to check what Jakub is doing there, do you have a
> pointer to that refactoring?

something similar to
commit 4f9218aaf8a4 ("bpf: move knowledge about post-translation offsets out of 
verifier")



Re: using verifier to ensure a BPF program uses certain metadata?

2017-10-17 Thread Alexei Starovoitov
On Mon, Oct 16, 2017 at 09:38:44AM +0200, Johannes Berg wrote:
> Hi,
> 
> As we discussed in April already (it's really been that long...), I'd
> wanted to allow using BPF to filter wireless monitor frames, to enable
> new use cases and higher performance in monitoring. I have some code,
> at
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next.git/log/?h=bpf

bpf bits looks pretty straightforward.
attach looks fine too. I'm assuming there is some rtnl or other lock,
so multiple assigns cannot race?
It's missing query interface though.
Please add support to return prog_id.

> which implements parts of this. It's still missing the TX status path
> and perhaps associated metadata, but that part is easy.
> 
> The bigger "problem" is that we're going to be adding support for
> devices that have 802.11->Ethernet conversion already in hardware, and
> in that case the notion that the filter program will get an 802.11
> header to look at is no longer right.
> 
> Now, most likely for the actual in-service monitoring we'll actually
> have to reconstitute the 802.11 header on the fly (in pure monitoring
> where nothing else is active, we can just disable the conversion), but
> the filtering shouldn't really be reliant on that, since that's not the
> cheapest thing to do.
> 
> The obvious idea around this is to add a metadata field (just a bit
> really), something like "is_data_ethernet", saying that it was both a
> data frame and is already converted to have an Ethernet header.
> However, since these devices don't really exist yet for the vast
> majority of people, I'm a bit afraid that we'll find later a lot of
> code simply ignoring this field and looking at the "802.11" header,
> which is then broken if it encounters an Ethernet header instead.
> 
> Are there lies my question: If we added a new callback to
> bpf_verifier_ops (e.g. "post_verifier_check"), to be called after the
> normal verification, and also added a context argument to
> "is_valid_access" (*), we could easily track that this new metadata
> field is accessed, and reject programs that don't access it at all.
> 
> Now, I realize that people could trivially just work around this in
> their program if they wanted, but I think most will take the reminder
> and just implement
> 
> if (ctx->is_data_ethernet)
> return DROP_FRAME;
> 
> instead, since mostly data frames will not be very relevant to them.
> 
> What do you think?

sounds fine and considering new verifier ops after Jakub refactoring
a check that is_data_ethernet was accessed would fit nicely.
Without void** hack.



Re: [RFC 1/3] bpf/wireless: add wifimon program type

2017-04-14 Thread Alexei Starovoitov
On Wed, Apr 12, 2017 at 05:30:40PM +0200, Johannes Berg wrote:
> On Wed, 2017-04-12 at 11:19 -0400, David Miller wrote:
> > 
> > > Instead it may make more sense to just have a "wifi_info(skb,
> > info)"
> > > function you can call, e.g. with a structure that has various
> > fields
> > > and flags to see which you care about.
> > 
> > I would advise against this, let the parsing part use __sk_buff and
> > therefore have maximum flexibility.  You really cannot predict the
> > future, so in my opinion it might be unwise to constrain at this
> > point.
> 
> So my point with the wifi_info() function to call from the BPF program
> was just that putting something that's not already in struct sk_buff
> into __sk_buff doesn't really make a lot of sense - we still have to
> actually build it somewhere/somehow without knowing if it's going to be
> needed by the program. We already have things like prog->cb_access and
> prog->dst_needed now, but I'm not sure we want to blow that up much.
> 
> At the same time, technically I *do* have the information in skb->cb,
> but the format thereof is something I really wouldn't want to let
> trickle into the ABI. Therefore, I think if somebody needs something
> like the bitrate, we should have a wifi_info() function that can return
> the bitrate (among other things) without having to pre-build the data.
> We can still cache it in mac80211 if multiple programs are there, dunno
> if that makes sense.
> 
> Nevertheless, I think if I don't use __sk_buff as the program argument
> then things get really messy, so I'll stick to that, and avoid adding
> anything to it as much as possible.

so today only 'len' field is needed, but the plan to add wifi specific
stuff to bpf context?
If so, adding these very wifi specific fields to __sk_buff will confuse
other program types, so new program type (like you did) and new
'struct bpf_wifi_md' are probably cleaner.
Physically the R1 register to bpf program will still be 'struct sk_buff',
but from bpf program side it will be:
struct bpf_wifi_md {
  __u32 len;
  __u32 wifi_things;
};

At the same time if most of the __sk_buff fields can be useful to wifimon,
then just use it as-is (without restricting to 'len' only) and add wifi
specific fields to it with a comment.