from:"Kees Cook"

Re: [PATCH net-next v5 0/7] introduce DEFINE_FLEX() macro

2023-09-12 Thread Kees Cook

On Tue, Sep 12, 2023 at 07:59:30AM -0400, Przemek Kitszel wrote:
> Add DEFINE_FLEX() macro, that helps on-stack allocation of structures
> with trailing flex array member.
> Expose __struct_size() macro which reads size of data allocated
> by DEFINE_FLEX().
> 
> Accompany new macros introduction with actual usage,
> in the ice driver - hence targeting for netdev tree.
> 
> Obvious benefits include simpler resulting code, less heap usage,
> less error checking. Less obvious is the fact that compiler has
> more room to optimize, and as a whole, even with more stuff on the stack,
> we end up with overall better (smaller) report from bloat-o-meter:
> add/remove: 8/6 grow/shrink: 7/18 up/down: 2211/-2270 (-59)
> (individual results in each patch).
> 
> v5: same as v4, just not RFC
> v4: _Static_assert() to ensure compiletime const count param
> v3: tidy up 1st patch
> v2: Kees: reusing __struct_size() instead of doubling it as a new macro
> 
> Przemek Kitszel (7):
>   overflow: add DEFINE_FLEX() for on-stack allocs
>   ice: ice_sched_remove_elems: replace 1 elem array param by u32
>   ice: drop two params of ice_aq_move_sched_elems()
>   ice: make use of DEFINE_FLEX() in ice_ddp.c
>   ice: make use of DEFINE_FLEX() for struct ice_aqc_add_tx_qgrp
>   ice: make use of DEFINE_FLEX() for struct ice_aqc_dis_txq_item
>   ice: make use of DEFINE_FLEX() in ice_switch.c

Looks good to me! Feel free to pick up via netdev.

-Kees

-- 
Kees Cook

Re: [PATCH][next] sctp: Fix out-of-bounds warning in sctp_process_asconf_param()

2021-04-16 Thread Kees Cook

On Fri, Apr 16, 2021 at 02:12:36PM -0500, Gustavo A. R. Silva wrote:
> Fix the following out-of-bounds warning:
> 
> net/sctp/sm_make_chunk.c:3150:4: warning: 'memcpy' offset [17, 28] from the 
> object at 'addr' is out of the bounds of referenced subobject 'v4' with type 
> 'struct sockaddr_in' at offset 0 [-Warray-bounds]
> 
> This helps with the ongoing efforts to globally enable -Warray-bounds
> and get us closer to being able to tighten the FORTIFY_SOURCE routines
> on memcpy().
> 
> Link: https://github.com/KSPP/linux/issues/109
> Reported-by: kernel test robot 
> Signed-off-by: Gustavo A. R. Silva 

Yup!

Reviewed-by: Kees Cook 

-- 
Kees Cook

Re: [PATCH v3 2/2] wl3501_cs: Fix out-of-bounds warnings in wl3501_mgmt_join

2021-04-15 Thread Kees Cook

On Wed, Apr 14, 2021 at 06:45:15PM -0500, Gustavo A. R. Silva wrote:
> Fix the following out-of-bounds warnings by adding a new structure
> wl3501_req instead of duplicating the same members in structure
> wl3501_join_req and wl3501_scan_confirm:
> 
> arch/x86/include/asm/string_32.h:182:25: warning: '__builtin_memcpy' offset 
> [39, 108] from the object at 'sig' is out of the bounds of referenced 
> subobject 'beacon_period' with type 'short unsigned int' at offset 36 
> [-Warray-bounds]
> arch/x86/include/asm/string_32.h:182:25: warning: '__builtin_memcpy' offset 
> [25, 95] from the object at 'sig' is out of the bounds of referenced 
> subobject 'beacon_period' with type 'short unsigned int' at offset 22 
> [-Warray-bounds]
> 
> Refactor the code, accordingly:
> 
> $ pahole -C wl3501_req drivers/net/wireless/wl3501_cs.o
> struct wl3501_req {
> u16beacon_period;/* 0 2 */
> u16dtim_period;  /* 2 2 */
> u16cap_info; /* 4 2 */
> u8 bss_type; /* 6 1 */
> u8 bssid[6]; /* 7 6 */
> struct iw_mgmt_essid_pset  ssid; /*1334 */
> struct iw_mgmt_ds_pset ds_pset;  /*47 3 */
> struct iw_mgmt_cf_pset cf_pset;  /*50 8 */
> struct iw_mgmt_ibss_pset   ibss_pset;/*58 4 */
> struct iw_mgmt_data_rset   bss_basic_rset;   /*6210 */
> 
> /* size: 72, cachelines: 2, members: 10 */
> /* last cacheline: 8 bytes */
> };
> 
> $ pahole -C wl3501_join_req drivers/net/wireless/wl3501_cs.o
> struct wl3501_join_req {
> u16next_blk; /* 0 2 */
> u8 sig_id;   /* 2 1 */
> u8 reserved; /* 3 1 */
> struct iw_mgmt_data_rset   operational_rset; /* 410 */
> u16reserved2;/*14 2 */
> u16timeout;  /*16 2 */
> u16probe_delay;  /*18 2 */
> u8 timestamp[8]; /*20 8 */
> u8 local_time[8];/*28 8 */
> struct wl3501_req  req;  /*3672 */
> 
> /* size: 108, cachelines: 2, members: 10 */
> /* last cacheline: 44 bytes */
> };
> 
> $ pahole -C wl3501_scan_confirm drivers/net/wireless/wl3501_cs.o
> struct wl3501_scan_confirm {
> u16next_blk; /* 0 2 */
> u8 sig_id;   /* 2 1 */
> u8 reserved; /* 3 1 */
> u16status;   /* 4 2 */
> char   timestamp[8]; /* 6 8 */
> char   localtime[8]; /*14 8 */
> struct wl3501_req  req;  /*2272 */
> /* --- cacheline 1 boundary (64 bytes) was 30 bytes ago --- */
> u8 rssi; /*94 1 */
> 
> /* size: 96, cachelines: 2, members: 8 */
> /* padding: 1 */
> /* last cacheline: 32 bytes */
> };
> 
> The problem is that the original code is trying to copy data into a
> bunch of struct members adjacent to each other in a single call to
> memcpy(). Now that a new struct wl3501_req enclosing all those adjacent
> members is introduced, memcpy() doesn't overrun the length of
> &sig.beacon_period and &this->bss_set[i].beacon_period, because the
> address of the new struct object _req_ is used as the destination,
> instead.
> 
> This helps with the ongoing efforts to globally enable -Warray-bounds
> and get us closer to being able to tighten the FORTIFY_SOURCE routines
> on memcpy().
> 
> Link: https://github.com/KSPP/linux/issues/109
> Reported-by: kernel test robot 
> Signed-off-by: Gustavo A. R. Silva 

Awesome! Thank you for this solution.

Reviewed-by: Kees Cook 

-- 
Kees Cook

Re: [PATCH v2 2/2][next] wl3501_cs: Fix out-of-bounds warning in wl3501_mgmt_join

2021-04-07 Thread Kees Cook

 cap_info;
> - u8  bss_type;
> - u8  bssid[ETH_ALEN];
> - struct iw_mgmt_essid_pset   ssid;
> - struct iw_mgmt_ds_pset  ds_pset;
> - struct iw_mgmt_cf_pset  cf_pset;
> - struct iw_mgmt_ibss_psetibss_pset;
> - struct iw_mgmt_data_rsetbss_basic_rset;
> + struct {
> + u16 beacon_period;
> + u16 dtim_period;
> + u16 cap_info;
> + u8  bss_type;
> + u8  bssid[ETH_ALEN];
> + struct iw_mgmt_essid_pset   ssid;
> + struct iw_mgmt_ds_pset  ds_pset;
> + struct iw_mgmt_cf_pset  cf_pset;
> + struct iw_mgmt_ibss_psetibss_pset;
> + struct iw_mgmt_data_rsetbss_basic_rset;
> + } req;
>  };
>  
>  struct wl3501_join_confirm {
> diff --git a/drivers/net/wireless/wl3501_cs.c 
> b/drivers/net/wireless/wl3501_cs.c
> index e149ef81d6cc..399d3bd2ae76 100644
> --- a/drivers/net/wireless/wl3501_cs.c
> +++ b/drivers/net/wireless/wl3501_cs.c
> @@ -590,7 +590,7 @@ static int wl3501_mgmt_join(struct wl3501_card *this, u16 
> stas)
>   struct wl3501_join_req sig = {
>   .sig_id   = WL3501_SIG_JOIN_REQ,
>   .timeout  = 10,
> - .ds_pset = {
> + .req.ds_pset = {
>   .el = {
>   .id  = IW_MGMT_INFO_ELEMENT_DS_PARAMETER_SET,
>   .len = 1,
> @@ -599,7 +599,7 @@ static int wl3501_mgmt_join(struct wl3501_card *this, u16 
> stas)
>   },
>   };
>  
> - memcpy(&sig.beacon_period, &this->bss_set[stas].beacon_period, 72);
> + memcpy(&sig.req, &this->bss_set[stas].beacon_period, sizeof(sig.req));

If not, then probably something like this should be added to make sure
nothing unexpected happens to change structure sizes:

BUILD_BUG_ON(sizeof(sig.req) != 72);

>   return wl3501_esbq_exec(this, &sig, sizeof(sig));
>  }
>  
> -- 
> 2.27.0
> 

-- 
Kees Cook

Re: [PATCH v2 1/2][next] wl3501_cs: Fix out-of-bounds warning in wl3501_send_pkt

2021-04-07 Thread Kees Cook

On Wed, Mar 31, 2021 at 04:44:29PM -0500, Gustavo A. R. Silva wrote:
> Fix the following out-of-bounds warning by enclosing
> structure members daddr and saddr into new struct addr:
> 
> arch/x86/include/asm/string_32.h:182:25: warning: '__builtin_memcpy' offset 
> [18, 23] from the object at 'sig' is out of the bounds of referenced 
> subobject 'daddr' with type 'u8[6]' {aka 'unsigned char[6]'} at offset 11 
> [-Warray-bounds]
> 
> Refactor the code, accordingly:
> 
> $ pahole -C wl3501_md_req drivers/net/wireless/wl3501_cs.o
> struct wl3501_md_req {
>   u16next_blk; /* 0 2 */
>   u8 sig_id;   /* 2 1 */
>   u8 routing;  /* 3 1 */
>   u16data; /* 4 2 */
>   u16size; /* 6 2 */
>   u8 pri;  /* 8 1 */
>   u8 service_class;/* 9 1 */
>   struct {
>   u8 daddr[6]; /*10 6 */
>   u8 saddr[6]; /*16 6 */
>   } addr;  /*1012 */
> 
>   /* size: 22, cachelines: 1, members: 8 */
>   /* last cacheline: 22 bytes */
> };
> 
> The problem is that the original code is trying to copy data into a
> couple of arrays adjacent to each other in a single call to memcpy().
> Now that a new struct _addr_ enclosing those two adjacent arrays
> is introduced, memcpy() doesn't overrun the length of &sig.daddr[0],
> because the address of the new struct object _addr_ is used as
> destination, instead.
> 
> Also, this helps with the ongoing efforts to enable -Warray-bounds and
> avoid confusing the compiler.
> 
> Link: https://github.com/KSPP/linux/issues/109
> Reported-by: kernel test robot 
> Build-tested-by: kernel test robot 
> Link: https://lore.kernel.org/lkml/60641d9b.2enledogsdcsoav2%25...@intel.com/
> Signed-off-by: Gustavo A. R. Silva 

Thanks, this makes the code much easier for the compiler to validate
at compile time. These cross-field memcpy()s are weird. I like the
solution here.

Reviewed-by: Kees Cook 

-- 
Kees Cook

Re: [PATCH RESEND][next] rtl8xxxu: Fix fall-through warnings for Clang

2021-03-10 Thread Kees Cook

On Wed, Mar 10, 2021 at 02:51:24PM -0500, Jes Sorensen wrote:
> On 3/10/21 2:45 PM, Kees Cook wrote:
> > On Wed, Mar 10, 2021 at 02:31:57PM -0500, Jes Sorensen wrote:
> >> On 3/10/21 2:14 PM, Kees Cook wrote:
> >>> Hm, this conversation looks like a miscommunication, mainly? I see
> >>> Gustavo, as requested by many others[1], replacing the fallthrough
> >>> comments with the "fallthrough" statement. (This is more than just a
> >>> "Clang doesn't parse comments" issue.)
> >>>
> >>> This could be a tree-wide patch and not bother you, but Greg KH has
> >>> generally advised us to send these changes broken out. Anyway, this
> >>> change still needs to land, so what would be the preferred path? I think
> >>> Gustavo could just carry it for Linus to merge without bothering you if
> >>> that'd be preferred?
> >>
> >> I'll respond with the same I did last time, fallthrough is not C and
> >> it's ugly.
> > 
> > I understand your point of view, but this is not the consensus[1] of
> > the community. "fallthrough" is a macro, using the GCC fallthrough
> > attribute, with the expectation that we can move to the C17/C18
> > "[[fallthrough]]" statement once it is finalized by the C standards
> > body.
> 
> I don't know who decided on that, but I still disagree. It's an ugly and
> pointless change that serves little purpose. We shouldn't have allowed
> the ugly /* fall-through */ comments in either, but at least they didn't
> mess with the code. I guess when you give someone an inch, they take a mile.
> 
> Last time this came up, the discussion was that clang refused to fix
> their brokenness and therefore this nonsense was being pushed into the
> kernel. It's still a pointless argument, if clang can't fix it's crap,
> then stop using it.
> 
> As Kalle correctly pointed out, none of the previous comments to this
> were addressed, the patches were just reposted as fact. Not exactly a
> nice way to go about it either.

Do you mean changing the commit log to re-justify these changes? I
guess that could be done, but based on the thread, it didn't seem to
be needed. The change is happening to match the coding style consensus
reached to give the kernel the flexibility to move from a gcc extension
to the final C standards committee results without having to do treewide
commits again (i.e. via the macro).

-- 
Kees Cook

Re: [PATCH RESEND][next] rtl8xxxu: Fix fall-through warnings for Clang

2021-03-10 Thread Kees Cook

On Wed, Mar 10, 2021 at 02:31:57PM -0500, Jes Sorensen wrote:
> On 3/10/21 2:14 PM, Kees Cook wrote:
> > On Fri, Mar 05, 2021 at 03:40:33PM +0200, Kalle Valo wrote:
> >> "Gustavo A. R. Silva"  writes:
> >>
> >>> In preparation to enable -Wimplicit-fallthrough for Clang, fix
> >>> multiple warnings by replacing /* fall through */ comments with
> >>> the new pseudo-keyword macro fallthrough; instead of letting the
> >>> code fall through to the next case.
> >>>
> >>> Notice that Clang doesn't recognize /* fall through */ comments as
> >>> implicit fall-through markings.
> >>>
> >>> Link: https://github.com/KSPP/linux/issues/115
> >>> Signed-off-by: Gustavo A. R. Silva 
> >>
> >> It's not cool that you ignore the comments you got in [1], then after a
> >> while mark the patch as "RESEND" and not even include a changelog why it
> >> was resent.
> >>
> >> [1] 
> >> https://patchwork.kernel.org/project/linux-wireless/patch/d522f387b2d0dde774785c7169c1f25aa529989d.1605896060.git.gustavo...@kernel.org/
> > 
> > Hm, this conversation looks like a miscommunication, mainly? I see
> > Gustavo, as requested by many others[1], replacing the fallthrough
> > comments with the "fallthrough" statement. (This is more than just a
> > "Clang doesn't parse comments" issue.)
> > 
> > This could be a tree-wide patch and not bother you, but Greg KH has
> > generally advised us to send these changes broken out. Anyway, this
> > change still needs to land, so what would be the preferred path? I think
> > Gustavo could just carry it for Linus to merge without bothering you if
> > that'd be preferred?
> 
> I'll respond with the same I did last time, fallthrough is not C and
> it's ugly.

I understand your point of view, but this is not the consensus[1] of
the community. "fallthrough" is a macro, using the GCC fallthrough
attribute, with the expectation that we can move to the C17/C18
"[[fallthrough]]" statement once it is finalized by the C standards
body.

-Kees

[1] 
https://www.kernel.org/doc/html/latest/process/deprecated.html#implicit-switch-case-fall-through

-- 
Kees Cook

Re: [PATCH RESEND][next] rtl8xxxu: Fix fall-through warnings for Clang

2021-03-10 Thread Kees Cook

On Fri, Mar 05, 2021 at 03:40:33PM +0200, Kalle Valo wrote:
> "Gustavo A. R. Silva"  writes:
> 
> > In preparation to enable -Wimplicit-fallthrough for Clang, fix
> > multiple warnings by replacing /* fall through */ comments with
> > the new pseudo-keyword macro fallthrough; instead of letting the
> > code fall through to the next case.
> >
> > Notice that Clang doesn't recognize /* fall through */ comments as
> > implicit fall-through markings.
> >
> > Link: https://github.com/KSPP/linux/issues/115
> > Signed-off-by: Gustavo A. R. Silva 
> 
> It's not cool that you ignore the comments you got in [1], then after a
> while mark the patch as "RESEND" and not even include a changelog why it
> was resent.
> 
> [1] 
> https://patchwork.kernel.org/project/linux-wireless/patch/d522f387b2d0dde774785c7169c1f25aa529989d.1605896060.git.gustavo...@kernel.org/

Hm, this conversation looks like a miscommunication, mainly? I see
Gustavo, as requested by many others[1], replacing the fallthrough
comments with the "fallthrough" statement. (This is more than just a
"Clang doesn't parse comments" issue.)

This could be a tree-wide patch and not bother you, but Greg KH has
generally advised us to send these changes broken out. Anyway, this
change still needs to land, so what would be the preferred path? I think
Gustavo could just carry it for Linus to merge without bothering you if
that'd be preferred?

-Kees

[1] https://git.kernel.org/linus/294f69e662d1570703e9b56e95be37a9fd3afba5

-- 
Kees Cook

Re: [PATCH v2] seccomp: Improve performace by optimizing rmb()

2021-02-10 Thread Kees Cook

On Fri, 5 Feb 2021 11:34:09 +0800, wanghongzhe wrote:
> According to kees's suggest, we started with the patch that just replaces
> rmb() with smp_rmb() and did a performace test with UnixBench. The results
> showed the overhead about 2.53% in rmb() test compared to the smp_rmb()
> one, in a x86-64 kernel with CONFIG_SMP enabled running inside a qemu-kvm
> vm. The test is a "syscall" testcase in UnixBench, which executes 5
> syscalls in a loop during a certain timeout (100 second in our test) and
> counts the total number of executions of this 5-syscall sequence. We set a
> seccomp filter with all allow rule for all used syscalls in this test
> (which will go bitmap path) to make sure the rmb() will be executed. The
> details for the test:
> 
> [...]

Applied to for-next/seccomp, thanks!

[1/1] seccomp: Improve performace by optimizing rmb()
      https://git.kernel.org/kees/c/a381b70a1cf8

-- 
Kees Cook

Re: [PATCH v1 1/1] Firstly, as Andy mentioned, this should be smp_rmb() instead of rmb(). considering that TSYNC is a cross-thread situation, and rmb() is a mandatory barrier which should not be used

2021-02-02 Thread Kees Cook

On Tue, Feb 02, 2021 at 06:13:07PM +0800, wanghongzhe wrote:
> Secondly, the smp_rmb() should be put between reading SYSCALL_WORK_SECCOMP 
> and reading
> seccomp.mode, not between reading seccomp.mode and seccomp->filter, to make
> sure that any changes to mode from another thread have been seen after
> SYSCALL_WORK_SECCOMP was seen, as the original comment shown. This issue 
> seems to be
> misintroduced at 13aa72f0fd0a9f98a41cefb662487269e2f1ad65 which aims to
> refactor the filter callback and the API. So the intuitive solution is to put
> it back like:
> 
> Thirdly, however, we can go further to improve the performace of checking
> syscall, considering that smp_rmb is always executed on the syscall-check
> path at each time for both FILTER and STRICT check while the TSYNC case
> which may lead to race condition is just a rare situation, and that in
> some arch like Arm64 smp_rmb is dsb(ishld) not a cheap barrier() in x86-64.
> 
> As a result, smp_rmb() should only be executed when necessary, e.g, it is
> only necessary when current thread's mode is SECCOMP_MODE_DISABLED at the
> first TYSNCed time, because after that the current thread's mode will always
> be SECCOMP_MODE_FILTER (and SYSCALL_WORK_SECCOMP will always be set) and can 
> not be
> changed anymore by anyone. In other words, after that, any thread can not
> change the mode (and SYSCALL_WORK_SECCOMP), so the race condition 
> disappeared, and
> no more smb_rmb() needed ever.
> 
> So the solution is to read mode again behind smp_rmb() after the mode is seen
> as SECCOMP_MODE_DISABLED by current thread at the first TSYNCed time, and if
> the new mode don't equals to SECCOMP_MODE_FILTER, do BUG(), go to FILTER path
> otherwise.
> 
> RFC -> v1:
>  - replace rmb() with smp_rmb()
>  - move the smp_rmb() logic to the middle between SYSCALL_WORK_SECCOMP and 
> mode
> 
> Signed-off-by: wanghongzhe 
> Reviewed-by: Andy Lutomirski 
> ---
>  kernel/seccomp.c | 25 +
>  1 file changed, 17 insertions(+), 8 deletions(-)
> 
> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> index 952dc1c90229..a621fb913ec6 100644
> --- a/kernel/seccomp.c
> +++ b/kernel/seccomp.c
> @@ -1160,12 +1160,6 @@ static int __seccomp_filter(int this_syscall, const 
> struct seccomp_data *sd,
>   int data;
>   struct seccomp_data sd_local;
>  
> - /*
> -  * Make sure that any changes to mode from another thread have
> -  * been seen after SYSCALL_WORK_SECCOMP was seen.
> -  */
> - rmb();
> -
>   if (!sd) {
>   populate_seccomp_data(&sd_local);
>   sd = &sd_local;
> @@ -1289,7 +1283,6 @@ static int __seccomp_filter(int this_syscall, const 
> struct seccomp_data *sd,
>  
>  int __secure_computing(const struct seccomp_data *sd)
>  {
> - int mode = current->seccomp.mode;
>   int this_syscall;
>  
>   if (IS_ENABLED(CONFIG_CHECKPOINT_RESTORE) &&
> @@ -1299,10 +1292,26 @@ int __secure_computing(const struct seccomp_data *sd)
>   this_syscall = sd ? sd->nr :
>   syscall_get_nr(current, current_pt_regs());
>  
> - switch (mode) {
> + /*
> +  * Make sure that any changes to mode from another thread have
> +  * been seen after SYSCALL_WORK_SECCOMP was seen.
> +  */
> + smp_rmb();

Let's start with a patch that just replaces rmb() with smp_rmb() and
then work on optimizing. Can you provide performance numbers that show
rmb() (and soon smp_rmb()) is causing actual problems here?

> +
> + switch (current->seccomp.mode) {
>   case SECCOMP_MODE_STRICT:
>   __secure_computing_strict(this_syscall);  /* may call do_exit */
>   return 0;
> + /*
> +  * Make sure that change to mode (from SECCOMP_MODE_DISABLED to
> +  * SECCOMP_MODE_FILTER) from another thread using TSYNC ability
> +  * have been seen after SYSCALL_WORK_SECCOMP was seen. Read mode again 
> behind
> +  * smp_rmb(), if it equals SECCOMP_MODE_FILTER, go to the right path.
> +  */
> + case SECCOMP_MODE_DISABLED:
> + smp_rmb();
> + if (unlikely(current->seccomp.mode != SECCOMP_MODE_FILTER))
> + BUG();

BUG() should never be used[1]. This is a recoverable situation, I think, and
should be handled as such.

-Kees

[1] 
https://www.kernel.org/doc/html/latest/process/deprecated.html#bug-and-bug-on

>   case SECCOMP_MODE_FILTER:
>   return __seccomp_filter(this_syscall, sd, false);
>   default:
> -- 
> 2.19.1
> 

-- 
Kees Cook

Re: UBSAN: array-index-out-of-bounds in arch_uprobe_analyze_insn

2020-12-01 Thread Kees Cook

Hi,

There appears to be a problem with prefix counting for the instruction
decoder. It looks like insn_get_prefixes() isn't keeping "nb" and "nbytes"
in sync correctly:

while (inat_is_legacy_prefix(attr)) {
/* Skip if same prefix */
for (i = 0; i < nb; i++)
if (prefixes->bytes[i] == b)
goto found;
if (nb == 4)
/* Invalid instruction */
break;
prefixes->bytes[nb++] = b;
...
found:
prefixes->nbytes++;
insn->next_byte++;
lb = b;
b = peek_next(insn_byte_t, insn);
attr = inat_get_opcode_attribute(b);
}

(nbytes is incremented on repeated prefixes, but "nb" isn't)

However, it looks like nbytes is used as an offset:

static inline int insn_offset_rex_prefix(struct insn *insn)
{
return insn->prefixes.nbytes;
}
static inline int insn_offset_vex_prefix(struct insn *insn)
{
return insn_offset_rex_prefix(insn) + insn->rex_prefix.nbytes;
}

Which means everything that iterates over prefixes.bytes[] is buggy,
since they may be trying to read past the end of the array:

$ git grep -A3 -E '< .*prefixes(\.|->)nbytes'
boot/compressed/sev-es.c:   for (i = 0; i < insn->prefixes.nbytes; i++) {
boot/compressed/sev-es.c-   insn_byte_t p =
insn->prefixes.bytes[i];
boot/compressed/sev-es.c-
boot/compressed/sev-es.c-   if (p == 0xf2 || p == 0xf3)
--
kernel/uprobes.c:   for (i = 0; i < insn->prefixes.nbytes; i++) {
kernel/uprobes.c-   insn_attr_t attr;
kernel/uprobes.c-
kernel/uprobes.c-   attr = 
inat_get_opcode_attribute(insn->prefixes.bytes[i]);
--
kernel/uprobes.c:   for (i = 0; i < insn->prefixes.nbytes; i++) {
kernel/uprobes.c-   if (insn->prefixes.bytes[i] == 0x66)
kernel/uprobes.c-   return -ENOTSUPP;
kernel/uprobes.c-   }
--
lib/insn-eval.c:for (i = 0; i < insn->prefixes.nbytes; i++) {
lib/insn-eval.c-insn_byte_t p = insn->prefixes.bytes[i];
lib/insn-eval.c-
lib/insn-eval.c-if (p == 0xf2 || p == 0xf3)
--
lib/insn-eval.c:for (i = 0; i < insn->prefixes.nbytes; i++) {
lib/insn-eval.c-insn_attr_t attr;
lib/insn-eval.c-
lib/insn-eval.c-attr = 
inat_get_opcode_attribute(insn->prefixes.bytes[i]);

I don't see a clear way to fix this.

-Kees

On Mon, Sep 21, 2020 at 09:20:07PM -0700, syzbot wrote:
> syzbot has bisected this issue to:
> 
> commit 4b2bd5fec007a4fd3fc82474b9199af25013de4c
> Author: John Stultz 
> Date:   Sat Oct 8 00:02:33 2016 +
> 
> proc: fix timerslack_ns CAP_SYS_NICE check when adjusting self
> 
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1697348d90
> start commit:   325d0eab Merge branch 'akpm' (patches from Andrew)
> git tree:   upstream
> final oops: https://syzkaller.appspot.com/x/report.txt?x=1597348d90
> console output: https://syzkaller.appspot.com/x/log.txt?x=1197348d90
> kernel config:  https://syzkaller.appspot.com/x/.config?x=b12e84189082991c
> dashboard link: https://syzkaller.appspot.com/bug?extid=9b64b619f10f19d19a7c
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=1573a8ad90
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=164ee6c590
> 
> Reported-by: syzbot+9b64b619f10f19d19...@syzkaller.appspotmail.com
> Fixes: 4b2bd5fec007 ("proc: fix timerslack_ns CAP_SYS_NICE check when 
> adjusting self")
> 
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

-- 
Kees Cook

Re: [Intel-wired-lan] [PATCH 000/141] Fix fall-through warnings for Clang

2020-11-25 Thread Kees Cook

On Tue, Nov 24, 2020 at 11:05:35PM -0800, James Bottomley wrote:
> Now, what we have seems to be about 6 cases (at least what's been shown
> in this thread) where a missing break would cause potentially user
> visible issues.  That means the value of this isn't zero, but it's not
> a no-brainer massive win either.  That's why I think asking what we've
> invested vs the return isn't a useless exercise.

The number is much higher[1]. If it were 6 in the entire history of the
kernel, I would agree with you. :) Some were fixed _before_ Gustavo's
effort too, which I also count towards the idea of "this is a dangerous
weakness in C, and now we have stopped it forever."

> But the broader point I'm making is just because the compiler people
> come up with a shiny new warning doesn't necessarily mean the problem
> it's detecting is one that causes us actual problems in the code base. 
> I'd really be happier if we had a theory about what classes of CVE or
> bug we could eliminate before we embrace the next new warning.

But we did! It was long ago justified and documented[2], and even links to
the CWE[3] for it. This wasn't random joy over discovering a new warning
we could turn on, this was turning on a warning that the compiler folks
finally gave us to handle an entire class of flaws. If we need to update
the code-base to address it not a useful debate -- that was settled
already, even if you're only discovering it now. :P. This last patch
set is about finishing that work for Clang, which is correctly even
more strict than GCC.

-Kees

[1] https://outflux.net/slides/2019/lss/kspp.pdf calls out specific
numbers (about 6.5% of the patches fixed missing breaks):
v4.19:  3 of 129
v4.20:  2 of  59
v5.0:   3 of  56
v5.1:  10 of 100
v5.2:   6 of  71
v5.3:   7 of  69

And in the history of the kernel, it's been an ongoing source of
flaws:

$ l --no-merges | grep -i 'missing break' | wc -l
185

The frequency of such errors being "naturally" found was pretty
steady until the static checkers started warning, and then it was
on the rise, but the full effort flushed the rest out, and now it's
dropped to almost zero:

  1 v2.6.12
  3 v2.6.16.28
  1 v2.6.17
  1 v2.6.19
  2 v2.6.21
  1 v2.6.22
  3 v2.6.24
  3 v2.6.29
  1 v2.6.32
  1 v2.6.33
  1 v2.6.35
  4 v2.6.36
  3 v2.6.38
  2 v2.6.39
  7 v3.0
  2 v3.1
  2 v3.2
  2 v3.3
  3 v3.4
  1 v3.5
  8 v3.6
  7 v3.7
  3 v3.8
  6 v3.9
  3 v3.10
  2 v3.11
  5 v3.12
  5 v3.13
  2 v3.14
  4 v3.15
  2 v3.16
  3 v3.17
  2 v3.18
  2 v3.19
  1 v4.0
  2 v4.1
  5 v4.2
  4 v4.5
  5 v4.7
  6 v4.8
  1 v4.9
  3 v4.10
  2 v4.11
  6 v4.12
  3 v4.13
  2 v4.14
  5 v4.15
  2 v4.16
  7 v4.18
  2 v4.19
  6 v4.20
  3 v5.0
 12 v5.1
  3 v5.2
  4 v5.3
  2 v5.4
  1 v5.8

And the reason it's fully zero, is because we still have the cases we're
cleaning up right now. Even this last one from v5.8 is specifically of
the same type this series addresses:

case 4:
color_index = TrueCModeIndex;
+   break;
default:
return;
}

[2] 
https://www.kernel.org/doc/html/latest/process/deprecated.html#implicit-switch-case-fall-through

All switch/case blocks must end in one of:

break;
    fallthrough;
continue;
goto ;
return [expression];

[3] https://cwe.mitre.org/data/definitions/484.html

-- 
Kees Cook

Re: [PATCH] entry: Fix boot for !CONFIG_GENERIC_ENTRY

2020-11-24 Thread Kees Cook

On Mon, Nov 23, 2020 at 10:54:58AM -0500, Gabriel Krisman Bertazi wrote:
> Gabriel Krisman Bertazi  writes:
> 
> > Jann Horn  writes:
> >> As part of fixing this, it might be a good idea to put "enum
> >> syscall_work_bit" behind a "#ifdef CONFIG_GENERIC_ENTRY" to avoid
> >> future accidents like this?
> >
> > Hi Jan, Arnd,
> >
> > That is correct.  This is a copy pasta mistake.  My apologies.  I didn't
> > have a !GENERIC_ENTRY device to test, but just the ifdef would have
> > caught it.
> 
> I have patched it as suggested.  Tested on qemu for arm32 and on bare
> metal for x86-64.
> 
> Once again, my apologies for the mistake.
> 
> -- >8 --
> Subject: [PATCH] entry: Fix boot for !CONFIG_GENERIC_ENTRY
> 
> A copy-pasta mistake tries to set SYSCALL_WORK flags instead of TIF
> flags for !CONFIG_GENERIC_ENTRY.  Also, add safeguards to catch this at
> compilation time.
> 
> Reported-by: Naresh Kamboju 
> Suggested-by: Jann Horn 
> Signed-off-by: Gabriel Krisman Bertazi 

Thanks for getting this fixed!

3136b93c3fb2 ("entry: Expose helpers to migrate TIF to SYSCALL_WORK flags")
Reviewed-by: Kees Cook 

-- 
Kees Cook

Re: [Intel-wired-lan] [PATCH 000/141] Fix fall-through warnings for Clang

2020-11-24 Thread Kees Cook

On Mon, Nov 23, 2020 at 08:31:30AM -0800, James Bottomley wrote:
> Really, no ... something which produces no improvement has no value at
> all ... we really shouldn't be wasting maintainer time with it because
> it has a cost to merge.  I'm not sure we understand where the balance
> lies in value vs cost to merge but I am confident in the zero value
> case.

What? We can't measure how many future bugs aren't introduced because the
kernel requires explicit case flow-control statements for all new code.

We already enable -Wimplicit-fallthrough globally, so that's not the
discussion. The issue is that Clang is (correctly) even more strict
than GCC for this, so these are the remaining ones to fix for full Clang
coverage too.

People have spent more time debating this already than it would have
taken to apply the patches. :)

This is about robustness and language wrangling. It's a big code-base,
and this is the price of our managing technical debt for permanent
robustness improvements. (The numbers I ran from Gustavo's earlier
patches were that about 10% of the places adjusted were identified as
legitimate bugs being fixed. This final series may be lower, but there
are still bugs being found from it -- we need to finish this and shut
the door on it for good.)

-- 
Kees Cook

Re: [PATCH 000/141] Fix fall-through warnings for Clang

2020-11-24 Thread Kees Cook

On Mon, Nov 23, 2020 at 05:32:51PM -0800, Nick Desaulniers wrote:
> On Sun, Nov 22, 2020 at 8:17 AM Kees Cook  wrote:
> >
> > On Fri, Nov 20, 2020 at 11:51:42AM -0800, Jakub Kicinski wrote:
> > > If none of the 140 patches here fix a real bug, and there is no change
> > > to machine code then it sounds to me like a W=2 kind of a warning.
> >
> > FWIW, this series has found at least one bug so far:
> > https://lore.kernel.org/lkml/CAFCwf11izHF=g1mGry1fE5kvFFFrxzhPSM6qKAO8gxSp=kr...@mail.gmail.com/
> 
> So looks like the bulk of these are:
> switch (x) {
>   case 0:
> ++x;
>   default:
> break;
> }
> 
> I have a patch that fixes those up for clang:
> https://reviews.llvm.org/D91895

I still think this isn't right -- it's a case statement that runs off
the end without an explicit flow control determination. I think Clang is
right to warn for these, and GCC should also warn.

-- 
Kees Cook

Re: [PATCH 000/141] Fix fall-through warnings for Clang

2020-11-22 Thread Kees Cook

On Fri, Nov 20, 2020 at 11:51:42AM -0800, Jakub Kicinski wrote:
> On Fri, 20 Nov 2020 11:30:40 -0800 Kees Cook wrote:
> > On Fri, Nov 20, 2020 at 10:53:44AM -0800, Jakub Kicinski wrote:
> > > On Fri, 20 Nov 2020 12:21:39 -0600 Gustavo A. R. Silva wrote:  
> > > > This series aims to fix almost all remaining fall-through warnings in
> > > > order to enable -Wimplicit-fallthrough for Clang.
> > > > 
> > > > In preparation to enable -Wimplicit-fallthrough for Clang, explicitly
> > > > add multiple break/goto/return/fallthrough statements instead of just
> > > > letting the code fall through to the next case.
> > > > 
> > > > Notice that in order to enable -Wimplicit-fallthrough for Clang, this
> > > > change[1] is meant to be reverted at some point. So, this patch helps
> > > > to move in that direction.
> > > > 
> > > > Something important to mention is that there is currently a discrepancy
> > > > between GCC and Clang when dealing with switch fall-through to empty 
> > > > case
> > > > statements or to cases that only contain a break/continue/return
> > > > statement[2][3][4].  
> > > 
> > > Are we sure we want to make this change? Was it discussed before?
> > > 
> > > Are there any bugs Clangs puritanical definition of fallthrough helped
> > > find?
> > > 
> > > IMVHO compiler warnings are supposed to warn about issues that could
> > > be bugs. Falling through to default: break; can hardly be a bug?!  
> > 
> > It's certainly a place where the intent is not always clear. I think
> > this makes all the cases unambiguous, and doesn't impact the machine
> > code, since the compiler will happily optimize away any behavioral
> > redundancy.
> 
> If none of the 140 patches here fix a real bug, and there is no change
> to machine code then it sounds to me like a W=2 kind of a warning.

FWIW, this series has found at least one bug so far:
https://lore.kernel.org/lkml/CAFCwf11izHF=g1mGry1fE5kvFFFrxzhPSM6qKAO8gxSp=kr...@mail.gmail.com/

-- 
Kees Cook

Re: [PATCH 000/141] Fix fall-through warnings for Clang

2020-11-20 Thread Kees Cook

On Fri, Nov 20, 2020 at 11:51:42AM -0800, Jakub Kicinski wrote:
> On Fri, 20 Nov 2020 11:30:40 -0800 Kees Cook wrote:
> > On Fri, Nov 20, 2020 at 10:53:44AM -0800, Jakub Kicinski wrote:
> > > On Fri, 20 Nov 2020 12:21:39 -0600 Gustavo A. R. Silva wrote:  
> > > > This series aims to fix almost all remaining fall-through warnings in
> > > > order to enable -Wimplicit-fallthrough for Clang.
> > > > 
> > > > In preparation to enable -Wimplicit-fallthrough for Clang, explicitly
> > > > add multiple break/goto/return/fallthrough statements instead of just
> > > > letting the code fall through to the next case.
> > > > 
> > > > Notice that in order to enable -Wimplicit-fallthrough for Clang, this
> > > > change[1] is meant to be reverted at some point. So, this patch helps
> > > > to move in that direction.
> > > > 
> > > > Something important to mention is that there is currently a discrepancy
> > > > between GCC and Clang when dealing with switch fall-through to empty 
> > > > case
> > > > statements or to cases that only contain a break/continue/return
> > > > statement[2][3][4].  
> > > 
> > > Are we sure we want to make this change? Was it discussed before?
> > > 
> > > Are there any bugs Clangs puritanical definition of fallthrough helped
> > > find?
> > > 
> > > IMVHO compiler warnings are supposed to warn about issues that could
> > > be bugs. Falling through to default: break; can hardly be a bug?!  
> > 
> > It's certainly a place where the intent is not always clear. I think
> > this makes all the cases unambiguous, and doesn't impact the machine
> > code, since the compiler will happily optimize away any behavioral
> > redundancy.
> 
> If none of the 140 patches here fix a real bug, and there is no change
> to machine code then it sounds to me like a W=2 kind of a warning.

I'd like to avoid splitting common -W options between default and W=2
just based on the compiler. Getting -Wimplicit-fallthrough enabled found
plenty of bugs, so making sure it works correctly for both compilers
feels justified to me. (This is just a subset of the same C language
short-coming.)

> I think clang is just being annoying here, but if I'm the only one who
> feels this way chances are I'm wrong :)

It's being pretty pedantic, but I don't think it's unreasonable to
explicitly state how every case ends. GCC's silence for the case of
"fall through to a break" doesn't really seem justified.

-- 
Kees Cook

Re: [PATCH 000/141] Fix fall-through warnings for Clang

2020-11-20 Thread Kees Cook

On Fri, Nov 20, 2020 at 10:53:44AM -0800, Jakub Kicinski wrote:
> On Fri, 20 Nov 2020 12:21:39 -0600 Gustavo A. R. Silva wrote:
> > This series aims to fix almost all remaining fall-through warnings in
> > order to enable -Wimplicit-fallthrough for Clang.
> > 
> > In preparation to enable -Wimplicit-fallthrough for Clang, explicitly
> > add multiple break/goto/return/fallthrough statements instead of just
> > letting the code fall through to the next case.
> > 
> > Notice that in order to enable -Wimplicit-fallthrough for Clang, this
> > change[1] is meant to be reverted at some point. So, this patch helps
> > to move in that direction.
> > 
> > Something important to mention is that there is currently a discrepancy
> > between GCC and Clang when dealing with switch fall-through to empty case
> > statements or to cases that only contain a break/continue/return
> > statement[2][3][4].
> 
> Are we sure we want to make this change? Was it discussed before?
> 
> Are there any bugs Clangs puritanical definition of fallthrough helped
> find?
> 
> IMVHO compiler warnings are supposed to warn about issues that could
> be bugs. Falling through to default: break; can hardly be a bug?!

It's certainly a place where the intent is not always clear. I think
this makes all the cases unambiguous, and doesn't impact the machine
code, since the compiler will happily optimize away any behavioral
redundancy.


-- 
Kees Cook

Re: [PATCH net] cfg80211: fix callback type mismatches in wext-compat

2020-11-20 Thread Kees Cook

On Tue, Nov 17, 2020 at 02:07:43PM -0800, Sami Tolvanen wrote:
> On Tue, Nov 17, 2020 at 1:45 PM Kees Cook  wrote:
> >
> > On Tue, Nov 17, 2020 at 12:59:02PM -0800, Sami Tolvanen wrote:
> > > Instead of casting callback functions to type iw_handler, which trips
> > > indirect call checking with Clang's Control-Flow Integrity (CFI), add
> > > stub functions with the correct function type for the callbacks.
> >
> > Oh, wow. iw_handler with union iwreq_data look like really nasty hacks.
> > Aren't those just totally bypassing type checking? Where do the
> > callbacks actually happen? I couldn't find them...
> 
> The callbacks to these happen in ioctl_standard_call in wext-core.c.

Thanks! If this is all the 'old compat' code, this patch looks fine. I
think new stuff should probably encode types in some fashion (rather
than via wrappers).

-- 
Kees Cook

Re: [PATCH net] cfg80211: fix callback type mismatches in wext-compat

2020-11-20 Thread Kees Cook

On Tue, Nov 17, 2020 at 12:59:02PM -0800, Sami Tolvanen wrote:
> Instead of casting callback functions to type iw_handler, which trips
> indirect call checking with Clang's Control-Flow Integrity (CFI), add
> stub functions with the correct function type for the callbacks.
> 
> Reported-by: Sedat Dilek 
> Signed-off-by: Sami Tolvanen 

Reviewed-by: Kees Cook 

-- 
Kees Cook

Re: [PATCH net] cfg80211: fix callback type mismatches in wext-compat

2020-11-17 Thread Kees Cook

CSIWFRAG)] = (iw_handler) cfg80211_wext_siwfrag,
> - [IW_IOCTL_IDX(SIOCGIWFRAG)] = (iw_handler) cfg80211_wext_giwfrag,
> - [IW_IOCTL_IDX(SIOCSIWTXPOW)]= (iw_handler) cfg80211_wext_siwtxpower,
> - [IW_IOCTL_IDX(SIOCGIWTXPOW)]= (iw_handler) cfg80211_wext_giwtxpower,
> - [IW_IOCTL_IDX(SIOCSIWRETRY)]= (iw_handler) cfg80211_wext_siwretry,
> - [IW_IOCTL_IDX(SIOCGIWRETRY)]= (iw_handler) cfg80211_wext_giwretry,
> - [IW_IOCTL_IDX(SIOCSIWENCODE)]   = (iw_handler) cfg80211_wext_siwencode,
> - [IW_IOCTL_IDX(SIOCGIWENCODE)]   = (iw_handler) cfg80211_wext_giwencode,
> - [IW_IOCTL_IDX(SIOCSIWPOWER)]= (iw_handler) cfg80211_wext_siwpower,
> - [IW_IOCTL_IDX(SIOCGIWPOWER)]= (iw_handler) cfg80211_wext_giwpower,
> - [IW_IOCTL_IDX(SIOCSIWGENIE)]= (iw_handler) cfg80211_wext_siwgenie,
> - [IW_IOCTL_IDX(SIOCSIWAUTH)] = (iw_handler) cfg80211_wext_siwauth,
> - [IW_IOCTL_IDX(SIOCGIWAUTH)] = (iw_handler) cfg80211_wext_giwauth,
> - [IW_IOCTL_IDX(SIOCSIWENCODEEXT)]= (iw_handler) 
> cfg80211_wext_siwencodeext,
> - [IW_IOCTL_IDX(SIOCSIWPMKSA)]= (iw_handler) cfg80211_wext_siwpmksa,
> + [IW_IOCTL_IDX(SIOCGIWNAME)] = __cfg80211_wext_giwname,
> + [IW_IOCTL_IDX(SIOCSIWFREQ)] = __cfg80211_wext_siwfreq,
> + [IW_IOCTL_IDX(SIOCGIWFREQ)] = __cfg80211_wext_giwfreq,
> + [IW_IOCTL_IDX(SIOCSIWMODE)] = __cfg80211_wext_siwmode,
> + [IW_IOCTL_IDX(SIOCGIWMODE)] = __cfg80211_wext_giwmode,
> + [IW_IOCTL_IDX(SIOCGIWRANGE)]= __cfg80211_wext_giwrange,
> + [IW_IOCTL_IDX(SIOCSIWAP)]   = __cfg80211_wext_siwap,
> + [IW_IOCTL_IDX(SIOCGIWAP)]   = __cfg80211_wext_giwap,
> + [IW_IOCTL_IDX(SIOCSIWMLME)] = __cfg80211_wext_siwmlme,
> + [IW_IOCTL_IDX(SIOCSIWSCAN)] = cfg80211_wext_siwscan,
> + [IW_IOCTL_IDX(SIOCGIWSCAN)] = __cfg80211_wext_giwscan,
> + [IW_IOCTL_IDX(SIOCSIWESSID)]= __cfg80211_wext_siwessid,
> + [IW_IOCTL_IDX(SIOCGIWESSID)]= __cfg80211_wext_giwessid,
> + [IW_IOCTL_IDX(SIOCSIWRATE)] = __cfg80211_wext_siwrate,
> + [IW_IOCTL_IDX(SIOCGIWRATE)] = __cfg80211_wext_giwrate,
> + [IW_IOCTL_IDX(SIOCSIWRTS)]  = __cfg80211_wext_siwrts,
> + [IW_IOCTL_IDX(SIOCGIWRTS)]  = __cfg80211_wext_giwrts,
> + [IW_IOCTL_IDX(SIOCSIWFRAG)] = __cfg80211_wext_siwfrag,
> + [IW_IOCTL_IDX(SIOCGIWFRAG)] = __cfg80211_wext_giwfrag,
> + [IW_IOCTL_IDX(SIOCSIWTXPOW)]= cfg80211_wext_siwtxpower,
> + [IW_IOCTL_IDX(SIOCGIWTXPOW)]= cfg80211_wext_giwtxpower,
> + [IW_IOCTL_IDX(SIOCSIWRETRY)]= __cfg80211_wext_siwretry,
> + [IW_IOCTL_IDX(SIOCGIWRETRY)]= __cfg80211_wext_giwretry,
> + [IW_IOCTL_IDX(SIOCSIWENCODE)]   = __cfg80211_wext_siwencode,
> + [IW_IOCTL_IDX(SIOCGIWENCODE)]   = __cfg80211_wext_giwencode,
> + [IW_IOCTL_IDX(SIOCSIWPOWER)]= __cfg80211_wext_siwpower,
> + [IW_IOCTL_IDX(SIOCGIWPOWER)]= __cfg80211_wext_giwpower,
> + [IW_IOCTL_IDX(SIOCSIWGENIE)]= __cfg80211_wext_siwgenie,
> + [IW_IOCTL_IDX(SIOCSIWAUTH)] = __cfg80211_wext_siwauth,
> +     [IW_IOCTL_IDX(SIOCGIWAUTH)] = __cfg80211_wext_giwauth,
> + [IW_IOCTL_IDX(SIOCSIWENCODEEXT)]= __cfg80211_wext_siwencodeext,
> + [IW_IOCTL_IDX(SIOCSIWPMKSA)]= __cfg80211_wext_siwpmksa,
>  };
>  
>  const struct iw_handler_def cfg80211_wext_handler = {
> 
> base-commit: 9c87c9f41245baa3fc4716cf39141439cf405b01
> -- 
> 2.29.2.299.gdc1121823c-goog
> 

-- 
Kees Cook

Re: [PATCH] tracepoint: Do not fail unregistering a probe due to memory allocation

2020-11-17 Thread Kees Cook

On Mon, Nov 16, 2020 at 05:51:07PM -0500, Steven Rostedt wrote:
> [ Kees, I added you because you tend to know about these things.
>   Is it OK to assign a void func(void) that doesn't do anything and returns
>   nothing to a function pointer that could be call with parameters? We need
>   to add stubs for tracepoints when we fail to allocate a new array on
>   removal of a callback, but the callbacks do have arguments, but the stub
>   called does not have arguments.
> 
>   Matt, Does this patch fix the error your patch was trying to fix?
> ]

As I think got discussed in the thread, what you had here wouldn't work
in a CFI build if the function prototype of the call site and the
function don't match. (Though I can't tell if .func() is ever called?)

i.e. .func's prototype must match tp_stub_func()'s.

-- 
Kees Cook

Re: [PATCH v5 0/3] Fix inefficiences and rename nla_strlcpy

2020-11-13 Thread Kees Cook

On Fri, Nov 13, 2020 at 12:11:30PM +0100, laniel_fran...@privacyrequired.com 
wrote:
> From: Francis Laniel 
> 
> Hi.
> 
> 
> I hope you are all fine and the same for your relatives.
> 
> This patch set answers to first three issues listed in:
> https://github.com/KSPP/linux/issues/110
> 
> To sum up, the patch contributions are the following:
> 1. the first patch fixes an inefficiency where some bytes in dst were written
> twice, one with 0 the other with src content.
> 2. The second one modifies nla_strlcpy to return the same value as strscpy,
> i.e. number of bytes written or -E2BIG if src was truncated.
> It also modifies code that calls nla_strlcpy and checks for its return value.
> 3. The third renames nla_strlcpy to nla_strscpy.
> 
> Unfortunately, I did not find how to create struct nlattr objects so I tested
> my modifications on simple char* and with GDB using tc to get to
> tcf_proto_check_kind.
> 
> If you see any way to improve the code or have any remark, feel free to 
> comment.
> 
> 
> Best regards and take care of yourselves.
> 
> Francis Laniel (3):
>   Fix unefficient call to memset before memcpu in nla_strlcpy.
>   Modify return value of nla_strlcpy to match that of strscpy.
>   treewide: rename nla_strlcpy to nla_strscpy.

Thanks! This looks good to me.

Jakub, does this look ready to you?

> 
>  drivers/infiniband/core/nldev.c| 10 +++---
>  drivers/net/can/vxcan.c|  4 +--
>  drivers/net/veth.c |  4 +--
>  include/linux/genl_magic_struct.h  |  2 +-
>  include/net/netlink.h  |  4 +--
>  include/net/pkt_cls.h  |  2 +-
>  kernel/taskstats.c |  2 +-
>  lib/nlattr.c   | 42 ++
>  net/core/fib_rules.c   |  4 +--
>  net/core/rtnetlink.c   | 12 +++
>  net/decnet/dn_dev.c|  2 +-
>  net/ieee802154/nl-mac.c|  2 +-
>  net/ipv4/devinet.c |  2 +-
>  net/ipv4/fib_semantics.c   |  2 +-
>  net/ipv4/metrics.c |  2 +-
>  net/netfilter/ipset/ip_set_hash_netiface.c |  4 +--
>  net/netfilter/nf_tables_api.c  |  6 ++--
>  net/netfilter/nfnetlink_acct.c |  2 +-
>  net/netfilter/nfnetlink_cthelper.c |  4 +--
>  net/netfilter/nft_ct.c |  2 +-
>  net/netfilter/nft_log.c|  2 +-
>  net/netlabel/netlabel_mgmt.c   |  2 +-
>  net/nfc/netlink.c  |  2 +-
>  net/sched/act_api.c|  2 +-
>  net/sched/act_ipt.c|  2 +-
>  net/sched/act_simple.c |  4 +--
>  net/sched/cls_api.c|  2 +-
>  net/sched/sch_api.c|  2 +-
>  net/tipc/netlink_compat.c  |  2 +-
>  29 files changed, 73 insertions(+), 61 deletions(-)
> 
> -- 
> 2.20.1
> 

-- 
Kees Cook

Re: [PATCH v4 2/3] Modify return value of nla_strlcpy to match that of strscpy.

2020-10-30 Thread Kees Cook

On Fri, Oct 30, 2020 at 04:36:46PM +0100, laniel_fran...@privacyrequired.com 
wrote:
> diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
> index 2a76a2f5ed88..f9b053b30a7b 100644
> --- a/net/sched/sch_api.c
> +++ b/net/sched/sch_api.c
> @@ -1170,7 +1170,7 @@ static struct Qdisc *qdisc_create(struct net_device 
> *dev,
>  #ifdef CONFIG_MODULES
>   if (ops == NULL && kind != NULL) {
>   char name[IFNAMSIZ];
> - if (nla_strlcpy(name, kind, IFNAMSIZ) < IFNAMSIZ) {
> + if (nla_strlcpy(name, kind, IFNAMSIZ) > 0) {
>   /* We dropped the RTNL semaphore in order to
>* perform the module load.  So, even if we
>* succeeded in loading the module we have to

Oops, I think this should be >= 0 ?

-- 
Kees Cook

Re: [RFC][PATCH v3 3/3] Rename nla_strlcpy to nla_strscpy.

2020-10-23 Thread Kees Cook

On Fri, Oct 23, 2020 at 08:29:20AM -0700, Jakub Kicinski wrote:
> On Fri, 23 Oct 2020 08:07:44 + David Laight wrote:
> > FWIW I suspect  the 'return -ERR on overflow' is going to bite us.
> > Code that does p += strsxxx(p, ..., lim - p, ...) assuming (or not
> > caring) about overflow goes badly wrong.
> 
> I don't really care either way, but in netlink there's usually an
> attribute per value, nothing combines strings like p += strx..().
> Looking at the conversion in patch 2 the callers just want to 
> check for overflow.

Right -- this is a very narrow use-case (NLA). I think this series is
fine as-is.

-- 
Kees Cook

Re: [RFC][PATCH v3 3/3] Rename nla_strlcpy to nla_strscpy.

2020-10-22 Thread Kees Cook

On Thu, Oct 22, 2020 at 11:41:31AM +0200, Francis Laniel wrote:
> Le jeudi 22 octobre 2020, 01:49:59 CEST Kees Cook a écrit :
> > On Tue, Oct 20, 2020 at 06:47:07PM +0200, 
> > laniel_fran...@privacyrequired.com 
> wrote:
> > > From: Francis Laniel 
> > > 
> > > Calls to nla_strlcpy are now replaced by calls to nla_strscpy which is the
> > > new name of this function.
> > > 
> > > Signed-off-by: Francis Laniel 
> > 
> > The Subject could also be: "treewide: Rename nla_strlcpy to nla_strscpy"
> > 
> > But otherwise, yup, easy mechanical change.
> 
> Should I submit a v4 for this change?

I'll say yes. :) Drop the RFC, bump to v4, and send it to netdev (along
with all the other CCs you have here already), and add the Reviewed-bys
from v3.

Thanks!

-Kees

> 
> > 
> > Reviewed-by: Kees Cook 
> > 
> > > ---
> > > 
> > >  drivers/infiniband/core/nldev.c| 10 +-
> > >  drivers/net/can/vxcan.c|  4 ++--
> > >  drivers/net/veth.c |  4 ++--
> > >  include/linux/genl_magic_struct.h  |  2 +-
> > >  include/net/netlink.h  |  4 ++--
> > >  include/net/pkt_cls.h  |  2 +-
> > >  kernel/taskstats.c |  2 +-
> > >  lib/nlattr.c   |  6 +++---
> > >  net/core/fib_rules.c   |  4 ++--
> > >  net/core/rtnetlink.c   | 12 ++--
> > >  net/decnet/dn_dev.c|  2 +-
> > >  net/ieee802154/nl-mac.c|  2 +-
> > >  net/ipv4/devinet.c |  2 +-
> > >  net/ipv4/fib_semantics.c   |  2 +-
> > >  net/ipv4/metrics.c |  2 +-
> > >  net/netfilter/ipset/ip_set_hash_netiface.c |  4 ++--
> > >  net/netfilter/nf_tables_api.c  |  6 +++---
> > >  net/netfilter/nfnetlink_acct.c |  2 +-
> > >  net/netfilter/nfnetlink_cthelper.c |  4 ++--
> > >  net/netfilter/nft_ct.c |  2 +-
> > >  net/netfilter/nft_log.c|  2 +-
> > >  net/netlabel/netlabel_mgmt.c   |  2 +-
> > >  net/nfc/netlink.c  |  2 +-
> > >  net/sched/act_api.c|  2 +-
> > >  net/sched/act_ipt.c|  2 +-
> > >  net/sched/act_simple.c |  4 ++--
> > >  net/sched/cls_api.c|  2 +-
> > >  net/sched/sch_api.c|  2 +-
> > >  net/tipc/netlink_compat.c  |  2 +-
> > >  29 files changed, 49 insertions(+), 49 deletions(-)
> > > 
> > > diff --git a/drivers/infiniband/core/nldev.c
> > > b/drivers/infiniband/core/nldev.c index 12d29d54a081..08366e254b1d 100644
> > > --- a/drivers/infiniband/core/nldev.c
> > > +++ b/drivers/infiniband/core/nldev.c
> > > @@ -932,7 +932,7 @@ static int nldev_set_doit(struct sk_buff *skb, struct
> > > nlmsghdr *nlh,> 
> > >   if (tb[RDMA_NLDEV_ATTR_DEV_NAME]) {
> > >   
> > >   char name[IB_DEVICE_NAME_MAX] = {};
> > > 
> > > - nla_strlcpy(name, tb[RDMA_NLDEV_ATTR_DEV_NAME],
> > > + nla_strscpy(name, tb[RDMA_NLDEV_ATTR_DEV_NAME],
> > > 
> > >   IB_DEVICE_NAME_MAX);
> > >   
> > >   if (strlen(name) == 0) {
> > >   
> > >   err = -EINVAL;
> > > 
> > > @@ -1529,13 +1529,13 @@ static int nldev_newlink(struct sk_buff *skb,
> > > struct nlmsghdr *nlh,> 
> > >   !tb[RDMA_NLDEV_ATTR_LINK_TYPE] || !tb[RDMA_NLDEV_ATTR_NDEV_NAME])
> > >   
> > >   return -EINVAL;
> > > 
> > > - nla_strlcpy(ibdev_name, tb[RDMA_NLDEV_ATTR_DEV_NAME],
> > > + nla_strscpy(ibdev_name, tb[RDMA_NLDEV_ATTR_DEV_NAME],
> > > 
> > >   sizeof(ibdev_name));
> > >   
> > >   if (strchr(ibdev_name, '%') || strlen(ibdev_name) == 0)
> > >   
> > >   return -EINVAL;
> > > 
> > > - nla_strlcpy(type, tb[RDMA_NLDEV_ATTR_LINK_TYPE], sizeof(type));
> > > - nla_strlcpy(ndev_name, tb[RDMA_NLDEV_ATTR_NDEV_NAME],
> > > + nla_strscpy(type, tb[RDMA_NLDEV_ATTR_LINK_TYPE], sizeof(type));
> > > + nla_strscpy(ndev_name, tb[RDMA_NLDEV_ATTR_NDEV_NAME],
> > > 
> > >   si

Re: [RFC][PATCH v3 3/3] Rename nla_strlcpy to nla_strscpy.

2020-10-21 Thread Kees Cook

On Tue, Oct 20, 2020 at 06:47:07PM +0200, laniel_fran...@privacyrequired.com 
wrote:
> From: Francis Laniel 
> 
> Calls to nla_strlcpy are now replaced by calls to nla_strscpy which is the new
> name of this function.
> 
> Signed-off-by: Francis Laniel 

The Subject could also be: "treewide: Rename nla_strlcpy to nla_strscpy"

But otherwise, yup, easy mechanical change.

Reviewed-by: Kees Cook 

> ---
>  drivers/infiniband/core/nldev.c| 10 +-
>  drivers/net/can/vxcan.c|  4 ++--
>  drivers/net/veth.c |  4 ++--
>  include/linux/genl_magic_struct.h  |  2 +-
>  include/net/netlink.h  |  4 ++--
>  include/net/pkt_cls.h  |  2 +-
>  kernel/taskstats.c |  2 +-
>  lib/nlattr.c   |  6 +++---
>  net/core/fib_rules.c   |  4 ++--
>  net/core/rtnetlink.c   | 12 ++--
>  net/decnet/dn_dev.c|  2 +-
>  net/ieee802154/nl-mac.c|  2 +-
>  net/ipv4/devinet.c |  2 +-
>  net/ipv4/fib_semantics.c   |  2 +-
>  net/ipv4/metrics.c |  2 +-
>  net/netfilter/ipset/ip_set_hash_netiface.c |  4 ++--
>  net/netfilter/nf_tables_api.c  |  6 +++---
>  net/netfilter/nfnetlink_acct.c |  2 +-
>  net/netfilter/nfnetlink_cthelper.c |  4 ++--
>  net/netfilter/nft_ct.c |  2 +-
>  net/netfilter/nft_log.c|  2 +-
>  net/netlabel/netlabel_mgmt.c   |  2 +-
>  net/nfc/netlink.c  |  2 +-
>  net/sched/act_api.c|  2 +-
>  net/sched/act_ipt.c|  2 +-
>  net/sched/act_simple.c |  4 ++--
>  net/sched/cls_api.c|  2 +-
>  net/sched/sch_api.c|  2 +-
>  net/tipc/netlink_compat.c  |  2 +-
>  29 files changed, 49 insertions(+), 49 deletions(-)
> 
> diff --git a/drivers/infiniband/core/nldev.c b/drivers/infiniband/core/nldev.c
> index 12d29d54a081..08366e254b1d 100644
> --- a/drivers/infiniband/core/nldev.c
> +++ b/drivers/infiniband/core/nldev.c
> @@ -932,7 +932,7 @@ static int nldev_set_doit(struct sk_buff *skb, struct 
> nlmsghdr *nlh,
>   if (tb[RDMA_NLDEV_ATTR_DEV_NAME]) {
>   char name[IB_DEVICE_NAME_MAX] = {};
>  
> - nla_strlcpy(name, tb[RDMA_NLDEV_ATTR_DEV_NAME],
> + nla_strscpy(name, tb[RDMA_NLDEV_ATTR_DEV_NAME],
>   IB_DEVICE_NAME_MAX);
>   if (strlen(name) == 0) {
>   err = -EINVAL;
> @@ -1529,13 +1529,13 @@ static int nldev_newlink(struct sk_buff *skb, struct 
> nlmsghdr *nlh,
>   !tb[RDMA_NLDEV_ATTR_LINK_TYPE] || !tb[RDMA_NLDEV_ATTR_NDEV_NAME])
>   return -EINVAL;
>  
> - nla_strlcpy(ibdev_name, tb[RDMA_NLDEV_ATTR_DEV_NAME],
> + nla_strscpy(ibdev_name, tb[RDMA_NLDEV_ATTR_DEV_NAME],
>   sizeof(ibdev_name));
>   if (strchr(ibdev_name, '%') || strlen(ibdev_name) == 0)
>   return -EINVAL;
>  
> - nla_strlcpy(type, tb[RDMA_NLDEV_ATTR_LINK_TYPE], sizeof(type));
> - nla_strlcpy(ndev_name, tb[RDMA_NLDEV_ATTR_NDEV_NAME],
> + nla_strscpy(type, tb[RDMA_NLDEV_ATTR_LINK_TYPE], sizeof(type));
> + nla_strscpy(ndev_name, tb[RDMA_NLDEV_ATTR_NDEV_NAME],
>   sizeof(ndev_name));
>  
>   ndev = dev_get_by_name(sock_net(skb->sk), ndev_name);
> @@ -1602,7 +1602,7 @@ static int nldev_get_chardev(struct sk_buff *skb, 
> struct nlmsghdr *nlh,
>   if (err || !tb[RDMA_NLDEV_ATTR_CHARDEV_TYPE])
>   return -EINVAL;
>  
> - nla_strlcpy(client_name, tb[RDMA_NLDEV_ATTR_CHARDEV_TYPE],
> + nla_strscpy(client_name, tb[RDMA_NLDEV_ATTR_CHARDEV_TYPE],
>   sizeof(client_name));
>  
>   if (tb[RDMA_NLDEV_ATTR_DEV_INDEX]) {
> diff --git a/drivers/net/can/vxcan.c b/drivers/net/can/vxcan.c
> index d6ba9426be4d..fa47bab510bb 100644
> --- a/drivers/net/can/vxcan.c
> +++ b/drivers/net/can/vxcan.c
> @@ -186,7 +186,7 @@ static int vxcan_newlink(struct net *net, struct 
> net_device *dev,
>   }
>  
>   if (ifmp && tbp[IFLA_IFNAME]) {
> - nla_strlcpy(ifname, tbp[IFLA_IFNAME], IFNAMSIZ);
> + nla_strscpy(ifname, tbp[IFLA_IFNAME], IFNAMSIZ);
>   name_assign_type = NET_NAME_USER;
>   } else {
>   snprintf(ifname, IFNAMSIZ, DRV_NAME "%%d");
> @@ -223,7 +223,7 @@ static int vxcan_newlink(struct net *net, struct 
> net_device *dev,
>  
>

Re: [RFC][PATCH v3 2/3] Modify return value of nla_strlcpy to match that of strscpy.

2020-10-21 Thread Kees Cook

On Tue, Oct 20, 2020 at 06:47:06PM +0200, laniel_fran...@privacyrequired.com 
wrote:
> From: Francis Laniel 
> 
> nla_strlcpy now returns -E2BIG if src was truncated when written to dst.
> It also returns this error value if dstsize is 0 or higher than INT_MAX.
> 
> For example, if src is "foo\0" and dst is 3 bytes long, the result will be:
> 1. "foG" after memcpy (G means garbage).
> 2. "fo\0" after memset.
> 3. -E2BIG is returned because src was not completely written into dst.
> 
> The callers of nla_strlcpy were modified to take into account this 
> modification.
> 
> Signed-off-by: Francis Laniel 

This looks correct to me. Thanks for the respin!

Reviewed-by: Kees Cook 

-- 
Kees Cook

Re: [RFC][PATCH v3 1/3] Fix unefficient call to memset before memcpu in nla_strlcpy.

2020-10-21 Thread Kees Cook

On Tue, Oct 20, 2020 at 06:47:05PM +0200, laniel_fran...@privacyrequired.com 
wrote:
> From: Francis Laniel 
> 
> Before this commit, nla_strlcpy first memseted dst to 0 then wrote src into 
> it.
> This is inefficient because bytes whom number is less than src length are 
> written
> twice.
> 
> This patch solves this issue by first writing src into dst then fill dst with
> 0's.
> Note that, in the case where src length is higher than dst, only 0 is written.
> Otherwise there are as many 0's written to fill dst.
> 
> For example, if src is "foo\0" and dst is 5 bytes long, the result will be:
> 1. "fooGG" after memcpy (G means garbage).
> 2. "foo\0\0" after memset.
> 
> Signed-off-by: Francis Laniel 

Looks good! (If there are future versions of this series, I think you
can drop the RFC part...)

Reviewed-by: Kees Cook 

-- 
Kees Cook

Re: [PATCH] random32: Restore __latent_entropy attribute on net_rand_state

2020-10-05 Thread Kees Cook

On Tue, Oct 06, 2020 at 04:28:09AM +0200, Willy Tarreau wrote:
> Hi Kees,
> 
> On Mon, Oct 05, 2020 at 07:12:29PM -0700, Kees Cook wrote:
> > On Fri, Oct 02, 2020 at 05:16:11PM +0200, Thibaut Sautereau wrote:
> > > From: Thibaut Sautereau 
> > > 
> > > Commit f227e3ec3b5c ("random32: update the net random state on interrupt
> > > and activity") broke compilation and was temporarily fixed by Linus in
> > > 83bdc7275e62 ("random32: remove net_rand_state from the latent entropy
> > > gcc plugin") by entirely moving net_rand_state out of the things handled
> > > by the latent_entropy GCC plugin.
> > > 
> > > From what I understand when reading the plugin code, using the
> > > __latent_entropy attribute on a declaration was the wrong part and
> > > simply keeping the __latent_entropy attribute on the variable definition
> > > was the correct fix.
> > > 
> > > Fixes: 83bdc7275e62 ("random32: remove net_rand_state from the latent 
> > > entropy gcc plugin")
> > > Cc: Linus Torvalds 
> > > Cc: Willy Tarreau 
> > > Cc: Emese Revfy 
> > > Signed-off-by: Thibaut Sautereau 
> > 
> > Yes, that looks correct. Thank you!
> > 
> > Acked-by: Kees Cook 
> > 
> > I'm not sure the best tree for this. Ted, Andrew, Linus? I'll take it
> > via my gcc plugin tree if no one else takes it. :)
> 
> It was already merged as commit 09a6b0bc3be79 and queued for -stable.

Ah, perfect! Thanks.

-- 
Kees Cook

Re: [PATCH] random32: Restore __latent_entropy attribute on net_rand_state

2020-10-05 Thread Kees Cook

On Fri, Oct 02, 2020 at 05:16:11PM +0200, Thibaut Sautereau wrote:
> From: Thibaut Sautereau 
> 
> Commit f227e3ec3b5c ("random32: update the net random state on interrupt
> and activity") broke compilation and was temporarily fixed by Linus in
> 83bdc7275e62 ("random32: remove net_rand_state from the latent entropy
> gcc plugin") by entirely moving net_rand_state out of the things handled
> by the latent_entropy GCC plugin.
> 
> From what I understand when reading the plugin code, using the
> __latent_entropy attribute on a declaration was the wrong part and
> simply keeping the __latent_entropy attribute on the variable definition
> was the correct fix.
> 
> Fixes: 83bdc7275e62 ("random32: remove net_rand_state from the latent entropy 
> gcc plugin")
> Cc: Linus Torvalds 
> Cc: Willy Tarreau 
> Cc: Emese Revfy 
> Signed-off-by: Thibaut Sautereau 

Yes, that looks correct. Thank you!

Acked-by: Kees Cook 

I'm not sure the best tree for this. Ted, Andrew, Linus? I'll take it
via my gcc plugin tree if no one else takes it. :)

-- 
Kees Cook

Re: [PATCH 0/3] xtensa: add seccomp support

2020-09-11 Thread Kees Cook

On Sat, Jul 18, 2020 at 07:16:51PM -0700, Max Filippov wrote:
> Hello,
> 
> this series adds support for seccomp filter on xtensa and updates
> selftests/seccomp.

Hi!

Firstly, thanks for adding seccomp support! :) I would, however, ask
that you CC maintainers on these kinds of changes for feedback. I was
surprised to find the changes in the seccomp selftests today in Linus's
tree. I didn't seem to get CCed on this series, even though
get_maintainers shows this:

$ ./scripts/get_maintainer.pl 0001-selftests-seccomp-add-xtensa-support.mbox
Kees Cook  (supporter:SECURE COMPUTING)
Andy Lutomirski  (reviewer:SECURE COMPUTING)
Will Drewry  (reviewer:SECURE COMPUTING)
Shuah Khan  (maintainer:KERNEL SELFTEST FRAMEWORK)
...

Regardless, I'm still glad to have more arch support! :) I'll send a
follow-up patch to refactor a bit of the selftest.

Thanks,

-- 
Kees Cook

Re: [PATCH -next] selftests/seccomp: Use bitwise instead of arithmetic operator for flags

2020-09-08 Thread Kees Cook

On Wed, 15 Apr 2020 12:15:01 +0800, Zou Wei wrote:
> This silences the following coccinelle warning:
> 
> "WARNING: sum of probable bitmasks, consider |"
> 
> tools/testing/selftests/seccomp/seccomp_bpf.c:3131:17-18: WARNING: sum of 
> probable bitmasks, consider |
> tools/testing/selftests/seccomp/seccomp_bpf.c:3133:18-19: WARNING: sum of 
> probable bitmasks, consider |
> tools/testing/selftests/seccomp/seccomp_bpf.c:3134:18-19: WARNING: sum of 
> probable bitmasks, consider |
> tools/testing/selftests/seccomp/seccomp_bpf.c:3135:18-19: WARNING: sum of 
> probable bitmasks, consider |

Applied, thanks!

[1/1] selftests/seccomp: Use bitwise instead of arithmetic operator for flags
  https://git.kernel.org/kees/c/76993fe3c1e4

Sorry for the massive delay on this one! I lost this email in my inbox. :)

-- 
Kees Cook

Re: [05/16] atmel: convert tasklets to use new tasklet_setup() API

2020-08-27 Thread Kees Cook

On Thu, Aug 27, 2020 at 01:23:20PM +, Kalle Valo wrote:
> Allen Pais  wrote:
> 
> > From: Allen Pais 
> > 
> > In preparation for unconditionally passing the
> > struct tasklet_struct pointer to all tasklet
> > callbacks, switch to using the new tasklet_setup()
> > and from_tasklet() to pass the tasklet pointer explicitly
> > and remove .data field.
> > 
> > Signed-off-by: Romain Perier 
> > Signed-off-by: Allen Pais 
> 
> 11 patches applied to wireless-drivers-next.git, thanks.
> 
> a36f50e5b937 atmel: convert tasklets to use new tasklet_setup() API
> fc6722301428 b43legacy: convert tasklets to use new tasklet_setup() API
> 427a06beb072 brcmsmac: convert tasklets to use new tasklet_setup() API
> ae6cf59f80f7 ipw2x00: convert tasklets to use new tasklet_setup() API
> b81b9d372ac8 iwlegacy: convert tasklets to use new tasklet_setup() API
> 7433c9690318 intersil: convert tasklets to use new tasklet_setup() API
> 51c41aa93ef5 mwl8k: convert tasklets to use new tasklet_setup() API
> aff8e8d02ec2 qtnfmac: convert tasklets to use new tasklet_setup() API
> a0d6ea9b6e1c rt2x00: convert tasklets to use new tasklet_setup() API
> d3ccc14dfe95 rtlwifi/rtw88: convert tasklets to use new tasklet_setup() API
> 26721b02466e zd1211rw: convert tasklets to use new tasklet_setup() API
> 
> -- 
> https://patchwork.kernel.org/patch/11717451/
> 
> https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

FWIW, I don't think a revert is needed here to wait for the from_tasket()
-> container_from() API to land since from_tasklet() is already being
used by other trees. Let's just get this done so we can get closer to
ripping out the old tasklet API. We'll have to do a treewide
from_timer(), from_tasklet() -> container_from() anyway...

-- 
Kees Cook

Re: [PATCH] block: convert tasklets to use new tasklet_setup() API

2020-08-26 Thread Kees Cook

On Wed, Aug 26, 2020 at 12:55:28PM +0300, Dan Carpenter wrote:
> On Wed, Aug 26, 2020 at 07:21:35AM +0530, Allen Pais wrote:
> > On Thu, Aug 20, 2020 at 3:09 AM James Bottomley
> >  wrote:
> > >
> > > On Wed, 2020-08-19 at 21:54 +0530, Allen wrote:
> > > > > [...]
> > > > > > > Since both threads seem to have petered out, let me suggest in
> > > > > > > kernel.h:
> > > > > > >
> > > > > > > #define cast_out(ptr, container, member) \
> > > > > > > container_of(ptr, typeof(*container), member)
> > > > > > >
> > > > > > > It does what you want, the argument order is the same as
> > > > > > > container_of with the only difference being you name the
> > > > > > > containing structure instead of having to specify its type.
> > > > > >
> > > > > > Not to incessantly bike shed on the naming, but I don't like
> > > > > > cast_out, it's not very descriptive. And it has connotations of
> > > > > > getting rid of something, which isn't really true.
> > > > >
> > > > > Um, I thought it was exactly descriptive: you're casting to the
> > > > > outer container.  I thought about following the C++ dynamic casting
> > > > > style, so out_cast(), but that seemed a bit pejorative.  What about
> > > > > outer_cast()?
> > > > >
> > > > > > FWIW, I like the from_ part of the original naming, as it has
> > > > > > some clues as to what is being done here. Why not just
> > > > > > from_container()? That should immediately tell people what it
> > > > > > does without having to look up the implementation, even before
> > > > > > this becomes a part of the accepted coding norm.
> > > > >
> > > > > I'm not opposed to container_from() but it seems a little less
> > > > > descriptive than outer_cast() but I don't really care.  I always
> > > > > have to look up container_of() when I'm using it so this would just
> > > > > be another macro of that type ...
> > > > >
> > > >
> > > >  So far we have a few which have been suggested as replacement
> > > > for from_tasklet()
> > > >
> > > > - out_cast() or outer_cast()
> > > > - from_member().
> > > > - container_from() or from_container()
> > > >
> > > > from_container() sounds fine, would trimming it a bit work? like
> > > > from_cont().
> > >
> > > I'm fine with container_from().  It's the same form as container_of()
> > > and I think we need urgent agreement to not stall everything else so
> > > the most innocuous name is likely to get the widest acceptance.
> > 
> > Kees,
> > 
> >   Will you be  sending the newly proposed API to Linus? I have V2
> > which uses container_from()
> > ready to be sent out.
> 
> I liked that James swapped the first two arguments so that it matches
> container_of().  Plus it's nice that when you have:
> 
>   struct whatever *foo = container_from(ptr, foo, member);
> 
> Then it means that "ptr == &foo->member".

I'm a bit stalled right now -- the merge window was keeping me busy, and
this week is the Linux Plumbers Conference. This is on my list, but I
haven't gotten back around to it. If you want, feel free to send the
container_from() patch; you might be able to unblock this faster than me
right now. :)

-Kees

-- 
Kees Cook

Re: [PATCH] block: convert tasklets to use new tasklet_setup() API

2020-08-18 Thread Kees Cook

On Tue, Aug 18, 2020 at 01:00:33PM -0700, James Bottomley wrote:
> On Mon, 2020-08-17 at 13:02 -0700, Jens Axboe wrote:
> > On 8/17/20 12:48 PM, Kees Cook wrote:
> > > On Mon, Aug 17, 2020 at 12:44:34PM -0700, Jens Axboe wrote:
> > > > On 8/17/20 12:29 PM, Kees Cook wrote:
> > > > > On Mon, Aug 17, 2020 at 06:56:47AM -0700, Jens Axboe wrote:
> > > > > > On 8/17/20 2:15 AM, Allen Pais wrote:
> > > > > > > From: Allen Pais 
> > > > > > > 
> > > > > > > In preparation for unconditionally passing the
> > > > > > > struct tasklet_struct pointer to all tasklet
> > > > > > > callbacks, switch to using the new tasklet_setup()
> > > > > > > and from_tasklet() to pass the tasklet pointer explicitly.
> > > > > > 
> > > > > > Who came up with the idea to add a macro 'from_tasklet' that
> > > > > > is just container_of? container_of in the code would be
> > > > > > _much_ more readable, and not leave anyone guessing wtf
> > > > > > from_tasklet is doing.
> > > > > > 
> > > > > > I'd fix that up now before everything else goes in...
> > > > > 
> > > > > As I mentioned in the other thread, I think this makes things
> > > > > much more readable. It's the same thing that the timer_struct
> > > > > conversion did (added a container_of wrapper) to avoid the
> > > > > ever-repeating use of typeof(), long lines, etc.
> > > > 
> > > > But then it should use a generic name, instead of each sub-system 
> > > > using some random name that makes people look up exactly what it
> > > > does. I'm not huge fan of the container_of() redundancy, but
> > > > adding private variants of this doesn't seem like the best way
> > > > forward. Let's have a generic helper that does this, and use it
> > > > everywhere.
> > > 
> > > I'm open to suggestions, but as things stand, these kinds of
> > > treewide
> > 
> > On naming? Implementation is just as it stands, from_tasklet() is
> > totally generic which is why I objected to it. from_member()? Not
> > great with naming... But I can see this going further and then we'll
> > suddenly have tons of these. It's not good for readability.
> 
> Since both threads seem to have petered out, let me suggest in
> kernel.h:
> 
> #define cast_out(ptr, container, member) \
>   container_of(ptr, typeof(*container), member)
> 
> It does what you want, the argument order is the same as container_of
> with the only difference being you name the containing structure
> instead of having to specify its type.

I like this! Shall I send this to Linus to see if this can land in -rc2
for use going forward?

-- 
Kees Cook

Re: [PATCH] block: convert tasklets to use new tasklet_setup() API

2020-08-17 Thread Kees Cook

On Mon, Aug 17, 2020 at 12:44:34PM -0700, Jens Axboe wrote:
> On 8/17/20 12:29 PM, Kees Cook wrote:
> > On Mon, Aug 17, 2020 at 06:56:47AM -0700, Jens Axboe wrote:
> >> On 8/17/20 2:15 AM, Allen Pais wrote:
> >>> From: Allen Pais 
> >>>
> >>> In preparation for unconditionally passing the
> >>> struct tasklet_struct pointer to all tasklet
> >>> callbacks, switch to using the new tasklet_setup()
> >>> and from_tasklet() to pass the tasklet pointer explicitly.
> >>
> >> Who came up with the idea to add a macro 'from_tasklet' that is just
> >> container_of? container_of in the code would be _much_ more readable,
> >> and not leave anyone guessing wtf from_tasklet is doing.
> >>
> >> I'd fix that up now before everything else goes in...
> > 
> > As I mentioned in the other thread, I think this makes things much more
> > readable. It's the same thing that the timer_struct conversion did
> > (added a container_of wrapper) to avoid the ever-repeating use of
> > typeof(), long lines, etc.
> 
> But then it should use a generic name, instead of each sub-system using
> some random name that makes people look up exactly what it does. I'm not
> huge fan of the container_of() redundancy, but adding private variants
> of this doesn't seem like the best way forward. Let's have a generic
> helper that does this, and use it everywhere.

I'm open to suggestions, but as things stand, these kinds of treewide
changes end up getting whole-release delays because of the need to have
the API in place for everyone before patches to do the changes can be
sent to multiple maintainers, etc.

-- 
Kees Cook

Re: [PATCH] block: convert tasklets to use new tasklet_setup() API

2020-08-17 Thread Kees Cook

On Mon, Aug 17, 2020 at 06:56:47AM -0700, Jens Axboe wrote:
> On 8/17/20 2:15 AM, Allen Pais wrote:
> > From: Allen Pais 
> > 
> > In preparation for unconditionally passing the
> > struct tasklet_struct pointer to all tasklet
> > callbacks, switch to using the new tasklet_setup()
> > and from_tasklet() to pass the tasklet pointer explicitly.
> 
> Who came up with the idea to add a macro 'from_tasklet' that is just
> container_of? container_of in the code would be _much_ more readable,
> and not leave anyone guessing wtf from_tasklet is doing.
> 
> I'd fix that up now before everything else goes in...

As I mentioned in the other thread, I think this makes things much more
readable. It's the same thing that the timer_struct conversion did
(added a container_of wrapper) to avoid the ever-repeating use of
typeof(), long lines, etc.

-- 
Kees Cook

Re: [PATCH 0/3] Modernize tasklet callback API

2020-08-11 Thread Kees Cook

On Mon, Aug 03, 2020 at 02:16:15PM +0530, Allen wrote:
> Here's the series re-based on top of 5.8
> https://github.com/allenpais/tasklets/tree/V3

Great!

> Let me know how you would want these to be reviewed.

Was a Coccinelle script used for any of these conversions? I wonder if
it'd be easier to do a single treewide patch for the more mechanical
changes.

And, actually, I still think the "prepare" patches should just be
collapsed into the actual "covert" patches -- there are only a few.

After those, yeah, I think getting these sent to their respective
maintainers is the next step.

> Also, I was thinking if removing tasklets completely could be a task
> on KSPP wiki. If yes, I did like to take ownership of that task. I have a
> couple of ideas in mind, which could be discussed in a separate email.

Sure! I will add it to the tracker. Here's for the refactoring:
https://github.com/KSPP/linux/issues/30

and here's for the removal:
https://github.com/KSPP/linux/issues/94

if you can added details/examples of how they should be removed, that'd
help other folks too, if they wanted to jump in. :)

-Kees

-- 
Kees Cook

Re: [PATCH v7 3/9] net/scm: Regularize compat handling of scm_detach_fds()

2020-08-08 Thread Kees Cook

On Fri, Aug 07, 2020 at 05:02:15PM -0700, John Stultz wrote:
> On Fri, Aug 7, 2020 at 3:18 PM Kees Cook  wrote:
> >
> > On Fri, Aug 07, 2020 at 01:29:24PM -0700, John Stultz wrote:
> > > On Thu, Jul 9, 2020 at 11:28 AM Kees Cook  wrote:
> > > >
> > > > Duplicate the cleanups from commit 2618d530dd8b ("net/scm: cleanup
> > > > scm_detach_fds") into the compat code.
> > > >
> > > > Replace open-coded __receive_sock() with a call to the helper.
> > > >
> > > > Move the check added in commit 1f466e1f15cf ("net: cleanly handle kernel
> > > > vs user buffers for ->msg_control") to before the compat call, even
> > > > though it should be impossible for an in-kernel call to also be compat.
> > > >
> > > > Correct the int "flags" argument to unsigned int to match fd_install()
> > > > and similar APIs.
> > > >
> > > > Regularize any remaining differences, including a whitespace issue,
> > > > a checkpatch warning, and add the check from commit 6900317f5eff ("net,
> > > > scm: fix PaX detected msg_controllen overflow in scm_detach_fds") which
> > > > fixed an overflow unique to 64-bit. To avoid confusion when comparing
> > > > the compat handler to the native handler, just include the same check
> > > > in the compat handler.
> > > >
> > > > Acked-by: Christian Brauner 
> > > > Signed-off-by: Kees Cook 
> > > > ---
> > >
> > > Hey Kees,
> > >   So during the merge window (while chasing a few other regressions),
> > > I noticed occasionally my Dragonboard 845c running AOSP having trouble
> > > with the web browser crashing or other apps hanging, and I've bisected
> > > the issue down to this change.
> > >
> > > Unfortunately it doesn't revert cleanly so I can't validate reverting
> > > it sorts things against linus/HEAD.  Anyway, I wanted to check and see
> > > if you had any other reports of similar or any ideas what might be
> > > going wrong?
> >
> > Hi; Yes, sorry for the trouble. I had a typo in a refactor of
> > SCM_RIGHTS. I suspect it'll be fixed by this:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1fa2c0a0c814fbae0eb3e79a510765225570d043
> >
> > Can you verify Linus's latest tree works for you? If not, there might be
> > something else hiding in the corners...
> 
> Thanks so much! Yes, I just updated to Linus' latest and the issue has
> disappeared!
> 
> thanks again!

Whew; sorry again and thanks for testing! :)

-- 
Kees Cook

Re: [PATCH net-next v1] hinic: fix strncpy output truncated compile warnings

2020-08-07 Thread Kees Cook

eg wr timeout", "phy fault",
+   [FAULT_TYPE_MAX] = "Unknown"};
+   u8 fault_type;
int err;
 
-   if (event->type < FAULT_TYPE_MAX)
-   strncpy(type_str, fault_type[event->type], 
strlen(fault_type[event->type]));
-   else
-   strncpy(type_str, "Unknown", strlen("Unknown"));
-
-   err = devlink_fmsg_string_pair_put(fmsg, "Fault type", type_str);
+   fault_type = clamp(event->type, FAULT_TYPE_MAX);
+   err = devlink_fmsg_string_pair_put(fmsg, "Fault type", 
type_str[fault_type]);
if (err)
return err;
 


-Kees

[1] 
https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings

-- 
Kees Cook

Re: [PATCH v7 3/9] net/scm: Regularize compat handling of scm_detach_fds()

2020-08-07 Thread Kees Cook

On Fri, Aug 07, 2020 at 01:29:24PM -0700, John Stultz wrote:
> On Thu, Jul 9, 2020 at 11:28 AM Kees Cook  wrote:
> >
> > Duplicate the cleanups from commit 2618d530dd8b ("net/scm: cleanup
> > scm_detach_fds") into the compat code.
> >
> > Replace open-coded __receive_sock() with a call to the helper.
> >
> > Move the check added in commit 1f466e1f15cf ("net: cleanly handle kernel
> > vs user buffers for ->msg_control") to before the compat call, even
> > though it should be impossible for an in-kernel call to also be compat.
> >
> > Correct the int "flags" argument to unsigned int to match fd_install()
> > and similar APIs.
> >
> > Regularize any remaining differences, including a whitespace issue,
> > a checkpatch warning, and add the check from commit 6900317f5eff ("net,
> > scm: fix PaX detected msg_controllen overflow in scm_detach_fds") which
> > fixed an overflow unique to 64-bit. To avoid confusion when comparing
> > the compat handler to the native handler, just include the same check
> > in the compat handler.
> >
> > Acked-by: Christian Brauner 
> > Signed-off-by: Kees Cook 
> > ---
> 
> Hey Kees,
>   So during the merge window (while chasing a few other regressions),
> I noticed occasionally my Dragonboard 845c running AOSP having trouble
> with the web browser crashing or other apps hanging, and I've bisected
> the issue down to this change.
> 
> Unfortunately it doesn't revert cleanly so I can't validate reverting
> it sorts things against linus/HEAD.  Anyway, I wanted to check and see
> if you had any other reports of similar or any ideas what might be
> going wrong?

Hi; Yes, sorry for the trouble. I had a typo in a refactor of
SCM_RIGHTS. I suspect it'll be fixed by this:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1fa2c0a0c814fbae0eb3e79a510765225570d043

Can you verify Linus's latest tree works for you? If not, there might be
something else hiding in the corners...

Thanks!

-- 
Kees Cook

Re: [PATCH 0/3] Modernize tasklet callback API

2020-07-30 Thread Kees Cook

[heavily trimmed CC list because I think lkml is ignoring this
thread...]

On Thu, Jul 30, 2020 at 09:03:55AM +0200, Thomas Gleixner wrote:
> Kees,
> 
> Kees Cook  writes:
> > This is the infrastructure changes to prepare the tasklet API for
> > conversion to passing the tasklet struct as the callback argument instead
> > of an arbitrary unsigned long. The first patch details why this is useful
> > (it's the same rationale as the timer_struct changes from a bit ago:
> > less abuse during memory corruption attacks, more in line with existing
> > ways of doing things in the kernel, save a little space in struct,
> > etc). Notably, the existing tasklet API use is much less messy, so there
> > is less to clean up.
> >
> > It's not clear to me which tree this should go through... Greg since it
> > starts with a USB clean-up, -tip for timer or interrupt, or if I should
> > just carry it. I'm open to suggestions, but if I don't hear otherwise,
> > I'll just carry it.
> >
> > My goal is to have this merged for v5.9-rc1 so that during the v5.10
> > development cycle the new API will be available. The entire tree of
> > changes is here[1] currently, but to split it up by maintainer the
> > infrastructure changes need to be landed first.
> >
> > Review and Acks appreciated! :)
> 
> I'd rather see tasklets vanish from the planet completely, but that's
> going to be a daring feat. So, grudgingly:

Understood! I will update the comments near the tasklet API.

> Acked-by: Thomas Gleixner 

Thanks!

-- 
Kees Cook

Re: [PATCH v7 2/9] pidfd: Add missing sock updates for pidfd_getfd()

2020-07-09 Thread Kees Cook

On Thu, Jul 09, 2020 at 10:00:42PM +0200, Jann Horn wrote:
> On Thu, Jul 9, 2020 at 8:26 PM Kees Cook  wrote:
> > The sock counting (sock_update_netprioidx() and sock_update_classid())
> > was missing from pidfd's implementation of received fd installation. Add
> > a call to the new __receive_sock() helper.
> [...]
> > diff --git a/kernel/pid.c b/kernel/pid.c
> [...]
> > @@ -642,10 +643,12 @@ static int pidfd_getfd(struct pid *pid, int fd)
> > }
> >
> > ret = get_unused_fd_flags(O_CLOEXEC);
> > -   if (ret < 0)
> > +   if (ret < 0) {
> > fput(file);
> > -   else
> > +   } else {
> > fd_install(ret, file);
> > +   __receive_sock(file);
> > +   }
> 
> __receive_sock() has to be before fd_install(), otherwise `file` can
> be a dangling pointer.

I've swapped the order now and double-checked the other uses. Everything
else seems fine.

-- 
Kees Cook

Re: [PATCH v7 2/9] pidfd: Add missing sock updates for pidfd_getfd()

2020-07-09 Thread Kees Cook

On Thu, Jul 09, 2020 at 10:00:42PM +0200, Jann Horn wrote:
> On Thu, Jul 9, 2020 at 8:26 PM Kees Cook  wrote:
> > The sock counting (sock_update_netprioidx() and sock_update_classid())
> > was missing from pidfd's implementation of received fd installation. Add
> > a call to the new __receive_sock() helper.
> [...]
> > diff --git a/kernel/pid.c b/kernel/pid.c
> [...]
> > @@ -642,10 +643,12 @@ static int pidfd_getfd(struct pid *pid, int fd)
> > }
> >
> > ret = get_unused_fd_flags(O_CLOEXEC);
> > -   if (ret < 0)
> > +   if (ret < 0) {
> > fput(file);
> > -   else
> > +   } else {
> > fd_install(ret, file);
> > +   __receive_sock(file);
> > +   }
> 
> __receive_sock() has to be before fd_install(), otherwise `file` can
> be a dangling pointer.

Burned by fd_install()'s API again. Thanks. I will respin.

-- 
Kees Cook

[PATCH v7 1/9] net/compat: Add missing sock updates for SCM_RIGHTS

2020-07-09 Thread Kees Cook

Add missed sock updates to compat path via a new helper, which will be
used more in coming patches. (The net/core/scm.c code is left as-is here
to assist with -stable backports for the compat path.)

Cc: sta...@vger.kernel.org
Fixes: 48a87cc26c13 ("net: netprio: fd passed in SCM_RIGHTS datagram not set 
correctly")
Fixes: d84295067fc7 ("net: net_cls: fd passed in SCM_RIGHTS datagram not set 
correctly")
Signed-off-by: Kees Cook 
---
 include/net/sock.h |  4 
 net/compat.c   |  1 +
 net/core/sock.c| 21 +
 3 files changed, 26 insertions(+)

diff --git a/include/net/sock.h b/include/net/sock.h
index c53cc42b5ab9..2be67f1ee8b1 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -890,6 +890,8 @@ static inline int sk_memalloc_socks(void)
 {
return static_branch_unlikely(&memalloc_socks_key);
 }
+
+void __receive_sock(struct file *file);
 #else
 
 static inline int sk_memalloc_socks(void)
@@ -897,6 +899,8 @@ static inline int sk_memalloc_socks(void)
return 0;
 }
 
+static inline void __receive_sock(struct file *file)
+{ }
 #endif
 
 static inline gfp_t sk_gfp_mask(const struct sock *sk, gfp_t gfp_mask)
diff --git a/net/compat.c b/net/compat.c
index 5e3041a2c37d..2937b816107d 100644
--- a/net/compat.c
+++ b/net/compat.c
@@ -309,6 +309,7 @@ void scm_detach_fds_compat(struct msghdr *kmsg, struct 
scm_cookie *scm)
break;
}
/* Bump the usage count and install the file. */
+   __receive_sock(fp[i]);
fd_install(new_fd, get_file(fp[i]));
}
 
diff --git a/net/core/sock.c b/net/core/sock.c
index 6c4acf1f0220..bde394979041 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2840,6 +2840,27 @@ int sock_no_mmap(struct file *file, struct socket *sock, 
struct vm_area_struct *
 }
 EXPORT_SYMBOL(sock_no_mmap);
 
+/*
+ * When a file is received (via SCM_RIGHTS, etc), we must bump the
+ * various sock-based usage counts.
+ */
+void __receive_sock(struct file *file)
+{
+   struct socket *sock;
+   int error;
+
+   /*
+* The resulting value of "error" is ignored here since we only
+* need to take action when the file is a socket and testing
+* "sock" for NULL is sufficient.
+*/
+   sock = sock_from_file(file, &error);
+   if (sock) {
+   sock_update_netprioidx(&sock->sk->sk_cgrp_data);
+   sock_update_classid(&sock->sk->sk_cgrp_data);
+   }
+}
+
 ssize_t sock_no_sendpage(struct socket *sock, struct page *page, int offset, 
size_t size, int flags)
 {
ssize_t res;
-- 
2.25.1

[PATCH v7 6/9] pidfd: Replace open-coded receive_fd()

2020-07-09 Thread Kees Cook

Replace the open-coded version of receive_fd() with a call to the
new helper.

Thanks to Vamshi K Sthambamkadi  for
catching a missed fput() in an earlier version of this patch.

Reviewed-by: Sargun Dhillon 
Acked-by: Christian Brauner 
Signed-off-by: Kees Cook 
---
 kernel/pid.c | 15 ++-
 1 file changed, 2 insertions(+), 13 deletions(-)

diff --git a/kernel/pid.c b/kernel/pid.c
index 85ed00abdc7c..da5aea5f04fa 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -636,19 +636,8 @@ static int pidfd_getfd(struct pid *pid, int fd)
if (IS_ERR(file))
return PTR_ERR(file);
 
-   ret = security_file_receive(file);
-   if (ret) {
-   fput(file);
-   return ret;
-   }
-
-   ret = get_unused_fd_flags(O_CLOEXEC);
-   if (ret < 0) {
-   fput(file);
-   } else {
-   fd_install(ret, file);
-   __receive_sock(file);
-   }
+   ret = receive_fd(file, O_CLOEXEC);
+   fput(file);
 
return ret;
 }
-- 
2.25.1

[PATCH v7 4/9] fs: Move __scm_install_fd() to __receive_fd()

2020-07-09 Thread Kees Cook

In preparation for users of the "install a received file" logic outside
of net/ (pidfd and seccomp), relocate and rename __scm_install_fd() from
net/core/scm.c to __receive_fd() in fs/file.c, and provide a wrapper
named receive_fd_user(), as future patches will change the interface
to __receive_fd().

Reviewed-by: Sargun Dhillon 
Acked-by: Christian Brauner 
Signed-off-by: Kees Cook 
---
 fs/file.c| 41 +
 include/linux/file.h |  8 
 include/net/scm.h|  1 -
 net/compat.c |  2 +-
 net/core/scm.c   | 27 +--
 5 files changed, 51 insertions(+), 28 deletions(-)

diff --git a/fs/file.c b/fs/file.c
index abb8b7081d7a..6220bf440809 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 
 unsigned int sysctl_nr_open __read_mostly = 1024*1024;
 unsigned int sysctl_nr_open_min = BITS_PER_LONG;
@@ -931,6 +932,46 @@ int replace_fd(unsigned fd, struct file *file, unsigned 
flags)
return err;
 }
 
+/**
+ * __receive_fd() - Install received file into file descriptor table
+ *
+ * @file: struct file that was received from another process
+ * @ufd: __user pointer to write new fd number to
+ * @o_flags: the O_* flags to apply to the new fd entry
+ *
+ * Installs a received file into the file descriptor table, with appropriate
+ * checks and count updates. Writes the fd number to userspace.
+ *
+ * This helper handles its own reference counting of the incoming
+ * struct file.
+ *
+ * Returns -ve on error.
+ */
+int __receive_fd(struct file *file, int __user *ufd, unsigned int o_flags)
+{
+   int new_fd;
+   int error;
+
+   error = security_file_receive(file);
+   if (error)
+   return error;
+
+   new_fd = get_unused_fd_flags(o_flags);
+   if (new_fd < 0)
+   return new_fd;
+
+   error = put_user(new_fd, ufd);
+   if (error) {
+   put_unused_fd(new_fd);
+   return error;
+   }
+
+   /* Bump the sock usage counts, if any. */
+   __receive_sock(file);
+   fd_install(new_fd, get_file(file));
+   return 0;
+}
+
 static int ksys_dup3(unsigned int oldfd, unsigned int newfd, int flags)
 {
int err = -EBADF;
diff --git a/include/linux/file.h b/include/linux/file.h
index 122f80084a3e..b14ff2ffd0bd 100644
--- a/include/linux/file.h
+++ b/include/linux/file.h
@@ -91,6 +91,14 @@ extern void put_unused_fd(unsigned int fd);
 
 extern void fd_install(unsigned int fd, struct file *file);
 
+extern int __receive_fd(struct file *file, int __user *ufd,
+   unsigned int o_flags);
+static inline int receive_fd_user(struct file *file, int __user *ufd,
+ unsigned int o_flags)
+{
+   return __receive_fd(file, ufd, o_flags);
+}
+
 extern void flush_delayed_fput(void);
 extern void __fput_sync(struct file *);
 
diff --git a/include/net/scm.h b/include/net/scm.h
index 581a94d6c613..1ce365f4c256 100644
--- a/include/net/scm.h
+++ b/include/net/scm.h
@@ -37,7 +37,6 @@ struct scm_cookie {
 #endif
 };
 
-int __scm_install_fd(struct file *file, int __user *ufd, unsigned int o_flags);
 void scm_detach_fds(struct msghdr *msg, struct scm_cookie *scm);
 void scm_detach_fds_compat(struct msghdr *msg, struct scm_cookie *scm);
 int __scm_send(struct socket *sock, struct msghdr *msg, struct scm_cookie 
*scm);
diff --git a/net/compat.c b/net/compat.c
index 27d477fdcaa0..e74cd3dae8b0 100644
--- a/net/compat.c
+++ b/net/compat.c
@@ -298,7 +298,7 @@ void scm_detach_fds_compat(struct msghdr *msg, struct 
scm_cookie *scm)
int err = 0, i;
 
for (i = 0; i < fdmax; i++) {
-   err = __scm_install_fd(scm->fp->fp[i], cmsg_data + i, o_flags);
+   err = receive_fd_user(scm->fp->fp[i], cmsg_data + i, o_flags);
if (err)
break;
}
diff --git a/net/core/scm.c b/net/core/scm.c
index 44f03213dcab..67c166a7820d 100644
--- a/net/core/scm.c
+++ b/net/core/scm.c
@@ -280,31 +280,6 @@ void put_cmsg_scm_timestamping(struct msghdr *msg, struct 
scm_timestamping_inter
 }
 EXPORT_SYMBOL(put_cmsg_scm_timestamping);
 
-int __scm_install_fd(struct file *file, int __user *ufd, unsigned int o_flags)
-{
-   int new_fd;
-   int error;
-
-   error = security_file_receive(file);
-   if (error)
-   return error;
-
-   new_fd = get_unused_fd_flags(o_flags);
-   if (new_fd < 0)
-   return new_fd;
-
-   error = put_user(new_fd, ufd);
-   if (error) {
-   put_unused_fd(new_fd);
-   return error;
-   }
-
-   /* Bump the sock usage counts, if any. */
-   __receive_sock(file);
-   fd_install(new_fd, get_file(file));
-   return 0;
-}
-
 static int scm_max_fds(struct msghdr *msg)
 {
if (msg->msg_controllen <= sizeof(struct cmsghdr))
@@ -331,7 +306,7 @@ void scm_

[PATCH v7 3/9] net/scm: Regularize compat handling of scm_detach_fds()

2020-07-09 Thread Kees Cook

Duplicate the cleanups from commit 2618d530dd8b ("net/scm: cleanup
scm_detach_fds") into the compat code.

Replace open-coded __receive_sock() with a call to the helper.

Move the check added in commit 1f466e1f15cf ("net: cleanly handle kernel
vs user buffers for ->msg_control") to before the compat call, even
though it should be impossible for an in-kernel call to also be compat.

Correct the int "flags" argument to unsigned int to match fd_install()
and similar APIs.

Regularize any remaining differences, including a whitespace issue,
a checkpatch warning, and add the check from commit 6900317f5eff ("net,
scm: fix PaX detected msg_controllen overflow in scm_detach_fds") which
fixed an overflow unique to 64-bit. To avoid confusion when comparing
the compat handler to the native handler, just include the same check
in the compat handler.

Acked-by: Christian Brauner 
Signed-off-by: Kees Cook 
---
 include/net/scm.h |  1 +
 net/compat.c  | 56 +--
 net/core/scm.c| 27 ++-
 3 files changed, 37 insertions(+), 47 deletions(-)

diff --git a/include/net/scm.h b/include/net/scm.h
index 1ce365f4c256..581a94d6c613 100644
--- a/include/net/scm.h
+++ b/include/net/scm.h
@@ -37,6 +37,7 @@ struct scm_cookie {
 #endif
 };
 
+int __scm_install_fd(struct file *file, int __user *ufd, unsigned int o_flags);
 void scm_detach_fds(struct msghdr *msg, struct scm_cookie *scm);
 void scm_detach_fds_compat(struct msghdr *msg, struct scm_cookie *scm);
 int __scm_send(struct socket *sock, struct msghdr *msg, struct scm_cookie 
*scm);
diff --git a/net/compat.c b/net/compat.c
index 2937b816107d..27d477fdcaa0 100644
--- a/net/compat.c
+++ b/net/compat.c
@@ -281,40 +281,31 @@ int put_cmsg_compat(struct msghdr *kmsg, int level, int 
type, int len, void *dat
return 0;
 }
 
-void scm_detach_fds_compat(struct msghdr *kmsg, struct scm_cookie *scm)
+static int scm_max_fds_compat(struct msghdr *msg)
 {
-   struct compat_cmsghdr __user *cm = (struct compat_cmsghdr __user *) 
kmsg->msg_control;
-   int fdmax = (kmsg->msg_controllen - sizeof(struct compat_cmsghdr)) / 
sizeof(int);
-   int fdnum = scm->fp->count;
-   struct file **fp = scm->fp->fp;
-   int __user *cmfptr;
-   int err = 0, i;
+   if (msg->msg_controllen <= sizeof(struct compat_cmsghdr))
+   return 0;
+   return (msg->msg_controllen - sizeof(struct compat_cmsghdr)) / 
sizeof(int);
+}
 
-   if (fdnum < fdmax)
-   fdmax = fdnum;
+void scm_detach_fds_compat(struct msghdr *msg, struct scm_cookie *scm)
+{
+   struct compat_cmsghdr __user *cm =
+   (struct compat_cmsghdr __user *)msg->msg_control;
+   unsigned int o_flags = (msg->msg_flags & MSG_CMSG_CLOEXEC) ? O_CLOEXEC 
: 0;
+   int fdmax = min_t(int, scm_max_fds_compat(msg), scm->fp->count);
+   int __user *cmsg_data = CMSG_USER_DATA(cm);
+   int err = 0, i;
 
-   for (i = 0, cmfptr = (int __user *) CMSG_COMPAT_DATA(cm); i < fdmax; 
i++, cmfptr++) {
-   int new_fd;
-   err = security_file_receive(fp[i]);
+   for (i = 0; i < fdmax; i++) {
+   err = __scm_install_fd(scm->fp->fp[i], cmsg_data + i, o_flags);
if (err)
break;
-   err = get_unused_fd_flags(MSG_CMSG_CLOEXEC & kmsg->msg_flags
- ? O_CLOEXEC : 0);
-   if (err < 0)
-   break;
-   new_fd = err;
-   err = put_user(new_fd, cmfptr);
-   if (err) {
-   put_unused_fd(new_fd);
-   break;
-   }
-   /* Bump the usage count and install the file. */
-   __receive_sock(fp[i]);
-   fd_install(new_fd, get_file(fp[i]));
}
 
if (i > 0) {
int cmlen = CMSG_COMPAT_LEN(i * sizeof(int));
+
err = put_user(SOL_SOCKET, &cm->cmsg_level);
if (!err)
err = put_user(SCM_RIGHTS, &cm->cmsg_type);
@@ -322,16 +313,19 @@ void scm_detach_fds_compat(struct msghdr *kmsg, struct 
scm_cookie *scm)
err = put_user(cmlen, &cm->cmsg_len);
if (!err) {
cmlen = CMSG_COMPAT_SPACE(i * sizeof(int));
-   kmsg->msg_control += cmlen;
-   kmsg->msg_controllen -= cmlen;
+   if (msg->msg_controllen < cmlen)
+   cmlen = msg->msg_controllen;
+   msg->msg_control += cmlen;
+   msg->msg_controllen -= cmlen;
}
}
-   if (i < fdnum)
-   kmsg->msg_flags |= MSG_CTRUNC;
+
+   if (i < scm->fp->count || (sc

[PATCH v7 0/9] Add seccomp notifier ioctl that enables adding fds

2020-07-09 Thread Kees Cook

Hello!

v7:
- break out sock usage counting fixes into more cleanly backportable pieces
- code style cleanups (christian)
- clarify addfd commit log (christian)
- add ..._SIZE_{VER0,LATEST} and BUILD_BUG_ON()s (christian)
- remove undef (christian)
- fix addfd embedded URL reference numbers
v6: https://lore.kernel.org/lkml/20200706201720.3482959-1-keesc...@chromium.org/

This continues the thread-merge between [1] and [2]. tl;dr: add a way for
a seccomp user_notif process manager to inject files into the managed
process in order to handle emulation of various fd-returning syscalls
across security boundaries. Containers folks and Chrome are in need
of the feature, and investigating this solution uncovered (and fixed)
implementation issues with existing file sending routines.

I intend to carry this in the for-next/seccomp tree, unless someone
has objections. :) Please review and test!

-Kees

[1] https://lore.kernel.org/lkml/20200603011044.7972-1-sar...@sargun.me/
[2] https://lore.kernel.org/lkml/20200610045214.1175600-1-keesc...@chromium.org/


Kees Cook (7):
  net/compat: Add missing sock updates for SCM_RIGHTS
  pidfd: Add missing sock updates for pidfd_getfd()
  net/scm: Regularize compat handling of scm_detach_fds()
  fs: Move __scm_install_fd() to __receive_fd()
  fs: Add receive_fd() wrapper for __receive_fd()
  pidfd: Replace open-coded receive_fd()
  fs: Expand __receive_fd() to accept existing fd

Sargun Dhillon (2):
  seccomp: Introduce addfd ioctl to seccomp user notifier
  selftests/seccomp: Test SECCOMP_IOCTL_NOTIF_ADDFD

 fs/file.c |  57 +
 include/linux/file.h  |  19 ++
 include/linux/seccomp.h   |   4 +
 include/net/sock.h|   4 +
 include/uapi/linux/seccomp.h  |  22 ++
 kernel/pid.c  |  14 +-
 kernel/seccomp.c  | 173 -
 net/compat.c  |  55 ++---
 net/core/scm.c|  50 +---
 net/core/sock.c   |  21 ++
 tools/testing/selftests/seccomp/seccomp_bpf.c | 229 ++
 11 files changed, 566 insertions(+), 82 deletions(-)

-- 
2.25.1

[PATCH v7 2/9] pidfd: Add missing sock updates for pidfd_getfd()

2020-07-09 Thread Kees Cook

The sock counting (sock_update_netprioidx() and sock_update_classid())
was missing from pidfd's implementation of received fd installation. Add
a call to the new __receive_sock() helper.

Cc: sta...@vger.kernel.org
Fixes: 8649c322f75c ("pid: Implement pidfd_getfd syscall")
Signed-off-by: Kees Cook 
---
 kernel/pid.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/kernel/pid.c b/kernel/pid.c
index f1496b757162..85ed00abdc7c 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -42,6 +42,7 @@
 #include 
 #include 
 #include 
+#include 
 
 struct pid init_struct_pid = {
.count  = REFCOUNT_INIT(1),
@@ -642,10 +643,12 @@ static int pidfd_getfd(struct pid *pid, int fd)
}
 
ret = get_unused_fd_flags(O_CLOEXEC);
-   if (ret < 0)
+   if (ret < 0) {
fput(file);
-   else
+   } else {
fd_install(ret, file);
+   __receive_sock(file);
+   }
 
return ret;
 }
-- 
2.25.1

[PATCH v7 7/9] fs: Expand __receive_fd() to accept existing fd

2020-07-09 Thread Kees Cook

Expand __receive_fd() with support for replace_fd() for the coming seccomp
"addfd" ioctl(). Add new wrapper receive_fd_replace() for the new behavior
and update existing wrappers to retain old behavior.

Thanks to Colin Ian King  for pointing out an
uninitialized variable exposure in an earlier version of this patch.

Reviewed-by: Sargun Dhillon 
Acked-by: Christian Brauner 
Signed-off-by: Kees Cook 
---
 fs/file.c| 25 +++--
 include/linux/file.h | 10 +++---
 2 files changed, 26 insertions(+), 9 deletions(-)

diff --git a/fs/file.c b/fs/file.c
index 87954bab9306..feebdc17bf46 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -935,6 +935,7 @@ int replace_fd(unsigned fd, struct file *file, unsigned 
flags)
 /**
  * __receive_fd() - Install received file into file descriptor table
  *
+ * @fd: fd to install into (if negative, a new fd will be allocated)
  * @file: struct file that was received from another process
  * @ufd: __user pointer to write new fd number to
  * @o_flags: the O_* flags to apply to the new fd entry
@@ -948,7 +949,7 @@ int replace_fd(unsigned fd, struct file *file, unsigned 
flags)
  *
  * Returns newly install fd or -ve on error.
  */
-int __receive_fd(struct file *file, int __user *ufd, unsigned int o_flags)
+int __receive_fd(int fd, struct file *file, int __user *ufd, unsigned int 
o_flags)
 {
int new_fd;
int error;
@@ -957,21 +958,33 @@ int __receive_fd(struct file *file, int __user *ufd, 
unsigned int o_flags)
if (error)
return error;
 
-   new_fd = get_unused_fd_flags(o_flags);
-   if (new_fd < 0)
-   return new_fd;
+   if (fd < 0) {
+   new_fd = get_unused_fd_flags(o_flags);
+   if (new_fd < 0)
+   return new_fd;
+   } else {
+   new_fd = fd;
+   }
 
if (ufd) {
error = put_user(new_fd, ufd);
if (error) {
-   put_unused_fd(new_fd);
+   if (fd < 0)
+   put_unused_fd(new_fd);
return error;
}
}
 
+   if (fd < 0) {
+   fd_install(new_fd, get_file(file));
+   } else {
+   error = replace_fd(new_fd, file, o_flags);
+   if (error)
+   return error;
+   }
+
/* Bump the sock usage counts, if any. */
__receive_sock(file);
-   fd_install(new_fd, get_file(file));
return new_fd;
 }
 
diff --git a/include/linux/file.h b/include/linux/file.h
index d9fee9f5c8da..225982792fa2 100644
--- a/include/linux/file.h
+++ b/include/linux/file.h
@@ -92,18 +92,22 @@ extern void put_unused_fd(unsigned int fd);
 
 extern void fd_install(unsigned int fd, struct file *file);
 
-extern int __receive_fd(struct file *file, int __user *ufd,
+extern int __receive_fd(int fd, struct file *file, int __user *ufd,
unsigned int o_flags);
 static inline int receive_fd_user(struct file *file, int __user *ufd,
  unsigned int o_flags)
 {
if (ufd == NULL)
return -EFAULT;
-   return __receive_fd(file, ufd, o_flags);
+   return __receive_fd(-1, file, ufd, o_flags);
 }
 static inline int receive_fd(struct file *file, unsigned int o_flags)
 {
-   return __receive_fd(file, NULL, o_flags);
+   return __receive_fd(-1, file, NULL, o_flags);
+}
+static inline int receive_fd_replace(int fd, struct file *file, unsigned int 
o_flags)
+{
+   return __receive_fd(fd, file, NULL, o_flags);
 }
 
 extern void flush_delayed_fput(void);
-- 
2.25.1

[PATCH v7 5/9] fs: Add receive_fd() wrapper for __receive_fd()

2020-07-09 Thread Kees Cook

For both pidfd and seccomp, the __user pointer is not used. Update
__receive_fd() to make writing to ufd optional via a NULL check. However,
for the receive_fd_user() wrapper, ufd is NULL checked so an -EFAULT
can be returned to avoid changing the SCM_RIGHTS interface behavior. Add
new wrapper receive_fd() for pidfd and seccomp that does not use the ufd
argument. For the new helper, the allocated fd needs to be returned on
success. Update the existing callers to handle it.

Reviewed-by: Sargun Dhillon 
Acked-by: Christian Brauner 
Signed-off-by: Kees Cook 
---
 fs/file.c| 17 ++---
 include/linux/file.h |  7 +++
 net/compat.c |  2 +-
 net/core/scm.c   |  2 +-
 4 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/fs/file.c b/fs/file.c
index 6220bf440809..87954bab9306 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -940,12 +940,13 @@ int replace_fd(unsigned fd, struct file *file, unsigned 
flags)
  * @o_flags: the O_* flags to apply to the new fd entry
  *
  * Installs a received file into the file descriptor table, with appropriate
- * checks and count updates. Writes the fd number to userspace.
+ * checks and count updates. Optionally writes the fd number to userspace, if
+ * @ufd is non-NULL.
  *
  * This helper handles its own reference counting of the incoming
  * struct file.
  *
- * Returns -ve on error.
+ * Returns newly install fd or -ve on error.
  */
 int __receive_fd(struct file *file, int __user *ufd, unsigned int o_flags)
 {
@@ -960,16 +961,18 @@ int __receive_fd(struct file *file, int __user *ufd, 
unsigned int o_flags)
if (new_fd < 0)
return new_fd;
 
-   error = put_user(new_fd, ufd);
-   if (error) {
-   put_unused_fd(new_fd);
-   return error;
+   if (ufd) {
+   error = put_user(new_fd, ufd);
+   if (error) {
+   put_unused_fd(new_fd);
+   return error;
+   }
}
 
/* Bump the sock usage counts, if any. */
__receive_sock(file);
fd_install(new_fd, get_file(file));
-   return 0;
+   return new_fd;
 }
 
 static int ksys_dup3(unsigned int oldfd, unsigned int newfd, int flags)
diff --git a/include/linux/file.h b/include/linux/file.h
index b14ff2ffd0bd..d9fee9f5c8da 100644
--- a/include/linux/file.h
+++ b/include/linux/file.h
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 
 struct file;
 
@@ -96,8 +97,14 @@ extern int __receive_fd(struct file *file, int __user *ufd,
 static inline int receive_fd_user(struct file *file, int __user *ufd,
  unsigned int o_flags)
 {
+   if (ufd == NULL)
+   return -EFAULT;
return __receive_fd(file, ufd, o_flags);
 }
+static inline int receive_fd(struct file *file, unsigned int o_flags)
+{
+   return __receive_fd(file, NULL, o_flags);
+}
 
 extern void flush_delayed_fput(void);
 extern void __fput_sync(struct file *);
diff --git a/net/compat.c b/net/compat.c
index e74cd3dae8b0..dc7ddbc2b15e 100644
--- a/net/compat.c
+++ b/net/compat.c
@@ -299,7 +299,7 @@ void scm_detach_fds_compat(struct msghdr *msg, struct 
scm_cookie *scm)
 
for (i = 0; i < fdmax; i++) {
err = receive_fd_user(scm->fp->fp[i], cmsg_data + i, o_flags);
-   if (err)
+   if (err < 0)
break;
}
 
diff --git a/net/core/scm.c b/net/core/scm.c
index 67c166a7820d..8156d4fb8a39 100644
--- a/net/core/scm.c
+++ b/net/core/scm.c
@@ -307,7 +307,7 @@ void scm_detach_fds(struct msghdr *msg, struct scm_cookie 
*scm)
 
for (i = 0; i < fdmax; i++) {
err = receive_fd_user(scm->fp->fp[i], cmsg_data + i, o_flags);
-   if (err)
+   if (err < 0)
break;
}
 
-- 
2.25.1

[PATCH v7 9/9] selftests/seccomp: Test SECCOMP_IOCTL_NOTIF_ADDFD

2020-07-09 Thread Kees Cook

From: Sargun Dhillon 

Test whether we can add file descriptors in response to notifications.
This injects the file descriptors via notifications, and then uses kcmp
to determine whether or not it has been successful.

It also includes some basic sanity checking for arguments.

Signed-off-by: Sargun Dhillon 
Link: https://lore.kernel.org/r/20200603011044.7972-5-sar...@sargun.me
Co-developed-by: Kees Cook 
Signed-off-by: Kees Cook 
---
 tools/testing/selftests/seccomp/seccomp_bpf.c | 229 ++
 1 file changed, 229 insertions(+)

diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c 
b/tools/testing/selftests/seccomp/seccomp_bpf.c
index 4662a25bc9e8..3f41b32b9165 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -45,6 +45,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -168,7 +169,9 @@ struct seccomp_metadata {
 
 #ifndef SECCOMP_FILTER_FLAG_NEW_LISTENER
 #define SECCOMP_FILTER_FLAG_NEW_LISTENER   (1UL << 3)
+#endif
 
+#ifndef SECCOMP_RET_USER_NOTIF
 #define SECCOMP_RET_USER_NOTIF 0x7fc0U
 
 #define SECCOMP_IOC_MAGIC  '!'
@@ -204,6 +207,39 @@ struct seccomp_notif_sizes {
 };
 #endif
 
+#ifndef SECCOMP_IOCTL_NOTIF_ADDFD
+/* On success, the return value is the remote process's added fd number */
+#define SECCOMP_IOCTL_NOTIF_ADDFD  SECCOMP_IOW(3,  \
+   struct seccomp_notif_addfd)
+
+/* valid flags for seccomp_notif_addfd */
+#define SECCOMP_ADDFD_FLAG_SETFD   (1UL << 0) /* Specify remote fd */
+
+struct seccomp_notif_addfd {
+   __u64 id;
+   __u32 flags;
+   __u32 srcfd;
+   __u32 newfd;
+   __u32 newfd_flags;
+};
+#endif
+
+struct seccomp_notif_addfd_small {
+   __u64 id;
+   char weird[4];
+};
+#define SECCOMP_IOCTL_NOTIF_ADDFD_SMALL\
+   SECCOMP_IOW(3, struct seccomp_notif_addfd_small)
+
+struct seccomp_notif_addfd_big {
+   union {
+   struct seccomp_notif_addfd addfd;
+   char buf[sizeof(struct seccomp_notif_addfd) + 8];
+   };
+};
+#define SECCOMP_IOCTL_NOTIF_ADDFD_BIG  \
+   SECCOMP_IOWR(3, struct seccomp_notif_addfd_big)
+
 #ifndef PTRACE_EVENTMSG_SYSCALL_ENTRY
 #define PTRACE_EVENTMSG_SYSCALL_ENTRY  1
 #define PTRACE_EVENTMSG_SYSCALL_EXIT   2
@@ -3740,6 +3776,199 @@ TEST(user_notification_filter_empty_threaded)
EXPECT_GT((pollfd.revents & POLLHUP) ?: 0, 0);
 }
 
+TEST(user_notification_addfd)
+{
+   pid_t pid;
+   long ret;
+   int status, listener, memfd, fd;
+   struct seccomp_notif_addfd addfd = {};
+   struct seccomp_notif_addfd_small small = {};
+   struct seccomp_notif_addfd_big big = {};
+   struct seccomp_notif req = {};
+   struct seccomp_notif_resp resp = {};
+   /* 100 ms */
+   struct timespec delay = { .tv_nsec = 1 };
+
+   memfd = memfd_create("test", 0);
+   ASSERT_GE(memfd, 0);
+
+   ret = prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
+   ASSERT_EQ(0, ret) {
+   TH_LOG("Kernel does not support PR_SET_NO_NEW_PRIVS!");
+   }
+
+   /* Check that the basic notification machinery works */
+   listener = user_notif_syscall(__NR_getppid,
+ SECCOMP_FILTER_FLAG_NEW_LISTENER);
+   ASSERT_GE(listener, 0);
+
+   pid = fork();
+   ASSERT_GE(pid, 0);
+
+   if (pid == 0) {
+   if (syscall(__NR_getppid) != USER_NOTIF_MAGIC)
+   exit(1);
+   exit(syscall(__NR_getppid) != USER_NOTIF_MAGIC);
+   }
+
+   ASSERT_EQ(ioctl(listener, SECCOMP_IOCTL_NOTIF_RECV, &req), 0);
+
+   addfd.srcfd = memfd;
+   addfd.newfd = 0;
+   addfd.id = req.id;
+   addfd.flags = 0x0;
+
+   /* Verify bad newfd_flags cannot be set */
+   addfd.newfd_flags = ~O_CLOEXEC;
+   EXPECT_EQ(ioctl(listener, SECCOMP_IOCTL_NOTIF_ADDFD, &addfd), -1);
+   EXPECT_EQ(errno, EINVAL);
+   addfd.newfd_flags = O_CLOEXEC;
+
+   /* Verify bad flags cannot be set */
+   addfd.flags = 0xff;
+   EXPECT_EQ(ioctl(listener, SECCOMP_IOCTL_NOTIF_ADDFD, &addfd), -1);
+   EXPECT_EQ(errno, EINVAL);
+   addfd.flags = 0;
+
+   /* Verify that remote_fd cannot be set without setting flags */
+   addfd.newfd = 1;
+   EXPECT_EQ(ioctl(listener, SECCOMP_IOCTL_NOTIF_ADDFD, &addfd), -1);
+   EXPECT_EQ(errno, EINVAL);
+   addfd.newfd = 0;
+
+   /* Verify small size cannot be set */
+   EXPECT_EQ(ioctl(listener, SECCOMP_IOCTL_NOTIF_ADDFD_SMALL, &small), -1);
+   EXPECT_EQ(errno, EINVAL);
+
+   /* Verify we can't send bits filled in unknown buffer area */
+   memset(&big, 0xAA, sizeof(big));
+   big.addfd = addfd;
+   EXPECT_EQ(ioctl(listener, SECCOMP_IOCTL_NOTIF_ADDFD_BIG, &big), -1);
+   EXPECT_EQ(errno, E2BIG);
+
+
+   /* Verify we can

[PATCH v7 8/9] seccomp: Introduce addfd ioctl to seccomp user notifier

2020-07-09 Thread Kees Cook

From: Sargun Dhillon 

The current SECCOMP_RET_USER_NOTIF API allows for syscall supervision over
an fd. It is often used in settings where a supervising task emulates
syscalls on behalf of a supervised task in userspace, either to further
restrict the supervisee's syscall abilities or to circumvent kernel
enforced restrictions the supervisor deems safe to lift (e.g. actually
performing a mount(2) for an unprivileged container).

While SECCOMP_RET_USER_NOTIF allows for the interception of any syscall,
only a certain subset of syscalls could be correctly emulated. Over the
last few development cycles, the set of syscalls which can't be emulated
has been reduced due to the addition of pidfd_getfd(2). With this we are
now able to, for example, intercept syscalls that require the supervisor
to operate on file descriptors of the supervisee such as connect(2).

However, syscalls that cause new file descriptors to be installed can not
currently be correctly emulated since there is no way for the supervisor
to inject file descriptors into the supervisee. This patch adds a
new addfd ioctl to remove this restriction by allowing the supervisor to
install file descriptors into the intercepted task. By implementing this
feature via seccomp the supervisor effectively instructs the supervisee
to install a set of file descriptors into its own file descriptor table
during the intercepted syscall. This way it is possible to intercept
syscalls such as open() or accept(), and install (or replace, like
dup2(2)) the supervisor's resulting fd into the supervisee. One
replacement use-case would be to redirect the stdout and stderr of a
supervisee into log file descriptors opened by the supervisor.

The ioctl handling is based on the discussions[1] of how Extensible
Arguments should interact with ioctls. Instead of building size into
the addfd structure, make it a function of the ioctl command (which
is how sizes are normally passed to ioctls). To support forward and
backward compatibility, just mask out the direction and size, and match
everything. The size (and any future direction) checks are done along
with copy_struct_from_user() logic.

As a note, the seccomp_notif_addfd structure is laid out based on 8-byte
alignment without requiring packing as there have been packing issues
with uapi highlighted before[2][3]. Although we could overload the
newfd field and use -1 to indicate that it is not to be used, doing
so requires changing the size of the fd field, and introduces struct
packing complexity.

[1]: https://lore.kernel.org/lkml/87o8w9bcaf@mid.deneb.enyo.de/
[2]: 
https://lore.kernel.org/lkml/a328b91d-fd8f-4f27-b3c2-91a9c45f1...@rasmusvillemoes.dk/
[3]: 
https://lore.kernel.org/lkml/20200612104629.GA15814@ircssh-2.c.rugged-nimbus-611.internal

Suggested-by: Matt Denton 
Link: https://lore.kernel.org/r/20200603011044.7972-4-sar...@sargun.me
Signed-off-by: Sargun Dhillon 
Co-developed-by: Kees Cook 
Signed-off-by: Kees Cook 
---
 include/linux/seccomp.h  |   4 +
 include/uapi/linux/seccomp.h |  22 +
 kernel/seccomp.c | 173 ++-
 3 files changed, 198 insertions(+), 1 deletion(-)

diff --git a/include/linux/seccomp.h b/include/linux/seccomp.h
index babcd6c02d09..881c90b6aa25 100644
--- a/include/linux/seccomp.h
+++ b/include/linux/seccomp.h
@@ -10,6 +10,10 @@
 SECCOMP_FILTER_FLAG_NEW_LISTENER | \
 SECCOMP_FILTER_FLAG_TSYNC_ESRCH)
 
+/* sizeof() the first published struct seccomp_notif_addfd */
+#define SECCOMP_NOTIFY_ADDFD_SIZE_VER0 24
+#define SECCOMP_NOTIFY_ADDFD_SIZE_LATEST SECCOMP_NOTIFY_ADDFD_SIZE_VER0
+
 #ifdef CONFIG_SECCOMP
 
 #include 
diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h
index 965290f7dcc2..6ba18b82a02e 100644
--- a/include/uapi/linux/seccomp.h
+++ b/include/uapi/linux/seccomp.h
@@ -113,6 +113,25 @@ struct seccomp_notif_resp {
__u32 flags;
 };
 
+/* valid flags for seccomp_notif_addfd */
+#define SECCOMP_ADDFD_FLAG_SETFD   (1UL << 0) /* Specify remote fd */
+
+/**
+ * struct seccomp_notif_addfd
+ * @id: The ID of the seccomp notification
+ * @flags: SECCOMP_ADDFD_FLAG_*
+ * @srcfd: The local fd number
+ * @newfd: Optional remote FD number if SETFD option is set, otherwise 0.
+ * @newfd_flags: The O_* flags the remote FD should have applied
+ */
+struct seccomp_notif_addfd {
+   __u64 id;
+   __u32 flags;
+   __u32 srcfd;
+   __u32 newfd;
+   __u32 newfd_flags;
+};
+
 #define SECCOMP_IOC_MAGIC  '!'
 #define SECCOMP_IO(nr) _IO(SECCOMP_IOC_MAGIC, nr)
 #define SECCOMP_IOR(nr, type)  _IOR(SECCOMP_IOC_MAGIC, nr, type)
@@ -124,5 +143,8 @@ struct seccomp_notif_resp {
 #define SECCOMP_IOCTL_NOTIF_SEND   SECCOMP_IOWR(1, \
struct seccomp_notif_resp)
 #define SECCOMP_IOCTL_NOTIF_ID_VALID   SECCOMP_IOW(2, __

Re: [PATCH v6 4/7] pidfd: Replace open-coded partial receive_fd()

2020-07-08 Thread Kees Cook

On Tue, Jul 07, 2020 at 02:22:20PM +0200, Christian Brauner wrote:
> So while the patch is correct it leaves 5.6 and 5.7 with a bug in the
> pidfd_getfd() implementation and that just doesn't seem right. I'm
> wondering whether we should introduce:
> 
> void sock_update(struct file *file)
> {
>   struct socket *sock;
>   int error;
> 
>   sock = sock_from_file(file, &error);
>   if (sock) {
>   sock_update_netprioidx(&sock->sk->sk_cgrp_data);
>   sock_update_classid(&sock->sk->sk_cgrp_data);
>   }
> }
> 
> and switch pidfd_getfd() over to:
> 
> diff --git a/kernel/pid.c b/kernel/pid.c
> index f1496b757162..c26bba822be3 100644
> --- a/kernel/pid.c
> +++ b/kernel/pid.c
> @@ -642,10 +642,12 @@ static int pidfd_getfd(struct pid *pid, int fd)
> }
> 
> ret = get_unused_fd_flags(O_CLOEXEC);
> -   if (ret < 0)
> +   if (ret < 0) {
> fput(file);
> -   else
> +   } else {
> +   sock_update(file);
> fd_install(ret, file);
> +   }
> 
> return ret;
>  }
> 
> first thing in the series and then all of the other patches on top of it
> so that we can Cc stable for this and that can get it backported to 5.6,
> 5.7, and 5.8.
> 
> Alternatively, I can make this a separate bugfix patch series which I'll
> send upstream soonish. Or we have specific patches just for 5.6, 5.7,
> and 5.8. Thoughts?

Okay, I looked at hch's clean-ups again and I'm reminded why they
don't make great -stable material. :) The compat bug (also missing the
sock_update()) needs a similar fix (going back to 3.6...), so, yeah,
for ease of backport, probably an explicit sock_update() implementation
(with compat and native scm using it), and a second patch for pidfd.

Let me see what I looks best...

-- 
Kees Cook

Re: [PATCH v6 6/7] seccomp: Introduce addfd ioctl to seccomp user notifier

2020-07-08 Thread Kees Cook

On Tue, Jul 07, 2020 at 03:30:49PM +0200, Christian Brauner wrote:
> Hm, maybe change that description to sm like:
> 
> [...]

Cool, yeah. Thanks! I've tweaked it a little more

> > +   /* 24 is original sizeof(struct seccomp_notif_addfd) */
> > +   if (size < 24 || size >= PAGE_SIZE)
> > +   return -EINVAL;
> 
> Hm, so maybe add the following:
> 
> #define SECCOMP_NOTIFY_ADDFD_VER0 24
> #define SECCOMP_NOTIFY_ADDFD_LATEST SECCOMP_NOTIFY_ADDFD_VER0
> 
> and then place:
> 
> BUILD_BUG_ON(sizeof(struct seccomp_notify_addfd) < SECCOMP_NOTIFY_ADDFD_VER0);
> BUILD_BUG_ON(sizeof(struct open_how) != SECCOMP_NOTIFY_ADDFD_LATEST);

Yes, good idea (BTW, did the EA syscall docs land?)

I've made these SECCOMP_NOTIFY_ADDFD_SIZE_* to match your examples below
(i.e.  I added "SIZE" to what you suggested above).

> somewhere which is what we do for clone3(), openat2() and others to
> catch build-time nonsense.
> 
> include/uapi/linux/perf_event.h:#define PERF_ATTR_SIZE_VER0 64  /* 
> sizeof first published struct */
> include/uapi/linux/sched.h:#define CLONE_ARGS_SIZE_VER0 64 /* sizeof first 
> published struct */
> include/uapi/linux/sched/types.h:#define SCHED_ATTR_SIZE_VER0   48  /* 
> sizeof first published struct */
> include/linux/fcntl.h:#define OPEN_HOW_SIZE_VER024 /* sizeof first 
> published struct */
> include/linux/fcntl.h:#define OPEN_HOW_SIZE_LATEST  OPEN_HOW_SIZE_VER0

The ..._SIZE_VER0 and ...LATEST stuff doesn't seem useful to export via
UAPI. Above, 2 of the 3 export to uapi. Is there a specific rationale
for which should and which shouldn't?

> > +#undef EA_IOCTL
> 
> Why is this undefed? :)

It was defined "in" a function, so I like to mimic function visibility.
But you're right; there's no reason to undef it.

-- 
Kees Cook

Re: [PATCH v6 5/7] fs: Expand __receive_fd() to accept existing fd

2020-07-08 Thread Kees Cook

On Tue, Jul 07, 2020 at 02:38:54PM +0200, Christian Brauner wrote:
> On Mon, Jul 06, 2020 at 01:17:18PM -0700, Kees Cook wrote:
> > Expand __receive_fd() with support for replace_fd() for the coming seccomp
> > "addfd" ioctl(). Add new wrapper receive_fd_replace() for the new behavior
> > and update existing wrappers to retain old behavior.
> > 
> > Thanks to Colin Ian King  for pointing out an
> > uninitialized variable exposure in an earlier version of this patch.
> > 
> > Reviewed-by: Sargun Dhillon 
> > Signed-off-by: Kees Cook 
> > ---
> 
> Thanks!
> (One tiny-nit below.)
> Acked-by: Christian Brauner 
> 
> >  fs/file.c| 24 ++--
> >  include/linux/file.h | 10 +++---
> >  2 files changed, 25 insertions(+), 9 deletions(-)
> > 
> > diff --git a/fs/file.c b/fs/file.c
> > index 0efdcf413210..11313ff36802 100644
> > --- a/fs/file.c
> > +++ b/fs/file.c
> > @@ -937,6 +937,7 @@ int replace_fd(unsigned fd, struct file *file, unsigned 
> > flags)
> >  /**
> >   * __receive_fd() - Install received file into file descriptor table
> >   *
> > + * @fd: fd to install into (if negative, a new fd will be allocated)
> >   * @file: struct file that was received from another process
> >   * @ufd: __user pointer to write new fd number to
> >   * @o_flags: the O_* flags to apply to the new fd entry
> > @@ -950,7 +951,7 @@ int replace_fd(unsigned fd, struct file *file, unsigned 
> > flags)
> >   *
> >   * Returns newly install fd or -ve on error.
> >   */
> > -int __receive_fd(struct file *file, int __user *ufd, unsigned int o_flags)
> > +int __receive_fd(int fd, struct file *file, int __user *ufd, unsigned int 
> > o_flags)
> >  {
> > struct socket *sock;
> > int new_fd;
> > @@ -960,18 +961,30 @@ int __receive_fd(struct file *file, int __user *ufd, 
> > unsigned int o_flags)
> > if (error)
> > return error;
> >  
> > -   new_fd = get_unused_fd_flags(o_flags);
> > -   if (new_fd < 0)
> > -   return new_fd;
> > +   if (fd < 0) {
> > +   new_fd = get_unused_fd_flags(o_flags);
> > +   if (new_fd < 0)
> > +   return new_fd;
> > +   } else
> > +   new_fd = fd;
> 
> This is nitpicky but coding style technically wants us to use braces
> around both branches if one of them requires them. ;)

Ah yeah, good point. Fixed. Thanks!

-- 
Kees Cook

Re: [PATCH v6 4/7] pidfd: Replace open-coded partial receive_fd()

2020-07-08 Thread Kees Cook

On Tue, Jul 07, 2020 at 02:22:20PM +0200, Christian Brauner wrote:
> On Mon, Jul 06, 2020 at 01:17:17PM -0700, Kees Cook wrote:
> > The sock counting (sock_update_netprioidx() and sock_update_classid()) was
> > missing from pidfd's implementation of received fd installation. Replace
> > the open-coded version with a call to the new receive_fd()
> > helper.
> > 
> > Thanks to Vamshi K Sthambamkadi  for
> > catching a missed fput() in an earlier version of this patch.
> > 
> > Fixes: 8649c322f75c ("pid: Implement pidfd_getfd syscall")
> > Reviewed-by: Sargun Dhillon 
> > Signed-off-by: Kees Cook 
> > ---
> 
> Thanks!
> Acked-by: Christian Brauner 
> 
> Christoph, Kees,
> 
> So while the patch is correct it leaves 5.6 and 5.7 with a bug in the
> pidfd_getfd() implementation and that just doesn't seem right. I'm
> wondering whether we should introduce:
> 
> void sock_update(struct file *file)
> {
>   struct socket *sock;
>   int error;
> 
>   sock = sock_from_file(file, &error);
>   if (sock) {
>   sock_update_netprioidx(&sock->sk->sk_cgrp_data);
>   sock_update_classid(&sock->sk->sk_cgrp_data);
>   }
> }
> 
> and switch pidfd_getfd() over to:
> 
> diff --git a/kernel/pid.c b/kernel/pid.c
> index f1496b757162..c26bba822be3 100644
> --- a/kernel/pid.c
> +++ b/kernel/pid.c
> @@ -642,10 +642,12 @@ static int pidfd_getfd(struct pid *pid, int fd)
> }
> 
> ret = get_unused_fd_flags(O_CLOEXEC);
> -   if (ret < 0)
> +   if (ret < 0) {
> fput(file);
> -   else
> +   } else {
> +   sock_update(file);
> fd_install(ret, file);
> +   }
> 
> return ret;
>  }
> 
> first thing in the series and then all of the other patches on top of it
> so that we can Cc stable for this and that can get it backported to 5.6,
> 5.7, and 5.8.
> 
> Alternatively, I can make this a separate bugfix patch series which I'll
> send upstream soonish. Or we have specific patches just for 5.6, 5.7,
> and 5.8. Thoughts?

I was thinking of just tossing the entire series (hch's and mine) at
-stable since it's relatively narrow. I'll look at what's needed for
backports...

> 
> Thanks!
> Christian

-- 
Kees Cook

Re: [PATCH v6 3/7] fs: Add receive_fd() wrapper for __receive_fd()

2020-07-08 Thread Kees Cook

On Tue, Jul 07, 2020 at 01:49:23PM +0200, Christian Brauner wrote:
> On Mon, Jul 06, 2020 at 01:17:16PM -0700, Kees Cook wrote:
> > For both pidfd and seccomp, the __user pointer is not used. Update
> > __receive_fd() to make writing to ufd optional via a NULL check. However,
> > for the receive_fd_user() wrapper, ufd is NULL checked so an -EFAULT
> > can be returned to avoid changing the SCM_RIGHTS interface behavior. Add
> > new wrapper receive_fd() for pidfd and seccomp that does not use the ufd
> > argument. For the new helper, the allocated fd needs to be returned on
> > success. Update the existing callers to handle it.
> > 
> > Reviewed-by: Sargun Dhillon 
> > Signed-off-by: Kees Cook 
> > ---
> 
> Hm, I'm not sure why 2/7 and 3/7 aren't just one patch but ok. :)

I wanted to do a "clean" move from one source to another without any
behavioral changes first.

> Acked-by: Christian Brauner 

Thanks!

-- 
Kees Cook

Re: [PATCH v6 1/7] net/scm: Regularize compat handling of scm_detach_fds()

2020-07-08 Thread Kees Cook

On Tue, Jul 07, 2020 at 01:41:03PM +0200, Christian Brauner wrote:
> On Mon, Jul 06, 2020 at 01:17:14PM -0700, Kees Cook wrote:
> > Duplicate the cleanups from commit 2618d530dd8b ("net/scm: cleanup
> > scm_detach_fds") into the compat code.
> > 
> > Move the check added in commit 1f466e1f15cf ("net: cleanly handle kernel
> > vs user buffers for ->msg_control") to before the compat call, even
> > though it should be impossible for an in-kernel call to also be compat.
> > 
> > Correct the int "flags" argument to unsigned int to match fd_install()
> > and similar APIs.
> > 
> > Regularize any remaining differences, including a whitespace issue,
> > a checkpatch warning, and add the check from commit 6900317f5eff ("net,
> > scm: fix PaX detected msg_controllen overflow in scm_detach_fds") which
> > fixed an overflow unique to 64-bit. To avoid confusion when comparing
> > the compat handler to the native handler, just include the same check
> > in the compat handler.
> > 
> > Fixes: 48a87cc26c13 ("net: netprio: fd passed in SCM_RIGHTS datagram not 
> > set correctly")
> > Fixes: d84295067fc7 ("net: net_cls: fd passed in SCM_RIGHTS datagram not 
> > set correctly")
> > Signed-off-by: Kees Cook 
> > ---
> 
> Thanks. Just a comment below.
> Acked-by: Christian Brauner 

Thanks!

> >  include/net/scm.h |  1 +
> >  net/compat.c  | 55 +--
> >  net/core/scm.c| 18 
> >  3 files changed, 35 insertions(+), 39 deletions(-)
> > 
> > diff --git a/include/net/scm.h b/include/net/scm.h
> > index 1ce365f4c256..581a94d6c613 100644
> > --- a/include/net/scm.h
> > +++ b/include/net/scm.h
> > @@ -37,6 +37,7 @@ struct scm_cookie {
> >  #endif
> >  };
> >  
> > +int __scm_install_fd(struct file *file, int __user *ufd, unsigned int 
> > o_flags);
> >  void scm_detach_fds(struct msghdr *msg, struct scm_cookie *scm);
> >  void scm_detach_fds_compat(struct msghdr *msg, struct scm_cookie *scm);
> >  int __scm_send(struct socket *sock, struct msghdr *msg, struct scm_cookie 
> > *scm);
> > diff --git a/net/compat.c b/net/compat.c
> > index 5e3041a2c37d..27d477fdcaa0 100644
> > --- a/net/compat.c
> > +++ b/net/compat.c
> > @@ -281,39 +281,31 @@ int put_cmsg_compat(struct msghdr *kmsg, int level, 
> > int type, int len, void *dat
> > return 0;
> >  }
> >  
> > -void scm_detach_fds_compat(struct msghdr *kmsg, struct scm_cookie *scm)
> > +static int scm_max_fds_compat(struct msghdr *msg)
> >  {
> > -   struct compat_cmsghdr __user *cm = (struct compat_cmsghdr __user *) 
> > kmsg->msg_control;
> > -   int fdmax = (kmsg->msg_controllen - sizeof(struct compat_cmsghdr)) / 
> > sizeof(int);
> > -   int fdnum = scm->fp->count;
> > -   struct file **fp = scm->fp->fp;
> > -   int __user *cmfptr;
> > -   int err = 0, i;
> > +   if (msg->msg_controllen <= sizeof(struct compat_cmsghdr))
> > +   return 0;
> > +   return (msg->msg_controllen - sizeof(struct compat_cmsghdr)) / 
> > sizeof(int);
> > +}
> >  
> > -   if (fdnum < fdmax)
> > -   fdmax = fdnum;
> > +void scm_detach_fds_compat(struct msghdr *msg, struct scm_cookie *scm)
> > +{
> > +   struct compat_cmsghdr __user *cm =
> > +   (struct compat_cmsghdr __user *)msg->msg_control;
> > +   unsigned int o_flags = (msg->msg_flags & MSG_CMSG_CLOEXEC) ? O_CLOEXEC 
> > : 0;
> > +   int fdmax = min_t(int, scm_max_fds_compat(msg), scm->fp->count);
> 
> Just a note that SCM_RIGHTS fd-sending is limited to 253 (SCM_MAX_FD)
> fds so min_t should never ouput > SCM_MAX_FD here afaict.
> 
> > +   int __user *cmsg_data = CMSG_USER_DATA(cm);
> > +   int err = 0, i;
> >  
> > -   for (i = 0, cmfptr = (int __user *) CMSG_COMPAT_DATA(cm); i < fdmax; 
> > i++, cmfptr++) {
> > -   int new_fd;
> > -   err = security_file_receive(fp[i]);
> > +   for (i = 0; i < fdmax; i++) {
> > +   err = __scm_install_fd(scm->fp->fp[i], cmsg_data + i, o_flags);
> > if (err)
> > break;
> > -   err = get_unused_fd_flags(MSG_CMSG_CLOEXEC & kmsg->msg_flags
> > - ? O_CLOEXEC : 0);
> > -   if (err < 0)
> > -   break;
> > -   new_fd = err;
> > -   err = put_user(new_fd, cmfptr);
> > -   i

[PATCH v6 1/7] net/scm: Regularize compat handling of scm_detach_fds()

2020-07-06 Thread Kees Cook

Duplicate the cleanups from commit 2618d530dd8b ("net/scm: cleanup
scm_detach_fds") into the compat code.

Move the check added in commit 1f466e1f15cf ("net: cleanly handle kernel
vs user buffers for ->msg_control") to before the compat call, even
though it should be impossible for an in-kernel call to also be compat.

Correct the int "flags" argument to unsigned int to match fd_install()
and similar APIs.

Regularize any remaining differences, including a whitespace issue,
a checkpatch warning, and add the check from commit 6900317f5eff ("net,
scm: fix PaX detected msg_controllen overflow in scm_detach_fds") which
fixed an overflow unique to 64-bit. To avoid confusion when comparing
the compat handler to the native handler, just include the same check
in the compat handler.

Fixes: 48a87cc26c13 ("net: netprio: fd passed in SCM_RIGHTS datagram not set 
correctly")
Fixes: d84295067fc7 ("net: net_cls: fd passed in SCM_RIGHTS datagram not set 
correctly")
Signed-off-by: Kees Cook 
---
 include/net/scm.h |  1 +
 net/compat.c  | 55 +--
 net/core/scm.c| 18 
 3 files changed, 35 insertions(+), 39 deletions(-)

diff --git a/include/net/scm.h b/include/net/scm.h
index 1ce365f4c256..581a94d6c613 100644
--- a/include/net/scm.h
+++ b/include/net/scm.h
@@ -37,6 +37,7 @@ struct scm_cookie {
 #endif
 };
 
+int __scm_install_fd(struct file *file, int __user *ufd, unsigned int o_flags);
 void scm_detach_fds(struct msghdr *msg, struct scm_cookie *scm);
 void scm_detach_fds_compat(struct msghdr *msg, struct scm_cookie *scm);
 int __scm_send(struct socket *sock, struct msghdr *msg, struct scm_cookie 
*scm);
diff --git a/net/compat.c b/net/compat.c
index 5e3041a2c37d..27d477fdcaa0 100644
--- a/net/compat.c
+++ b/net/compat.c
@@ -281,39 +281,31 @@ int put_cmsg_compat(struct msghdr *kmsg, int level, int 
type, int len, void *dat
return 0;
 }
 
-void scm_detach_fds_compat(struct msghdr *kmsg, struct scm_cookie *scm)
+static int scm_max_fds_compat(struct msghdr *msg)
 {
-   struct compat_cmsghdr __user *cm = (struct compat_cmsghdr __user *) 
kmsg->msg_control;
-   int fdmax = (kmsg->msg_controllen - sizeof(struct compat_cmsghdr)) / 
sizeof(int);
-   int fdnum = scm->fp->count;
-   struct file **fp = scm->fp->fp;
-   int __user *cmfptr;
-   int err = 0, i;
+   if (msg->msg_controllen <= sizeof(struct compat_cmsghdr))
+   return 0;
+   return (msg->msg_controllen - sizeof(struct compat_cmsghdr)) / 
sizeof(int);
+}
 
-   if (fdnum < fdmax)
-   fdmax = fdnum;
+void scm_detach_fds_compat(struct msghdr *msg, struct scm_cookie *scm)
+{
+   struct compat_cmsghdr __user *cm =
+   (struct compat_cmsghdr __user *)msg->msg_control;
+   unsigned int o_flags = (msg->msg_flags & MSG_CMSG_CLOEXEC) ? O_CLOEXEC 
: 0;
+   int fdmax = min_t(int, scm_max_fds_compat(msg), scm->fp->count);
+   int __user *cmsg_data = CMSG_USER_DATA(cm);
+   int err = 0, i;
 
-   for (i = 0, cmfptr = (int __user *) CMSG_COMPAT_DATA(cm); i < fdmax; 
i++, cmfptr++) {
-   int new_fd;
-   err = security_file_receive(fp[i]);
+   for (i = 0; i < fdmax; i++) {
+   err = __scm_install_fd(scm->fp->fp[i], cmsg_data + i, o_flags);
if (err)
break;
-   err = get_unused_fd_flags(MSG_CMSG_CLOEXEC & kmsg->msg_flags
- ? O_CLOEXEC : 0);
-   if (err < 0)
-   break;
-   new_fd = err;
-   err = put_user(new_fd, cmfptr);
-   if (err) {
-   put_unused_fd(new_fd);
-   break;
-   }
-   /* Bump the usage count and install the file. */
-   fd_install(new_fd, get_file(fp[i]));
}
 
if (i > 0) {
int cmlen = CMSG_COMPAT_LEN(i * sizeof(int));
+
err = put_user(SOL_SOCKET, &cm->cmsg_level);
if (!err)
err = put_user(SCM_RIGHTS, &cm->cmsg_type);
@@ -321,16 +313,19 @@ void scm_detach_fds_compat(struct msghdr *kmsg, struct 
scm_cookie *scm)
err = put_user(cmlen, &cm->cmsg_len);
if (!err) {
cmlen = CMSG_COMPAT_SPACE(i * sizeof(int));
-   kmsg->msg_control += cmlen;
-   kmsg->msg_controllen -= cmlen;
+   if (msg->msg_controllen < cmlen)
+   cmlen = msg->msg_controllen;
+   msg->msg_control += cmlen;
+   msg->msg_controllen -= cmlen;
}
}
-   if (i < fdnum)
-   kmsg->msg_flags

[PATCH v6 3/7] fs: Add receive_fd() wrapper for __receive_fd()

2020-07-06 Thread Kees Cook

For both pidfd and seccomp, the __user pointer is not used. Update
__receive_fd() to make writing to ufd optional via a NULL check. However,
for the receive_fd_user() wrapper, ufd is NULL checked so an -EFAULT
can be returned to avoid changing the SCM_RIGHTS interface behavior. Add
new wrapper receive_fd() for pidfd and seccomp that does not use the ufd
argument. For the new helper, the allocated fd needs to be returned on
success. Update the existing callers to handle it.

Reviewed-by: Sargun Dhillon 
Signed-off-by: Kees Cook 
---
 fs/file.c| 23 +++
 include/linux/file.h |  7 +++
 net/compat.c |  2 +-
 net/core/scm.c   |  2 +-
 4 files changed, 24 insertions(+), 10 deletions(-)

diff --git a/fs/file.c b/fs/file.c
index 3bd67233088b..0efdcf413210 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -942,12 +942,13 @@ int replace_fd(unsigned fd, struct file *file, unsigned 
flags)
  * @o_flags: the O_* flags to apply to the new fd entry
  *
  * Installs a received file into the file descriptor table, with appropriate
- * checks and count updates. Writes the fd number to userspace.
+ * checks and count updates. Optionally writes the fd number to userspace, if
+ * @ufd is non-NULL.
  *
  * This helper handles its own reference counting of the incoming
  * struct file.
  *
- * Returns -ve on error.
+ * Returns newly install fd or -ve on error.
  */
 int __receive_fd(struct file *file, int __user *ufd, unsigned int o_flags)
 {
@@ -963,20 +964,26 @@ int __receive_fd(struct file *file, int __user *ufd, 
unsigned int o_flags)
if (new_fd < 0)
return new_fd;
 
-   error = put_user(new_fd, ufd);
-   if (error) {
-   put_unused_fd(new_fd);
-   return error;
+   if (ufd) {
+   error = put_user(new_fd, ufd);
+   if (error) {
+   put_unused_fd(new_fd);
+   return error;
+   }
}
 
-   /* Bump the usage count and install the file. */
+   /*
+* Bump the usage count and install the file. The resulting value of
+* "error" is ignored here since we only need to take action when
+* the file is a socket and testing "sock" for NULL is sufficient.
+*/
sock = sock_from_file(file, &error);
if (sock) {
sock_update_netprioidx(&sock->sk->sk_cgrp_data);
sock_update_classid(&sock->sk->sk_cgrp_data);
}
fd_install(new_fd, get_file(file));
-   return 0;
+   return new_fd;
 }
 
 static int ksys_dup3(unsigned int oldfd, unsigned int newfd, int flags)
diff --git a/include/linux/file.h b/include/linux/file.h
index b14ff2ffd0bd..d9fee9f5c8da 100644
--- a/include/linux/file.h
+++ b/include/linux/file.h
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 
 struct file;
 
@@ -96,8 +97,14 @@ extern int __receive_fd(struct file *file, int __user *ufd,
 static inline int receive_fd_user(struct file *file, int __user *ufd,
  unsigned int o_flags)
 {
+   if (ufd == NULL)
+   return -EFAULT;
return __receive_fd(file, ufd, o_flags);
 }
+static inline int receive_fd(struct file *file, unsigned int o_flags)
+{
+   return __receive_fd(file, NULL, o_flags);
+}
 
 extern void flush_delayed_fput(void);
 extern void __fput_sync(struct file *);
diff --git a/net/compat.c b/net/compat.c
index e74cd3dae8b0..dc7ddbc2b15e 100644
--- a/net/compat.c
+++ b/net/compat.c
@@ -299,7 +299,7 @@ void scm_detach_fds_compat(struct msghdr *msg, struct 
scm_cookie *scm)
 
for (i = 0; i < fdmax; i++) {
err = receive_fd_user(scm->fp->fp[i], cmsg_data + i, o_flags);
-   if (err)
+   if (err < 0)
break;
}
 
diff --git a/net/core/scm.c b/net/core/scm.c
index 67c166a7820d..8156d4fb8a39 100644
--- a/net/core/scm.c
+++ b/net/core/scm.c
@@ -307,7 +307,7 @@ void scm_detach_fds(struct msghdr *msg, struct scm_cookie 
*scm)
 
for (i = 0; i < fdmax; i++) {
err = receive_fd_user(scm->fp->fp[i], cmsg_data + i, o_flags);
-   if (err)
+   if (err < 0)
break;
}
 
-- 
2.25.1

[PATCH v6 2/7] fs: Move __scm_install_fd() to __receive_fd()

2020-07-06 Thread Kees Cook

In preparation for users of the "install a received file" logic outside
of net/ (pidfd and seccomp), relocate and rename __scm_install_fd() from
net/core/scm.c to __receive_fd() in fs/file.c, and provide a wrapper
named receive_fd_user(), as future patches will change the interface
to __receive_fd().

Reviewed-by: Sargun Dhillon 
Signed-off-by: Kees Cook 
---
 fs/file.c| 48 
 include/linux/file.h |  8 
 include/linux/net.h  |  9 +
 include/net/scm.h|  1 -
 net/compat.c |  2 +-
 net/core/scm.c   | 32 +
 6 files changed, 67 insertions(+), 33 deletions(-)

diff --git a/fs/file.c b/fs/file.c
index abb8b7081d7a..3bd67233088b 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -18,6 +19,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 unsigned int sysctl_nr_open __read_mostly = 1024*1024;
 unsigned int sysctl_nr_open_min = BITS_PER_LONG;
@@ -931,6 +934,51 @@ int replace_fd(unsigned fd, struct file *file, unsigned 
flags)
return err;
 }
 
+/**
+ * __receive_fd() - Install received file into file descriptor table
+ *
+ * @file: struct file that was received from another process
+ * @ufd: __user pointer to write new fd number to
+ * @o_flags: the O_* flags to apply to the new fd entry
+ *
+ * Installs a received file into the file descriptor table, with appropriate
+ * checks and count updates. Writes the fd number to userspace.
+ *
+ * This helper handles its own reference counting of the incoming
+ * struct file.
+ *
+ * Returns -ve on error.
+ */
+int __receive_fd(struct file *file, int __user *ufd, unsigned int o_flags)
+{
+   struct socket *sock;
+   int new_fd;
+   int error;
+
+   error = security_file_receive(file);
+   if (error)
+   return error;
+
+   new_fd = get_unused_fd_flags(o_flags);
+   if (new_fd < 0)
+   return new_fd;
+
+   error = put_user(new_fd, ufd);
+   if (error) {
+   put_unused_fd(new_fd);
+   return error;
+   }
+
+   /* Bump the usage count and install the file. */
+   sock = sock_from_file(file, &error);
+   if (sock) {
+   sock_update_netprioidx(&sock->sk->sk_cgrp_data);
+   sock_update_classid(&sock->sk->sk_cgrp_data);
+   }
+   fd_install(new_fd, get_file(file));
+   return 0;
+}
+
 static int ksys_dup3(unsigned int oldfd, unsigned int newfd, int flags)
 {
int err = -EBADF;
diff --git a/include/linux/file.h b/include/linux/file.h
index 122f80084a3e..b14ff2ffd0bd 100644
--- a/include/linux/file.h
+++ b/include/linux/file.h
@@ -91,6 +91,14 @@ extern void put_unused_fd(unsigned int fd);
 
 extern void fd_install(unsigned int fd, struct file *file);
 
+extern int __receive_fd(struct file *file, int __user *ufd,
+   unsigned int o_flags);
+static inline int receive_fd_user(struct file *file, int __user *ufd,
+ unsigned int o_flags)
+{
+   return __receive_fd(file, ufd, o_flags);
+}
+
 extern void flush_delayed_fput(void);
 extern void __fput_sync(struct file *);
 
diff --git a/include/linux/net.h b/include/linux/net.h
index 016a9c5faa34..0ca550d0f6a6 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -240,7 +240,16 @@ int sock_sendmsg(struct socket *sock, struct msghdr *msg);
 int sock_recvmsg(struct socket *sock, struct msghdr *msg, int flags);
 struct file *sock_alloc_file(struct socket *sock, int flags, const char 
*dname);
 struct socket *sockfd_lookup(int fd, int *err);
+
+#ifdef CONFIG_NET
 struct socket *sock_from_file(struct file *file, int *err);
+#else
+static inline struct socket *sock_from_file(struct file *file, int *err)
+{
+   return NULL;
+}
+#endif
+
 #define sockfd_put(sock) fput(sock->file)
 int net_ratelimit(void);
 
diff --git a/include/net/scm.h b/include/net/scm.h
index 581a94d6c613..1ce365f4c256 100644
--- a/include/net/scm.h
+++ b/include/net/scm.h
@@ -37,7 +37,6 @@ struct scm_cookie {
 #endif
 };
 
-int __scm_install_fd(struct file *file, int __user *ufd, unsigned int o_flags);
 void scm_detach_fds(struct msghdr *msg, struct scm_cookie *scm);
 void scm_detach_fds_compat(struct msghdr *msg, struct scm_cookie *scm);
 int __scm_send(struct socket *sock, struct msghdr *msg, struct scm_cookie 
*scm);
diff --git a/net/compat.c b/net/compat.c
index 27d477fdcaa0..e74cd3dae8b0 100644
--- a/net/compat.c
+++ b/net/compat.c
@@ -298,7 +298,7 @@ void scm_detach_fds_compat(struct msghdr *msg, struct 
scm_cookie *scm)
int err = 0, i;
 
for (i = 0; i < fdmax; i++) {
-   err = __scm_install_fd(scm->fp->fp[i], cmsg_data + i, o_flags);
+   err = receive_fd_user(scm->fp->fp[i], cmsg_data + i, o_flags);
if (err)

[PATCH v6 0/7] Add seccomp notifier ioctl that enables adding fds

2020-07-06 Thread Kees Cook

Hello!

v6:
- fix missing fput()
- API name change: s/fd_install_received/receive_fd/
v5: https://lore.kernel.org/lkml/20200617220327.3731559-1-keesc...@chromium.org/

This continues the thread-merge between [1] and [2]. tl;dr: add a way for
a seccomp user_notif process manager to inject files into the managed
process in order to handle emulation of various fd-returning syscalls
across security boundaries. Containers folks and Chrome are in need
of the feature, and investigating this solution uncovered (and fixed)
implementation issues with existing file sending routines.

I intend to carry this in the for-next/seccomp tree, unless someone
has objections. :) Please review and test!

-Kees

[1] https://lore.kernel.org/lkml/20200603011044.7972-1-sar...@sargun.me/
[2] https://lore.kernel.org/lkml/20200610045214.1175600-1-keesc...@chromium.org/


Kees Cook (5):
  net/scm: Regularize compat handling of scm_detach_fds()
  fs: Move __scm_install_fd() to __receive_fd()
  fs: Add receive_fd() wrapper for __receive_fd()
  pidfd: Replace open-coded partial receive_fd()
  fs: Expand __receive_fd() to accept existing fd

Sargun Dhillon (2):
  seccomp: Introduce addfd ioctl to seccomp user notifier
  selftests/seccomp: Test SECCOMP_IOCTL_NOTIF_ADDFD

 fs/file.c |  67 +
 include/linux/file.h  |  19 ++
 include/linux/net.h   |   9 +
 include/uapi/linux/seccomp.h  |  22 ++
 kernel/pid.c  |  13 +-
 kernel/seccomp.c  | 172 -
 net/compat.c  |  55 ++---
 net/core/scm.c|  50 +---
 tools/testing/selftests/seccomp/seccomp_bpf.c | 229 ++
 9 files changed, 554 insertions(+), 82 deletions(-)

-- 
2.25.1

[PATCH v6 5/7] fs: Expand __receive_fd() to accept existing fd

2020-07-06 Thread Kees Cook

Expand __receive_fd() with support for replace_fd() for the coming seccomp
"addfd" ioctl(). Add new wrapper receive_fd_replace() for the new behavior
and update existing wrappers to retain old behavior.

Thanks to Colin Ian King  for pointing out an
uninitialized variable exposure in an earlier version of this patch.

Reviewed-by: Sargun Dhillon 
Signed-off-by: Kees Cook 
---
 fs/file.c| 24 ++--
 include/linux/file.h | 10 +++---
 2 files changed, 25 insertions(+), 9 deletions(-)

diff --git a/fs/file.c b/fs/file.c
index 0efdcf413210..11313ff36802 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -937,6 +937,7 @@ int replace_fd(unsigned fd, struct file *file, unsigned 
flags)
 /**
  * __receive_fd() - Install received file into file descriptor table
  *
+ * @fd: fd to install into (if negative, a new fd will be allocated)
  * @file: struct file that was received from another process
  * @ufd: __user pointer to write new fd number to
  * @o_flags: the O_* flags to apply to the new fd entry
@@ -950,7 +951,7 @@ int replace_fd(unsigned fd, struct file *file, unsigned 
flags)
  *
  * Returns newly install fd or -ve on error.
  */
-int __receive_fd(struct file *file, int __user *ufd, unsigned int o_flags)
+int __receive_fd(int fd, struct file *file, int __user *ufd, unsigned int 
o_flags)
 {
struct socket *sock;
int new_fd;
@@ -960,18 +961,30 @@ int __receive_fd(struct file *file, int __user *ufd, 
unsigned int o_flags)
if (error)
return error;
 
-   new_fd = get_unused_fd_flags(o_flags);
-   if (new_fd < 0)
-   return new_fd;
+   if (fd < 0) {
+   new_fd = get_unused_fd_flags(o_flags);
+   if (new_fd < 0)
+   return new_fd;
+   } else
+   new_fd = fd;
 
if (ufd) {
error = put_user(new_fd, ufd);
if (error) {
-   put_unused_fd(new_fd);
+   if (fd < 0)
+   put_unused_fd(new_fd);
return error;
}
}
 
+   if (fd < 0)
+   fd_install(new_fd, get_file(file));
+   else {
+   error = replace_fd(new_fd, file, o_flags);
+   if (error)
+   return error;
+   }
+
/*
 * Bump the usage count and install the file. The resulting value of
 * "error" is ignored here since we only need to take action when
@@ -982,7 +995,6 @@ int __receive_fd(struct file *file, int __user *ufd, 
unsigned int o_flags)
sock_update_netprioidx(&sock->sk->sk_cgrp_data);
sock_update_classid(&sock->sk->sk_cgrp_data);
}
-   fd_install(new_fd, get_file(file));
return new_fd;
 }
 
diff --git a/include/linux/file.h b/include/linux/file.h
index d9fee9f5c8da..225982792fa2 100644
--- a/include/linux/file.h
+++ b/include/linux/file.h
@@ -92,18 +92,22 @@ extern void put_unused_fd(unsigned int fd);
 
 extern void fd_install(unsigned int fd, struct file *file);
 
-extern int __receive_fd(struct file *file, int __user *ufd,
+extern int __receive_fd(int fd, struct file *file, int __user *ufd,
unsigned int o_flags);
 static inline int receive_fd_user(struct file *file, int __user *ufd,
  unsigned int o_flags)
 {
if (ufd == NULL)
return -EFAULT;
-   return __receive_fd(file, ufd, o_flags);
+   return __receive_fd(-1, file, ufd, o_flags);
 }
 static inline int receive_fd(struct file *file, unsigned int o_flags)
 {
-   return __receive_fd(file, NULL, o_flags);
+   return __receive_fd(-1, file, NULL, o_flags);
+}
+static inline int receive_fd_replace(int fd, struct file *file, unsigned int 
o_flags)
+{
+   return __receive_fd(fd, file, NULL, o_flags);
 }
 
 extern void flush_delayed_fput(void);
-- 
2.25.1

[PATCH v6 4/7] pidfd: Replace open-coded partial receive_fd()

2020-07-06 Thread Kees Cook

The sock counting (sock_update_netprioidx() and sock_update_classid()) was
missing from pidfd's implementation of received fd installation. Replace
the open-coded version with a call to the new receive_fd()
helper.

Thanks to Vamshi K Sthambamkadi  for
catching a missed fput() in an earlier version of this patch.

Fixes: 8649c322f75c ("pid: Implement pidfd_getfd syscall")
Reviewed-by: Sargun Dhillon 
Signed-off-by: Kees Cook 
---
 kernel/pid.c | 13 ++---
 1 file changed, 2 insertions(+), 11 deletions(-)

diff --git a/kernel/pid.c b/kernel/pid.c
index f1496b757162..a31c102f4c87 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -635,17 +635,8 @@ static int pidfd_getfd(struct pid *pid, int fd)
if (IS_ERR(file))
return PTR_ERR(file);
 
-   ret = security_file_receive(file);
-   if (ret) {
-   fput(file);
-   return ret;
-   }
-
-   ret = get_unused_fd_flags(O_CLOEXEC);
-   if (ret < 0)
-   fput(file);
-   else
-   fd_install(ret, file);
+   ret = receive_fd(file, O_CLOEXEC);
+   fput(file);
 
return ret;
 }
-- 
2.25.1

[PATCH v6 6/7] seccomp: Introduce addfd ioctl to seccomp user notifier

2020-07-06 Thread Kees Cook

From: Sargun Dhillon 

This adds a seccomp notifier ioctl which allows for the listener to
"add" file descriptors to a process which originated a seccomp user
notification. This allows calls like mount, and mknod to be "implemented",
as the return value, and the arguments are data in memory. On the other
hand, calls like connect can be "implemented" using pidfd_getfd.

Unfortunately, there are calls which return file descriptors, like
open, which are vulnerable to ToCToU attacks, and require that the more
privileged supervisor can inspect the argument, and perform the syscall
on behalf of the process generating the notification. This allows the
file descriptor generated from that open call to be returned to the
calling process.

In addition, there is functionality to allow for replacement of specific
file descriptors, following dup2-like semantics.

The ioctl handling is based on the discussions[1] of how Extensible
Arguments should interact with ioctls. Instead of building size into
the addfd structure, make it a function of the ioctl command (which
is how sizes are normally passed to ioctls). To support forward and
backward compatibility, just mask out the direction and size, and match
everything. The size (and any future direction) checks are done along
with copy_struct_from_user() logic.

As a note, the seccomp_notif_addfd structure is laid out based on 8-byte
alignment without requiring packing as there have been packing issues
with uapi highlighted before[1][2]. Although we could overload the
newfd field and use -1 to indicate that it is not to be used, doing
so requires changing the size of the fd field, and introduces struct
packing complexity.

[1]: https://lore.kernel.org/lkml/87o8w9bcaf@mid.deneb.enyo.de/
[2]: 
https://lore.kernel.org/lkml/a328b91d-fd8f-4f27-b3c2-91a9c45f1...@rasmusvillemoes.dk/
[3]: 
https://lore.kernel.org/lkml/20200612104629.GA15814@ircssh-2.c.rugged-nimbus-611.internal

Suggested-by: Matt Denton 
Link: https://lore.kernel.org/r/20200603011044.7972-4-sar...@sargun.me
Signed-off-by: Sargun Dhillon 
Co-developed-by: Kees Cook 
Signed-off-by: Kees Cook 
---
 include/uapi/linux/seccomp.h |  22 +
 kernel/seccomp.c | 172 ++-
 2 files changed, 193 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h
index 965290f7dcc2..6ba18b82a02e 100644
--- a/include/uapi/linux/seccomp.h
+++ b/include/uapi/linux/seccomp.h
@@ -113,6 +113,25 @@ struct seccomp_notif_resp {
__u32 flags;
 };
 
+/* valid flags for seccomp_notif_addfd */
+#define SECCOMP_ADDFD_FLAG_SETFD   (1UL << 0) /* Specify remote fd */
+
+/**
+ * struct seccomp_notif_addfd
+ * @id: The ID of the seccomp notification
+ * @flags: SECCOMP_ADDFD_FLAG_*
+ * @srcfd: The local fd number
+ * @newfd: Optional remote FD number if SETFD option is set, otherwise 0.
+ * @newfd_flags: The O_* flags the remote FD should have applied
+ */
+struct seccomp_notif_addfd {
+   __u64 id;
+   __u32 flags;
+   __u32 srcfd;
+   __u32 newfd;
+   __u32 newfd_flags;
+};
+
 #define SECCOMP_IOC_MAGIC  '!'
 #define SECCOMP_IO(nr) _IO(SECCOMP_IOC_MAGIC, nr)
 #define SECCOMP_IOR(nr, type)  _IOR(SECCOMP_IOC_MAGIC, nr, type)
@@ -124,5 +143,8 @@ struct seccomp_notif_resp {
 #define SECCOMP_IOCTL_NOTIF_SEND   SECCOMP_IOWR(1, \
struct seccomp_notif_resp)
 #define SECCOMP_IOCTL_NOTIF_ID_VALID   SECCOMP_IOW(2, __u64)
+/* On success, the return value is the remote process's added fd number */
+#define SECCOMP_IOCTL_NOTIF_ADDFD  SECCOMP_IOW(3, \
+   struct seccomp_notif_addfd)
 
 #endif /* _UAPI_LINUX_SECCOMP_H */
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index 866a432cd746..ee314036e429 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -87,10 +87,42 @@ struct seccomp_knotif {
long val;
u32 flags;
 
-   /* Signals when this has entered SECCOMP_NOTIFY_REPLIED */
+   /*
+* Signals when this has changed states, such as the listener
+* dying, a new seccomp addfd message, or changing to REPLIED
+*/
struct completion ready;
 
struct list_head list;
+
+   /* outstanding addfd requests */
+   struct list_head addfd;
+};
+
+/**
+ * struct seccomp_kaddfd - container for seccomp_addfd ioctl messages
+ *
+ * @file: A reference to the file to install in the other task
+ * @fd: The fd number to install it at. If the fd number is -1, it means the
+ *  installing process should allocate the fd as normal.
+ * @flags: The flags for the new file descriptor. At the moment, only O_CLOEXEC
+ * is allowed.
+ * @ret: The return value of the installing process. It is set to the fd num
+ *   upon success (>= 0).
+ * @completion: Indicates that the installing process has completed

[PATCH v6 7/7] selftests/seccomp: Test SECCOMP_IOCTL_NOTIF_ADDFD

2020-07-06 Thread Kees Cook

From: Sargun Dhillon 

Test whether we can add file descriptors in response to notifications.
This injects the file descriptors via notifications, and then uses kcmp
to determine whether or not it has been successful.

It also includes some basic sanity checking for arguments.

Signed-off-by: Sargun Dhillon 
Link: https://lore.kernel.org/r/20200603011044.7972-5-sar...@sargun.me
Co-developed-by: Kees Cook 
Signed-off-by: Kees Cook 
---
 tools/testing/selftests/seccomp/seccomp_bpf.c | 229 ++
 1 file changed, 229 insertions(+)

diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c 
b/tools/testing/selftests/seccomp/seccomp_bpf.c
index 2c77105e50e6..b854a6c5bf49 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -45,6 +45,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -168,7 +169,9 @@ struct seccomp_metadata {
 
 #ifndef SECCOMP_FILTER_FLAG_NEW_LISTENER
 #define SECCOMP_FILTER_FLAG_NEW_LISTENER   (1UL << 3)
+#endif
 
+#ifndef SECCOMP_RET_USER_NOTIF
 #define SECCOMP_RET_USER_NOTIF 0x7fc0U
 
 #define SECCOMP_IOC_MAGIC  '!'
@@ -204,6 +207,39 @@ struct seccomp_notif_sizes {
 };
 #endif
 
+#ifndef SECCOMP_IOCTL_NOTIF_ADDFD
+/* On success, the return value is the remote process's added fd number */
+#define SECCOMP_IOCTL_NOTIF_ADDFD  SECCOMP_IOW(3,  \
+   struct seccomp_notif_addfd)
+
+/* valid flags for seccomp_notif_addfd */
+#define SECCOMP_ADDFD_FLAG_SETFD   (1UL << 0) /* Specify remote fd */
+
+struct seccomp_notif_addfd {
+   __u64 id;
+   __u32 flags;
+   __u32 srcfd;
+   __u32 newfd;
+   __u32 newfd_flags;
+};
+#endif
+
+struct seccomp_notif_addfd_small {
+   __u64 id;
+   char weird[4];
+};
+#define SECCOMP_IOCTL_NOTIF_ADDFD_SMALL\
+   SECCOMP_IOW(3, struct seccomp_notif_addfd_small)
+
+struct seccomp_notif_addfd_big {
+   union {
+   struct seccomp_notif_addfd addfd;
+   char buf[sizeof(struct seccomp_notif_addfd) + 8];
+   };
+};
+#define SECCOMP_IOCTL_NOTIF_ADDFD_BIG  \
+   SECCOMP_IOWR(3, struct seccomp_notif_addfd_big)
+
 #ifndef PTRACE_EVENTMSG_SYSCALL_ENTRY
 #define PTRACE_EVENTMSG_SYSCALL_ENTRY  1
 #define PTRACE_EVENTMSG_SYSCALL_EXIT   2
@@ -3738,6 +3774,199 @@ TEST(user_notification_filter_empty_threaded)
EXPECT_GT((pollfd.revents & POLLHUP) ?: 0, 0);
 }
 
+TEST(user_notification_addfd)
+{
+   pid_t pid;
+   long ret;
+   int status, listener, memfd, fd;
+   struct seccomp_notif_addfd addfd = {};
+   struct seccomp_notif_addfd_small small = {};
+   struct seccomp_notif_addfd_big big = {};
+   struct seccomp_notif req = {};
+   struct seccomp_notif_resp resp = {};
+   /* 100 ms */
+   struct timespec delay = { .tv_nsec = 1 };
+
+   memfd = memfd_create("test", 0);
+   ASSERT_GE(memfd, 0);
+
+   ret = prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
+   ASSERT_EQ(0, ret) {
+   TH_LOG("Kernel does not support PR_SET_NO_NEW_PRIVS!");
+   }
+
+   /* Check that the basic notification machinery works */
+   listener = user_notif_syscall(__NR_getppid,
+ SECCOMP_FILTER_FLAG_NEW_LISTENER);
+   ASSERT_GE(listener, 0);
+
+   pid = fork();
+   ASSERT_GE(pid, 0);
+
+   if (pid == 0) {
+   if (syscall(__NR_getppid) != USER_NOTIF_MAGIC)
+   exit(1);
+   exit(syscall(__NR_getppid) != USER_NOTIF_MAGIC);
+   }
+
+   ASSERT_EQ(ioctl(listener, SECCOMP_IOCTL_NOTIF_RECV, &req), 0);
+
+   addfd.srcfd = memfd;
+   addfd.newfd = 0;
+   addfd.id = req.id;
+   addfd.flags = 0x0;
+
+   /* Verify bad newfd_flags cannot be set */
+   addfd.newfd_flags = ~O_CLOEXEC;
+   EXPECT_EQ(ioctl(listener, SECCOMP_IOCTL_NOTIF_ADDFD, &addfd), -1);
+   EXPECT_EQ(errno, EINVAL);
+   addfd.newfd_flags = O_CLOEXEC;
+
+   /* Verify bad flags cannot be set */
+   addfd.flags = 0xff;
+   EXPECT_EQ(ioctl(listener, SECCOMP_IOCTL_NOTIF_ADDFD, &addfd), -1);
+   EXPECT_EQ(errno, EINVAL);
+   addfd.flags = 0;
+
+   /* Verify that remote_fd cannot be set without setting flags */
+   addfd.newfd = 1;
+   EXPECT_EQ(ioctl(listener, SECCOMP_IOCTL_NOTIF_ADDFD, &addfd), -1);
+   EXPECT_EQ(errno, EINVAL);
+   addfd.newfd = 0;
+
+   /* Verify small size cannot be set */
+   EXPECT_EQ(ioctl(listener, SECCOMP_IOCTL_NOTIF_ADDFD_SMALL, &small), -1);
+   EXPECT_EQ(errno, EINVAL);
+
+   /* Verify we can't send bits filled in unknown buffer area */
+   memset(&big, 0xAA, sizeof(big));
+   big.addfd = addfd;
+   EXPECT_EQ(ioctl(listener, SECCOMP_IOCTL_NOTIF_ADDFD_BIG, &big), -1);
+   EXPECT_EQ(errno, E2BIG);
+
+
+   /* Verify we can

Re: [PATCH v5 4/7] pidfd: Replace open-coded partial fd_install_received()

2020-07-06 Thread Kees Cook

On Mon, Jul 06, 2020 at 06:12:45PM +0200, Christian Brauner wrote:
> On Mon, Jul 06, 2020 at 08:34:06AM -0700, Kees Cook wrote:
> > Yup, this was a mistake in my refactoring of the pidfs changes.
> 
> I already did.

Er, what? (I had a typo in my quote: s/pidfs/pidfd/.) I was trying to
say that this was just a mistake in my refactoring of the pidfd usage of
the new helper.

> > I still don't agree: it radically complicates the SCM_RIGHTS and seccomp
> 
> I'm sorry, I don't buy it yet, though I might've missed something in the
> discussions: :)
> After applying the patches in your series this literally is just (which
> is hardly radical ;):

Agreed, "radical" was too strong.

> diff --git a/fs/file.c b/fs/file.c
> index 9568bcfd1f44..26930b2ea39d 100644
> --- a/fs/file.c
> +++ b/fs/file.c
> @@ -974,7 +974,7 @@ int __fd_install_received(int fd, struct file *file, int 
> __user *ufd,
> }
> 
> if (fd < 0)
> -   fd_install(new_fd, get_file(file));
> +   fd_install(new_fd, file);
> else {
> new_fd = fd;
> error = replace_fd(new_fd, file, o_flags);
> diff --git a/net/compat.c b/net/compat.c
> index 71494337cca7..605a5a67200c 100644
> --- a/net/compat.c
> +++ b/net/compat.c
> @@ -298,9 +298,11 @@ void scm_detach_fds_compat(struct msghdr *msg, struct 
> scm_cookie *scm)
> int err = 0, i;
> 
> for (i = 0; i < fdmax; i++) {
> -   err = fd_install_received_user(scm->fp->fp[i], cmsg_data + i, 
> o_flags);
> -   if (err < 0)
> +   err = fd_install_received_user(get_file(scm->fp->fp[i]), 
> cmsg_data + i, o_flags);
> +   if (err < 0) {
> +   fput(scm->fp->fp[i]);
> break;
> +   }
> }
> 
> if (i > 0) {
> diff --git a/net/core/scm.c b/net/core/scm.c
> index b9a0442ebd26..0d06446ae598 100644
> --- a/net/core/scm.c
> +++ b/net/core/scm.c
> @@ -306,9 +306,11 @@ void scm_detach_fds(struct msghdr *msg, struct 
> scm_cookie *scm)
> }
> 
> for (i = 0; i < fdmax; i++) {
> -   err = fd_install_received_user(scm->fp->fp[i], cmsg_data + i, 
> o_flags);
> -   if (err < 0)
> +   err = fd_install_received_user(get_file(scm->fp->fp[i]), 
> cmsg_data + i, o_flags);
> +   if (err < 0) {
> +   fput(scm->fp->fp[i]);
> break;
> +   }
> }
> 
> if (i > 0) {

But my point stands: I really dislike this; suddenly the caller needs to
manage this when it should be an entirely internal detail to the
function. It was only pidfd doing it wrong, and that was entirely my
fault in the conversion.

> The problem here is that the current patch invites bugs and has already
> produced one because fd_install() and fd_install_*() have the same
> naming scheme but different behavior when dealing with references.
> That's just not a good idea.

I will rename the helper and add explicit documentation, but I really
don't think callers should have to deal with managing the helper's split
ref lifetime.

-- 
Kees Cook

Re: [PATCH v5 4/7] pidfd: Replace open-coded partial fd_install_received()

2020-07-06 Thread Kees Cook

On Mon, Jul 06, 2020 at 03:07:13PM +0200, Christian Brauner wrote:
> On Wed, Jun 17, 2020 at 03:03:24PM -0700, Kees Cook wrote:
> > The sock counting (sock_update_netprioidx() and sock_update_classid()) was
> > missing from pidfd's implementation of received fd installation. Replace
> > the open-coded version with a call to the new fd_install_received()
> > helper.
> > 
> > Fixes: 8649c322f75c ("pid: Implement pidfd_getfd syscall")
> > Signed-off-by: Kees Cook 
> > ---
> >  kernel/pid.c | 11 +--
> >  1 file changed, 1 insertion(+), 10 deletions(-)
> > 
> > diff --git a/kernel/pid.c b/kernel/pid.c
> > index f1496b757162..24924ec5df0e 100644
> > --- a/kernel/pid.c
> > +++ b/kernel/pid.c
> > @@ -635,18 +635,9 @@ static int pidfd_getfd(struct pid *pid, int fd)
> > if (IS_ERR(file))
> > return PTR_ERR(file);
> >  
> > -   ret = security_file_receive(file);
> > -   if (ret) {
> > -   fput(file);
> > -   return ret;
> > -   }
> > -
> > -   ret = get_unused_fd_flags(O_CLOEXEC);
> > +   ret = fd_install_received(file, O_CLOEXEC);
> > if (ret < 0)
> > fput(file);
> > -   else
> > -   fd_install(ret, file);
> 
> So someone just sent a fix for pidfd_getfd() that was based on the
> changes done here.

Hi! Ah yes, that didn't get CCed to me. I'll go reply.

> I've been on vacation so didn't have a change to review this series and
> I see it's already in linux-next. This introduces a memory leak and
> actually proves a point I tried to stress when adding this helper:
> fd_install_received() in contrast to fd_install() does _not_ consume a
> reference because it takes one before it calls into fd_install(). That
> means, you need an unconditional fput() here both in the failure and
> error path.

Yup, this was a mistake in my refactoring of the pidfs changes.

> I strongly suggest though that we simply align the behavior between
> fd_install() and fd_install_received() and have the latter simply
> consume a reference when it succeeds! Imho, this bug proves that I was
> right to insist on this before. ;)

I still don't agree: it radically complicates the SCM_RIGHTS and seccomp
cases. The primary difference is that fd_install() cannot fail, and it
was optimized for this situation. The other file-related helpers that
can fail do not consume the reference, so this is in keeping with those
as well.

-- 
Kees Cook

Re: use-after-free in Bluetooth: 6lowpan (was Re: KASAN: use-after-free Write in refcount_warn_saturate)

2020-07-05 Thread Kees Cook

On Thu, Feb 27, 2020 at 11:50:11PM -0800, syzbot wrote:
> Hello,
> 
> syzbot found the following crash on:
> 
> HEAD commit:f8788d86 Linux 5.6-rc3
> git tree:   upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=13005fd9e0
> kernel config:  https://syzkaller.appspot.com/x/.config?x=9833e26bab355358
> dashboard link: https://syzkaller.appspot.com/bug?extid=7dd7f2f77a7a01d1dc14
> compiler:   gcc (GCC) 9.0.0 20181231 (experimental)
> userspace arch: i386
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=17e3ebede0
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=16a9f8f9e0
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+7dd7f2f77a7a01d1d...@syzkaller.appspotmail.com
> 
> ==
> BUG: KASAN: use-after-free in atomic_set 
> include/asm-generic/atomic-instrumented.h:44 [inline]
> BUG: KASAN: use-after-free in refcount_set include/linux/refcount.h:123 
> [inline]
> BUG: KASAN: use-after-free in refcount_warn_saturate+0x1f/0x1f0 
> lib/refcount.c:15
> Write of size 4 at addr 888090eb4018 by task kworker/1:24/2888
> 
> CPU: 1 PID: 2888 Comm: kworker/1:24 Not tainted 5.6.0-rc3-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> Google 01/01/2011
> Workqueue: events do_enable_set
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x197/0x210 lib/dump_stack.c:118
>  print_address_description.constprop.0.cold+0xd4/0x30b mm/kasan/report.c:374
>  __kasan_report.cold+0x1b/0x32 mm/kasan/report.c:506
>  kasan_report+0x12/0x20 mm/kasan/common.c:641
>  check_memory_region_inline mm/kasan/generic.c:185 [inline]
>  check_memory_region+0x134/0x1a0 mm/kasan/generic.c:192
>  __kasan_check_write+0x14/0x20 mm/kasan/common.c:101
>  atomic_set include/asm-generic/atomic-instrumented.h:44 [inline]
>  refcount_set include/linux/refcount.h:123 [inline]
>  refcount_warn_saturate+0x1f/0x1f0 lib/refcount.c:15
>  refcount_sub_and_test include/linux/refcount.h:261 [inline]
>  refcount_dec_and_test include/linux/refcount.h:281 [inline]
>  kref_put include/linux/kref.h:64 [inline]
>  l2cap_chan_put+0x1d9/0x240 net/bluetooth/l2cap_core.c:498
>  do_enable_set+0x54b/0x960 net/bluetooth/6lowpan.c:1075
>  process_one_work+0xa05/0x17a0 kernel/workqueue.c:2264
>  worker_thread+0x98/0xe40 kernel/workqueue.c:2410
>  kthread+0x361/0x430 kernel/kthread.c:255
>  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
> 
> Allocated by task 2888:
>  save_stack+0x23/0x90 mm/kasan/common.c:72
>  set_track mm/kasan/common.c:80 [inline]
>  __kasan_kmalloc mm/kasan/common.c:515 [inline]
>  __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:488
>  kasan_kmalloc+0x9/0x10 mm/kasan/common.c:529
>  kmem_cache_alloc_trace+0x158/0x790 mm/slab.c:3551
>  kmalloc include/linux/slab.h:555 [inline]
>  kzalloc include/linux/slab.h:669 [inline]
>  l2cap_chan_create+0x45/0x3a0 net/bluetooth/l2cap_core.c:446
>  chan_create+0x10/0xe0 net/bluetooth/6lowpan.c:640
>  bt_6lowpan_listen net/bluetooth/6lowpan.c:959 [inline]
>  do_enable_set+0x583/0x960 net/bluetooth/6lowpan.c:1078
>  process_one_work+0xa05/0x17a0 kernel/workqueue.c:2264
>  worker_thread+0x98/0xe40 kernel/workqueue.c:2410
>  kthread+0x361/0x430 kernel/kthread.c:255
>  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
> 
> Freed by task 2975:
>  save_stack+0x23/0x90 mm/kasan/common.c:72
>  set_track mm/kasan/common.c:80 [inline]
>  kasan_set_free_info mm/kasan/common.c:337 [inline]
>  __kasan_slab_free+0x102/0x150 mm/kasan/common.c:476
>  kasan_slab_free+0xe/0x10 mm/kasan/common.c:485
>  __cache_free mm/slab.c:3426 [inline]
>  kfree+0x10a/0x2c0 mm/slab.c:3757
>  l2cap_chan_destroy net/bluetooth/l2cap_core.c:484 [inline]
>  kref_put include/linux/kref.h:65 [inline]
>  l2cap_chan_put+0x1b7/0x240 net/bluetooth/l2cap_core.c:498
>  do_enable_set+0x54b/0x960 net/bluetooth/6lowpan.c:1075
>  process_one_work+0xa05/0x17a0 kernel/workqueue.c:2264
>  worker_thread+0x98/0xe40 kernel/workqueue.c:2410
>  kthread+0x361/0x430 kernel/kthread.c:255
>  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352

The free and use-after-free are both via the same do_enable_set() path
which implies there are racing writes to the debugfs write handler. It
seems locking is missing for both listen_chan and enable_6lowpan. The
latter seems misused in is_bt_6lowpan(), which should likely just be
checking for chan->ops == &bt_6lowpan_chan_ops, I think?

I have no way to actually test changes to this code...

-- 
Kees Cook

Re: [PATCH 4/5] kprobes: Do not expose probe addresses to non-CAP_SYSLOG

2020-07-05 Thread Kees Cook

On Sun, Jul 05, 2020 at 01:10:54PM -0700, Linus Torvalds wrote:
> On Fri, Jul 3, 2020 at 8:50 AM Kees Cook  wrote:
> >
> > With 67 kthreads on a booted system, this patch does not immediately
> > blow up...
> 
> Did you try making read/write inc/dec that thing too? Or does that
> just blow up with tons of warnings?

I hadn't gotten to that yet. I need to catch up on other stuff first,
and I thought I'd give people time to scream if they tested this
themselves. :)

-- 
Kees Cook

Re: [PATCH 4/5] kprobes: Do not expose probe addresses to non-CAP_SYSLOG

2020-07-03 Thread Kees Cook

On Thu, Jul 02, 2020 at 06:00:17PM -0700, Linus Torvalds wrote:
> If somebody is interested in looking into things like that, it might
> be a good idea to have kernel threads with that counter incremented by
> default.

With 67 kthreads on a booted system, this patch does not immediately
blow up... And it likely needs some beautification. (Note that
current_cred_*() calls current_cred() under the hood, so AFAICT, only
current_cred() needs coverage.)

diff --git a/include/linux/cred.h b/include/linux/cred.h
index 18639c069263..a624847cb0ce 100644
--- a/include/linux/cred.h
+++ b/include/linux/cred.h
@@ -295,7 +295,10 @@ static inline void put_cred(const struct cred *_cred)
  * since nobody else can modify it.
  */
 #define current_cred() \
-   rcu_dereference_protected(current->cred, 1)
+({ \
+   WARN_ON_ONCE(current->warn_on_current_cred);\
+   rcu_dereference_protected(current->cred, 1);\
+})
 
 /**
  * current_real_cred - Access the current task's objective credentials
diff --git a/include/linux/sched.h b/include/linux/sched.h
index b62e6aaf28f0..21ab1b81aa40 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -652,6 +652,7 @@ struct task_struct {
/* Per task flags (PF_*), defined further below: */
unsigned intflags;
unsigned intptrace;
+   unsigned intwarn_on_current_cred;
 
 #ifdef CONFIG_SMP
struct llist_node   wake_entry;
diff --git a/kernel/fork.c b/kernel/fork.c
index 142b23645d82..2e181b9bfd3f 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2527,8 +2527,12 @@ pid_t kernel_thread(int (*fn)(void *), void *arg, 
unsigned long flags)
.stack  = (unsigned long)fn,
.stack_size = (unsigned long)arg,
};
+   pid_t pid;
 
-   return _do_fork(&args);
+   pid = _do_fork(&args);
+   if (pid == 0)
+   current->warn_on_current_cred = 1;
+   return pid;
 }
 
 #ifdef __ARCH_WANT_SYS_FORK


-- 
Kees Cook

Re: [PATCH 2/5] module: Refactor section attr into bin attribute

2020-07-03 Thread Kees Cook

On Fri, Jul 03, 2020 at 08:02:07AM +0200, Greg Kroah-Hartman wrote:
> On Thu, Jul 02, 2020 at 04:26:35PM -0700, Kees Cook wrote:
> > +   sattr->battr.size = 3 /* "0x", "\n" */ + (BITS_PER_LONG / 4);
> 
> They get a correct "size" value now, nice!

Yeah, though I do have some concerns that switching to a bin attribute
changes the userspace behavior a bit. With seq_file-based "show", we
get a 4096 size, and seeking isn't possible (lseek to non-0 location
will fail). With the raw "read", we get the right size, but lseek()
is allowed (but I've got the "read" handler refuse reads starting from
non-zero). When I reviewed[1] potential readers (elftutils, systemtap,
kmod), they all seem to do normal things (fopen/fscanf/fclose), so I'm
hoping this won't be a problem in practice.

> Reviewed-by: Greg Kroah-Hartman 

Thanks!

[1] https://codesearch.debian.net/search?q=%2Fsys%2Fmodule.*sections&literal=0

-- 
Kees Cook

Re: [PATCH 4/5] kprobes: Do not expose probe addresses to non-CAP_SYSLOG

2020-07-03 Thread Kees Cook

On Thu, Jul 02, 2020 at 06:00:17PM -0700, Linus Torvalds wrote:
> On Thu, Jul 2, 2020 at 4:26 PM Kees Cook  wrote:
> >
> > The kprobe show() functions were using "current"'s creds instead
> > of the file opener's creds for kallsyms visibility. Fix to use
> > seq_file->file->f_cred.
> 
> Side note: I have a distinct - but despite that possibly quite
> incorrect - memory that I've discussed with somebody several years ago
> about making "current_cred()" simply warn in any IO context.
> 
> IOW, we could have read and write just increment/decrement a
> per-thread counter, and have current_cred() do a WARN_ON_ONCE() if
> it's called with that counter incremented.

That does sound familiar. I can't find a thread on it, but my search
abilities are poor. :) So an increment/decrement in all the IO-related
syscalls, or were you thinking of some other location?

> The issue of ioctl's is a bit less obvious - there are reasons to
> argue those should also use open-time credentials, but on the other
> hand I think it's reasonable to pass a file descriptor to a suid app
> in order for that app to do things that the normal user cannot.
>
> But read/write are dangerous because of the "it's so easy to fool suid
> apps to read/write stdin/stdout".
>
> So pread/pwrite/ioctl/splice etc are things that suid applications
> very much do on purpose to affect a file descriptor. But plain
> read/write are things that might be accidental and used by attack
> vectors.

So probably just start with read/write and tighten it over time, if we
find other clear places, leaving ioctl/pread/pwrite/splice alone.

> If somebody is interested in looking into things like that, it might
> be a good idea to have kernel threads with that counter incremented by
> 
> Just throwing this idea out in case somebody wants to try it. It's not
> just "current_cred", of course. It's all the current_cred_xxx() users
> too. But it may be that there are a ton of false positives because
> maybe some code on purpose ends up doing things like just *comparing*
> current_cred with file->f_cred and then that would warn too.

Yeah ... and I think the kthread test should answer that question.

-- 
Kees Cook

[PATCH 3/5] module: Do not expose section addresses to non-CAP_SYSLOG

2020-07-02 Thread Kees Cook

The printing of section addresses in /sys/module/*/sections/* was not
using the correct credentials to evaluate visibility.

Before:

 # cat /sys/module/*/sections/.*text
 0xc0458000
 ...
 # capsh --drop=CAP_SYSLOG -- -c "cat /sys/module/*/sections/.*text"
 0xc0458000
 ...

After:

 # cat /sys/module/*/sections/*.text
 0xc0458000
 ...
 # capsh --drop=CAP_SYSLOG -- -c "cat /sys/module/*/sections/.*text"
 0x
 ...

Additionally replaces the existing (safe) /proc/modules check with
file->f_cred for consistency.

Cc: sta...@vger.kernel.org
Reported-by: Dominik Czarnota 
Fixes: be71eda5383f ("module: Fix display of wrong module .text address")
Signed-off-by: Kees Cook 
---
 kernel/module.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/module.c b/kernel/module.c
index 9e2954519259..e6c7571092cb 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -1530,8 +1530,8 @@ static ssize_t module_sect_read(struct file *file, struct 
kobject *kobj,
if (pos != 0)
return -EINVAL;
 
-   return sprintf(buf, "0x%px\n", kptr_restrict < 2 ?
-  (void *)sattr->address : NULL);
+   return sprintf(buf, "0x%px\n",
+  kallsyms_show_value(file->f_cred) ? (void 
*)sattr->address : NULL);
 }
 
 static void free_sect_attrs(struct module_sect_attrs *sect_attrs)
@@ -4380,7 +4380,7 @@ static int modules_open(struct inode *inode, struct file 
*file)
 
if (!err) {
struct seq_file *m = file->private_data;
-   m->private = kallsyms_show_value(current_cred()) ? NULL : (void 
*)8ul;
+   m->private = kallsyms_show_value(file->f_cred) ? NULL : (void 
*)8ul;
}
 
return err;
-- 
2.25.1

[PATCH 0/5] Refactor kallsyms_show_value() users for correct cred

2020-07-02 Thread Kees Cook

Hi,

I'm not sure who should carry this tree (me? Greg? akpm? Linus?), but
it fixes a kernel address exposure bug reported by Dominik Czarnota,
where /sys/modules/*/sections/* contents were visible to uid-0 without
CAP_SYSLOG (e.g. in containers):

This is correct, with CAP_SYSLOG:
 # cat /sys/module/*/sections/.*text
 0xc0458000
 ...

This is broken:
 # capsh --drop=CAP_SYSLOG -- -c "cat /sys/module/*/sections/.*text"
 0xc0458000
 ...

Fixing this required refactoring of several internals, and in the process
uncovered other users of kallsyms_show_value() that were doing checks
during "read" context instead of "open" context. This fixes all of these
cases by plumbing the file->f_cred through to their ultimate checks via
kallsyms_show_value()'s new cred argument.

Testing, reviews, and acks appreciated. :)

Thanks!

-Kees


Kees Cook (5):
  kallsyms: Refactor kallsyms_show_value() to take cred
  module: Refactor section attr into bin attribute
  module: Do not expose section addresses to non-CAP_SYSLOG
  kprobes: Do not expose probe addresses to non-CAP_SYSLOG
  bpf: Check correct cred for CAP_SYSLOG in bpf_dump_raw_ok()

 include/linux/filter.h |  4 +--
 include/linux/kallsyms.h   |  5 ++--
 kernel/bpf/syscall.c   | 37 +++
 kernel/kallsyms.c  | 17 -
 kernel/kprobes.c   |  4 +--
 kernel/module.c| 51 --
 net/core/sysctl_net_core.c |  2 +-
 7 files changed, 67 insertions(+), 53 deletions(-)

-- 
2.25.1

[PATCH 5/5] bpf: Check correct cred for CAP_SYSLOG in bpf_dump_raw_ok()

2020-07-02 Thread Kees Cook

When evaluating access control over kallsyms visibility, credentials at
open() time need to be used, not the "current" creds (though in BPF's
case, this has likely always been the same). Plumb access to associated
file->f_cred down through bpf_dump_raw_ok() and its callers now that
kallsysm_show_value() has been refactored to take struct cred.

Cc: sta...@vger.kernel.org
Fixes: 7105e828c087 ("bpf: allow for correlation of maps and helpers in dump")
Signed-off-by: Kees Cook 
---
 include/linux/filter.h |  4 ++--
 kernel/bpf/syscall.c   | 37 +
 net/core/sysctl_net_core.c |  2 +-
 3 files changed, 24 insertions(+), 19 deletions(-)

diff --git a/include/linux/filter.h b/include/linux/filter.h
index 55104f6c78e8..0b0144752d78 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -884,12 +884,12 @@ void bpf_jit_compile(struct bpf_prog *prog);
 bool bpf_jit_needs_zext(void);
 bool bpf_helper_changes_pkt_data(void *func);
 
-static inline bool bpf_dump_raw_ok(void)
+static inline bool bpf_dump_raw_ok(const struct cred *cred)
 {
/* Reconstruction of call-sites is dependent on kallsyms,
 * thus make dump the same restriction.
 */
-   return kallsyms_show_value(current_cred());
+   return kallsyms_show_value(cred);
 }
 
 struct bpf_prog *bpf_patch_insn_single(struct bpf_prog *prog, u32 off,
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 8da159936bab..859053ddf05b 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -3139,7 +3139,8 @@ static const struct bpf_map *bpf_map_from_imm(const 
struct bpf_prog *prog,
return NULL;
 }
 
-static struct bpf_insn *bpf_insn_prepare_dump(const struct bpf_prog *prog)
+static struct bpf_insn *bpf_insn_prepare_dump(const struct bpf_prog *prog,
+ const struct cred *f_cred)
 {
const struct bpf_map *map;
struct bpf_insn *insns;
@@ -3165,7 +3166,7 @@ static struct bpf_insn *bpf_insn_prepare_dump(const 
struct bpf_prog *prog)
code == (BPF_JMP | BPF_CALL_ARGS)) {
if (code == (BPF_JMP | BPF_CALL_ARGS))
insns[i].code = BPF_JMP | BPF_CALL;
-   if (!bpf_dump_raw_ok())
+   if (!bpf_dump_raw_ok(f_cred))
insns[i].imm = 0;
continue;
}
@@ -3221,7 +3222,8 @@ static int set_info_rec_size(struct bpf_prog_info *info)
return 0;
 }
 
-static int bpf_prog_get_info_by_fd(struct bpf_prog *prog,
+static int bpf_prog_get_info_by_fd(struct file *file,
+  struct bpf_prog *prog,
   const union bpf_attr *attr,
   union bpf_attr __user *uattr)
 {
@@ -3290,11 +3292,11 @@ static int bpf_prog_get_info_by_fd(struct bpf_prog 
*prog,
struct bpf_insn *insns_sanitized;
bool fault;
 
-   if (prog->blinded && !bpf_dump_raw_ok()) {
+   if (prog->blinded && !bpf_dump_raw_ok(file->f_cred)) {
info.xlated_prog_insns = 0;
goto done;
}
-   insns_sanitized = bpf_insn_prepare_dump(prog);
+   insns_sanitized = bpf_insn_prepare_dump(prog, file->f_cred);
if (!insns_sanitized)
return -ENOMEM;
uinsns = u64_to_user_ptr(info.xlated_prog_insns);
@@ -3328,7 +3330,7 @@ static int bpf_prog_get_info_by_fd(struct bpf_prog *prog,
}
 
if (info.jited_prog_len && ulen) {
-   if (bpf_dump_raw_ok()) {
+   if (bpf_dump_raw_ok(file->f_cred)) {
uinsns = u64_to_user_ptr(info.jited_prog_insns);
ulen = min_t(u32, info.jited_prog_len, ulen);
 
@@ -3363,7 +3365,7 @@ static int bpf_prog_get_info_by_fd(struct bpf_prog *prog,
ulen = info.nr_jited_ksyms;
info.nr_jited_ksyms = prog->aux->func_cnt ? : 1;
if (ulen) {
-   if (bpf_dump_raw_ok()) {
+   if (bpf_dump_raw_ok(file->f_cred)) {
unsigned long ksym_addr;
u64 __user *user_ksyms;
u32 i;
@@ -3394,7 +3396,7 @@ static int bpf_prog_get_info_by_fd(struct bpf_prog *prog,
ulen = info.nr_jited_func_lens;
info.nr_jited_func_lens = prog->aux->func_cnt ? : 1;
if (ulen) {
-   if (bpf_dump_raw_ok()) {
+   if (bpf_dump_raw_ok(file->f_cred)) {
u32 __user *user_lens;
u32 func_len, i;
 
@@ -3451,7 +3453,7 @@ static int bpf_prog_get_info_by_fd(struct bpf_prog *prog,
else
info.nr_jited_line_info = 0;
if (info.nr_jited_line_info && ulen

[PATCH 2/5] module: Refactor section attr into bin attribute

2020-07-02 Thread Kees Cook

In order to gain access to the open file's f_cred for kallsym visibility
permission checks, refactor the module section attributes to use the
bin_attribute instead of attribute interface. Additionally removes the
redundant "name" struct member.

Cc: sta...@vger.kernel.org
Signed-off-by: Kees Cook 
---
 kernel/module.c | 45 -
 1 file changed, 24 insertions(+), 21 deletions(-)

diff --git a/kernel/module.c b/kernel/module.c
index a5022ae84e50..9e2954519259 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -1510,8 +1510,7 @@ static inline bool sect_empty(const Elf_Shdr *sect)
 }
 
 struct module_sect_attr {
-   struct module_attribute mattr;
-   char *name;
+   struct bin_attribute battr;
unsigned long address;
 };
 
@@ -1521,11 +1520,16 @@ struct module_sect_attrs {
struct module_sect_attr attrs[];
 };
 
-static ssize_t module_sect_show(struct module_attribute *mattr,
-   struct module_kobject *mk, char *buf)
+static ssize_t module_sect_read(struct file *file, struct kobject *kobj,
+   struct bin_attribute *battr,
+   char *buf, loff_t pos, size_t count)
 {
struct module_sect_attr *sattr =
-   container_of(mattr, struct module_sect_attr, mattr);
+   container_of(battr, struct module_sect_attr, battr);
+
+   if (pos != 0)
+   return -EINVAL;
+
return sprintf(buf, "0x%px\n", kptr_restrict < 2 ?
   (void *)sattr->address : NULL);
 }
@@ -1535,7 +1539,7 @@ static void free_sect_attrs(struct module_sect_attrs 
*sect_attrs)
unsigned int section;
 
for (section = 0; section < sect_attrs->nsections; section++)
-   kfree(sect_attrs->attrs[section].name);
+   kfree(sect_attrs->attrs[section].battr.attr.name);
kfree(sect_attrs);
 }
 
@@ -1544,42 +1548,41 @@ static void add_sect_attrs(struct module *mod, const 
struct load_info *info)
unsigned int nloaded = 0, i, size[2];
struct module_sect_attrs *sect_attrs;
struct module_sect_attr *sattr;
-   struct attribute **gattr;
+   struct bin_attribute **gattr;
 
/* Count loaded sections and allocate structures */
for (i = 0; i < info->hdr->e_shnum; i++)
if (!sect_empty(&info->sechdrs[i]))
nloaded++;
size[0] = ALIGN(struct_size(sect_attrs, attrs, nloaded),
-   sizeof(sect_attrs->grp.attrs[0]));
-   size[1] = (nloaded + 1) * sizeof(sect_attrs->grp.attrs[0]);
+   sizeof(sect_attrs->grp.bin_attrs[0]));
+   size[1] = (nloaded + 1) * sizeof(sect_attrs->grp.bin_attrs[0]);
sect_attrs = kzalloc(size[0] + size[1], GFP_KERNEL);
if (sect_attrs == NULL)
return;
 
/* Setup section attributes. */
sect_attrs->grp.name = "sections";
-   sect_attrs->grp.attrs = (void *)sect_attrs + size[0];
+   sect_attrs->grp.bin_attrs = (void *)sect_attrs + size[0];
 
sect_attrs->nsections = 0;
sattr = §_attrs->attrs[0];
-   gattr = §_attrs->grp.attrs[0];
+   gattr = §_attrs->grp.bin_attrs[0];
for (i = 0; i < info->hdr->e_shnum; i++) {
Elf_Shdr *sec = &info->sechdrs[i];
if (sect_empty(sec))
continue;
+   sysfs_bin_attr_init(&sattr->battr);
sattr->address = sec->sh_addr;
-   sattr->name = kstrdup(info->secstrings + sec->sh_name,
-   GFP_KERNEL);
-   if (sattr->name == NULL)
+   sattr->battr.attr.name =
+   kstrdup(info->secstrings + sec->sh_name, GFP_KERNEL);
+   if (sattr->battr.attr.name == NULL)
goto out;
sect_attrs->nsections++;
-   sysfs_attr_init(&sattr->mattr.attr);
-   sattr->mattr.show = module_sect_show;
-   sattr->mattr.store = NULL;
-   sattr->mattr.attr.name = sattr->name;
-   sattr->mattr.attr.mode = S_IRUSR;
-   *(gattr++) = &(sattr++)->mattr.attr;
+   sattr->battr.read = module_sect_read;
+   sattr->battr.size = 3 /* "0x", "\n" */ + (BITS_PER_LONG / 4);
+   sattr->battr.attr.mode = 0400;
+   *(gattr++) = &(sattr++)->battr;
}
*gattr = NULL;
 
@@ -1669,7 +1672,7 @@ static void add_notes_attrs(struct module *mod, const 
struct load_info *info)
continue;
if (info->sechdrs[i].sh_type == SHT_NOTE) {
sysfs_bin_attr_init(natt

[PATCH 1/5] kallsyms: Refactor kallsyms_show_value() to take cred

2020-07-02 Thread Kees Cook

In order to perform future tests against the cred saved during open(),
switch kallsyms_show_value() to operate on a cred, and have all current
callers pass current_cred(). This makes it very obvious where callers
are checking the wrong credential in their "read" contexts. These will
be fixed in the coming patches.

Additionally switch return value to bool, since it is always used as a
direct permission check, not a 0-on-success, negative-on-error style
function return.

Cc: sta...@vger.kernel.org
Signed-off-by: Kees Cook 
---
 include/linux/filter.h   |  2 +-
 include/linux/kallsyms.h |  5 +++--
 kernel/kallsyms.c| 17 +++--
 kernel/kprobes.c |  4 ++--
 kernel/module.c  |  2 +-
 5 files changed, 18 insertions(+), 12 deletions(-)

diff --git a/include/linux/filter.h b/include/linux/filter.h
index 259377723603..55104f6c78e8 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -889,7 +889,7 @@ static inline bool bpf_dump_raw_ok(void)
/* Reconstruction of call-sites is dependent on kallsyms,
 * thus make dump the same restriction.
 */
-   return kallsyms_show_value() == 1;
+   return kallsyms_show_value(current_cred());
 }
 
 struct bpf_prog *bpf_patch_insn_single(struct bpf_prog *prog, u32 off,
diff --git a/include/linux/kallsyms.h b/include/linux/kallsyms.h
index 98338dc6b5d2..481273f0c72d 100644
--- a/include/linux/kallsyms.h
+++ b/include/linux/kallsyms.h
@@ -18,6 +18,7 @@
 #define KSYM_SYMBOL_LEN (sizeof("%s+%#lx/%#lx [%s]") + (KSYM_NAME_LEN - 1) + \
 2*(BITS_PER_LONG*3/10) + (MODULE_NAME_LEN - 1) + 1)
 
+struct cred;
 struct module;
 
 static inline int is_kernel_inittext(unsigned long addr)
@@ -98,7 +99,7 @@ int lookup_symbol_name(unsigned long addr, char *symname);
 int lookup_symbol_attrs(unsigned long addr, unsigned long *size, unsigned long 
*offset, char *modname, char *name);
 
 /* How and when do we show kallsyms values? */
-extern int kallsyms_show_value(void);
+extern bool kallsyms_show_value(const struct cred *cred);
 
 #else /* !CONFIG_KALLSYMS */
 
@@ -158,7 +159,7 @@ static inline int lookup_symbol_attrs(unsigned long addr, 
unsigned long *size, u
return -ERANGE;
 }
 
-static inline int kallsyms_show_value(void)
+static inline bool kallsyms_show_value(const struct cred *cred)
 {
return false;
 }
diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 16c8c605f4b0..bb14e64f62a4 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -644,19 +644,20 @@ static inline int kallsyms_for_perf(void)
  * Otherwise, require CAP_SYSLOG (assuming kptr_restrict isn't set to
  * block even that).
  */
-int kallsyms_show_value(void)
+bool kallsyms_show_value(const struct cred *cred)
 {
switch (kptr_restrict) {
case 0:
if (kallsyms_for_perf())
-   return 1;
+   return true;
/* fallthrough */
case 1:
-   if (has_capability_noaudit(current, CAP_SYSLOG))
-   return 1;
+   if (security_capable(cred, &init_user_ns, CAP_SYSLOG,
+CAP_OPT_NOAUDIT) == 0)
+   return true;
/* fallthrough */
default:
-   return 0;
+   return false;
}
 }
 
@@ -673,7 +674,11 @@ static int kallsyms_open(struct inode *inode, struct file 
*file)
return -ENOMEM;
reset_iter(iter, 0);
 
-   iter->show_value = kallsyms_show_value();
+   /*
+* Instead of checking this on every s_show() call, cache
+* the result here at open time.
+*/
+   iter->show_value = kallsyms_show_value(file->f_cred);
return 0;
 }
 
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 4a904cc56d68..d4de217e4a91 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -2448,7 +2448,7 @@ static void report_probe(struct seq_file *pi, struct 
kprobe *p,
else
kprobe_type = "k";
 
-   if (!kallsyms_show_value())
+   if (!kallsyms_show_value(current_cred()))
addr = NULL;
 
if (sym)
@@ -2540,7 +2540,7 @@ static int kprobe_blacklist_seq_show(struct seq_file *m, 
void *v)
 * If /proc/kallsyms is not showing kernel address, we won't
 * show them here either.
 */
-   if (!kallsyms_show_value())
+   if (!kallsyms_show_value(current_cred()))
seq_printf(m, "0x%px-0x%px\t%ps\n", NULL, NULL,
   (void *)ent->start_addr);
else
diff --git a/kernel/module.c b/kernel/module.c
index e8a198588f26..a5022ae84e50 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -4377,7 +4377,7 @@ static int modules_open(struct inode *inode, struct file 
*file)
 
if (!err) {
struct seq_file *m = file->private_data;
-   m->private

[PATCH 4/5] kprobes: Do not expose probe addresses to non-CAP_SYSLOG

2020-07-02 Thread Kees Cook

The kprobe show() functions were using "current"'s creds instead
of the file opener's creds for kallsyms visibility. Fix to use
seq_file->file->f_cred.

Cc: sta...@vger.kernel.org
Fixes: 81365a947de4 ("kprobes: Show address of kprobes if kallsyms does")
Fixes: ffb9bd68ebdb ("kprobes: Show blacklist addresses as same as kallsyms 
does")
Signed-off-by: Kees Cook 
---
 kernel/kprobes.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index d4de217e4a91..2e97febeef77 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -2448,7 +2448,7 @@ static void report_probe(struct seq_file *pi, struct 
kprobe *p,
else
kprobe_type = "k";
 
-   if (!kallsyms_show_value(current_cred()))
+   if (!kallsyms_show_value(pi->file->f_cred))
addr = NULL;
 
if (sym)
@@ -2540,7 +2540,7 @@ static int kprobe_blacklist_seq_show(struct seq_file *m, 
void *v)
 * If /proc/kallsyms is not showing kernel address, we won't
 * show them here either.
 */
-   if (!kallsyms_show_value(current_cred()))
+   if (!kallsyms_show_value(m->file->f_cred))
seq_printf(m, "0x%px-0x%px\t%ps\n", NULL, NULL,
   (void *)ent->start_addr);
else
-- 
2.25.1

Re: [PATCH v2 08/16] spi: davinci: Remove uninitialized_var() usage

2020-07-02 Thread Kees Cook

On Thu, Jul 02, 2020 at 04:23:35PM +0100, Mark Brown wrote:
> On Thu, Jul 02, 2020 at 08:21:40AM -0700, Kees Cook wrote:
> > On Wed, Jul 01, 2020 at 09:39:20PM +0100, Mark Brown wrote:
> 
> > > Please copy maintainers on patches :(
> 
> > Hi! Sorry about that; the CC list was giant, so I had opted for using
> > subsystem mailing lists where possible.
> 
> If you're going to err in a direction there I'd err in the direction of
> CCing the people not the list - I only saw this since I was looking for
> something else, I don't normally see stuff in the mailing list folder.

Yeah, I've gotten conflicting feedback on treewide changes:
- please CC me on only the one patch, I don't want to see everything else
- please CC me on the whole series, I want the full context for the change

I opted toward "CC me on this series", but then I get stuck when the CC
is giant. I think I may switch back to individual CCs for specific
patches, and point people to lore if they want greater context. (lore
didn't exist before...)

Thanks for the poke to make me reconsider this workflow. :)

-- 
Kees Cook

Re: [PATCH v2 08/16] spi: davinci: Remove uninitialized_var() usage

2020-07-02 Thread Kees Cook

On Wed, Jul 01, 2020 at 09:39:20PM +0100, Mark Brown wrote:
> On Fri, Jun 19, 2020 at 08:29:59PM -0700, Kees Cook wrote:
> > Using uninitialized_var() is dangerous as it papers over real bugs[1]
> > (or can in the future), and suppresses unrelated compiler warnings (e.g.
> > "unused variable"). If the compiler thinks it is uninitialized, either
> > simply initialize the variable or make compiler changes. As a precursor
> > to removing[2] this[3] macro[4], just remove this variable since it was
> > actually unused:
> 
> Please copy maintainers on patches :(

Hi! Sorry about that; the CC list was giant, so I had opted for using
subsystem mailing lists where possible.

> Acked-by: Mark Brown 

Thanks!

-- 
Kees Cook

Re: [PATCH v2 04/16] b43: Remove uninitialized_var() usage

2020-06-22 Thread Kees Cook

On Mon, Jun 22, 2020 at 10:04:18AM -0700, Nick Desaulniers wrote:
> On Fri, Jun 19, 2020 at 8:30 PM Kees Cook  wrote:
> >
> > Using uninitialized_var() is dangerous as it papers over real bugs[1]
> > (or can in the future), and suppresses unrelated compiler warnings (e.g.
> > "unused variable"). If the compiler thinks it is uninitialized, either
> > simply initialize the variable or make compiler changes. As a precursor
> > to removing[2] this[3] macro[4], just initialize this variable to NULL.
> > No later NULL deref is possible due to the early returns outside of the
> > (phy->rev >= 7 && phy->rev < 19) case, which explicitly tests for NULL.
> >
> > [1] https://lore.kernel.org/lkml/20200603174714.192027-1-gli...@google.com/
> > [2] 
> > https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1tgqcr5vqkczwj0qxk6cernou6eedsuda...@mail.gmail.com/
> > [3] 
> > https://lore.kernel.org/lkml/ca+55afwgbgqhbp1fkxvrkepzyr5j8n1vkt1vzdz9knmpuxh...@mail.gmail.com/
> > [4] 
> > https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yvju65tplgn_ybynv0ve...@mail.gmail.com/
> >
> > Fixes: 58619b14d106 ("b43: move under broadcom vendor directory")
> > Signed-off-by: Kees Cook 
> 
> I see three total uses of uninitialized_var() in this file, do we want
> to eliminate all of them?

This is the only one that needed an explicit initialization -- all the
others are handled in the treewide patch. I *could* split it out here,
but I found it easier to keep the "no op" changes together in the
treewide patch.

-Kees

> 
> > ---
> >  drivers/net/wireless/broadcom/b43/phy_n.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/wireless/broadcom/b43/phy_n.c 
> > b/drivers/net/wireless/broadcom/b43/phy_n.c
> > index c33b4235839d..46db91846007 100644
> > --- a/drivers/net/wireless/broadcom/b43/phy_n.c
> > +++ b/drivers/net/wireless/broadcom/b43/phy_n.c
> > @@ -4222,7 +4222,7 @@ static void b43_nphy_tx_gain_table_upload(struct 
> > b43_wldev *dev)
> > u32 rfpwr_offset;
> > u8 pga_gain, pad_gain;
> >     int i;
> > -   const s16 *uninitialized_var(rf_pwr_offset_table);
> > +   const s16 *rf_pwr_offset_table = NULL;
> >
> > table = b43_nphy_get_tx_gain_table(dev);
> > if (!table)
> > --
> 
> -- 
> Thanks,
> ~Nick Desaulniers

-- 
Kees Cook

Re: [PATCH v2 00/16] Remove uninitialized_var() macro

2020-06-20 Thread Kees Cook

On Sat, Jun 20, 2020 at 09:03:34AM +0200, Sedat Dilek wrote:
> On Sat, Jun 20, 2020 at 5:30 AM Kees Cook  wrote:
> >
> > v2:
> > - more special-cased fixes
> > - add reviews
> > v1: 
> > https://lore.kernel.org/lkml/20200603233203.1695403-1-keesc...@chromium.org
> >
> > Using uninitialized_var() is dangerous as it papers over real bugs[1]
> > (or can in the future), and suppresses unrelated compiler warnings
> > (e.g. "unused variable"). If the compiler thinks it is uninitialized,
> > either simply initialize the variable or make compiler changes.
> >
> > As recommended[2] by[3] Linus[4], remove the macro.
> >
> > Most of the 300 uses don't cause any warnings on gcc 9.3.0, so they're in
> > a single treewide commit in this series. A few others needed to actually
> > get cleaned up, and I broke those out into individual patches.
> >
> > The tree is:
> > https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/log/?h=kspp/uninit/macro
> >
> > -Kees
> >
> 
> Hi Kees,
> 
> thanks for doing a v2 of your patchset.
> 
> As I saw Jason Yan providing some "uninitialized_var() macro" patches
> to the MLs I pointen him to your tree "v1".

Thanks!

> BTW, I have tested your "v1" against Linux v5.7 (see [1]) - just
> yesterday with Linux v5.7.5-rc1.
> 
> Is it possible to have a v2 of this patchset on top od Linux v5.7 - if
> you do not mind.

Since it's only going to be for post-v5.8, I'm fine skipping the v5.7
testing. Mainly I'm looking at v5.8 and linux-next.

Thanks for looking at it!

-- 
Kees Cook

[PATCH v2 04/16] b43: Remove uninitialized_var() usage

2020-06-19 Thread Kees Cook

Using uninitialized_var() is dangerous as it papers over real bugs[1]
(or can in the future), and suppresses unrelated compiler warnings (e.g.
"unused variable"). If the compiler thinks it is uninitialized, either
simply initialize the variable or make compiler changes. As a precursor
to removing[2] this[3] macro[4], just initialize this variable to NULL.
No later NULL deref is possible due to the early returns outside of the
(phy->rev >= 7 && phy->rev < 19) case, which explicitly tests for NULL.

[1] https://lore.kernel.org/lkml/20200603174714.192027-1-gli...@google.com/
[2] 
https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1tgqcr5vqkczwj0qxk6cernou6eedsuda...@mail.gmail.com/
[3] 
https://lore.kernel.org/lkml/ca+55afwgbgqhbp1fkxvrkepzyr5j8n1vkt1vzdz9knmpuxh...@mail.gmail.com/
[4] 
https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yvju65tplgn_ybynv0ve...@mail.gmail.com/

Fixes: 58619b14d106 ("b43: move under broadcom vendor directory")
Signed-off-by: Kees Cook 
---
 drivers/net/wireless/broadcom/b43/phy_n.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/broadcom/b43/phy_n.c 
b/drivers/net/wireless/broadcom/b43/phy_n.c
index c33b4235839d..46db91846007 100644
--- a/drivers/net/wireless/broadcom/b43/phy_n.c
+++ b/drivers/net/wireless/broadcom/b43/phy_n.c
@@ -4222,7 +4222,7 @@ static void b43_nphy_tx_gain_table_upload(struct 
b43_wldev *dev)
u32 rfpwr_offset;
u8 pga_gain, pad_gain;
int i;
-   const s16 *uninitialized_var(rf_pwr_offset_table);
+   const s16 *rf_pwr_offset_table = NULL;
 
table = b43_nphy_get_tx_gain_table(dev);
if (!table)
-- 
2.25.1

[PATCH v2 06/16] ide: Remove uninitialized_var() usage

2020-06-19 Thread Kees Cook

Using uninitialized_var() is dangerous as it papers over real bugs[1]
(or can in the future), and suppresses unrelated compiler warnings (e.g.
"unused variable"). If the compiler thinks it is uninitialized, either
simply initialize the variable or make compiler changes. As a precursor
to removing[2] this[3] macro[4], just remove this variable since it was
actually unused:

drivers/ide/ide-taskfile.c:232:34: warning: unused variable 'flags' 
[-Wunused-variable]
unsigned long uninitialized_var(flags);
^

[1] https://lore.kernel.org/lkml/20200603174714.192027-1-gli...@google.com/
[2] 
https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1tgqcr5vqkczwj0qxk6cernou6eedsuda...@mail.gmail.com/
[3] 
https://lore.kernel.org/lkml/ca+55afwgbgqhbp1fkxvrkepzyr5j8n1vkt1vzdz9knmpuxh...@mail.gmail.com/
[4] 
https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yvju65tplgn_ybynv0ve...@mail.gmail.com/

Fixes: ce1e518190ea ("ide: don't disable interrupts during kmap_atomic()")
Reviewed-by: Nick Desaulniers 
Signed-off-by: Kees Cook 
---
 drivers/ide/ide-taskfile.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/ide/ide-taskfile.c b/drivers/ide/ide-taskfile.c
index aab6a10435b6..a26f85ab58a9 100644
--- a/drivers/ide/ide-taskfile.c
+++ b/drivers/ide/ide-taskfile.c
@@ -229,7 +229,6 @@ void ide_pio_bytes(ide_drive_t *drive, struct ide_cmd *cmd,
ide_hwif_t *hwif = drive->hwif;
struct scatterlist *sg = hwif->sg_table;
struct scatterlist *cursg = cmd->cursg;
-   unsigned long uninitialized_var(flags);
struct page *page;
unsigned int offset;
u8 *buf;
-- 
2.25.1

[PATCH v2 03/16] drbd: Remove uninitialized_var() usage

2020-06-19 Thread Kees Cook

Using uninitialized_var() is dangerous as it papers over real bugs[1]
(or can in the future), and suppresses unrelated compiler warnings (e.g.
"unused variable"). If the compiler thinks it is uninitialized, either
simply initialize the variable or make compiler changes. As a precursor
to removing[2] this[3] macro[4], just initialize this variable to NULL.

[1] https://lore.kernel.org/lkml/20200603174714.192027-1-gli...@google.com/
[2] 
https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1tgqcr5vqkczwj0qxk6cernou6eedsuda...@mail.gmail.com/
[3] 
https://lore.kernel.org/lkml/ca+55afwgbgqhbp1fkxvrkepzyr5j8n1vkt1vzdz9knmpuxh...@mail.gmail.com/
[4] 
https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yvju65tplgn_ybynv0ve...@mail.gmail.com/

Fixes: a29728463b25 ("drbd: Backport the "events2" command")
Reviewed-by: Nick Desaulniers 
Signed-off-by: Kees Cook 
---
 drivers/block/drbd/drbd_state.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/block/drbd/drbd_state.c b/drivers/block/drbd/drbd_state.c
index eeaa3b49b264..0067d328f0b5 100644
--- a/drivers/block/drbd/drbd_state.c
+++ b/drivers/block/drbd/drbd_state.c
@@ -1604,7 +1604,7 @@ static void broadcast_state_change(struct 
drbd_state_change *state_change)
unsigned int n_device, n_connection, n_peer_device, n_peer_devices;
void (*last_func)(struct sk_buff *, unsigned int, void *,
  enum drbd_notification_type) = NULL;
-   void *uninitialized_var(last_arg);
+   void *last_arg = NULL;
 
 #define HAS_CHANGED(state) ((state)[OLD] != (state)[NEW])
 #define FINAL_STATE_CHANGE(type) \
-- 
2.25.1

[PATCH v2 05/16] rtlwifi: rtl8192cu: Remove uninitialized_var() usage

2020-06-19 Thread Kees Cook

Using uninitialized_var() is dangerous as it papers over real bugs[1]
(or can in the future), and suppresses unrelated compiler warnings (e.g.
"unused variable"). If the compiler thinks it is uninitialized, either
simply initialize the variable or make compiler changes. As a precursor
to removing[2] this[3] macro[4], just initialize this variable to NULL,
and avoid sending garbage by returning.

[1] https://lore.kernel.org/lkml/20200603174714.192027-1-gli...@google.com/
[2] 
https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1tgqcr5vqkczwj0qxk6cernou6eedsuda...@mail.gmail.com/
[3] 
https://lore.kernel.org/lkml/ca+55afwgbgqhbp1fkxvrkepzyr5j8n1vkt1vzdz9knmpuxh...@mail.gmail.com/
[4] 
https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yvju65tplgn_ybynv0ve...@mail.gmail.com/

Fixes: dc0313f46664 ("rtlwifi: rtl8192cu: Add routine hw")
Reviewed-by: Nick Desaulniers 
Acked-by: Kalle Valo 
Signed-off-by: Kees Cook 
---
 drivers/net/wireless/realtek/rtlwifi/rtl8192cu/hw.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8192cu/hw.c 
b/drivers/net/wireless/realtek/rtlwifi/rtl8192cu/hw.c
index f070f25bb735..5b071b70bc08 100644
--- a/drivers/net/wireless/realtek/rtlwifi/rtl8192cu/hw.c
+++ b/drivers/net/wireless/realtek/rtlwifi/rtl8192cu/hw.c
@@ -592,7 +592,7 @@ static void _rtl92cu_init_chipn_one_out_ep_priority(struct 
ieee80211_hw *hw,
bool wmm_enable,
u8 queue_sel)
 {
-   u16 uninitialized_var(value);
+   u16 value;
 
switch (queue_sel) {
case TX_SELE_HQ:
@@ -606,7 +606,7 @@ static void _rtl92cu_init_chipn_one_out_ep_priority(struct 
ieee80211_hw *hw,
break;
default:
WARN_ON(1); /* Shall not reach here! */
-   break;
+   return;
}
_rtl92c_init_chipn_reg_priority(hw, value, value, value, value,
value, value);
-- 
2.25.1

[PATCH v2 02/16] x86/mm/numa: Remove uninitialized_var() usage

2020-06-19 Thread Kees Cook

Using uninitialized_var() is dangerous as it papers over real bugs[1]
(or can in the future), and suppresses unrelated compiler warnings (e.g.
"unused variable"). If the compiler thinks it is uninitialized, either
simply initialize the variable or make compiler changes. As a precursor
to removing[2] this[3] macro[4], refactor code to avoid its need.

The original reason for its use here was to work around the #ifdef
being the only place the variable was used. This is better expressed
using IS_ENABLED() and a new code block where the variable can be used
unconditionally.

[1] https://lore.kernel.org/lkml/20200603174714.192027-1-gli...@google.com/
[2] 
https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1tgqcr5vqkczwj0qxk6cernou6eedsuda...@mail.gmail.com/
[3] 
https://lore.kernel.org/lkml/ca+55afwgbgqhbp1fkxvrkepzyr5j8n1vkt1vzdz9knmpuxh...@mail.gmail.com/
[4] 
https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yvju65tplgn_ybynv0ve...@mail.gmail.com/

Fixes: 1e01979c8f50 ("x86, numa: Implement pfn -> nid mapping granularity 
check")
Signed-off-by: Kees Cook 
---
 arch/x86/mm/numa.c| 18 +-
 include/linux/page-flags-layout.h |  4 +++-
 2 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index 8ee952038c80..b05f45e5e8e2 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -543,7 +543,6 @@ static void __init numa_clear_kernel_node_hotplug(void)
 
 static int __init numa_register_memblks(struct numa_meminfo *mi)
 {
-   unsigned long uninitialized_var(pfn_align);
int i, nid;
 
/* Account for nodes with cpus and no memory */
@@ -571,15 +570,16 @@ static int __init numa_register_memblks(struct 
numa_meminfo *mi)
 * If sections array is gonna be used for pfn -> nid mapping, check
 * whether its granularity is fine enough.
 */
-#ifdef NODE_NOT_IN_PAGE_FLAGS
-   pfn_align = node_map_pfn_alignment();
-   if (pfn_align && pfn_align < PAGES_PER_SECTION) {
-   printk(KERN_WARNING "Node alignment %LuMB < min %LuMB, 
rejecting NUMA config\n",
-  PFN_PHYS(pfn_align) >> 20,
-  PFN_PHYS(PAGES_PER_SECTION) >> 20);
-   return -EINVAL;
+   if (IS_ENABLED(NODE_NOT_IN_PAGE_FLAGS)) {
+   unsigned long pfn_align = node_map_pfn_alignment();
+
+   if (pfn_align && pfn_align < PAGES_PER_SECTION) {
+   pr_warn("Node alignment %LuMB < min %LuMB, rejecting 
NUMA config\n",
+   PFN_PHYS(pfn_align) >> 20,
+   PFN_PHYS(PAGES_PER_SECTION) >> 20);
+   return -EINVAL;
+   }
}
-#endif
if (!numa_meminfo_cover_memory(mi))
return -EINVAL;
 
diff --git a/include/linux/page-flags-layout.h 
b/include/linux/page-flags-layout.h
index 71283739ffd2..e200eef6a7fd 100644
--- a/include/linux/page-flags-layout.h
+++ b/include/linux/page-flags-layout.h
@@ -98,9 +98,11 @@
 /*
  * We are going to use the flags for the page to node mapping if its in
  * there.  This includes the case where there is no node, so it is implicit.
+ * Note that this #define MUST have a value so that it can be tested with
+ * the IS_ENABLED() macro.
  */
 #if !(NODES_WIDTH > 0 || NODES_SHIFT == 0)
-#define NODE_NOT_IN_PAGE_FLAGS
+#define NODE_NOT_IN_PAGE_FLAGS 1
 #endif
 
 #if defined(CONFIG_NUMA_BALANCING) && LAST_CPUPID_WIDTH == 0
-- 
2.25.1

[PATCH v2 10/16] KVM: PPC: Book3S PR: Remove uninitialized_var() usage

2020-06-19 Thread Kees Cook

Using uninitialized_var() is dangerous as it papers over real bugs[1]
(or can in the future), and suppresses unrelated compiler warnings (e.g.
"unused variable"). If the compiler thinks it is uninitialized, either
simply initialize the variable or make compiler changes. As a precursor
to removing[2] this[3] macro[4], just remove this variable since it was
actually unused:

arch/powerpc/kvm/book3s_pr.c:1832:16: warning: unused variable 'vrsave' 
[-Wunused-variable]
unsigned long vrsave;
  ^

[1] https://lore.kernel.org/lkml/20200603174714.192027-1-gli...@google.com/
[2] 
https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1tgqcr5vqkczwj0qxk6cernou6eedsuda...@mail.gmail.com/
[3] 
https://lore.kernel.org/lkml/ca+55afwgbgqhbp1fkxvrkepzyr5j8n1vkt1vzdz9knmpuxh...@mail.gmail.com/
[4] 
https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yvju65tplgn_ybynv0ve...@mail.gmail.com/

Suggested-by: Nathan Chancellor 
Fixes: f05ed4d56e9c ("KVM: PPC: Split out code from book3s.c into book3s_pr.c")
Signed-off-by: Kees Cook 
---
 arch/powerpc/kvm/book3s_pr.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index ef54f917bdaf..ed12dfbf9bb5 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -1828,9 +1828,6 @@ static int kvmppc_vcpu_run_pr(struct kvm_vcpu *vcpu)
 {
struct kvm_run *run = vcpu->run;
int ret;
-#ifdef CONFIG_ALTIVEC
-   unsigned long uninitialized_var(vrsave);
-#endif
 
/* Check if we can run the vcpu at all */
if (!vcpu->arch.sane) {
-- 
2.25.1

[PATCH v2 01/16] docs: deprecated.rst: Add uninitialized_var()

2020-06-19 Thread Kees Cook

Nothing should be using this macro, and the entire idea of tricking the
compiler into silencing such warnings is a mistake.

Signed-off-by: Kees Cook 
---
 Documentation/process/deprecated.rst | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/Documentation/process/deprecated.rst 
b/Documentation/process/deprecated.rst
index 652e2aa02a66..943a926ecbbb 100644
--- a/Documentation/process/deprecated.rst
+++ b/Documentation/process/deprecated.rst
@@ -51,6 +51,24 @@ to make sure their systems do not continue running in the 
face of
 "unreachable" conditions. (For example, see commits like `this one
 <https://git.kernel.org/linus/d4689846881d160a4d12a514e991a740bcb5d65a>`_.)
 
+uninitialized_var()
+---
+For any compiler warnings about uninitialized variables, just add
+an initializer. Using the uninitialized_var() macro (or similar
+warning-silencing tricks) is dangerous as it papers over `real bugs
+<https://lore.kernel.org/lkml/20200603174714.192027-1-gli...@google.com/>`_
+(or can in the future), and suppresses unrelated compiler warnings
+(e.g. "unused variable"). If the compiler thinks it is uninitialized,
+either simply initialize the variable or make compiler changes. Keep in
+mind that in most cases, if an initialization is obviously redundant,
+the compiler's dead-store elimination pass will make sure there are no
+needless variable writes.
+
+As Linus has said, this macro
+`must 
<https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1tgqcr5vqkczwj0qxk6cernou6eedsuda...@mail.gmail.com/>`_
+`be 
<https://lore.kernel.org/lkml/ca+55afwgbgqhbp1fkxvrkepzyr5j8n1vkt1vzdz9knmpuxh...@mail.gmail.com/>`_
+`removed 
<https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yvju65tplgn_ybynv0ve...@mail.gmail.com/>`_.
+
 open-coded arithmetic in allocator arguments
 
 Dynamic size calculations (especially multiplication) should not be
-- 
2.25.1

[PATCH v2 12/16] f2fs: Eliminate usage of uninitialized_var() macro

2020-06-19 Thread Kees Cook

From: Jason Yan 

This is an effort to eliminate the uninitialized_var() macro[1].

The use of this macro is the wrong solution because it forces off ANY
analysis by the compiler for a given variable. It even masks "unused
variable" warnings.

Quoted from Linus[2]:

"It's a horrible thing to use, in that it adds extra cruft to the
source code, and then shuts up a compiler warning (even the _reliable_
warnings from gcc)."

Fix it by remove this variable since it is not needed at all.

[1] https://github.com/KSPP/linux/issues/81
[2] 
https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yvju65tplgn_ybynv0ve...@mail.gmail.com/

Suggested-by: Chao Yu 
Signed-off-by: Jason Yan 
Reviewed-by: Chao Yu 
Link: https://lore.kernel.org/r/20200615085132.166470-1-yanai...@huawei.com
Signed-off-by: Kees Cook 
---
 fs/f2fs/data.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 326c63879ddc..3753ba06531b 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -2856,7 +2856,6 @@ static int f2fs_write_cache_pages(struct address_space 
*mapping,
};
 #endif
int nr_pages;
-   pgoff_t uninitialized_var(writeback_index);
pgoff_t index;
pgoff_t end;/* Inclusive */
pgoff_t done_index;
@@ -2875,8 +2874,7 @@ static int f2fs_write_cache_pages(struct address_space 
*mapping,
clear_inode_flag(mapping->host, FI_HOT_DATA);
 
if (wbc->range_cyclic) {
-   writeback_index = mapping->writeback_index; /* prev offset */
-   index = writeback_index;
+   index = mapping->writeback_index; /* prev offset */
end = -1;
} else {
index = wbc->range_start >> PAGE_SHIFT;
-- 
2.25.1

[PATCH v2 07/16] clk: st: Remove uninitialized_var() usage

2020-06-19 Thread Kees Cook

Using uninitialized_var() is dangerous as it papers over real bugs[1]
(or can in the future), and suppresses unrelated compiler warnings (e.g.
"unused variable"). If the compiler thinks it is uninitialized, either
simply initialize the variable or make compiler changes. As a precursor
to removing[2] this[3] macro[4], just remove this variable since it was
actually unused:

drivers/clk/st/clkgen-fsyn.c: In function ‘quadfs_set_rate’:
drivers/clk/st/clkgen-fsyn.c:793:6: warning: unused variable ‘i’ 
[-Wunused-variable]
  793 |  int i;
  |  ^

[1] https://lore.kernel.org/lkml/20200603174714.192027-1-gli...@google.com/
[2] 
https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1tgqcr5vqkczwj0qxk6cernou6eedsuda...@mail.gmail.com/
[3] 
https://lore.kernel.org/lkml/ca+55afwgbgqhbp1fkxvrkepzyr5j8n1vkt1vzdz9knmpuxh...@mail.gmail.com/
[4] 
https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yvju65tplgn_ybynv0ve...@mail.gmail.com/

Fixes: 5f7aa9071e93 ("clk: st: Support for QUADFS inside ClockGenB/C/D/E/F")
Signed-off-by: Kees Cook 
---
 drivers/clk/st/clkgen-fsyn.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/clk/st/clkgen-fsyn.c b/drivers/clk/st/clkgen-fsyn.c
index a156bd0c6af7..f1adc858b590 100644
--- a/drivers/clk/st/clkgen-fsyn.c
+++ b/drivers/clk/st/clkgen-fsyn.c
@@ -790,7 +790,6 @@ static int quadfs_set_rate(struct clk_hw *hw, unsigned long 
rate,
struct st_clk_quadfs_fsynth *fs = to_quadfs_fsynth(hw);
struct stm_fs params;
long hwrate;
-   int uninitialized_var(i);
 
if (!rate || !parent_rate)
return -EINVAL;
-- 
2.25.1

[PATCH v2 11/16] media: sur40: Remove uninitialized_var() usage

2020-06-19 Thread Kees Cook

Using uninitialized_var() is dangerous as it papers over real bugs[1]
(or can in the future), and suppresses unrelated compiler warnings (e.g.
"unused variable"). If the compiler thinks it is uninitialized, either
simply initialize the variable or make compiler changes. As a precursor
to removing[2] this[3] macro[4], just remove this variable since it was
actually unused:

drivers/input/touchscreen/sur40.c:459:6: warning: variable 'packet_id' set but 
not used [-Wunused-but-set-variable]
  459 |  u32 packet_id;
  |  ^

However, in keeping with the documentation desires outlined in commit
335abaea7a27 ("Input: sur40 - silence unnecessary noisy debug output"),
comment out the assignment instead of removing it.

[1] https://lore.kernel.org/lkml/20200603174714.192027-1-gli...@google.com/
[2] 
https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1tgqcr5vqkczwj0qxk6cernou6eedsuda...@mail.gmail.com/
[3] 
https://lore.kernel.org/lkml/ca+55afwgbgqhbp1fkxvrkepzyr5j8n1vkt1vzdz9knmpuxh...@mail.gmail.com/
[4] 
https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yvju65tplgn_ybynv0ve...@mail.gmail.com/

Fixes: 335abaea7a27 ("Input: sur40 - silence unnecessary noisy debug output")
Signed-off-by: Kees Cook 
---
 drivers/input/touchscreen/sur40.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/input/touchscreen/sur40.c 
b/drivers/input/touchscreen/sur40.c
index 34d31c7ec8ba..620cdd7d214a 100644
--- a/drivers/input/touchscreen/sur40.c
+++ b/drivers/input/touchscreen/sur40.c
@@ -456,8 +456,6 @@ static void sur40_poll(struct input_dev *input)
 {
struct sur40_state *sur40 = input_get_drvdata(input);
int result, bulk_read, need_blobs, packet_blobs, i;
-   u32 uninitialized_var(packet_id);
-
struct sur40_header *header = &sur40->bulk_in_buffer->header;
struct sur40_blob *inblob = &sur40->bulk_in_buffer->blobs[0];
 
@@ -491,7 +489,7 @@ static void sur40_poll(struct input_dev *input)
if (need_blobs == -1) {
need_blobs = le16_to_cpu(header->count);
dev_dbg(sur40->dev, "need %d blobs\n", need_blobs);
-   packet_id = le32_to_cpu(header->packet_id);
+   /* packet_id = le32_to_cpu(header->packet_id); */
}
 
/*
-- 
2.25.1

[PATCH v2 14/16] checkpatch: Remove awareness of uninitialized_var() macro

2020-06-19 Thread Kees Cook

Using uninitialized_var() is dangerous as it papers over real bugs[1]
(or can in the future), and suppresses unrelated compiler warnings
(e.g. "unused variable"). If the compiler thinks it is uninitialized,
either simply initialize the variable or make compiler changes.

In preparation for removing[2] the[3] macro[4], partially revert
commit 16b7f3c89907 ("checkpatch: avoid warning about uninitialized_var()")
and remove all remaining mentions of uninitialized_var().

[1] https://lore.kernel.org/lkml/20200603174714.192027-1-gli...@google.com/
[2] 
https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1tgqcr5vqkczwj0qxk6cernou6eedsuda...@mail.gmail.com/
[3] 
https://lore.kernel.org/lkml/ca+55afwgbgqhbp1fkxvrkepzyr5j8n1vkt1vzdz9knmpuxh...@mail.gmail.com/
[4] 
https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yvju65tplgn_ybynv0ve...@mail.gmail.com/

Signed-off-by: Kees Cook 
---
 scripts/checkpatch.pl | 16 +---
 1 file changed, 5 insertions(+), 11 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 4c820607540b..60b737e222fe 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -840,7 +840,6 @@ our $FuncArg = 
qr{$Typecast{0,1}($LvalOrFunc|$Constant|$String)};
 our $declaration_macros = qr{(?x:

(?:$Storage\s+)?(?:[A-Z_][A-Z0-9]*_){0,2}(?:DEFINE|DECLARE)(?:_[A-Z0-9]+){1,6}\s*\(|
(?:$Storage\s+)?[HLP]?LIST_HEAD\s*\(|
-   (?:$Storage\s+)?${Type}\s+uninitialized_var\s*\(|
(?:SKCIPHER_REQUEST|SHASH_DESC|AHASH_REQUEST)_ON_STACK\s*\(
 )};
 
@@ -6330,8 +6329,7 @@ sub process {
if (defined $cond) {
substr($s, 0, length($cond), '');
}
-   if ($s =~ /^\s*;/ &&
-   $function_name ne 'uninitialized_var')
+   if ($s =~ /^\s*;/)
{
WARN("AVOID_EXTERNS",
 "externs should be avoided in .c files\n" 
.  $herecurr);
@@ -6350,17 +6348,13 @@ sub process {
}
 
 # check for function declarations that have arguments without identifier names
-# while avoiding uninitialized_var(x)
if (defined $stat &&
-   $stat =~ 
/^.\s*(?:extern\s+)?$Type\s*(?:($Ident)|\(\s*\*\s*$Ident\s*\))\s*\(\s*([^{]+)\s*\)\s*;/s
 &&
-   (!defined($1) ||
-(defined($1) && $1 ne "uninitialized_var")) &&
-$2 ne "void") {
-   my $args = trim($2);
+   $stat =~ 
/^.\s*(?:extern\s+)?$Type\s*(?:$Ident|\(\s*\*\s*$Ident\s*\))\s*\(\s*([^{]+)\s*\)\s*;/s
 &&
+   $1 ne "void") {
+   my $args = trim($1);
while ($args =~ 
m/\s*($Type\s*(?:$Ident|\(\s*\*\s*$Ident?\s*\)\s*$balanced_parens)?)/g) {
my $arg = trim($1);
-   if ($arg =~ /^$Type$/ &&
-   $arg !~ /enum\s+$Ident$/) {
+   if ($arg =~ /^$Type$/ && $arg !~ 
/enum\s+$Ident$/) {
WARN("FUNCTION_ARGUMENTS",
 "function definition argument 
'$arg' should also have an identifier name\n" . $herecurr);
}
-- 
2.25.1

[PATCH v2 16/16] compiler: Remove uninitialized_var() macro

2020-06-19 Thread Kees Cook

Using uninitialized_var() is dangerous as it papers over real bugs[1]
(or can in the future), and suppresses unrelated compiler warnings
(e.g. "unused variable"). If the compiler thinks it is uninitialized,
either simply initialize the variable or make compiler changes.

As recommended[2] by[3] Linus[4], remove the macro. With the recent
change to disable -Wmaybe-uninitialized in v5.7 in commit 78a5255ffb6a
("Stop the ad-hoc games with -Wno-maybe-initialized"), this is likely
the best time to make this treewide change.

[1] https://lore.kernel.org/lkml/20200603174714.192027-1-gli...@google.com/
[2] 
https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1tgqcr5vqkczwj0qxk6cernou6eedsuda...@mail.gmail.com/
[3] 
https://lore.kernel.org/lkml/ca+55afwgbgqhbp1fkxvrkepzyr5j8n1vkt1vzdz9knmpuxh...@mail.gmail.com/
[4] 
https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yvju65tplgn_ybynv0ve...@mail.gmail.com/

Suggested-by: Linus Torvalds 
Reviewed-by: Bart van Assche 
Reviewed-by: Miguel Ojeda 
Tested-by: Nathan Chancellor 
Tested-by: Sedat Dilek 
Signed-off-by: Kees Cook 
---
 include/linux/compiler-clang.h | 2 --
 include/linux/compiler-gcc.h   | 6 --
 tools/include/linux/compiler.h | 2 --
 tools/virtio/linux/kernel.h| 2 --
 4 files changed, 12 deletions(-)

diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
index ee37256ec8bd..f2371b83aaf2 100644
--- a/include/linux/compiler-clang.h
+++ b/include/linux/compiler-clang.h
@@ -5,8 +5,6 @@
 
 /* Compiler specific definitions for Clang compiler */
 
-#define uninitialized_var(x) x = *(&(x))
-
 /* same as gcc, this was present in clang-2.6 so we can assume it works
  * with any version that can compile the kernel
  */
diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index 7dd4e0349ef3..84e099a87383 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -59,12 +59,6 @@
(typeof(ptr)) (__ptr + (off));  \
 })
 
-/*
- * A trick to suppress uninitialized variable warning without generating any
- * code
- */
-#define uninitialized_var(x) x = x
-
 #ifdef CONFIG_RETPOLINE
 #define __noretpoline __attribute__((__indirect_branch__("keep")))
 #endif
diff --git a/tools/include/linux/compiler.h b/tools/include/linux/compiler.h
index 9f9002734e19..2f2f4082225e 100644
--- a/tools/include/linux/compiler.h
+++ b/tools/include/linux/compiler.h
@@ -111,8 +111,6 @@
 # define noinline
 #endif
 
-#define uninitialized_var(x) x = *(&(x))
-
 #include 
 
 /*
diff --git a/tools/virtio/linux/kernel.h b/tools/virtio/linux/kernel.h
index 6683b4a70b05..1e14ab967c11 100644
--- a/tools/virtio/linux/kernel.h
+++ b/tools/virtio/linux/kernel.h
@@ -109,8 +109,6 @@ static inline void free_page(unsigned long addr)
const typeof( ((type *)0)->member ) *__mptr = (ptr);\
(type *)( (char *)__mptr - offsetof(type,member) );})
 
-#define uninitialized_var(x) x = x
-
 # ifndef likely
 #  define likely(x)(__builtin_expect(!!(x), 1))
 # endif
-- 
2.25.1

[PATCH v2 15/16] treewide: Remove uninitialized_var() usage

2020-06-19 Thread Kees Cook

Using uninitialized_var() is dangerous as it papers over real bugs[1]
(or can in the future), and suppresses unrelated compiler warnings
(e.g. "unused variable"). If the compiler thinks it is uninitialized,
either simply initialize the variable or make compiler changes.

I preparation for removing[2] the[3] macro[4], remove all remaining
needless uses with the following script:

git grep '\buninitialized_var\b' | cut -d: -f1 | sort -u | \
xargs perl -pi -e \
's/\buninitialized_var\(([^\)]+)\)/\1/g;
 s:\s*/\* (GCC be quiet|to make compiler happy) \*/$::g;'

drivers/video/fbdev/riva/riva_hw.c was manually tweaked to avoid
pathological white-space.

No outstanding warnings were found building allmodconfig with GCC 9.3.0
for x86_64, i386, arm64, arm, powerpc, powerpc64le, s390x, mips, sparc64,
alpha, and m68k.

[1] https://lore.kernel.org/lkml/20200603174714.192027-1-gli...@google.com/
[2] 
https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1tgqcr5vqkczwj0qxk6cernou6eedsuda...@mail.gmail.com/
[3] 
https://lore.kernel.org/lkml/ca+55afwgbgqhbp1fkxvrkepzyr5j8n1vkt1vzdz9knmpuxh...@mail.gmail.com/
[4] 
https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yvju65tplgn_ybynv0ve...@mail.gmail.com/

Reviewed-by: Leon Romanovsky  # drivers/infiniband and 
mlx4/mlx5
Acked-by: Jason Gunthorpe  # IB
Acked-by: Kalle Valo  # wireless drivers
Reviewed-by: Chao Yu  # erofs
Signed-off-by: Kees Cook 
---
 arch/arm/mach-sa1100/assabet.c   |  2 +-
 arch/arm/mm/alignment.c  |  2 +-
 arch/ia64/kernel/process.c   |  2 +-
 arch/ia64/mm/discontig.c |  2 +-
 arch/ia64/mm/tlb.c   |  2 +-
 arch/mips/lib/dump_tlb.c |  2 +-
 arch/mips/mm/init.c  |  2 +-
 arch/mips/mm/tlb-r4k.c   |  6 +++---
 arch/powerpc/kvm/book3s_64_mmu_radix.c   |  2 +-
 arch/powerpc/kvm/powerpc.c   |  2 +-
 arch/powerpc/platforms/52xx/mpc52xx_pic.c|  2 +-
 arch/s390/kernel/smp.c   |  2 +-
 arch/x86/kernel/quirks.c | 10 +-
 arch/x86/kvm/mmu/mmu.c   |  2 +-
 arch/x86/kvm/mmu/paging_tmpl.h   |  2 +-
 arch/x86/kvm/x86.c   |  2 +-
 block/blk-merge.c|  2 +-
 drivers/acpi/acpi_pad.c  |  2 +-
 drivers/ata/libata-scsi.c|  2 +-
 drivers/atm/zatm.c   |  2 +-
 drivers/block/drbd/drbd_nl.c |  6 +++---
 drivers/block/rbd.c  |  2 +-
 drivers/clk/clk-gate.c   |  2 +-
 drivers/firewire/ohci.c  | 14 +++---
 drivers/gpu/drm/bridge/sil-sii8620.c |  2 +-
 drivers/gpu/drm/drm_edid.c   |  2 +-
 drivers/gpu/drm/exynos/exynos_drm_dsi.c  |  6 +++---
 drivers/gpu/drm/i915/display/intel_fbc.c |  2 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c  |  2 +-
 drivers/gpu/drm/i915/intel_uncore.c  |  2 +-
 drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c  |  4 ++--
 drivers/i2c/busses/i2c-rk3x.c|  2 +-
 drivers/ide/ide-acpi.c   |  2 +-
 drivers/ide/ide-atapi.c  |  2 +-
 drivers/ide/ide-io-std.c |  4 ++--
 drivers/ide/ide-io.c |  8 
 drivers/ide/ide-sysfs.c  |  2 +-
 drivers/ide/umc8672.c|  2 +-
 drivers/idle/intel_idle.c|  2 +-
 drivers/infiniband/core/uverbs_cmd.c |  4 ++--
 drivers/infiniband/hw/cxgb4/cm.c |  2 +-
 drivers/infiniband/hw/cxgb4/cq.c |  2 +-
 drivers/infiniband/hw/mlx4/qp.c  |  6 +++---
 drivers/infiniband/hw/mlx5/cq.c  |  6 +++---
 drivers/infiniband/hw/mlx5/devx.c|  2 +-
 drivers/infiniband/hw/mlx5/wr.c  |  2 +-
 drivers/infiniband/hw/mthca/mthca_qp.c   | 10 +-
 drivers/infiniband/sw/siw/siw_qp_rx.c|  2 +-
 drivers/input/serio/serio_raw.c  |  2 +-
 drivers/iommu/intel/iommu.c  |  2 +-
 drivers/md/dm-io.c   |  2 +-
 drivers/md/dm-ioctl.c|  2 +-
 drivers/md/dm-snap-persistent.c  |  2 +-
 drivers/md/dm-table.c|  2 +-
 drivers/md/dm-writecache.c   |  2 +-
 drivers/md/raid5.c   |  2 +-
 drivers/media/dvb-frontends/rtl2832.c|  2 +-
 drivers/media/tuners/qt1010.c|  4 ++--
 drivers/media/usb/gspca/vicam.c  |  2 +-
 drivers/media/usb/uvc/uvc_video.c

[PATCH v2 13/16] mm/debug_vm_pgtable: Remove uninitialized_var() usage

2020-06-19 Thread Kees Cook

Using uninitialized_var() is dangerous as it papers over real bugs[1]
(or can in the future), and suppresses unrelated compiler warnings (e.g.
"unused variable"). If the compiler thinks it is uninitialized, either
simply initialize the variable or make compiler changes. As a precursor
to removing[2] this[3] macro[4], just initialize this variable to NULL.

[1] https://lore.kernel.org/lkml/20200603174714.192027-1-gli...@google.com/
[2] 
https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1tgqcr5vqkczwj0qxk6cernou6eedsuda...@mail.gmail.com/
[3] 
https://lore.kernel.org/lkml/ca+55afwgbgqhbp1fkxvrkepzyr5j8n1vkt1vzdz9knmpuxh...@mail.gmail.com/
[4] 
https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yvju65tplgn_ybynv0ve...@mail.gmail.com/

Fixes: 399145f9eb6c ("mm/debug: add tests validating architecture page table 
helpers")
Signed-off-by: Kees Cook 
---
 mm/debug_vm_pgtable.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index e45623016aea..83c9e88a052a 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -307,7 +307,7 @@ static int __init debug_vm_pgtable(void)
phys_addr_t paddr;
unsigned long vaddr, pte_aligned, pmd_aligned;
unsigned long pud_aligned, p4d_aligned, pgd_aligned;
-   spinlock_t *uninitialized_var(ptl);
+   spinlock_t *ptl = NULL;
 
pr_info("Validating architecture page table helpers\n");
prot = vm_get_page_prot(VMFLAGS);
-- 
2.25.1

1 2 3 4 5 6 7 8 >

1 - 100 of 799 matches

Mail list logo