Re: KASAN: slab-out-of-bounds Read in map_lookup_elem

2018-01-13 Thread Dmitry Vyukov
On Sun, Jan 14, 2018 at 1:13 AM, Daniel Borkmann  wrote:
> On 01/13/2018 02:58 AM, syzbot wrote:
>> Hello,
>>
>> syzkaller hit the following crash on 19d28fbd306e7ae7c1acf05c3e6968b56f0d196b
>> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/master
>> compiler: gcc (GCC) 7.1.1 20170620
>> .config is attached
>> Raw console output is attached.
>> C reproducer is attached
>> syzkaller reproducer is attached. See https://goo.gl/kgGztJ
>> for information about syzkaller reproducers
>
> Fixed here as well:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git/commit/?id=bbeb6e4323dad9b5e0ee9f60c223dd532e2403b1

Thanks.

Let's tell syzbot about this so that it reports bugs in
map_lookup_elem ever again:

#syz fix:
bpf, array: fix overflow in max_entries and undefined behavior in index_mask


>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>> Reported-by: syzbot+e631b5eb810eae085...@syzkaller.appspotmail.com
>> It will help syzbot understand when the bug is fixed. See footer for details.
>> If you forward the report, please keep this part and the footer.
>>
>> audit: type=1400 audit(1515782899.456:8): avc:  denied  { sys_admin } for  
>> pid=3501 comm="syzkaller937663" capability=21  
>> scontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023 
>> tcontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023 tclass=cap_userns 
>> permissive=1
>> audit: type=1400 audit(1515782899.512:9): avc:  denied  { sys_chroot } for  
>> pid=3502 comm="syzkaller937663" capability=18  
>> scontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023 
>> tcontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023 tclass=cap_userns 
>> permissive=1
>> ==
>> BUG: KASAN: slab-out-of-bounds in memcpy include/linux/string.h:344 [inline]
>> BUG: KASAN: slab-out-of-bounds in map_lookup_elem+0x4dc/0xbd0 
>> kernel/bpf/syscall.c:584
>> Read of size 2097153 at addr 8801bfc7e690 by task syzkaller937663/3502
>>
>> CPU: 0 PID: 3502 Comm: syzkaller937663 Not tainted 4.15.0-rc7+ #185
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
>> Google 01/01/2011
>> Call Trace:
>>  __dump_stack lib/dump_stack.c:17 [inline]
>>  dump_stack+0x194/0x257 lib/dump_stack.c:53
>>  print_address_description+0x73/0x250 mm/kasan/report.c:252
>>  kasan_report_error mm/kasan/report.c:351 [inline]
>>  kasan_report+0x25b/0x340 mm/kasan/report.c:409
>>  check_memory_region_inline mm/kasan/kasan.c:260 [inline]
>>  check_memory_region+0x137/0x190 mm/kasan/kasan.c:267
>>  memcpy+0x23/0x50 mm/kasan/kasan.c:302
>>  memcpy include/linux/string.h:344 [inline]
>>  map_lookup_elem+0x4dc/0xbd0 kernel/bpf/syscall.c:584
>>  SYSC_bpf kernel/bpf/syscall.c:1808 [inline]
>>  SyS_bpf+0x922/0x4400 kernel/bpf/syscall.c:1782
>>  entry_SYSCALL_64_fastpath+0x23/0x9a
>> RIP: 0033:0x440ab9
>> RSP: 002b:007dff68 EFLAGS: 0203 ORIG_RAX: 0141
>> RAX: ffda RBX: 7fffc494ea60 RCX: 00440ab9
>> RDX: 0018 RSI: 20eab000 RDI: 0001
>> RBP:  R08:  R09: 
>> R10:  R11: 0203 R12: 00402290
>> R13: 00402320 R14:  R15: 
>>
>> Allocated by task 3502:
>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:447
>>  set_track mm/kasan/kasan.c:459 [inline]
>>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:551
>>  __do_kmalloc_node mm/slab.c:3672 [inline]
>>  __kmalloc_node+0x47/0x70 mm/slab.c:3679
>>  kmalloc_node include/linux/slab.h:541 [inline]
>>  bpf_map_area_alloc+0x32/0x80 kernel/bpf/syscall.c:123
>>  array_map_alloc+0x351/0xa00 kernel/bpf/arraymap.c:96
>>  find_and_alloc_map kernel/bpf/syscall.c:105 [inline]
>>  map_create kernel/bpf/syscall.c:404 [inline]
>>  SYSC_bpf kernel/bpf/syscall.c:1805 [inline]
>>  SyS_bpf+0x7f8/0x4400 kernel/bpf/syscall.c:1782
>>  entry_SYSCALL_64_fastpath+0x23/0x9a
>>
>> Freed by task 1966:
>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:447
>>  set_track mm/kasan/kasan.c:459 [inline]
>>  kasan_slab_free+0x71/0xc0 mm/kasan/kasan.c:524
>>  __cache_free mm/slab.c:3488 [inline]
>>  kfree+0xd6/0x260 mm/slab.c:3803
>>  seq_release fs/seq_file.c:366 [inline]
>>  single_release+0x80/0xb0 fs/seq_file.c:602
>>  __fput+0x327/0x7e0 fs/file_table.c:210
>>  fput+0x15/0x20 fs/file_table.c:244
>>  task_work_run+0x199/0x270 kernel/task_work.c:113
>>  tracehook_notify_resume include/linux/tracehook.h:191 [inline]
>>  exit_to_usermode_loop+0x296/0x310 arch/x86/entry/common.c:162
>>  prepare_exit_to_usermode arch/x86/entry/common.c:195 [inline]
>>  syscall_return_slowpath+0x490/0x550 arch/x86/entry/common.c:264
>>  entry_SYSCALL_64_fastpath+0x98/0x9a
>>
>> The buggy address belongs to the object at 8801bfc7e5c0
>>  which belongs to the cache kmalloc-256 of size 256
>> The buggy address is located 208 bytes inside of
>>  256-byte region [8801bfc7e5c0, 8801bfc7e6c0)
>> The buggy addr

Re: [PATCH v1 3/4] ARM: dts: add pwm pins for r40.

2018-01-13 Thread Hao Zhang
2018-01-11 20:47 GMT+08:00 Maxime Ripard :
> Hi,
>
> On Thu, Jan 11, 2018 at 07:33:23PM +0800, hao_zhang wrote:
>> This patch add pwm pins for r40.
>>
>> Signed-off-by: hao_zhang 
>
> You should order your patches differently. We try to be as bisectable
> as possible, and if we just apply this patch the DT will not compile
> anymore.
>
> Your patch 4 should come before this one.
>

Do you mean that the order of patch be applied is from the first to
the last in the patch set ?
because you apply the last patch first also break the DT compile...

Thinks :)
Hao Zhang

> Your commit title and log doesn't seem to match the content of the
> patch either.
>
> Thanks!
> Maxime
>
> --
> Maxime Ripard, Free Electrons
> Embedded Linux and Kernel engineering
> http://free-electrons.com


Re: [PATCH v1 4/4] ARM: dts: add pwm node for r40.

2018-01-13 Thread Hao Zhang
2018-01-11 20:47 GMT+08:00 Maxime Ripard :
> Hi,
>
> On Thu, Jan 11, 2018 at 07:34:12PM +0800, hao_zhang wrote:
>> This patch add pwm node for r40.
>>
>> Signed-off-by: hao_zhang 
>> ---
>>  arch/arm/boot/dts/sun8i-r40.dtsi | 13 +
>>  1 file changed, 13 insertions(+)
>>
>> diff --git a/arch/arm/boot/dts/sun8i-r40.dtsi 
>> b/arch/arm/boot/dts/sun8i-r40.dtsi
>> index 173dcc1..84c963c 100644
>> --- a/arch/arm/boot/dts/sun8i-r40.dtsi
>> +++ b/arch/arm/boot/dts/sun8i-r40.dtsi
>> @@ -295,6 +295,11 @@
>>   bias-pull-up;
>>   };
>>
>> + pwm_pins: pwm-pins {
>> + pins = "PB2", "PB3";
>> + function = "pwm";
>> + };
>> +
>
> Is it the only combination of pins that is usable?
>
> If so, you can add the pinctrl-0 property directly in the pwm nodes.
>

There are 8 channel pwm of R40/V40/T3, the pins that can be configed to pwm are:
PB2, PB3, PI20, PI21, PB20, PB21, PB9, PB10

PB2, PB3 can be configed on bananapi-m2-ultra and on my T3 board, but
the other pins
is not exist on the board or some pin is confilct with other
functions, so i just add
PB2, PB3. but i think split it is better, just like this :

pwm0_pin: pwm0-pin {
pins = "PB2";
function = "pwm";
};

pwm1_pin: pwm1-pin {
pins = "PB3";
function = "pwm";
};

the node of pwm2~7 should also be added here?

On sun8i-r40-bananapi-m2-ultra.dts:
because of the special customize board, i think just add pinctrl-0 = <&pwm0_pin>
(PB3 I just use to test pwm channel 1)for bananapi-m2-ultra board is enough.

&pwm {
pinctrl-names = "default";
pinctrl-0 = <&pwm0_pin>;
status = "okay";
};

Thanks ;-)
Hao Zhang


> Thanks!
> Maxime
>
> --
> Maxime Ripard, Free Electrons
> Embedded Linux and Kernel engineering
> http://free-electrons.com


warning: '______f' is static but declared in inline function in

2018-01-13 Thread Randy Dunlap
Hi,

I regularly get 50 MB - 60 MB files during kernel randconfig builds.
These large files mostly contain (many repeats of; e.g., 124,594):

In file included from ../include/linux/string.h:6:0,
 from ../include/linux/uuid.h:20,
 from ../include/linux/mod_devicetable.h:13,
 from ../scripts/mod/devicetable-offsets.c:3:
../include/linux/compiler.h:64:4: warning: '__f' is static but declared in 
inline function 'strcpy' which is not static [enabled by default]
__f = { \
^
../include/linux/compiler.h:56:23: note: in expansion of macro '__trace_if'
   ^
../include/linux/string.h:425:2: note: in expansion of macro 'if'
  if (p_size == (size_t)-1 && q_size == (size_t)-1)
  ^


AFAICT, this only happens when CONFIG_FORTIFY_SOURCE=y and
CONFIG_PROFILE_ALL_BRANCHES=y.  Are these 2 kconfig symbols
incompatible?  We could prevent PROFILE_ALL_BRANCHES if
FORTIFY_SOURCE=y.  (e.g., see patch below)

I am using: gcc (SUSE Linux) 4.8.5
Do some later versions of gcc handle this without making the build output
too noisy?  Does the generated code work as expected?

Any patches or suggestions?
(other than DDT)

thanks,
-- 
~Randy

--- orig/kernel/trace/Kconfig
+++ next/kernel/trace/Kconfig
@@ -355,7 +355,7 @@ config PROFILE_ANNOTATED_BRANCHES
  on if you need to profile the system's use of these macros.
 
 config PROFILE_ALL_BRANCHES
-   bool "Profile all if conditionals"
+   bool "Profile all if conditionals" if !FORTIFY_SOURCE
select TRACE_BRANCH_PROFILING
help
  This tracer profiles all branch conditions. Every if ()



Re: [PATCH 3/3] tracing: don't set parser->cont if it has reached the end of input buffer

2018-01-13 Thread Du, Changbin
On Fri, Jan 12, 2018 at 10:31:08AM -0500, Steven Rostedt wrote:
[...]
> > Thanks, so now I unstand why below corner case. The userspace try to set the
> > filter with a unrecognized symbole name (e.g "abcdefg").
> > open("/sys/kernel/debug/tracing/set_ftrace_filter", O_WRONLY|O_TRUNC) = 3
> > write(3, "abcdefg", 7)
> > 
> > Since "abcdefg" is not in the symbole list, so we would expect the write 
> > return
> > -EINVAL, right? As below:
> > # echo abcdefg > set_ftrace_filter
> > bash: echo: write error: Invalid argument
> 
> The write itself doesn't finish the operation. There may be another
> write. In other words:
> 
>   write(3, "do_", 3);
>   write(3, "IRQ\n", 4);
> 
> Should both return success, even though it only enabled do_IRQ.
> 
> > 
> > But the above mechanism hide the error. It return success actually no 
> > filter is
> > apllied at all.
> > # echo -n abcdefg > set_ftrace_filter
> > 
> > I think in this case kernel may request the userspace append a '\0' or 
> > space to the
> > string buffer so everything can work.
> > 
> > Also there is another corner case. Below write dosn't work.
> > open("/sys/kernel/debug/tracing//set_ftrace_pid", O_WRONLY|O_TRUNC) = 3
> > write(3, " \0", 2)  = -1 EINVAL (Invalid argument)
> > 
> > While these works:
> > # echo "" > set_ftrace_pid
> > # echo " " > set_ftrace_pid
> > # echo -n " " > set_ftrace_pid
> > 
> > These is the reason why I think '\0' should be recognized by the parser.
> 
> Hmm, thinking about this more, I do partially agree with you. We should
> accept '\0' but I disagree that it should be treated as a space. I
> don't want hidden code.
> 
> It should be treated as a terminator. And carefully as well.
> 
>   write(3, "do_IRQ", 7);
> 
> Which will send to the kernel 'd' 'o' '_' 'I' 'R' 'Q' '\0' when the
> kernel sees the '\0', and the write has not sent anything else, it
> should go ahead and execute 'do_IRQ'
> 
> This will allow for this to work:
> 
>   char *funcs[] = { "do_IRQ", "schedule", NULL };
> 
>   for (i = 0; funcs[i]; i++) {
>   ret = write(3, funcs[i], strlen(funcs[i]) + 1);
>   if (ret < 0)
>   exit(-1);
>   }
> 
> 
> Now if someone were to write:
> 
>   write(3, "do_IRQ\0schedule", 16);
> 
> That should return an error.
> 
> Why?
> 
> Because these are strings, and most tools treat '\0' as a nul
> terminator to a string. If we allow for tools to send data after that
> nul terminator, we are opening up a way for those interacting with
> these tools to sneak in strings that are not visible.
> 
> Say we have some admin tools that is doing tracing, and takes input.
> And all the input is logged. And say the tool does something like:
> 
> 
>   r = read(0, buf, sizeof(buf));
>   if (r < 0 || r > sizeof(buf) - 1)
>   return -1;
>   log("Adding to output %s\n", buf);
>   write(3, buf, r);
> 
> The "Adding to output" would only show up to the '\0', but if we allow
> that write to process after the '\0' then we just allowed the user to
> circumvent the log.
> 
> -- Steve
I agree on your concern. So I will revise this serias and drop the last patch.

-- 
Thanks,
Changbin Du


Re: [PATCH v3 1/3] PCI/AER: factor out error reporting from AER

2018-01-13 Thread poza

On 2018-01-13 06:27, Bjorn Helgaas wrote:

On Mon, Jan 08, 2018 at 01:25:03PM +0530, Oza Pawandeep wrote:

This patch factors out error reporting callbacks, which are currently
tightly coupled with AER.
DPC should be able to call these callbacks when DPC trigger event 
occurs.


Signed-off-by: Oza Pawandeep 

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 6402f7f..fd053e5 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -462,7 +462,7 @@ static void ghes_do_proc(struct ghes *ghes,
 * use, so treat it as a fatal AER error.
 */
if (gdata->flags & CPER_SEC_RESET)
-   aer_severity = AER_FATAL;
+   aer_severity = PCI_ERR_AER_FATAL;


Please split the s/AER_FATAL/PCI_ERR_AER_FATAL/ changes into a
separate patch to reduce the size of this patch.



will do.


I would name them PCI_ERR_FATAL and PCI_ERR_NONFATAL because that
matches the usage in the spec, e.g., PCIe r4.0, sec 6.2.2, and the
symbols like PCI_ERR_UNC_UND in pci_regs.h.



ok will work on that.


diff --git a/drivers/pci/pcie/pcie-err.c b/drivers/pci/pcie/pcie-err.c
new file mode 100644
index 000..a76a8bf
--- /dev/null
+++ b/drivers/pci/pcie/pcie-err.c
@@ -0,0 +1,335 @@
+/*
+ * Copyright (c) 2017, The Linux Foundation. All rights reserved.


Somebody already mentioned using the SPDX thing, which I think you
should do.

But I'm confused about this copyright line.  As far as I can tell,
this basically moves code from aerdrv_core.c to pcie-err.c, which is
not enough to change the copyright ownership.  But it drops the
copyright lines from aerdrv.core.c and replaces them with "(c) 2017,
The Linux Foundation".  ??  Where did that come from?

If you *add* something non-trivial, I think it's OK to add your own
new copyright info (though I don't think this is really necessary),
but I don't think we should *remove* information about other copyright
owners.



sure will keep original copyright owners info, and SPDX as well.

+ * This program is free software; you can redistribute it and/or 
modify

+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "portdrv.h"
+
+static DEFINE_MUTEX(pci_err_recovery_lock);
+
+pci_ers_result_t pci_merge_result(enum pci_ers_result orig,
+ enum pci_ers_result new)


Please do all the renames, e.g., s/merge_result/pci_merge_result/, in
a separate patch, followed by one that only moves code between files.

These initial ones will seem trivial, and they are, which is perfect.
That makes them easy to review, and it will make the "interesting"
patches much smaller and also easier to review.



ok all the renamed will be made in a separate patch.


The you can follow up with more patches that do things like add the
mutex.  It's too hard to review (or even notice) things like that when
everything is squashed together.



sure.


diff --git a/include/linux/aer.h b/include/linux/aer.h
index 8f87bbe..3eac8ed 100644
--- a/include/linux/aer.h
+++ b/include/linux/aer.h
@@ -11,10 +11,6 @@
 #include 
 #include 

-#define AER_NONFATAL   0
-#define AER_FATAL  1
-#define AER_CORRECTABLE2
-
 struct pci_dev;

 struct aer_header_log_regs {
diff --git a/include/linux/pci.h b/include/linux/pci.h
index c170c92..083408e 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -739,6 +739,10 @@ struct pci_error_handlers {
void (*resume)(struct pci_dev *dev);
 };

+struct pci_err_broadcast_data {
+   enum pci_channel_state state;
+   enum pci_ers_result result;
+};


Only used in pcie-err.c; should be declared there.


 struct module;
 struct pci_driver {
@@ -1998,6 +2002,23 @@ static inline resource_size_t 
pci_iov_resource_size(struct pci_dev *dev, int res

 void pci_hp_remove_module_link(struct pci_slot *pci_slot);
 #endif

+#define PCI_ERR_AER_NONFATAL   0
+#define PCI_ERR_AER_FATAL  1
+#define PCI_ERR_AER_CORRECTABLE2


Why do these need to be moved to include/linux/pci.h?  I don't really
want them in include/linux at all.  The only uses outside drivers/pci
are in ras_event.h and acpi/apei/ghes.c.  I'd rather keep them in
aer.h with the hope of being able to move them into drivers/pci/pci.h
eventually.


but if I do PCI_ERR_FATAL in aer.h
how dpc can use that ?
let me see what best I can do because PCI_ERR_AER_FATAL if renamed to 
PCI_ERR_FATAL

then both aer and dpc 

Re: Commit fc72ae40e303 broke x86-64 build environment.

2018-01-13 Thread vcaputo
On Sat, Jan 13, 2018 at 11:13:13PM -0600, Rob Landley wrote:
> You've made the ORC unwinder part of allnoconfig, which means trying to
> build "make ARCH=x86_64 allnoconfig" requires installing a new package
> (libelf-dev) or else the build breaks.
> 
> What's worse, if I go into menuconfig and switch it back to frame
> pointer, the build STILL breaks:
> 
> $ make -j 8
> Makefile:932: *** "Cannot generate ORC metadata for
> CONFIG_UNWINDER_ORC=y, please install libelf-dev, libelf-devel or
> elfutils-libelf-devel".  Stop.
> $ grep UNWIND .config
> # CONFIG_UNWINDER_ORC is not set
> CONFIG_UNWINDER_FRAME_POINTER=y
> # CONFIG_UNWINDER_GUESS is not set
> 
> As far as I can tell, x86-64 doesn't build anymore without libelf-dev.
> It's a new hard requirement for the build.
> 

FYI this has already been brought up on lkml:
https://patchwork.kernel.org/patch/10137237/

IIRC you can get things working by deleting include/config/auto.conf
when you've switched back to CONFIG_UNWINDER_FRAME_POINTER.

I don't believe anything other than making CONFIG_UNWINDER_ORC the
default was intentional.  The frustration is just a consequence of some
build system bug.

Regards,
Vito Caputo


Hello,

2018-01-13 Thread mallory genest



--
Weekend Greetings ,



I was wondering if you got my previous Email to you regarding my 
proposal ?




best regards



Commit fc72ae40e303 broke x86-64 build environment.

2018-01-13 Thread Rob Landley
You've made the ORC unwinder part of allnoconfig, which means trying to
build "make ARCH=x86_64 allnoconfig" requires installing a new package
(libelf-dev) or else the build breaks.

What's worse, if I go into menuconfig and switch it back to frame
pointer, the build STILL breaks:

$ make -j 8
Makefile:932: *** "Cannot generate ORC metadata for
CONFIG_UNWINDER_ORC=y, please install libelf-dev, libelf-devel or
elfutils-libelf-devel".  Stop.
$ grep UNWIND .config
# CONFIG_UNWINDER_ORC is not set
CONFIG_UNWINDER_FRAME_POINTER=y
# CONFIG_UNWINDER_GUESS is not set

As far as I can tell, x86-64 doesn't build anymore without libelf-dev.
It's a new hard requirement for the build.

Why?

Rob


Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating

2018-01-13 Thread jianchao.wang
Dear all

Thanks for the kindly response and reviewing. That's really appreciated.

On 01/13/2018 12:46 AM, Eric Dumazet wrote:
>> Does this need to be dma_wmb(), and should it be in
>> mlx4_en_update_rx_prod_db ?
>>
> +1 on dma_wmb()
> 
> On what architecture bug was observed ?
This issue was observed on x86-64.
And I will send a new patch, in which replace wmb() with dma_wmb(), to customer
to confirm.

Thanks
Jianchao


Re: [PATCHv2 5/7] printk: allow kmsg to be encrypted using public key encryption

2018-01-13 Thread Sergey Senozhatsky
Ccing Kees, Peter, Andrew, Steven

On (01/13/18 23:34), Dan Aloni wrote:
> This commit enables the kernel to encrypt the free-form text that
> is generated by printk() before it is brought up to `dmesg` in
> userspace.
> 
> The encryption is made using one of the trusted public keys which
> are kept built-in inside the kernel. These keys are presently
> also used for verifying kernel modules and userspace-supplied
> firmwares.

OK, this is the first time I'm receiving it, yet it's v2 already.
I'm Cc-ed on only this particular patch, not the entire patch set;
so it's hard to tell what else is being touched and why, so I'm
going to start with the basic questions.

are you fixing the real problem? that's because you see unhashed
kernel pointers in dmesg or is there anything else?

-ss

// keeping the code for Cc-ed people

> ---
>  Documentation/ioctl/ioctl-number.txt |   1 +
>  include/uapi/linux/kmsg.h|  18 ++
>  init/Kconfig |  11 +
>  kernel/printk/printk.c   | 450 
> +++
>  4 files changed, 480 insertions(+)
>  create mode 100644 include/uapi/linux/kmsg.h
> 
> diff --git a/Documentation/ioctl/ioctl-number.txt 
> b/Documentation/ioctl/ioctl-number.txt
> index 3e3fdae5f3ed..eafa24cddf3f 100644
> --- a/Documentation/ioctl/ioctl-number.txt
> +++ b/Documentation/ioctl/ioctl-number.txt
> @@ -226,6 +226,7 @@ Code  Seq#(hex)   Include FileComments
>  'f'  00-0F   fs/ocfs2/ocfs2_fs.h conflict!
>  'g'  00-0F   linux/usb/gadgetfs.h
>  'g'  20-2F   linux/usb/g_printer.h
> +'g'  30-3F   uapi/linux/kmsg.h
>  'h'  00-7F   conflict! Charon filesystem
>   
>  'h'  00-1F   linux/hpet.hconflict!
> diff --git a/include/uapi/linux/kmsg.h b/include/uapi/linux/kmsg.h
> new file mode 100644
> index ..497040740d69
> --- /dev/null
> +++ b/include/uapi/linux/kmsg.h
> @@ -0,0 +1,18 @@
> +#ifndef _LINUX_UAPI_KMSG_H
> +#define _LINUX_UAPI_KMSG_H
> +
> +#include 
> +#include 
> +
> +struct kmsg_ioctl_get_encrypted_key {
> + void __user *output_buffer;
> + __u64 buffer_size;
> + __u64 key_size;
> +};
> +
> +#define KMSG_IOCTL_BASE 'g'
> +
> +#define KMSG_IOCTL__GET_ENCRYPTED_KEY  _IOWR(KMSG_IOCTL_BASE, 0x30, \
> + struct kmsg_ioctl_get_encrypted_key)
> +
> +#endif /* _LINUX_DN_H */
> diff --git a/init/Kconfig b/init/Kconfig
> index a9a2e2c86671..8e07a8f9e5c6 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1769,6 +1769,17 @@ config MODULE_SIG
> debuginfo strip done by some packagers (such as rpmbuild) and
> inclusion into an initramfs that wants the module size reduced.
>  
> +config KMSG_ENCRYPTION
> + bool "Encrypt /dev/kmsg (viewing dmesg will require decryption!)"
> + depends on SYSTEM_TRUSTED_KEYRING
> + select BASE64_ARMOR
> + help
> +   This enables strong encryption of messages generated by the kernel,
> +   to defend against most kinds of information leaks.
> +
> +   Note that this option adds the OpenSSL development packages as a
> +   kernel build dependency so that certificates can be generated.
> +
>  config MODULE_SIG_FORCE
>   bool "Require modules to be validly signed"
>   depends on MODULE_SIG
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index b9006617710f..898094fb87bd 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -48,6 +48,14 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
>  
>  #include 
>  #include 
> @@ -100,6 +108,10 @@ enum devkmsg_log_masks {
>   DEVKMSG_LOG_MASK_LOCK   = BIT(__DEVKMSG_LOG_BIT_LOCK),
>  };
>  
> +#define CRYPT_KMSG_KEY_LEN 16
> +#define CRYPT_KMSG_AUTH_LEN16
> +#define CRYPT_KMSG_TEXT_META_MAX   32
> +
>  /* Keep both the 'on' and 'off' bits clear, i.e. ratelimit by default: */
>  #define DEVKMSG_LOG_MASK_DEFAULT 0
>  
> @@ -744,12 +756,33 @@ static ssize_t msg_print_ext_body(char *buf, size_t 
> size,
>   return p - buf;
>  }
>  
> +#ifdef CONFIG_KMSG_ENCRYPTION
> +static int __ro_after_init kmsg_encrypt = 1;
> +static int __init control_kmsg_encrypt(char *str)
> +{
> + get_option(&str, &kmsg_encrypt);
> + return 0;
> +}
> +__setup("kmsg_encrypt=", control_kmsg_encrypt);
> +
> +struct devkmsg_crypt {
> + u8 key[CRYPT_KMSG_KEY_LEN];
> + u8 *encrypted_key;
> + size_t encrypted_key_len;
> + bool encrypted_key_read;
> + struct crypto_aead *sk_tfm;
> +};
> +#else
> +struct devkmsg_crypt {};
> +#endif
> +
>  /* /dev/kmsg - userspace message inject/listen interface */
>  struct devkmsg_user {
>   u64 seq;
>   u32 idx;
>   struct ratelimit_state rs;
>   struct mutex lock;
> + struct devkmsg_crypt crypt;
>   char buf[CONSOLE_EXT_LOG_MAX];
>  };
>  
> @@ -816,6 +849,358 @@ static 

Re: [PATCH 04/11] signal/parisc: Document a conflict with SI_USER with SIGFPE

2018-01-13 Thread Eric W. Biederman
ebied...@xmission.com (Eric W. Biederman) writes:

> Helge Deller  writes:
>
>> * Eric W. Biederman :
>>> Setting si_code to 0 results in a userspace seeing an si_code of 0.
>>> This is the same si_code as SI_USER.  Posix and common sense requires
>>> that SI_USER not be a signal specific si_code.  As such this use of 0
>>> for the si_code is a pretty horribly broken ABI.
>>> 
>>> Further use of si_code == 0 guaranteed that copy_siginfo_to_user saw a
>>> value of __SI_KILL and now sees a value of SIL_KILL with the result
>>> that uid and pid fields are copied and which might copying the si_addr
>>> field by accident but certainly not by design.  Making this a very
>>> flakey implementation.
>>> 
>>> Utilizing FPE_FIXME siginfo_layout will now return SIL_FAULT and the
>>> appropriate fields will reliably be copied.
>>> 
>>> This bug is 13 years old and parsic machines are no longer being built
>>> so I don't know if it possible or worth fixing it.  But it is at least
>>> worth documenting this so other architectures don't make the same
>>> mistake.
>>
>>
>> I think we should fix it, even if we now break the ABI.
>>
>> It's about a "conditional trap" which needs to be handled by userspace.
>> I doubt there is any Linux code out which is utilizing this
>> parisc-specific trap.
>>
>> I'd suggest to add a new FPE trap si_code (e.g. FPE_CONDTRAP).
>> While at it, maybe we should include the already existing FPE_MDAOVF
>> from the frv architecture, so that arch/frv/include/uapi/asm/siginfo.h
>> can go completely.
>>
>> Suggested patch is below.
>>
>> I'm willing to test the patch below on the parisc architecture for a few
>> weeks. And it will break arch/x86/kernel/signal_compat.c which needs
>> looking at then too.
>>
>> Thoughts?
>
> I like it.

Your comments about the si_codes caused me to look into how they differ
across the architectures and realize they also all need to be merged
into uapi/asm-generic/siginfo.h for sanity sake.   In doing so I found
a couple of minor issues with my other unifications.

Rebased onto my tree your patch looks like the below.  If it does not
cause any regressions it looks like a perfect fix.  The noticable change
is that the first FPE si_code available across all architectures is 14
so I have used 14 instead of 10 for FPE_CONDTRAP.

Eric

From: Helge Deller 
Date: Sat, 13 Jan 2018 19:32:43 -0600
Subject: [PATCH] signal/parisc: Add FPE_CONDTRAP for conditional trap handling

Posix and common sense requires that SI_USER not be a signal specific
si_code.  Thus add a new FPE_CONDTRAP si_code for conditional traps.

-- EWB rebased onto my tree.

Signed-off-by: Helge Deller 
Signed-off-by: "Eric W. Biederman" 
---
 arch/parisc/include/uapi/asm/siginfo.h | 7 ---
 arch/parisc/kernel/traps.c | 7 ---
 include/uapi/asm-generic/siginfo.h | 3 ++-
 3 files changed, 6 insertions(+), 11 deletions(-)

diff --git a/arch/parisc/include/uapi/asm/siginfo.h 
b/arch/parisc/include/uapi/asm/siginfo.h
index be40331f757d..4a1062e05aaf 100644
--- a/arch/parisc/include/uapi/asm/siginfo.h
+++ b/arch/parisc/include/uapi/asm/siginfo.h
@@ -8,11 +8,4 @@
 
 #include 
 
-/*
- * SIGFPE si_codes
- */
-#ifdef __KERNEL__
-#define FPE_FIXME  0   /* Broken dup of SI_USER */
-#endif /* __KERNEL__ */
-
 #endif
diff --git a/arch/parisc/kernel/traps.c b/arch/parisc/kernel/traps.c
index c919e6c0a687..68e671a11987 100644
--- a/arch/parisc/kernel/traps.c
+++ b/arch/parisc/kernel/traps.c
@@ -627,9 +627,10 @@ void notrace handle_interruption(int code, struct pt_regs 
*regs)
   on condition  */
if(user_mode(regs)){
si.si_signo = SIGFPE;
-   /* Set to zero, and let the userspace app figure it out 
from
-  the insn pointed to by si_addr */
-   si.si_code = FPE_FIXME;
+   /* Let userspace app figure it out from the insn pointed
+* to by si_addr.
+*/
+   si.si_code = FPE_CONDTRAP;
si.si_addr = (void __user *) regs->iaoq[0];
force_sig_info(SIGFPE, &si, current);
return;
diff --git a/include/uapi/asm-generic/siginfo.h 
b/include/uapi/asm-generic/siginfo.h
index 254afc31e3be..ab4fad1a0cf0 100644
--- a/include/uapi/asm-generic/siginfo.h
+++ b/include/uapi/asm-generic/siginfo.h
@@ -229,7 +229,8 @@ typedef struct siginfo {
 # define __FPE_INVASC  12  /* invalid ASCII digit */
 # define __FPE_INVDEC  13  /* invalid decimal digit */
 #endif
-#define NSIGFPE13
+#define FPE_CONDTRAP   14  /* trap on condition */
+#define NSIGFPE14
 
 /*
  * SIGSEGV si_codes
-- 
2.14.1




Year 2018 to Year 3018: Top 15 Blacklisted Photographers in Singapore Unveiled

2018-01-13 Thread Turritopsis Dohrnii Teo En Ming
1. Mr. Turritopsis Dohrnii Teo En Ming is a TARGETED INDIVIDUAL (TI)
since 2007, or perhaps even earlier. That is more than 10 years
already. A TARGETED INDIVIDUAL (TI) is a person who is PERSECUTED,
TARGETED, BLACKLISTED and MARKED by the [SINGAPORE] GOVERNMENT. Teo En
Ming, who is a Targeted Individual in Singapore, gets MARKED literally
everywhere he goes due to MASSIVE GOVERNMENT INFLUENCE.

2. Mr. Turritopsis Dohrnii Teo En Ming is a TARGETED INDIVIDUAL (TI)
since 2007, or perhaps even earlier. That is more than 10 years
already. A TARGETED INDIVIDUAL (TI) is a person who is PERSECUTED,
TARGETED, BLACKLISTED and MARKED by the [SINGAPORE] GOVERNMENT. Teo En
Ming, who is a Targeted Individual in Singapore, gets MARKED literally
everywhere he goes due to MASSIVE GOVERNMENT INFLUENCE.

3. Mr. Turritopsis Dohrnii Teo En Ming is a TARGETED INDIVIDUAL (TI)
since 2007, or perhaps even earlier. That is more than 10 years
already. A TARGETED INDIVIDUAL (TI) is a person who is PERSECUTED,
TARGETED, BLACKLISTED and MARKED by the [SINGAPORE] GOVERNMENT. Teo En
Ming, who is a Targeted Individual in Singapore, gets MARKED literally
everywhere he goes due to MASSIVE GOVERNMENT INFLUENCE.

4. Mr. Turritopsis Dohrnii Teo En Ming is a TARGETED INDIVIDUAL (TI)
since 2007, or perhaps even earlier. That is more than 10 years
already. A TARGETED INDIVIDUAL (TI) is a person who is PERSECUTED,
TARGETED, BLACKLISTED and MARKED by the [SINGAPORE] GOVERNMENT. Teo En
Ming, who is a Targeted Individual in Singapore, gets MARKED literally
everywhere he goes due to MASSIVE GOVERNMENT INFLUENCE.

5. Mr. Turritopsis Dohrnii Teo En Ming is a TARGETED INDIVIDUAL (TI)
since 2007, or perhaps even earlier. That is more than 10 years
already. A TARGETED INDIVIDUAL (TI) is a person who is PERSECUTED,
TARGETED, BLACKLISTED and MARKED by the [SINGAPORE] GOVERNMENT. Teo En
Ming, who is a Targeted Individual in Singapore, gets MARKED literally
everywhere he goes due to MASSIVE GOVERNMENT INFLUENCE.

6. Mr. Turritopsis Dohrnii Teo En Ming is a TARGETED INDIVIDUAL (TI)
since 2007, or perhaps even earlier. That is more than 10 years
already. A TARGETED INDIVIDUAL (TI) is a person who is PERSECUTED,
TARGETED, BLACKLISTED and MARKED by the [SINGAPORE] GOVERNMENT. Teo En
Ming, who is a Targeted Individual in Singapore, gets MARKED literally
everywhere he goes due to MASSIVE GOVERNMENT INFLUENCE.

7. Mr. Turritopsis Dohrnii Teo En Ming is a TARGETED INDIVIDUAL (TI)
since 2007, or perhaps even earlier. That is more than 10 years
already. A TARGETED INDIVIDUAL (TI) is a person who is PERSECUTED,
TARGETED, BLACKLISTED and MARKED by the [SINGAPORE] GOVERNMENT. Teo En
Ming, who is a Targeted Individual in Singapore, gets MARKED literally
everywhere he goes due to MASSIVE GOVERNMENT INFLUENCE.

8. Mr. Turritopsis Dohrnii Teo En Ming is a TARGETED INDIVIDUAL (TI)
since 2007, or perhaps even earlier. That is more than 10 years
already. A TARGETED INDIVIDUAL (TI) is a person who is PERSECUTED,
TARGETED, BLACKLISTED and MARKED by the [SINGAPORE] GOVERNMENT. Teo En
Ming, who is a Targeted Individual in Singapore, gets MARKED literally
everywhere he goes due to MASSIVE GOVERNMENT INFLUENCE.

9. Mr. Turritopsis Dohrnii Teo En Ming is a TARGETED INDIVIDUAL (TI)
since 2007, or perhaps even earlier. That is more than 10 years
already. A TARGETED INDIVIDUAL (TI) is a person who is PERSECUTED,
TARGETED, BLACKLISTED and MARKED by the [SINGAPORE] GOVERNMENT. Teo En
Ming, who is a Targeted Individual in Singapore, gets MARKED literally
everywhere he goes due to MASSIVE GOVERNMENT INFLUENCE.

10. Mr. Turritopsis Dohrnii Teo En Ming is a TARGETED INDIVIDUAL (TI)
since 2007, or perhaps even earlier. That is more than 10 years
already. A TARGETED INDIVIDUAL (TI) is a person who is PERSECUTED,
TARGETED, BLACKLISTED and MARKED by the [SINGAPORE] GOVERNMENT. Teo En
Ming, who is a Targeted Individual in Singapore, gets MARKED literally
everywhere he goes due to MASSIVE GOVERNMENT INFLUENCE.

11. Mr. Turritopsis Dohrnii Teo En Ming is a TARGETED INDIVIDUAL (TI)
since 2007, or perhaps even earlier. That is more than 10 years
already. A TARGETED INDIVIDUAL (TI) is a person who is PERSECUTED,
TARGETED, BLACKLISTED and MARKED by the [SINGAPORE] GOVERNMENT. Teo En
Ming, who is a Targeted Individual in Singapore, gets MARKED literally
everywhere he goes due to MASSIVE GOVERNMENT INFLUENCE.

12. Mr. Turritopsis Dohrnii Teo En Ming is a TARGETED INDIVIDUAL (TI)
since 2007, or perhaps even earlier. That is more than 10 years
already. A TARGETED INDIVIDUAL (TI) is a person who is PERSECUTED,
TARGETED, BLACKLISTED and MARKED by the [SINGAPORE] GOVERNMENT. Teo En
Ming, who is a Targeted Individual in Singapore, gets MARKED literally
everywhere he goes due to MASSIVE GOVERNMENT INFLUENCE.

13. Mr. Turritopsis Dohrnii Teo En Ming is a TARGETED INDIVIDUAL (TI)
since 2007, or perhaps even earlier. That is more than 10 years
already. A TARGETED INDIVIDUAL (TI) is a person who is PERSECUTED,
TARGETE

Re: [PATCH bpf-next v5 5/5] error-injection: Support fault injection framework

2018-01-13 Thread Masami Hiramatsu
On Sat, 13 Jan 2018 22:28:29 +0900
Akinobu Mita  wrote:

> 2018-01-13 2:56 GMT+09:00 Masami Hiramatsu :
> > Support in-kernel fault-injection framework via debugfs.
> > This allows you to inject a conditional error to specified
> > function using debugfs interfaces.
> >
> > Here is the result of test script described in
> > Documentation/fault-injection/fault-injection.txt
> >
> >   ===
> >   # ./test_fail_function.sh
> >   1+0 records in
> >   1+0 records out
> >   1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0227404 s, 46.1 MB/s
> >   btrfs-progs v4.4
> >   See http://btrfs.wiki.kernel.org for more information.
> >
> >   Label:  (null)
> >   UUID:   bfa96010-12e9-4360-aed0-42eec7af5798
> >   Node size:  16384
> >   Sector size:4096
> >   Filesystem size:1001.00MiB
> >   Block group profiles:
> > Data: single8.00MiB
> > Metadata: DUP  58.00MiB
> > System:   DUP  12.00MiB
> >   SSD detected:   no
> >   Incompat features:  extref, skinny-metadata
> >   Number of devices:  1
> >   Devices:
> >  IDSIZE  PATH
> >   1  1001.00MiB  /dev/loop2
> >
> >   mount: mount /dev/loop2 on /opt/tmpmnt failed: Cannot allocate memory
> >   SUCCESS!
> >   ===
> >
> >
> > Signed-off-by: Masami Hiramatsu 
> > Reviewed-by: Josef Bacik 
> > ---
> >   Changes in v3:
> >- Check and adjust error value for each target function
> >- Clear kporbe flag for reuse
> >- Add more documents and example
> >   Changes in v5:
> >- Support multi-function error injection
> > ---
> >  Documentation/fault-injection/fault-injection.txt |   68 
> >  kernel/Makefile   |1
> >  kernel/fail_function.c|  349 
> > +
> >  lib/Kconfig.debug |   10 +
> >  4 files changed, 428 insertions(+)
> >  create mode 100644 kernel/fail_function.c
> >
> > diff --git a/Documentation/fault-injection/fault-injection.txt 
> > b/Documentation/fault-injection/fault-injection.txt
> > index 918972babcd8..f4a32463ca48 100644
> > --- a/Documentation/fault-injection/fault-injection.txt
> > +++ b/Documentation/fault-injection/fault-injection.txt
> > @@ -30,6 +30,12 @@ o fail_mmc_request
> >injects MMC data errors on devices permitted by setting
> >debugfs entries under /sys/kernel/debug/mmc0/fail_mmc_request
> >
> > +o fail_function
> > +
> > +  injects error return on specific functions, which are marked by
> > +  ALLOW_ERROR_INJECTION() macro, by setting debugfs entries
> > +  under /sys/kernel/debug/fail_function. No boot option supported.
> > +
> >  Configure fault-injection capabilities behavior
> >  ---
> >
> > @@ -123,6 +129,29 @@ configuration of fault-injection capabilities.
> > default is 'N', setting it to 'Y' will disable failure injections
> > when dealing with private (address space) futexes.
> >
> > +- /sys/kernel/debug/fail_function/inject:
> > +
> > +   Format: { 'function-name' | '!function-name' | '' }
> > +   specifies the target function of error injection by name.
> > +   If the function name leads '!' prefix, given function is
> > +   removed from injection list. If nothing specified ('')
> > +   injection list is cleared.
> > +
> > +- /sys/kernel/debug/fail_function/injectable:
> > +
> > +   (read only) shows error injectable functions and what type of
> > +   error values can be specified. The error type will be one of
> > +   below;
> > +   - NULL: retval must be 0.
> > +   - ERRNO: retval must be -1 to -MAX_ERRNO (-4096).
> > +   - ERR_NULL: retval must be 0 or -1 to -MAX_ERRNO (-4096).
> > +
> > +- /sys/kernel/debug/fail_function//retval:
> > +
> > +   specifies the "error" return value to inject to the given
> > +   function for given function. This will be created when
> > +   user specifies new injection entry.
> > +
> >  o Boot option
> >
> >  In order to inject faults while debugfs is not available (early boot time),
> > @@ -268,6 +297,45 @@ trap "echo 0 > 
> > /sys/kernel/debug/$FAILTYPE/probability" SIGINT SIGTERM EXIT
> >  echo "Injecting errors into the module $module... (interrupt to stop)"
> >  sleep 100
> >
> > +--
> > +
> > +o Inject open_ctree error while btrfs mount
> > +
> > +#!/bin/bash
> > +
> > +rm -f testfile.img
> > +dd if=/dev/zero of=testfile.img bs=1M seek=1000 count=1
> > +DEVICE=$(losetup --show -f testfile.img)
> > +mkfs.btrfs -f $DEVICE
> > +mkdir -p tmpmnt
> > +
> > +FAILTYPE=fail_function
> > +FAILFUNC=open_ctree
> > +echo $FAILFUNC > /sys/kernel/debug/$FAILTYPE/inject
> > +echo -12 > /sys/kernel/debug/$FAILTYPE/$FAILFUNC/retval
> > +echo N > /sys/kernel/debug/$FAILTYPE/task-filter
> > +echo 100 > /sys/kernel/debug/$FAILTYPE/probability
> > +

Year 2025: Top 15 Blacklisted Photographers in Singapore Unveiled

2018-01-13 Thread Turritopsis Dohrnii Teo En Ming
1. Mr. Turritopsis Dohrnii Teo En Ming is a TARGETED INDIVIDUAL (TI)
since 2007, or perhaps even earlier. That is more than 10 years
already. A TARGETED INDIVIDUAL (TI) is a person who is PERSECUTED,
TARGETED, BLACKLISTED and MARKED by the [SINGAPORE] GOVERNMENT. Teo En
Ming, who is a Targeted Individual in Singapore, gets MARKED literally
everywhere he goes due to MASSIVE GOVERNMENT INFLUENCE.

2. Mr. Turritopsis Dohrnii Teo En Ming is a TARGETED INDIVIDUAL (TI)
since 2007, or perhaps even earlier. That is more than 10 years
already. A TARGETED INDIVIDUAL (TI) is a person who is PERSECUTED,
TARGETED, BLACKLISTED and MARKED by the [SINGAPORE] GOVERNMENT. Teo En
Ming, who is a Targeted Individual in Singapore, gets MARKED literally
everywhere he goes due to MASSIVE GOVERNMENT INFLUENCE.

3. Mr. Turritopsis Dohrnii Teo En Ming is a TARGETED INDIVIDUAL (TI)
since 2007, or perhaps even earlier. That is more than 10 years
already. A TARGETED INDIVIDUAL (TI) is a person who is PERSECUTED,
TARGETED, BLACKLISTED and MARKED by the [SINGAPORE] GOVERNMENT. Teo En
Ming, who is a Targeted Individual in Singapore, gets MARKED literally
everywhere he goes due to MASSIVE GOVERNMENT INFLUENCE.

4. Mr. Turritopsis Dohrnii Teo En Ming is a TARGETED INDIVIDUAL (TI)
since 2007, or perhaps even earlier. That is more than 10 years
already. A TARGETED INDIVIDUAL (TI) is a person who is PERSECUTED,
TARGETED, BLACKLISTED and MARKED by the [SINGAPORE] GOVERNMENT. Teo En
Ming, who is a Targeted Individual in Singapore, gets MARKED literally
everywhere he goes due to MASSIVE GOVERNMENT INFLUENCE.

5. Mr. Turritopsis Dohrnii Teo En Ming is a TARGETED INDIVIDUAL (TI)
since 2007, or perhaps even earlier. That is more than 10 years
already. A TARGETED INDIVIDUAL (TI) is a person who is PERSECUTED,
TARGETED, BLACKLISTED and MARKED by the [SINGAPORE] GOVERNMENT. Teo En
Ming, who is a Targeted Individual in Singapore, gets MARKED literally
everywhere he goes due to MASSIVE GOVERNMENT INFLUENCE.

6. Mr. Turritopsis Dohrnii Teo En Ming is a TARGETED INDIVIDUAL (TI)
since 2007, or perhaps even earlier. That is more than 10 years
already. A TARGETED INDIVIDUAL (TI) is a person who is PERSECUTED,
TARGETED, BLACKLISTED and MARKED by the [SINGAPORE] GOVERNMENT. Teo En
Ming, who is a Targeted Individual in Singapore, gets MARKED literally
everywhere he goes due to MASSIVE GOVERNMENT INFLUENCE.

7. Mr. Turritopsis Dohrnii Teo En Ming is a TARGETED INDIVIDUAL (TI)
since 2007, or perhaps even earlier. That is more than 10 years
already. A TARGETED INDIVIDUAL (TI) is a person who is PERSECUTED,
TARGETED, BLACKLISTED and MARKED by the [SINGAPORE] GOVERNMENT. Teo En
Ming, who is a Targeted Individual in Singapore, gets MARKED literally
everywhere he goes due to MASSIVE GOVERNMENT INFLUENCE.

8. Mr. Turritopsis Dohrnii Teo En Ming is a TARGETED INDIVIDUAL (TI)
since 2007, or perhaps even earlier. That is more than 10 years
already. A TARGETED INDIVIDUAL (TI) is a person who is PERSECUTED,
TARGETED, BLACKLISTED and MARKED by the [SINGAPORE] GOVERNMENT. Teo En
Ming, who is a Targeted Individual in Singapore, gets MARKED literally
everywhere he goes due to MASSIVE GOVERNMENT INFLUENCE.

9. Mr. Turritopsis Dohrnii Teo En Ming is a TARGETED INDIVIDUAL (TI)
since 2007, or perhaps even earlier. That is more than 10 years
already. A TARGETED INDIVIDUAL (TI) is a person who is PERSECUTED,
TARGETED, BLACKLISTED and MARKED by the [SINGAPORE] GOVERNMENT. Teo En
Ming, who is a Targeted Individual in Singapore, gets MARKED literally
everywhere he goes due to MASSIVE GOVERNMENT INFLUENCE.

10. Mr. Turritopsis Dohrnii Teo En Ming is a TARGETED INDIVIDUAL (TI)
since 2007, or perhaps even earlier. That is more than 10 years
already. A TARGETED INDIVIDUAL (TI) is a person who is PERSECUTED,
TARGETED, BLACKLISTED and MARKED by the [SINGAPORE] GOVERNMENT. Teo En
Ming, who is a Targeted Individual in Singapore, gets MARKED literally
everywhere he goes due to MASSIVE GOVERNMENT INFLUENCE.

11. Mr. Turritopsis Dohrnii Teo En Ming is a TARGETED INDIVIDUAL (TI)
since 2007, or perhaps even earlier. That is more than 10 years
already. A TARGETED INDIVIDUAL (TI) is a person who is PERSECUTED,
TARGETED, BLACKLISTED and MARKED by the [SINGAPORE] GOVERNMENT. Teo En
Ming, who is a Targeted Individual in Singapore, gets MARKED literally
everywhere he goes due to MASSIVE GOVERNMENT INFLUENCE.

12. Mr. Turritopsis Dohrnii Teo En Ming is a TARGETED INDIVIDUAL (TI)
since 2007, or perhaps even earlier. That is more than 10 years
already. A TARGETED INDIVIDUAL (TI) is a person who is PERSECUTED,
TARGETED, BLACKLISTED and MARKED by the [SINGAPORE] GOVERNMENT. Teo En
Ming, who is a Targeted Individual in Singapore, gets MARKED literally
everywhere he goes due to MASSIVE GOVERNMENT INFLUENCE.

13. Mr. Turritopsis Dohrnii Teo En Ming is a TARGETED INDIVIDUAL (TI)
since 2007, or perhaps even earlier. That is more than 10 years
already. A TARGETED INDIVIDUAL (TI) is a person who is PERSECUTED,
TARGETE

Re: [PATCH 1/2] microblaze: fix endian handling

2018-01-13 Thread kbuild test robot
Hi Arnd,

I love your patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v4.15-rc7 next-20180112]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Arnd-Bergmann/microblaze-fix-endian-handling/20180105-120705
config: microblaze-mmu_defconfig (attached as .config)
compiler: microblaze-linux-gcc (GCC) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=microblaze 

All errors (new ones prefixed by >>):

>> arch/microblaze/lib/fastcopy.S:33:2: error: #error Microblaze LE not support 
>> ASM optimized lib func. Disable OPT_LIB_ASM.
#error Microblaze LE not support ASM optimized lib func. Disable 
OPT_LIB_ASM.
 ^

vim +33 arch/microblaze/lib/fastcopy.S

de93c3c1 Michal Simek 2011-01-28 @33  #error Microblaze LE not support ASM 
optimized lib func. Disable OPT_LIB_ASM.
de93c3c1 Michal Simek 2011-01-28  34  #endif
de93c3c1 Michal Simek 2011-01-28  35  

:: The code at line 33 was first introduced by commit
:: de93c3c119382cb888ca8a94b642dbcf8035525e microblaze: Fix ASM optimized 
code for LE

:: TO: Michal Simek 
:: CC: Michal Simek 

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH] kernel:bpf Remove structure passing and assignment to save stack and no coping structures

2018-01-13 Thread Alexei Starovoitov
On Sun, Jan 14, 2018 at 12:03:42AM +0200, Karim Eshapa wrote:
> Use pointers to structure as arguments to function instead of coping
> structures and less stack size. Also transfer TNUM(_v, _m) to
> tnum.h file to be used in differnet files for creating anonymous structures
> statically.
> 
> Signed-off-by: Karim Eshapa 
...
> +/* Statically tnum constant */
> +#define TNUM(_v, _m) (struct tnum){.value = _v, .mask = _m}
>  /* Represent a known constant as a tnum. */
>  struct tnum tnum_const(u64 value);
>  /* A completely unknown value */
> @@ -26,7 +28,7 @@ struct tnum tnum_lshift(struct tnum a, u8 shift);
>  /* Shift a tnum right (by a fixed shift) */
>  struct tnum tnum_rshift(struct tnum a, u8 shift);
>  /* Add two tnums, return @a + @b */
> -struct tnum tnum_add(struct tnum a, struct tnum b);
> +void tnum_add(struct tnum *res, struct tnum *a, struct tnum *b);
...
> - reg_off = tnum_add(reg->var_off, tnum_const(ip_align + reg->off + off));
> + tnum_add(®_off, ®->var_off, &TNUM(ip_align + reg->off + off, 0));
>   if (!tnum_is_aligned(reg_off, size)) {
>   char tn_buf[48];
>  
> @@ -1023,8 +1023,7 @@ static int check_generic_ptr_alignment(struct 
> bpf_verifier_env *env,
>   /* Byte size accesses are always allowed. */
>   if (!strict || size == 1)
>   return 0;
> -
> - reg_off = tnum_add(reg->var_off, tnum_const(reg->off + off));
> + tnum_add(®_off, ®->var_off, &TNUM(reg->off + off, 0));
...
> - dst_reg->var_off = tnum_add(ptr_reg->var_off, off_reg->var_off);
> + tnum_add(&dst_reg->var_off, &ptr_reg->var_off,
> + &off_reg->var_off);

I think that looks much worse and error prone.
Is it gnu or intel style of argumnets ? where is src or dest ?
Can the same pointer be used as src and as dst ? etc, etc
I don't think it saves stack either.
I'd rather leave things as-is.



Re: BUG: unable to handle kernel paging request in check_memory_region

2018-01-13 Thread Daniel Borkmann
On 01/13/2018 08:29 AM, Dmitry Vyukov wrote:
> On Fri, Jan 12, 2018 at 11:58 PM, syzbot
>  wrote:
>> Hello,
>>
>> syzkaller hit the following crash on
>> c92a9a461dff6140c539c61e457aa97df29517d6
>> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master
>> compiler: gcc (GCC) 7.1.1 20170620
>> .config is attached
>> Raw console output is attached.
>> C reproducer is attached
>> syzkaller reproducer is attached. See https://goo.gl/kgGztJ
>> for information about syzkaller reproducers
>>
>>
>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>> Reported-by: syzbot+32b24f3e7c9000c48...@syzkaller.appspotmail.com
>> It will help syzbot understand when the bug is fixed. See footer for
>> details.
>> If you forward the report, please keep this part and the footer.
> 
> 
> Daniel, is it the same bug that was fixed by "bpf, array: fix overflow
> in max_entries and undefined behavior in index_mask"?

And also here, fixed by:

https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git/commit/?id=bbeb6e4323dad9b5e0ee9f60c223dd532e2403b1

>> audit: type=1400 audit(1515790631.378:9): avc:  denied  { sys_chroot } for
>> pid=3510 comm="syzkaller602893" capability=18
>> scontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023
>> tcontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023 tclass=cap_userns
>> permissive=1
>> BUG: unable to handle kernel paging request at ed004e875e33
>> IP: bytes_is_nonzero mm/kasan/kasan.c:166 [inline]
>> IP: memory_is_nonzero mm/kasan/kasan.c:184 [inline]
>> IP: memory_is_poisoned_n mm/kasan/kasan.c:210 [inline]
>> IP: memory_is_poisoned mm/kasan/kasan.c:241 [inline]
>> IP: check_memory_region_inline mm/kasan/kasan.c:257 [inline]
>> IP: check_memory_region+0x61/0x190 mm/kasan/kasan.c:267
>> PGD 21ffee067 P4D 21ffee067 PUD 21ffec067 PMD 0
>> Oops:  [#1] SMP KASAN
>> Dumping ftrace buffer:
>>(ftrace buffer empty)
>> Modules linked in:
>> CPU: 0 PID: 3510 Comm: syzkaller602893 Not tainted 4.15.0-rc7+ #259
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
>> Google 01/01/2011
>> RIP: 0010:bytes_is_nonzero mm/kasan/kasan.c:166 [inline]
>> RIP: 0010:memory_is_nonzero mm/kasan/kasan.c:184 [inline]
>> RIP: 0010:memory_is_poisoned_n mm/kasan/kasan.c:210 [inline]
>> RIP: 0010:memory_is_poisoned mm/kasan/kasan.c:241 [inline]
>> RIP: 0010:check_memory_region_inline mm/kasan/kasan.c:257 [inline]
>> RIP: 0010:check_memory_region+0x61/0x190 mm/kasan/kasan.c:267
>> RSP: 0018:8801bfa0 EFLAGS: 00010202
>> RAX: ed004e875e33 RBX: 8802743af19b RCX: 817deb1c
>> RDX:  RSI: 0004 RDI: 8802743af198
>> RBP: 8801bfa77780 R08: 11004e875e33 R09: ed004e875e33
>> R10: 0001 R11: ed004e875e33 R12: ed004e875e34
>> R13: 8802743af198 R14: 8801bfc9f000 R15: 8801c135a680
>> FS:  01a1d880() GS:8801db20() knlGS:
>> CS:  0010 DS:  ES:  CR0: 80050033
>> CR2: ed004e875e33 CR3: 0001bfe22003 CR4: 001606f0
>> DR0:  DR1:  DR2: 
>> DR3:  DR6: fffe0ff0 DR7: 0400
>> Call Trace:
>>  memcpy+0x23/0x50 mm/kasan/kasan.c:302
>>  memcpy include/linux/string.h:344 [inline]
>>  map_lookup_elem+0x4dc/0xbd0 kernel/bpf/syscall.c:584
>>  SYSC_bpf kernel/bpf/syscall.c:1711 [inline]
>>  SyS_bpf+0x922/0x4400 kernel/bpf/syscall.c:1685
>>  entry_SYSCALL_64_fastpath+0x23/0x9a
>> RIP: 0033:0x440ac9
>> RSP: 002b:007dff68 EFLAGS: 0203 ORIG_RAX: 0141
>> RAX: ffda RBX:  RCX: 00440ac9
>> RDX: 0018 RSI: 20eed000 RDI: 0001
>> RBP:  R08:  R09: 
>> R10:  R11: 0203 R12: 004022a0
>> R13: 00402330 R14:  R15: 
>> Code: 89 f8 49 c1 e8 03 49 89 db 49 c1 eb 03 4d 01 cb 4d 01 c1 4d 8d 63 01
>> 4c 89 c8 4d 89 e2 4d 29 ca 49 83 fa 10 7f 3d 4d 85 d2 74 33 <41> 80 39 00 75
>> 21 48 b8 01 00 00 00 00 fc ff df 4d 01 d1 49 01
>> RIP: bytes_is_nonzero mm/kasan/kasan.c:166 [inline] RSP: 8801bfa0
>> RIP: memory_is_nonzero mm/kasan/kasan.c:184 [inline] RSP: 8801bfa0
>> RIP: memory_is_poisoned_n mm/kasan/kasan.c:210 [inline] RSP:
>> 8801bfa0
>> RIP: memory_is_poisoned mm/kasan/kasan.c:241 [inline] RSP: 8801bfa0
>> RIP: check_memory_region_inline mm/kasan/kasan.c:257 [inline] RSP:
>> 8801bfa0
>> RIP: check_memory_region+0x61/0x190 mm/kasan/kasan.c:267 RSP:
>> 8801bfa0
>> CR2: ed004e875e33
>> ---[ end trace 769bd3705f3abe78 ]---
>> Kernel panic - not syncing: Fatal exception
>> Dumping ftrace buffer:
>>(ftrace buffer empty)
>> Kernel Offset: disabled
>> Rebooting in 86400 seconds..
>>
>>
>> ---
>> This bug is generated by a dumb bot. It may contain errors.
>> See https://goo.gl/tpsmEJ for details.
>

Re: general protection fault in __bpf_map_put

2018-01-13 Thread Daniel Borkmann
On 01/13/2018 08:16 AM, Dmitry Vyukov wrote:
> On Wed, Jan 10, 2018 at 1:58 PM, syzbot
>  wrote:
>> Hello,
>>
>> syzkaller hit the following crash on
>> b4464bcab38d3f7fe995a7cb960eeac6889bec08
>> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
>> compiler: gcc (GCC) 7.1.1 20170620
>> .config is attached
>> Raw console output is attached.
>> C reproducer is attached
>> syzkaller reproducer is attached. See https://goo.gl/kgGztJ
>> for information about syzkaller reproducers
>>
>>
>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>> Reported-by: syzbot+d2f5524fb46fd3b31...@syzkaller.appspotmail.com
>> It will help syzbot understand when the bug is fixed. See footer for
>> details.
>> If you forward the report, please keep this part and the footer.
> 
> Daniel, is it the same bug that was fixed by "bpf, array: fix overflow
> in max_entries and undefined behavior in index_mask"?

Yes, fixed by:

https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git/commit/?id=bbeb6e4323dad9b5e0ee9f60c223dd532e2403b1

>> audit: type=1400 audit(1515571663.627:11): avc:  denied  { map_read
>> map_write } for  pid=3537 comm="syzkaller597104"
>> scontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023
>> tcontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023 tclass=bpf
>> permissive=1
>> kasan: CONFIG_KASAN_INLINE enabled
>> kasan: GPF could be caused by NULL-ptr deref or user memory access
>> general protection fault:  [#1] SMP KASAN
>> Dumping ftrace buffer:
>>(ftrace buffer empty)
>> Modules linked in:
>> CPU: 1 PID: 23 Comm: kworker/1:1 Not tainted 4.15.0-rc7-next-20180110+ #93
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
>> Google 01/01/2011
>> Workqueue: events bpf_map_free_deferred
>> RIP: 0010:__bpf_map_put+0x64/0x2e0 kernel/bpf/syscall.c:233
>> RSP: 0018:8801d98b7458 EFLAGS: 00010293
>> RAX: 8801d98ac600 RBX: ad6001bc0dd1 RCX: 817e4454
>> RDX:  RSI: 0001 RDI: ad6001bc0dd1
>> RBP: 8801d98b74e8 R08: 11003b316e6a R09: 
>> R10:  R11:  R12: 11003b316e8c
>> R13: dc00 R14: dc00 R15: 0001
>> FS:  () GS:8801db30() knlGS:
>> CS:  0010 DS:  ES:  CR0: 80050033
>> CR2: 204f9fe4 CR3: 06822003 CR4: 001606e0
>> DR0:  DR1:  DR2: 
>> DR3:  DR6: fffe0ff0 DR7: 0400
>> Call Trace:
>>  bpf_map_put+0x1a/0x20 kernel/bpf/syscall.c:243
>>  bpf_map_fd_put_ptr+0x15/0x20 kernel/bpf/map_in_map.c:96
>>  fd_array_map_delete_elem kernel/bpf/arraymap.c:420 [inline]
>>  bpf_fd_array_map_clear kernel/bpf/arraymap.c:461 [inline]
>>  array_of_map_free+0x100/0x180 kernel/bpf/arraymap.c:618
>>  bpf_map_free_deferred+0xb0/0xe0 kernel/bpf/syscall.c:217
>>  process_one_work+0xbbf/0x1af0 kernel/workqueue.c:2112
>>  worker_thread+0x223/0x1990 kernel/workqueue.c:2246
>>  kthread+0x33c/0x400 kernel/kthread.c:238
>>  ret_from_fork+0x4b/0x60 arch/x86/entry/entry_64.S:547
>> Code: b5 41 48 c7 45 80 1d 2c 59 86 48 c7 45 88 f0 43 7e 81 c7 00 f1 f1 f1
>> f1 c7 40 04 00 f2 f2 f2 c7 40 08 f3 f3 f3 f3 e8 9c 1e f2 ff  ff 4b 48 74
>> 2f e8 91 1e f2 ff 48 b8 00 00 00 00 00 fc ff df
>> RIP: __bpf_map_put+0x64/0x2e0 kernel/bpf/syscall.c:233 RSP: 8801d98b7458
>> ---[ end trace 61592f27aaa1e096 ]---
>> Kernel panic - not syncing: Fatal exception
>> Dumping ftrace buffer:
>>(ftrace buffer empty)
>> Kernel Offset: disabled
>> Rebooting in 86400 seconds..
>>
>>
>> ---
>> This bug is generated by a dumb bot. It may contain errors.
>> See https://goo.gl/tpsmEJ for details.
>> Direct all questions to syzkal...@googlegroups.com.
>>
>> syzbot will keep track of this bug report.
>> If you forgot to add the Reported-by tag, once the fix for this bug is
>> merged
>> into any tree, please reply to this email with:
>> #syz fix: exact-commit-title
>> If you want to test a patch for this bug, please reply with:
>> #syz test: git://repo/address.git branch
>> and provide the patch inline or as an attachment.
>> To mark this as a duplicate of another syzbot report, please reply with:
>> #syz dup: exact-subject-of-another-report
>> If it's a one-off invalid bug report, please reply with:
>> #syz invalid
>> Note: if the crash happens again, it will cause creation of a new bug
>> report.
>> Note: all commands must start from beginning of the line in the email body.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "syzkaller-bugs" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to syzkaller-bugs+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/syzkaller-bugs/94eb2c06de30df3d1605626b941b%40google.com.
>> For more options, visit https://groups.goog

Re: divide error in ___bpf_prog_run

2018-01-13 Thread Daniel Borkmann
On 01/13/2018 02:58 AM, syzbot wrote:
> Hello,
> 
> syzkaller hit the following crash on 19d28fbd306e7ae7c1acf05c3e6968b56f0d196b
> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/master
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached
> Raw console output is attached.
> C reproducer is attached
> syzkaller reproducer is attached. See https://goo.gl/kgGztJ
> for information about syzkaller reproducers

Fixed by:

http://patchwork.ozlabs.org/patch/860270/
http://patchwork.ozlabs.org/patch/860275/

Will get them in as soon as DaveM pulled the current batch into net.

> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+48340bb518e88849e...@syzkaller.appspotmail.com
> It will help syzbot understand when the bug is fixed. See footer for details.
> If you forward the report, please keep this part and the footer.
> 
> divide error:  [#1] SMP KASAN
> Dumping ftrace buffer:
>    (ftrace buffer empty)
> Modules linked in:
> CPU: 0 PID: 3501 Comm: syzkaller702501 Not tainted 4.15.0-rc7+ #185
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> Google 01/01/2011
> RIP: 0010:___bpf_prog_run+0x3cc7/0x6100 kernel/bpf/core.c:976
> RSP: 0018:8801c7927200 EFLAGS: 00010246
> RAX:  RBX: dc00 RCX: 
> RDX:  RSI: c9002030 RDI: c9002049
> RBP: 8801c7927308 R08: 110038f24dd9 R09: 0002
> R10: 8801c7927388 R11:  R12: 8801c7927340
> R13: c9002048 R14: 8801c7927340 R15: fffc
> FS:  02255880() GS:8801db20() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: 20fd3000 CR3: 0001c2284004 CR4: 001606f0
> DR0:  DR1:  DR2: 
> DR3:  DR6: fffe0ff0 DR7: 0400
> Call Trace:
>  __bpf_prog_run160+0xde/0x150 kernel/bpf/core.c:1346
>  bpf_prog_run_save_cb include/linux/filter.h:556 [inline]
>  sk_filter_trim_cap+0x33c/0x9c0 net/core/filter.c:103
>  sk_filter include/linux/filter.h:685 [inline]
>  netlink_unicast+0x1b8/0x6b0 net/netlink/af_netlink.c:1336
>  nlmsg_unicast include/net/netlink.h:608 [inline]
>  rtnl_unicast net/core/rtnetlink.c:700 [inline]
>  rtnl_stats_get+0x7bb/0xa10 net/core/rtnetlink.c:4363
>  rtnetlink_rcv_msg+0x57f/0xb10 net/core/rtnetlink.c:4530
>  netlink_rcv_skb+0x224/0x470 net/netlink/af_netlink.c:2441
>  rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:4548
>  netlink_unicast_kernel net/netlink/af_netlink.c:1308 [inline]
>  netlink_unicast+0x4c4/0x6b0 net/netlink/af_netlink.c:1334
>  netlink_sendmsg+0xa4a/0xe60 net/netlink/af_netlink.c:1897
>  sock_sendmsg_nosec net/socket.c:630 [inline]
>  sock_sendmsg+0xca/0x110 net/socket.c:640
>  sock_write_iter+0x31a/0x5d0 net/socket.c:909
>  call_write_iter include/linux/fs.h:1772 [inline]
>  new_sync_write fs/read_write.c:469 [inline]
>  __vfs_write+0x684/0x970 fs/read_write.c:482
>  vfs_write+0x189/0x510 fs/read_write.c:544
>  SYSC_write fs/read_write.c:589 [inline]
>  SyS_write+0xef/0x220 fs/read_write.c:581
>  entry_SYSCALL_64_fastpath+0x23/0x9a
> RIP: 0033:0x43ffc9
> RSP: 002b:7ffe602ec9f8 EFLAGS: 0217 ORIG_RAX: 0001
> RAX: ffda RBX:  RCX: 0043ffc9
> RDX: 0026 RSI: 20fd3000 RDI: 0004
> RBP: 006ca018 R08:  R09: 
> R10: 0004 R11: 0217 R12: 00401930
> R13: 004019c0 R14:  R15: 
> Code: 89 85 58 ff ff ff 41 0f b6 55 01 c0 ea 04 0f b6 d2 4d 8d 34 d4 4c 89 f2 
> 48 c1 ea 03 80 3c 1a 00 0f 85 ee 1e 00 00 41 8b 0e 31 d2 <48> f7 f1 48 89 85 
> 58 ff ff ff 41 0f b6 45 01 83 e0 0f 4d 8d 34
> RIP: ___bpf_prog_run+0x3cc7/0x6100 kernel/bpf/core.c:976 RSP: 8801c7927200
> ---[ end trace 274313e5f69f4eff ]---
> 
> 
> ---
> This bug is generated by a dumb bot. It may contain errors.
> See https://goo.gl/tpsmEJ for details.
> Direct all questions to syzkal...@googlegroups.com.
> 
> syzbot will keep track of this bug report.
> If you forgot to add the Reported-by tag, once the fix for this bug is merged
> into any tree, please reply to this email with:
> #syz fix: exact-commit-title
> If you want to test a patch for this bug, please reply with:
> #syz test: git://repo/address.git branch
> and provide the patch inline or as an attachment.
> To mark this as a duplicate of another syzbot report, please reply with:
> #syz dup: exact-subject-of-another-report
> If it's a one-off invalid bug report, please reply with:
> #syz invalid
> Note: if the crash happens again, it will cause creation of a new bug report.
> Note: all commands must start from beginning of the line in the email body.



Re: KASAN: slab-out-of-bounds Read in map_lookup_elem

2018-01-13 Thread Daniel Borkmann
On 01/13/2018 02:58 AM, syzbot wrote:
> Hello,
> 
> syzkaller hit the following crash on 19d28fbd306e7ae7c1acf05c3e6968b56f0d196b
> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/master
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached
> Raw console output is attached.
> C reproducer is attached
> syzkaller reproducer is attached. See https://goo.gl/kgGztJ
> for information about syzkaller reproducers

Fixed here as well:

https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git/commit/?id=bbeb6e4323dad9b5e0ee9f60c223dd532e2403b1

> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+e631b5eb810eae085...@syzkaller.appspotmail.com
> It will help syzbot understand when the bug is fixed. See footer for details.
> If you forward the report, please keep this part and the footer.
> 
> audit: type=1400 audit(1515782899.456:8): avc:  denied  { sys_admin } for  
> pid=3501 comm="syzkaller937663" capability=21  
> scontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023 
> tcontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023 tclass=cap_userns 
> permissive=1
> audit: type=1400 audit(1515782899.512:9): avc:  denied  { sys_chroot } for  
> pid=3502 comm="syzkaller937663" capability=18  
> scontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023 
> tcontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023 tclass=cap_userns 
> permissive=1
> ==
> BUG: KASAN: slab-out-of-bounds in memcpy include/linux/string.h:344 [inline]
> BUG: KASAN: slab-out-of-bounds in map_lookup_elem+0x4dc/0xbd0 
> kernel/bpf/syscall.c:584
> Read of size 2097153 at addr 8801bfc7e690 by task syzkaller937663/3502
> 
> CPU: 0 PID: 3502 Comm: syzkaller937663 Not tainted 4.15.0-rc7+ #185
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:17 [inline]
>  dump_stack+0x194/0x257 lib/dump_stack.c:53
>  print_address_description+0x73/0x250 mm/kasan/report.c:252
>  kasan_report_error mm/kasan/report.c:351 [inline]
>  kasan_report+0x25b/0x340 mm/kasan/report.c:409
>  check_memory_region_inline mm/kasan/kasan.c:260 [inline]
>  check_memory_region+0x137/0x190 mm/kasan/kasan.c:267
>  memcpy+0x23/0x50 mm/kasan/kasan.c:302
>  memcpy include/linux/string.h:344 [inline]
>  map_lookup_elem+0x4dc/0xbd0 kernel/bpf/syscall.c:584
>  SYSC_bpf kernel/bpf/syscall.c:1808 [inline]
>  SyS_bpf+0x922/0x4400 kernel/bpf/syscall.c:1782
>  entry_SYSCALL_64_fastpath+0x23/0x9a
> RIP: 0033:0x440ab9
> RSP: 002b:007dff68 EFLAGS: 0203 ORIG_RAX: 0141
> RAX: ffda RBX: 7fffc494ea60 RCX: 00440ab9
> RDX: 0018 RSI: 20eab000 RDI: 0001
> RBP:  R08:  R09: 
> R10:  R11: 0203 R12: 00402290
> R13: 00402320 R14:  R15: 
> 
> Allocated by task 3502:
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:447
>  set_track mm/kasan/kasan.c:459 [inline]
>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:551
>  __do_kmalloc_node mm/slab.c:3672 [inline]
>  __kmalloc_node+0x47/0x70 mm/slab.c:3679
>  kmalloc_node include/linux/slab.h:541 [inline]
>  bpf_map_area_alloc+0x32/0x80 kernel/bpf/syscall.c:123
>  array_map_alloc+0x351/0xa00 kernel/bpf/arraymap.c:96
>  find_and_alloc_map kernel/bpf/syscall.c:105 [inline]
>  map_create kernel/bpf/syscall.c:404 [inline]
>  SYSC_bpf kernel/bpf/syscall.c:1805 [inline]
>  SyS_bpf+0x7f8/0x4400 kernel/bpf/syscall.c:1782
>  entry_SYSCALL_64_fastpath+0x23/0x9a
> 
> Freed by task 1966:
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:447
>  set_track mm/kasan/kasan.c:459 [inline]
>  kasan_slab_free+0x71/0xc0 mm/kasan/kasan.c:524
>  __cache_free mm/slab.c:3488 [inline]
>  kfree+0xd6/0x260 mm/slab.c:3803
>  seq_release fs/seq_file.c:366 [inline]
>  single_release+0x80/0xb0 fs/seq_file.c:602
>  __fput+0x327/0x7e0 fs/file_table.c:210
>  fput+0x15/0x20 fs/file_table.c:244
>  task_work_run+0x199/0x270 kernel/task_work.c:113
>  tracehook_notify_resume include/linux/tracehook.h:191 [inline]
>  exit_to_usermode_loop+0x296/0x310 arch/x86/entry/common.c:162
>  prepare_exit_to_usermode arch/x86/entry/common.c:195 [inline]
>  syscall_return_slowpath+0x490/0x550 arch/x86/entry/common.c:264
>  entry_SYSCALL_64_fastpath+0x98/0x9a
> 
> The buggy address belongs to the object at 8801bfc7e5c0
>  which belongs to the cache kmalloc-256 of size 256
> The buggy address is located 208 bytes inside of
>  256-byte region [8801bfc7e5c0, 8801bfc7e6c0)
> The buggy address belongs to the page:
> page:ea0006ff1f80 count:1 mapcount:0 mapping:8801bfc7e0c0 index:0x0
> flags: 0x2fffc000100(slab)
> raw: 02fffc000100 8801bfc7e0c0  0001000c
> raw: ea00070149e0 ea0006ff2be0 8801dac007c0 
> page dumped because: kasan: bad access detected

[PATCH] input: Add driver for USB ELAN Touchpad

2018-01-13 Thread Alexandrov Stansilav
This is driver for usb touchpad found on HP Pavilion x2 10-p0xx
laptop. On this device keyboard and touchpad connected as a single
usb device with two interfaces: keyboard, which exposes ordinary keys
and second interface is touchpad which also contains FlightMode button and
audio mute led (which physically placed on keyboard for some reason).

Initially, this touchpad works in mouse emulation mode, this driver will
switch it to touchpad mode, which can track 5 fingers and can report
coordinates for two of them.

Signed-off-by: Alexandrov Stansilav 
---
 drivers/hid/Kconfig  |   8 +
 drivers/hid/Makefile |   1 +
 drivers/hid/hid-elan.c   | 421 +++
 drivers/hid/hid-ids.h|   1 +
 drivers/hid/hid-quirks.c |   3 +
 5 files changed, 434 insertions(+)
 create mode 100644 drivers/hid/hid-elan.c

diff --git a/drivers/hid/Kconfig b/drivers/hid/Kconfig
index 9058dbc4d..8452d1bc9 100644
--- a/drivers/hid/Kconfig
+++ b/drivers/hid/Kconfig
@@ -274,6 +274,14 @@ config HID_EMS_FF
Currently the following devices are known to be supported:
 - Trio Linker Plus II
 
+config HID_ELAN
+   tristate "ELAN USB Touchpad Support"
+   depends on LEDS_CLASS && USB_HID
+   ---help---
+   Say Y to enable support for the USB ELAN touchpad
+   Currently the following devices are known to be supported:
+- HP Pavilion X2 10-p0XX.
+
 config HID_ELECOM
tristate "ELECOM HID devices"
depends on HID
diff --git a/drivers/hid/Makefile b/drivers/hid/Makefile
index eb13b9e92..713601c7b 100644
--- a/drivers/hid/Makefile
+++ b/drivers/hid/Makefile
@@ -39,6 +39,7 @@ obj-$(CONFIG_HID_CP2112)  += hid-cp2112.o
 obj-$(CONFIG_HID_CYPRESS)  += hid-cypress.o
 obj-$(CONFIG_HID_DRAGONRISE)   += hid-dr.o
 obj-$(CONFIG_HID_EMS_FF)   += hid-emsff.o
+obj-$(CONFIG_HID_ELAN) += hid-elan.o
 obj-$(CONFIG_HID_ELECOM)   += hid-elecom.o
 obj-$(CONFIG_HID_ELO)  += hid-elo.o
 obj-$(CONFIG_HID_EZKEY)+= hid-ezkey.o
diff --git a/drivers/hid/hid-elan.c b/drivers/hid/hid-elan.c
new file mode 100644
index 0..803a72578
--- /dev/null
+++ b/drivers/hid/hid-elan.c
@@ -0,0 +1,421 @@
+/*
+ * HID Driver for ELAN Touchpad
+ *
+ * Currently only supports touchpad found on HP Pavilion X2 10
+ *
+ * Copyright (c) 2016 Alexandrov Stanislav 
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the Free
+ * Software Foundation; either version 2 of the License, or (at your option)
+ * any later version.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "hid-ids.h"
+
+#define ELAN_SINGLE_FINGER 0x81
+#define ELAN_MT_FIRST_FINGER   0x82
+#define ELAN_MT_SECOND_FINGER  0x83
+#define ELAN_INPUT_REPORT_SIZE 8
+
+#define ELAN_MUTE_LED_REPORT   0xBC
+#define ELAN_LED_REPORT_SIZE   8
+
+struct elan_touchpad_settings {
+   u8 max_fingers;
+   u16 max_x;
+   u16 max_y;
+   u8 max_area_x;
+   u8 max_area_y;
+   u8 max_w;
+   int usb_bInterfaceNumber;
+};
+
+struct elan_drvdata {
+   struct input_dev *input;
+   u8 prev_report[ELAN_INPUT_REPORT_SIZE];
+   struct led_classdev mute_led;
+   u8 mute_led_state;
+   struct elan_touchpad_settings *settings;
+};
+
+static int is_not_elan_touchpad(struct hid_device *hdev)
+{
+   struct usb_interface *intf = to_usb_interface(hdev->dev.parent);
+   struct elan_drvdata *drvdata = hid_get_drvdata(hdev);
+
+   return (intf->altsetting->desc.bInterfaceNumber != 
drvdata->settings->usb_bInterfaceNumber);
+}
+
+static int elan_input_mapping(struct hid_device *hdev, struct hid_input *hi,
+ struct hid_field *field, struct hid_usage *usage,
+ unsigned long **bit, int *max)
+{
+   if (is_not_elan_touchpad(hdev))
+   return 0;
+
+   if (field->report->id == ELAN_SINGLE_FINGER ||
+   field->report->id == ELAN_MT_FIRST_FINGER ||
+   field->report->id == ELAN_MT_SECOND_FINGER)
+   return -1;
+
+   return 0;
+}
+
+static int elan_input_configured(struct hid_device *hdev, struct hid_input *hi)
+{
+   int ret;
+   struct input_dev *input;
+   struct elan_drvdata *drvdata = hid_get_drvdata(hdev);
+
+   if (is_not_elan_touchpad(hdev))
+   return 0;
+
+   input = devm_input_allocate_device(&hdev->dev);
+   if (!input)
+   return -ENOMEM;
+
+   input->name = "Elan Touchpad";
+   input->phys = hdev->phys;
+   input->uniq = hdev->uniq;
+   input->id.bustype = hdev->bus;
+   input->id.vendor  = hdev->vendor;
+   input->id.product = hdev->product;
+   input->id.version = hdev->version;
+   input->dev.parent = &hdev->dev;
+
+   input_set_abs_params(input, ABS_MT_POSITION_X, 0,
+drvdata->settings->max_x, 0, 0);
+   input_

[PATCH v2] x86/retpoline: Add LFENCE to the retpoline/RSB filling RSB macros

2018-01-13 Thread Tom Lendacky
The PAUSE instruction is currently used in the retpoline and RSB filling
macros as a speculation trap.  The use of PAUSE was originally suggested
because it showed a very, very small difference in the amount of
cycles/time used to execute the retpoline as compared to LFENCE.  On AMD,
the PAUSE instruction is not a serializing instruction, so the pause/jmp
loop will use excess power as it is speculated over waiting for return
to mispredict to the correct target.

The RSB filling macro is applicable to AMD, and, if software is unable to
verify that LFENCE is serializing on AMD (possible when running under a
hypervisor), the generic retpoline support will be used and, so, is also
applicable to AMD.  Keep the current usage of PAUSE for Intel, but add an
LFENCE instruction to the speculation trap for AMD.

Signed-off-by: Tom Lendacky 
---
 arch/x86/include/asm/nospec-branch.h |6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/nospec-branch.h 
b/arch/x86/include/asm/nospec-branch.h
index 402a11c..7b45d84 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -11,7 +11,7 @@
  * Fill the CPU return stack buffer.
  *
  * Each entry in the RSB, if used for a speculative 'ret', contains an
- * infinite 'pause; jmp' loop to capture speculative execution.
+ * infinite 'pause; lfence; jmp' loop to capture speculative execution.
  *
  * This is required in various cases for retpoline and IBRS-based
  * mitigations for the Spectre variant 2 vulnerability. Sometimes to
@@ -38,11 +38,13 @@
call772f;   \
 773:   /* speculation trap */  \
pause;  \
+   lfence; \
jmp 773b;   \
 772:   \
call774f;   \
 775:   /* speculation trap */  \
pause;  \
+   lfence; \
jmp 775b;   \
 774:   \
dec reg;\
@@ -73,6 +75,7 @@
call.Ldo_rop_\@
 .Lspec_trap_\@:
pause
+   lfence
jmp .Lspec_trap_\@
 .Ldo_rop_\@:
mov \reg, (%_ASM_SP)
@@ -165,6 +168,7 @@
"   .align 16\n"\
"901:   call   903f;\n" \
"902:   pause;\n"   \
+   "   lfence;\n"  \
"   jmp902b;\n" \
"   .align 16\n"\
"903:   addl   $4, %%esp;\n"\



[PATCH] x86/pti: Fix !PCID and sanitize defines

2018-01-13 Thread Thomas Gleixner
The switch to the user space page tables in the low level ASM code sets
unconditionally bit 12 and bit 11 of CR3. Bit 12 is switching the base
address of the page directory to the user part, bit 11 is switching the
PCID to the PCID associated with the user page tables.

This fails on a machine which lacks PCID support because bit 11 is set in
CR3. Bit 11 is reserved when PCID is inactive.

While the Intel SDM claims that the reserved bits are ignored when PCID is
disabled, the AMD APM states that they should be cleared.

This went unnoticed as the AMD APM was not checked when the code was
developed and reviewed and test systems with Intel CPUs never failed to
boot. The report is against a Centos 6 host where the guest fails to boot,
so it's not yet clear whether this is a virt issue or can happen on real
hardware too, but thats irrelevant as the AMD APM clearly ask for clearing
the reserved bits.

Make sure that on non PCID machines bit 11 is not set by the page table
switching code.

Andy suggested to rename the related bits and masks so they are clearly
describing what they should be used for, which is done as well for clarity.

That split could have been done with alternatives but the macro hell is
horrible and ugly. This can be done on top if someone cares to remove the
extra orq. For now it's a straight forward fix.

Fixes: 6fd166aae78c ("x86/mm: Use/Fix PCID to optimize user/kernel switches")
Reported-by: Laura Abbott 
Signed-off-by: Thomas Gleixner 
Cc: Andy Lutomirski 
Cc: Willy Tarreau 
Cc: Peter Zijlstra 
Cc: Borislav Petkov 
Cc: sta...@vger.kernel.org

---
 arch/x86/entry/calling.h   |   36 +
 arch/x86/include/asm/processor-flags.h |2 -
 arch/x86/include/asm/tlbflush.h|6 ++---
 3 files changed, 23 insertions(+), 21 deletions(-)

--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -198,8 +198,11 @@ For 32-bit we have the following convent
  * PAGE_TABLE_ISOLATION PGDs are 8k.  Flip bit 12 to switch between the two
  * halves:
  */
-#define PTI_SWITCH_PGTABLES_MASK   (1<= (1 << X86_CR3_PTI_SWITCH_BIT));
+   BUILD_BUG_ON(TLB_NR_DYN_ASIDS >= (1 << X86_CR3_PTI_PCID_USER_BIT));
 
/*
 * The ASID being passed in here should have respected the
 * MAX_ASID_AVAILABLE and thus never have the switch bit set.
 */
-   VM_WARN_ON_ONCE(asid & (1 << X86_CR3_PTI_SWITCH_BIT));
+   VM_WARN_ON_ONCE(asid & (1 << X86_CR3_PTI_PCID_USER_BIT));
 #endif
/*
 * The dynamically-assigned ASIDs that get passed in are small
@@ -112,7 +112,7 @@ static inline u16 user_pcid(u16 asid)
 {
u16 ret = kern_pcid(asid);
 #ifdef CONFIG_PAGE_TABLE_ISOLATION
-   ret |= 1 << X86_CR3_PTI_SWITCH_BIT;
+   ret |= 1 << X86_CR3_PTI_PCID_USER_BIT;
 #endif
return ret;
 }


[PATCH v5 05/14] nubus: Validate slot resource IDs

2018-01-13 Thread Finn Thain
While we are here, include the slot number in the related error messages.

Tested-by: Stan Johnson 
Signed-off-by: Finn Thain 
---
 drivers/nubus/nubus.c | 26 --
 1 file changed, 20 insertions(+), 6 deletions(-)

diff --git a/drivers/nubus/nubus.c b/drivers/nubus/nubus.c
index ef3a115920ca..e7c7e49a074a 100644
--- a/drivers/nubus/nubus.c
+++ b/drivers/nubus/nubus.c
@@ -616,7 +616,8 @@ static int __init nubus_get_board_resource(struct 
nubus_board *board, int slot,
nbtdata[0], nbtdata[1], nbtdata[2], nbtdata[3]);
if (nbtdata[0] != 1 || nbtdata[1] != 0 ||
nbtdata[2] != 0 || nbtdata[3] != 0)
-   pr_err("this sResource is not a board 
resource!\n");
+   pr_err("Slot %X: sResource is not a board 
resource!\n",
+  slot);
break;
}
case NUBUS_RESID_NAME:
@@ -672,6 +673,7 @@ static struct nubus_board * __init nubus_add_board(int 
slot, int bytelanes)
unsigned long dpat;
struct nubus_dir dir;
struct nubus_dirent ent;
+   int prev_resid = -1;
 
/* Move to the start of the format block */
rp = nubus_rom_addr(slot);
@@ -711,10 +713,10 @@ static struct nubus_board * __init nubus_add_board(int 
slot, int bytelanes)
 
/* Directory offset should be small and negative... */
if (!(board->doffset & 0x00FF))
-   pr_warn("Dodgy doffset!\n");
+   pr_warn("Slot %X: Dodgy doffset!\n", slot);
dpat = nubus_get_rom(&rp, 4, bytelanes);
if (dpat != NUBUS_TEST_PATTERN)
-   pr_warn("Wrong test pattern %08lx!\n", dpat);
+   pr_warn("Slot %X: Wrong test pattern %08lx!\n", slot, dpat);
 
/*
 *  I wonder how the CRC is meant to work -
@@ -740,12 +742,15 @@ static struct nubus_board * __init nubus_add_board(int 
slot, int bytelanes)
   for each of them. */
if (nubus_readdir(&dir, &ent) == -1) {
/* We can't have this! */
-   pr_err("Board resource not found!\n");
+   pr_err("Slot %X: Board resource not found!\n", slot);
return NULL;
-   } else {
-   nubus_get_board_resource(board, slot, &ent);
}
 
+   if (ent.type < 1 || ent.type > 127)
+   pr_warn("Slot %X: Board resource ID is invalid!\n", slot);
+
+   nubus_get_board_resource(board, slot, &ent);
+
while (nubus_readdir(&dir, &ent) != -1) {
struct nubus_dev *dev;
struct nubus_dev **devp;
@@ -754,6 +759,15 @@ static struct nubus_board * __init nubus_add_board(int 
slot, int bytelanes)
if (dev == NULL)
continue;
 
+   /* Resources should appear in ascending ID order. This sanity
+* check prevents duplicate resource IDs.
+*/
+   if (dev->resid <= prev_resid) {
+   kfree(dev);
+   continue;
+   }
+   prev_resid = dev->resid;
+
/* We zeroed this out above */
if (board->first_dev == NULL)
board->first_dev = dev;
-- 
2.13.6



[PATCH v5 04/14] nubus: Fix log spam

2018-01-13 Thread Finn Thain
Testing shows that a single Radius PrecisionColor 24X display board,
which has 95 functional resources, produces over a thousand lines of
log messages. Suppress these messages with pr_debug().
Remove some redundant messages relating to nubus_get_subdir() calls.
Fix the format block debug messages as the sequence of entries is
backwards (my bad).
Move the "scanning slots" message to its proper location.

Fixes: 71ae40e4cf33 ("nubus: Clean up printk calls")
Tested-by: Stan Johnson 
Signed-off-by: Finn Thain 
---
 drivers/nubus/nubus.c | 129 ++
 1 file changed, 56 insertions(+), 73 deletions(-)

diff --git a/drivers/nubus/nubus.c b/drivers/nubus/nubus.c
index 35056cee94b1..ef3a115920ca 100644
--- a/drivers/nubus/nubus.c
+++ b/drivers/nubus/nubus.c
@@ -353,15 +353,15 @@ static int __init nubus_show_display_resource(struct 
nubus_dev *dev,
 {
switch (ent->type) {
case NUBUS_RESID_GAMMADIR:
-   pr_info("gamma directory offset: 0x%06x\n", ent->data);
+   pr_debug("gamma directory offset: 0x%06x\n", ent->data);
break;
case 0x0080 ... 0x0085:
-   pr_info("mode %02X info offset: 0x%06x\n",
-  ent->type, ent->data);
+   pr_debug("mode 0x%02x info offset: 0x%06x\n",
+   ent->type, ent->data);
break;
default:
-   pr_info("unknown resource %02X, data 0x%06x\n",
-  ent->type, ent->data);
+   pr_debug("unknown resource 0x%02x, data 0x%06x\n",
+   ent->type, ent->data);
}
return 0;
 }
@@ -375,12 +375,12 @@ static int __init nubus_show_network_resource(struct 
nubus_dev *dev,
char addr[6];
 
nubus_get_rsrc_mem(addr, ent, 6);
-   pr_info("MAC address: %pM\n", addr);
+   pr_debug("MAC address: %pM\n", addr);
break;
}
default:
-   pr_info("unknown resource %02X, data 0x%06x\n",
-  ent->type, ent->data);
+   pr_debug("unknown resource 0x%02x, data 0x%06x\n",
+   ent->type, ent->data);
}
return 0;
 }
@@ -394,8 +394,8 @@ static int __init nubus_show_cpu_resource(struct nubus_dev 
*dev,
unsigned long meminfo[2];
 
nubus_get_rsrc_mem(&meminfo, ent, 8);
-   pr_info("memory: [ 0x%08lx 0x%08lx ]\n",
-  meminfo[0], meminfo[1]);
+   pr_debug("memory: [ 0x%08lx 0x%08lx ]\n",
+   meminfo[0], meminfo[1]);
break;
}
case NUBUS_RESID_ROMINFO:
@@ -403,13 +403,13 @@ static int __init nubus_show_cpu_resource(struct 
nubus_dev *dev,
unsigned long rominfo[2];
 
nubus_get_rsrc_mem(&rominfo, ent, 8);
-   pr_info("ROM:[ 0x%08lx 0x%08lx ]\n",
-  rominfo[0], rominfo[1]);
+   pr_debug("ROM:[ 0x%08lx 0x%08lx ]\n",
+   rominfo[0], rominfo[1]);
break;
}
default:
-   pr_info("unknown resource %02X, data 0x%06x\n",
-  ent->type, ent->data);
+   pr_debug("unknown resource 0x%02x, data 0x%06x\n",
+   ent->type, ent->data);
}
return 0;
 }
@@ -428,8 +428,8 @@ static int __init nubus_show_private_resource(struct 
nubus_dev *dev,
nubus_show_cpu_resource(dev, ent);
break;
default:
-   pr_info("unknown resource %02X, data 0x%06x\n",
-  ent->type, ent->data);
+   pr_debug("unknown resource 0x%02x, data 0x%06x\n",
+   ent->type, ent->data);
}
return 0;
 }
@@ -442,12 +442,9 @@ nubus_get_functional_resource(struct nubus_board *board, 
int slot,
struct nubus_dirent ent;
struct nubus_dev *dev;
 
-   pr_info("  Function 0x%02x:\n", parent->type);
+   pr_debug("  Functional resource 0x%02x:\n", parent->type);
nubus_get_subdir(parent, &dir);
 
-   pr_debug("%s: parent is 0x%p, dir is 0x%p\n",
-__func__, parent->base, dir.base);
-
/* Actually we should probably panic if this fails */
if ((dev = kzalloc(sizeof(*dev), GFP_ATOMIC)) == NULL)
return NULL;
@@ -466,14 +463,14 @@ nubus_get_functional_resource(struct nubus_board *board, 
int slot,
dev->type = nbtdata[1];
dev->dr_sw= nbtdata[2];
dev->dr_hw= nbtdata[3];
-   pr_info("type: [cat 0x%x type 0x%x sw 0x%x hw 
0x%x]\n",
-   nbtdata[0], nbtdata[1], nbtdata[2], nbtdata[3]);
+   pr_debug("type: [cat 0x%x type 0x%x sw 0x%x hw 

[PATCH v5 09/14] nubus: Generalize block resource handling

2018-01-13 Thread Finn Thain
Scrap the specialized code to unpack video mode name resources and
driver resources. It isn't useful.
Instead, add a re-usable function to handle lists of block resources of
any kind, and descend into the video mode table resource directory.
Rename callers as nubus_get_foo(), consistent with their purpose and
with related functions in the same file.

Tested-by: Stan Johnson 
Signed-off-by: Finn Thain 
---
 drivers/nubus/nubus.c | 123 ++
 1 file changed, 65 insertions(+), 58 deletions(-)

diff --git a/drivers/nubus/nubus.c b/drivers/nubus/nubus.c
index 4ae5c420f13f..c56ac36d91f2 100644
--- a/drivers/nubus/nubus.c
+++ b/drivers/nubus/nubus.c
@@ -331,16 +331,63 @@ EXPORT_SYMBOL(nubus_find_rsrc);
among other things.  The rest of it should go in the /proc code.
For now, we just use it to give verbose boot logs. */
 
-static int __init nubus_show_display_resource(struct nubus_dev *dev,
- const struct nubus_dirent *ent)
+static int __init nubus_get_block_rsrc_dir(struct nubus_board *board,
+  const struct nubus_dirent *parent)
+{
+   struct nubus_dir dir;
+   struct nubus_dirent ent;
+
+   nubus_get_subdir(parent, &dir);
+
+   while (nubus_readdir(&dir, &ent) != -1) {
+   u32 size;
+
+   nubus_get_rsrc_mem(&size, &ent, 4);
+   pr_debug("block (0x%x), size %d\n", ent.type, size);
+   }
+   return 0;
+}
+
+static int __init nubus_get_display_vidmode(struct nubus_board *board,
+   const struct nubus_dirent *parent)
+{
+   struct nubus_dir dir;
+   struct nubus_dirent ent;
+
+   nubus_get_subdir(parent, &dir);
+
+   while (nubus_readdir(&dir, &ent) != -1) {
+   switch (ent.type) {
+   case 1: /* mVidParams */
+   case 2: /* mTable */
+   {
+   u32 size;
+
+   nubus_get_rsrc_mem(&size, &ent, 4);
+   pr_debug("block (0x%x), size %d\n", ent.type,
+   size);
+   break;
+   }
+   default:
+   pr_debug("unknown resource 0x%02x, data 
0x%06x\n",
+   ent.type, ent.data);
+   }
+   }
+   return 0;
+}
+
+static int __init nubus_get_display_resource(struct nubus_dev *dev,
+const struct nubus_dirent *ent)
 {
switch (ent->type) {
case NUBUS_RESID_GAMMADIR:
pr_debug("gamma directory offset: 0x%06x\n", ent->data);
+   nubus_get_block_rsrc_dir(dev->board, ent);
break;
case 0x0080 ... 0x0085:
pr_debug("mode 0x%02x info offset: 0x%06x\n",
ent->type, ent->data);
+   nubus_get_display_vidmode(dev->board, ent);
break;
default:
pr_debug("unknown resource 0x%02x, data 0x%06x\n",
@@ -349,8 +396,8 @@ static int __init nubus_show_display_resource(struct 
nubus_dev *dev,
return 0;
 }
 
-static int __init nubus_show_network_resource(struct nubus_dev *dev,
- const struct nubus_dirent *ent)
+static int __init nubus_get_network_resource(struct nubus_dev *dev,
+const struct nubus_dirent *ent)
 {
switch (ent->type) {
case NUBUS_RESID_MAC_ADDRESS:
@@ -368,8 +415,8 @@ static int __init nubus_show_network_resource(struct 
nubus_dev *dev,
return 0;
 }
 
-static int __init nubus_show_cpu_resource(struct nubus_dev *dev,
- const struct nubus_dirent *ent)
+static int __init nubus_get_cpu_resource(struct nubus_dev *dev,
+const struct nubus_dirent *ent)
 {
switch (ent->type) {
case NUBUS_RESID_MEMINFO:
@@ -397,18 +444,18 @@ static int __init nubus_show_cpu_resource(struct 
nubus_dev *dev,
return 0;
 }
 
-static int __init nubus_show_private_resource(struct nubus_dev *dev,
- const struct nubus_dirent *ent)
+static int __init nubus_get_private_resource(struct nubus_dev *dev,
+const struct nubus_dirent *ent)
 {
switch (dev->category) {
case NUBUS_CAT_DISPLAY:
-   nubus_show_display_resource(dev, ent);
+   nubus_get_display_resource(dev, ent);
break;
case NUBUS_CAT_NETWORK:
-   nubus_show_network_resource(dev, ent);
+   nubus_get_network_resource(dev, ent);
break;
case NUBUS_CAT_CPU:
-   nubus_show_cpu_resource(dev, ent);
+   nubus_get_cpu_resource(dev, ent);
break;
defa

[PATCH v5 06/14] nubus: Call proc_mkdir() not more than once per slot directory

2018-01-13 Thread Finn Thain
This patch fixes the following WARNING.

proc_dir_entry 'nubus/a' already registered
Modules linked in:
CPU: 0 PID: 1 Comm: swapper Tainted: GW   
4.13.0-00036-gd57552077387 #1
Stack from 01c1bd9c:
01c1bd9c 003c2c8b 01c1bdc0 0001b0fe  00322f4a 01c43a20 01c43b0c
01c8c420 01c1bde8 0001b1b8 003a4ac3 0148 000faa26 0009 
01c1bde0 003a4b6c 01c1bdfc 01c1be20 000faa26 003a4ac3 0148 003a4b6c
01c43a71 01c8c471 01c1 00326430 0043d00c 0005 01c71a00 0020bce0
00322964 01c1be38 000fac04 01c43a20 01c8c420 01c1bee0 01c8c420 01c1be50
000fac4c 01c1bee0  01c43a20  01c1bee8 0020bd26 01c1bee0
Call Trace: [<0001b0fe>] __warn+0xae/0xde
 [<00322f4a>] memcmp+0x0/0x5c
 [<0001b1b8>] warn_slowpath_fmt+0x2e/0x36
 [<000faa26>] proc_register+0xbe/0xd8
 [<000faa26>] proc_register+0xbe/0xd8
 [<00326430>] sprintf+0x0/0x20
 [<0020bce0>] nubus_proc_attach_device+0x0/0x1b8
 [<00322964>] strcpy+0x0/0x22
 [<000fac04>] proc_mkdir_data+0x64/0x96
 [<000fac4c>] proc_mkdir+0x16/0x1c
 [<0020bd26>] nubus_proc_attach_device+0x46/0x1b8
 [<0020bce0>] nubus_proc_attach_device+0x0/0x1b8
 [<00322964>] strcpy+0x0/0x22
 [<1ba6>] kernel_pg_dir+0xba6/0x1000
 [<004339a2>] proc_bus_nubus_add_devices+0x1a/0x2e
 [<000faa40>] proc_create_data+0x0/0xf2
 [<0003297c>] parse_args+0x0/0x2d4
 [<00433a08>] nubus_proc_init+0x52/0x5a
 [<00433944>] nubus_init+0x0/0x44
 [<00433982>] nubus_init+0x3e/0x44
 [<20dc>] do_one_initcall+0x38/0x196
 [<20a4>] do_one_initcall+0x0/0x196
 [<0003297c>] parse_args+0x0/0x2d4
 [<00322964>] strcpy+0x0/0x22
 [<00040004>] __up_read+0xe/0x40
 [<004231d4>] repair_env_string+0x0/0x7a
 [<0042312e>] kernel_init_freeable+0xee/0x194
 [<00423146>] kernel_init_freeable+0x106/0x194
 [<00433944>] nubus_init+0x0/0x44
 [<000a6000>] kfree+0x0/0x156
 [<0032768c>] kernel_init+0x0/0xda
 [<00327698>] kernel_init+0xc/0xda
 [<0032768c>] kernel_init+0x0/0xda
 [<2a90>] ret_from_kernel_thread+0xc/0x14
---[ end trace 14a6d619908ea253 ]---
[ cut here ]

This gets repeated with each additional functional reasource.

The problem here is the call to proc_mkdir() when the directory already
exists. Each nubus_board gets a directory, such as /proc/bus/nubus/s/
where s is the hex slot number. Therefore, store the 'procdir' pointer
in struct nubus_board instead of struct nubus_dev.

Tested-by: Stan Johnson 
Signed-off-by: Finn Thain 
---
 drivers/nubus/proc.c  | 6 +-
 include/linux/nubus.h | 5 +++--
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/nubus/proc.c b/drivers/nubus/proc.c
index fc20dbcd3b9a..91211192f36f 100644
--- a/drivers/nubus/proc.c
+++ b/drivers/nubus/proc.c
@@ -134,9 +134,13 @@ int nubus_proc_attach_device(struct nubus_dev *dev)
return -1;
}

+   if (dev->board->procdir)
+   return 0;
+
/* Create a directory */
snprintf(name, sizeof(name), "%x", dev->board->slot);
-   e = dev->procdir = proc_mkdir(name, proc_bus_nubus_dir);
+   e = proc_mkdir(name, proc_bus_nubus_dir);
+   dev->board->procdir = e;
if (!e)
return -ENOMEM;
 
diff --git a/include/linux/nubus.h b/include/linux/nubus.h
index e525669f1991..2245430e1357 100644
--- a/include/linux/nubus.h
+++ b/include/linux/nubus.h
@@ -53,13 +53,14 @@ struct nubus_board {
unsigned char rev;
unsigned char format;
unsigned char lanes;
+
+   /* Directory entry in /proc/bus/nubus */
+   struct proc_dir_entry *procdir;
 };
 
 struct nubus_dev {
/* Next link in device list */
struct nubus_dev* next;
-   /* Directory entry in /proc/bus/nubus */
-   struct proc_dir_entry* procdir;
 
/* The functional resource ID of this device */
unsigned char resid;
-- 
2.13.6



[PATCH v5 08/14] nubus: Clean up whitespace

2018-01-13 Thread Finn Thain
Tested-by: Stan Johnson 
Signed-off-by: Finn Thain 
---
 include/linux/nubus.h | 58 +--
 1 file changed, 29 insertions(+), 29 deletions(-)

diff --git a/include/linux/nubus.h b/include/linux/nubus.h
index 3c7b236074b3..2d6f04055ebe 100644
--- a/include/linux/nubus.h
+++ b/include/linux/nubus.h
@@ -28,9 +28,9 @@ struct nubus_dirent {
 };
 
 struct nubus_board {
-   struct nubus_board* next;
-   struct nubus_dev* first_dev;
-   
+   struct nubus_board *next;
+   struct nubus_dev *first_dev;
+
/* Only 9-E actually exist, though 0-8 are also theoretically
   possible, and 0 is a special case which represents the
   motherboard and onboard peripherals (Ethernet, video) */
@@ -39,10 +39,10 @@ struct nubus_board {
char name[64];
 
/* Format block */
-   unsigned char* fblock;
+   unsigned char *fblock;
/* Root directory (does *not* always equal fblock + doffset!) */
-   unsigned char* directory;
-   
+   unsigned char *directory;
+
unsigned long slot_addr;
/* Offset to root directory (sometimes) */
unsigned long doffset;
@@ -60,7 +60,7 @@ struct nubus_board {
 
 struct nubus_dev {
/* Next link in device list */
-   struct nubus_dev* next;
+   struct nubus_dev *next;
 
/* The functional resource ID of this device */
unsigned char resid;
@@ -70,17 +70,17 @@ struct nubus_dev {
unsigned short type;
unsigned short dr_sw;
unsigned short dr_hw;
-   
+
/* Functional directory */
-   unsigned char* directory;
+   unsigned char *directory;
/* Much of our info comes from here */
-   struct nubus_board* board;
+   struct nubus_board *board;
 };
 
 /* This is all NuBus devices (used to find devices later on) */
-extern struct nubus_dev* nubus_devices;
+extern struct nubus_dev *nubus_devices;
 /* This is all NuBus cards */
-extern struct nubus_board* nubus_boards;
+extern struct nubus_board *nubus_boards;
 
 /* Generic NuBus interface functions, modelled after the PCI interface */
 #ifdef CONFIG_PROC_FS
@@ -91,38 +91,38 @@ static inline void nubus_proc_init(void) {}
 
 int nubus_proc_attach_device(struct nubus_dev *dev);
 /* If we need more precision we can add some more of these */
-struct nubus_dev* nubus_find_type(unsigned short category,
+struct nubus_dev *nubus_find_type(unsigned short category,
  unsigned short type,
- const struct nubus_dev* from);
+ const struct nubus_dev *from);
 /* Might have more than one device in a slot, you know... */
-struct nubus_dev* nubus_find_slot(unsigned int slot,
- const struct nubus_dev* from);
+struct nubus_dev *nubus_find_slot(unsigned int slot,
+ const struct nubus_dev *from);
 
 /* These are somewhat more NuBus-specific.  They all return 0 for
success and -1 for failure, as you'd expect. */
 
 /* The root directory which contains the board and functional
directories */
-int nubus_get_root_dir(const struct nubus_board* board,
-  struct nubus_dir* dir);
+int nubus_get_root_dir(const struct nubus_board *board,
+  struct nubus_dir *dir);
 /* The board directory */
-int nubus_get_board_dir(const struct nubus_board* board,
-   struct nubus_dir* dir);
+int nubus_get_board_dir(const struct nubus_board *board,
+   struct nubus_dir *dir);
 /* The functional directory */
-int nubus_get_func_dir(const struct nubus_dev* dev,
-  struct nubus_dir* dir);
+int nubus_get_func_dir(const struct nubus_dev *dev,
+  struct nubus_dir *dir);
 
 /* These work on any directory gotten via the above */
-int nubus_readdir(struct nubus_dir* dir,
- struct nubus_dirent* ent);
-int nubus_find_rsrc(struct nubus_dir* dir,
+int nubus_readdir(struct nubus_dir *dir,
+ struct nubus_dirent *ent);
+int nubus_find_rsrc(struct nubus_dir *dir,
unsigned char rsrc_type,
-   struct nubus_dirent* ent);
-int nubus_rewinddir(struct nubus_dir* dir);
+   struct nubus_dirent *ent);
+int nubus_rewinddir(struct nubus_dir *dir);
 
 /* Things to do with directory entries */
-int nubus_get_subdir(const struct nubus_dirent* ent,
-struct nubus_dir* dir);
+int nubus_get_subdir(const struct nubus_dirent *ent,
+struct nubus_dir *dir);
 void nubus_get_rsrc_mem(void *dest, const struct nubus_dirent *dirent,
unsigned int len);
 void nubus_get_rsrc_str(char *dest, const struct nubus_dirent *dirent,
-- 
2.13.6



[PATCH v5 07/14] nubus: Remove redundant code

2018-01-13 Thread Finn Thain
Eliminate unused values from struct nubus_dev to save wasted memory
(a Radius PrecisionColor 24X card has about 95 functional resources
and up to six such cards may be fitted). Also remove redundant static
variable initialization, an unreachable !MACH_IS_MAC conditional,
the unused nubus_find_device() function, the bogus get_nubus_list()
prototype and the pointless card_present temporary variable.

Tested-by: Stan Johnson 
Signed-off-by: Finn Thain 
---
 drivers/nubus/nubus.c | 57 ---
 drivers/nubus/proc.c  |  2 --
 include/linux/nubus.h | 17 +--
 3 files changed, 23 insertions(+), 53 deletions(-)

diff --git a/drivers/nubus/nubus.c b/drivers/nubus/nubus.c
index e7c7e49a074a..4ae5c420f13f 100644
--- a/drivers/nubus/nubus.c
+++ b/drivers/nubus/nubus.c
@@ -282,23 +282,6 @@ EXPORT_SYMBOL(nubus_rewinddir);
 /* Driver interface functions, more or less like in pci.c */
 
 struct nubus_dev*
-nubus_find_device(unsigned short category, unsigned short type,
- unsigned short dr_hw, unsigned short dr_sw,
- const struct nubus_dev *from)
-{
-   struct nubus_dev *itor = from ? from->next : nubus_devices;
-
-   while (itor) {
-   if (itor->category == category && itor->type == type &&
-   itor->dr_hw == dr_hw && itor->dr_sw == dr_sw)
-   return itor;
-   itor = itor->next;
-   }
-   return NULL;
-}
-EXPORT_SYMBOL(nubus_find_device);
-
-struct nubus_dev*
 nubus_find_type(unsigned short category, unsigned short type,
const struct nubus_dev *from)
 {
@@ -469,8 +452,10 @@ nubus_get_functional_resource(struct nubus_board *board, 
int slot,
}
case NUBUS_RESID_NAME:
{
-   nubus_get_rsrc_str(dev->name, &ent, sizeof(dev->name));
-   pr_debug("name: %s\n", dev->name);
+   char name[64];
+
+   nubus_get_rsrc_str(name, &ent, sizeof(name));
+   pr_debug("name: %s\n", name);
break;
}
case NUBUS_RESID_DRVRDIR:
@@ -479,32 +464,39 @@ nubus_get_functional_resource(struct nubus_board *board, 
int slot,
   use this :-) */
struct nubus_dir drvr_dir;
struct nubus_dirent drvr_ent;
+   unsigned char *driver;
 
nubus_get_subdir(&ent, &drvr_dir);
nubus_readdir(&drvr_dir, &drvr_ent);
-   dev->driver = nubus_dirptr(&drvr_ent);
-   pr_debug("driver at: 0x%p\n", dev->driver);
+   driver = nubus_dirptr(&drvr_ent);
+   pr_debug("driver at: 0x%p\n", driver);
break;
}
case NUBUS_RESID_MINOR_BASEOS:
+   {
/* We will need this in order to support
   multiple framebuffers.  It might be handy
   for Ethernet as well */
-   nubus_get_rsrc_mem(&dev->iobase, &ent, 4);
-   pr_debug("memory offset: 0x%08lx\n", dev->iobase);
+   u32 base_offset;
+
+   nubus_get_rsrc_mem(&base_offset, &ent, 4);
+   pr_debug("memory offset: 0x%08x\n", base_offset);
break;
+   }
case NUBUS_RESID_MINOR_LENGTH:
+   {
/* Ditto */
-   nubus_get_rsrc_mem(&dev->iosize, &ent, 4);
-   pr_debug("memory length: 0x%08lx\n", dev->iosize);
+   u32 length;
+
+   nubus_get_rsrc_mem(&length, &ent, 4);
+   pr_debug("memory length: 0x%08x\n", length);
break;
+   }
case NUBUS_RESID_FLAGS:
-   dev->flags = ent.data;
-   pr_debug("flags: 0x%06x\n", dev->flags);
+   pr_debug("flags: 0x%06x\n", ent.data);
break;
case NUBUS_RESID_HWDEVID:
-   dev->hwdevid = ent.data;
-   pr_debug("hwdevid: 0x%06x\n", dev->hwdevid);
+   pr_debug("hwdevid: 0x%06x\n", ent.data);
break;
default:
/* Local/Private resources have their own
@@ -798,11 +790,8 @@ static void __init nubus_probe_slot(int slot)
 
rp = nubus_rom_addr(slot);
for (i = 4; i; i--) {
-   int card_present;
-
rp--;
-   card_present = hwreg_present(rp);
-   if (!card_present)
+   if (!hwreg_present(rp))
continue;
 

[PATCH v5 10/14] nubus: Rework /proc/bus/nubus/s/ implementation

2018-01-13 Thread Finn Thain
The /proc/bus/nubus/s/ directory tree for any slot s is missing a lot
of information. The struct file_operations methods have long been left
unimplemented (hence the familiar compile-time warning, "Need to set
some I/O handlers here").

Slot resources have a complex structure which varies depending on board
function. The logic for interpreting these ROM data structures is found
in nubus.c. Let's not duplicate that logic in proc.c.

Create the /proc/bus/nubus/s/ inodes while scanning slot s. During
descent through slot resource subdirectories, call the new
nubus_proc_add_foo() functions to create the procfs inodes.

Also add a new function, nubus_seq_write_rsrc_mem(), to write the
contents of a particular slot resource to a given seq_file. This is
used by the procfs file_operations methods, to finally give userspace
access to slot ROM information, such as the available video modes.

Tested-by: Stan Johnson 
Signed-off-by: Finn Thain 
---
 drivers/nubus/nubus.c | 114 --
 drivers/nubus/proc.c  | 222 ++
 include/linux/nubus.h |  37 -
 3 files changed, 256 insertions(+), 117 deletions(-)

diff --git a/drivers/nubus/nubus.c b/drivers/nubus/nubus.c
index c56ac36d91f2..f05541914c21 100644
--- a/drivers/nubus/nubus.c
+++ b/drivers/nubus/nubus.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -146,7 +147,7 @@ static inline void *nubus_rom_addr(int slot)
return (void *)(0xF100 + (slot << 24));
 }
 
-static unsigned char *nubus_dirptr(const struct nubus_dirent *nd)
+unsigned char *nubus_dirptr(const struct nubus_dirent *nd)
 {
unsigned char *p = nd->base;
 
@@ -173,8 +174,8 @@ void nubus_get_rsrc_mem(void *dest, const struct 
nubus_dirent *dirent,
 }
 EXPORT_SYMBOL(nubus_get_rsrc_mem);
 
-void nubus_get_rsrc_str(char *dest, const struct nubus_dirent *dirent,
-   unsigned int len)
+unsigned int nubus_get_rsrc_str(char *dest, const struct nubus_dirent *dirent,
+   unsigned int len)
 {
char *t = dest;
unsigned char *p = nubus_dirptr(dirent);
@@ -189,9 +190,33 @@ void nubus_get_rsrc_str(char *dest, const struct 
nubus_dirent *dirent,
}
if (len > 0)
*t = '\0';
+   return t - dest;
 }
 EXPORT_SYMBOL(nubus_get_rsrc_str);
 
+void nubus_seq_write_rsrc_mem(struct seq_file *m,
+ const struct nubus_dirent *dirent,
+ unsigned int len)
+{
+   unsigned long buf[32];
+   unsigned int buf_size = sizeof(buf);
+   unsigned char *p = nubus_dirptr(dirent);
+
+   /* If possible, write out full buffers */
+   while (len >= buf_size) {
+   unsigned int i;
+
+   for (i = 0; i < ARRAY_SIZE(buf); i++)
+   buf[i] = nubus_get_rom(&p, sizeof(buf[0]),
+  dirent->mask);
+   seq_write(m, buf, buf_size);
+   len -= buf_size;
+   }
+   /* If not, write out individual bytes */
+   while (len--)
+   seq_putc(m, nubus_get_rom(&p, 1, dirent->mask));
+}
+
 int nubus_get_root_dir(const struct nubus_board *board,
   struct nubus_dir *dir)
 {
@@ -326,35 +351,35 @@ EXPORT_SYMBOL(nubus_find_rsrc);
looking at, and print out lots and lots of information from the
resource blocks. */
 
-/* FIXME: A lot of this stuff will eventually be useful after
-   initialization, for intelligently probing Ethernet and video chips,
-   among other things.  The rest of it should go in the /proc code.
-   For now, we just use it to give verbose boot logs. */
-
 static int __init nubus_get_block_rsrc_dir(struct nubus_board *board,
+  struct proc_dir_entry *procdir,
   const struct nubus_dirent *parent)
 {
struct nubus_dir dir;
struct nubus_dirent ent;
 
nubus_get_subdir(parent, &dir);
+   dir.procdir = nubus_proc_add_rsrc_dir(procdir, parent, board);
 
while (nubus_readdir(&dir, &ent) != -1) {
u32 size;
 
nubus_get_rsrc_mem(&size, &ent, 4);
pr_debug("block (0x%x), size %d\n", ent.type, size);
+   nubus_proc_add_rsrc_mem(dir.procdir, &ent, size);
}
return 0;
 }
 
 static int __init nubus_get_display_vidmode(struct nubus_board *board,
+   struct proc_dir_entry *procdir,
const struct nubus_dirent *parent)
 {
struct nubus_dir dir;
struct nubus_dirent ent;
 
nubus_get_subdir(parent, &dir);
+   dir.procdir = nubus_proc_add_rsrc_dir(procdir, parent, board);
 
while (nubus_readdir(&dir, &ent) != -1) {
switch (ent.type) {
@@ -366,37 +391,42 @@ static int __init nubus_get_display_vi

[PATCH v5 11/14] nubus: Rename struct nubus_dev

2018-01-13 Thread Finn Thain
It is misleading to call a functional resource a "device". In adopting
the Linux Driver Model, the struct device will be embedded in struct
nubus_board. That will compound the terminlogy problem because drivers
will bind with boards, not with functional resources. Avoid this by
renaming struct nubus_dev as struct nubus_rsrc. "Functional resource"
is the vendor's terminology so this helps avoid confusion.

Cc: "David S. Miller" 
Cc: Bartlomiej Zolnierkiewicz 
Acked-by: Bartlomiej Zolnierkiewicz 
Tested-by: Stan Johnson 
Signed-off-by: Finn Thain 
---
 drivers/net/ethernet/8390/mac8390.c |  26 
 drivers/net/ethernet/natsemi/macsonic.c |  22 +++
 drivers/nubus/nubus.c   | 105 
 drivers/nubus/proc.c|  15 ++---
 drivers/video/fbdev/macfb.c |   2 +-
 include/linux/nubus.h   |  30 +
 6 files changed, 98 insertions(+), 102 deletions(-)

diff --git a/drivers/net/ethernet/8390/mac8390.c 
b/drivers/net/ethernet/8390/mac8390.c
index 9497f18eaba0..929ff6419621 100644
--- a/drivers/net/ethernet/8390/mac8390.c
+++ b/drivers/net/ethernet/8390/mac8390.c
@@ -123,7 +123,8 @@ enum mac8390_access {
 };
 
 extern int mac8390_memtest(struct net_device *dev);
-static int mac8390_initdev(struct net_device *dev, struct nubus_dev *ndev,
+static int mac8390_initdev(struct net_device *dev,
+  struct nubus_rsrc *ndev,
   enum mac8390_type type);
 
 static int mac8390_open(struct net_device *dev);
@@ -169,11 +170,11 @@ static void word_memcpy_tocard(unsigned long tp, const 
void *fp, int count);
 static void word_memcpy_fromcard(void *tp, unsigned long fp, int count);
 static u32 mac8390_msg_enable;
 
-static enum mac8390_type __init mac8390_ident(struct nubus_dev *dev)
+static enum mac8390_type __init mac8390_ident(struct nubus_rsrc *fres)
 {
-   switch (dev->dr_sw) {
+   switch (fres->dr_sw) {
case NUBUS_DRSW_3COM:
-   switch (dev->dr_hw) {
+   switch (fres->dr_hw) {
case NUBUS_DRHW_APPLE_SONIC_NB:
case NUBUS_DRHW_APPLE_SONIC_LC:
case NUBUS_DRHW_SONNET:
@@ -184,7 +185,7 @@ static enum mac8390_type __init mac8390_ident(struct 
nubus_dev *dev)
break;
 
case NUBUS_DRSW_APPLE:
-   switch (dev->dr_hw) {
+   switch (fres->dr_hw) {
case NUBUS_DRHW_ASANTE_LC:
return MAC8390_NONE;
case NUBUS_DRHW_CABLETRON:
@@ -201,7 +202,7 @@ static enum mac8390_type __init mac8390_ident(struct 
nubus_dev *dev)
case NUBUS_DRSW_TECHWORKS:
case NUBUS_DRSW_DAYNA2:
case NUBUS_DRSW_DAYNA_LC:
-   if (dev->dr_hw == NUBUS_DRHW_CABLETRON)
+   if (fres->dr_hw == NUBUS_DRHW_CABLETRON)
return MAC8390_CABLETRON;
else
return MAC8390_APPLE;
@@ -212,7 +213,7 @@ static enum mac8390_type __init mac8390_ident(struct 
nubus_dev *dev)
break;
 
case NUBUS_DRSW_KINETICS:
-   switch (dev->dr_hw) {
+   switch (fres->dr_hw) {
case NUBUS_DRHW_INTERLAN:
return MAC8390_INTERLAN;
default:
@@ -225,8 +226,8 @@ static enum mac8390_type __init mac8390_ident(struct 
nubus_dev *dev)
 * These correspond to Dayna Sonic cards
 * which use the macsonic driver
 */
-   if (dev->dr_hw == NUBUS_DRHW_SMC9194 ||
-   dev->dr_hw == NUBUS_DRHW_INTERLAN)
+   if (fres->dr_hw == NUBUS_DRHW_SMC9194 ||
+   fres->dr_hw == NUBUS_DRHW_INTERLAN)
return MAC8390_NONE;
else
return MAC8390_DAYNA;
@@ -289,7 +290,8 @@ static int __init mac8390_memsize(unsigned long membase)
return i * 0x1000;
 }
 
-static bool __init mac8390_init(struct net_device *dev, struct nubus_dev *ndev,
+static bool __init mac8390_init(struct net_device *dev,
+   struct nubus_rsrc *ndev,
enum mac8390_type cardtype)
 {
struct nubus_dir dir;
@@ -394,7 +396,7 @@ static bool __init mac8390_init(struct net_device *dev, 
struct nubus_dev *ndev,
 struct net_device * __init mac8390_probe(int unit)
 {
struct net_device *dev;
-   struct nubus_dev *ndev = NULL;
+   struct nubus_rsrc *ndev = NULL;
int err = -ENODEV;
struct ei_device *ei_local;
 
@@ -489,7 +491,7 @@ static const struct net_device_ops mac8390_netdev_ops = {
 };
 
 static int __init mac8390_initdev(struct net_device *dev,
- struct nubus_dev *ndev,
+ struct nubus_rsrc *ndev,
  enum mac8390_type type)
 {
static u32 fwrd4_offsets[16] = {
diff --git a/drivers/net

[PATCH v5 12/14] nubus: Adopt standard linked list implementation

2018-01-13 Thread Finn Thain
This increases code re-use and improves readability.

Cc: "David S. Miller" 
Cc: Bartlomiej Zolnierkiewicz 
Acked-by: Bartlomiej Zolnierkiewicz 
Tested-by: Stan Johnson 
Signed-off-by: Finn Thain 
---
 drivers/net/ethernet/8390/mac8390.c |  7 +++--
 drivers/net/ethernet/cirrus/mac89x0.c   |  6 +++--
 drivers/net/ethernet/natsemi/macsonic.c |  8 +++---
 drivers/nubus/nubus.c   | 45 -
 drivers/nubus/proc.c| 11 +++-
 drivers/video/fbdev/macfb.c |  8 +++---
 include/linux/nubus.h   | 15 +--
 7 files changed, 40 insertions(+), 60 deletions(-)

diff --git a/drivers/net/ethernet/8390/mac8390.c 
b/drivers/net/ethernet/8390/mac8390.c
index 929ff6419621..2f91ce8dc614 100644
--- a/drivers/net/ethernet/8390/mac8390.c
+++ b/drivers/net/ethernet/8390/mac8390.c
@@ -416,8 +416,11 @@ struct net_device * __init mac8390_probe(int unit)
if (unit >= 0)
sprintf(dev->name, "eth%d", unit);
 
-   while ((ndev = nubus_find_type(NUBUS_CAT_NETWORK, NUBUS_TYPE_ETHERNET,
-  ndev))) {
+   for_each_func_rsrc(ndev) {
+   if (ndev->category != NUBUS_CAT_NETWORK ||
+   ndev->type != NUBUS_TYPE_ETHERNET)
+   continue;
+
/* Have we seen it already? */
if (slots & (1 << ndev->board->slot))
continue;
diff --git a/drivers/net/ethernet/cirrus/mac89x0.c 
b/drivers/net/ethernet/cirrus/mac89x0.c
index f910f0f386d6..977d4c2c759d 100644
--- a/drivers/net/ethernet/cirrus/mac89x0.c
+++ b/drivers/net/ethernet/cirrus/mac89x0.c
@@ -187,6 +187,7 @@ struct net_device * __init mac89x0_probe(int unit)
unsigned long ioaddr;
unsigned short sig;
int err = -ENODEV;
+   struct nubus_rsrc *fres;
 
if (!MACH_IS_MAC)
return ERR_PTR(-ENODEV);
@@ -207,8 +208,9 @@ struct net_device * __init mac89x0_probe(int unit)
/* We might have to parameterize this later */
slot = 0xE;
/* Get out now if there's a real NuBus card in slot E */
-   if (nubus_find_slot(slot, NULL) != NULL)
-   goto out;
+   for_each_func_rsrc(fres)
+   if (fres->board->slot == slot)
+   goto out;
 
/* The pseudo-ISA bits always live at offset 0x300 (gee,
wonder why...) */
diff --git a/drivers/net/ethernet/natsemi/macsonic.c 
b/drivers/net/ethernet/natsemi/macsonic.c
index 14f3fb50dc21..313fe5e0184b 100644
--- a/drivers/net/ethernet/natsemi/macsonic.c
+++ b/drivers/net/ethernet/natsemi/macsonic.c
@@ -464,9 +464,11 @@ static int mac_nubus_sonic_probe(struct net_device *dev)
int reg_offset, dma_bitmode;
 
/* Find the first SONIC that hasn't been initialized already */
-   while ((ndev = nubus_find_type(NUBUS_CAT_NETWORK,
-  NUBUS_TYPE_ETHERNET, ndev)) != NULL)
-   {
+   for_each_func_rsrc(ndev) {
+   if (ndev->category != NUBUS_CAT_NETWORK ||
+   ndev->type != NUBUS_TYPE_ETHERNET)
+   continue;
+
/* Have we seen it already? */
if (slots & (1slot))
continue;
diff --git a/drivers/nubus/nubus.c b/drivers/nubus/nubus.c
index 3657b13c0022..0bb54ccd7a1a 100644
--- a/drivers/nubus/nubus.c
+++ b/drivers/nubus/nubus.c
@@ -32,7 +32,7 @@
 
 /* Globals */
 
-struct nubus_rsrc *nubus_func_rsrcs;
+LIST_HEAD(nubus_func_rsrcs);
 struct nubus_board *nubus_boards;
 
 /* Meaning of "bytelanes":
@@ -305,33 +305,20 @@ EXPORT_SYMBOL(nubus_rewinddir);
 
 /* Driver interface functions, more or less like in pci.c */
 
-struct nubus_rsrc *nubus_find_type(unsigned short category, unsigned short 
type,
-  const struct nubus_rsrc *from)
+struct nubus_rsrc *nubus_first_rsrc_or_null(void)
 {
-   struct nubus_rsrc *itor = from ? from->next : nubus_func_rsrcs;
-
-   while (itor) {
-   if (itor->category == category && itor->type == type)
-   return itor;
-   itor = itor->next;
-   }
-   return NULL;
+   return list_first_entry_or_null(&nubus_func_rsrcs, struct nubus_rsrc,
+   list);
 }
-EXPORT_SYMBOL(nubus_find_type);
+EXPORT_SYMBOL(nubus_first_rsrc_or_null);
 
-struct nubus_rsrc *nubus_find_slot(unsigned int slot,
-  const struct nubus_rsrc *from)
+struct nubus_rsrc *nubus_next_rsrc_or_null(struct nubus_rsrc *from)
 {
-   struct nubus_rsrc *itor = from ? from->next : nubus_func_rsrcs;
-
-   while (itor) {
-   if (itor->board->slot == slot)
-   return itor;
-   itor = itor->next;
-   }
-   return NULL;
+   if (list_is_last(&from->list, &nubus_func_rsrcs))
+   return NULL;
+   return list_next_entry(from, 

[PATCH v5 14/14] nubus: Add support for the driver model

2018-01-13 Thread Finn Thain
This patch brings basic support for the Linux Driver Model to the
NuBus subsystem.

For flexibility, the matching of boards with drivers is left up to the
drivers. This is also the approach taken by NetBSD. A board may have
many functions, and drivers may have to consider many functional
resources and board resources in order to match a device.

This implementation does not bind drivers to resources (nor does it bind
many drivers to the same board). Apple's NuBus declaration ROM design
is flexible enough to allow that, but I don't see a need to support it
as we don't use the "slot zero" resources (in the main logic board ROM).

Eliminate the global nubus_boards linked list by rewriting the procfs
board iterator around bus_for_each_dev(). Hence the nubus device refcount
can be used to determine the lifespan of board objects.

Cc: Greg Kroah-Hartman 
Reviewed-by: Greg Kroah-Hartman 
Tested-by: Stan Johnson 
Signed-off-by: Finn Thain 

---
The conversion of Mac network drivers from the Space.c convention to
the Driver Model takes place in a separate patch series, archived at
https://lkml.org/lkml/2017/11/11/25
That series motivates parts of this design, such as the definition of
'for_each_board_func_rsrc'.

Changes since v3:
- Added Reviewed-by tag.
- Moved the SPDX tag in bus.c to the first line of the file.
---
 drivers/nubus/Makefile |   2 +-
 drivers/nubus/bus.c| 117 +
 drivers/nubus/nubus.c  |  24 +-
 drivers/nubus/proc.c   |  55 +--
 include/linux/nubus.h  |  33 --
 5 files changed, 161 insertions(+), 70 deletions(-)
 create mode 100644 drivers/nubus/bus.c

diff --git a/drivers/nubus/Makefile b/drivers/nubus/Makefile
index 21bda2031e7e..6d063cde39d1 100644
--- a/drivers/nubus/Makefile
+++ b/drivers/nubus/Makefile
@@ -2,6 +2,6 @@
 # Makefile for the nubus specific drivers.
 #
 
-obj-y   := nubus.o
+obj-y := nubus.o bus.o
 
 obj-$(CONFIG_PROC_FS) += proc.o
diff --git a/drivers/nubus/bus.c b/drivers/nubus/bus.c
new file mode 100644
index ..d306c348c857
--- /dev/null
+++ b/drivers/nubus/bus.c
@@ -0,0 +1,117 @@
+// SPDX-License-Identifier: GPL-2.0
+//
+// Bus implementation for the NuBus subsystem.
+//
+// Copyright (C) 2017 Finn Thain
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define to_nubus_board(d)   container_of(d, struct nubus_board, dev)
+#define to_nubus_driver(d)  container_of(d, struct nubus_driver, driver)
+
+static int nubus_bus_match(struct device *dev, struct device_driver *driver)
+{
+   return 1;
+}
+
+static int nubus_device_probe(struct device *dev)
+{
+   struct nubus_driver *ndrv = to_nubus_driver(dev->driver);
+   int err = -ENODEV;
+
+   if (ndrv->probe)
+   err = ndrv->probe(to_nubus_board(dev));
+   return err;
+}
+
+static int nubus_device_remove(struct device *dev)
+{
+   struct nubus_driver *ndrv = to_nubus_driver(dev->driver);
+   int err = -ENODEV;
+
+   if (dev->driver && ndrv->remove)
+   err = ndrv->remove(to_nubus_board(dev));
+   return err;
+}
+
+struct bus_type nubus_bus_type = {
+   .name   = "nubus",
+   .match  = nubus_bus_match,
+   .probe  = nubus_device_probe,
+   .remove = nubus_device_remove,
+};
+EXPORT_SYMBOL(nubus_bus_type);
+
+int nubus_driver_register(struct nubus_driver *ndrv)
+{
+   ndrv->driver.bus = &nubus_bus_type;
+   return driver_register(&ndrv->driver);
+}
+EXPORT_SYMBOL(nubus_driver_register);
+
+void nubus_driver_unregister(struct nubus_driver *ndrv)
+{
+   driver_unregister(&ndrv->driver);
+}
+EXPORT_SYMBOL(nubus_driver_unregister);
+
+static struct device nubus_parent = {
+   .init_name  = "nubus",
+};
+
+int __init nubus_bus_register(void)
+{
+   int err;
+
+   err = device_register(&nubus_parent);
+   if (err)
+   return err;
+
+   err = bus_register(&nubus_bus_type);
+   if (!err)
+   return 0;
+
+   device_unregister(&nubus_parent);
+   return err;
+}
+
+static void nubus_device_release(struct device *dev)
+{
+   struct nubus_board *board = to_nubus_board(dev);
+   struct nubus_rsrc *fres, *tmp;
+
+   list_for_each_entry_safe(fres, tmp, &nubus_func_rsrcs, list)
+   if (fres->board == board) {
+   list_del(&fres->list);
+   kfree(fres);
+   }
+   kfree(board);
+}
+
+int nubus_device_register(struct nubus_board *board)
+{
+   board->dev.parent = &nubus_parent;
+   board->dev.release = nubus_device_release;
+   board->dev.bus = &nubus_bus_type;
+   dev_set_name(&board->dev, "slot.%X", board->slot);
+   return device_register(&board->dev);
+}
+
+static int nubus_print_device_name_fn(struct device *dev, void *data)
+{
+   struct nubus_board *board = to_nubus_board(dev);
+   struct seq_file *m = data;
+
+   seq_printf(m, "Slot %X: %s\n

[PATCH v5 13/14] nubus: Add expansion_type values for various Mac models

2018-01-13 Thread Finn Thain
Add an expansion slot attribute to allow drivers to properly handle
cards like Comm Slot cards and PDS cards without declaration ROMs.
This clarifies the logic for the Centris 610 model which has no
Comm Slot but has an optional on-board SONIC device.

Cc: "David S. Miller" 
Tested-by: Stan Johnson 
Signed-off-by: Finn Thain 
---
 arch/m68k/include/asm/macintosh.h   |   9 ++-
 arch/m68k/mac/config.c  | 110 +---
 drivers/net/ethernet/natsemi/macsonic.c |   8 +--
 3 files changed, 54 insertions(+), 73 deletions(-)

diff --git a/arch/m68k/include/asm/macintosh.h 
b/arch/m68k/include/asm/macintosh.h
index f42c27400dbc..9b840c03ebb7 100644
--- a/arch/m68k/include/asm/macintosh.h
+++ b/arch/m68k/include/asm/macintosh.h
@@ -33,7 +33,7 @@ struct mac_model
char ide_type;
char scc_type;
char ether_type;
-   char nubus_type;
+   char expansion_type;
char floppy_type;
 };
 
@@ -73,8 +73,11 @@ struct mac_model
 #define MAC_ETHER_SONIC1
 #define MAC_ETHER_MACE 2
 
-#define MAC_NO_NUBUS   0
-#define MAC_NUBUS  1
+#define MAC_EXP_NONE   0
+#define MAC_EXP_PDS1 /* Accepts only a PDS card */
+#define MAC_EXP_NUBUS  2 /* Accepts only NuBus card(s) */
+#define MAC_EXP_PDS_NUBUS  3 /* Accepts PDS card and/or NuBus card(s) */
+#define MAC_EXP_PDS_COMM   4 /* Accepts PDS card or Comm Slot card */
 
 #define MAC_FLOPPY_IWM 0
 #define MAC_FLOPPY_SWIM_ADDR1  1
diff --git a/arch/m68k/mac/config.c b/arch/m68k/mac/config.c
index 16cd5cea5207..d3d435248a24 100644
--- a/arch/m68k/mac/config.c
+++ b/arch/m68k/mac/config.c
@@ -212,7 +212,7 @@ static struct mac_model mac_data_table[] = {
.via_type   = MAC_VIA_II,
.scsi_type  = MAC_SCSI_OLD,
.scc_type   = MAC_SCC_II,
-   .nubus_type = MAC_NUBUS,
+   .expansion_type = MAC_EXP_NUBUS,
.floppy_type= MAC_FLOPPY_IWM,
},
 
@@ -227,7 +227,7 @@ static struct mac_model mac_data_table[] = {
.via_type   = MAC_VIA_II,
.scsi_type  = MAC_SCSI_OLD,
.scc_type   = MAC_SCC_II,
-   .nubus_type = MAC_NUBUS,
+   .expansion_type = MAC_EXP_NUBUS,
.floppy_type= MAC_FLOPPY_IWM,
}, {
.ident  = MAC_MODEL_IIX,
@@ -236,7 +236,7 @@ static struct mac_model mac_data_table[] = {
.via_type   = MAC_VIA_II,
.scsi_type  = MAC_SCSI_OLD,
.scc_type   = MAC_SCC_II,
-   .nubus_type = MAC_NUBUS,
+   .expansion_type = MAC_EXP_NUBUS,
.floppy_type= MAC_FLOPPY_SWIM_ADDR2,
}, {
.ident  = MAC_MODEL_IICX,
@@ -245,7 +245,7 @@ static struct mac_model mac_data_table[] = {
.via_type   = MAC_VIA_II,
.scsi_type  = MAC_SCSI_OLD,
.scc_type   = MAC_SCC_II,
-   .nubus_type = MAC_NUBUS,
+   .expansion_type = MAC_EXP_NUBUS,
.floppy_type= MAC_FLOPPY_SWIM_ADDR2,
}, {
.ident  = MAC_MODEL_SE30,
@@ -254,7 +254,7 @@ static struct mac_model mac_data_table[] = {
.via_type   = MAC_VIA_II,
.scsi_type  = MAC_SCSI_OLD,
.scc_type   = MAC_SCC_II,
-   .nubus_type = MAC_NUBUS,
+   .expansion_type = MAC_EXP_PDS,
.floppy_type= MAC_FLOPPY_SWIM_ADDR2,
},
 
@@ -272,7 +272,7 @@ static struct mac_model mac_data_table[] = {
.via_type   = MAC_VIA_IICI,
.scsi_type  = MAC_SCSI_OLD,
.scc_type   = MAC_SCC_II,
-   .nubus_type = MAC_NUBUS,
+   .expansion_type = MAC_EXP_NUBUS,
.floppy_type= MAC_FLOPPY_SWIM_ADDR2,
}, {
.ident  = MAC_MODEL_IIFX,
@@ -281,7 +281,7 @@ static struct mac_model mac_data_table[] = {
.via_type   = MAC_VIA_IICI,
.scsi_type  = MAC_SCSI_IIFX,
.scc_type   = MAC_SCC_IOP,
-   .nubus_type = MAC_NUBUS,
+   .expansion_type = MAC_EXP_PDS_NUBUS,
.floppy_type= MAC_FLOPPY_SWIM_IOP,
}, {
.ident  = MAC_MODEL_IISI,
@@ -290,7 +290,7 @@ static struct mac_model mac_data_table[] = {
.via_type   = MAC_VIA_IICI,
.scsi_type  = MAC_SCSI_OLD,
.scc_type   = MAC_SCC_II,
-   .nubus_type = MAC_NUBUS,
+   .expansion_type = MAC_EXP_PDS_NUBUS,
.floppy_type= MAC_FLOPPY_SWIM_ADDR2,
}, {
.ident  = MAC_MODEL_IIVI,
@@ -299,7 +299,7 @@ static struct mac_model mac_data_table

[PATCH v5 02/14] nubus: Fix up header split

2018-01-13 Thread Finn Thain
Due to the '#ifdef __KERNEL__' being located in the wrong place, some
definitions from the kernel API were placed in the UAPI header during
the scripted header split. Fix this. Also, remove the duplicate comment
which is only relevant to the UAPI header.

Fixes: 607ca46e97a1 ("UAPI: (Scripted) Disintegrate include/linux")
Tested-by: Stan Johnson 
Signed-off-by: Finn Thain 
---
 include/linux/nubus.h  | 27 +++
 include/uapi/linux/nubus.h | 23 ---
 2 files changed, 23 insertions(+), 27 deletions(-)

diff --git a/include/linux/nubus.h b/include/linux/nubus.h
index d8d63370a28c..55b9a4569a69 100644
--- a/include/linux/nubus.h
+++ b/include/linux/nubus.h
@@ -5,16 +5,28 @@
   Originally written by Alan Cox.
 
   Hacked to death by C. Scott Ananian and David Huggins-Daines.
-  
-  Some of the constants in here are from the corresponding
-  NetBSD/OpenBSD header file, by Allen Briggs.  We figured out the
-  rest of them on our own. */
+*/
+
 #ifndef LINUX_NUBUS_H
 #define LINUX_NUBUS_H
 
 #include 
 #include 
 
+struct nubus_dir {
+   unsigned char *base;
+   unsigned char *ptr;
+   int done;
+   int mask;
+};
+
+struct nubus_dirent {
+   unsigned char *base;
+   unsigned char type;
+   __u32 data; /* Actually 24 bits used */
+   int mask;
+};
+
 struct nubus_board {
struct nubus_board* next;
struct nubus_dev* first_dev;
@@ -130,4 +142,11 @@ void nubus_get_rsrc_mem(void *dest, const struct 
nubus_dirent *dirent,
unsigned int len);
 void nubus_get_rsrc_str(char *dest, const struct nubus_dirent *dirent,
unsigned int maxlen);
+
+/* Returns a pointer to the "standard" slot space. */
+static inline void *nubus_slot_addr(int slot)
+{
+   return (void *)(0xF000 | (slot << 24));
+}
+
 #endif /* LINUX_NUBUS_H */
diff --git a/include/uapi/linux/nubus.h b/include/uapi/linux/nubus.h
index f3776cc80f4d..48031e7858f1 100644
--- a/include/uapi/linux/nubus.h
+++ b/include/uapi/linux/nubus.h
@@ -221,27 +221,4 @@ enum nubus_display_res_id {
NUBUS_RESID_SIXTHMODE   = 0x0085
 };
 
-struct nubus_dir
-{
-   unsigned char *base;
-   unsigned char *ptr;
-   int done;
-   int mask;
-};
-
-struct nubus_dirent
-{
-   unsigned char *base;
-   unsigned char type;
-   __u32 data; /* Actually 24bits used */
-   int mask;
-};
-
-
-/* We'd like to get rid of this eventually.  Only daynaport.c uses it now. */
-static inline void *nubus_slot_addr(int slot)
-{
-   return (void *)(0xF000|(slot<<24));
-}
-
 #endif /* _UAPILINUX_NUBUS_H */
-- 
2.13.6



[PATCH v5 01/14] nubus: Avoid array underflow and overflow

2018-01-13 Thread Finn Thain
Check array indices. Avoid sprintf. Use buffers of sufficient size.
Use appropriate types for array length parameters.

Tested-by: Stan Johnson 
Signed-off-by: Finn Thain 
---
 drivers/nubus/nubus.c | 29 +
 drivers/nubus/proc.c  | 12 ++--
 include/linux/nubus.h | 10 --
 3 files changed, 27 insertions(+), 24 deletions(-)

diff --git a/drivers/nubus/nubus.c b/drivers/nubus/nubus.c
index b793727cd4f7..b6c97e07f15e 100644
--- a/drivers/nubus/nubus.c
+++ b/drivers/nubus/nubus.c
@@ -161,7 +161,7 @@ static unsigned char *nubus_dirptr(const struct 
nubus_dirent *nd)
pointed to with offsets) out of the card ROM. */
 
 void nubus_get_rsrc_mem(void *dest, const struct nubus_dirent *dirent,
-   int len)
+   unsigned int len)
 {
unsigned char *t = (unsigned char *)dest;
unsigned char *p = nubus_dirptr(dirent);
@@ -173,18 +173,22 @@ void nubus_get_rsrc_mem(void *dest, const struct 
nubus_dirent *dirent,
 }
 EXPORT_SYMBOL(nubus_get_rsrc_mem);
 
-void nubus_get_rsrc_str(void *dest, const struct nubus_dirent *dirent,
-   int len)
+void nubus_get_rsrc_str(char *dest, const struct nubus_dirent *dirent,
+   unsigned int len)
 {
-   unsigned char *t = (unsigned char *)dest;
+   char *t = dest;
unsigned char *p = nubus_dirptr(dirent);
 
-   while (len) {
-   *t = nubus_get_rom(&p, 1, dirent->mask);
-   if (!*t++)
+   while (len > 1) {
+   unsigned char c = nubus_get_rom(&p, 1, dirent->mask);
+
+   if (!c)
break;
+   *t++ = c;
len--;
}
+   if (len > 0)
+   *t = '\0';
 }
 EXPORT_SYMBOL(nubus_get_rsrc_str);
 
@@ -468,7 +472,7 @@ nubus_get_functional_resource(struct nubus_board *board, 
int slot,
}
case NUBUS_RESID_NAME:
{
-   nubus_get_rsrc_str(dev->name, &ent, 64);
+   nubus_get_rsrc_str(dev->name, &ent, sizeof(dev->name));
pr_info("name: %s\n", dev->name);
break;
}
@@ -528,7 +532,7 @@ static int __init nubus_get_vidnames(struct nubus_board 
*board,
/* Don't know what this is yet */
u16 id;
/* Longest one I've seen so far is 26 characters */
-   char name[32];
+   char name[36];
};
 
pr_info("video modes supported:\n");
@@ -598,8 +602,8 @@ static int __init nubus_get_vendorinfo(struct nubus_board 
*board,
char name[64];
 
/* These are all strings, we think */
-   nubus_get_rsrc_str(name, &ent, 64);
-   if (ent.type > 5)
+   nubus_get_rsrc_str(name, &ent, sizeof(name));
+   if (ent.type < 1 || ent.type > 5)
ent.type = 5;
pr_info("%s: %s\n", vendor_fields[ent.type - 1], name);
}
@@ -633,7 +637,8 @@ static int __init nubus_get_board_resource(struct 
nubus_board *board, int slot,
break;
}
case NUBUS_RESID_NAME:
-   nubus_get_rsrc_str(board->name, &ent, 64);
+   nubus_get_rsrc_str(board->name, &ent,
+  sizeof(board->name));
pr_info("name: %s\n", board->name);
break;
case NUBUS_RESID_ICON:
diff --git a/drivers/nubus/proc.c b/drivers/nubus/proc.c
index 004a122ac0ff..fc20dbcd3b9a 100644
--- a/drivers/nubus/proc.c
+++ b/drivers/nubus/proc.c
@@ -73,10 +73,10 @@ static void nubus_proc_subdir(struct nubus_dev* dev,
 
/* Some of these are directories, others aren't */
while (nubus_readdir(dir, &ent) != -1) {
-   char name[8];
+   char name[9];
struct proc_dir_entry* e;

-   sprintf(name, "%x", ent.type);
+   snprintf(name, sizeof(name), "%x", ent.type);
e = proc_create(name, S_IFREG | S_IRUGO | S_IWUSR, parent,
&nubus_proc_subdir_fops);
if (!e)
@@ -95,11 +95,11 @@ static void nubus_proc_populate(struct nubus_dev* dev,
/* We know these are all directories (board resource + one or
   more functional resources) */
while (nubus_readdir(root, &ent) != -1) {
-   char name[8];
+   char name[9];
struct proc_dir_entry* e;
struct nubus_dir dir;

-   sprintf(name, "%x", ent.type);
+   snprintf(name, sizeof(name), "%x", ent.type);
e = proc_mkdir(name, parent);
if (!e) return;
 
@@ -119,7 +119,7 @@ int nubus_proc_attach_device(struct nubus_dev *dev)
 {
struct proc_dir_entry *e;

[PATCH v5 03/14] nubus: Use static functions where possible

2018-01-13 Thread Finn Thain
This fixes a couple of warnings from 'make W=1':
drivers/nubus/nubus.c:790: warning: no previous prototype for 'nubus_probe_slot'
drivers/nubus/nubus.c:824: warning: no previous prototype for 'nubus_scan_bus'

Tested-by: Stan Johnson 
Signed-off-by: Finn Thain 
---
 drivers/nubus/nubus.c | 4 ++--
 include/linux/nubus.h | 1 -
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/nubus/nubus.c b/drivers/nubus/nubus.c
index b6c97e07f15e..35056cee94b1 100644
--- a/drivers/nubus/nubus.c
+++ b/drivers/nubus/nubus.c
@@ -793,7 +793,7 @@ static struct nubus_board * __init nubus_add_board(int 
slot, int bytelanes)
return board;
 }
 
-void __init nubus_probe_slot(int slot)
+static void __init nubus_probe_slot(int slot)
 {
unsigned char dp;
unsigned char *rp;
@@ -827,7 +827,7 @@ void __init nubus_probe_slot(int slot)
}
 }
 
-void __init nubus_scan_bus(void)
+static void __init nubus_scan_bus(void)
 {
int slot;
 
diff --git a/include/linux/nubus.h b/include/linux/nubus.h
index 55b9a4569a69..e525669f1991 100644
--- a/include/linux/nubus.h
+++ b/include/linux/nubus.h
@@ -92,7 +92,6 @@ extern struct nubus_dev* nubus_devices;
 extern struct nubus_board* nubus_boards;
 
 /* Generic NuBus interface functions, modelled after the PCI interface */
-void nubus_scan_bus(void);
 #ifdef CONFIG_PROC_FS
 extern void nubus_proc_init(void);
 #else
-- 
2.13.6



[PATCH v5 00/14] Modernization and fixes for NuBus subsystem

2018-01-13 Thread Finn Thain
This series begins with cleanups and fixes for the NuBus subsystem and
finishes with a patch to add support for the Linux Driver Model.
A separate series (which requires this one) modernizes NuBus drivers.

Changes since v1:
- Added the missing NULL check in nubus_device_remove().
- Squashed the two /proc/bus/nubus/s/ patches into one patch.
- Combined the two sets of /proc/bus/nubus file operations into one set.
- Used the name 'nubus_rsrc' instead of 'nubus_functional_resource'.
- Used the name 'nubus_device_register' instead of 'nubus_device_add'.
- Dropped the unused EXPORT_SYMBOL(nubus_seq_write_rsrc_mem).
- Replaced licensing text in the new file with SPDX-License-Identifier.

Changes since v2:
- Implemented an idiomatic device release function for nubus boards.
- Removed the global nubus_boards linked list.
- Removed nubus_board pointer from proc dir entry private data to improve
  modularity.
- Adopted the standard linked list implementation.
- Disambiguated unrecognized and empty resources under /proc/bus/nubus.
- Reduced redundancy in proc dir entry private data to save some memory.
- Replaced /proc/nubus custom seq file ops with single_open().

Changes since v3:
- Added Acked-by and Reviewed-by tags.
- Moved the SPDX tag in bus.c to the first line of the file.

Changes since v4:
- Addressed some code style issues.


Finn Thain (14):
  nubus: Avoid array underflow and overflow
  nubus: Fix up header split
  nubus: Use static functions where possible
  nubus: Fix log spam
  nubus: Validate slot resource IDs
  nubus: Call proc_mkdir() not more than once per slot directory
  nubus: Remove redundant code
  nubus: Clean up whitespace
  nubus: Generalize block resource handling
  nubus: Rework /proc/bus/nubus/s/ implementation
  nubus: Rename struct nubus_dev
  nubus: Adopt standard linked list implementation
  nubus: Add expansion_type values for various Mac models
  nubus: Add support for the driver model

 arch/m68k/include/asm/macintosh.h   |   9 +-
 arch/m68k/mac/config.c  | 110 +++
 drivers/net/ethernet/8390/mac8390.c |  33 +-
 drivers/net/ethernet/cirrus/mac89x0.c   |   6 +-
 drivers/net/ethernet/natsemi/macsonic.c |  38 ++-
 drivers/nubus/Makefile  |   2 +-
 drivers/nubus/bus.c | 117 +++
 drivers/nubus/nubus.c   | 542 +---
 drivers/nubus/proc.c| 281 -
 drivers/video/fbdev/macfb.c |  10 +-
 include/linux/nubus.h   | 189 +++
 include/uapi/linux/nubus.h  |  23 --
 12 files changed, 762 insertions(+), 598 deletions(-)
 create mode 100644 drivers/nubus/bus.c

-- 
2.13.6



[PATCH] kernel:bpf Remove structure passing and assignment to save stack and no coping structures

2018-01-13 Thread Karim Eshapa
Use pointers to structure as arguments to function instead of coping
structures and less stack size. Also transfer TNUM(_v, _m) to
tnum.h file to be used in differnet files for creating anonymous structures
statically.

Signed-off-by: Karim Eshapa 

Thanks,
Karim
---
 include/linux/tnum.h  |  4 +++-
 kernel/bpf/tnum.c | 14 +++---
 kernel/bpf/verifier.c | 11 ++-
 3 files changed, 16 insertions(+), 13 deletions(-)

diff --git a/include/linux/tnum.h b/include/linux/tnum.h
index 0d2d3da..72938a0 100644
--- a/include/linux/tnum.h
+++ b/include/linux/tnum.h
@@ -13,6 +13,8 @@ struct tnum {
 };
 
 /* Constructors */
+/* Statically tnum constant */
+#define TNUM(_v, _m)   (struct tnum){.value = _v, .mask = _m}
 /* Represent a known constant as a tnum. */
 struct tnum tnum_const(u64 value);
 /* A completely unknown value */
@@ -26,7 +28,7 @@ struct tnum tnum_lshift(struct tnum a, u8 shift);
 /* Shift a tnum right (by a fixed shift) */
 struct tnum tnum_rshift(struct tnum a, u8 shift);
 /* Add two tnums, return @a + @b */
-struct tnum tnum_add(struct tnum a, struct tnum b);
+void tnum_add(struct tnum *res, struct tnum *a, struct tnum *b);
 /* Subtract two tnums, return @a - @b */
 struct tnum tnum_sub(struct tnum a, struct tnum b);
 /* Bitwise-AND, return @a & @b */
diff --git a/kernel/bpf/tnum.c b/kernel/bpf/tnum.c
index 1f4bf68..89e3182 100644
--- a/kernel/bpf/tnum.c
+++ b/kernel/bpf/tnum.c
@@ -8,7 +8,6 @@
 #include 
 #include 
 
-#define TNUM(_v, _m)   (struct tnum){.value = _v, .mask = _m}
 /* A completely unknown value */
 const struct tnum tnum_unknown = { .value = 0, .mask = -1 };
 
@@ -43,16 +42,17 @@ struct tnum tnum_rshift(struct tnum a, u8 shift)
return TNUM(a.value >> shift, a.mask >> shift);
 }
 
-struct tnum tnum_add(struct tnum a, struct tnum b)
+void tnum_add(struct tnum *res, struct tnum *a, struct tnum *b)
 {
u64 sm, sv, sigma, chi, mu;
 
-   sm = a.mask + b.mask;
-   sv = a.value + b.value;
+   sm = a->mask + b->mask;
+   sv = a->value + b->value;
sigma = sm + sv;
chi = sigma ^ sv;
-   mu = chi | a.mask | b.mask;
-   return TNUM(sv & ~mu, mu);
+   mu = chi | a->mask | b->mask;
+   res->value = (sv & ~mu);
+   res->mask = mu;
 }
 
 struct tnum tnum_sub(struct tnum a, struct tnum b)
@@ -102,7 +102,7 @@ static struct tnum hma(struct tnum acc, u64 value, u64 mask)
 {
while (mask) {
if (mask & 1)
-   acc = tnum_add(acc, TNUM(0, value));
+   tnum_add(&acc, &acc, &TNUM(0, value));
mask >>= 1;
value <<= 1;
}
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index b414d6b..b31b1c4 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -999,7 +999,7 @@ static int check_pkt_ptr_alignment(struct bpf_verifier_env 
*env,
 */
ip_align = 2;
 
-   reg_off = tnum_add(reg->var_off, tnum_const(ip_align + reg->off + off));
+   tnum_add(®_off, ®->var_off, &TNUM(ip_align + reg->off + off, 0));
if (!tnum_is_aligned(reg_off, size)) {
char tn_buf[48];
 
@@ -1023,8 +1023,7 @@ static int check_generic_ptr_alignment(struct 
bpf_verifier_env *env,
/* Byte size accesses are always allowed. */
if (!strict || size == 1)
return 0;
-
-   reg_off = tnum_add(reg->var_off, tnum_const(reg->off + off));
+   tnum_add(®_off, ®->var_off, &TNUM(reg->off + off, 0));
if (!tnum_is_aligned(reg_off, size)) {
char tn_buf[48];
 
@@ -1971,7 +1970,8 @@ static int adjust_ptr_min_max_vals(struct 
bpf_verifier_env *env,
dst_reg->umin_value = umin_ptr + umin_val;
dst_reg->umax_value = umax_ptr + umax_val;
}
-   dst_reg->var_off = tnum_add(ptr_reg->var_off, off_reg->var_off);
+   tnum_add(&dst_reg->var_off, &ptr_reg->var_off,
+   &off_reg->var_off);
dst_reg->off = ptr_reg->off;
if (reg_is_pkt_pointer(ptr_reg)) {
dst_reg->id = ++env->id_gen;
@@ -2108,7 +2108,8 @@ static int adjust_scalar_min_max_vals(struct 
bpf_verifier_env *env,
dst_reg->umin_value += umin_val;
dst_reg->umax_value += umax_val;
}
-   dst_reg->var_off = tnum_add(dst_reg->var_off, src_reg.var_off);
+   tnum_add(&dst_reg->var_off, &dst_reg->var_off,
+   &src_reg.var_off);
break;
case BPF_SUB:
if (signed_sub_overflows(dst_reg->smin_value, smax_val) ||
-- 
2.7.4



[PATCHv2 6/7] tools: add dmesg decryption program

2018-01-13 Thread Dan Aloni
Example execution:

dmesg | dmesg-decipher 

Signed-off-by: Dan Aloni 
---
 tools/Makefile  |   9 +-
 tools/kmsg/.gitignore   |   1 +
 tools/kmsg/Makefile |  14 ++
 tools/kmsg/dmesg-decipher.c | 354 
 4 files changed, 377 insertions(+), 1 deletion(-)
 create mode 100644 tools/kmsg/.gitignore
 create mode 100644 tools/kmsg/Makefile
 create mode 100644 tools/kmsg/dmesg-decipher.c

diff --git a/tools/Makefile b/tools/Makefile
index be02c8b904db..5a661e4c9012 100644
--- a/tools/Makefile
+++ b/tools/Makefile
@@ -18,6 +18,7 @@ help:
@echo '  hv - tools used when in Hyper-V clients'
@echo '  iio- IIO tools'
@echo '  kvm_stat   - top-like utility for displaying kvm 
statistics'
+   @echo '  kmsg   - A tool for decrypting a dmesg using a 
private key'
@echo '  leds   - LEDs  tools'
@echo '  liblockdep - user-space wrapper for kernel 
locking-validator'
@echo '  bpf- misc BPF tools'
@@ -91,6 +92,9 @@ freefall: FORCE
 kvm_stat: FORCE
$(call descend,kvm/$@)
 
+kmsg: FORCE
+   $(call descend,kmsg)
+
 all: acpi cgroup cpupower gpio hv firewire liblockdep \
perf selftests spi turbostat usb \
virtio vm bpf x86_energy_perf_policy \
@@ -167,6 +171,9 @@ tmon_clean:
 freefall_clean:
$(call descend,laptop/freefall,clean)
 
+kmsg_clean:
+   $(call descend,kmsg,clean)
+
 build_clean:
$(call descend,build,clean)
 
@@ -174,6 +181,6 @@ clean: acpi_clean cgroup_clean cpupower_clean hv_clean 
firewire_clean \
perf_clean selftests_clean turbostat_clean spi_clean usb_clean 
virtio_clean \
vm_clean bpf_clean iio_clean x86_energy_perf_policy_clean 
tmon_clean \
freefall_clean build_clean libbpf_clean libsubcmd_clean 
liblockdep_clean \
-   gpio_clean objtool_clean leds_clean wmi_clean
+   gpio_clean objtool_clean leds_clean wmi_clean kmsg_clean
 
 .PHONY: FORCE
diff --git a/tools/kmsg/.gitignore b/tools/kmsg/.gitignore
new file mode 100644
index ..a5b4e26b8d0b
--- /dev/null
+++ b/tools/kmsg/.gitignore
@@ -0,0 +1 @@
+dmesg-decipher
diff --git a/tools/kmsg/Makefile b/tools/kmsg/Makefile
new file mode 100644
index ..9f4ef7b11798
--- /dev/null
+++ b/tools/kmsg/Makefile
@@ -0,0 +1,14 @@
+# SPDX-License-Identifier: GPL-2.0
+CC := $(CROSS_COMPILE)gcc
+
+CFLAGS := -O2 -Wall $$(pkg-config --libs openssl)
+
+PROGS := dmesg-decipher
+
+%: %.c
+   $(CC) $(CFLAGS) -o $@ $^
+
+all: $(PROGS)
+
+clean:
+   rm -fr $(PROGS)
diff --git a/tools/kmsg/dmesg-decipher.c b/tools/kmsg/dmesg-decipher.c
new file mode 100644
index ..1ad2b0a27402
--- /dev/null
+++ b/tools/kmsg/dmesg-decipher.c
@@ -0,0 +1,354 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * dmesg-decipher.c
+ *
+ * A sample utility to decrypt an encrypted dmesg output, for
+ * development with kernels having kmsg encryption enabled.
+ *
+ * base64 decoding code taken from lib/base64-armor.c
+ *
+ * Copyright (c) Dan Aloni, 2017
+ *
+ * Compile with:
+ *
+ * gcc -O2 -Wall $(pkg-config --libs openssl) \
+ *   dmesg-decipher -o dmesg-decipher
+ */
+
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * The following is based on code from:
+ *
+ *
https://wiki.openssl.org/index.php/EVP_Authenticated_Encryption_and_Decryption
+ */
+static int aes_256_gcm_decrypt(unsigned char *ciphertext, size_t 
ciphertext_len,
+  unsigned char *aad, size_t aad_len,
+  unsigned char *tag, unsigned char *key,
+  unsigned char *iv, size_t iv_len,
+  unsigned char *plaintext)
+{
+   EVP_CIPHER_CTX *ctx;
+   int len;
+   int plaintext_len;
+   int ret = -1;
+
+   /* Create and initialise the context */
+   ctx = EVP_CIPHER_CTX_new();
+   if (!ctx)
+   return -1;
+
+   /* Initialise the decryption operation. */
+   if (!EVP_DecryptInit_ex(ctx, EVP_aes_128_gcm(), NULL, NULL, NULL))
+   goto free;
+
+   /* Set IV length. Not necessary if this is 12 bytes (96 bits) */
+   if (!EVP_CIPHER_CTX_ctrl(ctx, EVP_CTRL_GCM_SET_IVLEN, iv_len, NULL))
+   goto free;
+
+   /* Initialise key and IV */
+   if (!EVP_DecryptInit_ex(ctx, NULL, NULL, key, iv))
+   goto free;
+
+   /* Provide any AAD data. This can be called zero or more times as
+* required
+*/
+   if (aad_len != 0)
+   if (!EVP_DecryptUpdate(ctx, NULL, &len, aad, aad_len))
+   goto free;
+
+   /* Provide the message to be decrypted, and obtain the plaintext output.
+* EVP_DecryptUpdate can be called multiple times if necessary
+*/
+ 

[PATCHv2 0/7] RFC: Public key encryption of dmesg by the kernel

2018-01-13 Thread Dan Aloni
Changes from v1 [1]:

 - Made suggested fixes following a review from Randy Dunlap
 - Modified the ASCII encoding of cipher text to base64 instead of hex,
   with newlines replaced by '~' ; updated dmesg-decipher for it too
 - Moved base64 code from fs/ceph to lib, and improved it a bit
 - Improved checks that we are not overflowing the user buffer when
   using copy_to_user to in the added code
 - Added some prints when errors in dmesg-decipher
 - Fixes to Makefile at tools/ for building 'kmsg' (should it
   build by default in target 'all'? There is an openssl depdendency.)
 - checkpatch.pl linting

[1] https://lwn.net/Articles/742412/

Dan Aloni (7):
  crypto: fix memory leak in rsa-kcs1pad encryption
  Move net/ceph/armor to lib/ and add docs
  base64-armor: add bounds checking
  certs: allow in-kernel access of trusted keys
  printk: allow kmsg to be encrypted using public key encryption
  tools: add dmesg decryption program
  docs: add dmesg encryption doc

 Documentation/admin-guide/dmesg-encryption.rst | 116 +++
 Documentation/admin-guide/index.rst|   1 +
 Documentation/ioctl/ioctl-number.txt   |   1 +
 certs/system_keyring.c |  56 ++-
 crypto/rsa-pkcs1pad.c  |   9 -
 include/keys/system_keyring.h  |   3 +
 include/linux/base64-armor.h   |  70 
 include/uapi/linux/kmsg.h  |  18 +
 init/Kconfig   |  11 +
 kernel/printk/printk.c | 451 +
 lib/Kconfig|   7 +
 lib/Makefile   |   1 +
 net/ceph/armor.c => lib/base64-armor.c |  29 +-
 net/ceph/Kconfig   |   1 +
 net/ceph/Makefile  |   2 +-
 net/ceph/crypto.c  |   3 +-
 net/ceph/crypto.h  |   4 -
 tools/Makefile |   9 +-
 tools/kmsg/.gitignore  |   1 +
 tools/kmsg/Makefile|  14 +
 tools/kmsg/dmesg-decipher.c| 354 +++
 21 files changed, 1139 insertions(+), 22 deletions(-)
 create mode 100644 Documentation/admin-guide/dmesg-encryption.rst
 create mode 100644 include/linux/base64-armor.h
 create mode 100644 include/uapi/linux/kmsg.h
 rename net/ceph/armor.c => lib/base64-armor.c (75%)
 create mode 100644 tools/kmsg/.gitignore
 create mode 100644 tools/kmsg/Makefile
 create mode 100644 tools/kmsg/dmesg-decipher.c

-- 
2.14.3



[PATCHv2 3/7] base64-armor: add bounds checking

2018-01-13 Thread Dan Aloni
Future use of the API can benefit from bounds checking.

Signed-off-by: Dan Aloni 
---
 include/linux/base64-armor.h | 17 +++--
 lib/base64-armor.c   | 20 ++--
 net/ceph/crypto.c|  2 +-
 3 files changed, 30 insertions(+), 9 deletions(-)

diff --git a/include/linux/base64-armor.h b/include/linux/base64-armor.h
index e5160c77bb2f..bb0b4491799e 100644
--- a/include/linux/base64-armor.h
+++ b/include/linux/base64-armor.h
@@ -8,11 +8,13 @@
  * not contain newlines, depending on input length.
  *
  * @dst: Beginning of the destination buffer.
+ * @dst_max: Maximum amount of bytes to write to the destination buffer.
  * @src: Beginning of the source buffer.
  * @end: Sentinel for the source buffer, pointing one byte after the
  *   last byte to be encoded.
  *
- * Returns the number of bytes written to the destination buffer.
+ * Returns the number of bytes written to the destination buffer, or
+ * an error of the output buffer is insufficient in size.
  *
  * _Neither_ the input or output are expected to be NULL-terminated.
  *
@@ -22,19 +24,21 @@
  *
  * See base64_encode_buffer_bound below.
  */
-
-extern int base64_armor(char *dst, const char *src, const char *end);
+extern int base64_armor(char *dst, int dst_max, const char *src,
+   const char *end);
 
 /**
  * base64_unarmor: Perform armored base64 decoding.
  *
  * @dst: Beginning of the destination buffer.
+ * @dst_max: Maximum amount of bytes to write to the destination buffer.
  * @src: Beginning of the source buffer
  * @end: Sentinel for the source buffer, pointing one byte after the
  *   last byte to be encoded.
  *
- * Returns the number of bytes written to the destination buffer, or
- * -EINVAL if the source buffer contains invalid bytes.
+ * Returns the number of bytes written to the destination buffer,
+ * -EINVAL if the source buffer contains invalid bytes, or -ENOSPC
+ * if the output buffer is insufficient in size.
  *
  * _Neither_ the input or output are expected to be NULL-terminated.
  *
@@ -43,7 +47,8 @@ extern int base64_armor(char *dst, const char *src, const 
char *end);
  *
  * See base64_decode_buffer_bound below.
  */
-extern int base64_unarmor(char *dst, const char *src, const char *end);
+extern int base64_unarmor(char *dst, int dst_max, const char *src,
+ const char *end);
 
 
 /*
diff --git a/lib/base64-armor.c b/lib/base64-armor.c
index e07d25ac2850..f4a289f8da6a 100644
--- a/lib/base64-armor.c
+++ b/lib/base64-armor.c
@@ -33,7 +33,7 @@ static int decode_bits(char c)
return -EINVAL;
 }
 
-int base64_armor(char *dst, const char *src, const char *end)
+int base64_armor(char *dst, int dst_max, const char *src, const char *end)
 {
int olen = 0;
int line = 0;
@@ -42,6 +42,8 @@ int base64_armor(char *dst, const char *src, const char *end)
unsigned char a, b, c;
 
a = *src++;
+   if (dst_max < 4)
+   return -ENOSPC;
*dst++ = encode_bits(a >> 2);
if (src < end) {
b = *src++;
@@ -62,17 +64,22 @@ int base64_armor(char *dst, const char *src, const char 
*end)
}
olen += 4;
line += 4;
+   dst_max -= 4;
+
if (line == 64) {
line = 0;
+   if (dst_max < 1)
+   return -ENOSPC;
*(dst++) = '\n';
olen++;
+   dst_max--;
}
}
return olen;
 }
 EXPORT_SYMBOL(base64_unarmor);
 
-int base64_unarmor(char *dst, const char *src, const char *end)
+int base64_unarmor(char *dst, int dst_max, const char *src, const char *end)
 {
int olen = 0;
 
@@ -92,13 +99,22 @@ int base64_unarmor(char *dst, const char *src, const char 
*end)
if (a < 0 || b < 0 || c < 0 || d < 0)
return -EINVAL;
 
+   if (dst_max < 1)
+   return -ENOSPC;
*dst++ = (a << 2) | (b >> 4);
+   dst_max--;
if (src[2] == '=')
return olen + 1;
+   if (dst_max < 1)
+   return -ENOSPC;
*dst++ = ((b & 15) << 4) | (c >> 2);
+   dst_max--;
if (src[3] == '=')
return olen + 2;
+   if (dst_max < 1)
+   return -ENOSPC;
*dst++ = ((c & 3) << 6) | d;
+   dst_max--;
olen += 3;
src += 4;
}
diff --git a/net/ceph/crypto.c b/net/ceph/crypto.c
index 25e04e3b1aa4..f7c75368989a 100644
--- a/net/ceph/crypto.c
+++ b/net/ceph/crypto.c
@@ -116,7 +116,7 @@ int ceph_crypto_key_unarmor(struct ceph_crypto_key *key, 
const char *inkey)
buf = kmalloc(blen, GFP_NOFS);
if (!buf)
  

[PATCHv2 2/7] Move net/ceph/armor to lib/ and add docs

2018-01-13 Thread Dan Aloni
Plus, add functions that assist in managing buffer bounds.

Signed-off-by: Dan Aloni 
---
 include/linux/base64-armor.h   | 65 ++
 lib/Kconfig|  7 
 lib/Makefile   |  1 +
 net/ceph/armor.c => lib/base64-armor.c | 13 ---
 net/ceph/Kconfig   |  1 +
 net/ceph/Makefile  |  2 +-
 net/ceph/crypto.c  |  3 +-
 net/ceph/crypto.h  |  4 ---
 8 files changed, 85 insertions(+), 11 deletions(-)
 create mode 100644 include/linux/base64-armor.h
 rename net/ceph/armor.c => lib/base64-armor.c (86%)

diff --git a/include/linux/base64-armor.h b/include/linux/base64-armor.h
new file mode 100644
index ..e5160c77bb2f
--- /dev/null
+++ b/include/linux/base64-armor.h
@@ -0,0 +1,65 @@
+#ifndef __LINUX_BASE64_ARMOR_H__
+#define __LINUX_BASE64_ARMOR_H__
+
+#include 
+
+/**
+ * base64_armor: Perform armored base64 encoding. Output may or may
+ * not contain newlines, depending on input length.
+ *
+ * @dst: Beginning of the destination buffer.
+ * @src: Beginning of the source buffer.
+ * @end: Sentinel for the source buffer, pointing one byte after the
+ *   last byte to be encoded.
+ *
+ * Returns the number of bytes written to the destination buffer.
+ *
+ * _Neither_ the input or output are expected to be NULL-terminated.
+ *
+ * The number of output bytes is exactly (n * 4 + (n / 16)) where
+ * n = ((end - src) + 2) / 3. A less stringent but more wasteful
+ * validation for output buffer size can be: 4 + (end - src) * 2.
+ *
+ * See base64_encode_buffer_bound below.
+ */
+
+extern int base64_armor(char *dst, const char *src, const char *end);
+
+/**
+ * base64_unarmor: Perform armored base64 decoding.
+ *
+ * @dst: Beginning of the destination buffer.
+ * @src: Beginning of the source buffer
+ * @end: Sentinel for the source buffer, pointing one byte after the
+ *   last byte to be encoded.
+ *
+ * Returns the number of bytes written to the destination buffer, or
+ * -EINVAL if the source buffer contains invalid bytes.
+ *
+ * _Neither_ the input or output are expected to be NULL-terminated.
+ *
+ * It can be assumed that the number of output bytes is less or
+ * equals to: 3 * ((end - src) / 4).
+ *
+ * See base64_decode_buffer_bound below.
+ */
+extern int base64_unarmor(char *dst, const char *src, const char *end);
+
+
+/*
+ * Utility functions for buffer upper bounds:
+ */
+
+static inline size_t base64_encode_buffer_bound(size_t src_len)
+{
+   size_t n = (src_len + 2) / 3;
+
+   return (n * 4 + (n / 16));
+}
+
+static inline size_t base64_decode_buffer_bound(size_t src_len)
+{
+   return 3 * (src_len / 4);
+}
+
+#endif
diff --git a/lib/Kconfig b/lib/Kconfig
index c5e84fbcb30b..caddcaebbc2f 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -188,6 +188,13 @@ config CRC8
  when they need to do cyclic redundancy check according CRC8
  algorithm. Module will be called crc8.
 
+config BASE64_ARMOR
+   tristate "BASE64 encoding/decoding functions"
+   help
+ This option provides BASE64 encoding and decoding functions.
+ Module name will be base64-armor if this code is built as a
+ module.
+
 config XXHASH
tristate
 
diff --git a/lib/Makefile b/lib/Makefile
index d11c48ec8ffd..47335d28f77f 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -94,6 +94,7 @@ ifneq ($(CONFIG_HAVE_DEC_LOCK),y)
   lib-y += dec_and_lock.o
 endif
 
+obj-$(CONFIG_BASE64_ARMOR) += base64-armor.o
 obj-$(CONFIG_BITREVERSE) += bitrev.o
 obj-$(CONFIG_RATIONAL) += rational.o
 obj-$(CONFIG_CRC_CCITT)+= crc-ccitt.o
diff --git a/net/ceph/armor.c b/lib/base64-armor.c
similarity index 86%
rename from net/ceph/armor.c
rename to lib/base64-armor.c
index 0db8065928df..e07d25ac2850 100644
--- a/net/ceph/armor.c
+++ b/lib/base64-armor.c
@@ -1,9 +1,8 @@
 // SPDX-License-Identifier: GPL-2.0
 
 #include 
-
-int ceph_armor(char *dst, const char *src, const char *end);
-int ceph_unarmor(char *dst, const char *src, const char *end);
+#include 
+#include 
 
 /*
  * base64 encode/decode.
@@ -34,7 +33,7 @@ static int decode_bits(char c)
return -EINVAL;
 }
 
-int ceph_armor(char *dst, const char *src, const char *end)
+int base64_armor(char *dst, const char *src, const char *end)
 {
int olen = 0;
int line = 0;
@@ -71,8 +70,9 @@ int ceph_armor(char *dst, const char *src, const char *end)
}
return olen;
 }
+EXPORT_SYMBOL(base64_unarmor);
 
-int ceph_unarmor(char *dst, const char *src, const char *end)
+int base64_unarmor(char *dst, const char *src, const char *end)
 {
int olen = 0;
 
@@ -104,3 +104,6 @@ int ceph_unarmor(char *dst, const char *src, const char 
*end)
}
return olen;
 }
+EXPORT_SYMBOL(base64_armor);
+
+MODULE_LICENSE("GPL v2");
diff --git a/net/ceph/Kconfig b/net/ceph/Kconfig
index f8cceb99e732..5c4e7d0f2896 100644
--- a/net/ceph/Kconfig
+++ b/ne

[PATCHv2 1/7] crypto: fix memory leak in rsa-kcs1pad encryption

2018-01-13 Thread Dan Aloni
The encryption mode of pkcs1pad never uses out_sg and out_buf, so
there's no need to allocate the buffer, which presently is not even
being freed.

CC: Herbert Xu 
Signed-off-by: Dan Aloni 
---
 crypto/rsa-pkcs1pad.c | 9 -
 1 file changed, 9 deletions(-)

diff --git a/crypto/rsa-pkcs1pad.c b/crypto/rsa-pkcs1pad.c
index 2908f93c3e55..e8354084ef4e 100644
--- a/crypto/rsa-pkcs1pad.c
+++ b/crypto/rsa-pkcs1pad.c
@@ -261,15 +261,6 @@ static int pkcs1pad_encrypt(struct akcipher_request *req)
pkcs1pad_sg_set_buf(req_ctx->in_sg, req_ctx->in_buf,
ctx->key_size - 1 - req->src_len, req->src);
 
-   req_ctx->out_buf = kmalloc(ctx->key_size, GFP_KERNEL);
-   if (!req_ctx->out_buf) {
-   kfree(req_ctx->in_buf);
-   return -ENOMEM;
-   }
-
-   pkcs1pad_sg_set_buf(req_ctx->out_sg, req_ctx->out_buf,
-   ctx->key_size, NULL);
-
akcipher_request_set_tfm(&req_ctx->child_req, ctx->child);
akcipher_request_set_callback(&req_ctx->child_req, req->base.flags,
pkcs1pad_encrypt_sign_complete_cb, req);
-- 
2.14.3



Re: [PATCH v1] x86/retpoline: Use lfence in the retpoline/RSB filling RSB macros

2018-01-13 Thread Thomas Gleixner
On Sat, 13 Jan 2018, Tom Lendacky wrote:

> On 1/13/2018 8:07 AM, Van De Ven, Arjan wrote:
> >>> The RSB filling macro is applicable to AMD, and, if software is unable to
> >>> verify that lfence is serializing on AMD (possible when running under a
> >>> hypervisor), the generic retpoline support will be used and, so, is also
> >>> applicable to AMD.  Change the use of pause to lfence.
> >>>
> >>> Signed-off-by: Tom Lendacky 
> >>
> >> Conditionally-Acked-by: David Woodhouse 
> > 
> > 
> > pause is technically the "save me power" instruction
> > 
> > how about a compromise where we do a double:
> > 
> > pause
> > lfence
> > jmp 
> > 
> > as sequence... that way if the branch recovery is fast, we get the 
> > performance of pause, but if it takes a while, on AMD you get the behavior 
> > of lfence?
> 
> That should work on AMD.

I zapped the commit from tip for now until this discussion is resolved.

Thanks,

tglx

[PATCHv2 4/7] certs: allow in-kernel access of trusted keys

2018-01-13 Thread Dan Aloni
CC: David Howells 
Signed-off-by: Dan Aloni 
---
 certs/system_keyring.c| 56 ++-
 include/keys/system_keyring.h |  3 +++
 2 files changed, 58 insertions(+), 1 deletion(-)

diff --git a/certs/system_keyring.c b/certs/system_keyring.c
index 6251d1b27f0c..843a38b43fb1 100644
--- a/certs/system_keyring.c
+++ b/certs/system_keyring.c
@@ -131,6 +131,8 @@ static __init int system_trusted_keyring_init(void)
  */
 device_initcall(system_trusted_keyring_init);
 
+static char *first_asymmetric_key_description;
+
 /*
  * Load the compiled-in list of X.509 certificates.
  */
@@ -172,8 +174,11 @@ static __init int load_system_certificate_list(void)
pr_err("Problem loading in-kernel X.509 certificate 
(%ld)\n",
   PTR_ERR(key));
} else {
+   first_asymmetric_key_description =
+   kstrdup(key_ref_to_ptr(key)->description,
+   GFP_KERNEL);
pr_notice("Loaded X.509 cert '%s'\n",
- key_ref_to_ptr(key)->description);
+ first_asymmetric_key_description);
key_ref_put(key);
}
p += plen;
@@ -265,3 +270,52 @@ int verify_pkcs7_signature(const void *data, size_t len,
 EXPORT_SYMBOL_GPL(verify_pkcs7_signature);
 
 #endif /* CONFIG_SYSTEM_DATA_VERIFICATION */
+
+/**
+ * get_first_asymmetric_key - Find a key by ID.
+ * @keyring: The keys to search.
+ *
+ * Return the first asymmetric key in a keyring.
+ */
+static struct key *get_first_asymmetric_key(struct key *keyring)
+{
+   key_ref_t ref;
+
+   ref = keyring_search(make_key_ref(keyring, 1),
+&key_type_asymmetric,
+first_asymmetric_key_description);
+   if (IS_ERR(ref)) {
+   switch (PTR_ERR(ref)) {
+   case -EACCES:
+   case -ENOTDIR:
+   case -EAGAIN:
+   return ERR_PTR(-ENOKEY);
+   default:
+   return ERR_CAST(ref);
+   }
+   }
+
+   return key_ref_to_ptr(ref);
+}
+
+/**
+ * find_trusted_asymmetric_key - Find a key by ID in the builtin trusted
+ * keys keyring, or return the first key in that keyring.
+ *
+ * @id_0: The first ID to look for or NULL.
+ * @id_1: The second ID to look for or NULL.
+ *
+ * The preferred identifier is the id_0 and the fallback identifier is
+ * the id_1. If both are given, the lookup is by the former, but the
+ * latter must also match. If none are given, the first key is returned.
+ */
+struct key *find_trusted_asymmetric_key(const struct asymmetric_key_id *id_0,
+   const struct asymmetric_key_id *id_1)
+{
+   struct key *keyring = builtin_trusted_keys;
+
+   if (!id_0 && !id_1)
+   return get_first_asymmetric_key(keyring);
+
+   return find_asymmetric_key(keyring, id_0, id_1, false);
+}
diff --git a/include/keys/system_keyring.h b/include/keys/system_keyring.h
index 359c2f936004..0bef29eb8297 100644
--- a/include/keys/system_keyring.h
+++ b/include/keys/system_keyring.h
@@ -13,6 +13,7 @@
 #define _KEYS_SYSTEM_KEYRING_H
 
 #include 
+#include 
 
 #ifdef CONFIG_SYSTEM_TRUSTED_KEYRING
 
@@ -61,5 +62,7 @@ static inline struct key *get_ima_blacklist_keyring(void)
 }
 #endif /* CONFIG_IMA_BLACKLIST_KEYRING */
 
+struct key *find_trusted_asymmetric_key(const struct asymmetric_key_id *id_0,
+   const struct asymmetric_key_id *id_1);
 
 #endif /* _KEYS_SYSTEM_KEYRING_H */
-- 
2.14.3



[PATCHv2 5/7] printk: allow kmsg to be encrypted using public key encryption

2018-01-13 Thread Dan Aloni
This commit enables the kernel to encrypt the free-form text that
is generated by printk() before it is brought up to `dmesg` in
userspace.

The encryption is made using one of the trusted public keys which
are kept built-in inside the kernel. These keys are presently
also used for verifying kernel modules and userspace-supplied
firmwares.

CC: Petr Mladek 
CC: Sergey Senozhatsky 
CC: Linus Torvalds 
Signed-off-by: Dan Aloni 
---
 Documentation/ioctl/ioctl-number.txt |   1 +
 include/uapi/linux/kmsg.h|  18 ++
 init/Kconfig |  11 +
 kernel/printk/printk.c   | 450 +++
 4 files changed, 480 insertions(+)
 create mode 100644 include/uapi/linux/kmsg.h

diff --git a/Documentation/ioctl/ioctl-number.txt 
b/Documentation/ioctl/ioctl-number.txt
index 3e3fdae5f3ed..eafa24cddf3f 100644
--- a/Documentation/ioctl/ioctl-number.txt
+++ b/Documentation/ioctl/ioctl-number.txt
@@ -226,6 +226,7 @@ Code  Seq#(hex) Include FileComments
 'f'00-0F   fs/ocfs2/ocfs2_fs.h conflict!
 'g'00-0F   linux/usb/gadgetfs.h
 'g'20-2F   linux/usb/g_printer.h
+'g'30-3F   uapi/linux/kmsg.h
 'h'00-7F   conflict! Charon filesystem

 'h'00-1F   linux/hpet.hconflict!
diff --git a/include/uapi/linux/kmsg.h b/include/uapi/linux/kmsg.h
new file mode 100644
index ..497040740d69
--- /dev/null
+++ b/include/uapi/linux/kmsg.h
@@ -0,0 +1,18 @@
+#ifndef _LINUX_UAPI_KMSG_H
+#define _LINUX_UAPI_KMSG_H
+
+#include 
+#include 
+
+struct kmsg_ioctl_get_encrypted_key {
+   void __user *output_buffer;
+   __u64 buffer_size;
+   __u64 key_size;
+};
+
+#define KMSG_IOCTL_BASE 'g'
+
+#define KMSG_IOCTL__GET_ENCRYPTED_KEY  _IOWR(KMSG_IOCTL_BASE, 0x30, \
+   struct kmsg_ioctl_get_encrypted_key)
+
+#endif /* _LINUX_DN_H */
diff --git a/init/Kconfig b/init/Kconfig
index a9a2e2c86671..8e07a8f9e5c6 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1769,6 +1769,17 @@ config MODULE_SIG
  debuginfo strip done by some packagers (such as rpmbuild) and
  inclusion into an initramfs that wants the module size reduced.
 
+config KMSG_ENCRYPTION
+   bool "Encrypt /dev/kmsg (viewing dmesg will require decryption!)"
+   depends on SYSTEM_TRUSTED_KEYRING
+   select BASE64_ARMOR
+   help
+ This enables strong encryption of messages generated by the kernel,
+ to defend against most kinds of information leaks.
+
+ Note that this option adds the OpenSSL development packages as a
+ kernel build dependency so that certificates can be generated.
+
 config MODULE_SIG_FORCE
bool "Require modules to be validly signed"
depends on MODULE_SIG
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index b9006617710f..898094fb87bd 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -48,6 +48,14 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
 
 #include 
 #include 
@@ -100,6 +108,10 @@ enum devkmsg_log_masks {
DEVKMSG_LOG_MASK_LOCK   = BIT(__DEVKMSG_LOG_BIT_LOCK),
 };
 
+#define CRYPT_KMSG_KEY_LEN 16
+#define CRYPT_KMSG_AUTH_LEN16
+#define CRYPT_KMSG_TEXT_META_MAX   32
+
 /* Keep both the 'on' and 'off' bits clear, i.e. ratelimit by default: */
 #define DEVKMSG_LOG_MASK_DEFAULT   0
 
@@ -744,12 +756,33 @@ static ssize_t msg_print_ext_body(char *buf, size_t size,
return p - buf;
 }
 
+#ifdef CONFIG_KMSG_ENCRYPTION
+static int __ro_after_init kmsg_encrypt = 1;
+static int __init control_kmsg_encrypt(char *str)
+{
+   get_option(&str, &kmsg_encrypt);
+   return 0;
+}
+__setup("kmsg_encrypt=", control_kmsg_encrypt);
+
+struct devkmsg_crypt {
+   u8 key[CRYPT_KMSG_KEY_LEN];
+   u8 *encrypted_key;
+   size_t encrypted_key_len;
+   bool encrypted_key_read;
+   struct crypto_aead *sk_tfm;
+};
+#else
+struct devkmsg_crypt {};
+#endif
+
 /* /dev/kmsg - userspace message inject/listen interface */
 struct devkmsg_user {
u64 seq;
u32 idx;
struct ratelimit_state rs;
struct mutex lock;
+   struct devkmsg_crypt crypt;
char buf[CONSOLE_EXT_LOG_MAX];
 };
 
@@ -816,6 +849,358 @@ static ssize_t devkmsg_write(struct kiocb *iocb, struct 
iov_iter *from)
return ret;
 }
 
+#ifdef CONFIG_KMSG_ENCRYPTION
+
+static int devkmsg_encrypt_key(struct devkmsg_crypt *crypt,
+  struct crypto_akcipher *ak_tfm)
+{
+   const struct public_key *pkey;
+   struct akcipher_request *req;
+   unsigned int out_len_max;
+   struct scatterlist src, dst;
+   void *outbuf_enc = NULL;
+   struct crypto_wait wait;
+   struct key *key;
+   int err;
+
+   if (!kmsg_encrypt)
+   return 0;
+
+   key = find_trusted_asymmetri

[PATCHv2 7/7] docs: add dmesg encryption doc

2018-01-13 Thread Dan Aloni
Reviewed-by: Randy Dunlap 
Signed-off-by: Dan Aloni 
---
 Documentation/admin-guide/dmesg-encryption.rst | 118 +
 Documentation/admin-guide/index.rst|   1 +
 2 files changed, 119 insertions(+)
 create mode 100644 Documentation/admin-guide/dmesg-encryption.rst

diff --git a/Documentation/admin-guide/dmesg-encryption.rst 
b/Documentation/admin-guide/dmesg-encryption.rst
new file mode 100644
index ..5aedb8db3a7c
--- /dev/null
+++ b/Documentation/admin-guide/dmesg-encryption.rst
@@ -0,0 +1,118 @@
+Kernel message encryption
+-
+
+.. CONTENTS
+..
+.. - Overview
+.. - Reason for encrypting dmesg
+.. - Compile time and run time switches
+.. - Limitations
+.. - Decrypting dmesg
+
+
+
+Overview
+
+
+Similar to the module signing facility, it is also possible to have the kernel
+perform public key encryption of the kernel messages that are being generated
+by printk calls.
+
+The encryption can be performed for one of the trusted public keys in the
+kernel keyring, and by default will be performed against the kernel's module
+signing key.
+
+To prevent a run-time dependency inside printk itself, the encryption takes
+place upon trying to read ``/dev/kmsg`` which is the mechanism currently used
+by ``systemd`` to read kernel messages, and is also used by ``dmesg``
+invocations.
+
+The first line being read by a ``dmesg`` opener will be an artificial line
+containing an encrypted symmetric encryption session key, in RSA PKCS#1 format.
+The other lines are messages encrypted under an AES-128-GCM scheme. All binary
+ciphertext is base64-encoded, so that the ciphertext solely comprises of
+printable characters.
+
+===
+Limitations
+===
+
+There are various limitations one need to consider when enabling dmesg
+encryption:
+
+  * The metadata of kernel messages is not part of the encryption (timestamp,
+log facility, log severity).
+
+  * The seldom accompanying dictionary is also not part of the encryption.
+
+  * Any output to any system console, happening when printk() itself is
+executing, is also not encrypted. A potential attacker can load up
+``netconsole`` and have kernel messages being sent as plaintext to other
+machines. Hopefully, on embedded devices, all system consoles are under
+strict control of the developers.
+
+  * The syslog system call is barred from reading kmsg. Its present users are
+few, as the system call's interface is mostly a fallback to an inaccessible
+``/dev/kmsg``. This is only an implementation limitation and that may be
+addressed.
+
+  * kmsg buffers will still be saved as plaintext inside kdumps. The assumption
+is that having an access to read a kdump is equivalent to full kernel
+access anyway.
+
+===
+Reason for encryption dmesg
+===
+
+For years, dmesg has contained data which could be utilized by vulnerability
+exploiters, allowing for privilege escalations. Developers may leave key data
+such as pointers, indication of driver bugs, and more.
+
+The feature is mostly aimed for device manufacturers who are not keen on
+revealing the full details of kernel execution, bugs, and crashes to their
+users, but only to their developers, so that local programs running on the
+devices cannot use the data for 'rooting' and executing exploits.
+
+==
+Compile time and run time switches
+==
+
+In build time, this feature is controlled via the ``CONFIG_KMSG_ENCRYPTION``
+configuration variable.
+
+In run time, it can be turned off by providing `kmsg_encrypt=0` as a boot time
+parameter.
+
+
+Decrypting dmesg
+
+
+A supplied program in the kernel tree named ``dmesg-decipher`` uses the OpenSSL
+library along with the paired private key of the encryption in order to
+decipher an encrypted dmesg.
+
+An innocuous dmesg invocation will appear as such (with the ciphertexts
+shortened here for the brevity of this document)::
+
+[0.00] K:Zzgt0ovlRvwHfQgbQ2tdjOzgYFwrzHU00XO4=
+[0.00] M:ogoKk3kCb6q51z8BVLr903/w==,16,12
+[0.00] M:CcxUnMRIHrjDo+c1Zes=,16,12
+
+
+The artificial ``K:`` message is generated per opening of ``/dev/kmsg``. It
+contains the encrypted session key. The encrypted dmesg lines follows it
+(prefix ``M:``).
+
+Provided with the private key, deciphering a dmesg output should be a
+straightforward process.
+
+For example, one can save an encrypted dmesg to ``dmesg.enc`` in one machine,
+then transfer it to another machine which contains access to the PEM with the
+decrypting private key, and use the the following command::
+
+cat dmesg.enc | ./tools/kmsg/dmesg-decipher certs/signing_key.pem
+
+[0.00] Linux version 4.15.0-rc5+ (dan@jupiter) (gcc version 7.2.1 
20170915 (Red Hat 7.2.1-2) (GCC)) #109 SMP Sat Dec 30 18:32:25 IST 2017
+[   

Re: [PATCH] [net-next] net: netsec: use dma_addr_t for storing dma address

2018-01-13 Thread Ard Biesheuvel
On 13 January 2018 at 21:13, Arnd Bergmann  wrote:
> On targets that have different sizes for phys_addr_t and dma_addr_t,
> we get a type mismatch error:
>
> drivers/net/ethernet/socionext/netsec.c: In function 'netsec_alloc_dring':
> drivers/net/ethernet/socionext/netsec.c:970:9: error: passing argument 3 of 
> 'dma_zalloc_coherent' from incompatible pointer type 
> [-Werror=incompatible-pointer-types]
>
> The code is otherwise correct, as the address is never actually used as a
> physical address but only passed into a DMA register.  For consistently,

consistency

> I'm changing the variable name as well, to clarify that this is a DMA
> address.
>
> Signed-off-by: Arnd Bergmann 

Acked-by: Ard Biesheuvel 

> ---
>  drivers/net/ethernet/socionext/netsec.c | 14 +++---
>  1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/net/ethernet/socionext/netsec.c 
> b/drivers/net/ethernet/socionext/netsec.c
> index 6c263af86b8a..f4c0b02ddad8 100644
> --- a/drivers/net/ethernet/socionext/netsec.c
> +++ b/drivers/net/ethernet/socionext/netsec.c
> @@ -252,7 +252,7 @@ struct netsec_desc {
>  };
>
>  struct netsec_desc_ring {
> -   phys_addr_t desc_phys;
> +   dma_addr_t desc_dma;
> struct netsec_desc *desc;
> void *vaddr;
> u16 pkt_cnt;
> @@ -953,7 +953,7 @@ static void netsec_free_dring(struct netsec_priv *priv, 
> int id)
>
> if (dring->vaddr) {
> dma_free_coherent(priv->dev, DESC_SZ * DESC_NUM,
> - dring->vaddr, dring->desc_phys);
> + dring->vaddr, dring->desc_dma);
> dring->vaddr = NULL;
> }
>
> @@ -967,7 +967,7 @@ static int netsec_alloc_dring(struct netsec_priv *priv, 
> enum ring_id id)
> int ret = 0;
>
> dring->vaddr = dma_zalloc_coherent(priv->dev, DESC_SZ * DESC_NUM,
> -  &dring->desc_phys, GFP_KERNEL);
> +  &dring->desc_dma, GFP_KERNEL);
> if (!dring->vaddr) {
> ret = -ENOMEM;
> goto err;
> @@ -1087,14 +1087,14 @@ static int netsec_reset_hardware(struct netsec_priv 
> *priv)
>
> /* set desc_start addr */
> netsec_write(priv, NETSEC_REG_NRM_RX_DESC_START_UP,
> -
> upper_32_bits(priv->desc_ring[NETSEC_RING_RX].desc_phys));
> +upper_32_bits(priv->desc_ring[NETSEC_RING_RX].desc_dma));
> netsec_write(priv, NETSEC_REG_NRM_RX_DESC_START_LW,
> -
> lower_32_bits(priv->desc_ring[NETSEC_RING_RX].desc_phys));
> +lower_32_bits(priv->desc_ring[NETSEC_RING_RX].desc_dma));
>
> netsec_write(priv, NETSEC_REG_NRM_TX_DESC_START_UP,
> -
> upper_32_bits(priv->desc_ring[NETSEC_RING_TX].desc_phys));
> +upper_32_bits(priv->desc_ring[NETSEC_RING_TX].desc_dma));
> netsec_write(priv, NETSEC_REG_NRM_TX_DESC_START_LW,
> -
> lower_32_bits(priv->desc_ring[NETSEC_RING_TX].desc_phys));
> +lower_32_bits(priv->desc_ring[NETSEC_RING_TX].desc_dma));
>
> /* set normal tx dring ring config */
> netsec_write(priv, NETSEC_REG_NRM_TX_CONFIG,
> --
> 2.9.0
>


[PATCH] [net-next] net: netsec: use dma_addr_t for storing dma address

2018-01-13 Thread Arnd Bergmann
On targets that have different sizes for phys_addr_t and dma_addr_t,
we get a type mismatch error:

drivers/net/ethernet/socionext/netsec.c: In function 'netsec_alloc_dring':
drivers/net/ethernet/socionext/netsec.c:970:9: error: passing argument 3 of 
'dma_zalloc_coherent' from incompatible pointer type 
[-Werror=incompatible-pointer-types]

The code is otherwise correct, as the address is never actually used as a
physical address but only passed into a DMA register.  For consistently,
I'm changing the variable name as well, to clarify that this is a DMA
address.

Signed-off-by: Arnd Bergmann 
---
 drivers/net/ethernet/socionext/netsec.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/socionext/netsec.c 
b/drivers/net/ethernet/socionext/netsec.c
index 6c263af86b8a..f4c0b02ddad8 100644
--- a/drivers/net/ethernet/socionext/netsec.c
+++ b/drivers/net/ethernet/socionext/netsec.c
@@ -252,7 +252,7 @@ struct netsec_desc {
 };
 
 struct netsec_desc_ring {
-   phys_addr_t desc_phys;
+   dma_addr_t desc_dma;
struct netsec_desc *desc;
void *vaddr;
u16 pkt_cnt;
@@ -953,7 +953,7 @@ static void netsec_free_dring(struct netsec_priv *priv, int 
id)
 
if (dring->vaddr) {
dma_free_coherent(priv->dev, DESC_SZ * DESC_NUM,
- dring->vaddr, dring->desc_phys);
+ dring->vaddr, dring->desc_dma);
dring->vaddr = NULL;
}
 
@@ -967,7 +967,7 @@ static int netsec_alloc_dring(struct netsec_priv *priv, 
enum ring_id id)
int ret = 0;
 
dring->vaddr = dma_zalloc_coherent(priv->dev, DESC_SZ * DESC_NUM,
-  &dring->desc_phys, GFP_KERNEL);
+  &dring->desc_dma, GFP_KERNEL);
if (!dring->vaddr) {
ret = -ENOMEM;
goto err;
@@ -1087,14 +1087,14 @@ static int netsec_reset_hardware(struct netsec_priv 
*priv)
 
/* set desc_start addr */
netsec_write(priv, NETSEC_REG_NRM_RX_DESC_START_UP,
-upper_32_bits(priv->desc_ring[NETSEC_RING_RX].desc_phys));
+upper_32_bits(priv->desc_ring[NETSEC_RING_RX].desc_dma));
netsec_write(priv, NETSEC_REG_NRM_RX_DESC_START_LW,
-lower_32_bits(priv->desc_ring[NETSEC_RING_RX].desc_phys));
+lower_32_bits(priv->desc_ring[NETSEC_RING_RX].desc_dma));
 
netsec_write(priv, NETSEC_REG_NRM_TX_DESC_START_UP,
-upper_32_bits(priv->desc_ring[NETSEC_RING_TX].desc_phys));
+upper_32_bits(priv->desc_ring[NETSEC_RING_TX].desc_dma));
netsec_write(priv, NETSEC_REG_NRM_TX_DESC_START_LW,
-lower_32_bits(priv->desc_ring[NETSEC_RING_TX].desc_phys));
+lower_32_bits(priv->desc_ring[NETSEC_RING_TX].desc_dma));
 
/* set normal tx dring ring config */
netsec_write(priv, NETSEC_REG_NRM_TX_CONFIG,
-- 
2.9.0



Re: [PATCH v1] x86/retpoline: Use lfence in the retpoline/RSB filling RSB macros

2018-01-13 Thread Tom Lendacky
On 1/13/2018 8:07 AM, Van De Ven, Arjan wrote:
>>> The RSB filling macro is applicable to AMD, and, if software is unable to
>>> verify that lfence is serializing on AMD (possible when running under a
>>> hypervisor), the generic retpoline support will be used and, so, is also
>>> applicable to AMD.  Change the use of pause to lfence.
>>>
>>> Signed-off-by: Tom Lendacky 
>>
>> Conditionally-Acked-by: David Woodhouse 
> 
> 
> pause is technically the "save me power" instruction
> 
> how about a compromise where we do a double:
> 
> pause
> lfence
> jmp 
> 
> as sequence... that way if the branch recovery is fast, we get the 
> performance of pause, but if it takes a while, on AMD you get the behavior of 
> lfence?

That should work on AMD.

Thanks,
Tom

> 
> 


Re: pci/setup-bus: Delete an error message for a failed memory allocation in add_to_list()

2018-01-13 Thread SF Markus Elfring
> Your commit message says "omit an extra message", which suggests that
> there are currently two messages about the memory allocation failure,
> and that your patch removes one of them.

Yes. - There is a general transformation pattern applied.


> If that's the case, it would be nice to know where the other message is.

Have you got any special experiences with backtraces?



> If your patch removes the *only* message about the memory allocation
> failure, that might be worth doing,

Thanks for a bit of positive feedback.


> but the changelog should be clear about that

Do you distinguish the “log” from a commit description?


> and say "I don't think the error message is worthwhile
> because the function already returns failure" or something similar.

Do you find the wording “WARNING: Possible unnecessary 'out of memory' message”
(from the script “checkpatch.pl”) more reasonable?



>> * Are you looking for a reminder on the Linux allocation failure report?
> 
> I don't know what the "Linux allocation failure report" is.

This information seems to be “hidden” in source code.

https://elixir.free-electrons.com/linux/v4.15-rc7/source/include/linux/gfp.h#L191
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/gfp.h?id=c92a9a461dff6140c539c61e457aa97df29517d6#n213


Are you familiar with the usage of the option “__GFP_NOWARN”?



>>> Also, please squash all the drivers/pci patches into one.
>>
>> To which other change possibilities do you refer here?
> 
> You posted two patches that remove error messages about memory
> allocation failures:

Yes. - Also for this software area …


>   
> http://lkml.kernel.org/r/dc3922b4-50f6-e7fa-482f-18e6ff5d9...@users.sourceforge.net

Is it safer to handle adjustments for the directory “drivers/pci/hotplug” 
separately?


>   
> http://lkml.kernel.org/r/fd9d212e-e8da-1aa7-be7f-7bf6d8f1e...@users.sourceforge.net
> 
> These are doing the same thing and could be combined into one patch.

The final committer could perform such an operation if an other patch 
granularity
would be preferred (or if you would insist on patch squashing).
I guess that you do not need to wait on me to apply an adjusted software 
combination
in this case.

Regards,
Markus


Re: [PATCH 04/11] signal/parisc: Document a conflict with SI_USER with SIGFPE

2018-01-13 Thread Eric W. Biederman
Helge Deller  writes:

> * Eric W. Biederman :
>> Setting si_code to 0 results in a userspace seeing an si_code of 0.
>> This is the same si_code as SI_USER.  Posix and common sense requires
>> that SI_USER not be a signal specific si_code.  As such this use of 0
>> for the si_code is a pretty horribly broken ABI.
>> 
>> Further use of si_code == 0 guaranteed that copy_siginfo_to_user saw a
>> value of __SI_KILL and now sees a value of SIL_KILL with the result
>> that uid and pid fields are copied and which might copying the si_addr
>> field by accident but certainly not by design.  Making this a very
>> flakey implementation.
>> 
>> Utilizing FPE_FIXME siginfo_layout will now return SIL_FAULT and the
>> appropriate fields will reliably be copied.
>> 
>> This bug is 13 years old and parsic machines are no longer being built
>> so I don't know if it possible or worth fixing it.  But it is at least
>> worth documenting this so other architectures don't make the same
>> mistake.
>
>
> I think we should fix it, even if we now break the ABI.
>
> It's about a "conditional trap" which needs to be handled by userspace.
> I doubt there is any Linux code out which is utilizing this
> parisc-specific trap.
>
> I'd suggest to add a new FPE trap si_code (e.g. FPE_CONDTRAP).
> While at it, maybe we should include the already existing FPE_MDAOVF
> from the frv architecture, so that arch/frv/include/uapi/asm/siginfo.h
> can go completely.
>
> Suggested patch is below.
>
> I'm willing to test the patch below on the parisc architecture for a few
> weeks. And it will break arch/x86/kernel/signal_compat.c which needs
> looking at then too.
>
> Thoughts?

I like it.

We have the option of bringing either the ia64 or the frv si_codes
into the generic fold.  Is there any reason you choose frv?
Last I looked ia64 tended in many aspects to be well thought out,
and thus worth a careful look.

Given that a couple of weeks likely puts on the other side of the merge
window I would like to start with my patch so I can close the potential
copying of unitialized memory to userspace.  Then we can build yours on
top.

Although I am more than happy to add new si_codes now.

What I am in the final stages of testing and reviewing internally is the
change to merge all of struct siginfo, struct compat_siginfo,
copy_siginfo_from_user32 and copy_siginfo_to_user32 together.

I need another couple hours and I will be ready to post that.

For long term maintenance the more we can merge together the better,
as clearly some of these bugs have persisted far too long.  And getting
collapsing the arch specific si_codes into just a set of si_codes
looks like one more good step in that direction.

Eric


> Helge
>
>
>
> [PATCH] parisc: Add FPE_CONDTRAP for conditional trap handling
>
> Posix and common sense requires that SI_USER not be a signal specific
> si_code.  Thus add a new FPE_CONDTRAP si_code for conditional traps.
>
> Signed-off-by: Helge Deller 
>
> diff --git a/arch/parisc/kernel/traps.c b/arch/parisc/kernel/traps.c
> index 8453724b8009..13702f0f5ba1 100644
> --- a/arch/parisc/kernel/traps.c
> +++ b/arch/parisc/kernel/traps.c
> @@ -627,9 +627,9 @@ void notrace handle_interruption(int code, struct pt_regs 
> *regs)
>  on condition  */
>   if(user_mode(regs)){
>   si.si_signo = SIGFPE;
> - /* Set to zero, and let the userspace app figure it out 
> from
> -the insn pointed to by si_addr */
> - si.si_code = 0;
> + /* Let userspace app figure out from the insn pointed
> +  * to by si_addr */
> + si.si_code = FPE_CONDTRAP;
>   si.si_addr = (void __user *) regs->iaoq[0];
>   force_sig_info(SIGFPE, &si, current);
>   return;
> diff --git a/include/uapi/asm-generic/siginfo.h 
> b/include/uapi/asm-generic/siginfo.h
> index e447283b8f52..2b759fe42142 100644
> --- a/include/uapi/asm-generic/siginfo.h
> +++ b/include/uapi/asm-generic/siginfo.h
> @@ -193,7 +193,9 @@ typedef struct siginfo {
>  #define FPE_FLTRES   6   /* floating point inexact result */
>  #define FPE_FLTINV   7   /* floating point invalid operation */
>  #define FPE_FLTSUB   8   /* subscript out of range */
> -#define NSIGFPE  8
> +#define FPE_MDAOVF   9   /* media overflow */
> +#define FPE_CONDTRAP 10  /* trap on condition */
> +#define NSIGFPE  10
>  
>  /*
>   * SIGSEGV si_codes


Re: [PATCH v2 16/22] mmc: tmio: fix never-detected card insertion bug

2018-01-13 Thread Wolfram Sang

> I am talking about the card detection
> by the IP-builtin circuit.

Yes, I know. As I wrote in one of the previous patches when reviewing
it, I disabled GPIO CD and used the internal mechanism (for tests where
it is relevant). Like here, too.

>  - GPIO is not set up -> mmc_gpio_get_cd() returns -ENOSYS

Thanks! That pointed me to the right direction. I missed that patch
10/22 was still under discussion and not applied to mmc/next, so I had
to pick it manually.

I can confirm now that there is an issue and your patch fixes it for the
non-GPIO case. For the GPIO case, however, the TMIO_STAT_CARD_REMOVE |
TMIO_STAT_CARD_INSERT interrupts are enabled now, too. It didn't harm
when doing my tests, but we shouldn't do it, to be safe IMO.



signature.asc
Description: PGP signature


Re: Yet another KPTI regression with 4.14.x series in a VM

2018-01-13 Thread Andy Lutomirski
On Sat, Jan 13, 2018 at 12:45 PM, Thomas Gleixner  wrote:
> On Sat, 13 Jan 2018, Andy Lutomirski wrote:
>> Trying to inventory this stuff scattered all over the place:
>>
>> #define PTI_PGTABLE_SWITCH_BITPAGE_SHIFT
>> #define PTI_SWITCH_PGTABLES_MASK(1<> # define X86_CR3_PTI_SWITCH_BIT11
>> #define PTI_SWITCH_MASK
>> (PTI_SWITCH_PGTABLES_MASK|(1<>
>> Blech.  I wouldn't be terribly surprised if I missed a few as well.  How 
>> about:
>>
>> PTI_USER_PGTABLE_BIT = PAGE_SHIFT
>> PTI_USER_PGTABLE_MASK = 1 << PTI_USER_PGTABLE_BIT
>> PTI_USER_PCID_BIT = 11
>> PTI_USER_PCID_MASK = 1 << PTI_USER_PCID_BIT
>> PTI_USER_PGTABLE_AND_PCID_MASK = PTI_USER_PCID_MASK | PTI_USER_PGTABLE_MASK
>>
>> This naming would make the apparently buggy code look fishy, as it
>> should.  I will give this a shot some time soon if no one beats me to
>> it.
>
> Well, the thing we tripped over is that we trusted the SDM that bit 11 is
> ignored. Seems its not and the AMD APM says that reserved bit should be
> cleared. Next time I surely stare into both
>
> So something like the below should make it clear. I've not done the
> alternatives thing yet...
>

Looks generally sane to me.

>
> 8<---
> --- a/arch/x86/entry/calling.h
> +++ b/arch/x86/entry/calling.h
> @@ -198,8 +198,11 @@ For 32-bit we have the following convent
>   * PAGE_TABLE_ISOLATION PGDs are 8k.  Flip bit 12 to switch between the two
>   * halves:
>   */
> -#define PTI_SWITCH_PGTABLES_MASK   (1< -#define PTI_SWITCH_MASK
> (PTI_SWITCH_PGTABLES_MASK|(1< +#define PTI_USER_PGTABLE_BIT   PAGE_SHIFT
> +#define PTI_USER_PGTABLE_MASK  (1 << PTI_USER_PGTABLE_BIT)
> +#define PTI_USER_PCID_BIT  X86_CR3_PTI_PCID_USER_BIT
> +#define PTI_USER_PCID_MASK (1 << PTI_USER_PCID_BIT)
> +#define PTI_USER_PGTABLE_AND_PCID_MASK  (PTI_USER_PCID_MASK | 
> PTI_USER_PGTABLE_MASK)
>
>  .macro SET_NOFLUSH_BIT reg:req
> bts $X86_CR3_PCID_NOFLUSH_BIT, \reg
> @@ -208,7 +211,7 @@ For 32-bit we have the following convent
>  .macro ADJUST_KERNEL_CR3 reg:req
> ALTERNATIVE "", "SET_NOFLUSH_BIT \reg", X86_FEATURE_PCID
> /* Clear PCID and "PAGE_TABLE_ISOLATION bit", point CR3 at kernel 
> pagetables: */
> -   andq$(~PTI_SWITCH_MASK), \reg
> +   andq$(~PTI_USER_PGTABLE_AND_PCID_MASK), \reg
>  .endm
>
>  .macro SWITCH_TO_KERNEL_CR3 scratch_reg:req
> @@ -239,15 +242,18 @@ For 32-bit we have the following convent
> /* Flush needed, clear the bit */
> btr \scratch_reg, THIS_CPU_user_pcid_flush_mask
> movq\scratch_reg2, \scratch_reg
> -   jmp .Lwrcr3_\@
> +   jmp .Lwrcr3_pcid_\@
>
>  .Lnoflush_\@:
> movq\scratch_reg2, \scratch_reg
> SET_NOFLUSH_BIT \scratch_reg
>
> +.Lwcr3_pcid_\@:
> +   orq $(PTI_USER_PCID_MASK), \scratch_reg
> +
>  .Lwrcr3_\@:
> /* Flip the PGD and ASID to the user version */
> -   orq $(PTI_SWITCH_MASK), \scratch_reg
> +   orq $(PTI_USER_PGTABLE_MASK), \scratch_reg
> mov \scratch_reg, %cr3
>  .Lend_\@:
>  .endm
> @@ -272,7 +278,7 @@ For 32-bit we have the following convent
>  *
>  * That indicates a kernel CR3 value, not a user CR3.
>  */
> -   testq   $(PTI_SWITCH_MASK), \scratch_reg
> +   testq   $(PTI_USER_PGTABLE_MASK), \scratch_reg
> jz  .Ldone_\@
>
> ADJUST_KERNEL_CR3 \scratch_reg
> @@ -290,7 +296,7 @@ For 32-bit we have the following convent
>  * KERNEL pages can always resume with NOFLUSH as we do
>  * explicit flushes.
>  */
> -   bt  $X86_CR3_PTI_SWITCH_BIT, \save_reg
> +   bt  $PTI_USER_PGTABLE_BIT, \save_reg
> jnc .Lnoflush_\@
>
> /*
> --- a/arch/x86/include/asm/processor-flags.h
> +++ b/arch/x86/include/asm/processor-flags.h
> @@ -40,7 +40,7 @@
>  #define CR3_NOFLUSHBIT_ULL(63)
>
>  #ifdef CONFIG_PAGE_TABLE_ISOLATION
> -# define X86_CR3_PTI_SWITCH_BIT11
> +# define X86_CR3_PTI_PCID_USER_BIT 11
>  #endif
>
>  #else
> --- a/arch/x86/include/asm/tlbflush.h
> +++ b/arch/x86/include/asm/tlbflush.h
> @@ -81,13 +81,13 @@ static inline u16 kern_pcid(u16 asid)
>  * Make sure that the dynamic ASID space does not confict with the
>  * bit we are using to switch between user and kernel ASIDs.
>  */
> -   BUILD_BUG_ON(TLB_NR_DYN_ASIDS >= (1 << X86_CR3_PTI_SWITCH_BIT));
> +   BUILD_BUG_ON(TLB_NR_DYN_ASIDS >= (1 << X86_CR3_PTI_PCID_USER_BIT));
>
> /*
>  * The ASID being passed in here should have respected the
>  * MAX_ASID_AVAILABLE and thus never have the switch bit set.
>  */
> -   VM_WARN_ON_ONCE(asid & (1 << X86_CR3_PTI_SWITCH_BIT));
> +   VM_WARN_ON_ONCE(asid & (1 << X86_CR3_PTI_PCID_USER_BIT));
>  #endif
> /*
>  * The dynamically-assigned ASIDs that get passed in are small
> @@ -112,7 +112,7 @@ static inline u16 user_pcid(u16 asid)
>  {
>  

Re: Yet another KPTI regression with 4.14.x series in a VM

2018-01-13 Thread Thomas Gleixner
On Sat, 13 Jan 2018, Andy Lutomirski wrote:
> Trying to inventory this stuff scattered all over the place:
> 
> #define PTI_PGTABLE_SWITCH_BITPAGE_SHIFT
> #define PTI_SWITCH_PGTABLES_MASK(1< # define X86_CR3_PTI_SWITCH_BIT11
> #define PTI_SWITCH_MASK
> (PTI_SWITCH_PGTABLES_MASK|(1< 
> Blech.  I wouldn't be terribly surprised if I missed a few as well.  How 
> about:
> 
> PTI_USER_PGTABLE_BIT = PAGE_SHIFT
> PTI_USER_PGTABLE_MASK = 1 << PTI_USER_PGTABLE_BIT
> PTI_USER_PCID_BIT = 11
> PTI_USER_PCID_MASK = 1 << PTI_USER_PCID_BIT
> PTI_USER_PGTABLE_AND_PCID_MASK = PTI_USER_PCID_MASK | PTI_USER_PGTABLE_MASK
> 
> This naming would make the apparently buggy code look fishy, as it
> should.  I will give this a shot some time soon if no one beats me to
> it.

Well, the thing we tripped over is that we trusted the SDM that bit 11 is
ignored. Seems its not and the AMD APM says that reserved bit should be
cleared. Next time I surely stare into both

So something like the below should make it clear. I've not done the
alternatives thing yet...

Thanks,

tglx

8<---
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -198,8 +198,11 @@ For 32-bit we have the following convent
  * PAGE_TABLE_ISOLATION PGDs are 8k.  Flip bit 12 to switch between the two
  * halves:
  */
-#define PTI_SWITCH_PGTABLES_MASK   (1<= (1 << X86_CR3_PTI_SWITCH_BIT));
+   BUILD_BUG_ON(TLB_NR_DYN_ASIDS >= (1 << X86_CR3_PTI_PCID_USER_BIT));
 
/*
 * The ASID being passed in here should have respected the
 * MAX_ASID_AVAILABLE and thus never have the switch bit set.
 */
-   VM_WARN_ON_ONCE(asid & (1 << X86_CR3_PTI_SWITCH_BIT));
+   VM_WARN_ON_ONCE(asid & (1 << X86_CR3_PTI_PCID_USER_BIT));
 #endif
/*
 * The dynamically-assigned ASIDs that get passed in are small
@@ -112,7 +112,7 @@ static inline u16 user_pcid(u16 asid)
 {
u16 ret = kern_pcid(asid);
 #ifdef CONFIG_PAGE_TABLE_ISOLATION
-   ret |= 1 << X86_CR3_PTI_SWITCH_BIT;
+   ret |= 1 << X86_CR3_PTI_PCID_USER_BIT;
 #endif
return ret;
 }





[GIT PULL] Staging driver fixe for 4.15-rc8

2018-01-13 Thread Greg KH
The following changes since commit 30a7acd573899fd8b8ac39236eff6468b195ac7d:

  Linux 4.15-rc6 (2017-12-31 14:47:43 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git/ 
tags/staging-4.15-rc8

for you to fetch changes up to 443064cb0b1fb4569fe0a71209da7625129fb760:

  staging: android: ashmem: fix a race condition in ASHMEM_SET_SIZE ioctl 
(2018-01-09 15:32:11 +0100)


Staging driver fix for 4.15-rc8

Here is a single android ashmem bugfix that resolves a reported issue in
that interface.  It's been in linux-next this week with no reported
issues.

Signed-off-by: Greg Kroah-Hartman 


Viktor Slavkovic (1):
  staging: android: ashmem: fix a race condition in ASHMEM_SET_SIZE ioctl

 drivers/staging/android/ashmem.c | 2 ++
 1 file changed, 2 insertions(+)


[GIT PULL] USB driver fixes for 4.15-rc8

2018-01-13 Thread Greg KH
The following changes since commit 30a7acd573899fd8b8ac39236eff6468b195ac7d:

  Linux 4.15-rc6 (2017-12-31 14:47:43 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git/ 
tags/usb-4.15-rc8

for you to fetch changes up to 1a2e91e795def04e15fac87b8e16b635691d0b82:

  Documentation: usb: fix typo in UVC gadgetfs config command (2018-01-11 
18:39:52 +0100)


USB fixes for 4.15-rc8

Here are some small USB fixes and device ids for 4.15-rc8

Nothing major, small fixes for various devices, some resolutions for
bugs found by fuzzers, and the usual handful of new device ids.

All of these have been in linux-next with no reported issues.

Signed-off-by: Greg Kroah-Hartman 


Alan Stern (1):
  USB: UDC core: fix double-free in usb_add_gadget_udc_release

Bin Liu (1):
  Documentation: usb: fix typo in UVC gadgetfs config command

Christian Holl (1):
  USB: serial: cp210x: add new device ID ELV ALC 8xxx

Diego Elio Pettenò (1):
  USB: serial: cp210x: add IDs for LifeScan OneTouch Verio IQ

Greg Kroah-Hartman (1):
  Merge tag 'usb-serial-4.15-rc8' of 
https://git.kernel.org/.../johan/usb-serial into usb-linus

Icenowy Zheng (1):
  uas: ignore UAS for Norelsys NS1068(X) chips

Pete Zaitcev (1):
  USB: fix usbmon BUG trigger

Shuah Khan (3):
  usbip: fix vudc_rx: harden CMD_SUBMIT path to handle malicious input
  usbip: remove kernel addresses from usb device and urb debug msgs
  usbip: vudc_tx: fix v_send_ret_submit() vulnerability to null xfer buffer

Stefan Agner (1):
  usb: misc: usb3503: make sure reset is low for at least 100us

 Documentation/usb/gadget-testing.txt |  2 +-
 drivers/usb/gadget/udc/core.c| 28 +---
 drivers/usb/misc/usb3503.c   |  2 ++
 drivers/usb/mon/mon_bin.c|  8 +++-
 drivers/usb/serial/cp210x.c  |  2 ++
 drivers/usb/storage/unusual_uas.h|  7 +++
 drivers/usb/usbip/usbip_common.c | 17 +++--
 drivers/usb/usbip/vudc_rx.c  | 19 +++
 drivers/usb/usbip/vudc_tx.c  | 11 +--
 9 files changed, 63 insertions(+), 33 deletions(-)


[GIT PULL] Char/Misc driver fixes for 4.15-rc8

2018-01-13 Thread Greg KH
The following changes since commit 30a7acd573899fd8b8ac39236eff6468b195ac7d:

  Linux 4.15-rc6 (2017-12-31 14:47:43 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git/ 
tags/char-misc-4.15-rc8

for you to fetch changes up to aa1f10e85b0ab53dee85d8e293c8159d18d293a8:

  mux: core: fix double get_device() (2018-01-09 14:19:41 +0100)


Char/Misc fixes for 4.15-rc8

Here are two bugfixes for some driver bugs for 4.15-rc8

The first is a bluetooth security bug that has been ignored by the
Bluetooth developers for months for no obvious reason at all, so I've
taken it through my tree.

The second is a simple double-free bug in the mux subsystem.

Both have been in linux-next for a while with no reported issues.

Signed-off-by: Greg Kroah-Hartman 


Ben Seri (1):
  Bluetooth: Prevent stack info leak from the EFS element.

Hans de Goede (1):
  mux: core: fix double get_device()

 drivers/mux/core.c |  4 +++-
 net/bluetooth/l2cap_core.c | 20 +++-
 2 files changed, 14 insertions(+), 10 deletions(-)


Re: [PATCH v5] leaking_addresses: add generic 32-bit support

2018-01-13 Thread Tobin C. Harding
On Sat, Jan 13, 2018 at 10:55:26AM +, Kaiwan N Billimoria wrote:
> Hi Tobin,
> 
> Thanks very much for your detailed review.
> Just wanted to say that am up to my neck in work (an exceptionally busy
> time), hence will take a while to work on this - around another 3 weeks
> perhaps.
> I'd like to continue, but if you feel it's too long please move ahead.

No worries Kaiwan, thanks for letting me know.  I'll cc you on any
patches to leaking_addresses.pl for now so you can easily keep up with
whats going on.

thanks,
Tobin.


[PATCH v2] Coccinelle: kzalloc-simple: Rename kzalloc-simple to zalloc-simple

2018-01-13 Thread Himanshu Jha
Rename kzalloc-simple to zalloc-simple since now the rule is not
specific to kzalloc function only, but also to many other zero memory
allocating functions specified in the rule.

Suggested-by: SF Markus Elfring 
Signed-off-by: Himanshu Jha 
---
 v2:
-generated the patch using -M flag for renaming.
-Since Markus earlier worked on renaming this patch adding his name too
 .../coccinelle/api/alloc/{kzalloc-simple.cocci => zalloc-simple.cocci}| 0
 1 file changed, 0 insertions(+), 0 deletions(-)
 rename scripts/coccinelle/api/alloc/{kzalloc-simple.cocci => 
zalloc-simple.cocci} (100%)

diff --git a/scripts/coccinelle/api/alloc/kzalloc-simple.cocci 
b/scripts/coccinelle/api/alloc/zalloc-simple.cocci
similarity index 100%
rename from scripts/coccinelle/api/alloc/kzalloc-simple.cocci
rename to scripts/coccinelle/api/alloc/zalloc-simple.cocci
-- 
2.7.4



Re: [PATCH v3 8/9] x86: use __uaccess_begin_nospec and ASM_IFENCE in get_user paths

2018-01-13 Thread Eric W. Biederman
Linus Torvalds  writes:

> On Sat, Jan 13, 2018 at 11:05 AM, Linus Torvalds
>  wrote:
>>
>> I _know_ that lfence is expensive as hell on P4, for example.
>>
>> Yes, yes, "sbb" is often more expensive than most ALU instructions,
>> and Agner Fog says it has a 10-cycle latency on Prescott (which is
>> outrageous, but being one or two cycles more due to the flags
>> generation is normal). So the sbb/and may certainly add a few cycles
>> to the critical path, but on Prescott "lfence" is *50* cycles
>> according to those same tables by Agner Fog.
>
> Side note: I don't think P4 is really relevant for a performance
> discussion, I was just giving it as an example where we do know actual
> cycles.
>
> I'm much more interested in modern Intel big-core CPU's, and just
> wondering whether somebody could ask an architect.
>
> Because I _suspect_ the answer from a CPU architect would be: "Christ,
> the sbb/and sequence is much better because it doesn't have any extra
> serialization", but maybe I'm wrong, and people feel that lfence is
> particularly easy to do right without any real downside.

As an educated observer it seems like the cmpq/sbb/and sequence is an
improvement because it moves the dependency from one end of the cpu
pipeline to another.  If any cpu does data speculation on anything other
than branch targets that sequence could still be susceptible to
speculation.

>From the AMD patches it appears that lfence is becoming a serializing
instruction which in principal is much more expensive.

Also do we have alternatives for these sequences so if we run on an
in-order atom (or 386 or 486) where speculation does not occur we can
avoid the cost?

Eric


Re: pci/setup-bus: Delete an error message for a failed memory allocation in add_to_list()

2018-01-13 Thread Bjorn Helgaas
On Sat, Jan 13, 2018 at 07:15:04AM +0100, SF Markus Elfring wrote:
> >> Omit an extra message for a memory allocation failure in this function.
> > 
> > If this is an "extra" message, I assume there's some other message?
> > Can you mention where that is in the changelog?
> 
> * Would you like to get a more detailed commit description?

Your commit message says "omit an extra message", which suggests that
there are currently two messages about the memory allocation failure,
and that your patch removes one of them.

If that's the case, it would be nice to know where the other message
is.

If your patch removes the *only* message about the memory allocation
failure, that might be worth doing, but the changelog should be clear
about that and say "I don't think the error message is worthwhile
because the function already returns failure" or something similar.

> * Are you looking for a reminder on the Linux allocation failure report?

I don't know what the "Linux allocation failure report" is.

> > Also, please squash all the drivers/pci patches into one.
> 
> To which other change possibilities do you refer here?

You posted two patches that remove error messages about memory
allocation failures:

  
http://lkml.kernel.org/r/dc3922b4-50f6-e7fa-482f-18e6ff5d9...@users.sourceforge.net
  
http://lkml.kernel.org/r/fd9d212e-e8da-1aa7-be7f-7bf6d8f1e...@users.sourceforge.net

These are doing the same thing and could be combined into one patch.

Bjorn


[PATCH v3] input: pxrc: new driver for PhoenixRC Flight Controller Adapter

2018-01-13 Thread Marcus Folkesson
This driver let you plug in your RC controller to the adapter and
use it as input device in various RC simulators.

Signed-off-by: Marcus Folkesson 
---
v3:
- Use RUDDER and MISC instead of TILT_X and TILT_Y
- Drop kref and anchor
- Rework URB handling
- Add PM support
v2:
- Change module license to GPLv2 to match SPDX tag

 Documentation/input/devices/pxrc.rst |  57 +++
 drivers/input/joystick/Kconfig   |   9 +
 drivers/input/joystick/Makefile  |   1 +
 drivers/input/joystick/pxrc.c| 320 +++
 4 files changed, 387 insertions(+)
 create mode 100644 Documentation/input/devices/pxrc.rst
 create mode 100644 drivers/input/joystick/pxrc.c

diff --git a/Documentation/input/devices/pxrc.rst 
b/Documentation/input/devices/pxrc.rst
new file mode 100644
index ..ca11f646bae8
--- /dev/null
+++ b/Documentation/input/devices/pxrc.rst
@@ -0,0 +1,57 @@
+===
+pxrc - PhoenixRC Flight Controller Adapter
+===
+
+:Author: Marcus Folkesson 
+
+This driver let you use your own RC controller plugged into the
+adapter that comes with PhoenixRC [1]_ or other compatible adapters.
+
+The adapter supports 7 analog channels and 1 digital input switch.
+
+Notes
+=
+
+Many RC controllers is able to configure which stick goes to which channel.
+This is also configurable in most simulators, so a matching is not necessary.
+
+The driver is generating the following input event for analog channels:
+
++-++
+| Channel |  Event |
++=++
+| 1   |  ABS_X |
++-++
+| 2   |  ABS_Y |
++-++
+| 3   |  ABS_RX|
++-++
+| 4   |  ABS_RY|
++-++
+| 5   |  ABS_RUDDER|
++-++
+| 6   |  ABS_THROTTLE  |
++-++
+| 7   |  ABS_MISC  |
++-++
+
+The digital input switch is generated as an `BTN_A` event.
+
+Manual Testing
+==
+
+To test this driver's functionality you may use `input-event` which is part of
+the `input layer utilities` suite [2]_.
+
+For example::
+
+> modprobe pxrc
+> input-events 
+
+To print all input events from input `devnr`.
+
+References
+==
+
+.. [1] http://www.phoenix-sim.com/
+.. [2] https://www.kraxel.org/cgit/input/
diff --git a/drivers/input/joystick/Kconfig b/drivers/input/joystick/Kconfig
index f3c2f6ea8b44..18ab6dafff41 100644
--- a/drivers/input/joystick/Kconfig
+++ b/drivers/input/joystick/Kconfig
@@ -351,4 +351,13 @@ config JOYSTICK_PSXPAD_SPI_FF
 
  To drive rumble motor a dedicated power supply is required.
 
+config JOYSTICK_PXRC
+   tristate "PhoenixRC Flight Controller Adapter"
+   depends on USB_ARCH_HAS_HCD
+   select USB
+   help
+ Say Y here if you want to use the PhoenixRC Flight Controller Adapter.
+
+ To compile this driver as a module, choose M here: the
+ module will be called pxrc.
 endif
diff --git a/drivers/input/joystick/Makefile b/drivers/input/joystick/Makefile
index 67651efda2e1..dd0492ebbed7 100644
--- a/drivers/input/joystick/Makefile
+++ b/drivers/input/joystick/Makefile
@@ -23,6 +23,7 @@ obj-$(CONFIG_JOYSTICK_JOYDUMP)+= joydump.o
 obj-$(CONFIG_JOYSTICK_MAGELLAN)+= magellan.o
 obj-$(CONFIG_JOYSTICK_MAPLE)   += maplecontrol.o
 obj-$(CONFIG_JOYSTICK_PSXPAD_SPI)  += psxpad-spi.o
+obj-$(CONFIG_JOYSTICK_PXRC)+= pxrc.o
 obj-$(CONFIG_JOYSTICK_SIDEWINDER)  += sidewinder.o
 obj-$(CONFIG_JOYSTICK_SPACEBALL)   += spaceball.o
 obj-$(CONFIG_JOYSTICK_SPACEORB)+= spaceorb.o
diff --git a/drivers/input/joystick/pxrc.c b/drivers/input/joystick/pxrc.c
new file mode 100644
index ..98d9b8184c46
--- /dev/null
+++ b/drivers/input/joystick/pxrc.c
@@ -0,0 +1,320 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Driver for Phoenix RC Flight Controller Adapter
+ *
+ * Copyright (C) 2018 Marcus Folkesson 
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define PXRC_VENDOR_ID (0x1781)
+#define PXRC_PRODUCT_ID(0x0898)
+
+static const struct usb_device_id pxrc_table[] = {
+   { USB_DEVICE(PXRC_VENDOR_ID, PXRC_PRODUCT_ID) },
+   { }
+};
+MODULE_DEVICE_TABLE(usb, pxrc_table);
+
+struct pxrc {
+   struct input_dev*input;
+   struct usb_device   *udev;
+   struct usb_interface*intf;
+   struct urb  *urb;
+   __u8epaddr;
+   charphys[64];
+   unsigned char   *data;
+   size_t  bsize;
+};
+
+static void pxrc_usb_irq(struct urb *urb)
+{
+   struct pxrc *pxrc = urb->context;
+   

Re: Yet another KPTI regression with 4.14.x series in a VM

2018-01-13 Thread Andy Lutomirski
On Fri, Jan 12, 2018 at 10:33 PM, Willy Tarreau  wrote:
> On Fri, Jan 12, 2018 at 10:08:20PM -0800, Andy Lutomirski wrote:
>> In fact, it looks like this code is totally bogus and has never been
>> correct at all.  Even in:
>>
>> commit 4b1d5ae3b103eda43f9d0f85c355bb6995b03a30
>> Author: Peter Zijlstra 
>> Date:   Mon Dec 4 15:07:59 2017 +0100
>>
>> x86/mm: Use/Fix PCID to optimize user/kernel switches
>>
>> We have:
>>
>> .macro SWITCH_TO_USER_CR3_NOSTACK scratch_reg:req scratch_reg2:req
>> ALTERNATIVE "jmp .Lend_\@", "", X86_FEATURE_PTI
>> mov %cr3, \scratch_reg
>>
>> ALTERNATIVE "jmp .Lwrcr3_\@", "", X86_FEATURE_PCID
>>
>> ...
>>
>> .Lwrcr3_\@:
>> /* Flip the PGD and ASID to the user version */
>> orq $(PTI_SWITCH_MASK), \scratch_reg
>> mov \scratch_reg, %cr3
>> .Lend_\@:
>>
>> That's bogus.  PTI_SWITCH_MASK is 0x1800, which has PCID = 0x800.
>>
>> This should probably use an alternative to select between 0x1000 and
>> 0x800 depending on X86_FEATURE_PCID or just use an entirely different
>> label for the !PCID case.
>>
>> FWIW, this bit in SAVE_AND_SWITCH_TO_KERNEL_CR3
>>
>> testq   $(PTI_SWITCH_MASK), \scratch_reg
>> jz  .Ldone_\@
>>
>> is a bit silly, too.  It's *correct* (I think), but shouldn't that
>> just be bt $(PTI_SWITCH_PGTABLES_BIT), \scratch_reg, with the obvious
>> caveat that the headers don't actually define PTI_SWITCH_PGTABLES_BIT?
>
> I wondered the same initially when reading this but thought there was
> surely a good reason that I could not understand due to my lack of
> knowledge and stopped wondering. BTW your PTI_SWITCH_PGTABLES_BIT would
> in fact be PAGE_SHIFT :-)

Trying to inventory this stuff scattered all over the place:

#define PTI_PGTABLE_SWITCH_BITPAGE_SHIFT
#define PTI_SWITCH_PGTABLES_MASK(1<

Re: [Cocci] [PATCH] Coccinelle: kzalloc-simple: Rename kzalloc-simple to zalloc-simple

2018-01-13 Thread Himanshu Jha
On Sat, Jan 13, 2018 at 05:13:36PM -0200, Fabio Estevam wrote:
> On Sat, Jan 13, 2018 at 3:53 PM, Himanshu Jha
>  wrote:
> 
> > Yes, I used 'git mv'.
> >
> > It doesn't matter when applying through 'git am', both will result the
> > same AFAIK and only difference is that the patch files generated by 'git
> > format-patch' are different. But that is not important I think.
> 
> You missed the -M option when running 'git format'.
> 
> For reviewers it is not that easy to realize that the 448 lines of the
> original file are the same as the ones in the new file.
> 
> The -M option generates a much cleaner patch.

I guess you're right because patch has many lines.
No problem, I will send again with a much cleaner patch. :-)

-- 
Thanks
Himanshu Jha


[PATCH] sunrpc: Use seq_putc() in unix_gid_show()

2018-01-13 Thread SF Markus Elfring
From: Markus Elfring 
Date: Sat, 13 Jan 2018 20:33:05 +0100

A single character (line break) should be put into a sequence.
Thus use the corresponding function "seq_putc".

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 net/sunrpc/svcauth_unix.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c
index af7f28fb8102..df1327c0dd1c 100644
--- a/net/sunrpc/svcauth_unix.c
+++ b/net/sunrpc/svcauth_unix.c
@@ -566,7 +566,7 @@ static int unix_gid_show(struct seq_file *m,
seq_printf(m, "%u %d:", from_kuid_munged(user_ns, ug->uid), glen);
for (i = 0; i < glen; i++)
seq_printf(m, " %d", from_kgid_munged(user_ns, ug->gi->gid[i]));
-   seq_printf(m, "\n");
+   seq_putc(m, '\n');
return 0;
 }
 
-- 
2.15.1



Re: [PATCH v3 8/9] x86: use __uaccess_begin_nospec and ASM_IFENCE in get_user paths

2018-01-13 Thread Linus Torvalds
On Sat, Jan 13, 2018 at 11:05 AM, Linus Torvalds
 wrote:
>
> I _know_ that lfence is expensive as hell on P4, for example.
>
> Yes, yes, "sbb" is often more expensive than most ALU instructions,
> and Agner Fog says it has a 10-cycle latency on Prescott (which is
> outrageous, but being one or two cycles more due to the flags
> generation is normal). So the sbb/and may certainly add a few cycles
> to the critical path, but on Prescott "lfence" is *50* cycles
> according to those same tables by Agner Fog.

Side note: I don't think P4 is really relevant for a performance
discussion, I was just giving it as an example where we do know actual
cycles.

I'm much more interested in modern Intel big-core CPU's, and just
wondering whether somebody could ask an architect.

Because I _suspect_ the answer from a CPU architect would be: "Christ,
the sbb/and sequence is much better because it doesn't have any extra
serialization", but maybe I'm wrong, and people feel that lfence is
particularly easy to do right without any real downside.

Linus


[PATCH v2] iio: adc: driver for ti adc081s/adc101s/adc121s

2018-01-13 Thread Milan Stevanovic

From fab687d20ba46d78439b6cdaf0d40b78ae68222c Mon Sep 17 00:00:00 2001
From: Milan Stevanovic 
Date: Sun, 7 Jan 2018 21:44:33 +0100
Subject: [PATCH v2] iio: adc: driver for ti adc081s/adc101s/adc121s

Add Linux device driver for TI single-channel CMOS
8/10/12-bit analog-to-digital converter with a
high-speed serial interface.

Signed-off-by: Milan Stevanovic 

---
Changes in v2:
 - Fix typo error
 - Keep Copyright comment
---
 drivers/iio/adc/ad7476.c | 25 +++--
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/drivers/iio/adc/ad7476.c b/drivers/iio/adc/ad7476.c
index b7706bf..4fe3cf1 100644
--- a/drivers/iio/adc/ad7476.c
+++ b/drivers/iio/adc/ad7476.c
@@ -1,9 +1,10 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
- * AD7466/7/8 AD7476/5/7/8 (A) SPI ADC driver
+ * Analog Devices AD7466/7/8 AD7476/5/7/8 (A) SPI ADC driver
+ * TI ADC081S/ADC101S/ADC121S 8/10/12-bit SPI ADC driver
  *
  * Copyright 2010 Analog Devices Inc.
  *
- * Licensed under the GPL-2 or later.
  */
 
 #include 

@@ -56,6 +57,9 @@ enum ad7476_supported_device_ids {
ID_AD7468,
ID_AD7495,
ID_AD7940,
+   ID_ADC081S,
+   ID_ADC101S,
+   ID_ADC121S,
 };
 
 static irqreturn_t ad7476_trigger_handler(int irq, void  *p)

@@ -147,6 +151,8 @@ static int ad7476_read_raw(struct iio_dev *indio_dev,
},  \
 }
 
+#define ADC081S_CHAN(bits) _AD7476_CHAN((bits), 12 - (bits), \

+   BIT(IIO_CHAN_INFO_RAW))
 #define AD7476_CHAN(bits) _AD7476_CHAN((bits), 13 - (bits), \
BIT(IIO_CHAN_INFO_RAW))
 #define AD7940_CHAN(bits) _AD7476_CHAN((bits), 15 - (bits), \
@@ -192,6 +198,18 @@ static const struct ad7476_chip_info 
ad7476_chip_info_tbl[] = {
.channel[0] = AD7940_CHAN(14),
.channel[1] = IIO_CHAN_SOFT_TIMESTAMP(1),
},
+   [ID_ADC081S] = {
+   .channel[0] = ADC081S_CHAN(8),
+   .channel[1] = IIO_CHAN_SOFT_TIMESTAMP(1),
+   },
+   [ID_ADC101S] = {
+   .channel[0] = ADC081S_CHAN(10),
+   .channel[1] = IIO_CHAN_SOFT_TIMESTAMP(1),
+   },
+   [ID_ADC121S] = {
+   .channel[0] = ADC081S_CHAN(12),
+   .channel[1] = IIO_CHAN_SOFT_TIMESTAMP(1),
+   },
 };
 
 static const struct iio_info ad7476_info = {

@@ -294,6 +312,9 @@ static const struct spi_device_id ad7476_id[] = {
{"ad7910", ID_AD7467},
{"ad7920", ID_AD7466},
{"ad7940", ID_AD7940},
+   {"adc081s", ID_ADC081S},
+   {"adc101s", ID_ADC101S},
+   {"adc121s", ID_ADC121S},
{}
 };
 MODULE_DEVICE_TABLE(spi, ad7476_id);
--
2.7.4
la



[PATCH] Remove structure passing and assignment to save stack and no coping structures.

2018-01-13 Thread Karim Eshapa
Signed-off-by: Karim Eshapa 

Thanks,
Karim
---
 include/linux/tnum.h  |  2 +-
 kernel/bpf/tnum.c | 13 +++--
 kernel/bpf/verifier.c | 12 
 3 files changed, 16 insertions(+), 11 deletions(-)

diff --git a/include/linux/tnum.h b/include/linux/tnum.h
index 0d2d3da..ddb1250 100644
--- a/include/linux/tnum.h
+++ b/include/linux/tnum.h
@@ -26,7 +26,7 @@ struct tnum tnum_lshift(struct tnum a, u8 shift);
 /* Shift a tnum right (by a fixed shift) */
 struct tnum tnum_rshift(struct tnum a, u8 shift);
 /* Add two tnums, return @a + @b */
-struct tnum tnum_add(struct tnum a, struct tnum b);
+void tnum_add(struct tnum *res, struct tnum *a, struct tnum *b);
 /* Subtract two tnums, return @a - @b */
 struct tnum tnum_sub(struct tnum a, struct tnum b);
 /* Bitwise-AND, return @a & @b */
diff --git a/kernel/bpf/tnum.c b/kernel/bpf/tnum.c
index 1f4bf68..f7f8b10 100644
--- a/kernel/bpf/tnum.c
+++ b/kernel/bpf/tnum.c
@@ -43,16 +43,17 @@ struct tnum tnum_rshift(struct tnum a, u8 shift)
return TNUM(a.value >> shift, a.mask >> shift);
 }
 
-struct tnum tnum_add(struct tnum a, struct tnum b)
+void tnum_add(struct tnum *res, struct tnum *a, struct tnum *b)
 {
u64 sm, sv, sigma, chi, mu;
 
-   sm = a.mask + b.mask;
-   sv = a.value + b.value;
+   sm = a->mask + b->mask;
+   sv = a->value + b->value;
sigma = sm + sv;
chi = sigma ^ sv;
-   mu = chi | a.mask | b.mask;
-   return TNUM(sv & ~mu, mu);
+   mu = chi | a->mask | b->mask;
+   res->value = (sv & ~mu);
+   res->mask = mu;
 }
 
 struct tnum tnum_sub(struct tnum a, struct tnum b)
@@ -102,7 +103,7 @@ static struct tnum hma(struct tnum acc, u64 value, u64 mask)
 {
while (mask) {
if (mask & 1)
-   acc = tnum_add(acc, TNUM(0, value));
+   tnum_add(&acc, &acc, &TNUM(0, value));
mask >>= 1;
value <<= 1;
}
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index b414d6b..4acc16c 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -999,7 +999,8 @@ static int check_pkt_ptr_alignment(struct bpf_verifier_env 
*env,
 */
ip_align = 2;
 
-   reg_off = tnum_add(reg->var_off, tnum_const(ip_align + reg->off + off));
+   tnum_add(®_off, ®->var_off,
+   &tnum_const(ip_align + reg->off + off));
if (!tnum_is_aligned(reg_off, size)) {
char tn_buf[48];
 
@@ -1024,7 +1025,8 @@ static int check_generic_ptr_alignment(struct 
bpf_verifier_env *env,
if (!strict || size == 1)
return 0;
 
-   reg_off = tnum_add(reg->var_off, tnum_const(reg->off + off));
+   tnum_add(®_off, ®->var_off,
+   &tnum_const(reg->off + off));
if (!tnum_is_aligned(reg_off, size)) {
char tn_buf[48];
 
@@ -1971,7 +1973,8 @@ static int adjust_ptr_min_max_vals(struct 
bpf_verifier_env *env,
dst_reg->umin_value = umin_ptr + umin_val;
dst_reg->umax_value = umax_ptr + umax_val;
}
-   dst_reg->var_off = tnum_add(ptr_reg->var_off, off_reg->var_off);
+   tnum_add(&dst_reg->var_off, &ptr_reg->var_off,
+   &off_reg->var_off);
dst_reg->off = ptr_reg->off;
if (reg_is_pkt_pointer(ptr_reg)) {
dst_reg->id = ++env->id_gen;
@@ -2108,7 +2111,8 @@ static int adjust_scalar_min_max_vals(struct 
bpf_verifier_env *env,
dst_reg->umin_value += umin_val;
dst_reg->umax_value += umax_val;
}
-   dst_reg->var_off = tnum_add(dst_reg->var_off, src_reg.var_off);
+   tnum_add(&dst_reg->var_off, &dst_reg->var_off,
+   &src_reg.var_off);
break;
case BPF_SUB:
if (signed_sub_overflows(dst_reg->smin_value, smax_val) ||
-- 
2.7.4



[PATCH] l2tp: Use seq_putc() in l2tp_dfs_seq_session_show()

2018-01-13 Thread SF Markus Elfring
From: Markus Elfring 
Date: Sat, 13 Jan 2018 20:11:01 +0100

Two single characters (line breaks) should be put into a sequence.
Thus use the corresponding function "seq_putc".

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 net/l2tp/l2tp_debugfs.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/l2tp/l2tp_debugfs.c b/net/l2tp/l2tp_debugfs.c
index 4cc30b38aba4..0a5add3ecdba 100644
--- a/net/l2tp/l2tp_debugfs.c
+++ b/net/l2tp/l2tp_debugfs.c
@@ -191,7 +191,7 @@ static void l2tp_dfs_seq_session_show(struct seq_file *m, 
void *v)
seq_printf(m, "%02x%02x%02x%02x",
   session->cookie[4], session->cookie[5],
   session->cookie[6], session->cookie[7]);
-   seq_printf(m, "\n");
+   seq_putc(m, '\n');
}
if (session->peer_cookie_len) {
seq_printf(m, "   peer cookie %02x%02x%02x%02x",
@@ -201,7 +201,7 @@ static void l2tp_dfs_seq_session_show(struct seq_file *m, 
void *v)
seq_printf(m, "%02x%02x%02x%02x",
   session->peer_cookie[4], 
session->peer_cookie[5],
   session->peer_cookie[6], 
session->peer_cookie[7]);
-   seq_printf(m, "\n");
+   seq_putc(m, '\n');
}
 
seq_printf(m, "   %hu/%hu tx %ld/%ld/%ld rx %ld/%ld/%ld\n",
-- 
2.15.1



Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating

2018-01-13 Thread Jason Gunthorpe
On Fri, Jan 12, 2018 at 01:01:56PM -0800, Saeed Mahameed wrote:


> Simply putting a memory barrier on the top or the bottom of a functions,
> means nothing unless you are looking at the whole picture, of all the
> callers of that function to understand why is it there.

When I review code I want to see the memory barrier placed *directly*
before the write which allows the DMA.

So yes, this is my preference:

> update_doorbell() {
> dma_wmb();
> ring->db = prod;
> }

Conceptually what is happening here is very similar to what
smp_store_release() does for SMP cases. In most cases wmb should
always be strongly connected with a following write.

smp_store_release() is called 'release' because the write it
incorporates allows the other CPU to 'see' what is being
protected. Similarly here, the write to the db allows the device to
'see' the new ring data.

And this is bad idea:

> fill buffers();
> dma_wmb();
> update_doorbell();
 
> I simply like the 2nd one since with one look you can understand
> what this dma_wmb is protecting.

What do you think the wmb is protecting in the above? It isn't the fill.

Jason


Re: [Cocci] [PATCH] Coccinelle: kzalloc-simple: Rename kzalloc-simple to zalloc-simple

2018-01-13 Thread Fabio Estevam
On Sat, Jan 13, 2018 at 3:53 PM, Himanshu Jha
 wrote:

> Yes, I used 'git mv'.
>
> It doesn't matter when applying through 'git am', both will result the
> same AFAIK and only difference is that the patch files generated by 'git
> format-patch' are different. But that is not important I think.

You missed the -M option when running 'git format'.

For reviewers it is not that easy to realize that the 448 lines of the
original file are the same as the ones in the new file.

The -M option generates a much cleaner patch.


[PATCH] x86_64: trim clear_page.S includes

2018-01-13 Thread Alexey Dobriyan
After alternatives were shifted to the call site, only 2 headers are
necessary.

Signed-off-by: Alexey Dobriyan 
---

 arch/x86/lib/clear_page_64.S |2 --
 1 file changed, 2 deletions(-)

--- a/arch/x86/lib/clear_page_64.S
+++ b/arch/x86/lib/clear_page_64.S
@@ -1,6 +1,4 @@
 #include 
-#include 
-#include 
 #include 
 
 /*


Re: [PATCH v3 8/9] x86: use __uaccess_begin_nospec and ASM_IFENCE in get_user paths

2018-01-13 Thread Linus Torvalds
On Sat, Jan 13, 2018 at 10:18 AM, Dan Williams  wrote:
> diff --git a/arch/x86/lib/getuser.S b/arch/x86/lib/getuser.S
> index c97d935a29e8..85f400b8ee7c 100644
> --- a/arch/x86/lib/getuser.S
> +++ b/arch/x86/lib/getuser.S
> @@ -41,6 +41,7 @@ ENTRY(__get_user_1)
> cmp TASK_addr_limit(%_ASM_DX),%_ASM_AX
> jae bad_get_user
> ASM_STAC
> +   ASM_IFENCE
>  1: movzbl (%_ASM_AX),%edx
> xor %eax,%eax
> ASM_CLAC

So I really would like to know from somebody (preferably somebody with
real microarchitectural knowledge) just how expensive that "lfence"
ends up being.

Because since we could just generate the masking of the address from
the exact same condition code that we already generate, the "lfence"
really can be replaced by just two ALU instructions instead:

   diff --git a/arch/x86/lib/getuser.S b/arch/x86/lib/getuser.S
   index c97d935a29e8..4c378b485399 100644
   --- a/arch/x86/lib/getuser.S
   +++ b/arch/x86/lib/getuser.S
   @@ -40,6 +40,8 @@ ENTRY(__get_user_1)
   mov PER_CPU_VAR(current_task), %_ASM_DX
   cmp TASK_addr_limit(%_ASM_DX),%_ASM_AX
   jae bad_get_user
   +   sbb %_ASM_DX,%_ASM_DX
   +   and %_ASM_DX,%_ASM_AX
   ASM_STAC
1: movzbl (%_ASM_AX),%edx
   xor %eax,%eax

which looks like it should have a fairly low maximum overhead (ok, the
above is totally untested, maybe I got the condition the wrong way
around _again_).

I _know_ that lfence is expensive as hell on P4, for example.

Yes, yes, "sbb" is often more expensive than most ALU instructions,
and Agner Fog says it has a 10-cycle latency on Prescott (which is
outrageous, but being one or two cycles more due to the flags
generation is normal). So the sbb/and may certainly add a few cycles
to the critical path, but on Prescott "lfence" is *50* cycles
according to those same tables by Agner Fog.

Is there anybody who is willing to say one way or another wrt the
"sbb/and" sequence vs "lfence".

   Linus


Re: [PATCH v2 00/19] prevent bounds-check bypass via speculative execution

2018-01-13 Thread Linus Torvalds
On Fri, Jan 12, 2018 at 4:15 PM, Tony Luck  wrote:
>
> Here there isn't any reason for speculation. The core has the
> value of 'x' in a register and the upper bound encoded into the
> "cmp" instruction.  Both are right there, no waiting, no speculation.

So this is an argument I haven't seen before (although it was brought
up in private long ago), but that is very relevant: the actual scope
and depth of speculation.

Your argument basically depends on just what gets speculated, and on
the _actual_ order of execution.

So your argument depends on "the uarch will actually run the code in
order if there are no events that block the pipeline".

Or at least it depends on a certain latency of the killing of any OoO
execution being low enough that the cache access doesn't even begin.

I realize that that is very much a particular microarchitectural
detail, but it's actually a *big* deal. Do we have a set of rules for
what is not a worry, simply because the speculated accesses get killed
early enough?

Apparently "test a register value against a constant" is good enough,
assuming that register is also needed for the address of the access.

Linus


Re: stable/linux-3.16.y build: 178 builds: 1 failed, 177 passed, 2 errors, 57 warnings (v3.16.52)

2018-01-13 Thread Manfred Spraul

Hi Arnd,

On 01/03/2018 12:15 AM, Arnd Bergmann wrote:



2 ipc/sem.c:377:6: warning: '___p1' may be used uninitialized in this function 
[-Wmaybe-uninitialized]

This code was last touched in 3.16 by the backport of commit
5864a2fd3088 ("ipc/sem.c: fix complex_count vs. simple op race")

The warning is in "smp_load_acquire(&sma->complex_mode))", and I suspect
that commit 27d7be1801a4 ("ipc/sem.c: avoid using spin_unlock_wait()")
avoided the warning upstream by removing the smp_mb() before it.

The smp_mb() pairs with spin_unlock_wait() in complexmode_enter()
It is removed by commit 27d7be1801a4 ("ipc/sem.c: avoid using 
spin_unlock_wait()").


From what I see, it doesn't exist in any of the stable kernels 
(intentionally, the above commit is a rewrite for better performance).


___p1 is from smp_load_acquire()
>    typeof(*p) ___p1 = READ_ONCE(*p);   \

I don't see how ___p1 could be used uninitialized. Perhaps a compiler issue?

--
    Manfred



[PATCH] x86_64: clobber flags in clear_page()

2018-01-13 Thread Alexey Dobriyan
All clear_page() implementations use XOR which resets flags.

Judging by allyesconfig disassembly no code is affected.

Signed-off-by: Alexey Dobriyan 
---

 arch/x86/include/asm/page_64.h |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/x86/include/asm/page_64.h
+++ b/arch/x86/include/asm/page_64.h
@@ -47,7 +47,7 @@ static inline void clear_page(void *page)
   clear_page_erms, X86_FEATURE_ERMS,
   "=D" (page),
   "0" (page)
-  : "memory", "rax", "rcx");
+  : "cc", "memory", "rax", "rcx");
 }
 
 void copy_page(void *to, void *from);


Re: [PATCH 1/3] ARM: dts: rockchip: drop veyron's nonstandard 'backlight-boot-off'

2018-01-13 Thread Heiko Stuebner
Am Freitag, 5. Januar 2018, 16:47:55 CET schrieb Brian Norris:
> This was used out-of-tree as a hack for resolving issues where some
> systems expect the backlight to turn on automatically at boot, while
> others expect to manage the backlight status via a DRM/panel driver.
> Those issues have since been fixed upstream in pwm_bl.c without device
> tree hacks, and so this un-documented property should no longer be
> useful.
> 
> Signed-off-by: Brian Norris 

applied (for 4.17 though)


Thanks
Heiko


[PATCH] Remove structure passing and assignment to save stack and no coping structures.

2018-01-13 Thread Karim Eshapa
Signed-off-by: Karim Eshapa 

Thanks,
Karim
---
 include/linux/tnum.h  |  2 +-
 kernel/bpf/tnum.c | 13 +++--
 kernel/bpf/verifier.c | 12 
 3 files changed, 16 insertions(+), 11 deletions(-)

diff --git a/include/linux/tnum.h b/include/linux/tnum.h
index 0d2d3da..ddb1250 100644
--- a/include/linux/tnum.h
+++ b/include/linux/tnum.h
@@ -26,7 +26,7 @@ struct tnum tnum_lshift(struct tnum a, u8 shift);
 /* Shift a tnum right (by a fixed shift) */
 struct tnum tnum_rshift(struct tnum a, u8 shift);
 /* Add two tnums, return @a + @b */
-struct tnum tnum_add(struct tnum a, struct tnum b);
+void tnum_add(struct tnum *res, struct tnum *a, struct tnum *b);
 /* Subtract two tnums, return @a - @b */
 struct tnum tnum_sub(struct tnum a, struct tnum b);
 /* Bitwise-AND, return @a & @b */
diff --git a/kernel/bpf/tnum.c b/kernel/bpf/tnum.c
index 1f4bf68..f7f8b10 100644
--- a/kernel/bpf/tnum.c
+++ b/kernel/bpf/tnum.c
@@ -43,16 +43,17 @@ struct tnum tnum_rshift(struct tnum a, u8 shift)
return TNUM(a.value >> shift, a.mask >> shift);
 }
 
-struct tnum tnum_add(struct tnum a, struct tnum b)
+void tnum tnum_add(struct tnum *res, struct tnum *a, struct tnum *b)
 {
u64 sm, sv, sigma, chi, mu;
 
-   sm = a.mask + b.mask;
-   sv = a.value + b.value;
+   sm = a->mask + b->mask;
+   sv = a->value + b->value;
sigma = sm + sv;
chi = sigma ^ sv;
-   mu = chi | a.mask | b.mask;
-   return TNUM(sv & ~mu, mu);
+   mu = chi | a->mask | b->mask;
+   res->value = (sv & ~mu);
+   res->mask = mu;
 }
 
 struct tnum tnum_sub(struct tnum a, struct tnum b)
@@ -102,7 +103,7 @@ static struct tnum hma(struct tnum acc, u64 value, u64 mask)
 {
while (mask) {
if (mask & 1)
-   acc = tnum_add(acc, TNUM(0, value));
+   tnum_add(&acc, &acc, &TNUM(0, value));
mask >>= 1;
value <<= 1;
}
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index b414d6b..4acc16c 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -999,7 +999,8 @@ static int check_pkt_ptr_alignment(struct bpf_verifier_env 
*env,
 */
ip_align = 2;
 
-   reg_off = tnum_add(reg->var_off, tnum_const(ip_align + reg->off + off));
+   tnum_add(®_off, ®->var_off,
+   &tnum_const(ip_align + reg->off + off));
if (!tnum_is_aligned(reg_off, size)) {
char tn_buf[48];
 
@@ -1024,7 +1025,8 @@ static int check_generic_ptr_alignment(struct 
bpf_verifier_env *env,
if (!strict || size == 1)
return 0;
 
-   reg_off = tnum_add(reg->var_off, tnum_const(reg->off + off));
+   tnum_add(®_off, ®->var_off,
+   &tnum_const(reg->off + off));
if (!tnum_is_aligned(reg_off, size)) {
char tn_buf[48];
 
@@ -1971,7 +1973,8 @@ static int adjust_ptr_min_max_vals(struct 
bpf_verifier_env *env,
dst_reg->umin_value = umin_ptr + umin_val;
dst_reg->umax_value = umax_ptr + umax_val;
}
-   dst_reg->var_off = tnum_add(ptr_reg->var_off, off_reg->var_off);
+   tnum_add(&dst_reg->var_off, &ptr_reg->var_off,
+   &off_reg->var_off);
dst_reg->off = ptr_reg->off;
if (reg_is_pkt_pointer(ptr_reg)) {
dst_reg->id = ++env->id_gen;
@@ -2108,7 +2111,8 @@ static int adjust_scalar_min_max_vals(struct 
bpf_verifier_env *env,
dst_reg->umin_value += umin_val;
dst_reg->umax_value += umax_val;
}
-   dst_reg->var_off = tnum_add(dst_reg->var_off, src_reg.var_off);
+   tnum_add(&dst_reg->var_off, &dst_reg->var_off,
+   &src_reg.var_off);
break;
case BPF_SUB:
if (signed_sub_overflows(dst_reg->smin_value, smax_val) ||
-- 
2.7.4



Re: [PATCH v5] perf tools: Add ARM Statistical Profiling Extensions (SPE) support

2018-01-13 Thread Arnaldo Carvalho de Melo
Em Fri, Jan 12, 2018 at 07:27:37PM -0600, Kim Phillips escreveu:
> 'perf record' and 'perf report --dump-raw-trace' supported in this
> release.
> 
> Example usage:
> 
>  # perf record -e arm_spe/ts_enable=1,pa_enable=1/ dd if=/dev/zero 
> of=/dev/null count=1
>  # perf report --dump-raw-trace
> 
> Note that the perf.data file is portable, so the report can be run on
> another architecture host if necessary.

Failed for these distros:

  1036.39 centos:6  : FAIL gcc (GCC) 4.4.7 20120313 
(Red Hat 4.4.7-18)
  3735.28 oraclelinux:6 : FAIL gcc (GCC) 4.4.7 20120313 
(Red Hat 4.4.7-18)
  3937.41 ubuntu:12.04.5: FAIL gcc (Ubuntu/Linaro 
4.6.3-1ubuntu5) 4.6.3

  CC   /tmp/build/perf/util/arm-spe-pkt-decoder.o
cc1: warnings being treated as errors
util/arm-spe-pkt-decoder.c: In function 'arm_spe_pkt_desc':
util/arm-spe-pkt-decoder.c:277: error: declaration of 'index' shadows a global 
declaration
/usr/include/string.h:489: error: shadowed declaration is here
mv: cannot stat `/tmp/build/perf/util/.arm-spe-pkt-decoder.o.tmp': No such file 
or directory
make[4]: *** [/tmp/build/perf/util/arm-spe-pkt-decoder.o] Error 1

Just rename 'index' to 'idx'

  14   110.68 debian:9  : FAIL gcc (Debian 6.3.0-18) 6.3.0 
20170516
  15   115.91 debian:experimental   : FAIL gcc (Debian 7.2.0-17) 7.2.1 
20171205
  26   124.97 fedora:25 : FAIL gcc (GCC) 6.4.1 20170727 
(Red Hat 6.4.1-1)
  27   126.05 fedora:26 : FAIL gcc (GCC) 7.2.1 20170915 
(Red Hat 7.2.1-2)
  28   126.40 fedora:27 : FAIL gcc (GCC) 7.2.1 20170915 
(Red Hat 7.2.1-2)
  29   119.77 fedora:rawhide: FAIL gcc (GCC) 7.2.1 20170829 
(Red Hat 7.2.1-1)
  36   127.59 opensuse:tumbleweed   : FAIL gcc (SUSE Linux) 7.2.1 
20171020 [gcc-7-branch revision 253932]
  43   106.21 ubuntu:16.04  : FAIL gcc (Ubuntu 
5.4.0-6ubuntu1~16.04.5) 5.4.0 20160609
  50   114.48 ubuntu:16.10  : FAIL gcc (Ubuntu 6.2.0-5ubuntu12) 
6.2.0 20161005
  51   112.79 ubuntu:17.04  : FAIL gcc (Ubuntu 6.3.0-12ubuntu2) 
6.3.0 20170406
  52   115.37 ubuntu:17.10  : FAIL gcc (Ubuntu 7.2.0-8ubuntu3) 
7.2.0
  53   112.14 ubuntu:18.04  : FAIL gcc (Ubuntu 7.2.0-16ubuntu1) 
7.2.0
5

  LD   /tmp/build/perf/util/scripting-engines/libperf-in.o
  CC   /tmp/build/perf/util/intel-bts.o
  CC   /tmp/build/perf/util/arm-spe.o
util/arm-spe.c:165:19: error: unused function 'arm_spe_update_queues' 
[-Werror,-Wunused-function]
static inline int arm_spe_update_queues(struct arm_spe *spe)
  ^
1 error generated.
mv: cannot stat '/tmp/build/perf/util/.arm-spe.o.tmp': No such file or directory
/git/linux/tools/build/Makefile.build:96: recipe for target 
'/tmp/build/perf/util/arm-spe.o' failed




[PATCH v3 8/9] x86: use __uaccess_begin_nospec and ASM_IFENCE in get_user paths

2018-01-13 Thread Dan Williams
Quoting Linus:

I do think that it would be a good idea to very expressly document
the fact that it's not that the user access itself is unsafe. I do
agree that things like "get_user()" want to be protected, but not
because of any direct bugs or problems with get_user() and friends,
but simply because get_user() is an excellent source of a pointer
that is obviously controlled from a potentially attacking user
space. So it's a prime candidate for then finding _subsequent_
accesses that can then be used to perturb the cache.

Note that '__copy_user_ll' is also called in the 'put_user' case, but
there is currently no indication that put_user in general deserves the
same hygiene as 'get_user'.

Suggested-by: Linus Torvalds 
Suggested-by: Andi Kleen 
Cc: Al Viro 
Cc: Kees Cook 
Cc: Thomas Gleixner 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: x...@kernel.org
Signed-off-by: Dan Williams 
---
 arch/x86/include/asm/uaccess.h|6 +++---
 arch/x86/include/asm/uaccess_32.h |6 +++---
 arch/x86/include/asm/uaccess_64.h |   12 ++--
 arch/x86/lib/copy_user_64.S   |3 +++
 arch/x86/lib/getuser.S|5 +
 arch/x86/lib/usercopy_32.c|8 
 6 files changed, 24 insertions(+), 16 deletions(-)

diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index a31fd4fc6483..82c73f064e76 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -450,7 +450,7 @@ do {
\
 ({ \
int __gu_err;   \
__inttype(*(ptr)) __gu_val; \
-   __uaccess_begin();  \
+   __uaccess_begin_nospec();   \
__get_user_size(__gu_val, (ptr), (size), __gu_err, -EFAULT);\
__uaccess_end();\
(x) = (__force __typeof__(*(ptr)))__gu_val; \
@@ -558,7 +558,7 @@ struct __large_struct { unsigned long buf[100]; };
  * get_user_ex(...);
  * } get_user_catch(err)
  */
-#define get_user_try   uaccess_try
+#define get_user_try   uaccess_try_nospec
 #define get_user_catch(err)uaccess_catch(err)
 
 #define get_user_ex(x, ptr)do {\
@@ -592,7 +592,7 @@ extern void __cmpxchg_wrong_size(void)
__typeof__(ptr) __uval = (uval);\
__typeof__(*(ptr)) __old = (old);   \
__typeof__(*(ptr)) __new = (new);   \
-   __uaccess_begin();  \
+   __uaccess_begin_nospec();   \
switch (size) { \
case 1: \
{   \
diff --git a/arch/x86/include/asm/uaccess_32.h 
b/arch/x86/include/asm/uaccess_32.h
index 72950401b223..ba2dc1930630 100644
--- a/arch/x86/include/asm/uaccess_32.h
+++ b/arch/x86/include/asm/uaccess_32.h
@@ -29,21 +29,21 @@ raw_copy_from_user(void *to, const void __user *from, 
unsigned long n)
switch (n) {
case 1:
ret = 0;
-   __uaccess_begin();
+   __uaccess_begin_nospec();
__get_user_asm_nozero(*(u8 *)to, from, ret,
  "b", "b", "=q", 1);
__uaccess_end();
return ret;
case 2:
ret = 0;
-   __uaccess_begin();
+   __uaccess_begin_nospec();
__get_user_asm_nozero(*(u16 *)to, from, ret,
  "w", "w", "=r", 2);
__uaccess_end();
return ret;
case 4:
ret = 0;
-   __uaccess_begin();
+   __uaccess_begin_nospec();
__get_user_asm_nozero(*(u32 *)to, from, ret,
  "l", "k", "=r", 4);
__uaccess_end();
diff --git a/arch/x86/include/asm/uaccess_64.h 
b/arch/x86/include/asm/uaccess_64.h
index f07ef3c575db..62546b3a398e 100644
--- a/arch/x86/include/asm/uaccess_64.h
+++ b/arch/x86/include/asm/uaccess_64.h
@@ -55,31 +55,31 @@ raw_copy_from_user(void *dst, const void __user *src, 
unsigned long size)
return copy_user_generic(dst, (__force void *)src, size);
switch (size) {
case 1:
-   __uaccess_begin();
+ 

[PATCH v3 6/9] asm/nospec: mask speculative execution flows

2018-01-13 Thread Dan Williams
'__array_ptr' is proposed as a generic mechanism to mitigate against
Spectre-variant-1 attacks, i.e. an attack that bypasses memory bounds
checks via speculative execution). The '__array_ptr' implementation
appears safe for current generation cpus across multiple architectures.

In comparison, 'ifence_array_ptr' uses a hard / architectural 'ifence'
approach to preclude the possibility speculative execution. However, it
is not the default given a concern for avoiding instruction-execution
barriers in potential fast paths.

Based on an original implementation by Linus Torvalds, tweaked to remove
speculative flows by Alexei Starovoitov, and tweaked again by Linus to
introduce an x86 assembly implementation for the mask generation.

Co-developed-by: Linus Torvalds 
Co-developed-by: Alexei Starovoitov 
Co-developed-by: Peter Zijlstra 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: x...@kernel.org
Signed-off-by: Dan Williams 
---
 arch/arm/Kconfig   |1 +
 arch/arm64/Kconfig |1 +
 arch/x86/Kconfig   |3 ++
 include/linux/nospec.h |   92 
 kernel/Kconfig.nospec  |   46 
 kernel/Makefile|1 +
 kernel/nospec.c|   52 +++
 lib/Kconfig|3 ++
 8 files changed, 199 insertions(+)
 create mode 100644 include/linux/nospec.h
 create mode 100644 kernel/Kconfig.nospec
 create mode 100644 kernel/nospec.c

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 51c8df561077..fd4789ec8cac 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -7,6 +7,7 @@ config ARM
select ARCH_HAS_DEBUG_VIRTUAL
select ARCH_HAS_DEVMEM_IS_ALLOWED
select ARCH_HAS_ELF_RANDOMIZE
+   select ARCH_HAS_IFENCE
select ARCH_HAS_SET_MEMORY
select ARCH_HAS_STRICT_KERNEL_RWX if MMU && !XIP_KERNEL
select ARCH_HAS_STRICT_MODULE_RWX if MMU
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index c9a7e9e1414f..22765c4b6986 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -16,6 +16,7 @@ config ARM64
select ARCH_HAS_GCOV_PROFILE_ALL
select ARCH_HAS_GIGANTIC_PAGE if (MEMORY_ISOLATION && COMPACTION) || CMA
select ARCH_HAS_KCOV
+   select ARCH_HAS_IFENCE
select ARCH_HAS_SET_MEMORY
select ARCH_HAS_SG_CHAIN
select ARCH_HAS_STRICT_KERNEL_RWX
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index d4fc98c50378..68698289c83c 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -54,6 +54,7 @@ config X86
select ARCH_HAS_FORTIFY_SOURCE
select ARCH_HAS_GCOV_PROFILE_ALL
select ARCH_HAS_KCOVif X86_64
+   select ARCH_HAS_IFENCE
select ARCH_HAS_PMEM_APIif X86_64
# Causing hangs/crashes, see the commit that added this change for 
details.
select ARCH_HAS_REFCOUNT
@@ -442,6 +443,8 @@ config INTEL_RDT
 
  Say N if unsure.
 
+source "kernel/Kconfig.nospec"
+
 if X86_32
 config X86_EXTENDED_PLATFORM
bool "Support for extended (non-PC) x86 platforms"
diff --git a/include/linux/nospec.h b/include/linux/nospec.h
new file mode 100644
index ..f6e7ba7a7344
--- /dev/null
+++ b/include/linux/nospec.h
@@ -0,0 +1,92 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright(c) 2018 Intel Corporation. All rights reserved.
+
+#ifndef __NOSPEC_H__
+#define __NOSPEC_H__
+
+#include 
+#include 
+
+/*
+ * If idx is negative or if idx > size then bit 63 is set in the mask,
+ * and the value of ~(-1L) is zero. When the mask is zero, bounds check
+ * failed, __array_ptr will return NULL.
+ */
+#ifndef array_ptr_mask
+#define array_ptr_mask(idx, sz)
\
+({ \
+   unsigned long mask; \
+   unsigned long _i = (idx);   \
+   unsigned long _s = (sz);\
+   \
+   mask = ~(long)(_i | (_s - 1 - _i)) >> (BITS_PER_LONG - 1);  \
+   mask;   \
+})
+#endif
+
+/**
+ * __array_ptr - Generate a pointer to an array element, ensuring
+ * the pointer is bounded under speculation to NULL.
+ *
+ * @base: the base of the array
+ * @idx: the index of the element, must be less than LONG_MAX
+ * @sz: the number of elements in the array, must be less than LONG_MAX
+ *
+ * If @idx falls in the interval [0, @sz), returns the pointer to
+ * @arr[@idx], otherwise returns NULL.
+ */
+#define __array_ptr(base, idx, sz) \
+({ \
+   union { typeof(*(base)) *_ptr; unsigned long _bit; } __u;   \
+   typeof

[PATCH v3 7/9] x86: introduce __uaccess_begin_nospec and ASM_IFENCE

2018-01-13 Thread Dan Williams
For 'get_user' paths, do not allow the kernel to speculate on the value
of a user controlled pointer. In addition to the 'stac' instruction for
Supervisor Mode Access Protection, an 'ifence' causes the 'access_ok'
result to resolve in the pipeline before the cpu might take any
speculative action on the pointer value.

Since this is a major kernel interface that deals with user controlled
data, the '__uaccess_begin_nospec' mechanism will prevent speculative
execution past an 'access_ok' permission check. While speculative
execution past 'access_ok' is not enough to lead to a kernel memory
leak, it is a necessary precondition.

To be clear, '__uaccess_begin_nospec' and ASM_IFENCE are not addressing
any known issues with 'get_user' they are addressing a class of
potential problems that could be near 'get_user' usages. In other words,
these helpers are for hygiene not clinical fixes.

There are no functional changes in this patch.

Suggested-by: Linus Torvalds 
Suggested-by: Andi Kleen 
Cc: Tom Lendacky 
Cc: Al Viro 
Cc: Kees Cook 
Cc: Thomas Gleixner 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: x...@kernel.org
Signed-off-by: Dan Williams 
---
 arch/x86/include/asm/smap.h|4 
 arch/x86/include/asm/uaccess.h |   10 ++
 2 files changed, 14 insertions(+)

diff --git a/arch/x86/include/asm/smap.h b/arch/x86/include/asm/smap.h
index db00bd4b..0b59707e0b46 100644
--- a/arch/x86/include/asm/smap.h
+++ b/arch/x86/include/asm/smap.h
@@ -40,6 +40,10 @@
 
 #endif /* CONFIG_X86_SMAP */
 
+#define ASM_IFENCE \
+   ALTERNATIVE_2 "", "mfence", X86_FEATURE_MFENCE_RDTSC, \
+ "lfence", X86_FEATURE_LFENCE_RDTSC
+
 #else /* __ASSEMBLY__ */
 
 #include 
diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index 574dff4d2913..a31fd4fc6483 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -124,6 +124,11 @@ extern int __get_user_bad(void);
 
 #define __uaccess_begin() stac()
 #define __uaccess_end()   clac()
+#define __uaccess_begin_nospec()   \
+({ \
+   stac(); \
+   ifence();   \
+})
 
 /*
  * This is a type: either unsigned long, if the argument fits into
@@ -487,6 +492,11 @@ struct __large_struct { unsigned long buf[100]; };
__uaccess_begin();  \
barrier();
 
+#define uaccess_try_nospec do {
\
+   current->thread.uaccess_err = 0;\
+   __uaccess_begin_nospec();   \
+   barrier();
+
 #define uaccess_catch(err) \
__uaccess_end();\
(err) |= (current->thread.uaccess_err ? -EFAULT : 0);   \



[PATCH v3 5/9] x86: implement ifence_array_ptr() and array_ptr_mask()

2018-01-13 Thread Dan Williams
'ifence_array_ptr' is provided as an alternative to the default
'__array_ptr' implementation that uses a mask to sanitize user
controllable pointers. Later patches will allow it to be selected via
the kernel command line. The '__array_ptr' implementation otherwise
appears safe for current generation cpus across multiple architectures.

'array_ptr_mask' is used by the default 'array_ptr' implementation to
cheaply calculate an array bounds mask.

Suggested-by: Linus Torvalds 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: x...@kernel.org
Signed-off-by: Dan Williams 
---
 arch/x86/include/asm/barrier.h |   46 
 1 file changed, 46 insertions(+)

diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
index b04f572d6d97..1b507f9e2cc7 100644
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -28,6 +28,52 @@
 #define ifence() alternative_2("", "mfence", X86_FEATURE_MFENCE_RDTSC, \
   "lfence", X86_FEATURE_LFENCE_RDTSC)
 
+/**
+ * ifence_array_ptr - Generate a pointer to an array element,
+ * ensuring the pointer is bounded under speculation.
+ *
+ * @arr: the base of the array
+ * @idx: the index of the element
+ * @sz: the number of elements in the array
+ *
+ * If @idx falls in the interval [0, @sz), returns the pointer to
+ * @arr[@idx], otherwise returns NULL.
+ */
+#define ifence_array_ptr(arr, idx, sz) \
+({ \
+   typeof(*(arr)) *__arr = (arr), *__ret;  \
+   typeof(idx) __idx = (idx);  \
+   typeof(sz) __sz = (sz); \
+   \
+   __ret = __idx < __sz ? __arr + __idx : NULL;\
+   ifence();   \
+   __ret;  \
+})
+
+/**
+ * array_ptr_mask - generate a mask for array_ptr() that is ~0UL when
+ * the bounds check succeeds and 0 otherwise
+ */
+#define array_ptr_mask array_ptr_mask
+static inline unsigned long array_ptr_mask(unsigned long idx, unsigned long sz)
+{
+   unsigned long mask;
+
+   /*
+* mask = index - size, if that result is >= 0 then the index is
+* invalid and the mask is 0 else ~0
+*/
+#ifdef CONFIG_X86_32
+   asm ("cmpl %1,%2; sbbl %0,%0;"
+#else
+   asm ("cmpq %1,%2; sbbq %0,%0;"
+#endif
+   :"=r" (mask)
+   :"r"(sz),"r" (idx)
+   :"cc");
+   return mask;
+}
+
 #ifdef CONFIG_X86_PPRO_FENCE
 #define dma_rmb()  rmb()
 #else



[PATCH v3 9/9] vfs, fdtable: prevent bounds-check bypass via speculative execution

2018-01-13 Thread Dan Williams
Expectedly, static analysis reports that 'fd' is a user controlled value
that is used as a data dependency to read from the 'fdt->fd' array.  In
order to avoid potential leaks of kernel memory values, block
speculative execution of the instruction stream that could issue reads
based on an invalid 'file *' returned from __fcheck_files.

Cc: Al Viro 
Co-developed-by: Elena Reshetova 
Signed-off-by: Dan Williams 
---
 include/linux/fdtable.h |7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/include/linux/fdtable.h b/include/linux/fdtable.h
index 1c65817673db..9731f1a255db 100644
--- a/include/linux/fdtable.h
+++ b/include/linux/fdtable.h
@@ -10,6 +10,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -81,9 +82,11 @@ struct dentry;
 static inline struct file *__fcheck_files(struct files_struct *files, unsigned 
int fd)
 {
struct fdtable *fdt = rcu_dereference_raw(files->fdt);
+   struct file __rcu **fdp;
 
-   if (fd < fdt->max_fds)
-   return rcu_dereference_raw(fdt->fd[fd]);
+   fdp = array_ptr(fdt->fd, fd, fdt->max_fds);
+   if (fdp)
+   return rcu_dereference_raw(*fdp);
return NULL;
 }
 



Re: [PATCH] kdump: Write a correct address of mem_section into vmcoreinfo

2018-01-13 Thread Kirill A. Shutemov
On Sat, Jan 13, 2018 at 11:48:38AM +0100, Ingo Molnar wrote:
> 
> * Kirill A. Shutemov  wrote:
> 
> > Depending on configuration mem_section can now be an array or a pointer
> > to an array allocated dynamically. In most cases, we can continue to refer
> > to it as 'mem_section' regardless of what it is.
> > 
> > But there's one exception: '&mem_section' means "address of the array" if
> > mem_section is an array, but if mem_section is a pointer, it would mean
> > "address of the pointer".
> > 
> > We've stepped onto this in kdump code. VMCOREINFO_SYMBOL(mem_section)
> > writes down address of pointer into vmcoreinfo, not array as we wanted.
> > 
> > Let's introduce VMCOREINFO_SYMBOL_ARRAY() that would handle the
> > situation correctly for both cases.
> > 
> > Signed-off-by: Kirill A. Shutemov 
> > Fixes: 83e3c48729d9 ("mm/sparsemem: Allocate mem_section at runtime for 
> > CONFIG_SPARSEMEM_EXTREME=y")
> > Cc: sta...@vger.kernel.org
> > Acked-by: Baoquan He 
> > Acked-by: Dave Young 
> 
> You forgot the Reported-by - I added that to the commit.

Oops, sorry.

Note, that Andrew has already pick it up and sent it upstream.

-- 
 Kirill A. Shutemov


[PATCH v3 0/9] core, x86: prevent bounds-check bypass via speculative execution

2018-01-13 Thread Dan Williams
Changes since v2 [1]:
* style fix in Documentation/speculation.txt (Geert)

* add Russell and Catalin to the cc on the ARM patches (Russell)

* clarify changelog for "x86: introduce __uaccess_begin_nospec and
  ASM_IFENCE" (Eric, Linus, Josh)

* fix the dynamic 'mask' / 'ifence' toggle vs CONFIG_JUMP_LABEL=n
  (Peter)

* include the get_user_{1,2,4,8} helpers in the ASM_IFENCE protections
  (Linus)

* fix array_ptr_mask for ARCH=i386 builds (Kbuild robot)

* prioritize the get_user protections, and the fdtable fix

[1]: https://lwn.net/Articles/744141/

---

Quoting Mark's original RFC:

"Recently, Google Project Zero discovered several classes of attack
against speculative execution. One of these, known as variant-1, allows
explicit bounds checks to be bypassed under speculation, providing an
arbitrary read gadget. Further details can be found on the GPZ blog [2]
and the Documentation patch in this series."

This series incorporates Mark Rutland's latest ARM changes and adds
the x86 specific implementation of 'ifence_array_ptr'. That ifence
based approach is provided as an opt-in fallback, but the default
mitigation, '__array_ptr', uses a 'mask' approach that removes
conditional branches instructions, and otherwise aims to redirect
speculation to use a NULL pointer rather than a user controlled value.

The mask is generated by the following from Alexei, and Linus:

mask = ~(long)(_i | (_s - 1 - _i)) >> (BITS_PER_LONG - 1);

...and Linus provided an optimized mask generation helper for x86:

asm ("cmpq %1,%2; sbbq %0,%0;"
:"=r" (mask)
:"r"(sz),"r" (idx)
:"cc");

The 'array_ptr' mechanism can be switched between 'mask' and 'ifence'
via the spectre_v1={mask,ifence} command line option if
CONFIG_SPECTRE1_DYNAMIC=y, and the compile-time default is otherwise set
by selecting either CONFIG_SPECTRE1_MASK or CONFIG_SPECTRE1_IFENCE. This
level of sophistication is provided given concerns about 'value
speculation' [3].

The get_user protections and 'array_ptr' infrastructure are the only
concern of this patch set. Going forward 'array_ptr' is a tool that
sub-system maintainers can use to instrument array bounds checks like
'__fcheck_files'. When to use 'array_ptr' is saved for a future patch
set, and in the meantime the 'get_user' protections raise the bar for
launching a Spectre-v1 attack.

These patches are also available via the 'nospec-v3' git branch here:

git://git.kernel.org/pub/scm/linux/kernel/git/djbw/linux nospec-v3

Note that the BPF fix for Spectre variant1 is merged for 4.15-rc8.

[2]: 
https://googleprojectzero.blogspot.co.uk/2018/01/reading-privileged-memory-with-side.html
[3]: https://marc.info/?l=linux-netdev&m=151527996901350&w=2

---

Dan Williams (6):
  x86: implement ifence()
  x86: implement ifence_array_ptr() and array_ptr_mask()
  asm/nospec: mask speculative execution flows
  x86: introduce __uaccess_begin_nospec and ASM_IFENCE
  x86: use __uaccess_begin_nospec and ASM_IFENCE in get_user paths
  vfs, fdtable: prevent bounds-check bypass via speculative execution

Mark Rutland (3):
  Documentation: document array_ptr
  arm64: implement ifence_array_ptr()
  arm: implement ifence_array_ptr()


 Documentation/speculation.txt |  143 +
 arch/arm/Kconfig  |1 
 arch/arm/include/asm/barrier.h|   24 ++
 arch/arm64/Kconfig|1 
 arch/arm64/include/asm/barrier.h  |   24 ++
 arch/x86/Kconfig  |3 +
 arch/x86/include/asm/barrier.h|   50 +
 arch/x86/include/asm/msr.h|3 -
 arch/x86/include/asm/smap.h   |4 +
 arch/x86/include/asm/uaccess.h|   16 +++-
 arch/x86/include/asm/uaccess_32.h |6 +-
 arch/x86/include/asm/uaccess_64.h |   12 ++-
 arch/x86/lib/copy_user_64.S   |3 +
 arch/x86/lib/getuser.S|5 +
 arch/x86/lib/usercopy_32.c|8 +-
 include/linux/fdtable.h   |7 +-
 include/linux/nospec.h|   92 
 kernel/Kconfig.nospec |   46 
 kernel/Makefile   |1 
 kernel/nospec.c   |   52 +
 lib/Kconfig   |3 +
 21 files changed, 484 insertions(+), 20 deletions(-)
 create mode 100644 Documentation/speculation.txt
 create mode 100644 include/linux/nospec.h
 create mode 100644 kernel/Kconfig.nospec
 create mode 100644 kernel/nospec.c


[PATCH v3 2/9] arm64: implement ifence_array_ptr()

2018-01-13 Thread Dan Williams
From: Mark Rutland 

This patch implements ifence_array_ptr() for arm64, using an
LDR+CSEL+CSDB sequence to inhibit speculative use of the returned value.

Signed-off-by: Mark Rutland 
Signed-off-by: Will Deacon 
Cc: Catalin Marinas 
Cc: Peter Zijlstra 
Signed-off-by: Dan Williams 
---
 arch/arm64/include/asm/barrier.h |   24 
 1 file changed, 24 insertions(+)

diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
index 77651c49ef44..74ffcddb26e6 100644
--- a/arch/arm64/include/asm/barrier.h
+++ b/arch/arm64/include/asm/barrier.h
@@ -40,6 +40,30 @@
 #define dma_rmb()  dmb(oshld)
 #define dma_wmb()  dmb(oshst)
 
+#define ifence_array_ptr(arr, idx, sz) \
+({ \
+   typeof(&(arr)[0]) __nap_arr = (arr);\
+   typeof(idx) __nap_idx = (idx);  \
+   typeof(sz) __nap_sz = (sz); \
+   \
+   unsigned long __nap_ptr = (unsigned long)__nap_arr +\
+ sizeof(__nap_arr[0]) * idx;   \
+   \
+   asm volatile(   \
+   "   cmp %[i], %[s]\n"   \
+   "   b.cs1f\n"   \
+   "   ldr %[p], %[pp]\n"  \
+   "1: csel%[p], %[p], xzr, cc\n"  \
+   "   hint#0x14 // CSDB\n"\
+   : [p] "=&r" (__nap_ptr) \
+   : [pp] "m" (__nap_ptr), \
+ [i] "r" ((unsigned long)__nap_idx),   \
+ [s] "r" ((unsigned long)__nap_sz) \
+   : "cc");\
+   \
+   (typeof(&(__nap_arr)[0]))__nap_ptr; \
+})
+
 #define __smp_mb() dmb(ish)
 #define __smp_rmb()dmb(ishld)
 #define __smp_wmb()dmb(ishst)



[PATCH v3 4/9] x86: implement ifence()

2018-01-13 Thread Dan Williams
The new barrier, 'ifence', ensures that speculative execution never
crosses the fence.

Previously the kernel only needed this fence in 'rdtsc_ordered', but now
it is also proposed as a mitigation against Spectre variant1 attacks.
When used it needs to be placed in the success path after a bounds check
i.e.:

if (x < max) {
ifence();
val = array[x];
} else
return -EINVAL;

With this change the cpu will never issue speculative reads of
'array + x' with values of x >= max.

'ifence', via 'ifence_array_ptr', is an opt-in fallback to the default
mitigation provided by '__array_ptr'. It is also proposed for blocking
speculation in the 'get_user' path to bypass 'access_ok' checks. For
now, just provide the common definition for later patches to build upon.

Suggested-by: Peter Zijlstra 
Suggested-by: Alan Cox 
Cc: Tom Lendacky 
Cc: Mark Rutland 
Cc: Greg KH 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: x...@kernel.org
Signed-off-by: Elena Reshetova 
Signed-off-by: Dan Williams 
---
 arch/x86/include/asm/barrier.h |4 
 arch/x86/include/asm/msr.h |3 +--
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
index 7fb336210e1b..b04f572d6d97 100644
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -24,6 +24,10 @@
 #define wmb()  asm volatile("sfence" ::: "memory")
 #endif
 
+/* prevent speculative execution past this barrier */
+#define ifence() alternative_2("", "mfence", X86_FEATURE_MFENCE_RDTSC, \
+  "lfence", X86_FEATURE_LFENCE_RDTSC)
+
 #ifdef CONFIG_X86_PPRO_FENCE
 #define dma_rmb()  rmb()
 #else
diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index 07962f5f6fba..e426d2a33ff3 100644
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -214,8 +214,7 @@ static __always_inline unsigned long long 
rdtsc_ordered(void)
 * that some other imaginary CPU is updating continuously with a
 * time stamp.
 */
-   alternative_2("", "mfence", X86_FEATURE_MFENCE_RDTSC,
- "lfence", X86_FEATURE_LFENCE_RDTSC);
+   ifence();
return rdtsc();
 }
 



[PATCH v3 3/9] arm: implement ifence_array_ptr()

2018-01-13 Thread Dan Williams
From: Mark Rutland 

This patch implements ifence_array_ptr() for arm, using an
LDR+MOVCS+CSDB sequence to inhibit speculative use of the returned
value.

Cc: Russell King 
Signed-off-by: Mark Rutland 
Signed-off-by: Dan Williams 
---
 arch/arm/include/asm/barrier.h |   24 
 1 file changed, 24 insertions(+)

diff --git a/arch/arm/include/asm/barrier.h b/arch/arm/include/asm/barrier.h
index 40f5c410fd8c..919235ed6e68 100644
--- a/arch/arm/include/asm/barrier.h
+++ b/arch/arm/include/asm/barrier.h
@@ -59,6 +59,30 @@ extern void arm_heavy_mb(void);
 #define dma_wmb()  barrier()
 #endif
 
+#define ifence_array_ptr(arr, idx, sz) \
+({ \
+   typeof(&(arr)[0]) __nap_arr = (arr);\
+   typeof(idx) __nap_idx = (idx);  \
+   typeof(sz) __nap_sz = (sz); \
+   \
+   unsigned long __nap_ptr = (unsigned long)__nap_arr +\
+ sizeof(__nap_arr[0]) * idx;   \
+   \
+   asm volatile(   \
+   "   cmp %[i], %[s]\n"   \
+   "   bcs 1f\n"   \
+   "   ldr %[p], %[pp]\n"  \
+   "1: movcs   %[p], #0\n" \
+   "   .inst   0xe320f018 @ CSDB\n"\
+   : [p] "=&r" (__nap_ptr) \
+   : [pp] "m" (__nap_ptr), \
+ [i] "r" ((unsigned long)__nap_idx),   \
+ [s] "r" ((unsigned long)__nap_sz) \
+   : "cc");\
+   \
+   (typeof(&(__nap_arr)[0]))__nap_ptr; \
+})
+
 #define __smp_mb() dmb(ish)
 #define __smp_rmb()__smp_mb()
 #define __smp_wmb()dmb(ishst)



[PATCH v3 1/9] Documentation: document array_ptr

2018-01-13 Thread Dan Williams
From: Mark Rutland 

Document the rationale and usage of the new array_ptr() helper.

Signed-off-by: Mark Rutland 
Signed-off-by: Will Deacon 
Cc: Dan Williams 
Cc: Jonathan Corbet 
Cc: Peter Zijlstra 
Signed-off-by: Dan Williams 
---
 Documentation/speculation.txt |  143 +
 1 file changed, 143 insertions(+)
 create mode 100644 Documentation/speculation.txt

diff --git a/Documentation/speculation.txt b/Documentation/speculation.txt
new file mode 100644
index ..1e59d1d9eaf4
--- /dev/null
+++ b/Documentation/speculation.txt
@@ -0,0 +1,143 @@
+This document explains potential effects of speculation, and how undesirable
+effects can be mitigated portably using common APIs.
+
+===
+Speculation
+===
+
+To improve performance and minimize average latencies, many contemporary CPUs
+employ speculative execution techniques such as branch prediction, performing
+work which may be discarded at a later stage.
+
+Typically speculative execution cannot be observed from architectural state,
+such as the contents of registers. However, in some cases it is possible to
+observe its impact on microarchitectural state, such as the presence or
+absence of data in caches. Such state may form side-channels which can be
+observed to extract secret information.
+
+For example, in the presence of branch prediction, it is possible for bounds
+checks to be ignored by code which is speculatively executed. Consider the
+following code:
+
+   int load_array(int *array, unsigned int idx)
+   {
+   if (idx >= MAX_ARRAY_ELEMS)
+   return 0;
+   else
+   return array[idx];
+   }
+
+Which, on arm64, may be compiled to an assembly sequence such as:
+
+   CMP , #MAX_ARRAY_ELEMS
+   B.LTless
+   MOV , #0
+   RET
+  less:
+   LDR , [, ]
+   RET
+
+It is possible that a CPU mis-predicts the conditional branch, and
+speculatively loads array[idx], even if idx >= MAX_ARRAY_ELEMS. This value
+will subsequently be discarded, but the speculated load may affect
+microarchitectural state which can be subsequently measured.
+
+More complex sequences involving multiple dependent memory accesses may result
+in sensitive information being leaked. Consider the following code, building
+on the prior example:
+
+   int load_dependent_arrays(int *arr1, int *arr2, int idx)
+   {
+   int val1, val2,
+
+   val1 = load_array(arr1, idx);
+   val2 = load_array(arr2, val1);
+
+   return val2;
+   }
+
+Under speculation, the first call to load_array() may return the value of an
+out-of-bounds address, while the second call will influence microarchitectural
+state dependent on this value. This may provide an arbitrary read primitive.
+
+
+Mitigating speculation side-channels
+
+
+The kernel provides a generic API to ensure that bounds checks are respected
+even under speculation. Architectures which are affected by speculation-based
+side-channels are expected to implement these primitives.
+
+The array_ptr() helper in  can be used to prevent
+information from being leaked via side-channels.
+
+A call to array_ptr(arr, idx, sz) returns a sanitized pointer to
+arr[idx] only if idx falls in the [0, sz) interval. When idx < 0 or idx > sz,
+NULL is returned. Additionally, array_ptr() an out-of-bounds poitner is
+not propagated to code which is speculatively executed.
+
+This can be used to protect the earlier load_array() example:
+
+   int load_array(int *array, unsigned int idx)
+   {
+   int *elem;
+
+   elem = array_ptr(array, idx, MAX_ARRAY_ELEMS);
+   if (elem)
+   return *elem;
+   else
+   return 0;
+   }
+
+This can also be used in situations where multiple fields on a structure are
+accessed:
+
+   struct foo array[SIZE];
+   int a, b;
+
+   void do_thing(int idx)
+   {
+   struct foo *elem;
+
+   elem = array_ptr(array, idx, SIZE);
+   if (elem) {
+   a = elem->field_a;
+   b = elem->field_b;
+   }
+   }
+
+It is imperative that the returned pointer is used. Pointers which are
+generated separately are subject to a number of potential CPU and compiler
+optimizations, and may still be used speculatively. For example, this means
+that the following sequence is unsafe:
+
+   struct foo array[SIZE];
+   int a, b;
+
+   void do_thing(int idx)
+   {
+   if (array_ptr(array, idx, SIZE) != NULL) {
+   // unsafe as wrong pointer is used
+   a = array[idx].field_a;
+   b = array[idx].field_b;
+   }
+   }
+
+Similarly, it is unsafe to compare the returned poi

Re: [PATCH] retpoline/module: Taint kernel for missing retpoline in module

2018-01-13 Thread Andi Kleen
> > Also what's the point of putting this information into every symbol?
> 
> It makes it easy to check :)

Easier than nm?

Per symbol still doesn't make any sense to me.

> 
> > Once per module is good enough.
> > 
> > We already have similar checks for staging etc.
> 
> Sure, but this is more of a "Hey, your version of GCC is doing something
> different than what you built the kernel with, watch out!" which is much
> more generic and good to know.  A whole taint for one CPU bug type seems
> overkill to me.

I removed the taint in version 2, posted yesterday. It now just prints
the warning and resets the vulnerability reporting in sysfs.

-Andi


Re: [PATCH 4.9] x86/pti/efi: broken conversion from efi to kernel page table

2018-01-13 Thread Greg KH
On Sat, Jan 13, 2018 at 12:40:10PM -0500, Pavel Tatashin wrote:
> Hi Greg,
> 
> Yeah, the one in pgtable.c needs to be removed, I wonder how it
> compiled... I will submit a new patch for 4.9 sometime later.

It builds, just gives a warning, easy to miss if you aren't looking for
it :)



Re: [PATCH 4/8] irqchip/gic-v3: add ability to save/restore GIC/ITS state

2018-01-13 Thread Marc Zyngier
[I remember asking you to copy Sudeep Hola on this. Please do so the
next time around]

On Fri, 12 Jan 2018 21:24:18 +,
Derek Basehore wrote:
> 
> Some platforms power off GIC logic in S3, so we need to save/restore

S3 is a not a GIC concept, and is only vaguely mentioned in terms of
the rk3399 silicon, if grep serves me right. Please expand on what
state this is exactly.

> state. This adds a DT-binding to save/restore the GICD/GICR/GITS
> states using the new CPU_PM_SYSTEM_ENTER/EXIT CPU PM states.

DT binding? I can't see any in this patch.

> 
> Change-Id: I1fb2117296373fa67397fdd4a8960077b241462e

It's been mentioned somewhere else in the thread: these tags have no
purpose in the kernel. Please sanitise your patches before posting them.

> Signed-off-by: Derek Basehore 
> Signed-off-by: Brian Norris 

Who is the author of this patch? If that's a joined authorship, please
use the Co-Developed-by: tag.

> ---
>  drivers/irqchip/irq-gic-v3-its.c   |  51 ++
>  drivers/irqchip/irq-gic-v3.c   | 333 
> +++--
>  include/linux/irqchip/arm-gic-v3.h |  17 ++
>  3 files changed, 391 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/irqchip/irq-gic-v3-its.c 
> b/drivers/irqchip/irq-gic-v3-its.c
> index 06f025fd5726..5cb808e3d0bf 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -85,6 +85,16 @@ struct its_baser {
>  
>  struct its_device;
>  
> +/*
> + * Saved ITS state - this is where saved state for the ITS is stored
> + * when it's disabled during system suspend.
> + */
> +struct its_ctx {
> + u64 cbaser;
> + u64 cwriter;

Why do you need to save cwriter? Do you expect to perform a
save/restore in the middle of a set of commands? I would really expect
the system to be in a quiescent state, and the command queue to be
reset to an empty state on resume. WHy isn't it so?

> + u32 ctlr;
> +};
> +
>  /*
>   * The ITS structure - contains most of the infrastructure, with the
>   * top-level MSI domain, the command queue, the collections, and the
> @@ -101,6 +111,7 @@ struct its_node {
>   struct its_collection   *collections;
>   struct fwnode_handle*fwnode_handle;
>   u64 (*get_msi_base)(struct its_device *its_dev);
> + struct its_ctx  its_ctx;
>   struct list_headits_device_list;
>   u64 flags;
>   unsigned long   list_nr;
> @@ -3042,6 +3053,46 @@ static void its_enable_quirks(struct its_node *its)
>   gic_enable_quirks(iidr, its_quirks, its);
>  }
>  
> +void its_save_disable(void)
> +{
> + struct its_node *its;
> +
> + spin_lock(&its_lock);
> + list_for_each_entry(its, &its_nodes, entry) {
> + struct its_ctx *ctx = &its->its_ctx;
> + void __iomem *base = its->base;
> + int i;
> +
> + ctx->ctlr = readl_relaxed(base + GITS_CTLR);
> + its_force_quiescent(base);

What if the ITS fails to become quiescent?

> + ctx->cbaser = gits_read_cbaser(base + GITS_CBASER);
> + ctx->cwriter = readq_relaxed(base + GITS_CWRITER);

How about those systems that do not have a readq (32bit)? Please make
sure this builds on 32bit too.

> + for (i = 0; i < ARRAY_SIZE(its->tables); i++)
> + its->tables[i].val = its_read_baser(its, 
> &its->tables[i]);
> + }
> + spin_unlock(&its_lock);
> +}
> +
> +void its_restore_enable(void)
> +{
> + struct its_node *its;
> +
> + spin_lock(&its_lock);
> + list_for_each_entry(its, &its_nodes, entry) {
> + struct its_ctx *ctx = &its->its_ctx;
> + void __iomem *base = its->base;
> + int i;
> +
> + gits_write_cbaser(ctx->cbaser, base + GITS_CBASER);
> + gits_write_cwriter(ctx->cwriter, base + GITS_CWRITER);
> + for (i = 0; i < ARRAY_SIZE(its->tables); i++)
> + its_write_baser(its, &its->tables[i],
> + its->tables[i].val);
> + writel_relaxed(ctx->ctlr, base + GITS_CTLR);
> + }
> + spin_unlock(&its_lock);
> +}
> +
>  static int its_init_domain(struct fwnode_handle *handle, struct its_node 
> *its)
>  {
>   struct irq_domain *inner_domain;
> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> index 9a7a15049903..95d37fb6f458 100644
> --- a/drivers/irqchip/irq-gic-v3.c
> +++ b/drivers/irqchip/irq-gic-v3.c
> @@ -47,6 +47,36 @@ struct redist_region {
>   boolsingle_redist;
>  };
>  
> +struct gic_dist_ctx {
> + u64 *irouter;
> + u32 *igroupr;
> + u32 *isenabler;
> + u32 *ispendr;
> + u32 *isactiver;
> + u32 *ipriorityr;
> + u32 *icfgr;
> + u32

Re: [PATCHSET v5] blk-mq: reimplement timeout handling

2018-01-13 Thread Ming Lei
On Sat, Jan 13, 2018 at 10:45:14PM +0800, Ming Lei wrote:
> On Fri, Jan 12, 2018 at 04:55:34PM -0500, Laurence Oberman wrote:
> > On Fri, 2018-01-12 at 20:57 +, Bart Van Assche wrote:
> > > On Tue, 2018-01-09 at 08:29 -0800, Tejun Heo wrote:
> > > > Currently, blk-mq timeout path synchronizes against the usual
> > > > issue/completion path using a complex scheme involving atomic
> > > > bitflags, REQ_ATOM_*, memory barriers and subtle memory coherence
> > > > rules.  Unfortunatley, it contains quite a few holes.
> > > 
> > > Hello Tejun,
> > > 
> > > With this patch series applied I see weird hangs in blk_mq_get_tag()
> > > when I
> > > run the srp-test software. If I pull Jens' latest for-next branch and
> > > revert
> > > this patch series then the srp-test software runs successfully. Note:
> > > if you
> > > don't have InfiniBand hardware available then you will need the
> > > RDMA/CM
> > > patches for the SRP initiator and target drivers that have been
> > > posted
> > > recently on the linux-rdma mailing list to run the srp-test software.
> > > 
> > > This is how I run the srp-test software in a VM:
> > > 
> > > ./run_tests -c -d -r 10
> > > 
> > > Here is an example of what SysRq-w reported when the hang occurred:
> > > 
> > > sysrq: SysRq : Show Blocked State
> > >  taskPC stack   pid father
> > > kworker/u8:0D12864 5  2 0x8000
> > > Workqueue: events_unbound sd_probe_async [sd_mod]
> > > Call Trace:
> > > ? __schedule+0x2b4/0xbb0
> > > schedule+0x2d/0x90
> > > io_schedule+0xd/0x30
> > > blk_mq_get_tag+0x169/0x290
> > > ? finish_wait+0x80/0x80
> > > blk_mq_get_request+0x16a/0x4f0
> > > blk_mq_alloc_request+0x59/0xc0
> > > blk_get_request_flags+0x3f/0x260
> > > scsi_execute+0x33/0x1e0 [scsi_mod]
> > > read_capacity_16.part.35+0x9c/0x460 [sd_mod]
> > > sd_revalidate_disk+0x14bb/0x1cb0 [sd_mod]
> > > sd_probe_async+0xf2/0x1a0 [sd_mod]
> > > process_one_work+0x21c/0x6d0
> > > worker_thread+0x35/0x380
> > > ? process_one_work+0x6d0/0x6d0
> > > kthread+0x117/0x130
> > > ? kthread_create_worker_on_cpu+0x40/0x40
> > > ret_from_fork+0x24/0x30
> > > systemd-udevd   D13672  1048285 0x0100
> > > Call Trace:
> > > ? __schedule+0x2b4/0xbb0
> > > schedule+0x2d/0x90
> > > io_schedule+0xd/0x30
> > > generic_file_read_iter+0x32f/0x970
> > > ? page_cache_tree_insert+0x100/0x100
> > > __vfs_read+0xcc/0x120
> > > vfs_read+0x96/0x140
> > > SyS_read+0x40/0xa0
> > > do_syscall_64+0x5f/0x1b0
> > > entry_SYSCALL64_slow_path+0x25/0x25
> > > RIP: 0033:0x7f8ce6d08d11
> > > RSP: 002b:7fff96dec288 EFLAGS: 0246 ORIG_RAX:
> > > 
> > > RAX: ffda RBX: 5651de7f6e10 RCX: 7f8ce6d08d11
> > > RDX: 0040 RSI: 5651de7f6e38 RDI: 0007
> > > RBP: 5651de7ea500 R08: 7f8ce6cf1c20 R09: 5651de7f6e10
> > > R10: 006f R11: 0246 R12: 01ff
> > > R13: 01ff0040 R14: 5651de7ea550 R15: 0040
> > > systemd-udevd   D13496  1049285 0x0100
> > > Call Trace:
> > > ? __schedule+0x2b4/0xbb0
> > > schedule+0x2d/0x90
> > > io_schedule+0xd/0x30
> > > blk_mq_get_tag+0x169/0x290
> > > ? finish_wait+0x80/0x80
> > > blk_mq_get_request+0x16a/0x4f0
> > > blk_mq_make_request+0x105/0x8e0
> > > ? generic_make_request+0xd6/0x3d0
> > > generic_make_request+0x103/0x3d0
> > > ? submit_bio+0x57/0x110
> > > submit_bio+0x57/0x110
> > > mpage_readpages+0x13b/0x160
> > > ? I_BDEV+0x10/0x10
> > > ? rcu_read_lock_sched_held+0x66/0x70
> > > ? __alloc_pages_nodemask+0x2e8/0x360
> > > __do_page_cache_readahead+0x2a4/0x370
> > > ? force_page_cache_readahead+0xaf/0x110
> > > force_page_cache_readahead+0xaf/0x110
> > > generic_file_read_iter+0x743/0x970
> > > ? find_held_lock+0x2d/0x90
> > > ? _raw_spin_unlock+0x29/0x40
> > > __vfs_read+0xcc/0x120
> > > vfs_read+0x96/0x140
> > > SyS_read+0x40/0xa0
> > > do_syscall_64+0x5f/0x1b0
> > > entry_SYSCALL64_slow_path+0x25/0x25
> > > RIP: 0033:0x7f8ce6d08d11
> > > RSP: 002b:7fff96dec8b8 EFLAGS: 0246 ORIG_RAX:
> > > 
> > > RAX: ffda RBX: 7f8ce7085010 RCX: 7f8ce6d08d11
> > > RDX: 0004 RSI: 7f8ce7085038 RDI: 000f
> > > RBP: 5651de7ec840 R08:  R09: 7f8ce7085010
> > > R10: 7f8ce7085028 R11: 0246 R12: 
> > > R13: 0004 R14: 5651de7ec890 R15: 0004
> > > systemd-udevd   D13672  1055285 0x0100
> > > Call Trace:
> > > ? __schedule+0x2b4/0xbb0
> > > schedule+0x2d/0x90
> > > io_schedule+0xd/0x30
> > > blk_mq_get_tag+0x169/0x290
> > > ? finish_wait+0x80/0x80
> > > blk_mq_get_request+0x16a/0x4f0
> > > blk_mq_make_request+0x105/0x8e0
> > > ? generic_make_request+0xd6/0x3d0
> > > generic_make_request+0x103/0x3d0
> > > ? submit_bio+0x57/0x110
> > > submit_bio+0x57/0x110
> > > mpage_readpages+0x13b/0x160
> > > ? I_BDEV+0x10/0x10
> > > ? rcu_read_lock_sched_held+0x66/0x70
> > > ? __alloc_pages_nodemask+0x2e8/0x3

Re: [Cocci] [PATCH] Coccinelle: kzalloc-simple: Rename kzalloc-simple to zalloc-simple

2018-01-13 Thread Himanshu Jha
On Sat, Jan 13, 2018 at 03:02:10PM -0200, Fabio Estevam wrote:
> On Sat, Jan 13, 2018 at 1:57 PM, Himanshu Jha
>  wrote:
> > Rename kzalloc-simple to zalloc-simple since now the rule is not
> > specific to kzalloc function only, but also to many other zero memory
> > allocating functions specified in the rule.
> >
> > Signed-off-by: Himanshu Jha 
> > ---
> >  scripts/coccinelle/api/alloc/kzalloc-simple.cocci | 448 
> > --
> >  scripts/coccinelle/api/alloc/zalloc.cocci | 448 
> > ++
> 
> You could use 'git mv' and 'git format -1 -M', so that git detects the rename.

Yes, I used 'git mv'.

It doesn't matter when applying through 'git am', both will result the
same AFAIK and only difference is that the patch files generated by 'git
format-patch' are different. But that is not important I think.

Masahiro if have any problem then please tell me, I can send again as
stated by Fabio.

-- 
Thanks
Himanshu Jha


Re: [PATCH 4.9] x86/pti/efi: broken conversion from efi to kernel page table

2018-01-13 Thread Pavel Tatashin
Hi Greg,

Yeah, the one in pgtable.c needs to be removed, I wonder how it
compiled... I will submit a new patch for 4.9 sometime later.

Thank you,
Pavel

On Sat, Jan 13, 2018 at 12:12 PM, Greg KH  wrote:
> On Thu, Jan 11, 2018 at 04:58:20PM -0500, Pavel Tatashin wrote:
>> The page table order must be increased for EFI table in order to avoid a
>> bug where NMI tries to change the page table to kernel page table, while
>> efi page table is active.
>>
>> For more disccussion about this bug, see this thread:
>> http://lkml.iu.edu/hypermail/linux/kernel/1801.1/00951.html
>>
>> Signed-off-by: Pavel Tatashin 
>> Reviewed-by: Steven Sistare 
>> Acked-by: Jiri Kosina 
>> ---
>>  arch/x86/include/asm/pgalloc.h | 11 +++
>>  arch/x86/platform/efi/efi_64.c |  2 +-
>>  2 files changed, 12 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/include/asm/pgalloc.h b/arch/x86/include/asm/pgalloc.h
>> index b6d425999f99..1178a51b77f3 100644
>> --- a/arch/x86/include/asm/pgalloc.h
>> +++ b/arch/x86/include/asm/pgalloc.h
>> @@ -27,6 +27,17 @@ static inline void paravirt_release_pud(unsigned long 
>> pfn) {}
>>   */
>>  extern gfp_t __userpte_alloc_gfp;
>>
>> +#ifdef CONFIG_PAGE_TABLE_ISOLATION
>> +/*
>> + * Instead of one PGD, we acquire two PGDs.  Being order-1, it is
>> + * both 8k in size and 8k-aligned.  That lets us just flip bit 12
>> + * in a pointer to swap between the two 4k halves.
>> + */
>> +#define PGD_ALLOCATION_ORDER 1
>> +#else
>> +#define PGD_ALLOCATION_ORDER 0
>> +#endif
>
> This conflicts with the definition of PGD_ALLOCATION_ORDER in
> arch/x86/mm/pgtable.c that says:
>
> /*
>  * Instead of one pgd, Kaiser acquires two pgds.  Being order-1, it is
>  * both 8k in size and 8k-aligned.  That lets us just flip bit 12
>  * in a pointer to swap between the two 4k halves.
>  */
> #define PGD_ALLOCATION_ORDERkaiser_enabled
>
> So, which is it?
>
> I'm going to go drop this from the 4.9 stable queue because of this.
>
> thanks,
>
> greg k-h


Re

2018-01-13 Thread Alex



--
Hello,

I have a project i want to bring to you.. please respond for details

Alex


  1   2   >