Re: [PATCH 3/4] mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE

2018-08-22 Thread Nicholas Piggin
On Wed, 22 Aug 2018 22:03:40 -0700
Linus Torvalds  wrote:

> On Wed, Aug 22, 2018 at 9:33 PM Nicholas Piggin  wrote:
> >
> > I think it was quite well understood and fixed here, a145abf12c9 but
> > again that was before I really started looking at it.  
> 
> You don't understand the problem.

More fundamentally I think I didn't understand this fix, I think
actually powerpc/radix does have a bug here. a145abf12c9 was really
just a replacement for x86's hack of expanding the TLB invalidation
range when freeing page table to capture page walk cache (powerpc/radix
needs a different instruction so that didn't work for us).

But I hadn't really looked at this fix closely rather Peter's follow up
post about making powerpc page walk cache flushing design a generic
concept.

My point in this reply was more that my patches from the other month
weren't a blundering issue to fix this bug without realising it, they
were purely about avoiding the x86 TLB range expanding hack (that won't
be needed if generic users all move over).

> 
> All the x86 people thought WE ALREADY DID THAT.
> 
> Because we had done this all correctly over a decade ago!
> 
> Nobody realized that it had been screwed up by the powerpc code, and

The powerpc/hash code is not screwed up though AFAIKS. You can't
take arch specific code and slap a "generic" label on it, least of all
the crazy powerpc/hash code, you of all people would agree with that :)

> the commit you point to was believed to be a new *powerpc* only issue,
> because the semantics on powerpc has changed because of the radix
> tree.
> 
> The semantics on x86 have never changed, they've always been the same.
> So why would the x86 people react to powerpc doing something that x86
> had already always done.
> 
> See?
> 
> Nobody cared one whit about commit a145abf12c9, because it just
> handles a new powerpc-specific case.
> 
> > I don't really understand what the issue you have with powerpc here.
> > powerpc hash has the page table flushing accessors which are just
> > no-ops, it's the generic code that fails to call them properly. Surely
> > there was no powerpc patch that removed those calls from generic code?  
> 
> Yes there was.
> 
> Look where the generic code *came* from.
> 
> It's powerpc code.
> 
> See commit 267239116987 ("mm, powerpc: move the RCU page-table freeing
> into generic code").
> 
> The powerpc code was made into the generic code, because the powerpc
> code had to handle all those special RCU freeing things etc that
> others didn't.
> 
> It's just that when x86 was then switched over to use the "generic"
> code, people didn't realize that the generic code didn't do the TLB
> invalidations for page tables, because they hadn't been needed on
> powerpc.

Sure, there was a minor bug in the port. Not that it was a closely
guarded secret that powerpc didn't flush page table pages, but it's a
relatively subtle issue in complex code. That happens.

> 
> So the powerpc code that was made generic, never really was. The new
> "generic" code had a powerpc-specific quirk.
> 
> That then very subtly broke x86, without the x86 people ever
> realizing. Because the old simple non-RCU x86 code had never had that
> issue, it just treated the leaf pages and the directory pages exactly
> the same.
> 
> See?
> 
> And THAT is why I talk about the powerpc code. Because what is
> "generic" code in 4.18 (and several releases before) oisn't actually
> generic.
> 
> And that's basically exactly the bug that the patches from PeterZ is
> fixing. Making the "tlb_remove_table()" code always flush the tlb, the
> way it should have when it was made generic.

It just sounded like you were blaming correct powerpc/hash code for
this. It's just a minor bug in taking that code into generic, not really
a big deal, right? Or are you saying powerpc devs or code could be doing
something better to play nicer with the rest of the archs?

Honestly trying to improve things here, and encouraged by x86 and ARM
looking to move over to a saner page walk cache tracking design and
sharing more code with powerpc/radix. I would help with reviewing
things or writing code or porting powerpc bits if I can.

Thanks,
Nick


Re: [PATCH 3/4] mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE

2018-08-22 Thread Nicholas Piggin
On Wed, 22 Aug 2018 22:03:40 -0700
Linus Torvalds  wrote:

> On Wed, Aug 22, 2018 at 9:33 PM Nicholas Piggin  wrote:
> >
> > I think it was quite well understood and fixed here, a145abf12c9 but
> > again that was before I really started looking at it.  
> 
> You don't understand the problem.

More fundamentally I think I didn't understand this fix, I think
actually powerpc/radix does have a bug here. a145abf12c9 was really
just a replacement for x86's hack of expanding the TLB invalidation
range when freeing page table to capture page walk cache (powerpc/radix
needs a different instruction so that didn't work for us).

But I hadn't really looked at this fix closely rather Peter's follow up
post about making powerpc page walk cache flushing design a generic
concept.

My point in this reply was more that my patches from the other month
weren't a blundering issue to fix this bug without realising it, they
were purely about avoiding the x86 TLB range expanding hack (that won't
be needed if generic users all move over).

> 
> All the x86 people thought WE ALREADY DID THAT.
> 
> Because we had done this all correctly over a decade ago!
> 
> Nobody realized that it had been screwed up by the powerpc code, and

The powerpc/hash code is not screwed up though AFAIKS. You can't
take arch specific code and slap a "generic" label on it, least of all
the crazy powerpc/hash code, you of all people would agree with that :)

> the commit you point to was believed to be a new *powerpc* only issue,
> because the semantics on powerpc has changed because of the radix
> tree.
> 
> The semantics on x86 have never changed, they've always been the same.
> So why would the x86 people react to powerpc doing something that x86
> had already always done.
> 
> See?
> 
> Nobody cared one whit about commit a145abf12c9, because it just
> handles a new powerpc-specific case.
> 
> > I don't really understand what the issue you have with powerpc here.
> > powerpc hash has the page table flushing accessors which are just
> > no-ops, it's the generic code that fails to call them properly. Surely
> > there was no powerpc patch that removed those calls from generic code?  
> 
> Yes there was.
> 
> Look where the generic code *came* from.
> 
> It's powerpc code.
> 
> See commit 267239116987 ("mm, powerpc: move the RCU page-table freeing
> into generic code").
> 
> The powerpc code was made into the generic code, because the powerpc
> code had to handle all those special RCU freeing things etc that
> others didn't.
> 
> It's just that when x86 was then switched over to use the "generic"
> code, people didn't realize that the generic code didn't do the TLB
> invalidations for page tables, because they hadn't been needed on
> powerpc.

Sure, there was a minor bug in the port. Not that it was a closely
guarded secret that powerpc didn't flush page table pages, but it's a
relatively subtle issue in complex code. That happens.

> 
> So the powerpc code that was made generic, never really was. The new
> "generic" code had a powerpc-specific quirk.
> 
> That then very subtly broke x86, without the x86 people ever
> realizing. Because the old simple non-RCU x86 code had never had that
> issue, it just treated the leaf pages and the directory pages exactly
> the same.
> 
> See?
> 
> And THAT is why I talk about the powerpc code. Because what is
> "generic" code in 4.18 (and several releases before) oisn't actually
> generic.
> 
> And that's basically exactly the bug that the patches from PeterZ is
> fixing. Making the "tlb_remove_table()" code always flush the tlb, the
> way it should have when it was made generic.

It just sounded like you were blaming correct powerpc/hash code for
this. It's just a minor bug in taking that code into generic, not really
a big deal, right? Or are you saying powerpc devs or code could be doing
something better to play nicer with the rest of the archs?

Honestly trying to improve things here, and encouraged by x86 and ARM
looking to move over to a saner page walk cache tracking design and
sharing more code with powerpc/radix. I would help with reviewing
things or writing code or porting powerpc bits if I can.

Thanks,
Nick


Re: [PATCH 1/2] dt-bindings: dmaengine: add DT binding for UniPhier MIO DMAC

2018-08-22 Thread Jassi Brar
On 23 August 2018 at 10:48, Masahiro Yamada
 wrote:
> Hi Jassi,
>
>
> 2018-08-21 19:44 GMT+09:00 Jassi Brar :
>> On 21 August 2018 at 15:17, Masahiro Yamada
>>  wrote:
>>> (+CC Rob, DT, LKML)
>>>
>>> I forgot to CC this to DT community...
>>>
>>>
>>> 2018-08-21 18:30 GMT+09:00 Masahiro Yamada :
 The MIO DMAC (Media IO DMA Controller) is used in UniPhier LD4,
 Pro4, and sLD8 SoCs.

 Signed-off-by: Masahiro Yamada 
 ---

  .../devicetree/bindings/dma/uniphier-mio-dmac.txt  | 28 
 ++
  1 file changed, 28 insertions(+)
  create mode 100644 
 Documentation/devicetree/bindings/dma/uniphier-mio-dmac.txt

 diff --git a/Documentation/devicetree/bindings/dma/uniphier-mio-dmac.txt 
 b/Documentation/devicetree/bindings/dma/uniphier-mio-dmac.txt
 new file mode 100644
 index 000..a9e969e
 --- /dev/null
 +++ b/Documentation/devicetree/bindings/dma/uniphier-mio-dmac.txt
 @@ -0,0 +1,28 @@
 +UniPhier Media IO DMA controller
 +
 +This works as an external DMA engine for SD/eMMC controllers etc.
 +found in UniPhier LD4, Pro4, sLD8 SoCs.
 +
 +Required properties:
 +- compatible: should be "socionext,uniphier-mio-dmac".
 +- reg: offset and length of the register set for the device.
 +- interrupts: a list of interrupt specifiers associated with the DMA 
 channels.
 +- clocks: a single clock specifier
 +- #dma-cells: should be <1>. The single cell represents the channel 
 number.
 +- dma-channels: specify the number of the DMA channels. This should match 
 to
 +  the number of tuples in the interrupts property.
 +
>> Can we not infer the number of channels from interrupt tuples? After
>> all the driver assumes they are same.
>
>
> It would be possible to count the number of tuples
> in "interrupts".
>
>
>
> I know of_irq_count(), but I do not see any driver
> in drivers/dma/ that calls it.
>
>
> I guess the reason is that of_irq_count() is not exported,
> so tristate drivers like this cannot use it.
>
>
> I checked Documentation/devicetree/bindings/dma/,
> and some controllers specify _redundant_ dma-channels property.
>
> fsl-mxs-dma.txt
> renesas,rcar-dmac.txt
> renesas,usb-dmac.txt
>
:) I am not sure "because others are doing it" is a good reason to
introduce redundancy.


> I also see counter-implementation.
>
>
> bcm2835-dma.c hard-codes the number of channels in the driver.
> tegra210-adma.c associates nr_channels with compatible string.
>
>
>
> I will wait for comments from the maintainers.
>
> If desired, I will export of_irq_count()
> and use it from my driver.
>
If you don't want to leave too much footprint, you could do

  count = 0;
  while (of_irq_parse_one(dev, count, ) == 0) count++

of_irq_parse_one() is already exported.

A good side-effect is you wouldn't have to hardcode the count in the
driver (like bcm and tegra examples you quote).

Having said that, I wouldn't lose sleep over it. So 

Cheers!


Re: [PATCH 1/2] dt-bindings: dmaengine: add DT binding for UniPhier MIO DMAC

2018-08-22 Thread Jassi Brar
On 23 August 2018 at 10:48, Masahiro Yamada
 wrote:
> Hi Jassi,
>
>
> 2018-08-21 19:44 GMT+09:00 Jassi Brar :
>> On 21 August 2018 at 15:17, Masahiro Yamada
>>  wrote:
>>> (+CC Rob, DT, LKML)
>>>
>>> I forgot to CC this to DT community...
>>>
>>>
>>> 2018-08-21 18:30 GMT+09:00 Masahiro Yamada :
 The MIO DMAC (Media IO DMA Controller) is used in UniPhier LD4,
 Pro4, and sLD8 SoCs.

 Signed-off-by: Masahiro Yamada 
 ---

  .../devicetree/bindings/dma/uniphier-mio-dmac.txt  | 28 
 ++
  1 file changed, 28 insertions(+)
  create mode 100644 
 Documentation/devicetree/bindings/dma/uniphier-mio-dmac.txt

 diff --git a/Documentation/devicetree/bindings/dma/uniphier-mio-dmac.txt 
 b/Documentation/devicetree/bindings/dma/uniphier-mio-dmac.txt
 new file mode 100644
 index 000..a9e969e
 --- /dev/null
 +++ b/Documentation/devicetree/bindings/dma/uniphier-mio-dmac.txt
 @@ -0,0 +1,28 @@
 +UniPhier Media IO DMA controller
 +
 +This works as an external DMA engine for SD/eMMC controllers etc.
 +found in UniPhier LD4, Pro4, sLD8 SoCs.
 +
 +Required properties:
 +- compatible: should be "socionext,uniphier-mio-dmac".
 +- reg: offset and length of the register set for the device.
 +- interrupts: a list of interrupt specifiers associated with the DMA 
 channels.
 +- clocks: a single clock specifier
 +- #dma-cells: should be <1>. The single cell represents the channel 
 number.
 +- dma-channels: specify the number of the DMA channels. This should match 
 to
 +  the number of tuples in the interrupts property.
 +
>> Can we not infer the number of channels from interrupt tuples? After
>> all the driver assumes they are same.
>
>
> It would be possible to count the number of tuples
> in "interrupts".
>
>
>
> I know of_irq_count(), but I do not see any driver
> in drivers/dma/ that calls it.
>
>
> I guess the reason is that of_irq_count() is not exported,
> so tristate drivers like this cannot use it.
>
>
> I checked Documentation/devicetree/bindings/dma/,
> and some controllers specify _redundant_ dma-channels property.
>
> fsl-mxs-dma.txt
> renesas,rcar-dmac.txt
> renesas,usb-dmac.txt
>
:) I am not sure "because others are doing it" is a good reason to
introduce redundancy.


> I also see counter-implementation.
>
>
> bcm2835-dma.c hard-codes the number of channels in the driver.
> tegra210-adma.c associates nr_channels with compatible string.
>
>
>
> I will wait for comments from the maintainers.
>
> If desired, I will export of_irq_count()
> and use it from my driver.
>
If you don't want to leave too much footprint, you could do

  count = 0;
  while (of_irq_parse_one(dev, count, ) == 0) count++

of_irq_parse_one() is already exported.

A good side-effect is you wouldn't have to hardcode the count in the
driver (like bcm and tegra examples you quote).

Having said that, I wouldn't lose sleep over it. So 

Cheers!


Re: KASAN: null-ptr-deref Write in binder_update_page_range

2018-08-22 Thread Minchan Kim
Hi,

On Wed, Aug 22, 2018 at 03:07:04PM +0900, Dae R. Jeong wrote:
> Reporting the crash: KASAN: null-ptr-deref Write in binder_update_page_range
> 
> This crash has been found in v4.18-rc3 using RaceFuzzer (a modified
> version of Syzkaller), which we describe more at the end of this
> report.
> 
> Our analysis shows that the race occurs when invoking two syscalls
> concurrently, mmap$binder() and ioctl$BINDER_WRITE_READ. More
> specifically, since two code lines `alloc->vma = vma;` and
> `alloc->vma_vm_mm = vma->vm_mm;` in binder_alloc_mmap_handler() is not
> an atomic operation during mmap$binder() syscall, there is a time
> window that `alloc->vma` is assigned but `alloc->vma_vm_mm` isn't
> assigned. It causes the null pointer dereference in
> binder_alloc_new_buf_locked() since it checks whether `alloc->vma` is
> NULL, but it doesn't check that `alloc->vma_vm_mm` is NULL. More
> details on the thread interleaving and the crash log are follows.
> 
> 
> Thread interleaving:
> CPU0 (binder_alloc_mmap_handler)  CPU1 
> (binder_alloc_new_buf_locked)
> = =
> // drivers/android/binder_alloc.c
> // #L718 (v4.18-rc3)
> alloc->vma = vma;
>   // 
> drivers/android/binder_alloc.c
>   // #L346 (v4.18-rc3)
>   if (alloc->vma == NULL) {
>   ...
>   // alloc->vma is not NULL 
> at this point
>   return ERR_PTR(-ESRCH);
>   }
>   ...
>   // #L438
>   binder_update_page_range(alloc, 
> 0,
>   (void 
> *)PAGE_ALIGN((uintptr_t)buffer->data),
>   end_page_addr);
> 
>   // In 
> binder_update_page_range() #L218
>   // But still alloc->vma_vm_mm 
> is NULL here
>   if (need_mm && 
> mmget_not_zero(alloc->vma_vm_mm))
> alloc->vma_vm_mm = vma->vm_mm;
> 
> 
> Crash Log:
> ==
> BUG: KASAN: null-ptr-deref in __atomic_add_unless 
> include/asm-generic/atomic-instrumented.h:89 [inline]
> BUG: KASAN: null-ptr-deref in atomic_add_unless include/linux/atomic.h:533 
> [inline]
> BUG: KASAN: null-ptr-deref in mmget_not_zero include/linux/sched/mm.h:75 
> [inline]
> BUG: KASAN: null-ptr-deref in binder_update_page_range+0xece/0x18e0 
> drivers/android/binder_alloc.c:218
> Write of size 4 at addr 0058 by task syz-executor0/11184
> 
> CPU: 1 PID: 11184 Comm: syz-executor0 Not tainted 4.18.0-rc3 #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x16e/0x22c lib/dump_stack.c:113
>  kasan_report_error mm/kasan/report.c:352 [inline]
>  kasan_report+0x163/0x380 mm/kasan/report.c:412
>  check_memory_region_inline mm/kasan/kasan.c:260 [inline]
>  check_memory_region+0x140/0x1a0 mm/kasan/kasan.c:267
>  kasan_check_write+0x14/0x20 mm/kasan/kasan.c:278
>  __atomic_add_unless include/asm-generic/atomic-instrumented.h:89 [inline]
>  atomic_add_unless include/linux/atomic.h:533 [inline]
>  mmget_not_zero include/linux/sched/mm.h:75 [inline]
>  binder_update_page_range+0xece/0x18e0 drivers/android/binder_alloc.c:218
>  binder_alloc_new_buf_locked drivers/android/binder_alloc.c:443 [inline]
>  binder_alloc_new_buf+0x467/0xc30 drivers/android/binder_alloc.c:513
>  binder_transaction+0x125b/0x4fb0 drivers/android/binder.c:2957
>  binder_thread_write+0xc08/0x2770 drivers/android/binder.c:3528
>  binder_ioctl_write_read.isra.39+0x24f/0x8e0 drivers/android/binder.c:4456
>  binder_ioctl+0xa86/0xf34 drivers/android/binder.c:4596
>  vfs_ioctl fs/ioctl.c:46 [inline]
>  do_vfs_ioctl+0x154/0xd40 fs/ioctl.c:686
>  ksys_ioctl+0x94/0xb0 fs/ioctl.c:701
>  __do_sys_ioctl fs/ioctl.c:708 [inline]
>  __se_sys_ioctl fs/ioctl.c:706 [inline]
>  __x64_sys_ioctl+0x43/0x50 fs/ioctl.c:706
>  do_syscall_64+0x167/0x4b0 arch/x86/entry/common.c:290
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x456469
> Code: 1d ba fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 
> 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 
> 83 eb b9 fb ff c3 66 2e 0f 1f 84 00 00 00 00
> RSP: 002b:7f575f268b28 EFLAGS: 0246 ORIG_RAX: 0010
> RAX: ffda RBX: 0072bfa0 RCX: 00456469
> RDX: 23c0 RSI: c0306201 RDI: 0016
> RBP: 

Re: KASAN: null-ptr-deref Write in binder_update_page_range

2018-08-22 Thread Minchan Kim
Hi,

On Wed, Aug 22, 2018 at 03:07:04PM +0900, Dae R. Jeong wrote:
> Reporting the crash: KASAN: null-ptr-deref Write in binder_update_page_range
> 
> This crash has been found in v4.18-rc3 using RaceFuzzer (a modified
> version of Syzkaller), which we describe more at the end of this
> report.
> 
> Our analysis shows that the race occurs when invoking two syscalls
> concurrently, mmap$binder() and ioctl$BINDER_WRITE_READ. More
> specifically, since two code lines `alloc->vma = vma;` and
> `alloc->vma_vm_mm = vma->vm_mm;` in binder_alloc_mmap_handler() is not
> an atomic operation during mmap$binder() syscall, there is a time
> window that `alloc->vma` is assigned but `alloc->vma_vm_mm` isn't
> assigned. It causes the null pointer dereference in
> binder_alloc_new_buf_locked() since it checks whether `alloc->vma` is
> NULL, but it doesn't check that `alloc->vma_vm_mm` is NULL. More
> details on the thread interleaving and the crash log are follows.
> 
> 
> Thread interleaving:
> CPU0 (binder_alloc_mmap_handler)  CPU1 
> (binder_alloc_new_buf_locked)
> = =
> // drivers/android/binder_alloc.c
> // #L718 (v4.18-rc3)
> alloc->vma = vma;
>   // 
> drivers/android/binder_alloc.c
>   // #L346 (v4.18-rc3)
>   if (alloc->vma == NULL) {
>   ...
>   // alloc->vma is not NULL 
> at this point
>   return ERR_PTR(-ESRCH);
>   }
>   ...
>   // #L438
>   binder_update_page_range(alloc, 
> 0,
>   (void 
> *)PAGE_ALIGN((uintptr_t)buffer->data),
>   end_page_addr);
> 
>   // In 
> binder_update_page_range() #L218
>   // But still alloc->vma_vm_mm 
> is NULL here
>   if (need_mm && 
> mmget_not_zero(alloc->vma_vm_mm))
> alloc->vma_vm_mm = vma->vm_mm;
> 
> 
> Crash Log:
> ==
> BUG: KASAN: null-ptr-deref in __atomic_add_unless 
> include/asm-generic/atomic-instrumented.h:89 [inline]
> BUG: KASAN: null-ptr-deref in atomic_add_unless include/linux/atomic.h:533 
> [inline]
> BUG: KASAN: null-ptr-deref in mmget_not_zero include/linux/sched/mm.h:75 
> [inline]
> BUG: KASAN: null-ptr-deref in binder_update_page_range+0xece/0x18e0 
> drivers/android/binder_alloc.c:218
> Write of size 4 at addr 0058 by task syz-executor0/11184
> 
> CPU: 1 PID: 11184 Comm: syz-executor0 Not tainted 4.18.0-rc3 #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x16e/0x22c lib/dump_stack.c:113
>  kasan_report_error mm/kasan/report.c:352 [inline]
>  kasan_report+0x163/0x380 mm/kasan/report.c:412
>  check_memory_region_inline mm/kasan/kasan.c:260 [inline]
>  check_memory_region+0x140/0x1a0 mm/kasan/kasan.c:267
>  kasan_check_write+0x14/0x20 mm/kasan/kasan.c:278
>  __atomic_add_unless include/asm-generic/atomic-instrumented.h:89 [inline]
>  atomic_add_unless include/linux/atomic.h:533 [inline]
>  mmget_not_zero include/linux/sched/mm.h:75 [inline]
>  binder_update_page_range+0xece/0x18e0 drivers/android/binder_alloc.c:218
>  binder_alloc_new_buf_locked drivers/android/binder_alloc.c:443 [inline]
>  binder_alloc_new_buf+0x467/0xc30 drivers/android/binder_alloc.c:513
>  binder_transaction+0x125b/0x4fb0 drivers/android/binder.c:2957
>  binder_thread_write+0xc08/0x2770 drivers/android/binder.c:3528
>  binder_ioctl_write_read.isra.39+0x24f/0x8e0 drivers/android/binder.c:4456
>  binder_ioctl+0xa86/0xf34 drivers/android/binder.c:4596
>  vfs_ioctl fs/ioctl.c:46 [inline]
>  do_vfs_ioctl+0x154/0xd40 fs/ioctl.c:686
>  ksys_ioctl+0x94/0xb0 fs/ioctl.c:701
>  __do_sys_ioctl fs/ioctl.c:708 [inline]
>  __se_sys_ioctl fs/ioctl.c:706 [inline]
>  __x64_sys_ioctl+0x43/0x50 fs/ioctl.c:706
>  do_syscall_64+0x167/0x4b0 arch/x86/entry/common.c:290
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x456469
> Code: 1d ba fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 
> 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 
> 83 eb b9 fb ff c3 66 2e 0f 1f 84 00 00 00 00
> RSP: 002b:7f575f268b28 EFLAGS: 0246 ORIG_RAX: 0010
> RAX: ffda RBX: 0072bfa0 RCX: 00456469
> RDX: 23c0 RSI: c0306201 RDI: 0016
> RBP: 

[PATCH] mei: fix use-after-free in mei_cl_write

2018-08-22 Thread john . hubbard
From: John Hubbard 

KASAN reports a use-after-free during startup, in mei_cl_write:

BUG: KASAN: use-after-free in mei_cl_write+0x601/0x870 [mei]
   (drivers/misc/mei/client.c:1770)

This is caused by commit 98e70866aacb ("mei: add support for variable
length mei headers."), which changed the return value from len, to
buf-size. That ends up using a stale buf pointer, because in some
situations, the cb (callback) is deleted before it gets here.

However, fortunately, len remains unchanged throughout the function
(and I don't see anything else that would require re-reading buf->size
either), so the fix is to simply revert the change, and return len, as
before.

CC: Tomas Winkler 
CC: Arnd Bergmann 
CC: Greg Kroah-Hartman 
Signed-off-by: John Hubbard 
---
 drivers/misc/mei/client.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/misc/mei/client.c b/drivers/misc/mei/client.c
index 4ab6251d418e..ebdcf0b450e2 100644
--- a/drivers/misc/mei/client.c
+++ b/drivers/misc/mei/client.c
@@ -1767,7 +1767,7 @@ ssize_t mei_cl_write(struct mei_cl *cl, struct mei_cl_cb 
*cb)
}
}
 
-   rets = buf->size;
+   rets = len;
 err:
cl_dbg(dev, cl, "rpm: autosuspend\n");
pm_runtime_mark_last_busy(dev->dev);
-- 
2.18.0



[PATCH] mei: fix use-after-free in mei_cl_write

2018-08-22 Thread john . hubbard
From: John Hubbard 

KASAN reports a use-after-free during startup, in mei_cl_write:

BUG: KASAN: use-after-free in mei_cl_write+0x601/0x870 [mei]
   (drivers/misc/mei/client.c:1770)

This is caused by commit 98e70866aacb ("mei: add support for variable
length mei headers."), which changed the return value from len, to
buf-size. That ends up using a stale buf pointer, because in some
situations, the cb (callback) is deleted before it gets here.

However, fortunately, len remains unchanged throughout the function
(and I don't see anything else that would require re-reading buf->size
either), so the fix is to simply revert the change, and return len, as
before.

CC: Tomas Winkler 
CC: Arnd Bergmann 
CC: Greg Kroah-Hartman 
Signed-off-by: John Hubbard 
---
 drivers/misc/mei/client.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/misc/mei/client.c b/drivers/misc/mei/client.c
index 4ab6251d418e..ebdcf0b450e2 100644
--- a/drivers/misc/mei/client.c
+++ b/drivers/misc/mei/client.c
@@ -1767,7 +1767,7 @@ ssize_t mei_cl_write(struct mei_cl *cl, struct mei_cl_cb 
*cb)
}
}
 
-   rets = buf->size;
+   rets = len;
 err:
cl_dbg(dev, cl, "rpm: autosuspend\n");
pm_runtime_mark_last_busy(dev->dev);
-- 
2.18.0



Re: [PATCH 3/4] mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE

2018-08-22 Thread Benjamin Herrenschmidt
On Wed, 2018-08-22 at 22:11 -0700, Linus Torvalds wrote:
> On Wed, Aug 22, 2018 at 9:54 PM Benjamin Herrenschmidt  
> wrote:
> > 
> > 
> > So we do need a different flush instruction for the page tables vs. the
> > normal TLB pages.
> 
> Right. ARM wants it too. x86 is odd in that a regular "invlpg" already
> invalidates all the internal tlb cache nodes.
> 
> So the "new world order" is exactly that patch that PeterZ sent you, that 
> adds a
> 
> +   unsigned intfreed_tables : 1;
> 

 .../...

> So instead, when you get to the actual "tlb_flush(tlb)", you do
> exactly that - flush the tlb. And the mmu_gather structure shows you
> how much you need to flush. If you see that "freed_tables" is set,
> then you know that you need to also do the special instruction to
> flush the inner level caches. The range continues to show the page
> range.

Yup. That looks like a generic version of the "need_flush_all" flag we
have, which is fine by us.

Just don't blame powerpc for all the historical crap :-)

Cheers,
Ben.

> 
>Linus



Re: [PATCH 3/4] mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE

2018-08-22 Thread Benjamin Herrenschmidt
On Wed, 2018-08-22 at 22:11 -0700, Linus Torvalds wrote:
> On Wed, Aug 22, 2018 at 9:54 PM Benjamin Herrenschmidt  
> wrote:
> > 
> > 
> > So we do need a different flush instruction for the page tables vs. the
> > normal TLB pages.
> 
> Right. ARM wants it too. x86 is odd in that a regular "invlpg" already
> invalidates all the internal tlb cache nodes.
> 
> So the "new world order" is exactly that patch that PeterZ sent you, that 
> adds a
> 
> +   unsigned intfreed_tables : 1;
> 

 .../...

> So instead, when you get to the actual "tlb_flush(tlb)", you do
> exactly that - flush the tlb. And the mmu_gather structure shows you
> how much you need to flush. If you see that "freed_tables" is set,
> then you know that you need to also do the special instruction to
> flush the inner level caches. The range continues to show the page
> range.

Yup. That looks like a generic version of the "need_flush_all" flag we
have, which is fine by us.

Just don't blame powerpc for all the historical crap :-)

Cheers,
Ben.

> 
>Linus



Re: [PATCH 3/4] mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE

2018-08-22 Thread Linus Torvalds
On Wed, Aug 22, 2018 at 10:11 PM Linus Torvalds
 wrote:
>
> So instead, when you get to the actual "tlb_flush(tlb)", you do
> exactly that - flush the tlb. And the mmu_gather structure shows you
> how much you need to flush. If you see that "freed_tables" is set,
> then you know that you need to also do the special instruction to
> flush the inner level caches. The range continues to show the page
> range.

Note that this obviously works fine for a hashed table model too - you
just ignore the "freed_tables" bit entirely and continue to do
whatever you always did.

And we can ignore it on x86 too, because we just see the range, and we
invalidate the range, and that will always invalidate the mid-level
caching too.

So the new bit is literally for arm and powerpc-radix (and maybe
s390), but we want to make the actual VM interface truly generic and
not have one set of code with five different behaviors (which we
_currently_ kind of have with the whole in addition to all the
HAVE_RCU_TABLE_FREE etc config options that modify how the code works.

It would be good to also cut down on the millions of functions that
each architecture can override, because Christ, it got very confusing
at times to follow just how the code flowed from generic code to
architecture-specific macros back to generic code and then
arch-specific inline helper functions.

It's a maze of underscores and "page" vs "table", and "flush" vs "remove" etc.

But that "it would be good to really make everybody to use as much of
the generic code as possible" and everybody have the same pattern,
that's a future thing. But the whole "let's just add that
"freed_tables" thing would be part of trying to get people to use the
same overall pattern, even if some architectures might not care about
that detail.

  Linus


Re: [PATCH 3/4] mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE

2018-08-22 Thread Linus Torvalds
On Wed, Aug 22, 2018 at 10:11 PM Linus Torvalds
 wrote:
>
> So instead, when you get to the actual "tlb_flush(tlb)", you do
> exactly that - flush the tlb. And the mmu_gather structure shows you
> how much you need to flush. If you see that "freed_tables" is set,
> then you know that you need to also do the special instruction to
> flush the inner level caches. The range continues to show the page
> range.

Note that this obviously works fine for a hashed table model too - you
just ignore the "freed_tables" bit entirely and continue to do
whatever you always did.

And we can ignore it on x86 too, because we just see the range, and we
invalidate the range, and that will always invalidate the mid-level
caching too.

So the new bit is literally for arm and powerpc-radix (and maybe
s390), but we want to make the actual VM interface truly generic and
not have one set of code with five different behaviors (which we
_currently_ kind of have with the whole in addition to all the
HAVE_RCU_TABLE_FREE etc config options that modify how the code works.

It would be good to also cut down on the millions of functions that
each architecture can override, because Christ, it got very confusing
at times to follow just how the code flowed from generic code to
architecture-specific macros back to generic code and then
arch-specific inline helper functions.

It's a maze of underscores and "page" vs "table", and "flush" vs "remove" etc.

But that "it would be good to really make everybody to use as much of
the generic code as possible" and everybody have the same pattern,
that's a future thing. But the whole "let's just add that
"freed_tables" thing would be part of trying to get people to use the
same overall pattern, even if some architectures might not care about
that detail.

  Linus


Re: [PATCH 1/2] dt-bindings: dmaengine: add DT binding for UniPhier MIO DMAC

2018-08-22 Thread Masahiro Yamada
Hi Jassi,


2018-08-21 19:44 GMT+09:00 Jassi Brar :
> On 21 August 2018 at 15:17, Masahiro Yamada
>  wrote:
>> (+CC Rob, DT, LKML)
>>
>> I forgot to CC this to DT community...
>>
>>
>> 2018-08-21 18:30 GMT+09:00 Masahiro Yamada :
>>> The MIO DMAC (Media IO DMA Controller) is used in UniPhier LD4,
>>> Pro4, and sLD8 SoCs.
>>>
>>> Signed-off-by: Masahiro Yamada 
>>> ---
>>>
>>>  .../devicetree/bindings/dma/uniphier-mio-dmac.txt  | 28 
>>> ++
>>>  1 file changed, 28 insertions(+)
>>>  create mode 100644 
>>> Documentation/devicetree/bindings/dma/uniphier-mio-dmac.txt
>>>
>>> diff --git a/Documentation/devicetree/bindings/dma/uniphier-mio-dmac.txt 
>>> b/Documentation/devicetree/bindings/dma/uniphier-mio-dmac.txt
>>> new file mode 100644
>>> index 000..a9e969e
>>> --- /dev/null
>>> +++ b/Documentation/devicetree/bindings/dma/uniphier-mio-dmac.txt
>>> @@ -0,0 +1,28 @@
>>> +UniPhier Media IO DMA controller
>>> +
>>> +This works as an external DMA engine for SD/eMMC controllers etc.
>>> +found in UniPhier LD4, Pro4, sLD8 SoCs.
>>> +
>>> +Required properties:
>>> +- compatible: should be "socionext,uniphier-mio-dmac".
>>> +- reg: offset and length of the register set for the device.
>>> +- interrupts: a list of interrupt specifiers associated with the DMA 
>>> channels.
>>> +- clocks: a single clock specifier
>>> +- #dma-cells: should be <1>. The single cell represents the channel number.
>>> +- dma-channels: specify the number of the DMA channels. This should match 
>>> to
>>> +  the number of tuples in the interrupts property.
>>> +
> Can we not infer the number of channels from interrupt tuples? After
> all the driver assumes they are same.


It would be possible to count the number of tuples
in "interrupts".



I know of_irq_count(), but I do not see any driver
in drivers/dma/ that calls it.


I guess the reason is that of_irq_count() is not exported,
so tristate drivers like this cannot use it.


I checked Documentation/devicetree/bindings/dma/,
and some controllers specify _redundant_ dma-channels property.

fsl-mxs-dma.txt
renesas,rcar-dmac.txt
renesas,usb-dmac.txt



I also see counter-implementation.


bcm2835-dma.c hard-codes the number of channels in the driver.
tegra210-adma.c associates nr_channels with compatible string.



I will wait for comments from the maintainers.

If desired, I will export of_irq_count()
and use it from my driver.



Thanks.

-- 
Best Regards
Masahiro Yamada


Re: [PATCH 1/2] dt-bindings: dmaengine: add DT binding for UniPhier MIO DMAC

2018-08-22 Thread Masahiro Yamada
Hi Jassi,


2018-08-21 19:44 GMT+09:00 Jassi Brar :
> On 21 August 2018 at 15:17, Masahiro Yamada
>  wrote:
>> (+CC Rob, DT, LKML)
>>
>> I forgot to CC this to DT community...
>>
>>
>> 2018-08-21 18:30 GMT+09:00 Masahiro Yamada :
>>> The MIO DMAC (Media IO DMA Controller) is used in UniPhier LD4,
>>> Pro4, and sLD8 SoCs.
>>>
>>> Signed-off-by: Masahiro Yamada 
>>> ---
>>>
>>>  .../devicetree/bindings/dma/uniphier-mio-dmac.txt  | 28 
>>> ++
>>>  1 file changed, 28 insertions(+)
>>>  create mode 100644 
>>> Documentation/devicetree/bindings/dma/uniphier-mio-dmac.txt
>>>
>>> diff --git a/Documentation/devicetree/bindings/dma/uniphier-mio-dmac.txt 
>>> b/Documentation/devicetree/bindings/dma/uniphier-mio-dmac.txt
>>> new file mode 100644
>>> index 000..a9e969e
>>> --- /dev/null
>>> +++ b/Documentation/devicetree/bindings/dma/uniphier-mio-dmac.txt
>>> @@ -0,0 +1,28 @@
>>> +UniPhier Media IO DMA controller
>>> +
>>> +This works as an external DMA engine for SD/eMMC controllers etc.
>>> +found in UniPhier LD4, Pro4, sLD8 SoCs.
>>> +
>>> +Required properties:
>>> +- compatible: should be "socionext,uniphier-mio-dmac".
>>> +- reg: offset and length of the register set for the device.
>>> +- interrupts: a list of interrupt specifiers associated with the DMA 
>>> channels.
>>> +- clocks: a single clock specifier
>>> +- #dma-cells: should be <1>. The single cell represents the channel number.
>>> +- dma-channels: specify the number of the DMA channels. This should match 
>>> to
>>> +  the number of tuples in the interrupts property.
>>> +
> Can we not infer the number of channels from interrupt tuples? After
> all the driver assumes they are same.


It would be possible to count the number of tuples
in "interrupts".



I know of_irq_count(), but I do not see any driver
in drivers/dma/ that calls it.


I guess the reason is that of_irq_count() is not exported,
so tristate drivers like this cannot use it.


I checked Documentation/devicetree/bindings/dma/,
and some controllers specify _redundant_ dma-channels property.

fsl-mxs-dma.txt
renesas,rcar-dmac.txt
renesas,usb-dmac.txt



I also see counter-implementation.


bcm2835-dma.c hard-codes the number of channels in the driver.
tegra210-adma.c associates nr_channels with compatible string.



I will wait for comments from the maintainers.

If desired, I will export of_irq_count()
and use it from my driver.



Thanks.

-- 
Best Regards
Masahiro Yamada


Re: [PATCH 2/4] mm/tlb: Remove tlb_remove_table() non-concurrent condition

2018-08-22 Thread Nicholas Piggin
On Wed, 22 Aug 2018 21:54:48 -0700
Linus Torvalds  wrote:

> On Wed, Aug 22, 2018 at 9:16 PM Nicholas Piggin  wrote:
> >
> > On Wed, 22 Aug 2018 20:35:16 -0700
> > Linus Torvalds  wrote:  
> > >
> > > And yes, those lazy tlbs are all kernel threads, but they can still
> > > speculatively load user addresses.  
> >
> > So?
> >
> > If the arch does not shoot those all down after the user page tables
> > are removed then it's buggy regardless of this short cut.  
> 
> Read the code.
> 
> That shortcut frees the pages *WITHOUT* doing that TLB flush.

I've read the code and I understand that. Hence I'm asking because I
did't think the changelog matched the change. But possibly it was
because I actually didn't read the changelog enough -- I guess it does
say the TLB operation is the problem. I may have got side tracked by
the word speculativee.

> It just
> does __tlb_remove_table(), which does *not* do that whole page
> queueing so that we can batch flush and then free later.
> 
> So the fast-case it's buggy, exactly because of the reason you state.

Okay sure, thanks for confirming it for me. I would ask for changelog
to be slightly expanded on, but maybe it's just my reading
comprehension needs improvement...

> > The only real problem I could see would be if a page walk cache still
> > points to the freed table, then the table gets re-allocated and used
> > elsewhere, and meanwhile a speculative access tries to load an entry
> > from the page that is an invalid form of page table that might cause
> > a machine check or something. That would be (u)arch specific, but if
> > that's what we're concerned with here it's a different issue and needs
> > to be documented as such.  
> 
> We've *seen* that case, we had exactly that when we were being
> aggressive about trying to avoid flushes for the lazy mmu case
> entirely, because "we can just flush when we activate the lazy mm
> state instead".
> 
> The logic was actually solid from a TLB case - who cares what garbage
> TLB entries we might speculatively fill, when we're going to flush
> them before they can possibly be used.
> 
> It turns out that that logic is solid, but hits another problem: at
> least some AMD CPU's will cause machine checks when the TLB entries
> are inconsistent with the machine setup. That can't happen with a
> *good* page table, but when you speculatively walk already freed
> tables, you can get any garbage.

Yeah that does make sense.

> 
> I forget what the exact trigger was, but this was actually reported.
> So you can't free page directory pages without flushing the tlb first
> (to make that internal tlb node caches are flushed).
> 
> So the logic for freeing leaf pages and freeing middle nodes has to be
> exactly the same: you make the modification to the page table to
> remove the node/leaf, you queue up the memory for freeing, you flush
> the tlb, and you then free the page.
> 
> That's the ordering that tlb_remove_page() has always honored, but
> that tlb_remove_tabl() didn't.
> 
> It honored it for the *normal* case, which is why it took so long to
> notice that the TLB shootdown had been broken on x86 when it moved to
> the "generic" code. The *normal* case does this all right, and batches
> things up, and then when the batch fills up it does a
> tlb_table_flush() which does the TLB flush and schedules the actual
> freeing.
> 
> But there were two cases that *didn't* do that. The special "I'm the
> only thread" fast case, and the "oops I ran out of memory, so now I'll
> fake it, and just synchronize with page twalkers manually, and then do
> that special direct remove without flushing the tlb".
> 
> NOTE! Jann triggered that one by
>  (a) forcing the machine low on memory
>  (b) force-poisoning the page tables immediately after free
> 
> I suspect it's really hard to trigger under normal loads, exactly
> because the *normal* case gets it right. It's literally the "oops, I
> can't batch up because I ran out of memory" case that gets it wrong.
> 
> (And the special single-thread + lazy or use_mm() case, but that's
> going to be entirely impossible to trigger, because in practice it's
> just a single thread, and you won't ever hit the magic timing needed
> that frees the page in the single thread at exactly the same time that
> some optimistic lazy mm on another cpu happens to speculatively load
> that address).
> 
> So the "atomic_read(_users)" case is likely entirely impossible to
> trigger any sane way. But because Jann found the problem with the 'ran
> out of memory" case, we started looking at the more theoretical cases
> that matched the same kind of "no tlb flush before free" pattern.

Thanks for giving that background. In that case I'm happy with this fix.

Thanks,
Nick


Re: [PATCH 2/4] mm/tlb: Remove tlb_remove_table() non-concurrent condition

2018-08-22 Thread Nicholas Piggin
On Wed, 22 Aug 2018 21:54:48 -0700
Linus Torvalds  wrote:

> On Wed, Aug 22, 2018 at 9:16 PM Nicholas Piggin  wrote:
> >
> > On Wed, 22 Aug 2018 20:35:16 -0700
> > Linus Torvalds  wrote:  
> > >
> > > And yes, those lazy tlbs are all kernel threads, but they can still
> > > speculatively load user addresses.  
> >
> > So?
> >
> > If the arch does not shoot those all down after the user page tables
> > are removed then it's buggy regardless of this short cut.  
> 
> Read the code.
> 
> That shortcut frees the pages *WITHOUT* doing that TLB flush.

I've read the code and I understand that. Hence I'm asking because I
did't think the changelog matched the change. But possibly it was
because I actually didn't read the changelog enough -- I guess it does
say the TLB operation is the problem. I may have got side tracked by
the word speculativee.

> It just
> does __tlb_remove_table(), which does *not* do that whole page
> queueing so that we can batch flush and then free later.
> 
> So the fast-case it's buggy, exactly because of the reason you state.

Okay sure, thanks for confirming it for me. I would ask for changelog
to be slightly expanded on, but maybe it's just my reading
comprehension needs improvement...

> > The only real problem I could see would be if a page walk cache still
> > points to the freed table, then the table gets re-allocated and used
> > elsewhere, and meanwhile a speculative access tries to load an entry
> > from the page that is an invalid form of page table that might cause
> > a machine check or something. That would be (u)arch specific, but if
> > that's what we're concerned with here it's a different issue and needs
> > to be documented as such.  
> 
> We've *seen* that case, we had exactly that when we were being
> aggressive about trying to avoid flushes for the lazy mmu case
> entirely, because "we can just flush when we activate the lazy mm
> state instead".
> 
> The logic was actually solid from a TLB case - who cares what garbage
> TLB entries we might speculatively fill, when we're going to flush
> them before they can possibly be used.
> 
> It turns out that that logic is solid, but hits another problem: at
> least some AMD CPU's will cause machine checks when the TLB entries
> are inconsistent with the machine setup. That can't happen with a
> *good* page table, but when you speculatively walk already freed
> tables, you can get any garbage.

Yeah that does make sense.

> 
> I forget what the exact trigger was, but this was actually reported.
> So you can't free page directory pages without flushing the tlb first
> (to make that internal tlb node caches are flushed).
> 
> So the logic for freeing leaf pages and freeing middle nodes has to be
> exactly the same: you make the modification to the page table to
> remove the node/leaf, you queue up the memory for freeing, you flush
> the tlb, and you then free the page.
> 
> That's the ordering that tlb_remove_page() has always honored, but
> that tlb_remove_tabl() didn't.
> 
> It honored it for the *normal* case, which is why it took so long to
> notice that the TLB shootdown had been broken on x86 when it moved to
> the "generic" code. The *normal* case does this all right, and batches
> things up, and then when the batch fills up it does a
> tlb_table_flush() which does the TLB flush and schedules the actual
> freeing.
> 
> But there were two cases that *didn't* do that. The special "I'm the
> only thread" fast case, and the "oops I ran out of memory, so now I'll
> fake it, and just synchronize with page twalkers manually, and then do
> that special direct remove without flushing the tlb".
> 
> NOTE! Jann triggered that one by
>  (a) forcing the machine low on memory
>  (b) force-poisoning the page tables immediately after free
> 
> I suspect it's really hard to trigger under normal loads, exactly
> because the *normal* case gets it right. It's literally the "oops, I
> can't batch up because I ran out of memory" case that gets it wrong.
> 
> (And the special single-thread + lazy or use_mm() case, but that's
> going to be entirely impossible to trigger, because in practice it's
> just a single thread, and you won't ever hit the magic timing needed
> that frees the page in the single thread at exactly the same time that
> some optimistic lazy mm on another cpu happens to speculatively load
> that address).
> 
> So the "atomic_read(_users)" case is likely entirely impossible to
> trigger any sane way. But because Jann found the problem with the 'ran
> out of memory" case, we started looking at the more theoretical cases
> that matched the same kind of "no tlb flush before free" pattern.

Thanks for giving that background. In that case I'm happy with this fix.

Thanks,
Nick


[PATCH] mm: respect arch_dup_mmap() return value

2018-08-22 Thread Nadav Amit
Commit d70f2a14b72a4 ("include/linux/sched/mm.h: uninline
mmdrop_async(), etc") ignored the return value of arch_dup_mmap(). As a
result, on x86, a failure to duplicate the LDT (e.g., due to memory
allocation error), would leave the duplicated memory mapping in an
inconsistent state.

Fix by regarding the return value, as it was before the change.

Fixes: d70f2a14b72a4 ("include/linux/sched/mm.h: uninline mmdrop_async(), etc")
Cc: Andrew Morton 
Cc: sta...@vger.kernel.org
Signed-off-by: Nadav Amit 
---
 kernel/fork.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index 1b27babc4c78..4527d1d331de 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -549,8 +549,7 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm,
goto out;
}
/* a new mm has just been created */
-   arch_dup_mmap(oldmm, mm);
-   retval = 0;
+   retval = arch_dup_mmap(oldmm, mm);
 out:
up_write(>mmap_sem);
flush_tlb_mm(oldmm);
-- 
2.17.1



[PATCH] mm: respect arch_dup_mmap() return value

2018-08-22 Thread Nadav Amit
Commit d70f2a14b72a4 ("include/linux/sched/mm.h: uninline
mmdrop_async(), etc") ignored the return value of arch_dup_mmap(). As a
result, on x86, a failure to duplicate the LDT (e.g., due to memory
allocation error), would leave the duplicated memory mapping in an
inconsistent state.

Fix by regarding the return value, as it was before the change.

Fixes: d70f2a14b72a4 ("include/linux/sched/mm.h: uninline mmdrop_async(), etc")
Cc: Andrew Morton 
Cc: sta...@vger.kernel.org
Signed-off-by: Nadav Amit 
---
 kernel/fork.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index 1b27babc4c78..4527d1d331de 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -549,8 +549,7 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm,
goto out;
}
/* a new mm has just been created */
-   arch_dup_mmap(oldmm, mm);
-   retval = 0;
+   retval = arch_dup_mmap(oldmm, mm);
 out:
up_write(>mmap_sem);
flush_tlb_mm(oldmm);
-- 
2.17.1



Re: [PATCH 3/4] mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE

2018-08-22 Thread Linus Torvalds
On Wed, Aug 22, 2018 at 9:54 PM Benjamin Herrenschmidt  wrote:
>
>
> So we do need a different flush instruction for the page tables vs. the
> normal TLB pages.

Right. ARM wants it too. x86 is odd in that a regular "invlpg" already
invalidates all the internal tlb cache nodes.

So the "new world order" is exactly that patch that PeterZ sent you, that adds a

+   unsigned intfreed_tables : 1;

to the 'struct mmu_gather', and then makes all those
pte/pmd/pud/p4d_free_tlb() functions set that bit.

So I'm referring to the email PeterZ sent you in this thread that said:

  Nick, Will is already looking at using this to remove the synchronous
  invalidation from __p*_free_tlb() for ARM, could you have a look to see
  if PowerPC-radix could benefit from that too?

  Basically, using a patch like the below, would give your tlb_flush()
  information on if tables were removed or not.

then, in that model, you do *not* need to override these
pte/pmd/pud/p4d_free_tlb() macros at all (well, you *can* if you want
to, for doing games with the range modification, but let's sayt that
you don't need that right now).

So instead, when you get to the actual "tlb_flush(tlb)", you do
exactly that - flush the tlb. And the mmu_gather structure shows you
how much you need to flush. If you see that "freed_tables" is set,
then you know that you need to also do the special instruction to
flush the inner level caches. The range continues to show the page
range.

   Linus


Re: [PATCH 3/4] mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE

2018-08-22 Thread Linus Torvalds
On Wed, Aug 22, 2018 at 9:54 PM Benjamin Herrenschmidt  wrote:
>
>
> So we do need a different flush instruction for the page tables vs. the
> normal TLB pages.

Right. ARM wants it too. x86 is odd in that a regular "invlpg" already
invalidates all the internal tlb cache nodes.

So the "new world order" is exactly that patch that PeterZ sent you, that adds a

+   unsigned intfreed_tables : 1;

to the 'struct mmu_gather', and then makes all those
pte/pmd/pud/p4d_free_tlb() functions set that bit.

So I'm referring to the email PeterZ sent you in this thread that said:

  Nick, Will is already looking at using this to remove the synchronous
  invalidation from __p*_free_tlb() for ARM, could you have a look to see
  if PowerPC-radix could benefit from that too?

  Basically, using a patch like the below, would give your tlb_flush()
  information on if tables were removed or not.

then, in that model, you do *not* need to override these
pte/pmd/pud/p4d_free_tlb() macros at all (well, you *can* if you want
to, for doing games with the range modification, but let's sayt that
you don't need that right now).

So instead, when you get to the actual "tlb_flush(tlb)", you do
exactly that - flush the tlb. And the mmu_gather structure shows you
how much you need to flush. If you see that "freed_tables" is set,
then you know that you need to also do the special instruction to
flush the inner level caches. The range continues to show the page
range.

   Linus


Re: [PATCH] iscsi-target: Don't use stack buffer for scatterlist

2018-08-22 Thread Mike Christie
ccing Maurizio because he was working on the same issue.

On 08/22/2018 12:37 PM, Laura Abbott wrote:
> Fedora got a bug report of a crash with iSCSI:
> 
> kernel BUG at include/linux/scatterlist.h:143!
> ...
> RIP: 0010:iscsit_do_crypto_hash_buf+0x154/0x180 [iscsi_target_mod]
> ...
>  Call Trace:
>   ? iscsi_target_tx_thread+0x200/0x200 [iscsi_target_mod]
>   iscsit_get_rx_pdu+0x4cd/0xa90 [iscsi_target_mod]
>   ? native_sched_clock+0x3e/0xa0
>   ? iscsi_target_tx_thread+0x200/0x200 [iscsi_target_mod]
>   iscsi_target_rx_thread+0x81/0xf0 [iscsi_target_mod]
>   kthread+0x120/0x140
>   ? kthread_create_worker_on_cpu+0x70/0x70
>   ret_from_fork+0x3a/0x50
> 
> This is a BUG_ON for using a stack buffer with a scatterlist.
> There are two cases that trigger this bug. Switch to using a
> dynamically allocated buffer for one case and do not assign
> a NULL buffer in another case.
> 
> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1615258
> Signed-off-by: Laura Abbott 
> ---
>  drivers/target/iscsi/iscsi_target.c | 11 +--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/target/iscsi/iscsi_target.c 
> b/drivers/target/iscsi/iscsi_target.c
> index 8e223799347a..8b40f0e99e2c 100644
> --- a/drivers/target/iscsi/iscsi_target.c
> +++ b/drivers/target/iscsi/iscsi_target.c
> @@ -1419,7 +1419,8 @@ static void iscsit_do_crypto_hash_buf(struct 
> ahash_request *hash,
>  
>   sg_init_table(sg, ARRAY_SIZE(sg));
>   sg_set_buf(sg, buf, payload_length);
> - sg_set_buf(sg + 1, pad_bytes, padding);
> + if (padding)
> + sg_set_buf(sg + 1, pad_bytes, padding);
>  
>   ahash_request_set_crypt(hash, sg, data_crc, payload_length + padding);
>  
> @@ -3913,10 +3914,14 @@ static bool iscsi_target_check_conn_state(struct 
> iscsi_conn *conn)
>  static void iscsit_get_rx_pdu(struct iscsi_conn *conn)
>  {
>   int ret;
> - u8 buffer[ISCSI_HDR_LEN], opcode;
> + u8 *buffer, opcode;
>   u32 checksum = 0, digest = 0;
>   struct kvec iov;
>  
> + buffer = kmalloc_array(ISCSI_HDR_LEN, sizeof(*buffer), GFP_KERNEL);
> + if (!buffer)
> + return;
> +
>   while (!kthread_should_stop()) {
>   /*
>* Ensure that both TX and RX per connection kthreads
> @@ -3985,6 +3990,8 @@ static void iscsit_get_rx_pdu(struct iscsi_conn *conn)
>   if (ret < 0)
>   return;
>   }

You need to also change all the returns to breaks between the kmalloc
above and kfree here.

> + kfree(buffer);
>  }
>  
>  int iscsi_target_rx_thread(void *arg)
> 



Re: [PATCH] iscsi-target: Don't use stack buffer for scatterlist

2018-08-22 Thread Mike Christie
ccing Maurizio because he was working on the same issue.

On 08/22/2018 12:37 PM, Laura Abbott wrote:
> Fedora got a bug report of a crash with iSCSI:
> 
> kernel BUG at include/linux/scatterlist.h:143!
> ...
> RIP: 0010:iscsit_do_crypto_hash_buf+0x154/0x180 [iscsi_target_mod]
> ...
>  Call Trace:
>   ? iscsi_target_tx_thread+0x200/0x200 [iscsi_target_mod]
>   iscsit_get_rx_pdu+0x4cd/0xa90 [iscsi_target_mod]
>   ? native_sched_clock+0x3e/0xa0
>   ? iscsi_target_tx_thread+0x200/0x200 [iscsi_target_mod]
>   iscsi_target_rx_thread+0x81/0xf0 [iscsi_target_mod]
>   kthread+0x120/0x140
>   ? kthread_create_worker_on_cpu+0x70/0x70
>   ret_from_fork+0x3a/0x50
> 
> This is a BUG_ON for using a stack buffer with a scatterlist.
> There are two cases that trigger this bug. Switch to using a
> dynamically allocated buffer for one case and do not assign
> a NULL buffer in another case.
> 
> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1615258
> Signed-off-by: Laura Abbott 
> ---
>  drivers/target/iscsi/iscsi_target.c | 11 +--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/target/iscsi/iscsi_target.c 
> b/drivers/target/iscsi/iscsi_target.c
> index 8e223799347a..8b40f0e99e2c 100644
> --- a/drivers/target/iscsi/iscsi_target.c
> +++ b/drivers/target/iscsi/iscsi_target.c
> @@ -1419,7 +1419,8 @@ static void iscsit_do_crypto_hash_buf(struct 
> ahash_request *hash,
>  
>   sg_init_table(sg, ARRAY_SIZE(sg));
>   sg_set_buf(sg, buf, payload_length);
> - sg_set_buf(sg + 1, pad_bytes, padding);
> + if (padding)
> + sg_set_buf(sg + 1, pad_bytes, padding);
>  
>   ahash_request_set_crypt(hash, sg, data_crc, payload_length + padding);
>  
> @@ -3913,10 +3914,14 @@ static bool iscsi_target_check_conn_state(struct 
> iscsi_conn *conn)
>  static void iscsit_get_rx_pdu(struct iscsi_conn *conn)
>  {
>   int ret;
> - u8 buffer[ISCSI_HDR_LEN], opcode;
> + u8 *buffer, opcode;
>   u32 checksum = 0, digest = 0;
>   struct kvec iov;
>  
> + buffer = kmalloc_array(ISCSI_HDR_LEN, sizeof(*buffer), GFP_KERNEL);
> + if (!buffer)
> + return;
> +
>   while (!kthread_should_stop()) {
>   /*
>* Ensure that both TX and RX per connection kthreads
> @@ -3985,6 +3990,8 @@ static void iscsit_get_rx_pdu(struct iscsi_conn *conn)
>   if (ret < 0)
>   return;
>   }

You need to also change all the returns to breaks between the kmalloc
above and kfree here.

> + kfree(buffer);
>  }
>  
>  int iscsi_target_rx_thread(void *arg)
> 



Re: [PATCH 3/4] mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE

2018-08-22 Thread Linus Torvalds
On Wed, Aug 22, 2018 at 9:33 PM Nicholas Piggin  wrote:
>
> I think it was quite well understood and fixed here, a145abf12c9 but
> again that was before I really started looking at it.

You don't understand the problem.

All the x86 people thought WE ALREADY DID THAT.

Because we had done this all correctly over a decade ago!

Nobody realized that it had been screwed up by the powerpc code, and
the commit you point to was believed to be a new *powerpc* only issue,
because the semantics on powerpc has changed because of the radix
tree.

The semantics on x86 have never changed, they've always been the same.
So why would the x86 people react to powerpc doing something that x86
had already always done.

See?

Nobody cared one whit about commit a145abf12c9, because it just
handles a new powerpc-specific case.

> I don't really understand what the issue you have with powerpc here.
> powerpc hash has the page table flushing accessors which are just
> no-ops, it's the generic code that fails to call them properly. Surely
> there was no powerpc patch that removed those calls from generic code?

Yes there was.

Look where the generic code *came* from.

It's powerpc code.

See commit 267239116987 ("mm, powerpc: move the RCU page-table freeing
into generic code").

The powerpc code was made into the generic code, because the powerpc
code had to handle all those special RCU freeing things etc that
others didn't.

It's just that when x86 was then switched over to use the "generic"
code, people didn't realize that the generic code didn't do the TLB
invalidations for page tables, because they hadn't been needed on
powerpc.

So the powerpc code that was made generic, never really was. The new
"generic" code had a powerpc-specific quirk.

That then very subtly broke x86, without the x86 people ever
realizing. Because the old simple non-RCU x86 code had never had that
issue, it just treated the leaf pages and the directory pages exactly
the same.

See?

And THAT is why I talk about the powerpc code. Because what is
"generic" code in 4.18 (and several releases before) oisn't actually
generic.

And that's basically exactly the bug that the patches from PeterZ is
fixing. Making the "tlb_remove_table()" code always flush the tlb, the
way it should have when it was made generic.

  Linus


Re: [PATCH 3/4] mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE

2018-08-22 Thread Linus Torvalds
On Wed, Aug 22, 2018 at 9:33 PM Nicholas Piggin  wrote:
>
> I think it was quite well understood and fixed here, a145abf12c9 but
> again that was before I really started looking at it.

You don't understand the problem.

All the x86 people thought WE ALREADY DID THAT.

Because we had done this all correctly over a decade ago!

Nobody realized that it had been screwed up by the powerpc code, and
the commit you point to was believed to be a new *powerpc* only issue,
because the semantics on powerpc has changed because of the radix
tree.

The semantics on x86 have never changed, they've always been the same.
So why would the x86 people react to powerpc doing something that x86
had already always done.

See?

Nobody cared one whit about commit a145abf12c9, because it just
handles a new powerpc-specific case.

> I don't really understand what the issue you have with powerpc here.
> powerpc hash has the page table flushing accessors which are just
> no-ops, it's the generic code that fails to call them properly. Surely
> there was no powerpc patch that removed those calls from generic code?

Yes there was.

Look where the generic code *came* from.

It's powerpc code.

See commit 267239116987 ("mm, powerpc: move the RCU page-table freeing
into generic code").

The powerpc code was made into the generic code, because the powerpc
code had to handle all those special RCU freeing things etc that
others didn't.

It's just that when x86 was then switched over to use the "generic"
code, people didn't realize that the generic code didn't do the TLB
invalidations for page tables, because they hadn't been needed on
powerpc.

So the powerpc code that was made generic, never really was. The new
"generic" code had a powerpc-specific quirk.

That then very subtly broke x86, without the x86 people ever
realizing. Because the old simple non-RCU x86 code had never had that
issue, it just treated the leaf pages and the directory pages exactly
the same.

See?

And THAT is why I talk about the powerpc code. Because what is
"generic" code in 4.18 (and several releases before) oisn't actually
generic.

And that's basically exactly the bug that the patches from PeterZ is
fixing. Making the "tlb_remove_table()" code always flush the tlb, the
way it should have when it was made generic.

  Linus


Re: [PATCH 2/4] mm/tlb: Remove tlb_remove_table() non-concurrent condition

2018-08-22 Thread Linus Torvalds
On Wed, Aug 22, 2018 at 9:16 PM Nicholas Piggin  wrote:
>
> On Wed, 22 Aug 2018 20:35:16 -0700
> Linus Torvalds  wrote:
> >
> > And yes, those lazy tlbs are all kernel threads, but they can still
> > speculatively load user addresses.
>
> So?
>
> If the arch does not shoot those all down after the user page tables
> are removed then it's buggy regardless of this short cut.

Read the code.

That shortcut frees the pages *WITHOUT* doing that TLB flush. It just
does __tlb_remove_table(), which does *not* do that whole page
queueing so that we can batch flush and then free later.

So the fast-case it's buggy, exactly because of the reason you state.

The logic used to be that if you were the only cpu that had that tlb,
and it was a mid-level table, it didn't need that synchronization at
all.

And that logic is simply wrong. Exactly because even if mm_users is 1,
there can be things looking up TLB entries on other CPU's. Either
because of a lazy mm and a hw walker with speculation, or because of
use_mm() and a software walker.

So the whole "free immediately" shortcut was bogus. You *always* need
to queue, then flush the tlb, and then free.

That said, that wasn't actually the bug that Jann found. He found the
bug a few lines lower down, where the "I can't allocate memory for
queueing" ended up *also* not flushing the TLB.

> The only real problem I could see would be if a page walk cache still
> points to the freed table, then the table gets re-allocated and used
> elsewhere, and meanwhile a speculative access tries to load an entry
> from the page that is an invalid form of page table that might cause
> a machine check or something. That would be (u)arch specific, but if
> that's what we're concerned with here it's a different issue and needs
> to be documented as such.

We've *seen* that case, we had exactly that when we were being
aggressive about trying to avoid flushes for the lazy mmu case
entirely, because "we can just flush when we activate the lazy mm
state instead".

The logic was actually solid from a TLB case - who cares what garbage
TLB entries we might speculatively fill, when we're going to flush
them before they can possibly be used.

It turns out that that logic is solid, but hits another problem: at
least some AMD CPU's will cause machine checks when the TLB entries
are inconsistent with the machine setup. That can't happen with a
*good* page table, but when you speculatively walk already freed
tables, you can get any garbage.

I forget what the exact trigger was, but this was actually reported.
So you can't free page directory pages without flushing the tlb first
(to make that internal tlb node caches are flushed).

So the logic for freeing leaf pages and freeing middle nodes has to be
exactly the same: you make the modification to the page table to
remove the node/leaf, you queue up the memory for freeing, you flush
the tlb, and you then free the page.

That's the ordering that tlb_remove_page() has always honored, but
that tlb_remove_tabl() didn't.

It honored it for the *normal* case, which is why it took so long to
notice that the TLB shootdown had been broken on x86 when it moved to
the "generic" code. The *normal* case does this all right, and batches
things up, and then when the batch fills up it does a
tlb_table_flush() which does the TLB flush and schedules the actual
freeing.

But there were two cases that *didn't* do that. The special "I'm the
only thread" fast case, and the "oops I ran out of memory, so now I'll
fake it, and just synchronize with page twalkers manually, and then do
that special direct remove without flushing the tlb".

NOTE! Jann triggered that one by
 (a) forcing the machine low on memory
 (b) force-poisoning the page tables immediately after free

I suspect it's really hard to trigger under normal loads, exactly
because the *normal* case gets it right. It's literally the "oops, I
can't batch up because I ran out of memory" case that gets it wrong.

(And the special single-thread + lazy or use_mm() case, but that's
going to be entirely impossible to trigger, because in practice it's
just a single thread, and you won't ever hit the magic timing needed
that frees the page in the single thread at exactly the same time that
some optimistic lazy mm on another cpu happens to speculatively load
that address).

So the "atomic_read(_users)" case is likely entirely impossible to
trigger any sane way. But because Jann found the problem with the 'ran
out of memory" case, we started looking at the more theoretical cases
that matched the same kind of "no tlb flush before free" pattern.

  Linus


Re: [PATCH 2/4] mm/tlb: Remove tlb_remove_table() non-concurrent condition

2018-08-22 Thread Linus Torvalds
On Wed, Aug 22, 2018 at 9:16 PM Nicholas Piggin  wrote:
>
> On Wed, 22 Aug 2018 20:35:16 -0700
> Linus Torvalds  wrote:
> >
> > And yes, those lazy tlbs are all kernel threads, but they can still
> > speculatively load user addresses.
>
> So?
>
> If the arch does not shoot those all down after the user page tables
> are removed then it's buggy regardless of this short cut.

Read the code.

That shortcut frees the pages *WITHOUT* doing that TLB flush. It just
does __tlb_remove_table(), which does *not* do that whole page
queueing so that we can batch flush and then free later.

So the fast-case it's buggy, exactly because of the reason you state.

The logic used to be that if you were the only cpu that had that tlb,
and it was a mid-level table, it didn't need that synchronization at
all.

And that logic is simply wrong. Exactly because even if mm_users is 1,
there can be things looking up TLB entries on other CPU's. Either
because of a lazy mm and a hw walker with speculation, or because of
use_mm() and a software walker.

So the whole "free immediately" shortcut was bogus. You *always* need
to queue, then flush the tlb, and then free.

That said, that wasn't actually the bug that Jann found. He found the
bug a few lines lower down, where the "I can't allocate memory for
queueing" ended up *also* not flushing the TLB.

> The only real problem I could see would be if a page walk cache still
> points to the freed table, then the table gets re-allocated and used
> elsewhere, and meanwhile a speculative access tries to load an entry
> from the page that is an invalid form of page table that might cause
> a machine check or something. That would be (u)arch specific, but if
> that's what we're concerned with here it's a different issue and needs
> to be documented as such.

We've *seen* that case, we had exactly that when we were being
aggressive about trying to avoid flushes for the lazy mmu case
entirely, because "we can just flush when we activate the lazy mm
state instead".

The logic was actually solid from a TLB case - who cares what garbage
TLB entries we might speculatively fill, when we're going to flush
them before they can possibly be used.

It turns out that that logic is solid, but hits another problem: at
least some AMD CPU's will cause machine checks when the TLB entries
are inconsistent with the machine setup. That can't happen with a
*good* page table, but when you speculatively walk already freed
tables, you can get any garbage.

I forget what the exact trigger was, but this was actually reported.
So you can't free page directory pages without flushing the tlb first
(to make that internal tlb node caches are flushed).

So the logic for freeing leaf pages and freeing middle nodes has to be
exactly the same: you make the modification to the page table to
remove the node/leaf, you queue up the memory for freeing, you flush
the tlb, and you then free the page.

That's the ordering that tlb_remove_page() has always honored, but
that tlb_remove_tabl() didn't.

It honored it for the *normal* case, which is why it took so long to
notice that the TLB shootdown had been broken on x86 when it moved to
the "generic" code. The *normal* case does this all right, and batches
things up, and then when the batch fills up it does a
tlb_table_flush() which does the TLB flush and schedules the actual
freeing.

But there were two cases that *didn't* do that. The special "I'm the
only thread" fast case, and the "oops I ran out of memory, so now I'll
fake it, and just synchronize with page twalkers manually, and then do
that special direct remove without flushing the tlb".

NOTE! Jann triggered that one by
 (a) forcing the machine low on memory
 (b) force-poisoning the page tables immediately after free

I suspect it's really hard to trigger under normal loads, exactly
because the *normal* case gets it right. It's literally the "oops, I
can't batch up because I ran out of memory" case that gets it wrong.

(And the special single-thread + lazy or use_mm() case, but that's
going to be entirely impossible to trigger, because in practice it's
just a single thread, and you won't ever hit the magic timing needed
that frees the page in the single thread at exactly the same time that
some optimistic lazy mm on another cpu happens to speculatively load
that address).

So the "atomic_read(_users)" case is likely entirely impossible to
trigger any sane way. But because Jann found the problem with the 'ran
out of memory" case, we started looking at the more theoretical cases
that matched the same kind of "no tlb flush before free" pattern.

  Linus


Re: [PATCH 3/4] mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE

2018-08-22 Thread Benjamin Herrenschmidt
On Wed, 2018-08-22 at 20:59 -0700, Linus Torvalds wrote:
> On Wed, Aug 22, 2018 at 8:45 PM Nicholas Piggin  wrote:
> > 
> > powerpc/radix has no such issue, it already does this tracking.
> 
> Yeah, I now realize that this was why you wanted to add that hacky
> thing to the generic code, so that you can add the tlb_flush_pgtable()
> call.
> 
> I thought it was because powerpc had some special flush instruction
> for it, and the regular tlb flush didn't do it. But no. It was because
> the regular code had lost the tlb flush _entirely_, because powerpc
> didn't want it.

Heh :-) Well, back on hash we didn't (we do now with Radix) but I
wouldn't blame us for the generic code being broken ... the RCU table
freeing was in arch/powerpc at the time :-) I don't think it was us
making it generic :)

> > We were discussing this a couple of months ago, I wasn't aware of ARM's
> > issue but I suggested x86 could go the same way as powerpc.
> 
> The problem is that x86 _used_ to do this all correctly long long ago.
> 
> And then we switched over to the "generic" table flushing (which
> harkens back to the powerpc code).

Yes, we wrote it the RCU stuff to solve the races with SW walking,
which is completely orthogonal with HW walking & TLB content. We didn't
do the move to generic code though ;-)

> Which actually turned out to be not generic at all, and did not flush
> the internal pages like x86 used to (back when x86 just used
> tlb_remove_page for everything).

Well, having RCU do the flushing is rather generic, it makes sense
whenever there's somebody doing a SW walk *and* you don't have IPIs to
synchronize your flushes (ie, anybody with HW TLB invalidation
broadcast basically, so ARM and us).

> So as a result, x86 had unintentionally lost the TLB flush we used to
> have, because tlb_remove_table() had lost the tlb flushing because of
> a powerpc quirk.

This is a somewhat odd way of putting the "blame" :-) But yeah ok...

> You then added it back as a hacky per-architecture hook (apparently
> having realized that you never did it at all), which didn't fix the
> unintentional lack of flushing on x86.
> 
> So now we're going to do it right.  No more "oh, powerpc didn't need
> to flush because the hash tables weren't in the tlb at all" thing in
> the generic code that then others need to work around.

So we do need a different flush instruction for the page tables vs. the
normal TLB pages. 

Cheers,
Ben.




Re: [PATCH 3/4] mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE

2018-08-22 Thread Benjamin Herrenschmidt
On Wed, 2018-08-22 at 20:59 -0700, Linus Torvalds wrote:
> On Wed, Aug 22, 2018 at 8:45 PM Nicholas Piggin  wrote:
> > 
> > powerpc/radix has no such issue, it already does this tracking.
> 
> Yeah, I now realize that this was why you wanted to add that hacky
> thing to the generic code, so that you can add the tlb_flush_pgtable()
> call.
> 
> I thought it was because powerpc had some special flush instruction
> for it, and the regular tlb flush didn't do it. But no. It was because
> the regular code had lost the tlb flush _entirely_, because powerpc
> didn't want it.

Heh :-) Well, back on hash we didn't (we do now with Radix) but I
wouldn't blame us for the generic code being broken ... the RCU table
freeing was in arch/powerpc at the time :-) I don't think it was us
making it generic :)

> > We were discussing this a couple of months ago, I wasn't aware of ARM's
> > issue but I suggested x86 could go the same way as powerpc.
> 
> The problem is that x86 _used_ to do this all correctly long long ago.
> 
> And then we switched over to the "generic" table flushing (which
> harkens back to the powerpc code).

Yes, we wrote it the RCU stuff to solve the races with SW walking,
which is completely orthogonal with HW walking & TLB content. We didn't
do the move to generic code though ;-)

> Which actually turned out to be not generic at all, and did not flush
> the internal pages like x86 used to (back when x86 just used
> tlb_remove_page for everything).

Well, having RCU do the flushing is rather generic, it makes sense
whenever there's somebody doing a SW walk *and* you don't have IPIs to
synchronize your flushes (ie, anybody with HW TLB invalidation
broadcast basically, so ARM and us).

> So as a result, x86 had unintentionally lost the TLB flush we used to
> have, because tlb_remove_table() had lost the tlb flushing because of
> a powerpc quirk.

This is a somewhat odd way of putting the "blame" :-) But yeah ok...

> You then added it back as a hacky per-architecture hook (apparently
> having realized that you never did it at all), which didn't fix the
> unintentional lack of flushing on x86.
> 
> So now we're going to do it right.  No more "oh, powerpc didn't need
> to flush because the hash tables weren't in the tlb at all" thing in
> the generic code that then others need to work around.

So we do need a different flush instruction for the page tables vs. the
normal TLB pages. 

Cheers,
Ben.




Re: [PATCH 3/4] mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE

2018-08-22 Thread Nicholas Piggin
On Wed, 22 Aug 2018 20:59:46 -0700
Linus Torvalds  wrote:

> On Wed, Aug 22, 2018 at 8:45 PM Nicholas Piggin  wrote:
> >
> > powerpc/radix has no such issue, it already does this tracking.  
> 
> Yeah, I now realize that this was why you wanted to add that hacky
> thing to the generic code, so that you can add the tlb_flush_pgtable()
> call.
> 
> I thought it was because powerpc had some special flush instruction
> for it, and the regular tlb flush didn't do it. But no.

powerpc/radix does have a special instruction for it, that is why I
posted the patch :)

> It was because
> the regular code had lost the tlb flush _entirely_, because powerpc
> didn't want it.

I think that was long before I started looking at the code.
powerpc/hash hardware has no idea about the page tables so yeah they
don't need it.

> 
> > We were discussing this a couple of months ago, I wasn't aware of ARM's
> > issue but I suggested x86 could go the same way as powerpc.  
> 
> The problem is that x86 _used_ to do this all correctly long long ago.
> 
> And then we switched over to the "generic" table flushing (which
> harkens back to the powerpc code).
> 
> Which actually turned out to be not generic at all, and did not flush
> the internal pages like x86 used to (back when x86 just used
> tlb_remove_page for everything).
> 
> So as a result, x86 had unintentionally lost the TLB flush we used to
> have, because tlb_remove_table() had lost the tlb flushing because of
> a powerpc quirk.
> 
> You then added it back as a hacky per-architecture hook (apparently
> having realized that you never did it at all), which didn't fix the

I think it was quite well understood and fixed here, a145abf12c9 but
again that was before I really started looking at it.

The hooks I added recently are for a different reason, and it's
actaully the opposite problem -- to work around the hacky generic code
that x86 foisted on other archs.

> unintentional lack of flushing on x86.
> 
> So now we're going to do it right.  No more "oh, powerpc didn't need
> to flush because the hash tables weren't in the tlb at all" thing in
> the generic code that then others need to work around.

I don't really understand what the issue you have with powerpc here.
powerpc hash has the page table flushing accessors which are just
no-ops, it's the generic code that fails to call them properly. Surely
there was no powerpc patch that removed those calls from generic code?

powerpc/radix yes it does some arch specific things to do its page walk
cache flushing, but it is a better design than the hacks x86 has in
generic code, surely. I thought you basically agreed and thought x86 /
generic code could move to that kind of model.

Thanks,
Nick


Re: [PATCH 3/4] mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE

2018-08-22 Thread Nicholas Piggin
On Wed, 22 Aug 2018 20:59:46 -0700
Linus Torvalds  wrote:

> On Wed, Aug 22, 2018 at 8:45 PM Nicholas Piggin  wrote:
> >
> > powerpc/radix has no such issue, it already does this tracking.  
> 
> Yeah, I now realize that this was why you wanted to add that hacky
> thing to the generic code, so that you can add the tlb_flush_pgtable()
> call.
> 
> I thought it was because powerpc had some special flush instruction
> for it, and the regular tlb flush didn't do it. But no.

powerpc/radix does have a special instruction for it, that is why I
posted the patch :)

> It was because
> the regular code had lost the tlb flush _entirely_, because powerpc
> didn't want it.

I think that was long before I started looking at the code.
powerpc/hash hardware has no idea about the page tables so yeah they
don't need it.

> 
> > We were discussing this a couple of months ago, I wasn't aware of ARM's
> > issue but I suggested x86 could go the same way as powerpc.  
> 
> The problem is that x86 _used_ to do this all correctly long long ago.
> 
> And then we switched over to the "generic" table flushing (which
> harkens back to the powerpc code).
> 
> Which actually turned out to be not generic at all, and did not flush
> the internal pages like x86 used to (back when x86 just used
> tlb_remove_page for everything).
> 
> So as a result, x86 had unintentionally lost the TLB flush we used to
> have, because tlb_remove_table() had lost the tlb flushing because of
> a powerpc quirk.
> 
> You then added it back as a hacky per-architecture hook (apparently
> having realized that you never did it at all), which didn't fix the

I think it was quite well understood and fixed here, a145abf12c9 but
again that was before I really started looking at it.

The hooks I added recently are for a different reason, and it's
actaully the opposite problem -- to work around the hacky generic code
that x86 foisted on other archs.

> unintentional lack of flushing on x86.
> 
> So now we're going to do it right.  No more "oh, powerpc didn't need
> to flush because the hash tables weren't in the tlb at all" thing in
> the generic code that then others need to work around.

I don't really understand what the issue you have with powerpc here.
powerpc hash has the page table flushing accessors which are just
no-ops, it's the generic code that fails to call them properly. Surely
there was no powerpc patch that removed those calls from generic code?

powerpc/radix yes it does some arch specific things to do its page walk
cache flushing, but it is a better design than the hacks x86 has in
generic code, surely. I thought you basically agreed and thought x86 /
generic code could move to that kind of model.

Thanks,
Nick


[GIT PULL 4/4] ARM: Device-tree updates

2018-08-22 Thread Olof Johansson
Business as usual -- the bulk of our changes are to devicetree files
with new hardware support, new SoCs and platforms, and new board types.

New SoCs/platforms:
- Raspberry Pi Compute Module (CM1) and IO board
- i.MX6SSL from NXP
- Renesas RZ/N1D SoC (R9A06G032), Dual Cortex-A7 with Ethernet, CAN and
PLC interfaces
- TI AM654 SoC, Quad Cortex-A53, safety subsystem with Cortex-R5
controllers, communication and PRU subsystem and lots of other
interfaces (PCIe, USB3, etc).

New boards and systems:
- Several Atmel at91-based boards from Laird
- Marvell Armada388-based Helios4 board from SolidRun
- Samsung Aires-based phones (s5pv210)
- Allwinner A64-based Pinebook laptop

In addition to the above, there's the usual amount of new devices
described on existing platforms, fixes and tweaks and new minor variants
of boards/platforms.



The following changes since commit a8bba4bb81f33b520f2e9bee78cf2e845e16c9c3:

  Merge tag 'armsoc-defconfig' into HEAD

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc.git tags/armsoc-dt

for you to fetch changes up to afd3e3dad6761ddf08119afe121bfbe096c0844b:

  Merge tag 'qcom-arm64-for-4.19-2' of 
git://git.kernel.org/pub/scm/linux/kernel/git/agross/linux into next/dt



Adam Ford (6):
  ARM: dts: da850-evm: Enable SATA port
  ARM: dts: da850-evm: Enable LCD and backlight
  ARM: dts: am3517-evm: Remove unnessary PMIC parameters
  ARM: dts: am3517-som: Add WL127x Wifi
  ARM: dts: am3517-som:  Add builtin Bluetooth
  ARM: dts: am3517-evm: Enable USB1 Host

Alex Gonzalez (2):
  ARM: dts: imx6ul: Add DTS for ConnectCore 6UL System-On-Module (SOM)
  ARM: dts: imx6ul: Add DTS for ConnectCore 6UL SBC Express

Alexander Kurz (6):
  ARM: dts: imx6qdl-wandboard: remove regulators bus node
  ARM: dts: imx6dl-riotboard: remove regulators bus node
  ARM: dts: imx6qdl-wandboard: enable USB OTG
  ARM: dts: imx6dl-riotboard: fix OTG regulator polarity
  ARM: dts: imx50: fix KPP pin definition typos
  ARM: dts: imx50: remove non-existent USB instances

Alexandre Belloni (4):
  dt-bindings: arm: remove PMC bindings
  dt-bindings: clk: at91: Document all the PMC compatibles
  ARM: at91: fix USB clock detection handling
  ARM: dts: fix PMC compatible

Alexandre Torgue (2):
  ARM: dts: stm32: remove gpio aliases for stm32mp157c
  ARM: dts: stm32: Reorder nodes in stm32mp157c-ed1

Amelie Delaunay (9):
  ARM: dts: stm32: update rtc st,syscfg property on stm32f429
  ARM: dts: stm32: update rtc st,syscfg property on stm32f746
  ARM: dts: stm32: add RTC support to stm32mp157c
  ARM: dts: stm32: enable RTC on stm32mp157c-ed1
  ARM: dts: stm32: enable USB Host (USBH) EHCI controller on stm32mp157c-ev1
  ARM: dts: stm32: add USB OTG HS support for stm32mp157c SoC
  ARM: dts: stm32: enable USB OTG HS on stm32mp157c-ev1
  ARM: dts: stm32: add SPI support on stm32mp157c
  ARM: dts: stm32: add SPI1 support on stm32mp157c-ev1

Amit Kucheria (2):
  arm64: dts: msm8996: thermal: Initialise via DT and add second controller
  arm64: dts: sdm845: Add tsens nodes

Anand Moon (1):
  ARM: dts: exynos: Add missing interrupts for pwm node on Exynos5

Andre Przywara (1):
  arm64: dts: allwinner: a64: Add PWM controllers

Andrey Smirnov (13):
  ARM: dts: imx51-zii-rdu1: Make sure SD1_WP is low
  ARM: dts: imx51-zii-rdu1: Populate RAVE SP EEPROM nodes
  ARM: dts: imx6qdl-zii-rdu2: Populate RAVE SP EEPROM nodes
  ARM: dts: imx6qdl-zii-rdu2: Populate RAVE SP backlight node
  ARM: dts: imx6qdl-zii-rdu2: Populate RAVE SP power button node
  ARM: dts: imx51-babbage: Make use of pinctrl_usbh1reg
  ARM: dts: imx: Add ZII SCU2 Mezz board
  ARM: dts: imx: Add ZII SCU3 ESB
  ARM: dts: imx51-zii-scu3-esb: Add switch IRQ line pinumx config
  ARM: dts: imx51-zii-scu3-esb: Fix RAVE SP watchdog compatible string
  ARM: dts: vf610: Add ZII SSMB SPU3 board
  ARM: dts: vf610: Add ZII CFU1 board
  ARM: dts: vf610-zii-ssmb-spu3: Fix W=1 level warnings

Andy Gross (2):
  Merge tag 'qcom-arm64-for-4.19' into arm64-for-4.19-2
  Merge tag 'qcom-drivers-for-4.19' into arm64-for-4.19-2

Anson Huang (14):
  ARM: dts: imx: add cooling-cells for cpufreq cooling device
  ARM: dts: imx6sll: declare src module to be compatible to imx51's src
  ARM: dts: imx6sll-evk: enable usdhc3 slot
  ARM: dts: imx6: correct anatop regulators range
  ARM: dts: imx6sx: add ocram_s support
  ARM: dts: imx6ul: add GPIO clocks
  ARM: dts: imx6sll-evk: enable PWM1 for backlight driver
  ARM: dts: imx6sll-evk: correct lcd regulator GPIO pin
  ARM: dts: imx6sll-evk: enable SEIKO 43WVF1G lcdif panel
  ARM: dts: imx6sl-evk: add missing GPIO iomux setting
  ARM: dts: 

[GIT PULL 4/4] ARM: Device-tree updates

2018-08-22 Thread Olof Johansson
Business as usual -- the bulk of our changes are to devicetree files
with new hardware support, new SoCs and platforms, and new board types.

New SoCs/platforms:
- Raspberry Pi Compute Module (CM1) and IO board
- i.MX6SSL from NXP
- Renesas RZ/N1D SoC (R9A06G032), Dual Cortex-A7 with Ethernet, CAN and
PLC interfaces
- TI AM654 SoC, Quad Cortex-A53, safety subsystem with Cortex-R5
controllers, communication and PRU subsystem and lots of other
interfaces (PCIe, USB3, etc).

New boards and systems:
- Several Atmel at91-based boards from Laird
- Marvell Armada388-based Helios4 board from SolidRun
- Samsung Aires-based phones (s5pv210)
- Allwinner A64-based Pinebook laptop

In addition to the above, there's the usual amount of new devices
described on existing platforms, fixes and tweaks and new minor variants
of boards/platforms.



The following changes since commit a8bba4bb81f33b520f2e9bee78cf2e845e16c9c3:

  Merge tag 'armsoc-defconfig' into HEAD

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc.git tags/armsoc-dt

for you to fetch changes up to afd3e3dad6761ddf08119afe121bfbe096c0844b:

  Merge tag 'qcom-arm64-for-4.19-2' of 
git://git.kernel.org/pub/scm/linux/kernel/git/agross/linux into next/dt



Adam Ford (6):
  ARM: dts: da850-evm: Enable SATA port
  ARM: dts: da850-evm: Enable LCD and backlight
  ARM: dts: am3517-evm: Remove unnessary PMIC parameters
  ARM: dts: am3517-som: Add WL127x Wifi
  ARM: dts: am3517-som:  Add builtin Bluetooth
  ARM: dts: am3517-evm: Enable USB1 Host

Alex Gonzalez (2):
  ARM: dts: imx6ul: Add DTS for ConnectCore 6UL System-On-Module (SOM)
  ARM: dts: imx6ul: Add DTS for ConnectCore 6UL SBC Express

Alexander Kurz (6):
  ARM: dts: imx6qdl-wandboard: remove regulators bus node
  ARM: dts: imx6dl-riotboard: remove regulators bus node
  ARM: dts: imx6qdl-wandboard: enable USB OTG
  ARM: dts: imx6dl-riotboard: fix OTG regulator polarity
  ARM: dts: imx50: fix KPP pin definition typos
  ARM: dts: imx50: remove non-existent USB instances

Alexandre Belloni (4):
  dt-bindings: arm: remove PMC bindings
  dt-bindings: clk: at91: Document all the PMC compatibles
  ARM: at91: fix USB clock detection handling
  ARM: dts: fix PMC compatible

Alexandre Torgue (2):
  ARM: dts: stm32: remove gpio aliases for stm32mp157c
  ARM: dts: stm32: Reorder nodes in stm32mp157c-ed1

Amelie Delaunay (9):
  ARM: dts: stm32: update rtc st,syscfg property on stm32f429
  ARM: dts: stm32: update rtc st,syscfg property on stm32f746
  ARM: dts: stm32: add RTC support to stm32mp157c
  ARM: dts: stm32: enable RTC on stm32mp157c-ed1
  ARM: dts: stm32: enable USB Host (USBH) EHCI controller on stm32mp157c-ev1
  ARM: dts: stm32: add USB OTG HS support for stm32mp157c SoC
  ARM: dts: stm32: enable USB OTG HS on stm32mp157c-ev1
  ARM: dts: stm32: add SPI support on stm32mp157c
  ARM: dts: stm32: add SPI1 support on stm32mp157c-ev1

Amit Kucheria (2):
  arm64: dts: msm8996: thermal: Initialise via DT and add second controller
  arm64: dts: sdm845: Add tsens nodes

Anand Moon (1):
  ARM: dts: exynos: Add missing interrupts for pwm node on Exynos5

Andre Przywara (1):
  arm64: dts: allwinner: a64: Add PWM controllers

Andrey Smirnov (13):
  ARM: dts: imx51-zii-rdu1: Make sure SD1_WP is low
  ARM: dts: imx51-zii-rdu1: Populate RAVE SP EEPROM nodes
  ARM: dts: imx6qdl-zii-rdu2: Populate RAVE SP EEPROM nodes
  ARM: dts: imx6qdl-zii-rdu2: Populate RAVE SP backlight node
  ARM: dts: imx6qdl-zii-rdu2: Populate RAVE SP power button node
  ARM: dts: imx51-babbage: Make use of pinctrl_usbh1reg
  ARM: dts: imx: Add ZII SCU2 Mezz board
  ARM: dts: imx: Add ZII SCU3 ESB
  ARM: dts: imx51-zii-scu3-esb: Add switch IRQ line pinumx config
  ARM: dts: imx51-zii-scu3-esb: Fix RAVE SP watchdog compatible string
  ARM: dts: vf610: Add ZII SSMB SPU3 board
  ARM: dts: vf610: Add ZII CFU1 board
  ARM: dts: vf610-zii-ssmb-spu3: Fix W=1 level warnings

Andy Gross (2):
  Merge tag 'qcom-arm64-for-4.19' into arm64-for-4.19-2
  Merge tag 'qcom-drivers-for-4.19' into arm64-for-4.19-2

Anson Huang (14):
  ARM: dts: imx: add cooling-cells for cpufreq cooling device
  ARM: dts: imx6sll: declare src module to be compatible to imx51's src
  ARM: dts: imx6sll-evk: enable usdhc3 slot
  ARM: dts: imx6: correct anatop regulators range
  ARM: dts: imx6sx: add ocram_s support
  ARM: dts: imx6ul: add GPIO clocks
  ARM: dts: imx6sll-evk: enable PWM1 for backlight driver
  ARM: dts: imx6sll-evk: correct lcd regulator GPIO pin
  ARM: dts: imx6sll-evk: enable SEIKO 43WVF1G lcdif panel
  ARM: dts: imx6sl-evk: add missing GPIO iomux setting
  ARM: dts: 

[GIT PULL 3/4] ARM: SoC defconfig updates

2018-08-22 Thread Olof Johansson
We keep these separate since some files are shared and conflict-prone,
but there isn't really much to write about here.

Some of the churnier pieces is for the Aspeed platforms, which did an
overdue refresh of the defconfig, and enabled USB gadget and some
drivers from there. Most of the rest are minor additions here and there
to turn on drivers that are needed or useful on the various platforms.



The following changes since commit 4344bb3ec602d8b613a2f6c71f84eb631468a432:

  Merge tag 'armsoc-drivers' into HEAD

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc.git 
tags/armsoc-defconfig

for you to fetch changes up to 07d268f541dee9e58ded056fb6782942faa89a91:

  Merge tag 'v4.19-rockchip-defconfig64-1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into 
next/defconfig



Abhishek Sahu (2):
  ARM: qcom_defconfig: Enable QCOM NAND related configs
  arm64: defconfig: Enable CONFIG_MTD_NAND_QCOM for IPQ8074

Adam Ford (1):
  ARM: davinci_all_defconfig: set CONFIG_BACKLIGHT_PWM=m

Amelie Delaunay (1):
  ARM: multi_v7_defconfig: enable STM32 RTC

Baruch Siach (2):
  ARM: mvebu_v7_defconfig: sync defconfig
  ARM: mvebu_v7_defconfig: enable SFP support

Benjamin Herrenschmidt (2):
  arm: configs: Add USB gadget to Aspeed G4 defconfig
  arm: configs: Add USB gadget to Aspeed G5 defconfig

David Lechner (2):
  ARM: davinci_all_defconfig: remove CONFIG_DAVINCI_RESET_CLOCKS
  ARM: davinci_all_defconfig: Enable Bluetooth

Dien Pham (1):
  arm64: defconfig: Enable BD9571MWV regulator

Enric Balletbo i Serra (1):
  arm64: defconfig: Enable more peripherals for Samsung Chromebook Plus.

Florian Fainelli (2):
  Merge tag 'bcm2835-defconfig-next-2018-07-03' into defconfig/next
  Merge tag 'bcm2835-defconfig-64-next-2018-07-03' into defconfig-arm64/next

Geert Uytterhoeven (5):
  ARM: shmobile: defconfig: Drop NET_VENDOR_=n
  ARM: shmobile: defconfig: Enable reset controller support
  ARM: shmobile: defconfig: Enable support for RZN1D-DB
  ARM: shmobile: defconfig: Disable /sbin/hotplug fork-bomb
  ARM: multi_v7_defconfig: Enable support for RZN1D-DB

Gregory CLEMENT (1):
  ARM: multi_v7_defconfig: Add Marvell NAND controller support

Hugues Fruchet (1):
  ARM: multi_v7_defconfig: enable STM32 DCMI media support

Joel Stanley (4):
  ARM: config: aspeed: Update defconfig
  ARM: config: multi_v5: Refresh configuration
  ARM: config: multi_v5: Enable ASPEED drivers
  ARM: config: aspeed: Enable new FSI drivers

Kishon Vijay Abraham I (1):
  ARM: configs: keystone: Enable CONFIG_MMC_SDHCI_OMAP

Kunihiko Hayashi (1):
  ARM: multi_v7_defconfig: add CONFIG_UNIPHIER_THERMAL and CONFIG_SNI_AVE

Leonard Crestez (1):
  ARM: imx_v6_v7_defconfig: Enable imx6qdl-sabreauto sensors

Murali Karicheri (1):
  ARM: keystone: k2g: enable micrel and dp83867 phys

Olof Johansson (15):
  Merge tag 'keystone_config_for_4.19' of 
git://git.kernel.org/.../ssantosh/linux-keystone into next/defconfig
  Merge tag 'arm-soc/for-4.19/defconfig' of 
https://github.com/Broadcom/stblinux into next/defconfig
  Merge tag 'arm-soc/for-4.19/defconfig-arm64' of 
https://github.com/Broadcom/stblinux into next/defconfig
  Merge tag 'davinci-for-v4.19/defconfig' of 
git://git.kernel.org/.../nsekhar/linux-davinci into next/defconfig
  Merge tag 'samsung-defconfig-4.19' of 
https://git.kernel.org/.../krzk/linux into next/defconfig
  Merge tag 'stm32-defconfig-for-v4.19-1' of 
git://git.kernel.org/.../atorgue/stm32 into next/defconfig
  Merge tag 'mvebu-defconfig-4.19-1' of git://git.infradead.org/linux-mvebu 
into next/defconfig
  Merge tag 'hisi-defconfig-for-4.19' of 
git://github.com/hisilicon/linux-hisi into next/defconfig
  Merge tag 'imx-defconfig-4.19' of git://git.kernel.org/.../shawnguo/linux 
into next/defconfig
  Merge tag 'renesas-arm64-defconfig-for-v4.19' of 
https://git.kernel.org/.../horms/renesas into next/defconfig
  Merge tag 'qcom-arm64-defconfig-for-4.19' of 
git://git.kernel.org/.../agross/linux into next/defconfig
  Merge tag 'qcom-defconfig-for-4.19' of 
git://git.kernel.org/.../agross/linux into next/defconfig
  Merge tag 'renesas-arm-defconfig-for-v4.19' of 
https://git.kernel.org/.../horms/renesas into next/defconfig
  Merge tag 'aspeed-4.19-defconfig' of git://git.kernel.org/.../joel/aspeed 
into next/defconfig
  Merge tag 'v4.19-rockchip-defconfig64-1' of 
git://git.kernel.org/.../mmind/linux-rockchip into next/defconfig

Paweł Chmiel (3):
  ARM: s5pv210_defconfig: Run make savedefconfig
  ARM: s5pv210_defconfig: Enable drivers for Samsung Aries based phones
  ARM: s5pv210_defconfig: Enable options needed to boot typical Linux distro

Pierre-Yves 

[GIT PULL 3/4] ARM: SoC defconfig updates

2018-08-22 Thread Olof Johansson
We keep these separate since some files are shared and conflict-prone,
but there isn't really much to write about here.

Some of the churnier pieces is for the Aspeed platforms, which did an
overdue refresh of the defconfig, and enabled USB gadget and some
drivers from there. Most of the rest are minor additions here and there
to turn on drivers that are needed or useful on the various platforms.



The following changes since commit 4344bb3ec602d8b613a2f6c71f84eb631468a432:

  Merge tag 'armsoc-drivers' into HEAD

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc.git 
tags/armsoc-defconfig

for you to fetch changes up to 07d268f541dee9e58ded056fb6782942faa89a91:

  Merge tag 'v4.19-rockchip-defconfig64-1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into 
next/defconfig



Abhishek Sahu (2):
  ARM: qcom_defconfig: Enable QCOM NAND related configs
  arm64: defconfig: Enable CONFIG_MTD_NAND_QCOM for IPQ8074

Adam Ford (1):
  ARM: davinci_all_defconfig: set CONFIG_BACKLIGHT_PWM=m

Amelie Delaunay (1):
  ARM: multi_v7_defconfig: enable STM32 RTC

Baruch Siach (2):
  ARM: mvebu_v7_defconfig: sync defconfig
  ARM: mvebu_v7_defconfig: enable SFP support

Benjamin Herrenschmidt (2):
  arm: configs: Add USB gadget to Aspeed G4 defconfig
  arm: configs: Add USB gadget to Aspeed G5 defconfig

David Lechner (2):
  ARM: davinci_all_defconfig: remove CONFIG_DAVINCI_RESET_CLOCKS
  ARM: davinci_all_defconfig: Enable Bluetooth

Dien Pham (1):
  arm64: defconfig: Enable BD9571MWV regulator

Enric Balletbo i Serra (1):
  arm64: defconfig: Enable more peripherals for Samsung Chromebook Plus.

Florian Fainelli (2):
  Merge tag 'bcm2835-defconfig-next-2018-07-03' into defconfig/next
  Merge tag 'bcm2835-defconfig-64-next-2018-07-03' into defconfig-arm64/next

Geert Uytterhoeven (5):
  ARM: shmobile: defconfig: Drop NET_VENDOR_=n
  ARM: shmobile: defconfig: Enable reset controller support
  ARM: shmobile: defconfig: Enable support for RZN1D-DB
  ARM: shmobile: defconfig: Disable /sbin/hotplug fork-bomb
  ARM: multi_v7_defconfig: Enable support for RZN1D-DB

Gregory CLEMENT (1):
  ARM: multi_v7_defconfig: Add Marvell NAND controller support

Hugues Fruchet (1):
  ARM: multi_v7_defconfig: enable STM32 DCMI media support

Joel Stanley (4):
  ARM: config: aspeed: Update defconfig
  ARM: config: multi_v5: Refresh configuration
  ARM: config: multi_v5: Enable ASPEED drivers
  ARM: config: aspeed: Enable new FSI drivers

Kishon Vijay Abraham I (1):
  ARM: configs: keystone: Enable CONFIG_MMC_SDHCI_OMAP

Kunihiko Hayashi (1):
  ARM: multi_v7_defconfig: add CONFIG_UNIPHIER_THERMAL and CONFIG_SNI_AVE

Leonard Crestez (1):
  ARM: imx_v6_v7_defconfig: Enable imx6qdl-sabreauto sensors

Murali Karicheri (1):
  ARM: keystone: k2g: enable micrel and dp83867 phys

Olof Johansson (15):
  Merge tag 'keystone_config_for_4.19' of 
git://git.kernel.org/.../ssantosh/linux-keystone into next/defconfig
  Merge tag 'arm-soc/for-4.19/defconfig' of 
https://github.com/Broadcom/stblinux into next/defconfig
  Merge tag 'arm-soc/for-4.19/defconfig-arm64' of 
https://github.com/Broadcom/stblinux into next/defconfig
  Merge tag 'davinci-for-v4.19/defconfig' of 
git://git.kernel.org/.../nsekhar/linux-davinci into next/defconfig
  Merge tag 'samsung-defconfig-4.19' of 
https://git.kernel.org/.../krzk/linux into next/defconfig
  Merge tag 'stm32-defconfig-for-v4.19-1' of 
git://git.kernel.org/.../atorgue/stm32 into next/defconfig
  Merge tag 'mvebu-defconfig-4.19-1' of git://git.infradead.org/linux-mvebu 
into next/defconfig
  Merge tag 'hisi-defconfig-for-4.19' of 
git://github.com/hisilicon/linux-hisi into next/defconfig
  Merge tag 'imx-defconfig-4.19' of git://git.kernel.org/.../shawnguo/linux 
into next/defconfig
  Merge tag 'renesas-arm64-defconfig-for-v4.19' of 
https://git.kernel.org/.../horms/renesas into next/defconfig
  Merge tag 'qcom-arm64-defconfig-for-4.19' of 
git://git.kernel.org/.../agross/linux into next/defconfig
  Merge tag 'qcom-defconfig-for-4.19' of 
git://git.kernel.org/.../agross/linux into next/defconfig
  Merge tag 'renesas-arm-defconfig-for-v4.19' of 
https://git.kernel.org/.../horms/renesas into next/defconfig
  Merge tag 'aspeed-4.19-defconfig' of git://git.kernel.org/.../joel/aspeed 
into next/defconfig
  Merge tag 'v4.19-rockchip-defconfig64-1' of 
git://git.kernel.org/.../mmind/linux-rockchip into next/defconfig

Paweł Chmiel (3):
  ARM: s5pv210_defconfig: Run make savedefconfig
  ARM: s5pv210_defconfig: Enable drivers for Samsung Aries based phones
  ARM: s5pv210_defconfig: Enable options needed to boot typical Linux distro

Pierre-Yves 

[GIT PULL 2/4] ARM: SoC driver updates

2018-08-22 Thread Olof Johansson
Some of the larger changes this merge window:
- Removal of drivers for Exynos5440, a Samsung SoC that never saw
widespread use.
- Uniphier support for USB3 and SPI reset handling
- Syste control and SRAM drivers and bindings for Allwinner platforms
- Qualcomm AOSS (Always-on subsystem) reset controller drivers
- Raspberry Pi hwmon driver for voltage
- Mediatek pwrap (pmic) support for MT6797 SoC



The following changes since commit 8928c21c542a61c7cf7e33794a84e774040fb718:

  Merge tag 'armsoc-soc' into HEAD

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc.git 
tags/armsoc-drivers

for you to fetch changes up to 29ed45fff05899f6f39d05fe1c32b1bc51f8926b:

  Merge tag 'v4.18-next-soc' of 
https://git.kernel.org/pub/scm/linux/kernel/git/matthias.bgg/linux into 
next/drivers



Argus Lin (4):
  dt-bindings: pwrap: mediatek: add pwrap support for MT6797
  soc: mediatek: pwrap: fix cipher init setting error
  soc: mediatek: pwrap: add pwrap driver for mt6797 SoCs
  soc: mediatek: pwrap: add mt6351 driver for mt6797 SoCs

Arnd Bergmann (2):
  tee: replace getnstimeofday64() with ktime_get_real_ts64()
  soc: ti: wkup_m3_ipc: mark PM functions as __maybe_unused

Chen-Yu Tsai (2):
  dt-bindings: sram: Rename A64 SRAM controller compatible
  soc: sunxi: sram: Add updated compatible string for A64 system control

Dan Carpenter (1):
  firmware: arm_scmi: remove some unnecessary checks

Dave Gerlach (2):
  memory: ti-emif-sram: Add resume function to recopy sram code
  soc: ti: wkup_m3_ipc: Add wkup_m3_request_wake_src

Dmitry Osipenko (1):
  memory: tegra: Correct driver probe order

Doug Berger (1):
  soc: bcm: brcmstb: pm: Add support for newer rev B3.0 controllers

Faiz Abbas (3):
  clk: ti: dra7: Add clkctrl clock data for the mcan clocks
  bus: ti-sysc: Add support for using ti-sysc for MCAN on dra76x
  bus: ti-sysc: Add support for software reset

Florian Fainelli (2):
  soc: bcm: brcmstb: Add missing DDR MEMC compatible strings
  Merge tag 'bcm2835-drivers-next-2018-07-03' into drivers/next

Horia Geantă (1):
  staging: fsl-dpaa2: eth: move generic FD defines to DPIO

Icenowy Zheng (3):
  soc: sunxi: export a regmap for EMAC clock reg on A64
  dt-bindings: add binding for the Allwinner A64 DE2 bus
  bus: add bus driver for accessing Allwinner A64 DE2

Joakim Tjernlund (1):
  soc: fsl: qe: gpio: Add qe_gpio_set_multiple

Keerthy (1):
  soc: ti: wkup_m3_ipc: Add rtc_only with ddr in self refresh mode support

Kees Cook (1):
  firmware: raspberrypi: Remove VLA usage

Krzysztof Kozlowski (4):
  ata: ahci-platform: Remove support for Exynos5440
  cpufreq: exynos: Remove support for Exynos5440
  clk: samsung: Remove support for Exynos5440
  usb: host: exynos: Remove support for Exynos5440

Kunihiko Hayashi (4):
  reset: simple: export reset_simple_ops to be referred from modules
  dt-bindings: reset: uniphier: add USB3 core reset support
  reset: uniphier: add USB3 core reset control
  reset: uniphier: add reset control support for SPI

Leonard Crestez (2):
  soc: imx: gpc: Disable 6sl display power gating for ERR006287
  soc: imx6qp: Use GENPD_FLAG_ALWAYS_ON for PU errata

Li Yang (1):
  soc: fsl: cleanup Kconfig menu

Maxime Ripard (2):
  drivers: soc: sunxi: Add support for the C1 SRAM region
  soc: sunxi: Add the A13, A23 and H3 system control compatibles

Olof Johansson (14):
  Merge tag 'tee-drv-for-4.18' of 
git://git.linaro.org/people/jens.wiklander/linux-tee into next/drivers
  Merge tag 'soc_drivers_for_4.19' of 
git://git.kernel.org/.../ssantosh/linux-keystone into next/drivers
  Merge tag 'arm-soc/for-4.19/drivers' of 
https://github.com/Broadcom/stblinux into next/drivers
  Merge tag 'omap-for-v4.19/ti-sysc-v2-signed' of 
git://git.kernel.org/.../tmlind/linux-omap into next/drivers
  Merge tag 'tegra-for-4.19-memory' of git://git.kernel.org/.../tegra/linux 
into next/drivers
  Merge tag 'scmi-update-4.19' of 
git://git.kernel.org/.../sudeep.holla/linux into next/drivers
  Merge tag 'vexpress-update-4.19' of 
git://git.kernel.org/.../sudeep.holla/linux into next/drivers
  Merge tag 'reset-for-4.19' of git://git.pengutronix.de/git/pza/linux into 
next/drivers
  Merge tag 'imx-drivers-4.19' of git://git.kernel.org/.../shawnguo/linux 
into next/drivers
  Merge tag 'qcom-drivers-for-4.19' of 
git://git.kernel.org/.../agross/linux into next/drivers
  Merge tag 'sunxi-drivers-for-4.19' of 
https://git.kernel.org/.../sunxi/linux into next/drivers
  Merge tag 'soc-fsl-for-4.19' of git://git.kernel.org/.../leo/linux into 
next/drivers
  Merge tag 'samsung-drivers-exynos5440-4.19' of 
https://git.kernel.org/.../krzk/linux into 

[GIT PULL 1/4] ARM: 32-bit SoC platform updates

2018-08-22 Thread Olof Johansson
Most of the SoC updates in this cycle are cleanups and moves to more
modern infrastructure:

- Davinci was moved to common clock framework
- OMAP1-based Amstrad E3 "Superphone" saw a bunch of cleanups to the
keyboard interface (bitbanged AT keyboard via GPIO).
- Removal of some stale code for Renesas platforms
- Power management improvements for i.MX6LL



The following changes since commit 815f0ddb346c196018d4d8f8f55c12b83da1de3f:

  include/linux/compiler*.h: make compiler-*.h mutually exclusive

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc.git tags/armsoc-soc

for you to fetch changes up to f0fc40aff6fee100ffbed8328a0df88f8aa75fce:

  ARM: uniphier: select RESET_CONTROLLER



Andy Shevchenko (1):
  ARM: OMAP2+: reuse DEFINE_SHOW_ATTRIBUTE() macro

Anson Huang (6):
  ARM: imx: add standby mode suspend for i.MX6SLL
  ARM: imx: add mem mode suspend for i.MX6SLL
  ARM: imx: add L2 page power control for GPC
  ARM: imx: add cpu idle support for i.MX6SLL
  ARM: imx: remove i.MX6SLL support in i.MX6SL cpu idle driver
  ARM: imx: enable bus auto clock gating function for i.mx6sll

Arnd Bergmann (4):
  ARM: imx: fix i.MX6SLL build
  ARM: imx: call imx6sx_cpuidle_init() conditionally for 6sll
  soc: r9a06g032: don't build SMP files for non-SMP config
  ARM: shmobile: r8a7779: hide unused r8a7779_platform_cpu_kill

Arvind Yadav (1):
  ARM: OMAP1: constify gpio_led

Bartosz Golaszewski (13):
  clk: davinci: psc-da850: remove the 'davinci_nand.0" lookup
  clk: davinci: psc-dm365: use two lookup entries for the aemif clock
  clk: davinci: psc-dm644x: use two lookup entries for the aemif clock
  clk: davinci: psc-dm646x: use two lookup entries for the aemif clock
  clk: davinci: psc-da830: add a lookup entry for aemif clock
  ARM: davinci: omapl138-hawk: add aemif & nand support
  ARM: davinci: da850-evm: use aemif platform driver in legacy mode
  ARM: davinci: dm365-evm: use the ti-aemif soc driver
  ARM: davinci: dm644x-evm: use aemif platform driver
  ARM: davinci: da830-evm: use aemif platform driver
  ARM: davinci: dm646x-evm: use aemif platform driver
  ARM: davinci: mityomapl138: use aemif platform driver
  ARM: davinci: unduplicate aemif support

Boris Brezillon (1):
  MAINTAINERS: Remove the AT91 clk driver entry

Claudiu Beznea (3):
  ARM: at91: pm: Use ULP0 naming instead of slow clock
  ARM: at91: pm: add PMC fast startup registers defines
  ARM: at91: pm: configure wakeup sources for ULP1 mode

Clément Peron (3):
  ARM: imx: remove inexistant EPIT timer init
  ARM: debug: Add iProc UART3 debug addresses
  ARM: debug: fix BCM2836 order entry

Dave Gerlach (2):
  ARM: hwmod: RTC: Don't assume lock/unlock will be called with irq enabled
  ARM: OMAP2+: sleep33/43xx: Make sleep actions configurable

David Lechner (21):
  ARM: davinci: pass clock as parameter to davinci_timer_init()
  ARM: davinci: da830: add new clock init using common clock framework
  ARM: davinci: da850: add new clock init using common clock framework
  ARM: davinci: dm355: add new clock init using common clock framework
  ARM: davinci: dm365: add new clock init using common clock framework
  ARM: davinci: dm644x: add new clock init using common clock framework
  ARM: davinci: dm646x: add new clock init using common clock framework
  ARM: davinci: da8xx: add new USB PHY clock init using common clock 
framework
  ARM: davinci: da8xx: add new sata_refclk init using common clock framework
  ARM: davinci: remove CONFIG_DAVINCI_RESET_CLOCKS
  ARM: davinci: switch to common clock framework
  ARM: davinci: da830: Remove legacy clock init
  ARM: davinci: da850: Remove legacy clock init
  ARM: davinci: dm355: Remove legacy clock init
  ARM: davinci: dm365: Remove legacy clock init
  ARM: davinci: dm644x: Remove legacy clock init
  ARM: davinci: dm646x: Remove legacy clock init
  ARM: davinci: da8xx: Remove legacy USB and SATA clock init
  ARM: davinci: remove legacy clocks
  ARM: davinci: add device tree support to timer
  ARM: davinci: da8xx-dt: switch to device tree clocks

Denis Efremov (1):
  ARM: s3c24xx: Fix typo in guard macro of s3c2412.h

Ethan Tuttle (1):
  ARM: mvebu: declare asm symbols as character arrays in pmsu.c

Fabio Estevam (2):
  ARM: imx51: Configure M4IF to avoid visual artifacts
  ARM: mx5: Set the DBGEN bit in ARM_GPC register

Geert Uytterhoeven (16):
  ARM: shmobile: r8a7790: Remove legacy SMP fallback code
  ARM: shmobile: r8a7790: Use common R-Car Gen2 machine definition
  ARM: shmobile: r8a7791: Remove legacy SMP fallback code
  ARM: shmobile: r8a7791: Use common R-Car Gen2 machine definition

[GIT PULL 0/4] ARM: SoC/platform updates

2018-08-22 Thread Olof Johansson
Hi Linus,

Another medium-sized batch of ARM updates this release cycle. Not a
large number of new platforms in this release, but there are a few --
from Renesas and TI in particular.

790 patches this time around, 173 authors. 26531 lines added and 13363
removed -- very close to the size of last release.

A bit more detail in each pull request, but this is overall a "business
as usual" release this time around.

Please merge!


Thanks,

-Olof



[GIT PULL 2/4] ARM: SoC driver updates

2018-08-22 Thread Olof Johansson
Some of the larger changes this merge window:
- Removal of drivers for Exynos5440, a Samsung SoC that never saw
widespread use.
- Uniphier support for USB3 and SPI reset handling
- Syste control and SRAM drivers and bindings for Allwinner platforms
- Qualcomm AOSS (Always-on subsystem) reset controller drivers
- Raspberry Pi hwmon driver for voltage
- Mediatek pwrap (pmic) support for MT6797 SoC



The following changes since commit 8928c21c542a61c7cf7e33794a84e774040fb718:

  Merge tag 'armsoc-soc' into HEAD

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc.git 
tags/armsoc-drivers

for you to fetch changes up to 29ed45fff05899f6f39d05fe1c32b1bc51f8926b:

  Merge tag 'v4.18-next-soc' of 
https://git.kernel.org/pub/scm/linux/kernel/git/matthias.bgg/linux into 
next/drivers



Argus Lin (4):
  dt-bindings: pwrap: mediatek: add pwrap support for MT6797
  soc: mediatek: pwrap: fix cipher init setting error
  soc: mediatek: pwrap: add pwrap driver for mt6797 SoCs
  soc: mediatek: pwrap: add mt6351 driver for mt6797 SoCs

Arnd Bergmann (2):
  tee: replace getnstimeofday64() with ktime_get_real_ts64()
  soc: ti: wkup_m3_ipc: mark PM functions as __maybe_unused

Chen-Yu Tsai (2):
  dt-bindings: sram: Rename A64 SRAM controller compatible
  soc: sunxi: sram: Add updated compatible string for A64 system control

Dan Carpenter (1):
  firmware: arm_scmi: remove some unnecessary checks

Dave Gerlach (2):
  memory: ti-emif-sram: Add resume function to recopy sram code
  soc: ti: wkup_m3_ipc: Add wkup_m3_request_wake_src

Dmitry Osipenko (1):
  memory: tegra: Correct driver probe order

Doug Berger (1):
  soc: bcm: brcmstb: pm: Add support for newer rev B3.0 controllers

Faiz Abbas (3):
  clk: ti: dra7: Add clkctrl clock data for the mcan clocks
  bus: ti-sysc: Add support for using ti-sysc for MCAN on dra76x
  bus: ti-sysc: Add support for software reset

Florian Fainelli (2):
  soc: bcm: brcmstb: Add missing DDR MEMC compatible strings
  Merge tag 'bcm2835-drivers-next-2018-07-03' into drivers/next

Horia Geantă (1):
  staging: fsl-dpaa2: eth: move generic FD defines to DPIO

Icenowy Zheng (3):
  soc: sunxi: export a regmap for EMAC clock reg on A64
  dt-bindings: add binding for the Allwinner A64 DE2 bus
  bus: add bus driver for accessing Allwinner A64 DE2

Joakim Tjernlund (1):
  soc: fsl: qe: gpio: Add qe_gpio_set_multiple

Keerthy (1):
  soc: ti: wkup_m3_ipc: Add rtc_only with ddr in self refresh mode support

Kees Cook (1):
  firmware: raspberrypi: Remove VLA usage

Krzysztof Kozlowski (4):
  ata: ahci-platform: Remove support for Exynos5440
  cpufreq: exynos: Remove support for Exynos5440
  clk: samsung: Remove support for Exynos5440
  usb: host: exynos: Remove support for Exynos5440

Kunihiko Hayashi (4):
  reset: simple: export reset_simple_ops to be referred from modules
  dt-bindings: reset: uniphier: add USB3 core reset support
  reset: uniphier: add USB3 core reset control
  reset: uniphier: add reset control support for SPI

Leonard Crestez (2):
  soc: imx: gpc: Disable 6sl display power gating for ERR006287
  soc: imx6qp: Use GENPD_FLAG_ALWAYS_ON for PU errata

Li Yang (1):
  soc: fsl: cleanup Kconfig menu

Maxime Ripard (2):
  drivers: soc: sunxi: Add support for the C1 SRAM region
  soc: sunxi: Add the A13, A23 and H3 system control compatibles

Olof Johansson (14):
  Merge tag 'tee-drv-for-4.18' of 
git://git.linaro.org/people/jens.wiklander/linux-tee into next/drivers
  Merge tag 'soc_drivers_for_4.19' of 
git://git.kernel.org/.../ssantosh/linux-keystone into next/drivers
  Merge tag 'arm-soc/for-4.19/drivers' of 
https://github.com/Broadcom/stblinux into next/drivers
  Merge tag 'omap-for-v4.19/ti-sysc-v2-signed' of 
git://git.kernel.org/.../tmlind/linux-omap into next/drivers
  Merge tag 'tegra-for-4.19-memory' of git://git.kernel.org/.../tegra/linux 
into next/drivers
  Merge tag 'scmi-update-4.19' of 
git://git.kernel.org/.../sudeep.holla/linux into next/drivers
  Merge tag 'vexpress-update-4.19' of 
git://git.kernel.org/.../sudeep.holla/linux into next/drivers
  Merge tag 'reset-for-4.19' of git://git.pengutronix.de/git/pza/linux into 
next/drivers
  Merge tag 'imx-drivers-4.19' of git://git.kernel.org/.../shawnguo/linux 
into next/drivers
  Merge tag 'qcom-drivers-for-4.19' of 
git://git.kernel.org/.../agross/linux into next/drivers
  Merge tag 'sunxi-drivers-for-4.19' of 
https://git.kernel.org/.../sunxi/linux into next/drivers
  Merge tag 'soc-fsl-for-4.19' of git://git.kernel.org/.../leo/linux into 
next/drivers
  Merge tag 'samsung-drivers-exynos5440-4.19' of 
https://git.kernel.org/.../krzk/linux into 

[GIT PULL 1/4] ARM: 32-bit SoC platform updates

2018-08-22 Thread Olof Johansson
Most of the SoC updates in this cycle are cleanups and moves to more
modern infrastructure:

- Davinci was moved to common clock framework
- OMAP1-based Amstrad E3 "Superphone" saw a bunch of cleanups to the
keyboard interface (bitbanged AT keyboard via GPIO).
- Removal of some stale code for Renesas platforms
- Power management improvements for i.MX6LL



The following changes since commit 815f0ddb346c196018d4d8f8f55c12b83da1de3f:

  include/linux/compiler*.h: make compiler-*.h mutually exclusive

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc.git tags/armsoc-soc

for you to fetch changes up to f0fc40aff6fee100ffbed8328a0df88f8aa75fce:

  ARM: uniphier: select RESET_CONTROLLER



Andy Shevchenko (1):
  ARM: OMAP2+: reuse DEFINE_SHOW_ATTRIBUTE() macro

Anson Huang (6):
  ARM: imx: add standby mode suspend for i.MX6SLL
  ARM: imx: add mem mode suspend for i.MX6SLL
  ARM: imx: add L2 page power control for GPC
  ARM: imx: add cpu idle support for i.MX6SLL
  ARM: imx: remove i.MX6SLL support in i.MX6SL cpu idle driver
  ARM: imx: enable bus auto clock gating function for i.mx6sll

Arnd Bergmann (4):
  ARM: imx: fix i.MX6SLL build
  ARM: imx: call imx6sx_cpuidle_init() conditionally for 6sll
  soc: r9a06g032: don't build SMP files for non-SMP config
  ARM: shmobile: r8a7779: hide unused r8a7779_platform_cpu_kill

Arvind Yadav (1):
  ARM: OMAP1: constify gpio_led

Bartosz Golaszewski (13):
  clk: davinci: psc-da850: remove the 'davinci_nand.0" lookup
  clk: davinci: psc-dm365: use two lookup entries for the aemif clock
  clk: davinci: psc-dm644x: use two lookup entries for the aemif clock
  clk: davinci: psc-dm646x: use two lookup entries for the aemif clock
  clk: davinci: psc-da830: add a lookup entry for aemif clock
  ARM: davinci: omapl138-hawk: add aemif & nand support
  ARM: davinci: da850-evm: use aemif platform driver in legacy mode
  ARM: davinci: dm365-evm: use the ti-aemif soc driver
  ARM: davinci: dm644x-evm: use aemif platform driver
  ARM: davinci: da830-evm: use aemif platform driver
  ARM: davinci: dm646x-evm: use aemif platform driver
  ARM: davinci: mityomapl138: use aemif platform driver
  ARM: davinci: unduplicate aemif support

Boris Brezillon (1):
  MAINTAINERS: Remove the AT91 clk driver entry

Claudiu Beznea (3):
  ARM: at91: pm: Use ULP0 naming instead of slow clock
  ARM: at91: pm: add PMC fast startup registers defines
  ARM: at91: pm: configure wakeup sources for ULP1 mode

Clément Peron (3):
  ARM: imx: remove inexistant EPIT timer init
  ARM: debug: Add iProc UART3 debug addresses
  ARM: debug: fix BCM2836 order entry

Dave Gerlach (2):
  ARM: hwmod: RTC: Don't assume lock/unlock will be called with irq enabled
  ARM: OMAP2+: sleep33/43xx: Make sleep actions configurable

David Lechner (21):
  ARM: davinci: pass clock as parameter to davinci_timer_init()
  ARM: davinci: da830: add new clock init using common clock framework
  ARM: davinci: da850: add new clock init using common clock framework
  ARM: davinci: dm355: add new clock init using common clock framework
  ARM: davinci: dm365: add new clock init using common clock framework
  ARM: davinci: dm644x: add new clock init using common clock framework
  ARM: davinci: dm646x: add new clock init using common clock framework
  ARM: davinci: da8xx: add new USB PHY clock init using common clock 
framework
  ARM: davinci: da8xx: add new sata_refclk init using common clock framework
  ARM: davinci: remove CONFIG_DAVINCI_RESET_CLOCKS
  ARM: davinci: switch to common clock framework
  ARM: davinci: da830: Remove legacy clock init
  ARM: davinci: da850: Remove legacy clock init
  ARM: davinci: dm355: Remove legacy clock init
  ARM: davinci: dm365: Remove legacy clock init
  ARM: davinci: dm644x: Remove legacy clock init
  ARM: davinci: dm646x: Remove legacy clock init
  ARM: davinci: da8xx: Remove legacy USB and SATA clock init
  ARM: davinci: remove legacy clocks
  ARM: davinci: add device tree support to timer
  ARM: davinci: da8xx-dt: switch to device tree clocks

Denis Efremov (1):
  ARM: s3c24xx: Fix typo in guard macro of s3c2412.h

Ethan Tuttle (1):
  ARM: mvebu: declare asm symbols as character arrays in pmsu.c

Fabio Estevam (2):
  ARM: imx51: Configure M4IF to avoid visual artifacts
  ARM: mx5: Set the DBGEN bit in ARM_GPC register

Geert Uytterhoeven (16):
  ARM: shmobile: r8a7790: Remove legacy SMP fallback code
  ARM: shmobile: r8a7790: Use common R-Car Gen2 machine definition
  ARM: shmobile: r8a7791: Remove legacy SMP fallback code
  ARM: shmobile: r8a7791: Use common R-Car Gen2 machine definition

[GIT PULL 0/4] ARM: SoC/platform updates

2018-08-22 Thread Olof Johansson
Hi Linus,

Another medium-sized batch of ARM updates this release cycle. Not a
large number of new platforms in this release, but there are a few --
from Renesas and TI in particular.

790 patches this time around, 173 authors. 26531 lines added and 13363
removed -- very close to the size of last release.

A bit more detail in each pull request, but this is overall a "business
as usual" release this time around.

Please merge!


Thanks,

-Olof



Re: [RFC PATCH 3/5] RISC-V: Add cpu_operatios structure

2018-08-22 Thread Atish Patra

On 8/22/18 8:54 PM, Anup Patel wrote:

On Wed, Aug 22, 2018 at 11:33 AM, Christoph Hellwig  wrote:

On Tue, Aug 21, 2018 at 10:34:38PM +0530, Anup Patel wrote:

The cpu_operations is certainly required because SOC vendors will add
vendor-specific mechanism to selectively bringing-up CPUs/HARTs instead
of all CPUs entering Linux kernel simultaneously. In fact, we might also end-up
having CPU ON/OFF operations in SBI.


Your forgot an essential part in your analysis:  Right now we only have
one single way to deal with cpu on/offlining, and that is the dummy WFI
kind.  Once other ways show up we can build proper infrastructure, but
until then this is just a white elephant as we have no idea how these
abstractions will look like.

And my hope is that we'll just see new SBI calls, in which case we'll
just need SBI and dummy version and can avoid all the indirect calls.


IMHO, rather than waiting for new CPU ON/OFF methods to come-up we
can keep the cpu_operations ready. Also, we are not re-inventing anything
here which we might have to discard later because cpu_operations are
already tried and hardened for Linux ARM64.

I agree with you that in long-term SBI-based CPU ON/OFF will be widely
used. Most likely we will have at-least two CPU ON/OFF methods:
1. Existing lottery based spinning
2. New SBI calls

Regards,
Anup



I am fine with either keeping the cpu_ops infrastructure for now or 
reintroducing again along with better smp enablement methods.


Anyways, there were concerns about all existing booting method (all cpu 
thrown to Linux at the same time). I was thinking to adopt spin table 
boot method for RISC-V as well. I can drop this patch now and 
reintroduce with spin table boot method.


Any thoughts ?

Regards,
Atish


Re: [RFC PATCH 3/5] RISC-V: Add cpu_operatios structure

2018-08-22 Thread Atish Patra

On 8/22/18 8:54 PM, Anup Patel wrote:

On Wed, Aug 22, 2018 at 11:33 AM, Christoph Hellwig  wrote:

On Tue, Aug 21, 2018 at 10:34:38PM +0530, Anup Patel wrote:

The cpu_operations is certainly required because SOC vendors will add
vendor-specific mechanism to selectively bringing-up CPUs/HARTs instead
of all CPUs entering Linux kernel simultaneously. In fact, we might also end-up
having CPU ON/OFF operations in SBI.


Your forgot an essential part in your analysis:  Right now we only have
one single way to deal with cpu on/offlining, and that is the dummy WFI
kind.  Once other ways show up we can build proper infrastructure, but
until then this is just a white elephant as we have no idea how these
abstractions will look like.

And my hope is that we'll just see new SBI calls, in which case we'll
just need SBI and dummy version and can avoid all the indirect calls.


IMHO, rather than waiting for new CPU ON/OFF methods to come-up we
can keep the cpu_operations ready. Also, we are not re-inventing anything
here which we might have to discard later because cpu_operations are
already tried and hardened for Linux ARM64.

I agree with you that in long-term SBI-based CPU ON/OFF will be widely
used. Most likely we will have at-least two CPU ON/OFF methods:
1. Existing lottery based spinning
2. New SBI calls

Regards,
Anup



I am fine with either keeping the cpu_ops infrastructure for now or 
reintroducing again along with better smp enablement methods.


Anyways, there were concerns about all existing booting method (all cpu 
thrown to Linux at the same time). I was thinking to adopt spin table 
boot method for RISC-V as well. I can drop this patch now and 
reintroduce with spin table boot method.


Any thoughts ?

Regards,
Atish


Re: [PATCH 2/4] mm/tlb: Remove tlb_remove_table() non-concurrent condition

2018-08-22 Thread Nicholas Piggin
On Wed, 22 Aug 2018 20:35:16 -0700
Linus Torvalds  wrote:

> On Wed, Aug 22, 2018 at 8:31 PM Nicholas Piggin  wrote:
> >
> >
> > So that leaves speculative operations. I don't see where the problem is
> > with those either -- this shortcut needs to ensure there are no other
> > *non speculative* operations. mm_users is correct for that.  
> 
> No. Because mm_users doesn't contain any lazy tlb users.
> 
> And yes, those lazy tlbs are all kernel threads, but they can still
> speculatively load user addresses.

So?

If the arch does not shoot those all down after the user page tables
are removed then it's buggy regardless of this short cut.

The only real problem I could see would be if a page walk cache still
points to the freed table, then the table gets re-allocated and used
elsewhere, and meanwhile a speculative access tries to load an entry
from the page that is an invalid form of page table that might cause
a machine check or something. That would be (u)arch specific, but if
that's what we're concerned with here it's a different issue and needs
to be documented as such.

I'll have a look at powerpc and see if we can cope with it. If so, I'll
make it an arch specific opt-in short cut.

Thanks,
Nick


Re: [PATCH 2/4] mm/tlb: Remove tlb_remove_table() non-concurrent condition

2018-08-22 Thread Nicholas Piggin
On Wed, 22 Aug 2018 20:35:16 -0700
Linus Torvalds  wrote:

> On Wed, Aug 22, 2018 at 8:31 PM Nicholas Piggin  wrote:
> >
> >
> > So that leaves speculative operations. I don't see where the problem is
> > with those either -- this shortcut needs to ensure there are no other
> > *non speculative* operations. mm_users is correct for that.  
> 
> No. Because mm_users doesn't contain any lazy tlb users.
> 
> And yes, those lazy tlbs are all kernel threads, but they can still
> speculatively load user addresses.

So?

If the arch does not shoot those all down after the user page tables
are removed then it's buggy regardless of this short cut.

The only real problem I could see would be if a page walk cache still
points to the freed table, then the table gets re-allocated and used
elsewhere, and meanwhile a speculative access tries to load an entry
from the page that is an invalid form of page table that might cause
a machine check or something. That would be (u)arch specific, but if
that's what we're concerned with here it's a different issue and needs
to be documented as such.

I'll have a look at powerpc and see if we can cope with it. If so, I'll
make it an arch specific opt-in short cut.

Thanks,
Nick


Re: [lkp-robot] [ipc] 296ba26b66: BUG:sleeping_function_called_from_invalid_context_at_mm/memory.c

2018-08-22 Thread Davidlohr Bueso

On Thu, 23 Aug 2018, kernel test robot wrote:


FYI, we noticed the following commit (built with gcc-7):

commit: 296ba26b6681b6e07ed419b3004647167cb17f61 ("ipc: drop ipc_lock()")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master


I suspect this is because that commit forgot to set EIDRM for the
!ipc_valid_object() case. So the callers check IS_ERR(shm_lock()),
which won't fail the op as it did before the commit.

diff --git a/ipc/shm.c b/ipc/shm.c
index b0eb3757ab89..4cd402e4cfeb 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -199,6 +199,7 @@ static inline struct shmid_kernel *shm_lock(struct 
ipc_namespace *ns, int id)
}

ipc_unlock_object(ipcp);
+   ipcp = ERR_PTR(-EIDRM);
err:
rcu_read_unlock();
/*


Re: [lkp-robot] [ipc] 296ba26b66: BUG:sleeping_function_called_from_invalid_context_at_mm/memory.c

2018-08-22 Thread Davidlohr Bueso

On Thu, 23 Aug 2018, kernel test robot wrote:


FYI, we noticed the following commit (built with gcc-7):

commit: 296ba26b6681b6e07ed419b3004647167cb17f61 ("ipc: drop ipc_lock()")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master


I suspect this is because that commit forgot to set EIDRM for the
!ipc_valid_object() case. So the callers check IS_ERR(shm_lock()),
which won't fail the op as it did before the commit.

diff --git a/ipc/shm.c b/ipc/shm.c
index b0eb3757ab89..4cd402e4cfeb 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -199,6 +199,7 @@ static inline struct shmid_kernel *shm_lock(struct 
ipc_namespace *ns, int id)
}

ipc_unlock_object(ipcp);
+   ipcp = ERR_PTR(-EIDRM);
err:
rcu_read_unlock();
/*


Re: [PATCH 3/4] mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE

2018-08-22 Thread Linus Torvalds
On Wed, Aug 22, 2018 at 8:45 PM Nicholas Piggin  wrote:
>
> powerpc/radix has no such issue, it already does this tracking.

Yeah, I now realize that this was why you wanted to add that hacky
thing to the generic code, so that you can add the tlb_flush_pgtable()
call.

I thought it was because powerpc had some special flush instruction
for it, and the regular tlb flush didn't do it. But no. It was because
the regular code had lost the tlb flush _entirely_, because powerpc
didn't want it.

> We were discussing this a couple of months ago, I wasn't aware of ARM's
> issue but I suggested x86 could go the same way as powerpc.

The problem is that x86 _used_ to do this all correctly long long ago.

And then we switched over to the "generic" table flushing (which
harkens back to the powerpc code).

Which actually turned out to be not generic at all, and did not flush
the internal pages like x86 used to (back when x86 just used
tlb_remove_page for everything).

So as a result, x86 had unintentionally lost the TLB flush we used to
have, because tlb_remove_table() had lost the tlb flushing because of
a powerpc quirk.

You then added it back as a hacky per-architecture hook (apparently
having realized that you never did it at all), which didn't fix the
unintentional lack of flushing on x86.

So now we're going to do it right.  No more "oh, powerpc didn't need
to flush because the hash tables weren't in the tlb at all" thing in
the generic code that then others need to work around.

  Linus


Re: [PATCH 3/4] mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE

2018-08-22 Thread Linus Torvalds
On Wed, Aug 22, 2018 at 8:45 PM Nicholas Piggin  wrote:
>
> powerpc/radix has no such issue, it already does this tracking.

Yeah, I now realize that this was why you wanted to add that hacky
thing to the generic code, so that you can add the tlb_flush_pgtable()
call.

I thought it was because powerpc had some special flush instruction
for it, and the regular tlb flush didn't do it. But no. It was because
the regular code had lost the tlb flush _entirely_, because powerpc
didn't want it.

> We were discussing this a couple of months ago, I wasn't aware of ARM's
> issue but I suggested x86 could go the same way as powerpc.

The problem is that x86 _used_ to do this all correctly long long ago.

And then we switched over to the "generic" table flushing (which
harkens back to the powerpc code).

Which actually turned out to be not generic at all, and did not flush
the internal pages like x86 used to (back when x86 just used
tlb_remove_page for everything).

So as a result, x86 had unintentionally lost the TLB flush we used to
have, because tlb_remove_table() had lost the tlb flushing because of
a powerpc quirk.

You then added it back as a hacky per-architecture hook (apparently
having realized that you never did it at all), which didn't fix the
unintentional lack of flushing on x86.

So now we're going to do it right.  No more "oh, powerpc didn't need
to flush because the hash tables weren't in the tlb at all" thing in
the generic code that then others need to work around.

  Linus


mv: cannot stat 'drivers/net/wireless/intel/iwlwifi/mvm/.tmp_mx_ops.o': No such file or directory

2018-08-22 Thread kbuild test robot
tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   815f0ddb346c196018d4d8f8f55c12b83da1de3f
commit: 0fbe9a245c60bedebb6dd329966f463bb724450a microblaze: add endianness 
options to LDFLAGS instead of LD
date:   4 weeks ago
config: microblaze-allmodconfig (attached as .config)
compiler: microblaze-linux-gcc (GCC) 8.1.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
git checkout 0fbe9a245c60bedebb6dd329966f463bb724450a
# save the attached .config to linux build tree
GCC_VERSION=8.1.0 make.cross ARCH=microblaze 

All errors (new ones prefixed by >>):

   microblaze-linux-ld: drivers/net/wireless/intel/iwlwifi/mvm/ops.o: compiled 
for a little endian system and target is big endian
   microblaze-linux-ld: failed to merge target specific data of file 
drivers/net/wireless/intel/iwlwifi/mvm/ops.o
   microblaze-linux-ld: drivers/net/wireless/intel/iwlwifi/mvm/.tmp_mc_ops.o: 
compiled for a little endian system and target is big endian
   microblaze-linux-ld: failed to merge target specific data of file 
drivers/net/wireless/intel/iwlwifi/mvm/.tmp_mc_ops.o
>> mv: cannot stat 'drivers/net/wireless/intel/iwlwifi/mvm/.tmp_mx_ops.o': No 
>> such file or directory

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH] x86, kdump: Fix efi=noruntime NULL pointer dereference

2018-08-22 Thread Dave Young
On 08/22/18 at 06:23pm, Dave Young wrote:
> On 08/21/18 at 03:39pm, Ard Biesheuvel wrote:
> > On 9 August 2018 at 11:13, Dave Young  wrote:
> > > On 08/09/18 at 09:33am, Mike Galbraith wrote:
> > >> On Thu, 2018-08-09 at 12:21 +0800, Dave Young wrote:
> > >> > Hi Mike,
> > >> >
> > >> > Thanks for the patch!
> > >> > On 08/08/18 at 04:03pm, Mike Galbraith wrote:
> > >> > > When booting with efi=noruntime, we call efi_runtime_map_copy() while
> > >> > > loading the kdump kernel, and trip over a NULL efi.memmap.map.  Avoid
> > >> > > that and a useless allocation when the only mapping we can use (1:1)
> > >> > > is not available.
> > >> >
> > >> > At first glance, efi_get_runtime_map_size should return 0 in case
> > >> > noruntime.
> > >>
> > >> What efi does internally at unmap time is to leave everything except
> > >> efi.mmap.map untouched, setting it to NULL and turning off EFI_MEMMAP,
> > >> rendering efi.mmap.map accessors useless/unsafe without first checking
> > >> EFI_MEMMAP.
> > >
> > > Probably the x86 efi_init should reset nr_map to zero in case runtime is
> > > disabled.  But let's see how Ard thinks about this and cc linux-efi.
> > >
> > > As for efi_get_runtime_map_size, it was introduced for x86 kexec use.
> > > for copying runtime maps,  so I think it is reasonable this function
> > > return zero in case no runtime.
> > >
> > 
> > I don't see the patch in the context so I cannot comment in great detail.
> 
> The patch is below:
> https://lore.kernel.org/lkml/1533737025.4936.3.ca...@gmx.de
> 
> > 
> > In any case, it is better to decouple EFI_MEMMAP from EFI_RUNTIME
> > dependencies. On x86, one may imply the other, but this is not
> > generally the case.
> > 
> > That means that efi_get_runtime_map_size() should probably check the
> > EFI_RUNTIME flag, and return 0 if it is cleared. Perhaps there are
> > other places where EFI_MEMMAP flag checks are missing, but I consider
> > that a separate issue.
> 
> Yes, I also agree with to check EFI_RUNTIME_SERVICES. There is no point for
> efi_get_runtime_map_size to return a value other than 0 in case 
> EFI_RUNTIME_SERVICES
> is not set ie. via efi=noruntime
> 
> Is below patch acceptable?  The copy function can be changed to return
> an error in case map size == 0, but that can be done later along with
> the caller size cleanups in kexec code

Forgot to add Mike's reported-by tag..

Mike, since we are going this way, I'm working on a kexec code cleanup,
but it needs careful testing so still need some time.

Can you help test below efi fix and provide you tested-by if it works?

> ---
> 
> efi: check EFI_RUNTIME_SERVICES flag in runtime map copying code
> 
> Mike reported a kexec_file_load NULL pointer dereference bug like below:
> [5.878262] BUG: unable to handle kernel NULL pointer dereference at 
> 
> [5.879868] PGD 80013c1f1067 P4D 80013c1f1067 PUD 13aea7067 PMD 0 
> [5.881225] Oops:  [#1] SMP PTI
> [5.882068] Modules linked in:
> [5.882851] CPU: 0 PID: 394 Comm: kexec Kdump: loaded Not tainted 
> 4.17.0-rc2+ #648
> [5.884333] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 
> 02/06/2015
> [5.885843] RIP: 0010:memcpy_erms+0x6/0x10
> [5.886789] RSP: 0018:c958bd00 EFLAGS: 00010246
> [5.887899] RAX: 880138e050b0 RBX: 000980b0 RCX: 
> 0ba0
> [5.889304] RDX: 0ba0 RSI:  RDI: 
> 880138e050b0
> [5.890977] RBP: 880138e04000 R08: 0017 R09: 
> 0002
> [5.892524] R10: 00099000 R11: 52d0 R12: 
> 39400200
> [5.893967] R13: 880138e05000 R14: 0ba0 R15: 
> c9a4d000
> [5.895378] FS:  7f167c9e6740() GS:88013fc0() 
> knlGS:
> [5.896953] CS:  0010 DS:  ES:  CR0: 80050033
> [5.898143] CR2:  CR3: 00013c3ec002 CR4: 
> 001606f0
> [5.899542] DR0:  DR1:  DR2: 
> 
> [5.900962] DR3:  DR6: fffe0ff0 DR7: 
> 0400
> [5.902552] Call Trace:
> [5.903267]  efi_runtime_map_copy+0x28/0x30
> [5.904956]  bzImage64_load+0x59d/0x736
> [5.906881]  ? arch_kexec_kernel_image_load+0x6d/0x70
> [5.908243]  ? __se_sys_kexec_file_load+0x24b/0x750
> [5.909352]  ? _cond_resched+0x19/0x30
> [5.910286]  ? do_syscall_64+0x65/0x180
> [5.911229]  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [5.912365] Code: 90 90 90 90 90 eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 
> 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1  
> a4 c3 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e 40 38 
> [5.916235] RIP: memcpy_erms+0x6/0x10 RSP: c958bd00
> [5.917507] CR2: 
> [5.918762] ---[ end trace 5cf4c4b3b93d7fdd ]---
> 
> Changing 

mv: cannot stat 'drivers/net/wireless/intel/iwlwifi/mvm/.tmp_mx_ops.o': No such file or directory

2018-08-22 Thread kbuild test robot
tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   815f0ddb346c196018d4d8f8f55c12b83da1de3f
commit: 0fbe9a245c60bedebb6dd329966f463bb724450a microblaze: add endianness 
options to LDFLAGS instead of LD
date:   4 weeks ago
config: microblaze-allmodconfig (attached as .config)
compiler: microblaze-linux-gcc (GCC) 8.1.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
git checkout 0fbe9a245c60bedebb6dd329966f463bb724450a
# save the attached .config to linux build tree
GCC_VERSION=8.1.0 make.cross ARCH=microblaze 

All errors (new ones prefixed by >>):

   microblaze-linux-ld: drivers/net/wireless/intel/iwlwifi/mvm/ops.o: compiled 
for a little endian system and target is big endian
   microblaze-linux-ld: failed to merge target specific data of file 
drivers/net/wireless/intel/iwlwifi/mvm/ops.o
   microblaze-linux-ld: drivers/net/wireless/intel/iwlwifi/mvm/.tmp_mc_ops.o: 
compiled for a little endian system and target is big endian
   microblaze-linux-ld: failed to merge target specific data of file 
drivers/net/wireless/intel/iwlwifi/mvm/.tmp_mc_ops.o
>> mv: cannot stat 'drivers/net/wireless/intel/iwlwifi/mvm/.tmp_mx_ops.o': No 
>> such file or directory

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH] x86, kdump: Fix efi=noruntime NULL pointer dereference

2018-08-22 Thread Dave Young
On 08/22/18 at 06:23pm, Dave Young wrote:
> On 08/21/18 at 03:39pm, Ard Biesheuvel wrote:
> > On 9 August 2018 at 11:13, Dave Young  wrote:
> > > On 08/09/18 at 09:33am, Mike Galbraith wrote:
> > >> On Thu, 2018-08-09 at 12:21 +0800, Dave Young wrote:
> > >> > Hi Mike,
> > >> >
> > >> > Thanks for the patch!
> > >> > On 08/08/18 at 04:03pm, Mike Galbraith wrote:
> > >> > > When booting with efi=noruntime, we call efi_runtime_map_copy() while
> > >> > > loading the kdump kernel, and trip over a NULL efi.memmap.map.  Avoid
> > >> > > that and a useless allocation when the only mapping we can use (1:1)
> > >> > > is not available.
> > >> >
> > >> > At first glance, efi_get_runtime_map_size should return 0 in case
> > >> > noruntime.
> > >>
> > >> What efi does internally at unmap time is to leave everything except
> > >> efi.mmap.map untouched, setting it to NULL and turning off EFI_MEMMAP,
> > >> rendering efi.mmap.map accessors useless/unsafe without first checking
> > >> EFI_MEMMAP.
> > >
> > > Probably the x86 efi_init should reset nr_map to zero in case runtime is
> > > disabled.  But let's see how Ard thinks about this and cc linux-efi.
> > >
> > > As for efi_get_runtime_map_size, it was introduced for x86 kexec use.
> > > for copying runtime maps,  so I think it is reasonable this function
> > > return zero in case no runtime.
> > >
> > 
> > I don't see the patch in the context so I cannot comment in great detail.
> 
> The patch is below:
> https://lore.kernel.org/lkml/1533737025.4936.3.ca...@gmx.de
> 
> > 
> > In any case, it is better to decouple EFI_MEMMAP from EFI_RUNTIME
> > dependencies. On x86, one may imply the other, but this is not
> > generally the case.
> > 
> > That means that efi_get_runtime_map_size() should probably check the
> > EFI_RUNTIME flag, and return 0 if it is cleared. Perhaps there are
> > other places where EFI_MEMMAP flag checks are missing, but I consider
> > that a separate issue.
> 
> Yes, I also agree with to check EFI_RUNTIME_SERVICES. There is no point for
> efi_get_runtime_map_size to return a value other than 0 in case 
> EFI_RUNTIME_SERVICES
> is not set ie. via efi=noruntime
> 
> Is below patch acceptable?  The copy function can be changed to return
> an error in case map size == 0, but that can be done later along with
> the caller size cleanups in kexec code

Forgot to add Mike's reported-by tag..

Mike, since we are going this way, I'm working on a kexec code cleanup,
but it needs careful testing so still need some time.

Can you help test below efi fix and provide you tested-by if it works?

> ---
> 
> efi: check EFI_RUNTIME_SERVICES flag in runtime map copying code
> 
> Mike reported a kexec_file_load NULL pointer dereference bug like below:
> [5.878262] BUG: unable to handle kernel NULL pointer dereference at 
> 
> [5.879868] PGD 80013c1f1067 P4D 80013c1f1067 PUD 13aea7067 PMD 0 
> [5.881225] Oops:  [#1] SMP PTI
> [5.882068] Modules linked in:
> [5.882851] CPU: 0 PID: 394 Comm: kexec Kdump: loaded Not tainted 
> 4.17.0-rc2+ #648
> [5.884333] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 
> 02/06/2015
> [5.885843] RIP: 0010:memcpy_erms+0x6/0x10
> [5.886789] RSP: 0018:c958bd00 EFLAGS: 00010246
> [5.887899] RAX: 880138e050b0 RBX: 000980b0 RCX: 
> 0ba0
> [5.889304] RDX: 0ba0 RSI:  RDI: 
> 880138e050b0
> [5.890977] RBP: 880138e04000 R08: 0017 R09: 
> 0002
> [5.892524] R10: 00099000 R11: 52d0 R12: 
> 39400200
> [5.893967] R13: 880138e05000 R14: 0ba0 R15: 
> c9a4d000
> [5.895378] FS:  7f167c9e6740() GS:88013fc0() 
> knlGS:
> [5.896953] CS:  0010 DS:  ES:  CR0: 80050033
> [5.898143] CR2:  CR3: 00013c3ec002 CR4: 
> 001606f0
> [5.899542] DR0:  DR1:  DR2: 
> 
> [5.900962] DR3:  DR6: fffe0ff0 DR7: 
> 0400
> [5.902552] Call Trace:
> [5.903267]  efi_runtime_map_copy+0x28/0x30
> [5.904956]  bzImage64_load+0x59d/0x736
> [5.906881]  ? arch_kexec_kernel_image_load+0x6d/0x70
> [5.908243]  ? __se_sys_kexec_file_load+0x24b/0x750
> [5.909352]  ? _cond_resched+0x19/0x30
> [5.910286]  ? do_syscall_64+0x65/0x180
> [5.911229]  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [5.912365] Code: 90 90 90 90 90 eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 
> 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1  
> a4 c3 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e 40 38 
> [5.916235] RIP: memcpy_erms+0x6/0x10 RSP: c958bd00
> [5.917507] CR2: 
> [5.918762] ---[ end trace 5cf4c4b3b93d7fdd ]---
> 
> Changing 

Re: [PATCH 3/4] mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE

2018-08-22 Thread Nicholas Piggin
On Wed, 22 Aug 2018 17:55:27 +0200
Peter Zijlstra  wrote:

> On Wed, Aug 22, 2018 at 05:30:15PM +0200, Peter Zijlstra wrote:
> > ARM
> > which later used this put an explicit TLB invalidate in their
> > __p*_free_tlb() functions, and PowerPC-radix followed that example.  
> 
> > +/*
> > + * If we want tlb_remove_table() to imply TLB invalidates.
> > + */
> > +static inline void tlb_table_invalidate(struct mmu_gather *tlb)
> > +{
> > +#ifdef CONFIG_HAVE_RCU_TABLE_INVALIDATE
> > +   /*
> > +* Invalidate page-table caches used by hardware walkers. Then we still
> > +* need to RCU-sched wait while freeing the pages because software
> > +* walkers can still be in-flight.
> > +*/
> > +   __tlb_flush_mmu_tlbonly(tlb);
> > +#endif
> > +}  
> 
> 
> Nick, Will is already looking at using this to remove the synchronous
> invalidation from __p*_free_tlb() for ARM, could you have a look to see
> if PowerPC-radix could benefit from that too?

powerpc/radix has no such issue, it already does this tracking.

We were discussing this a couple of months ago, I wasn't aware of ARM's
issue but I suggested x86 could go the same way as powerpc. Would have
been good to get some feedback on some of the proposed approaches there.
Because it's not just pwc tracking but if you do this you don't need
those silly hacks in generic code to expand the TLB address range
either.

So powerpc has no fundamental problem with making this stuff generic.
If you need a fix for x86 and ARM for this merge window though, I would
suggest just copying what powerpc already has. Next time we can
consider consolidating them all into generic code.

Thanks,
Nick


> 
> Basically, using a patch like the below, would give your tlb_flush()
> information on if tables were removed or not.
> 
> ---
> 
> --- a/include/asm-generic/tlb.h
> +++ b/include/asm-generic/tlb.h
> @@ -96,12 +96,22 @@ struct mmu_gather {
>  #endif
>   unsigned long   start;
>   unsigned long   end;
> - /* we are in the middle of an operation to clear
> -  * a full mm and can make some optimizations */
> - unsigned intfullmm : 1,
> - /* we have performed an operation which
> -  * requires a complete flush of the tlb */
> - need_flush_all : 1;
> + /*
> +  * we are in the middle of an operation to clear
> +  * a full mm and can make some optimizations
> +  */
> + unsigned intfullmm : 1;
> +
> + /*
> +  * we have performed an operation which
> +  * requires a complete flush of the tlb
> +  */
> + unsigned intneed_flush_all : 1;
> +
> + /*
> +  * we have removed page directories
> +  */
> + unsigned intfreed_tables : 1;
>  
>   struct mmu_gather_batch *active;
>   struct mmu_gather_batch local;
> @@ -136,6 +146,7 @@ static inline void __tlb_reset_range(str
>   tlb->start = TASK_SIZE;
>   tlb->end = 0;
>   }
> + tlb->freed_tables = 0;
>  }
>  
>  static inline void tlb_remove_page_size(struct mmu_gather *tlb,
> @@ -269,6 +280,7 @@ static inline void tlb_remove_check_page
>  #define pte_free_tlb(tlb, ptep, address) \
>   do {\
>   __tlb_adjust_range(tlb, address, PAGE_SIZE);\
> + tlb->freed_tables = 1;  \
>   __pte_free_tlb(tlb, ptep, address); \
>   } while (0)
>  #endif
> @@ -276,7 +288,8 @@ static inline void tlb_remove_check_page
>  #ifndef pmd_free_tlb
>  #define pmd_free_tlb(tlb, pmdp, address) \
>   do {\
> - __tlb_adjust_range(tlb, address, PAGE_SIZE);\
> + __tlb_adjust_range(tlb, address, PAGE_SIZE);\
> + tlb->freed_tables = 1;  \
>   __pmd_free_tlb(tlb, pmdp, address); \
>   } while (0)
>  #endif
> @@ -286,6 +299,7 @@ static inline void tlb_remove_check_page
>  #define pud_free_tlb(tlb, pudp, address) \
>   do {\
>   __tlb_adjust_range(tlb, address, PAGE_SIZE);\
> + tlb->freed_tables = 1;  \
>   __pud_free_tlb(tlb, pudp, address); \
>   } while (0)
>  #endif
> @@ -295,7 +309,8 @@ static inline void tlb_remove_check_page
>  #ifndef p4d_free_tlb
>  #define p4d_free_tlb(tlb, pudp, address) \
>   do {\
> - __tlb_adjust_range(tlb, address, PAGE_SIZE);\
> + __tlb_adjust_range(tlb, address, PAGE_SIZE);\
> + tlb->freed_tables = 1;  \
>   __p4d_free_tlb(tlb, pudp, address); \
>   } while (0)
>  #endif



Re: [PATCH 3/4] mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE

2018-08-22 Thread Nicholas Piggin
On Wed, 22 Aug 2018 17:55:27 +0200
Peter Zijlstra  wrote:

> On Wed, Aug 22, 2018 at 05:30:15PM +0200, Peter Zijlstra wrote:
> > ARM
> > which later used this put an explicit TLB invalidate in their
> > __p*_free_tlb() functions, and PowerPC-radix followed that example.  
> 
> > +/*
> > + * If we want tlb_remove_table() to imply TLB invalidates.
> > + */
> > +static inline void tlb_table_invalidate(struct mmu_gather *tlb)
> > +{
> > +#ifdef CONFIG_HAVE_RCU_TABLE_INVALIDATE
> > +   /*
> > +* Invalidate page-table caches used by hardware walkers. Then we still
> > +* need to RCU-sched wait while freeing the pages because software
> > +* walkers can still be in-flight.
> > +*/
> > +   __tlb_flush_mmu_tlbonly(tlb);
> > +#endif
> > +}  
> 
> 
> Nick, Will is already looking at using this to remove the synchronous
> invalidation from __p*_free_tlb() for ARM, could you have a look to see
> if PowerPC-radix could benefit from that too?

powerpc/radix has no such issue, it already does this tracking.

We were discussing this a couple of months ago, I wasn't aware of ARM's
issue but I suggested x86 could go the same way as powerpc. Would have
been good to get some feedback on some of the proposed approaches there.
Because it's not just pwc tracking but if you do this you don't need
those silly hacks in generic code to expand the TLB address range
either.

So powerpc has no fundamental problem with making this stuff generic.
If you need a fix for x86 and ARM for this merge window though, I would
suggest just copying what powerpc already has. Next time we can
consider consolidating them all into generic code.

Thanks,
Nick


> 
> Basically, using a patch like the below, would give your tlb_flush()
> information on if tables were removed or not.
> 
> ---
> 
> --- a/include/asm-generic/tlb.h
> +++ b/include/asm-generic/tlb.h
> @@ -96,12 +96,22 @@ struct mmu_gather {
>  #endif
>   unsigned long   start;
>   unsigned long   end;
> - /* we are in the middle of an operation to clear
> -  * a full mm and can make some optimizations */
> - unsigned intfullmm : 1,
> - /* we have performed an operation which
> -  * requires a complete flush of the tlb */
> - need_flush_all : 1;
> + /*
> +  * we are in the middle of an operation to clear
> +  * a full mm and can make some optimizations
> +  */
> + unsigned intfullmm : 1;
> +
> + /*
> +  * we have performed an operation which
> +  * requires a complete flush of the tlb
> +  */
> + unsigned intneed_flush_all : 1;
> +
> + /*
> +  * we have removed page directories
> +  */
> + unsigned intfreed_tables : 1;
>  
>   struct mmu_gather_batch *active;
>   struct mmu_gather_batch local;
> @@ -136,6 +146,7 @@ static inline void __tlb_reset_range(str
>   tlb->start = TASK_SIZE;
>   tlb->end = 0;
>   }
> + tlb->freed_tables = 0;
>  }
>  
>  static inline void tlb_remove_page_size(struct mmu_gather *tlb,
> @@ -269,6 +280,7 @@ static inline void tlb_remove_check_page
>  #define pte_free_tlb(tlb, ptep, address) \
>   do {\
>   __tlb_adjust_range(tlb, address, PAGE_SIZE);\
> + tlb->freed_tables = 1;  \
>   __pte_free_tlb(tlb, ptep, address); \
>   } while (0)
>  #endif
> @@ -276,7 +288,8 @@ static inline void tlb_remove_check_page
>  #ifndef pmd_free_tlb
>  #define pmd_free_tlb(tlb, pmdp, address) \
>   do {\
> - __tlb_adjust_range(tlb, address, PAGE_SIZE);\
> + __tlb_adjust_range(tlb, address, PAGE_SIZE);\
> + tlb->freed_tables = 1;  \
>   __pmd_free_tlb(tlb, pmdp, address); \
>   } while (0)
>  #endif
> @@ -286,6 +299,7 @@ static inline void tlb_remove_check_page
>  #define pud_free_tlb(tlb, pudp, address) \
>   do {\
>   __tlb_adjust_range(tlb, address, PAGE_SIZE);\
> + tlb->freed_tables = 1;  \
>   __pud_free_tlb(tlb, pudp, address); \
>   } while (0)
>  #endif
> @@ -295,7 +309,8 @@ static inline void tlb_remove_check_page
>  #ifndef p4d_free_tlb
>  #define p4d_free_tlb(tlb, pudp, address) \
>   do {\
> - __tlb_adjust_range(tlb, address, PAGE_SIZE);\
> + __tlb_adjust_range(tlb, address, PAGE_SIZE);\
> + tlb->freed_tables = 1;  \
>   __p4d_free_tlb(tlb, pudp, address); \
>   } while (0)
>  #endif



arch/powerpc/boot/../../../lib/xz/xz_crc32.c:18:10: fatal error: linux/crc32poly.h: No such file or directory

2018-08-22 Thread kbuild test robot
tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   815f0ddb346c196018d4d8f8f55c12b83da1de3f
commit: faa16bc404d72a5afb857c924c83a5f691f83386 lib: Use existing define with 
polynomial
date:   4 weeks ago
config: powerpc-skiroot_defconfig (attached as .config)
compiler: powerpc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
git checkout faa16bc404d72a5afb857c924c83a5f691f83386
# save the attached .config to linux build tree
GCC_VERSION=7.2.0 make.cross ARCH=powerpc 

All errors (new ones prefixed by >>):

   In file included from arch/powerpc/boot/../../../lib/decompress_unxz.c:233:0,
from arch/powerpc/boot/decompress.c:42:
>> arch/powerpc/boot/../../../lib/xz/xz_crc32.c:18:10: fatal error: 
>> linux/crc32poly.h: No such file or directory
#include 
 ^~~
   compilation terminated.

vim +18 arch/powerpc/boot/../../../lib/xz/xz_crc32.c

  > 18  #include 
19  #include "xz_private.h"
20  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


arch/powerpc/boot/../../../lib/xz/xz_crc32.c:18:10: fatal error: linux/crc32poly.h: No such file or directory

2018-08-22 Thread kbuild test robot
tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   815f0ddb346c196018d4d8f8f55c12b83da1de3f
commit: faa16bc404d72a5afb857c924c83a5f691f83386 lib: Use existing define with 
polynomial
date:   4 weeks ago
config: powerpc-skiroot_defconfig (attached as .config)
compiler: powerpc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
git checkout faa16bc404d72a5afb857c924c83a5f691f83386
# save the attached .config to linux build tree
GCC_VERSION=7.2.0 make.cross ARCH=powerpc 

All errors (new ones prefixed by >>):

   In file included from arch/powerpc/boot/../../../lib/decompress_unxz.c:233:0,
from arch/powerpc/boot/decompress.c:42:
>> arch/powerpc/boot/../../../lib/xz/xz_crc32.c:18:10: fatal error: 
>> linux/crc32poly.h: No such file or directory
#include 
 ^~~
   compilation terminated.

vim +18 arch/powerpc/boot/../../../lib/xz/xz_crc32.c

  > 18  #include 
19  #include "xz_private.h"
20  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH 2/4] mm/tlb: Remove tlb_remove_table() non-concurrent condition

2018-08-22 Thread Linus Torvalds
On Wed, Aug 22, 2018 at 8:35 PM Linus Torvalds
 wrote:
>
> No. Because mm_users doesn't contain any lazy tlb users.

.. or, as it turns out, the use_mm() case either, which can do
gup_fast(). Oh well.

Linus


Re: [PATCH 2/4] mm/tlb: Remove tlb_remove_table() non-concurrent condition

2018-08-22 Thread Linus Torvalds
On Wed, Aug 22, 2018 at 8:35 PM Linus Torvalds
 wrote:
>
> No. Because mm_users doesn't contain any lazy tlb users.

.. or, as it turns out, the use_mm() case either, which can do
gup_fast(). Oh well.

Linus


Re: [PATCH 2/4] mm/tlb: Remove tlb_remove_table() non-concurrent condition

2018-08-22 Thread Linus Torvalds
On Wed, Aug 22, 2018 at 8:31 PM Nicholas Piggin  wrote:
>
>
> So that leaves speculative operations. I don't see where the problem is
> with those either -- this shortcut needs to ensure there are no other
> *non speculative* operations. mm_users is correct for that.

No. Because mm_users doesn't contain any lazy tlb users.

And yes, those lazy tlbs are all kernel threads, but they can still
speculatively load user addresses.

   Linus


Re: [PATCH 2/4] mm/tlb: Remove tlb_remove_table() non-concurrent condition

2018-08-22 Thread Linus Torvalds
On Wed, Aug 22, 2018 at 8:31 PM Nicholas Piggin  wrote:
>
>
> So that leaves speculative operations. I don't see where the problem is
> with those either -- this shortcut needs to ensure there are no other
> *non speculative* operations. mm_users is correct for that.

No. Because mm_users doesn't contain any lazy tlb users.

And yes, those lazy tlbs are all kernel threads, but they can still
speculatively load user addresses.

   Linus


linux-next: Tree for Aug 23

2018-08-22 Thread Stephen Rothwell
Hi all,

Please do not add any v4.20 material to your linux-next included
branches until after v4.19-rc1 has been released.

Changes since 20180822:

New tree: at91-fixes

The kbuild tree gained a build failure so I used the version from
next-20180822.

Non-merge commits (relative to Linus' tree): 1609
 1899 files changed, 59754 insertions(+), 27656 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a
multi_v7_defconfig for arm and a native build of tools/perf. After
the final fixups (if any), I do an x86_64 modules_install followed by
builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit),
ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc
and sparc64 defconfig. And finally, a simple boot test of the powerpc
pseries_le_defconfig kernel in qemu (with and without kvm enabled).

Below is a summary of the state of the merge.

I am currently merging 287 trees (counting Linus' and 65 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (899fbc33fd77 Merge tag 'platform-drivers-x86-v4.19-1' of 
git://git.infradead.org/linux-platform-drivers-x86)
Merging fixes/master (147a89bc71e7 Merge tag 'kconfig-v4.17' of 
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild)
Merging kbuild-current/fixes (ad1d69735878 Merge tag 'fuse-update-4.19' of 
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse)
Merging arc-current/for-curr (fce0d0affdc5 ARC: cleanup show_faulting_vma())
Merging arm-current/fixes (afc9f65e01cd ARM: 8781/1: Fix Thumb-2 syscall return 
for binutils 2.29+)
Merging arm64-fixes/for-next/fixes (5ad356eabc47 arm64: mm: check for upper 
PAGE_SHIFT bits in pfn_valid())
Merging m68k-current/for-linus (71a896687b85 m68k/defconfig: Update defconfigs 
for v4.18-rc6)
Merging powerpc-fixes/fixes (cca19f0b684f powerpc/64s/radix: Fix missing global 
invalidations when removing copro)
Merging sparc/master (df2def49c57b Merge tag 'acpi-4.19-rc1-2' of 
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm)
Merging fscrypt-current/for-stable (ae64f9bd1d36 Linux 4.15-rc2)
Merging net/master (78c452fe7068 Merge branch 'net_sched-fixes')
Merging bpf/master (9b2e0388bec8 bpf: sockmap: write_space events need to be 
passed to TCP handler)
Merging ipsec/master (25432eba9cd8 openvswitch: meter: Fix setting meter id for 
new entries)
Merging netfilter/master (9c86336c15db ip6_vti: fix a null pointer deference 
when destroy vti6 tunnel)
Merging ipvs/master (feb9f55c33e5 netfilter: nft_dynset: allow dynamic updates 
of non-anonymous set)
Merging wireless-drivers/master (299b6365a3b7 brcmfmac: fix regression in 
parsing NVRAM for multiple devices)
Merging mac80211/master (8a54d8fc160e cfg80211: remove division by size of 
sizeof(struct ieee80211_wmm_rule))
Merging rdma-fixes/for-rc (845b397a7771 IB/ucm: fix UCM link error)
Merging sound-current/for-linus (8a328ac1f9eb ALSA: hda/realtek - Fix HP 
Headset Mic can't record)
Merging sound-asoc-fixes/for-linus (ba095ffc9324 Merge branch 'asoc-4.18' into 
asoc-linus)
Merging regmap-fixes/for-linus (94710cac0ef4 Linux 4.18)
Merging regulator-fixes/for-linus (84d77c5ab32d Merge branch 'regulator-4.18' 
into regulator-linus)
Merging spi-fixes/for-linus (8a7c14551b9b Merge branch 'spi-4.18' into 
spi-linus)
Merging pci-current/for-linus (44bda4b7d26e PCI: Fix is_added/is_busmaster race 
condition)
Merging driver-core.current/driver-core-linus (ad1d69735878 Merge tag 
'fuse-update-4.19' of 
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse)
Merging tty.current/tty-linus (ad1d69735878 Merge tag 'fuse-update-4.19' of 
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse)
Merging usb.current/usb-linus (ad1d69735878 Merge tag 'fuse-update-4.19' of 
git://git.kernel.org/pub/scm/linux

linux-next: Tree for Aug 23

2018-08-22 Thread Stephen Rothwell
Hi all,

Please do not add any v4.20 material to your linux-next included
branches until after v4.19-rc1 has been released.

Changes since 20180822:

New tree: at91-fixes

The kbuild tree gained a build failure so I used the version from
next-20180822.

Non-merge commits (relative to Linus' tree): 1609
 1899 files changed, 59754 insertions(+), 27656 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a
multi_v7_defconfig for arm and a native build of tools/perf. After
the final fixups (if any), I do an x86_64 modules_install followed by
builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit),
ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc
and sparc64 defconfig. And finally, a simple boot test of the powerpc
pseries_le_defconfig kernel in qemu (with and without kvm enabled).

Below is a summary of the state of the merge.

I am currently merging 287 trees (counting Linus' and 65 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (899fbc33fd77 Merge tag 'platform-drivers-x86-v4.19-1' of 
git://git.infradead.org/linux-platform-drivers-x86)
Merging fixes/master (147a89bc71e7 Merge tag 'kconfig-v4.17' of 
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild)
Merging kbuild-current/fixes (ad1d69735878 Merge tag 'fuse-update-4.19' of 
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse)
Merging arc-current/for-curr (fce0d0affdc5 ARC: cleanup show_faulting_vma())
Merging arm-current/fixes (afc9f65e01cd ARM: 8781/1: Fix Thumb-2 syscall return 
for binutils 2.29+)
Merging arm64-fixes/for-next/fixes (5ad356eabc47 arm64: mm: check for upper 
PAGE_SHIFT bits in pfn_valid())
Merging m68k-current/for-linus (71a896687b85 m68k/defconfig: Update defconfigs 
for v4.18-rc6)
Merging powerpc-fixes/fixes (cca19f0b684f powerpc/64s/radix: Fix missing global 
invalidations when removing copro)
Merging sparc/master (df2def49c57b Merge tag 'acpi-4.19-rc1-2' of 
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm)
Merging fscrypt-current/for-stable (ae64f9bd1d36 Linux 4.15-rc2)
Merging net/master (78c452fe7068 Merge branch 'net_sched-fixes')
Merging bpf/master (9b2e0388bec8 bpf: sockmap: write_space events need to be 
passed to TCP handler)
Merging ipsec/master (25432eba9cd8 openvswitch: meter: Fix setting meter id for 
new entries)
Merging netfilter/master (9c86336c15db ip6_vti: fix a null pointer deference 
when destroy vti6 tunnel)
Merging ipvs/master (feb9f55c33e5 netfilter: nft_dynset: allow dynamic updates 
of non-anonymous set)
Merging wireless-drivers/master (299b6365a3b7 brcmfmac: fix regression in 
parsing NVRAM for multiple devices)
Merging mac80211/master (8a54d8fc160e cfg80211: remove division by size of 
sizeof(struct ieee80211_wmm_rule))
Merging rdma-fixes/for-rc (845b397a7771 IB/ucm: fix UCM link error)
Merging sound-current/for-linus (8a328ac1f9eb ALSA: hda/realtek - Fix HP 
Headset Mic can't record)
Merging sound-asoc-fixes/for-linus (ba095ffc9324 Merge branch 'asoc-4.18' into 
asoc-linus)
Merging regmap-fixes/for-linus (94710cac0ef4 Linux 4.18)
Merging regulator-fixes/for-linus (84d77c5ab32d Merge branch 'regulator-4.18' 
into regulator-linus)
Merging spi-fixes/for-linus (8a7c14551b9b Merge branch 'spi-4.18' into 
spi-linus)
Merging pci-current/for-linus (44bda4b7d26e PCI: Fix is_added/is_busmaster race 
condition)
Merging driver-core.current/driver-core-linus (ad1d69735878 Merge tag 
'fuse-update-4.19' of 
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse)
Merging tty.current/tty-linus (ad1d69735878 Merge tag 'fuse-update-4.19' of 
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse)
Merging usb.current/usb-linus (ad1d69735878 Merge tag 'fuse-update-4.19' of 
git://git.kernel.org/pub/scm/linux

Re: [PATCH 2/4] mm/tlb: Remove tlb_remove_table() non-concurrent condition

2018-08-22 Thread Nicholas Piggin
On Wed, 22 Aug 2018 17:30:14 +0200
Peter Zijlstra  wrote:

> Will noted that only checking mm_users is incorrect; we should also
> check mm_count in order to cover CPUs that have a lazy reference to
> this mm (and could do speculative TLB operations).

Why is that incorrect?

This shortcut has nothing to do with no TLBs -- not sure about x86, but
other CPUs can certainly have remaining TLBs here, speculative
operations or not (even if they don't have an mm_count ref they can
have TLBs here).

So that leaves speculative operations. I don't see where the problem is
with those either -- this shortcut needs to ensure there are no other
*non speculative* operations. mm_users is correct for that.

If there is a speculation security problem here it should be carefully
documented otherwise it's going to be re-introduced...

I actually have a patch to extend this optimisation further that I'm
going to send out again today. It's nice to avoid the double handling
of the pages.

Thanks,
Nick

> 
> If removing this turns out to be a performance issue, we can
> re-instate a more complete check, but in tlb_table_flush() eliding the
> call_rcu_sched().
> 
> Cc: sta...@kernel.org
> Cc: Nicholas Piggin 
> Cc: David Miller 
> Cc: Will Deacon 
> Cc: Martin Schwidefsky 
> Cc: Michael Ellerman 
> Fixes: 267239116987 ("mm, powerpc: move the RCU page-table freeing into 
> generic code")
> Reported-by: Will Deacon 
> Signed-off-by: Peter Zijlstra (Intel) 
> ---
>  mm/memory.c |9 -
>  1 file changed, 9 deletions(-)
> 
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -375,15 +375,6 @@ void tlb_remove_table(struct mmu_gather
>  {
>   struct mmu_table_batch **batch = >batch;
>  
> - /*
> -  * When there's less then two users of this mm there cannot be a
> -  * concurrent page-table walk.
> -  */
> - if (atomic_read(>mm->mm_users) < 2) {
> - __tlb_remove_table(table);
> - return;
> - }
> -
>   if (*batch == NULL) {
>   *batch = (struct mmu_table_batch *)__get_free_page(GFP_NOWAIT | 
> __GFP_NOWARN);
>   if (*batch == NULL) {
> 
> 



Re: [PATCH 2/4] mm/tlb: Remove tlb_remove_table() non-concurrent condition

2018-08-22 Thread Nicholas Piggin
On Wed, 22 Aug 2018 17:30:14 +0200
Peter Zijlstra  wrote:

> Will noted that only checking mm_users is incorrect; we should also
> check mm_count in order to cover CPUs that have a lazy reference to
> this mm (and could do speculative TLB operations).

Why is that incorrect?

This shortcut has nothing to do with no TLBs -- not sure about x86, but
other CPUs can certainly have remaining TLBs here, speculative
operations or not (even if they don't have an mm_count ref they can
have TLBs here).

So that leaves speculative operations. I don't see where the problem is
with those either -- this shortcut needs to ensure there are no other
*non speculative* operations. mm_users is correct for that.

If there is a speculation security problem here it should be carefully
documented otherwise it's going to be re-introduced...

I actually have a patch to extend this optimisation further that I'm
going to send out again today. It's nice to avoid the double handling
of the pages.

Thanks,
Nick

> 
> If removing this turns out to be a performance issue, we can
> re-instate a more complete check, but in tlb_table_flush() eliding the
> call_rcu_sched().
> 
> Cc: sta...@kernel.org
> Cc: Nicholas Piggin 
> Cc: David Miller 
> Cc: Will Deacon 
> Cc: Martin Schwidefsky 
> Cc: Michael Ellerman 
> Fixes: 267239116987 ("mm, powerpc: move the RCU page-table freeing into 
> generic code")
> Reported-by: Will Deacon 
> Signed-off-by: Peter Zijlstra (Intel) 
> ---
>  mm/memory.c |9 -
>  1 file changed, 9 deletions(-)
> 
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -375,15 +375,6 @@ void tlb_remove_table(struct mmu_gather
>  {
>   struct mmu_table_batch **batch = >batch;
>  
> - /*
> -  * When there's less then two users of this mm there cannot be a
> -  * concurrent page-table walk.
> -  */
> - if (atomic_read(>mm->mm_users) < 2) {
> - __tlb_remove_table(table);
> - return;
> - }
> -
>   if (*batch == NULL) {
>   *batch = (struct mmu_table_batch *)__get_free_page(GFP_NOWAIT | 
> __GFP_NOWARN);
>   if (*batch == NULL) {
> 
> 



Re: KASAN: use-after-free Write in irq_bypass_register_consumer

2018-08-22 Thread Eric Biggers
On Sat, May 26, 2018 at 11:24:01AM +0200, Dmitry Vyukov wrote:
> On Sun, May 13, 2018 at 8:21 AM, Eric Biggers  wrote:
> > On Thu, Apr 05, 2018 at 08:15:24PM -0700, Eric Biggers wrote:
> >> On Mon, Jan 29, 2018 at 01:29:48PM +0800, Tianyu Lan wrote:
> >> >
> >> >
> >> > On 1/27/2018 7:27 AM, Eric Biggers wrote:
> >> > > On Sat, Dec 16, 2017 at 04:37:02PM +0800, Lan, Tianyu wrote:
> >> > > > The root cause is that kvm_irqfd_assign() and kvm_irqfd_deassign() 
> >> > > > can't
> >> > > > be run in parallel. Some data structure(e.g, irqfd->consumer) will be
> >> > > > crashed because irqfd may be freed in deassign path before they are 
> >> > > > used
> >> > > > in assign path. The other data maybe used in deassign path before
> >> > > > initialization. Syzbot test hit such case. Add mutx between
> >> > > > kvm_irqfd_assign() and kvm_irqfd_deassign() can fix such issue. Will
> >> > > > send patch to fix it.
> >> > > >
> >> > > > On 12/16/2017 12:53 PM, Tianyu Lan wrote:
> >> > > > > I reproduced the issue. Will have a look.
> >> > > > >
> >> > > > > -- Best regards Tianyu Lan 2017-12-15 18:14 GMT+08:00 syzbot
> >> > > > > :
> >> > > > > > syzkaller has found reproducer for the following crash on
> >> > > > > > 82bcf1def3b5f1251177ad47c44f7e17af039b4b
> >> > > > > > git://git.cmpxchg.org/linux-mmots.git/master
> >> > > > > > compiler: gcc (GCC) 7.1.1 20170620
> >> > > > > > .config is attached
> >> > > > > > Raw console output is attached.
> >> > > > > > C reproducer is attached
> >> > > > > > syzkaller reproducer is attached. Seehttps://goo.gl/kgGztJ
> >> > > > > > for information about syzkaller reproducers
> >> > > > > >
> >> > > > > >
> >> > > > > > ==
> >> > > > > > BUG: KASAN: use-after-free in __list_add include/linux/list.h:64 
> >> > > > > > [inline]
> >> > > > > > BUG: KASAN: use-after-free in list_add include/linux/list.h:79 
> >> > > > > > [inline]
> >> > > > > > BUG: KASAN: use-after-free in 
> >> > > > > > irq_bypass_register_consumer+0x4b4/0x500
> >> > > > > > virt/lib/irqbypass.c:217
> >> > > > > > Write of size 8 at addr 8801cdf51180 by task 
> >> > > > > > syzkaller436086/15031
> >> > > > > >
> >> > > > > > CPU: 1 PID: 15031 Comm: syzkaller436086 Not tainted 
> >> > > > > > 4.15.0-rc2-mm1+ #39
> >> > > > > > Hardware name: Google Google Compute Engine/Google Compute 
> >> > > > > > Engine, BIOS
> >> > > > > > Google 01/01/2011
> >> > > > > > Call Trace:
> >> > > > > >__dump_stack lib/dump_stack.c:17 [inline]
> >> > > > > >dump_stack+0x194/0x257 lib/dump_stack.c:53
> >> > > > > >print_address_description+0x73/0x250 mm/kasan/report.c:252
> >> > > > > >kasan_report_error mm/kasan/report.c:351 [inline]
> >> > > > > >kasan_report+0x25b/0x340 mm/kasan/report.c:409
> >> > > > > >__asan_report_store8_noabort+0x17/0x20 mm/kasan/report.c:435
> >> > > > > >__list_add include/linux/list.h:64 [inline]
> >> > > > > >list_add include/linux/list.h:79 [inline]
> >> > > > > >irq_bypass_register_consumer+0x4b4/0x500 
> >> > > > > > virt/lib/irqbypass.c:217
> >> > > > > >kvm_irqfd_assign arch/x86/kvm/../../../virt/kvm/eventfd.c:417 
> >> > > > > > [inline]
> >> > > > > >kvm_irqfd+0x137f/0x1d50 
> >> > > > > > arch/x86/kvm/../../../virt/kvm/eventfd.c:572
> >> > > > > >kvm_vm_ioctl+0x1079/0x1c40 
> >> > > > > > arch/x86/kvm/../../../virt/kvm/kvm_main.c:2992
> >> > > > > >vfs_ioctl fs/ioctl.c:46 [inline]
> >> > > > > >do_vfs_ioctl+0x1b1/0x1530 fs/ioctl.c:686
> >> > > > > >SYSC_ioctl fs/ioctl.c:701 [inline]
> >> > > > > >SyS_ioctl+0x8f/0xc0 fs/ioctl.c:692
> >> > > > > >entry_SYSCALL_64_fastpath+0x1f/0x96
> >> > > > > > RIP: 0033:0x44d379
> >> > > > > > RSP: 002b:7fc5ff9a9d08 EFLAGS: 0246 ORIG_RAX: 
> >> > > > > > 0010
> >> > > > > > RAX: ffda RBX: 7fc5ff9aa700 RCX: 0044d379
> >> > > > > > RDX: 20080fe0 RSI: 4020ae76 RDI: 0005
> >> > > > > > RBP: 007ff900 R08: 7fc5ff9aa700 R09: 7fc5ff9aa700
> >> > > > > > R10: 7fc5ff9aa700 R11: 0246 R12: 
> >> > > > > > R13: 007ff8ff R14: 7fc5ff9aa9c0 R15: 
> >> > > > > >
> >> > > > > > Allocated by task 15031:
> >> > > > > >save_stack+0x43/0xd0 mm/kasan/kasan.c:447
> >> > > > > >set_track mm/kasan/kasan.c:459 [inline]
> >> > > > > >kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:551
> >> > > > > >kmem_cache_alloc_trace+0x136/0x750 mm/slab.c:3614
> >> > > > > >kmalloc include/linux/slab.h:516 [inline]
> >> > > > > >kzalloc include/linux/slab.h:705 [inline]
> >> > > > > >kvm_irqfd_assign arch/x86/kvm/../../../virt/kvm/eventfd.c:296 
> >> > > > > > [inline]
> >> > > > > >kvm_irqfd+0x16c/0x1d50 
> >> > > > > > arch/x86/kvm/../../../virt/kvm/eventfd.c:572
> >> > > > > >kvm_vm_ioctl+0x1079/0x1c40 
> >> > > > > > arch/x86/kvm/../../../virt/kvm/kvm_main.c:2992
> >> > > > > >

Re: KASAN: use-after-free Write in irq_bypass_register_consumer

2018-08-22 Thread Eric Biggers
On Sat, May 26, 2018 at 11:24:01AM +0200, Dmitry Vyukov wrote:
> On Sun, May 13, 2018 at 8:21 AM, Eric Biggers  wrote:
> > On Thu, Apr 05, 2018 at 08:15:24PM -0700, Eric Biggers wrote:
> >> On Mon, Jan 29, 2018 at 01:29:48PM +0800, Tianyu Lan wrote:
> >> >
> >> >
> >> > On 1/27/2018 7:27 AM, Eric Biggers wrote:
> >> > > On Sat, Dec 16, 2017 at 04:37:02PM +0800, Lan, Tianyu wrote:
> >> > > > The root cause is that kvm_irqfd_assign() and kvm_irqfd_deassign() 
> >> > > > can't
> >> > > > be run in parallel. Some data structure(e.g, irqfd->consumer) will be
> >> > > > crashed because irqfd may be freed in deassign path before they are 
> >> > > > used
> >> > > > in assign path. The other data maybe used in deassign path before
> >> > > > initialization. Syzbot test hit such case. Add mutx between
> >> > > > kvm_irqfd_assign() and kvm_irqfd_deassign() can fix such issue. Will
> >> > > > send patch to fix it.
> >> > > >
> >> > > > On 12/16/2017 12:53 PM, Tianyu Lan wrote:
> >> > > > > I reproduced the issue. Will have a look.
> >> > > > >
> >> > > > > -- Best regards Tianyu Lan 2017-12-15 18:14 GMT+08:00 syzbot
> >> > > > > :
> >> > > > > > syzkaller has found reproducer for the following crash on
> >> > > > > > 82bcf1def3b5f1251177ad47c44f7e17af039b4b
> >> > > > > > git://git.cmpxchg.org/linux-mmots.git/master
> >> > > > > > compiler: gcc (GCC) 7.1.1 20170620
> >> > > > > > .config is attached
> >> > > > > > Raw console output is attached.
> >> > > > > > C reproducer is attached
> >> > > > > > syzkaller reproducer is attached. Seehttps://goo.gl/kgGztJ
> >> > > > > > for information about syzkaller reproducers
> >> > > > > >
> >> > > > > >
> >> > > > > > ==
> >> > > > > > BUG: KASAN: use-after-free in __list_add include/linux/list.h:64 
> >> > > > > > [inline]
> >> > > > > > BUG: KASAN: use-after-free in list_add include/linux/list.h:79 
> >> > > > > > [inline]
> >> > > > > > BUG: KASAN: use-after-free in 
> >> > > > > > irq_bypass_register_consumer+0x4b4/0x500
> >> > > > > > virt/lib/irqbypass.c:217
> >> > > > > > Write of size 8 at addr 8801cdf51180 by task 
> >> > > > > > syzkaller436086/15031
> >> > > > > >
> >> > > > > > CPU: 1 PID: 15031 Comm: syzkaller436086 Not tainted 
> >> > > > > > 4.15.0-rc2-mm1+ #39
> >> > > > > > Hardware name: Google Google Compute Engine/Google Compute 
> >> > > > > > Engine, BIOS
> >> > > > > > Google 01/01/2011
> >> > > > > > Call Trace:
> >> > > > > >__dump_stack lib/dump_stack.c:17 [inline]
> >> > > > > >dump_stack+0x194/0x257 lib/dump_stack.c:53
> >> > > > > >print_address_description+0x73/0x250 mm/kasan/report.c:252
> >> > > > > >kasan_report_error mm/kasan/report.c:351 [inline]
> >> > > > > >kasan_report+0x25b/0x340 mm/kasan/report.c:409
> >> > > > > >__asan_report_store8_noabort+0x17/0x20 mm/kasan/report.c:435
> >> > > > > >__list_add include/linux/list.h:64 [inline]
> >> > > > > >list_add include/linux/list.h:79 [inline]
> >> > > > > >irq_bypass_register_consumer+0x4b4/0x500 
> >> > > > > > virt/lib/irqbypass.c:217
> >> > > > > >kvm_irqfd_assign arch/x86/kvm/../../../virt/kvm/eventfd.c:417 
> >> > > > > > [inline]
> >> > > > > >kvm_irqfd+0x137f/0x1d50 
> >> > > > > > arch/x86/kvm/../../../virt/kvm/eventfd.c:572
> >> > > > > >kvm_vm_ioctl+0x1079/0x1c40 
> >> > > > > > arch/x86/kvm/../../../virt/kvm/kvm_main.c:2992
> >> > > > > >vfs_ioctl fs/ioctl.c:46 [inline]
> >> > > > > >do_vfs_ioctl+0x1b1/0x1530 fs/ioctl.c:686
> >> > > > > >SYSC_ioctl fs/ioctl.c:701 [inline]
> >> > > > > >SyS_ioctl+0x8f/0xc0 fs/ioctl.c:692
> >> > > > > >entry_SYSCALL_64_fastpath+0x1f/0x96
> >> > > > > > RIP: 0033:0x44d379
> >> > > > > > RSP: 002b:7fc5ff9a9d08 EFLAGS: 0246 ORIG_RAX: 
> >> > > > > > 0010
> >> > > > > > RAX: ffda RBX: 7fc5ff9aa700 RCX: 0044d379
> >> > > > > > RDX: 20080fe0 RSI: 4020ae76 RDI: 0005
> >> > > > > > RBP: 007ff900 R08: 7fc5ff9aa700 R09: 7fc5ff9aa700
> >> > > > > > R10: 7fc5ff9aa700 R11: 0246 R12: 
> >> > > > > > R13: 007ff8ff R14: 7fc5ff9aa9c0 R15: 
> >> > > > > >
> >> > > > > > Allocated by task 15031:
> >> > > > > >save_stack+0x43/0xd0 mm/kasan/kasan.c:447
> >> > > > > >set_track mm/kasan/kasan.c:459 [inline]
> >> > > > > >kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:551
> >> > > > > >kmem_cache_alloc_trace+0x136/0x750 mm/slab.c:3614
> >> > > > > >kmalloc include/linux/slab.h:516 [inline]
> >> > > > > >kzalloc include/linux/slab.h:705 [inline]
> >> > > > > >kvm_irqfd_assign arch/x86/kvm/../../../virt/kvm/eventfd.c:296 
> >> > > > > > [inline]
> >> > > > > >kvm_irqfd+0x16c/0x1d50 
> >> > > > > > arch/x86/kvm/../../../virt/kvm/eventfd.c:572
> >> > > > > >kvm_vm_ioctl+0x1079/0x1c40 
> >> > > > > > arch/x86/kvm/../../../virt/kvm/kvm_main.c:2992
> >> > > > > >

Compliment of the day to you Dear Friend.

2018-08-22 Thread Mrs. Amina Kadi
 Compliment of the day to you Dear Friend.

Dear Friend.
 
  I am Mrs. Amina Kadi. am sending this brief letter to solicit your
partnership to transfer $5.5 million US Dollars. I shall send you
more information and procedures when I receive positive response from
you.

Mrs. Amina Kadi









Compliment of the day to you Dear Friend.

2018-08-22 Thread Mrs. Amina Kadi
 Compliment of the day to you Dear Friend.

Dear Friend.
 
  I am Mrs. Amina Kadi. am sending this brief letter to solicit your
partnership to transfer $5.5 million US Dollars. I shall send you
more information and procedures when I receive positive response from
you.

Mrs. Amina Kadi









Re: [PATCH 1/9] of: add helper to lookup compatible child node

2018-08-22 Thread kbuild test robot
Hi Johan,

I love your patch! Yet something to improve:

[auto build test ERROR on robh/for-next]
[also build test ERROR on v4.18 next-20180822]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Johan-Hovold/of-fix-compatible-child-node-lookups/20180823-074211
base:   https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git for-next
config: mips-decstation_defconfig (attached as .config)
compiler: mipsel-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
GCC_VERSION=7.2.0 make.cross ARCH=mips 

All errors (new ones prefixed by >>):

   In file included from include/linux/irqdomain.h:35:0,
from arch/mips/include/asm/irq.h:14,
from include/linux/irq.h:23,
from include/asm-generic/hardirq.h:13,
from arch/mips/include/asm/hardirq.h:16,
from include/linux/hardirq.h:9,
from include/linux/interrupt.h:11,
from arch/mips/dec/ecc-berr.c:16:
>> include/linux/of.h:637:28: error: 'of_get_compatible_child' defined but not 
>> used [-Werror=unused-function]
static struct device_node *of_get_compatible_child(const struct device_node 
*parent,
   ^~~
   cc1: all warnings being treated as errors

vim +/of_get_compatible_child +637 include/linux/of.h

   636  
 > 637  static struct device_node *of_get_compatible_child(const struct 
 > device_node *parent,
   638  const char *compatible)
   639  {
   640  return NULL;
   641  }
   642  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH 1/9] of: add helper to lookup compatible child node

2018-08-22 Thread kbuild test robot
Hi Johan,

I love your patch! Yet something to improve:

[auto build test ERROR on robh/for-next]
[also build test ERROR on v4.18 next-20180822]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Johan-Hovold/of-fix-compatible-child-node-lookups/20180823-074211
base:   https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git for-next
config: mips-decstation_defconfig (attached as .config)
compiler: mipsel-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
GCC_VERSION=7.2.0 make.cross ARCH=mips 

All errors (new ones prefixed by >>):

   In file included from include/linux/irqdomain.h:35:0,
from arch/mips/include/asm/irq.h:14,
from include/linux/irq.h:23,
from include/asm-generic/hardirq.h:13,
from arch/mips/include/asm/hardirq.h:16,
from include/linux/hardirq.h:9,
from include/linux/interrupt.h:11,
from arch/mips/dec/ecc-berr.c:16:
>> include/linux/of.h:637:28: error: 'of_get_compatible_child' defined but not 
>> used [-Werror=unused-function]
static struct device_node *of_get_compatible_child(const struct device_node 
*parent,
   ^~~
   cc1: all warnings being treated as errors

vim +/of_get_compatible_child +637 include/linux/of.h

   636  
 > 637  static struct device_node *of_get_compatible_child(const struct 
 > device_node *parent,
   638  const char *compatible)
   639  {
   640  return NULL;
   641  }
   642  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


[lkp-robot] [x86] 19efe000d3: PANIC:double_fault

2018-08-22 Thread kernel test robot
FYI, we noticed the following commit (built with gcc-5):

commit: 19efe000d3258032d9a1dfb25313a092f9454da0 ("x86: Remap the IRQ stack so 
it has guard pages")
https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git x86/guard_pages

in testcase: trinity
with following parameters:

runtime: 300s

test-description: Trinity is a linux system call fuzz tester.
test-url: http://codemonkey.org.uk/projects/trinity/


on test machine: qemu-system-x86_64 -enable-kvm -cpu IvyBridge -m 420M

caused below changes (please refer to attached dmesg/kmsg for entire 
log/backtrace):


+--+++
|  | 0d997f71d5 
| 19efe000d3 |
+--+++
| boot_successes   | 6  
| 0  |
| boot_failures| 5  
| 32 |
| invoked_oom-killer:gfp_mask=0x   | 5  
||
| Mem-Info | 5  
||
| Out_of_memory:Kill_process   | 3  
||
| Kernel_panic-not_syncing:Out_of_memory_and_no_killable_processes | 3  
||
| RIP:__put_user_4 | 1  
||
| PANIC:double_fault   | 0  
| 32 |
| RIP:trace_hardirqs_off_thunk | 0  
| 32 |
| Kernel_panic-not_syncing:Machine_halted  | 0  
| 32 |
| WARNING:kernel_stack | 0  
| 32 |
+--+++



[0.004000]  memory used by lock dependency info: 7871 kB
[0.004000]  per task-struct memory footprint: 2688 bytes
[0.004000] ACPI: Core revision 20180531
[0.004000] clocksource: hpet: mask: 0x max_cycles: 0x, 
max_idle_ns: 19112604467 ns
[0.004000] hpet clockevent registered
[0.004000] PANIC: double fault, error_code: 0x0
[0.004000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
4.18.0-rc4-00074-g19efe000 #2
[0.004000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.10.2-1 04/01/2014
[0.004000] RIP: 0010:trace_hardirqs_off_thunk+0xb/0x1c
[0.004000] Code: 5f 5d c3 55 48 89 e5 57 56 52 51 50 41 50 41 51 41 52 41 
53 48 8b 7d 08 e8 f5 4e 2b 00 eb 34 55 48 89 e5 57 56 52 51 50 41 50 <41> 51 41 
52 41 53 48 8b 7d 08 e8 eb 52 2b 00 eb 18 55 48 89 e5 57 
[0.004000] RSP: :c900 EFLAGS: 00010087
[0.004000] RAX: 82800a97 RBX: 0001 RCX: 82800a97
[0.004000] RDX:  RSI: 82800f68 RDI: 83678c68
[0.004000] RBP: c930 R08:  R09: 
[0.004000] R10:  R11:  R12: 
[0.004000] R13:  R14:  R15: 
[0.004000] FS:  () GS:88001340() 
knlGS:
[0.004000] CS:  0010 DS:  ES:  CR0: 80050033
[0.004000] CR2: c8f8 CR3: 0366e000 CR4: 000406b0
[0.004000] Call Trace:
[0.004000] Kernel panic - not syncing: Machine halted.
[0.004000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
4.18.0-rc4-00074-g19efe000 #2
[0.004000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.10.2-1 04/01/2014
[0.004000] Call Trace:
[0.004000]  <#DF>
[0.004000]  dump_stack+0x9b/0xe7
[0.004000]  panic+0x1ad/0x325
[0.004000]  ? refcount_error_report+0x267/0x267
[0.004000]  df_debug+0x32/0x32
[0.004000]  do_double_fault+0x1b3/0x1c7
[0.004000]  double_fault+0x23/0x30
[0.004000] RIP: 0010:trace_hardirqs_off_thunk+0xb/0x1c
[0.004000] Code: 5f 5d c3 55 48 89 e5 57 56 52 51 50 41 50 41 51 41 52 41 
53 48 8b 7d 08 e8 f5 4e 2b 00 eb 34 55 48 89 e5 57 56 52 51 50 41 50 <41> 51 41 
52 41 53 48 8b 7d 08 e8 eb 52 2b 00 eb 18 55 48 89 e5 57 
[0.004000] RSP: :c900 EFLAGS: 00010087
[0.004000] RAX: 82800a97 RBX: 0001 RCX: 82800a97
[0.004000] RDX:  RSI: 82800f68 RDI: 83678c68
[0.004000] RBP: c930 R08:  R09: 
[0.004000] R10:  R11:  R12: 
[0.004000] R13:  R14:  R15: 
[0.004000]  ? native_iret+0x7/0x7
[0.004000]  ? async_page_fault+0x8/0x30
[0.004000] WARNING: kernel stack regs at (ptrval) 

[lkp-robot] [x86] 19efe000d3: PANIC:double_fault

2018-08-22 Thread kernel test robot
FYI, we noticed the following commit (built with gcc-5):

commit: 19efe000d3258032d9a1dfb25313a092f9454da0 ("x86: Remap the IRQ stack so 
it has guard pages")
https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git x86/guard_pages

in testcase: trinity
with following parameters:

runtime: 300s

test-description: Trinity is a linux system call fuzz tester.
test-url: http://codemonkey.org.uk/projects/trinity/


on test machine: qemu-system-x86_64 -enable-kvm -cpu IvyBridge -m 420M

caused below changes (please refer to attached dmesg/kmsg for entire 
log/backtrace):


+--+++
|  | 0d997f71d5 
| 19efe000d3 |
+--+++
| boot_successes   | 6  
| 0  |
| boot_failures| 5  
| 32 |
| invoked_oom-killer:gfp_mask=0x   | 5  
||
| Mem-Info | 5  
||
| Out_of_memory:Kill_process   | 3  
||
| Kernel_panic-not_syncing:Out_of_memory_and_no_killable_processes | 3  
||
| RIP:__put_user_4 | 1  
||
| PANIC:double_fault   | 0  
| 32 |
| RIP:trace_hardirqs_off_thunk | 0  
| 32 |
| Kernel_panic-not_syncing:Machine_halted  | 0  
| 32 |
| WARNING:kernel_stack | 0  
| 32 |
+--+++



[0.004000]  memory used by lock dependency info: 7871 kB
[0.004000]  per task-struct memory footprint: 2688 bytes
[0.004000] ACPI: Core revision 20180531
[0.004000] clocksource: hpet: mask: 0x max_cycles: 0x, 
max_idle_ns: 19112604467 ns
[0.004000] hpet clockevent registered
[0.004000] PANIC: double fault, error_code: 0x0
[0.004000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
4.18.0-rc4-00074-g19efe000 #2
[0.004000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.10.2-1 04/01/2014
[0.004000] RIP: 0010:trace_hardirqs_off_thunk+0xb/0x1c
[0.004000] Code: 5f 5d c3 55 48 89 e5 57 56 52 51 50 41 50 41 51 41 52 41 
53 48 8b 7d 08 e8 f5 4e 2b 00 eb 34 55 48 89 e5 57 56 52 51 50 41 50 <41> 51 41 
52 41 53 48 8b 7d 08 e8 eb 52 2b 00 eb 18 55 48 89 e5 57 
[0.004000] RSP: :c900 EFLAGS: 00010087
[0.004000] RAX: 82800a97 RBX: 0001 RCX: 82800a97
[0.004000] RDX:  RSI: 82800f68 RDI: 83678c68
[0.004000] RBP: c930 R08:  R09: 
[0.004000] R10:  R11:  R12: 
[0.004000] R13:  R14:  R15: 
[0.004000] FS:  () GS:88001340() 
knlGS:
[0.004000] CS:  0010 DS:  ES:  CR0: 80050033
[0.004000] CR2: c8f8 CR3: 0366e000 CR4: 000406b0
[0.004000] Call Trace:
[0.004000] Kernel panic - not syncing: Machine halted.
[0.004000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
4.18.0-rc4-00074-g19efe000 #2
[0.004000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.10.2-1 04/01/2014
[0.004000] Call Trace:
[0.004000]  <#DF>
[0.004000]  dump_stack+0x9b/0xe7
[0.004000]  panic+0x1ad/0x325
[0.004000]  ? refcount_error_report+0x267/0x267
[0.004000]  df_debug+0x32/0x32
[0.004000]  do_double_fault+0x1b3/0x1c7
[0.004000]  double_fault+0x23/0x30
[0.004000] RIP: 0010:trace_hardirqs_off_thunk+0xb/0x1c
[0.004000] Code: 5f 5d c3 55 48 89 e5 57 56 52 51 50 41 50 41 51 41 52 41 
53 48 8b 7d 08 e8 f5 4e 2b 00 eb 34 55 48 89 e5 57 56 52 51 50 41 50 <41> 51 41 
52 41 53 48 8b 7d 08 e8 eb 52 2b 00 eb 18 55 48 89 e5 57 
[0.004000] RSP: :c900 EFLAGS: 00010087
[0.004000] RAX: 82800a97 RBX: 0001 RCX: 82800a97
[0.004000] RDX:  RSI: 82800f68 RDI: 83678c68
[0.004000] RBP: c930 R08:  R09: 
[0.004000] R10:  R11:  R12: 
[0.004000] R13:  R14:  R15: 
[0.004000]  ? native_iret+0x7/0x7
[0.004000]  ? async_page_fault+0x8/0x30
[0.004000] WARNING: kernel stack regs at (ptrval) 

[PATCH] cpufreq: ti-cpufreq: Only register platform_device when supported

2018-08-22 Thread Dave Gerlach
Currently the ti-cpufreq driver blindly registers a 'ti-cpufreq' to force
the driver to probe on any platforms where the driver is built in.
However, this should only happen on platforms that actually can make use
of the driver. There is already functionality in place to match the
SoC compatible so let's factor this out into a separate call and
make sure we find a match before creating the ti-cpufreq driver device.

Signed-off-by: Dave Gerlach 
---
 drivers/cpufreq/ti-cpufreq.c | 25 -
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/drivers/cpufreq/ti-cpufreq.c b/drivers/cpufreq/ti-cpufreq.c
index 3f0e2a14895a..541fdcf17b57 100644
--- a/drivers/cpufreq/ti-cpufreq.c
+++ b/drivers/cpufreq/ti-cpufreq.c
@@ -201,19 +201,31 @@ static const struct of_device_id ti_cpufreq_of_match[] = {
{},
 };
 
+static const struct of_device_id *ti_cpufreq_match_node(void)
+{
+   struct device_node *np;
+   const struct of_device_id *match;
+
+   np = of_find_node_by_path("/");
+   match = of_match_node(ti_cpufreq_of_match, np);
+   of_node_put(np);
+
+   if (!match)
+   return NULL;
+   else
+   return match;
+}
+
 static int ti_cpufreq_probe(struct platform_device *pdev)
 {
u32 version[VERSION_COUNT];
-   struct device_node *np;
const struct of_device_id *match;
struct opp_table *ti_opp_table;
struct ti_cpufreq_data *opp_data;
const char * const reg_names[] = {"vdd", "vbb"};
int ret;
 
-   np = of_find_node_by_path("/");
-   match = of_match_node(ti_cpufreq_of_match, np);
-   of_node_put(np);
+   match = ti_cpufreq_match_node();
if (!match)
return -ENODEV;
 
@@ -290,7 +302,10 @@ static int ti_cpufreq_probe(struct platform_device *pdev)
 
 static int ti_cpufreq_init(void)
 {
-   platform_device_register_simple("ti-cpufreq", -1, NULL, 0);
+   /* Check to ensure we are on a compatible platform */
+   if (ti_cpufreq_match_node())
+   platform_device_register_simple("ti-cpufreq", -1, NULL, 0);
+
return 0;
 }
 module_init(ti_cpufreq_init);
-- 
2.16.1



[PATCH] cpufreq: ti-cpufreq: Only register platform_device when supported

2018-08-22 Thread Dave Gerlach
Currently the ti-cpufreq driver blindly registers a 'ti-cpufreq' to force
the driver to probe on any platforms where the driver is built in.
However, this should only happen on platforms that actually can make use
of the driver. There is already functionality in place to match the
SoC compatible so let's factor this out into a separate call and
make sure we find a match before creating the ti-cpufreq driver device.

Signed-off-by: Dave Gerlach 
---
 drivers/cpufreq/ti-cpufreq.c | 25 -
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/drivers/cpufreq/ti-cpufreq.c b/drivers/cpufreq/ti-cpufreq.c
index 3f0e2a14895a..541fdcf17b57 100644
--- a/drivers/cpufreq/ti-cpufreq.c
+++ b/drivers/cpufreq/ti-cpufreq.c
@@ -201,19 +201,31 @@ static const struct of_device_id ti_cpufreq_of_match[] = {
{},
 };
 
+static const struct of_device_id *ti_cpufreq_match_node(void)
+{
+   struct device_node *np;
+   const struct of_device_id *match;
+
+   np = of_find_node_by_path("/");
+   match = of_match_node(ti_cpufreq_of_match, np);
+   of_node_put(np);
+
+   if (!match)
+   return NULL;
+   else
+   return match;
+}
+
 static int ti_cpufreq_probe(struct platform_device *pdev)
 {
u32 version[VERSION_COUNT];
-   struct device_node *np;
const struct of_device_id *match;
struct opp_table *ti_opp_table;
struct ti_cpufreq_data *opp_data;
const char * const reg_names[] = {"vdd", "vbb"};
int ret;
 
-   np = of_find_node_by_path("/");
-   match = of_match_node(ti_cpufreq_of_match, np);
-   of_node_put(np);
+   match = ti_cpufreq_match_node();
if (!match)
return -ENODEV;
 
@@ -290,7 +302,10 @@ static int ti_cpufreq_probe(struct platform_device *pdev)
 
 static int ti_cpufreq_init(void)
 {
-   platform_device_register_simple("ti-cpufreq", -1, NULL, 0);
+   /* Check to ensure we are on a compatible platform */
+   if (ti_cpufreq_match_node())
+   platform_device_register_simple("ti-cpufreq", -1, NULL, 0);
+
return 0;
 }
 module_init(ti_cpufreq_init);
-- 
2.16.1



Re: [PATCH] include/linux/compiler*.h: make compiler-*.h mutually exclusive

2018-08-22 Thread Masahiro Yamada
Hi Nick

2018-08-23 8:37 GMT+09:00 Nick Desaulniers :
> Commit cafa0010cd51 ("Raise the minimum required gcc version to 4.6")
> recently exposed a brittle part of the build for supporting non-gcc
> compilers.
>
> Both Clang and ICC define __GNUC__, __GNUC_MINOR__, and
> __GNUC_PATCHLEVEL__ for quick compatibility with code bases that haven't
> added compiler specific checks for __clang__ or __INTEL_COMPILER.
>
> This is brittle, as they happened to get compatibility by posing as a
> certain version of GCC.  This broke when upgrading the minimal version
> of GCC required to build the kernel, to a version above what ICC and
> Clang claim to be.
>
> Rather than always including compiler-gcc.h then undefining or
> redefining macros in compiler-intel.h or compiler-clang.h, let's
> separate out the compiler specific macro definitions into mutually
> exclusive headers, do more proper compiler detection, and keep shared
> definitions in compiler_types.h.
>
> Reported-by: Masahiro Yamada 
> Suggested-by: Eli Friedman 
> Suggested-by: Joe Perches 
> Signed-off-by: Nick Desaulniers 
> ---
>  arch/arm/kernel/asm-offsets.c |  13 +-
>  drivers/gpu/drm/i915/i915_utils.h |   2 +-
>  drivers/watchdog/kempld_wdt.c |   5 -
>  include/linux/compiler-clang.h|  20 ++-
>  include/linux/compiler-gcc.h  |  88 ---
>  include/linux/compiler-intel.h|  13 +-
>  include/linux/compiler_types.h| 238 ++
>  mm/ksm.c  |   4 +-
>  mm/migrate.c  |   3 +-
>  9 files changed, 133 insertions(+), 253 deletions(-)



If I build ARM linux with Clang,
I see the following error.



arch/arm/firmware/trusted_foundations.c:34:13: error: variable has
incomplete type 'void'
static void __naked tf_generic_smc(u32 type, u32 arg1, u32 arg2)
^
arch/arm/firmware/trusted_foundations.c:34:20: error: expected ';'
after top level declarator
static void __naked tf_generic_smc(u32 type, u32 arg1, u32 arg2)
   ^
   ;
arch/arm/firmware/trusted_foundations.c:77:25: error: use of
undeclared identifier 'trusted_foundations_ops'
register_firmware_ops(_foundations_ops);
   ^
3 errors generated.
scripts/Makefile.build:307: recipe for target
'arch/arm/firmware/trusted_foundations.o' failed
make[2]: *** [arch/arm/firmware/trusted_foundations.o] Error 1
Makefile:1057: recipe for target 'arch/arm/firmware' failed
make[1]: *** [arch/arm/firmware] Error 2
make[1]: *** Waiting for unfinished jobs





__naked is defined in 


It was previous included by all compilers,
but now it is only by _true_ GCC.


Even if I move the __naked definition
to ,
I see a different error.



arch/arm/firmware/trusted_foundations.c:47:10: error: parameter
references not allowed in naked functions
: "r" (type), "r" (arg1), "r" (arg2)
   ^
arch/arm/firmware/trusted_foundations.c:34:13: note: attribute is here
static void __naked tf_generic_smc(u32 type, u32 arg1, u32 arg2)
^
./include/linux/compiler_types.h:277:33: note: expanded from macro '__naked'
#define __naked __attribute__((naked))
   ^



I do not know what a correct fix is.







-- 
Best Regards
Masahiro Yamada


Re: [PATCH] include/linux/compiler*.h: make compiler-*.h mutually exclusive

2018-08-22 Thread Masahiro Yamada
Hi Nick

2018-08-23 8:37 GMT+09:00 Nick Desaulniers :
> Commit cafa0010cd51 ("Raise the minimum required gcc version to 4.6")
> recently exposed a brittle part of the build for supporting non-gcc
> compilers.
>
> Both Clang and ICC define __GNUC__, __GNUC_MINOR__, and
> __GNUC_PATCHLEVEL__ for quick compatibility with code bases that haven't
> added compiler specific checks for __clang__ or __INTEL_COMPILER.
>
> This is brittle, as they happened to get compatibility by posing as a
> certain version of GCC.  This broke when upgrading the minimal version
> of GCC required to build the kernel, to a version above what ICC and
> Clang claim to be.
>
> Rather than always including compiler-gcc.h then undefining or
> redefining macros in compiler-intel.h or compiler-clang.h, let's
> separate out the compiler specific macro definitions into mutually
> exclusive headers, do more proper compiler detection, and keep shared
> definitions in compiler_types.h.
>
> Reported-by: Masahiro Yamada 
> Suggested-by: Eli Friedman 
> Suggested-by: Joe Perches 
> Signed-off-by: Nick Desaulniers 
> ---
>  arch/arm/kernel/asm-offsets.c |  13 +-
>  drivers/gpu/drm/i915/i915_utils.h |   2 +-
>  drivers/watchdog/kempld_wdt.c |   5 -
>  include/linux/compiler-clang.h|  20 ++-
>  include/linux/compiler-gcc.h  |  88 ---
>  include/linux/compiler-intel.h|  13 +-
>  include/linux/compiler_types.h| 238 ++
>  mm/ksm.c  |   4 +-
>  mm/migrate.c  |   3 +-
>  9 files changed, 133 insertions(+), 253 deletions(-)



If I build ARM linux with Clang,
I see the following error.



arch/arm/firmware/trusted_foundations.c:34:13: error: variable has
incomplete type 'void'
static void __naked tf_generic_smc(u32 type, u32 arg1, u32 arg2)
^
arch/arm/firmware/trusted_foundations.c:34:20: error: expected ';'
after top level declarator
static void __naked tf_generic_smc(u32 type, u32 arg1, u32 arg2)
   ^
   ;
arch/arm/firmware/trusted_foundations.c:77:25: error: use of
undeclared identifier 'trusted_foundations_ops'
register_firmware_ops(_foundations_ops);
   ^
3 errors generated.
scripts/Makefile.build:307: recipe for target
'arch/arm/firmware/trusted_foundations.o' failed
make[2]: *** [arch/arm/firmware/trusted_foundations.o] Error 1
Makefile:1057: recipe for target 'arch/arm/firmware' failed
make[1]: *** [arch/arm/firmware] Error 2
make[1]: *** Waiting for unfinished jobs





__naked is defined in 


It was previous included by all compilers,
but now it is only by _true_ GCC.


Even if I move the __naked definition
to ,
I see a different error.



arch/arm/firmware/trusted_foundations.c:47:10: error: parameter
references not allowed in naked functions
: "r" (type), "r" (arg1), "r" (arg2)
   ^
arch/arm/firmware/trusted_foundations.c:34:13: note: attribute is here
static void __naked tf_generic_smc(u32 type, u32 arg1, u32 arg2)
^
./include/linux/compiler_types.h:277:33: note: expanded from macro '__naked'
#define __naked __attribute__((naked))
   ^



I do not know what a correct fix is.







-- 
Best Regards
Masahiro Yamada


Re: [PATCH] irq: ingenic: Drop dependency on MACH_INGENIC

2018-08-22 Thread kbuild test robot
Hi Paul,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on tip/irq/core]
[also build test WARNING on v4.18 next-20180822]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Paul-Cercueil/irq-ingenic-Drop-dependency-on-MACH_INGENIC/20180822-083429
config: mips-allmodconfig (attached as .config)
compiler: mips-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
GCC_VERSION=7.2.0 make.cross ARCH=mips 

All warnings (new ones prefixed by >>):

   In file included from drivers//irqchip/irq-ingenic.c:30:0:
>> arch/mips/include/asm/mach-jz4740/irq.h:63:0: warning: "NR_IRQS" redefined
#define NR_IRQS (JZ4740_IRQ_ADC_BASE + 6)

   In file included from arch/mips/include/asm/irq.h:18:0,
from include/linux/irq.h:23,
from include/asm-generic/hardirq.h:13,
from arch/mips/include/asm/hardirq.h:16,
from include/linux/hardirq.h:9,
from include/linux/interrupt.h:11,
from drivers//irqchip/irq-ingenic.c:19:
   arch/mips/include/asm/mach-generic/irq.h:12:0: note: this is the location of 
the previous definition
#define NR_IRQS 128


vim +/NR_IRQS +63 arch/mips/include/asm/mach-jz4740/irq.h

9869848d Lars-Peter Clausen 2010-07-17  62  
9869848d Lars-Peter Clausen 2010-07-17 @63  #define NR_IRQS 
(JZ4740_IRQ_ADC_BASE + 6)
9869848d Lars-Peter Clausen 2010-07-17  64  

:: The code at line 63 was first introduced by commit
:: 9869848d12601cdddf097a36aebe0b10dc5d177b MIPS: JZ4740: Add IRQ handler 
code

:: TO: Lars-Peter Clausen 
:: CC: Ralf Baechle 

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH] irq: ingenic: Drop dependency on MACH_INGENIC

2018-08-22 Thread kbuild test robot
Hi Paul,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on tip/irq/core]
[also build test WARNING on v4.18 next-20180822]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Paul-Cercueil/irq-ingenic-Drop-dependency-on-MACH_INGENIC/20180822-083429
config: mips-allmodconfig (attached as .config)
compiler: mips-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
GCC_VERSION=7.2.0 make.cross ARCH=mips 

All warnings (new ones prefixed by >>):

   In file included from drivers//irqchip/irq-ingenic.c:30:0:
>> arch/mips/include/asm/mach-jz4740/irq.h:63:0: warning: "NR_IRQS" redefined
#define NR_IRQS (JZ4740_IRQ_ADC_BASE + 6)

   In file included from arch/mips/include/asm/irq.h:18:0,
from include/linux/irq.h:23,
from include/asm-generic/hardirq.h:13,
from arch/mips/include/asm/hardirq.h:16,
from include/linux/hardirq.h:9,
from include/linux/interrupt.h:11,
from drivers//irqchip/irq-ingenic.c:19:
   arch/mips/include/asm/mach-generic/irq.h:12:0: note: this is the location of 
the previous definition
#define NR_IRQS 128


vim +/NR_IRQS +63 arch/mips/include/asm/mach-jz4740/irq.h

9869848d Lars-Peter Clausen 2010-07-17  62  
9869848d Lars-Peter Clausen 2010-07-17 @63  #define NR_IRQS 
(JZ4740_IRQ_ADC_BASE + 6)
9869848d Lars-Peter Clausen 2010-07-17  64  

:: The code at line 63 was first introduced by commit
:: 9869848d12601cdddf097a36aebe0b10dc5d177b MIPS: JZ4740: Add IRQ handler 
code

:: TO: Lars-Peter Clausen 
:: CC: Ralf Baechle 

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


[PATCH] scsi/virio_scsi.c: do not call virtscsi_remove_vqs() in virtscsi_init() to avoid crash bug

2018-08-22 Thread piaojun
If some error happened before find_vqs, error branch will goto
virtscsi_remove_vqs to free vqs. Actually the vqs have not been allocated
successfully, so this will cause wild-pointer-free problem. So
virtscsi_remove_vqs could be deleted as no error will happen after
find_vqs.

Signed-off-by: Jun Piao 
---
 drivers/scsi/virtio_scsi.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index 1c72db9..da0fd74 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -833,8 +833,6 @@ static int virtscsi_init(struct virtio_device *vdev,
kfree(names);
kfree(callbacks);
kfree(vqs);
-   if (err)
-   virtscsi_remove_vqs(vdev);
return err;
 }

-- 


[PATCH] scsi/virio_scsi.c: do not call virtscsi_remove_vqs() in virtscsi_init() to avoid crash bug

2018-08-22 Thread piaojun
If some error happened before find_vqs, error branch will goto
virtscsi_remove_vqs to free vqs. Actually the vqs have not been allocated
successfully, so this will cause wild-pointer-free problem. So
virtscsi_remove_vqs could be deleted as no error will happen after
find_vqs.

Signed-off-by: Jun Piao 
---
 drivers/scsi/virtio_scsi.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index 1c72db9..da0fd74 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -833,8 +833,6 @@ static int virtscsi_init(struct virtio_device *vdev,
kfree(names);
kfree(callbacks);
kfree(vqs);
-   if (err)
-   virtscsi_remove_vqs(vdev);
return err;
 }

-- 


[GIT PULL] tracing: gcov and fixes to no kprobes on notrace functions

2018-08-22 Thread Steven Rostedt


Linus,

Masami found an off by one bug in the code that keeps "notrace" functions
from being traced by kprobes. During my testing, I found that there's places
that we may want to add kprobes to notrace, thus we may end up changing this
code before 4.19 is released. The history behind this change is that we
found that adding kprobes to various notrace functions caused the kernel to
crashed. We took the safe route and decided not to allow kprobes to trace
any notrace function. But because notrace is added to functions that just
cause weird side effects to the function tracer, but are still safe,
preventing kprobes for all notrace functions may be too much of a big
hammer. One such place is __schedule() is marked notrace, to keep
function tracer from doing strange recursive loops when it gets traced
with NEED_RESCHED set. With this change, one can not add kprobes to the
scheduler.

Masami also added code to use gcov on ftrace.


Please pull the latest trace-v4.19-1 tree, which can be found at:


  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
trace-v4.19-1

Tag SHA1: 808ce93b6ef6d4949eee7c48ab5e002142367393
Head SHA1: 9161a864ff88e800de50494da095af19832e9583


Masami Hiramatsu (2):
  tracing: Allow gcov profiling on only ftrace subsystem
  tracing/kprobes: Fix to check notrace function with correct range


 kernel/trace/Kconfig| 12 
 kernel/trace/Makefile   |  5 +
 kernel/trace/trace_kprobe.c |  9 -
 3 files changed, 25 insertions(+), 1 deletion(-)
---
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index fd6754b88f87..aec44d838283 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -774,6 +774,18 @@ config TRACING_EVENTS_GPIO
help
  Enable tracing events for gpio subsystem
 
+config GCOV_PROFILE_FTRACE
+   bool "Enable GCOV profiling on ftrace subsystem"
+   depends on GCOV_KERNEL
+   help
+ Enable GCOV profiling on ftrace subsystem for checking
+ which functions/lines are tested.
+
+ If unsure, say N.
+
+ Note that on a kernel compiled with this config, ftrace will
+ run significantly slower.
+
 endif # FTRACE
 
 endif # TRACING_SUPPORT
diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
index 98d53b39a8ee..f81dadbc7c4a 100644
--- a/kernel/trace/Makefile
+++ b/kernel/trace/Makefile
@@ -23,6 +23,11 @@ ifdef CONFIG_TRACING_BRANCHES
 KBUILD_CFLAGS += -DDISABLE_BRANCH_PROFILING
 endif
 
+# for GCOV coverage profiling
+ifdef CONFIG_GCOV_PROFILE_FTRACE
+GCOV_PROFILE := y
+endif
+
 CFLAGS_trace_benchmark.o := -I$(src)
 CFLAGS_trace_events_filter.o := -I$(src)
 
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 65a4157af851..ad384b31fe01 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -513,7 +513,14 @@ static bool within_notrace_func(struct trace_kprobe *tk)
if (!addr || !kallsyms_lookup_size_offset(addr, , ))
return false;
 
-   return !ftrace_location_range(addr - offset, addr - offset + size);
+   /* Get the entry address of the target function */
+   addr -= offset;
+
+   /*
+* Since ftrace_location_range() does inclusive range check, we need
+* to subtract 1 byte from the end address.
+*/
+   return !ftrace_location_range(addr, addr + size - 1);
 }
 #else
 #define within_notrace_func(tk)(false)


[GIT PULL] tracing: gcov and fixes to no kprobes on notrace functions

2018-08-22 Thread Steven Rostedt


Linus,

Masami found an off by one bug in the code that keeps "notrace" functions
from being traced by kprobes. During my testing, I found that there's places
that we may want to add kprobes to notrace, thus we may end up changing this
code before 4.19 is released. The history behind this change is that we
found that adding kprobes to various notrace functions caused the kernel to
crashed. We took the safe route and decided not to allow kprobes to trace
any notrace function. But because notrace is added to functions that just
cause weird side effects to the function tracer, but are still safe,
preventing kprobes for all notrace functions may be too much of a big
hammer. One such place is __schedule() is marked notrace, to keep
function tracer from doing strange recursive loops when it gets traced
with NEED_RESCHED set. With this change, one can not add kprobes to the
scheduler.

Masami also added code to use gcov on ftrace.


Please pull the latest trace-v4.19-1 tree, which can be found at:


  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
trace-v4.19-1

Tag SHA1: 808ce93b6ef6d4949eee7c48ab5e002142367393
Head SHA1: 9161a864ff88e800de50494da095af19832e9583


Masami Hiramatsu (2):
  tracing: Allow gcov profiling on only ftrace subsystem
  tracing/kprobes: Fix to check notrace function with correct range


 kernel/trace/Kconfig| 12 
 kernel/trace/Makefile   |  5 +
 kernel/trace/trace_kprobe.c |  9 -
 3 files changed, 25 insertions(+), 1 deletion(-)
---
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index fd6754b88f87..aec44d838283 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -774,6 +774,18 @@ config TRACING_EVENTS_GPIO
help
  Enable tracing events for gpio subsystem
 
+config GCOV_PROFILE_FTRACE
+   bool "Enable GCOV profiling on ftrace subsystem"
+   depends on GCOV_KERNEL
+   help
+ Enable GCOV profiling on ftrace subsystem for checking
+ which functions/lines are tested.
+
+ If unsure, say N.
+
+ Note that on a kernel compiled with this config, ftrace will
+ run significantly slower.
+
 endif # FTRACE
 
 endif # TRACING_SUPPORT
diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
index 98d53b39a8ee..f81dadbc7c4a 100644
--- a/kernel/trace/Makefile
+++ b/kernel/trace/Makefile
@@ -23,6 +23,11 @@ ifdef CONFIG_TRACING_BRANCHES
 KBUILD_CFLAGS += -DDISABLE_BRANCH_PROFILING
 endif
 
+# for GCOV coverage profiling
+ifdef CONFIG_GCOV_PROFILE_FTRACE
+GCOV_PROFILE := y
+endif
+
 CFLAGS_trace_benchmark.o := -I$(src)
 CFLAGS_trace_events_filter.o := -I$(src)
 
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 65a4157af851..ad384b31fe01 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -513,7 +513,14 @@ static bool within_notrace_func(struct trace_kprobe *tk)
if (!addr || !kallsyms_lookup_size_offset(addr, , ))
return false;
 
-   return !ftrace_location_range(addr - offset, addr - offset + size);
+   /* Get the entry address of the target function */
+   addr -= offset;
+
+   /*
+* Since ftrace_location_range() does inclusive range check, we need
+* to subtract 1 byte from the end address.
+*/
+   return !ftrace_location_range(addr, addr + size - 1);
 }
 #else
 #define within_notrace_func(tk)(false)


Re: linux-next: build warning after merge of the apparmor tree

2018-08-22 Thread John Johansen
On 08/22/2018 05:20 PM, Stephen Rothwell wrote:
> Hi John,
> 
> After merging the apparmor tree, today's linux-next build (x86_64
> allmodconfig) produced this warning:
> 
> security/apparmor/policy_unpack.c: In function 'unpack_dfa':
> security/apparmor/policy_unpack.c:426:1: warning: label 'fail' defined but 
> not used [-Wunused-label]
>  fail:
>  ^~~~
> 
> Introduced by commit
> 
>   fb5841091f28 ("apparmor: remove no-op permission check in policy_unpack")
> 
sorry
fixed


Re: linux-next: build warning after merge of the apparmor tree

2018-08-22 Thread John Johansen
On 08/22/2018 05:20 PM, Stephen Rothwell wrote:
> Hi John,
> 
> After merging the apparmor tree, today's linux-next build (x86_64
> allmodconfig) produced this warning:
> 
> security/apparmor/policy_unpack.c: In function 'unpack_dfa':
> security/apparmor/policy_unpack.c:426:1: warning: label 'fail' defined but 
> not used [-Wunused-label]
>  fail:
>  ^~~~
> 
> Introduced by commit
> 
>   fb5841091f28 ("apparmor: remove no-op permission check in policy_unpack")
> 
sorry
fixed


Re: Overview of performance improvements of recent SMB3 compounding patches

2018-08-22 Thread Ronnie Sahlberg
We also have stat() that drops from 6 to 2 roundtrips.

For metadata I think the only remaining low hanging fruit is readdir().
Currently in cifs.ko scanning a directory takes a minimum of 4 roundtrips :

open
query and get data
query and get the error STATUS_NO_MORE_FILES
close


For small-ish directories (the common case I expect will be that almost all 
directories will fit in one reply of data)
we can cut this down to just a single roundtrip by using the compound :

open; query; query; close

If the second query fails with the error above we know we are done and we 
finished in a single
roundtrip.

The drawback is for the rare cases where the directory did not fit in the 
reply, then
we will have to throw all the data away and re-start from scratch using the old 
loop thus making all large directory scans
even slower than they are right now, well just one roundtrip slower but still.
(Technically, the protocol does support re-opening a directory and resuming at 
a specific FileIndex  but it is not implemented server-side.)

I think it will still be worth it if we can get the common case to drop from 4 
to 1 roundtrip.

It will involve touching cifs_readdir() which is hairy code but if someone 
would want to tackle this I think there is opportunity to make
a huge impact on directory listing performance.


regards
ronnie sahlberg




- Original Message -
> From: "Steve French" 
> To: "CIFS" , "samba-technical" 
> , "LKML"
> , "linux-fsdevel" 
> , "ronnie sahlberg"
> 
> Sent: Thursday, 23 August, 2018 5:28:01 AM
> Subject: Re: Overview of performance improvements of recent SMB3 compounding 
> patches
> 
> Continuing the experiments with Ronnie's patches show additional
> promising performance results from other common scenarios:
> 
> Very good news that the number of roundtrips (request/response pairs
> to the server) has dropped so substantially.  Reducing latency, and
> allowing the server to more efficiently process the requests leads to
> much better performance for these common operations:
> 
> - rename goes from 9 request/response pairs to 5 ("mv /mnt/file /mnt/file1")
> - hardlink goes from 8 to only 3 (!)  ("ln /mnt/file1 /mnt/file2")
> - symlink (with mfsymlinks enabled) goes from 11 to 9 ("ln -s
> /mnt/file1 /mnt/file3")
> - touch (existing file) 6 down to 4
> 
> In current kernel we benefit from compounding now on stafs ("stat -f
> /mnt"), and in the earlier note I described the improvements in
> create, unlink, mkdir and rmdir which were also awesome.
> 
> This is very exciting.
> 
> On Tue, Aug 21, 2018 at 1:24 PM Steve French  wrote:
> >
> > In experiments today with Ronnie's most recent compounding patches I
> > see the expected significant improvements in create/mkdir/unlink/rmdir
> > operations over SMB3 mounts (tests were to Samba but would be similar
> > to all modern servers).  See below:
> >
> > "touch /mnt/file"  goes from 6 request/response pairs to 4 with
> > Ronnie's compounding patches
> > "rm /mnt/file" from 5 to 2 request/response pairs
> > "mkdir /mnt/newdir" 6 pairs to 3 pairs
> > "rmdir /mnt/newdir" 6 pairs down to 2 pairs
> >
> > Good job Ronnie!
> > --
> > Thanks,
> >
> > Steve
> 
> 
> 
> --
> Thanks,
> 
> Steve
> 


Re: Overview of performance improvements of recent SMB3 compounding patches

2018-08-22 Thread Ronnie Sahlberg
We also have stat() that drops from 6 to 2 roundtrips.

For metadata I think the only remaining low hanging fruit is readdir().
Currently in cifs.ko scanning a directory takes a minimum of 4 roundtrips :

open
query and get data
query and get the error STATUS_NO_MORE_FILES
close


For small-ish directories (the common case I expect will be that almost all 
directories will fit in one reply of data)
we can cut this down to just a single roundtrip by using the compound :

open; query; query; close

If the second query fails with the error above we know we are done and we 
finished in a single
roundtrip.

The drawback is for the rare cases where the directory did not fit in the 
reply, then
we will have to throw all the data away and re-start from scratch using the old 
loop thus making all large directory scans
even slower than they are right now, well just one roundtrip slower but still.
(Technically, the protocol does support re-opening a directory and resuming at 
a specific FileIndex  but it is not implemented server-side.)

I think it will still be worth it if we can get the common case to drop from 4 
to 1 roundtrip.

It will involve touching cifs_readdir() which is hairy code but if someone 
would want to tackle this I think there is opportunity to make
a huge impact on directory listing performance.


regards
ronnie sahlberg




- Original Message -
> From: "Steve French" 
> To: "CIFS" , "samba-technical" 
> , "LKML"
> , "linux-fsdevel" 
> , "ronnie sahlberg"
> 
> Sent: Thursday, 23 August, 2018 5:28:01 AM
> Subject: Re: Overview of performance improvements of recent SMB3 compounding 
> patches
> 
> Continuing the experiments with Ronnie's patches show additional
> promising performance results from other common scenarios:
> 
> Very good news that the number of roundtrips (request/response pairs
> to the server) has dropped so substantially.  Reducing latency, and
> allowing the server to more efficiently process the requests leads to
> much better performance for these common operations:
> 
> - rename goes from 9 request/response pairs to 5 ("mv /mnt/file /mnt/file1")
> - hardlink goes from 8 to only 3 (!)  ("ln /mnt/file1 /mnt/file2")
> - symlink (with mfsymlinks enabled) goes from 11 to 9 ("ln -s
> /mnt/file1 /mnt/file3")
> - touch (existing file) 6 down to 4
> 
> In current kernel we benefit from compounding now on stafs ("stat -f
> /mnt"), and in the earlier note I described the improvements in
> create, unlink, mkdir and rmdir which were also awesome.
> 
> This is very exciting.
> 
> On Tue, Aug 21, 2018 at 1:24 PM Steve French  wrote:
> >
> > In experiments today with Ronnie's most recent compounding patches I
> > see the expected significant improvements in create/mkdir/unlink/rmdir
> > operations over SMB3 mounts (tests were to Samba but would be similar
> > to all modern servers).  See below:
> >
> > "touch /mnt/file"  goes from 6 request/response pairs to 4 with
> > Ronnie's compounding patches
> > "rm /mnt/file" from 5 to 2 request/response pairs
> > "mkdir /mnt/newdir" 6 pairs to 3 pairs
> > "rmdir /mnt/newdir" 6 pairs down to 2 pairs
> >
> > Good job Ronnie!
> > --
> > Thanks,
> >
> > Steve
> 
> 
> 
> --
> Thanks,
> 
> Steve
> 


Re: [PATCH v5 0/4] x86/boot/KASLR: Parse ACPI table and limit kaslr in immovable memory.

2018-08-22 Thread Chao Fan
Hi all,

No reply for 2 weeks, any comments?

Thanks,
Chao Fan

On Tue, Aug 07, 2018 at 02:49:56PM +0800, Chao Fan wrote:
>***Background:
>People reported that kaslr may randomly chooses some positions
>which are located in movable memory regions. This will break memory
>hotplug feature and make the memory can't be removed.
>
>***Solutions:
>There should be a method to limit kaslr to choosing immovable memory
>regions, so there are 2 solutions:
>1) Add a kernel parameter to specify the memory regions.
>2) Get the information of memory hotremove, then kaslr will know the
>   right regions.
>In method 2, information about memory hot remove is in ACPI
>tables, which will be parsed after 'start_kernel', kaslr can't get
>the information.
>In method 1, users should know the regions address and specify in
>kernel parameter.
>
>In the earliest time, I tried to dig ACPI tabls to solve this problem.
>But I didn't splite the code in 'compressed/' and ACPI code, so the patch
>is hard to follow so refused by community.
>Somebody suggest to add a kernel parameter to specify the
>immovable memory so that limit kaslr in these regions. Then I make
>a patchset. After several versions, Ingo gave a suggestion:
>https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1634024.html
>Follow Ingo's suggestion, imitate the ACPI code to parse the acpi
>tables, so that the kaslr can get necessary memory information in
>ACPI tables.
>Since I think ACPI code is independent part, so copy the codes
>and functions to 'compressed/' directory, so that kaslr won't
>influence the initialization of ACPI.
>
>PATCH 1/4 Reuse the head file of linux/acpi.h, and copy a fcuntion from
>  ACPI code.
>PATCH 2/4 Functions to parse ACPI code.
>PATCH 3/4 If 'CONFIG_MEMORY_HOTREMOVE' specified, walk all nodes and
>  store the information of immovable memory regions.
>PATCH 4/4 According to the immovable memory regions, filter the
>  immovable regions which KASLR can choose.
>
>***Test results:
> - I did a very simple test, and it can get the memory information in
>   bios and efi KVM guest machine, and put it by early printk. But no
>   more tests, so it's with RFC tag.
>
>v1->v2:
> -  Simplify some code.
>Follow Baoquan He's suggestion:
> - Reuse the head file of acpi code.
>
>v2->v3:
> - Test in more conditions, so remove the 'RFC' tag.
> - Change some comments.
>
>v3->v4:
>Follow Thomas Gleixner's suggetsion:
> - Put the whole efi related function into #define CONFIG_EFI and return
>   false in the other stub.
> - Simplify two functions in head file.
>
>v4->v5:
>Follow Dou Liyang's suggestion:
> - Add more comments about some functions based on kernel code.
> - Change some typo in comments.
> - Clean useless variable.
> - Add check for the boundary of array.
> - Add check for 'movable_node' parameter
>
>Any comments will be welcome.
>
>
>Chao Fan (4):
>  x86/boot: Add acpitb.h to help parse acpi tables
>  x86/boot: Add acpitb.c to parse acpi tables
>  x86/boot/KASLR: Walk srat tables to filter immovable memory
>  x86/boot/KASLR: Limit kaslr to choosing the immovable memory
>
> arch/x86/boot/compressed/Makefile |   4 +
> arch/x86/boot/compressed/acpitb.c | 272 ++
> arch/x86/boot/compressed/acpitb.h |   7 +
> arch/x86/boot/compressed/kaslr.c  | 125 --
> 4 files changed, 397 insertions(+), 11 deletions(-)
> create mode 100644 arch/x86/boot/compressed/acpitb.c
> create mode 100644 arch/x86/boot/compressed/acpitb.h
>
>-- 
>2.17.1
>




Re: [PATCH v5 0/4] x86/boot/KASLR: Parse ACPI table and limit kaslr in immovable memory.

2018-08-22 Thread Chao Fan
Hi all,

No reply for 2 weeks, any comments?

Thanks,
Chao Fan

On Tue, Aug 07, 2018 at 02:49:56PM +0800, Chao Fan wrote:
>***Background:
>People reported that kaslr may randomly chooses some positions
>which are located in movable memory regions. This will break memory
>hotplug feature and make the memory can't be removed.
>
>***Solutions:
>There should be a method to limit kaslr to choosing immovable memory
>regions, so there are 2 solutions:
>1) Add a kernel parameter to specify the memory regions.
>2) Get the information of memory hotremove, then kaslr will know the
>   right regions.
>In method 2, information about memory hot remove is in ACPI
>tables, which will be parsed after 'start_kernel', kaslr can't get
>the information.
>In method 1, users should know the regions address and specify in
>kernel parameter.
>
>In the earliest time, I tried to dig ACPI tabls to solve this problem.
>But I didn't splite the code in 'compressed/' and ACPI code, so the patch
>is hard to follow so refused by community.
>Somebody suggest to add a kernel parameter to specify the
>immovable memory so that limit kaslr in these regions. Then I make
>a patchset. After several versions, Ingo gave a suggestion:
>https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1634024.html
>Follow Ingo's suggestion, imitate the ACPI code to parse the acpi
>tables, so that the kaslr can get necessary memory information in
>ACPI tables.
>Since I think ACPI code is independent part, so copy the codes
>and functions to 'compressed/' directory, so that kaslr won't
>influence the initialization of ACPI.
>
>PATCH 1/4 Reuse the head file of linux/acpi.h, and copy a fcuntion from
>  ACPI code.
>PATCH 2/4 Functions to parse ACPI code.
>PATCH 3/4 If 'CONFIG_MEMORY_HOTREMOVE' specified, walk all nodes and
>  store the information of immovable memory regions.
>PATCH 4/4 According to the immovable memory regions, filter the
>  immovable regions which KASLR can choose.
>
>***Test results:
> - I did a very simple test, and it can get the memory information in
>   bios and efi KVM guest machine, and put it by early printk. But no
>   more tests, so it's with RFC tag.
>
>v1->v2:
> -  Simplify some code.
>Follow Baoquan He's suggestion:
> - Reuse the head file of acpi code.
>
>v2->v3:
> - Test in more conditions, so remove the 'RFC' tag.
> - Change some comments.
>
>v3->v4:
>Follow Thomas Gleixner's suggetsion:
> - Put the whole efi related function into #define CONFIG_EFI and return
>   false in the other stub.
> - Simplify two functions in head file.
>
>v4->v5:
>Follow Dou Liyang's suggestion:
> - Add more comments about some functions based on kernel code.
> - Change some typo in comments.
> - Clean useless variable.
> - Add check for the boundary of array.
> - Add check for 'movable_node' parameter
>
>Any comments will be welcome.
>
>
>Chao Fan (4):
>  x86/boot: Add acpitb.h to help parse acpi tables
>  x86/boot: Add acpitb.c to parse acpi tables
>  x86/boot/KASLR: Walk srat tables to filter immovable memory
>  x86/boot/KASLR: Limit kaslr to choosing the immovable memory
>
> arch/x86/boot/compressed/Makefile |   4 +
> arch/x86/boot/compressed/acpitb.c | 272 ++
> arch/x86/boot/compressed/acpitb.h |   7 +
> arch/x86/boot/compressed/kaslr.c  | 125 --
> 4 files changed, 397 insertions(+), 11 deletions(-)
> create mode 100644 arch/x86/boot/compressed/acpitb.c
> create mode 100644 arch/x86/boot/compressed/acpitb.h
>
>-- 
>2.17.1
>




Re: [BUGFIX PATCH] tracing/kprobes: Fix to check notrace function with correct range

2018-08-22 Thread Steven Rostedt
On Thu, 23 Aug 2018 10:18:31 +0900
Masami Hiramatsu  wrote:

> On Wed, 22 Aug 2018 08:58:09 -0400
> Steven Rostedt  wrote:
> 
> > On Tue, 21 Aug 2018 09:42:49 -0400
> > Steven Rostedt  wrote:
> >   
> > > On Tue, 21 Aug 2018 22:04:57 +0900
> > > Masami Hiramatsu  wrote:
> > >   
> > > > Fix within_notrace_func() to check notrace function correctly.
> > > > 
> > > > Since the ftrace_location_range(start, end) function checks
> > > > the range inclusively (start <= ftrace-loc <= end), the end
> > > > address must not include the entry address of next function.
> > > > 
> > > > However, within_notrace_func() uses kallsyms_lookup_size_offset()
> > > > to get the function size and calculate the end address from
> > > > adding the size to the entry address. This means the end address
> > > > is the entry address of the next function.
> > > > 
> > > > In the result, within_notrace_func() fails to find notrace
> > > > function if the next function of the target function is
> > > > ftraced.
> > > > 
> > > > Let's subtract 1 from the end address so that ftrace_location_range()
> > > > can check it correctly.
> > > > 
> > > > Fixes: commit 45408c4f9250 ("tracing: kprobes: Prohibit probing on 
> > > > notrace function")
> > > > Signed-off-by: Masami Hiramatsu 
> > > > Reported-by: Michael Rodin 
> > > > ---
> > > >
> > > 
> > > Applied. Thanks Masami! I'll start testing this and send it upstream
> > > when it's finished.
> > >   
> > 
> > Hmm, this fix shows the extent of not tracing functions with notrace
> > attched much deeper. For one thing, we can't hook kprobes to the
> > __schedule() function (which is now what all the main schedule
> > functions call). One of my tests used this function to test kprobes and
> > it failed.
> > 
> > I'll push this to Linus, but I'm wondering if we want to perhaps add a
> > white list of functions marked "notrace" but still kprobes can trace?  
> 
> Yes, that is what I'm thinking about. Maybe most of notrace function is
> safe for kprobes (like __schedule()). And if it is really critical, those
> functions must be marked by NOKPROBE_SYMBOL().
> 

Correct. This notrace was added because we want to not trace any
function call between preempt_schedule() and where NEED_RESCHED is
cleared. Because the function tracer does a preempt_enable_notrace()
which triggers the preempt_schedule() call again, and we can end up in
a recursive loop. But that is fixed, but by tracing __schedule() we do
a song and dance of calling preempt_schedule_notrace and makes the
trace output weird.

See commit 499d79559ffe4.

-- Steve


Re: [BUGFIX PATCH] tracing/kprobes: Fix to check notrace function with correct range

2018-08-22 Thread Steven Rostedt
On Thu, 23 Aug 2018 10:18:31 +0900
Masami Hiramatsu  wrote:

> On Wed, 22 Aug 2018 08:58:09 -0400
> Steven Rostedt  wrote:
> 
> > On Tue, 21 Aug 2018 09:42:49 -0400
> > Steven Rostedt  wrote:
> >   
> > > On Tue, 21 Aug 2018 22:04:57 +0900
> > > Masami Hiramatsu  wrote:
> > >   
> > > > Fix within_notrace_func() to check notrace function correctly.
> > > > 
> > > > Since the ftrace_location_range(start, end) function checks
> > > > the range inclusively (start <= ftrace-loc <= end), the end
> > > > address must not include the entry address of next function.
> > > > 
> > > > However, within_notrace_func() uses kallsyms_lookup_size_offset()
> > > > to get the function size and calculate the end address from
> > > > adding the size to the entry address. This means the end address
> > > > is the entry address of the next function.
> > > > 
> > > > In the result, within_notrace_func() fails to find notrace
> > > > function if the next function of the target function is
> > > > ftraced.
> > > > 
> > > > Let's subtract 1 from the end address so that ftrace_location_range()
> > > > can check it correctly.
> > > > 
> > > > Fixes: commit 45408c4f9250 ("tracing: kprobes: Prohibit probing on 
> > > > notrace function")
> > > > Signed-off-by: Masami Hiramatsu 
> > > > Reported-by: Michael Rodin 
> > > > ---
> > > >
> > > 
> > > Applied. Thanks Masami! I'll start testing this and send it upstream
> > > when it's finished.
> > >   
> > 
> > Hmm, this fix shows the extent of not tracing functions with notrace
> > attched much deeper. For one thing, we can't hook kprobes to the
> > __schedule() function (which is now what all the main schedule
> > functions call). One of my tests used this function to test kprobes and
> > it failed.
> > 
> > I'll push this to Linus, but I'm wondering if we want to perhaps add a
> > white list of functions marked "notrace" but still kprobes can trace?  
> 
> Yes, that is what I'm thinking about. Maybe most of notrace function is
> safe for kprobes (like __schedule()). And if it is really critical, those
> functions must be marked by NOKPROBE_SYMBOL().
> 

Correct. This notrace was added because we want to not trace any
function call between preempt_schedule() and where NEED_RESCHED is
cleared. Because the function tracer does a preempt_enable_notrace()
which triggers the preempt_schedule() call again, and we can end up in
a recursive loop. But that is fixed, but by tracing __schedule() we do
a song and dance of calling preempt_schedule_notrace and makes the
trace output weird.

See commit 499d79559ffe4.

-- Steve


Re: [RFC PATCH 1/2] x86: WARN() when uaccess helpers fault on kernel addresses

2018-08-22 Thread Andy Lutomirski
On Wed, Aug 22, 2018 at 5:55 PM, Jann Horn  wrote:
> On Thu, Aug 23, 2018 at 2:28 AM Andy Lutomirski  wrote:
>>
>> On Wed, Aug 22, 2018 at 4:53 PM, Jann Horn  wrote:
>> > On Tue, Aug 7, 2018 at 4:55 AM Andy Lutomirski  wrote:
>> >> > On Aug 6, 2018, at 6:22 PM, Jann Horn  wrote:
>> >> > There have been multiple kernel vulnerabilities that permitted 
>> >> > userspace to
>> >> > pass completely unchecked pointers through to userspace accessors:
>> >> >
>> >> > - the waitid() bug - commit 96ca579a1ecc ("waitid(): Add missing
>> >> >  access_ok() checks")
>> >> > - the sg/bsg read/write APIs
>> >> > - the infiniband read/write APIs
>> >> >
>> >> > These don't happen all that often, but when they do happen, it is hard 
>> >> > to
>> >> > test for them properly; and it is probably also hard to discover them 
>> >> > with
>> >> > fuzzing. Even when an unmapped kernel address is supplied to such buggy
>> >> > code, it just returns -EFAULT instead of doing a proper BUG() or at 
>> >> > least
>> >> > WARN().
>> >> >
>> >> > This patch attempts to make such misbehaving code a bit more visible by
>> >> > WARN()ing in the pagefault handler code when a userspace accessor causes
>> >> > #PF on a kernel address and the current context isn't whitelisted.
>> >>
>> >> I like this a lot, and, in fact, I once wrote a patch to do something 
>> >> similar. It was before the fancy extable code, though, so it was a mess.  
>> >> Here are some thoughts:
>> >>
>> >> - It should be three patches. One patch to add the _UA annotations, one 
>> >> to improve the info passes to the handlers, and one to change behavior.
>> >>
>> >> - You should pass the vector, the error code, and the address to the 
>> >> handler.
>> >
>> > I'm polishing the patch a bit, and I've noticed that to plumb the
>> > error code and address through properly, I might need significantly
>> > more code churn because of kprobes - I want to make sure I'm not going
>> > down the completely wrong path here.
>> >
>> > I'm extending fixup_exception() to take two extra args "unsigned long
>> > error_code, unsigned long fault_addr". Most callers of
>> > fixup_exception() are fairly straightforward, but
>> > kprobe_fault_handler() has a dozen callchains from different exception
>> > handlers, and most of them are coming via notify_die().
>>
>> KILL IT WITH FIRE
>>
>> More seriously, kill kprobe_exceptions_notify() and just fold the
>> contents into do_general_protection().  This notifier chain crap is
>> incomprehensible.  I would love to see notify_die() removed entirely,
>> and crap like this is just more reason to want it gone.
>
> This isn't just do_general_protection() though, that's just one
> example. As far as I can tell, similar stuff happens everywhere where
> notify_die() is used - #DF, #BR, #BP, #MF and so on.

True.  But there aren't actually that many places in the kernel that
use die notifiers at all.  Here's the complete list, excluding non-x86
arch-specific ones, annotated a bit:

arch/x86/kernel/kgdb.c:retval = register_die_notifier(_notifier);
arch/x86/kernel/kgdb.c:unregister_die_notifier(_notifier);
arch/x86/kernel/kgdb.c:unregister_die_notifier(_notifier);

OK, maybe not totally crazy for kgdb.

arch/x86/mm/kasan_init_64.c:register_die_notifier(_die_notifier);

Should be in traps.c directly.

arch/x86/mm/kmmio.c:return register_die_notifier(_die);
arch/x86/mm/kmmio.c:unregister_die_notifier(_die);

Should probably be in traps.c directly.

drivers/bus/brcmstb_gisb.c:register_die_notifier(_die_notifier);

I don't know WTF this is, but it is certainly garbage if anyone ever
tries to build this on x86.

drivers/dma/sh/shdmac.c:int err =
register_die_notifier(_dmae_nmi_notifier);
drivers/dma/sh/shdmac.c:unregister_die_notifier(_dmae_nmi_notifier);

SH-specific, I think.  Not really sure.

drivers/firmware/google/gsmi.c:register_die_notifier(_die_notifier);
drivers/firmware/google/gsmi.c:unregister_die_notifier(_die_notifier);

This is actually an *oops* notifier.  That's totally reasonable, but
it should be a separate OOPS notification chain.

drivers/hv/vmbus_drv.c:register_die_notifier(_die_block);
drivers/hv/vmbus_drv.c:unregister_die_notifier(_die_block);

Appears to *want* to be an OOPS notifier, but it appears to be rather
buggy and to actually catch everything.

drivers/misc/sgi-xp/xpc_main.c:
(void)unregister_die_notifier(_die_notifier);
drivers/misc/sgi-xp/xpc_main.c:ret =
register_die_notifier(_die_notifier);
drivers/misc/sgi-xp/xpc_main.c:
(void)unregister_die_notifier(_die_notifier);

Haven't looked.

kernel/events/hw_breakpoint.c:return
register_die_notifier(_breakpoint_exceptions_nb);

This is an utter abomination and needs to be killed.  The #DB code is
gnarly enough without this particular indirection.  I want to kill it
on x86.  I have a big #DB cleanup series.  Maybe I'll do it.

kernel/events/uprobes.c:return register_die_notifier(_exception_nb);

For 

Re: [RFC PATCH 1/2] x86: WARN() when uaccess helpers fault on kernel addresses

2018-08-22 Thread Andy Lutomirski
On Wed, Aug 22, 2018 at 5:55 PM, Jann Horn  wrote:
> On Thu, Aug 23, 2018 at 2:28 AM Andy Lutomirski  wrote:
>>
>> On Wed, Aug 22, 2018 at 4:53 PM, Jann Horn  wrote:
>> > On Tue, Aug 7, 2018 at 4:55 AM Andy Lutomirski  wrote:
>> >> > On Aug 6, 2018, at 6:22 PM, Jann Horn  wrote:
>> >> > There have been multiple kernel vulnerabilities that permitted 
>> >> > userspace to
>> >> > pass completely unchecked pointers through to userspace accessors:
>> >> >
>> >> > - the waitid() bug - commit 96ca579a1ecc ("waitid(): Add missing
>> >> >  access_ok() checks")
>> >> > - the sg/bsg read/write APIs
>> >> > - the infiniband read/write APIs
>> >> >
>> >> > These don't happen all that often, but when they do happen, it is hard 
>> >> > to
>> >> > test for them properly; and it is probably also hard to discover them 
>> >> > with
>> >> > fuzzing. Even when an unmapped kernel address is supplied to such buggy
>> >> > code, it just returns -EFAULT instead of doing a proper BUG() or at 
>> >> > least
>> >> > WARN().
>> >> >
>> >> > This patch attempts to make such misbehaving code a bit more visible by
>> >> > WARN()ing in the pagefault handler code when a userspace accessor causes
>> >> > #PF on a kernel address and the current context isn't whitelisted.
>> >>
>> >> I like this a lot, and, in fact, I once wrote a patch to do something 
>> >> similar. It was before the fancy extable code, though, so it was a mess.  
>> >> Here are some thoughts:
>> >>
>> >> - It should be three patches. One patch to add the _UA annotations, one 
>> >> to improve the info passes to the handlers, and one to change behavior.
>> >>
>> >> - You should pass the vector, the error code, and the address to the 
>> >> handler.
>> >
>> > I'm polishing the patch a bit, and I've noticed that to plumb the
>> > error code and address through properly, I might need significantly
>> > more code churn because of kprobes - I want to make sure I'm not going
>> > down the completely wrong path here.
>> >
>> > I'm extending fixup_exception() to take two extra args "unsigned long
>> > error_code, unsigned long fault_addr". Most callers of
>> > fixup_exception() are fairly straightforward, but
>> > kprobe_fault_handler() has a dozen callchains from different exception
>> > handlers, and most of them are coming via notify_die().
>>
>> KILL IT WITH FIRE
>>
>> More seriously, kill kprobe_exceptions_notify() and just fold the
>> contents into do_general_protection().  This notifier chain crap is
>> incomprehensible.  I would love to see notify_die() removed entirely,
>> and crap like this is just more reason to want it gone.
>
> This isn't just do_general_protection() though, that's just one
> example. As far as I can tell, similar stuff happens everywhere where
> notify_die() is used - #DF, #BR, #BP, #MF and so on.

True.  But there aren't actually that many places in the kernel that
use die notifiers at all.  Here's the complete list, excluding non-x86
arch-specific ones, annotated a bit:

arch/x86/kernel/kgdb.c:retval = register_die_notifier(_notifier);
arch/x86/kernel/kgdb.c:unregister_die_notifier(_notifier);
arch/x86/kernel/kgdb.c:unregister_die_notifier(_notifier);

OK, maybe not totally crazy for kgdb.

arch/x86/mm/kasan_init_64.c:register_die_notifier(_die_notifier);

Should be in traps.c directly.

arch/x86/mm/kmmio.c:return register_die_notifier(_die);
arch/x86/mm/kmmio.c:unregister_die_notifier(_die);

Should probably be in traps.c directly.

drivers/bus/brcmstb_gisb.c:register_die_notifier(_die_notifier);

I don't know WTF this is, but it is certainly garbage if anyone ever
tries to build this on x86.

drivers/dma/sh/shdmac.c:int err =
register_die_notifier(_dmae_nmi_notifier);
drivers/dma/sh/shdmac.c:unregister_die_notifier(_dmae_nmi_notifier);

SH-specific, I think.  Not really sure.

drivers/firmware/google/gsmi.c:register_die_notifier(_die_notifier);
drivers/firmware/google/gsmi.c:unregister_die_notifier(_die_notifier);

This is actually an *oops* notifier.  That's totally reasonable, but
it should be a separate OOPS notification chain.

drivers/hv/vmbus_drv.c:register_die_notifier(_die_block);
drivers/hv/vmbus_drv.c:unregister_die_notifier(_die_block);

Appears to *want* to be an OOPS notifier, but it appears to be rather
buggy and to actually catch everything.

drivers/misc/sgi-xp/xpc_main.c:
(void)unregister_die_notifier(_die_notifier);
drivers/misc/sgi-xp/xpc_main.c:ret =
register_die_notifier(_die_notifier);
drivers/misc/sgi-xp/xpc_main.c:
(void)unregister_die_notifier(_die_notifier);

Haven't looked.

kernel/events/hw_breakpoint.c:return
register_die_notifier(_breakpoint_exceptions_nb);

This is an utter abomination and needs to be killed.  The #DB code is
gnarly enough without this particular indirection.  I want to kill it
on x86.  I have a big #DB cleanup series.  Maybe I'll do it.

kernel/events/uprobes.c:return register_die_notifier(_exception_nb);

For 

Re: [BUGFIX PATCH] tracing/kprobes: Fix to check notrace function with correct range

2018-08-22 Thread Masami Hiramatsu
On Wed, 22 Aug 2018 08:58:09 -0400
Steven Rostedt  wrote:

> On Tue, 21 Aug 2018 09:42:49 -0400
> Steven Rostedt  wrote:
> 
> > On Tue, 21 Aug 2018 22:04:57 +0900
> > Masami Hiramatsu  wrote:
> > 
> > > Fix within_notrace_func() to check notrace function correctly.
> > > 
> > > Since the ftrace_location_range(start, end) function checks
> > > the range inclusively (start <= ftrace-loc <= end), the end
> > > address must not include the entry address of next function.
> > > 
> > > However, within_notrace_func() uses kallsyms_lookup_size_offset()
> > > to get the function size and calculate the end address from
> > > adding the size to the entry address. This means the end address
> > > is the entry address of the next function.
> > > 
> > > In the result, within_notrace_func() fails to find notrace
> > > function if the next function of the target function is
> > > ftraced.
> > > 
> > > Let's subtract 1 from the end address so that ftrace_location_range()
> > > can check it correctly.
> > > 
> > > Fixes: commit 45408c4f9250 ("tracing: kprobes: Prohibit probing on 
> > > notrace function")
> > > Signed-off-by: Masami Hiramatsu 
> > > Reported-by: Michael Rodin 
> > > ---
> > >  
> > 
> > Applied. Thanks Masami! I'll start testing this and send it upstream
> > when it's finished.
> > 
> 
> Hmm, this fix shows the extent of not tracing functions with notrace
> attched much deeper. For one thing, we can't hook kprobes to the
> __schedule() function (which is now what all the main schedule
> functions call). One of my tests used this function to test kprobes and
> it failed.
> 
> I'll push this to Linus, but I'm wondering if we want to perhaps add a
> white list of functions marked "notrace" but still kprobes can trace?

Yes, that is what I'm thinking about. Maybe most of notrace function is
safe for kprobes (like __schedule()). And if it is really critical, those
functions must be marked by NOKPROBE_SYMBOL().

Thank you,

-- 
Masami Hiramatsu 


Re: [BUGFIX PATCH] tracing/kprobes: Fix to check notrace function with correct range

2018-08-22 Thread Masami Hiramatsu
On Wed, 22 Aug 2018 08:58:09 -0400
Steven Rostedt  wrote:

> On Tue, 21 Aug 2018 09:42:49 -0400
> Steven Rostedt  wrote:
> 
> > On Tue, 21 Aug 2018 22:04:57 +0900
> > Masami Hiramatsu  wrote:
> > 
> > > Fix within_notrace_func() to check notrace function correctly.
> > > 
> > > Since the ftrace_location_range(start, end) function checks
> > > the range inclusively (start <= ftrace-loc <= end), the end
> > > address must not include the entry address of next function.
> > > 
> > > However, within_notrace_func() uses kallsyms_lookup_size_offset()
> > > to get the function size and calculate the end address from
> > > adding the size to the entry address. This means the end address
> > > is the entry address of the next function.
> > > 
> > > In the result, within_notrace_func() fails to find notrace
> > > function if the next function of the target function is
> > > ftraced.
> > > 
> > > Let's subtract 1 from the end address so that ftrace_location_range()
> > > can check it correctly.
> > > 
> > > Fixes: commit 45408c4f9250 ("tracing: kprobes: Prohibit probing on 
> > > notrace function")
> > > Signed-off-by: Masami Hiramatsu 
> > > Reported-by: Michael Rodin 
> > > ---
> > >  
> > 
> > Applied. Thanks Masami! I'll start testing this and send it upstream
> > when it's finished.
> > 
> 
> Hmm, this fix shows the extent of not tracing functions with notrace
> attched much deeper. For one thing, we can't hook kprobes to the
> __schedule() function (which is now what all the main schedule
> functions call). One of my tests used this function to test kprobes and
> it failed.
> 
> I'll push this to Linus, but I'm wondering if we want to perhaps add a
> white list of functions marked "notrace" but still kprobes can trace?

Yes, that is what I'm thinking about. Maybe most of notrace function is
safe for kprobes (like __schedule()). And if it is really critical, those
functions must be marked by NOKPROBE_SYMBOL().

Thank you,

-- 
Masami Hiramatsu 


Re: [PATCH] include/linux/compiler*.h: make compiler-*.h mutually exclusive

2018-08-22 Thread Linus Torvalds
On Wed, Aug 22, 2018 at 6:10 PM Dominique Martinet
 wrote:
>
> I'm not building linux directly, but BPF programs that indirectly uses
> clang with bcc

Oh, ok.

I _can_ test the basic build (just not get a working link and a
kernel) by just forcing it, so I guess I'll call that testing enough.

I'll push it out. It's just Nick's patch, but with the extra barrier
parenthesis removed.

  Linus


Re: [PATCH] include/linux/compiler*.h: make compiler-*.h mutually exclusive

2018-08-22 Thread Linus Torvalds
On Wed, Aug 22, 2018 at 6:10 PM Dominique Martinet
 wrote:
>
> I'm not building linux directly, but BPF programs that indirectly uses
> clang with bcc

Oh, ok.

I _can_ test the basic build (just not get a working link and a
kernel) by just forcing it, so I guess I'll call that testing enough.

I'll push it out. It's just Nick's patch, but with the extra barrier
parenthesis removed.

  Linus


Re: [PATCH] include/linux/compiler*.h: make compiler-*.h mutually exclusive

2018-08-22 Thread Dominique Martinet
Linus Torvalds wrote on Wed, Aug 22, 2018:
> I've fixed that manually, but when I tried to test it I just hit the
> 
>   arch/x86/Makefile:179: *** Compiler lacks asm-goto support..  Stop.
> 
> error.
> 
> Do you have some experimental clang build with asm goto support? What
> version? Or is it just that you're building ARM, not x86?

I'm not building linux directly, but BPF programs that indirectly uses
clang with bcc

On fedora you can install bcc-tools and run
e.g. /usr/share/bcc/tools/execsnoop that will compile something at
runtime.
If the package isn't available for your distro you can just install bcc
and run the script from their repo[1]

[1] https://raw.githubusercontent.com/iovisor/bcc/master/tools/execsnoop.py


Thanks,
-- 
Dominique


Re: [PATCH] include/linux/compiler*.h: make compiler-*.h mutually exclusive

2018-08-22 Thread Dominique Martinet
Linus Torvalds wrote on Wed, Aug 22, 2018:
> I've fixed that manually, but when I tried to test it I just hit the
> 
>   arch/x86/Makefile:179: *** Compiler lacks asm-goto support..  Stop.
> 
> error.
> 
> Do you have some experimental clang build with asm goto support? What
> version? Or is it just that you're building ARM, not x86?

I'm not building linux directly, but BPF programs that indirectly uses
clang with bcc

On fedora you can install bcc-tools and run
e.g. /usr/share/bcc/tools/execsnoop that will compile something at
runtime.
If the package isn't available for your distro you can just install bcc
and run the script from their repo[1]

[1] https://raw.githubusercontent.com/iovisor/bcc/master/tools/execsnoop.py


Thanks,
-- 
Dominique


Re: please revert commit ce8556cca6 "kbuild: verify that $DEPMOD is installed" introduced in v4.18.4.

2018-08-22 Thread Randy Dunlap
On 08/22/2018 05:39 PM, Dmitry Torokhov wrote:
> On Wed, Aug 22, 2018 at 4:35 PM Randy Dunlap  wrote:
>>
>> On 08/22/2018 11:53 AM, H. Nikolaus Schaller wrote:
>>> This patch requires that /sbin/depmod is installed and installable on
>>> the build host.
>>>
>>> But not all build hosts for cross compiling Linux are Linux systems
>>> and are able to provide a working port of depmod, especially at the
>>> file patch /sbin/depmod.
>>>
>>> I use, for example, a Darwin system to cross compile Linux and I run
>>> depmod -a on the embedded system once, after installing a new Linux
>>> kernel there.
>>>
>>> I have no problem with seeing a warning, but aborting the build process
>>> is IMHO a bad idea since the previous behaviour didn't harm many people
>>> as far as I see. Probably 99% of people compiling Linux kernels do that
>>> on Linux and 99% of those have depmod installed for optimal operation of
>>> their build host. So IMHO printing the warning is good enough.
>>
>> Thanks for the report and sorry about the problem.
>>
>> I'm OK with changing the error to a warning.
>> Does the patch below work for you?
> 
> Why would one want a warning on a host that never runs "make modules_install"?
> Can this check be only done when we actually try to install modules?


So Nikolaus:  how do you provoke this problem that you are reporting?
It's not just a theoretical problem, is it?

The way that I read the top-level Makefile, this check for $DEPMOD only happens
when you run "make modules_install".

-- 
~Randy


Re: please revert commit ce8556cca6 "kbuild: verify that $DEPMOD is installed" introduced in v4.18.4.

2018-08-22 Thread Randy Dunlap
On 08/22/2018 05:39 PM, Dmitry Torokhov wrote:
> On Wed, Aug 22, 2018 at 4:35 PM Randy Dunlap  wrote:
>>
>> On 08/22/2018 11:53 AM, H. Nikolaus Schaller wrote:
>>> This patch requires that /sbin/depmod is installed and installable on
>>> the build host.
>>>
>>> But not all build hosts for cross compiling Linux are Linux systems
>>> and are able to provide a working port of depmod, especially at the
>>> file patch /sbin/depmod.
>>>
>>> I use, for example, a Darwin system to cross compile Linux and I run
>>> depmod -a on the embedded system once, after installing a new Linux
>>> kernel there.
>>>
>>> I have no problem with seeing a warning, but aborting the build process
>>> is IMHO a bad idea since the previous behaviour didn't harm many people
>>> as far as I see. Probably 99% of people compiling Linux kernels do that
>>> on Linux and 99% of those have depmod installed for optimal operation of
>>> their build host. So IMHO printing the warning is good enough.
>>
>> Thanks for the report and sorry about the problem.
>>
>> I'm OK with changing the error to a warning.
>> Does the patch below work for you?
> 
> Why would one want a warning on a host that never runs "make modules_install"?
> Can this check be only done when we actually try to install modules?


So Nikolaus:  how do you provoke this problem that you are reporting?
It's not just a theoretical problem, is it?

The way that I read the top-level Makefile, this check for $DEPMOD only happens
when you run "make modules_install".

-- 
~Randy


  1   2   3   4   5   6   7   8   9   >