Re: WARNING in apparmor_secid_to_secctx

2018-09-01 Thread syzbot

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger  
crash:


Reported-and-tested-by:  
syzbot+21016130b0580a9de...@syzkaller.appspotmail.com


Tested on:

commit: 22dad84baabf apparmor: fix apparmor_secid_to_secctx incorr..
git tree:
git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor.git/4.18-syzbot-secid

kernel config:  https://syzkaller.appspot.com/x/.config?x=a7c5a36688323465
compiler:   gcc (GCC) 8.0.1 20180413 (experimental)

Note: testing is done by a robot and is best-effort only.


Re: WARNING in apparmor_secid_to_secctx

2018-09-01 Thread syzbot

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger  
crash:


Reported-and-tested-by:  
syzbot+21016130b0580a9de...@syzkaller.appspotmail.com


Tested on:

commit: 22dad84baabf apparmor: fix apparmor_secid_to_secctx incorr..
git tree:
git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor.git/4.18-syzbot-secid

kernel config:  https://syzkaller.appspot.com/x/.config?x=a7c5a36688323465
compiler:   gcc (GCC) 8.0.1 20180413 (experimental)

Note: testing is done by a robot and is best-effort only.


Re: Re: WARNING in apparmor_secid_to_secctx

2018-09-01 Thread Dmitry Vyukov
On Sun, Sep 2, 2018 at 7:03 AM, syzbot
 wrote:
>> On Sun, Sep 2, 2018 at 6:52 AM, John Johansen
>>  wrote:
>>>
>>> On 09/01/2018 09:33 PM, Dmitry Vyukov wrote:

 On Sat, Sep 1, 2018 at 11:18 AM, John Johansen
  wrote:
>
> On 08/29/2018 07:17 PM, syzbot wrote:
>>
>> Hello,
>
>
>> syzbot found the following crash on:
>
>
>> HEAD commit:817e60a7a2bb Merge branch 'nfp-add-NFP5000-support'
>> git tree:   net-next
>> console output:
>> https://syzkaller.appspot.com/x/log.txt?x=1536d29640
>> kernel config:
>> https://syzkaller.appspot.com/x/.config?x=531a917630d2a492
>> dashboard link:
>> https://syzkaller.appspot.com/bug?extid=21016130b0580a9de3b5
>> compiler:   gcc (GCC) 8.0.1 20180413 (experimental)
>
>
>> Unfortunately, I don't have any reproducer for this crash yet.
>
>
>> IMPORTANT: if you fix the bug, please add the following tag to the
>> commit:
>> Reported-by: syzbot+21016130b0580a9de...@syzkaller.appspotmail.com
>
>
>
> << snip >>
>
>
> Patch sent directly to syzbot for testing
>
>
 Hi John,
>
>
 What do you mean? syzbot has not received any test requests for this,
 and it would reply within half an hour or so. Where is that patch?
>
>
>
>>> Hrmmm strange I followed the web instruction and attached the patch to
>>> the
>>> reply. The patch is below, its also available at
>
>
>>> git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor
>>> 4.18-syzbot-secid
>
>
>> Humm.. Maybe you did not send it to syzbot?  The command should be just:
>
>
>> #syz test: git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor
>
>
> "git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor" does not
> look like a valid git repo address.
>
>
>> 4.18-syzbot-secid

I guess the repo is:

#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor.git
4.18-syzbot-secid


>>> ---
>
>
>>>  From 22dad84baabf4174f11f5e9b34a05529084fa29c Mon Sep 17 00:00:00 2001
>>> From: John Johansen 
>>> Date: Sat, 1 Sep 2018 01:57:52 -0700
>>> Subject: [PATCH] apparmor: fix apparmor_secid_to_secctx incorrect debug
>>>   triggering  WARN_ON
>
>
>>> apparmor_secid_to_secctx() has a bad debug statement tripping on a
>>> condition handle by the code.  When kconfig SECURITY_APPARMOR_DEBUG is
>>> enabled the debug WARN_ON will trip when **secdata is NULL resulting
>>> in the following trace.
>
>
>>> [ cut here ]
>>> AppArmor WARN apparmor_secid_to_secctx: ((!secdata)):
>>> WARNING: CPU: 0 PID: 14826 at security/apparmor/secid.c:82
>>> apparmor_secid_to_secctx+0x2b5/0x2f0 security/apparmor/secid.c:82
>>> Kernel panic - not syncing: panic_on_warn set ...
>
>
>>> CPU: 0 PID: 14826 Comm: syz-executor1 Not tainted 4.19.0-rc1+ #193
>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
>>> Google 01/01/2011
>>> Call Trace:
>>>   __dump_stack lib/dump_stack.c:77 [inline]
>>>   dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113
>>>   panic+0x238/0x4e7 kernel/panic.c:184
>>>   __warn.cold.8+0x163/0x1ba kernel/panic.c:536
>>>   report_bug+0x252/0x2d0 lib/bug.c:186
>>>   fixup_bug arch/x86/kernel/traps.c:178 [inline]
>>>   do_error_trap+0x1fc/0x4d0 arch/x86/kernel/traps.c:296
>>>   do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:316
>>>   invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:993
>>> RIP: 0010:apparmor_secid_to_secctx+0x2b5/0x2f0
>>> security/apparmor/secid.c:82
>>> Code: c7 c7 40 66 58 87 e8 6a 6d 0f fe 0f 0b e9 6c fe ff ff e8 3e aa 44
>>> fe 48 c7 c6 80 67 58 87 48 c7 c7 a0 65 58 87 e8 4b 6d 0f fe <0f> 0b e9 3f fe
>>> ff ff 48 89 df e8 fc a7 83 fe e9 ed fe ff ff bb f4
>>> RSP: 0018:8801ba1bed10 EFLAGS: 00010286
>>> RAX:  RBX: 8801ba1beed0 RCX: c9000227e000
>>> RDX: 00018482 RSI: 8163ac01 RDI: 0001
>>> RBP: 8801ba1bed30 R08: 8801b80ec080 R09: ed003b603eca
>>> R10: ed003b603eca R11: 8801db01f657 R12: 0001
>>> R13:  R14:  R15: 8801ba1beed0
>>>   security_secid_to_secctx+0x63/0xc0 security/security.c:1314
>>>   ctnetlink_secctx_size net/netfilter/nf_conntrack_netlink.c:621 [inline]
>>>   ctnetlink_nlmsg_size net/netfilter/nf_conntrack_netlink.c:659 [inline]
>>>   ctnetlink_conntrack_event+0x303/0x1470
>>> net/netfilter/nf_conntrack_netlink.c:706
>>>   nf_conntrack_eventmask_report+0x55f/0x930
>>> net/netfilter/nf_conntrack_ecache.c:151
>>>   nf_conntrack_event_report
>>> include/net/netfilter/nf_conntrack_ecache.h:112 [inline]
>>>   nf_ct_delete+0x33c/0x5d0 net/netfilter/nf_conntrack_core.c:601
>>>   nf_ct_iterate_cleanup+0x48c/0x5e0
>>> net/netfilter/nf_conntrack_core.c:1892
>>>   nf_ct_iterate_cleanup_net+0x23c/0x2d0
>>> net/netfilter/nf_conntrack_core.c:1974
>>>   ctnetlink_flush_conntrack net/netfilter/nf_conntrack_netlink.c:1226
>>> [inline]
>>>   ctnetlink_del_conntrack+0x66c/0x850
>>> net/netfilter/nf_conntrack_netlink.c:1258
>>>   

Re: Re: WARNING in apparmor_secid_to_secctx

2018-09-01 Thread Dmitry Vyukov
On Sun, Sep 2, 2018 at 7:03 AM, syzbot
 wrote:
>> On Sun, Sep 2, 2018 at 6:52 AM, John Johansen
>>  wrote:
>>>
>>> On 09/01/2018 09:33 PM, Dmitry Vyukov wrote:

 On Sat, Sep 1, 2018 at 11:18 AM, John Johansen
  wrote:
>
> On 08/29/2018 07:17 PM, syzbot wrote:
>>
>> Hello,
>
>
>> syzbot found the following crash on:
>
>
>> HEAD commit:817e60a7a2bb Merge branch 'nfp-add-NFP5000-support'
>> git tree:   net-next
>> console output:
>> https://syzkaller.appspot.com/x/log.txt?x=1536d29640
>> kernel config:
>> https://syzkaller.appspot.com/x/.config?x=531a917630d2a492
>> dashboard link:
>> https://syzkaller.appspot.com/bug?extid=21016130b0580a9de3b5
>> compiler:   gcc (GCC) 8.0.1 20180413 (experimental)
>
>
>> Unfortunately, I don't have any reproducer for this crash yet.
>
>
>> IMPORTANT: if you fix the bug, please add the following tag to the
>> commit:
>> Reported-by: syzbot+21016130b0580a9de...@syzkaller.appspotmail.com
>
>
>
> << snip >>
>
>
> Patch sent directly to syzbot for testing
>
>
 Hi John,
>
>
 What do you mean? syzbot has not received any test requests for this,
 and it would reply within half an hour or so. Where is that patch?
>
>
>
>>> Hrmmm strange I followed the web instruction and attached the patch to
>>> the
>>> reply. The patch is below, its also available at
>
>
>>> git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor
>>> 4.18-syzbot-secid
>
>
>> Humm.. Maybe you did not send it to syzbot?  The command should be just:
>
>
>> #syz test: git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor
>
>
> "git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor" does not
> look like a valid git repo address.
>
>
>> 4.18-syzbot-secid

I guess the repo is:

#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor.git
4.18-syzbot-secid


>>> ---
>
>
>>>  From 22dad84baabf4174f11f5e9b34a05529084fa29c Mon Sep 17 00:00:00 2001
>>> From: John Johansen 
>>> Date: Sat, 1 Sep 2018 01:57:52 -0700
>>> Subject: [PATCH] apparmor: fix apparmor_secid_to_secctx incorrect debug
>>>   triggering  WARN_ON
>
>
>>> apparmor_secid_to_secctx() has a bad debug statement tripping on a
>>> condition handle by the code.  When kconfig SECURITY_APPARMOR_DEBUG is
>>> enabled the debug WARN_ON will trip when **secdata is NULL resulting
>>> in the following trace.
>
>
>>> [ cut here ]
>>> AppArmor WARN apparmor_secid_to_secctx: ((!secdata)):
>>> WARNING: CPU: 0 PID: 14826 at security/apparmor/secid.c:82
>>> apparmor_secid_to_secctx+0x2b5/0x2f0 security/apparmor/secid.c:82
>>> Kernel panic - not syncing: panic_on_warn set ...
>
>
>>> CPU: 0 PID: 14826 Comm: syz-executor1 Not tainted 4.19.0-rc1+ #193
>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
>>> Google 01/01/2011
>>> Call Trace:
>>>   __dump_stack lib/dump_stack.c:77 [inline]
>>>   dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113
>>>   panic+0x238/0x4e7 kernel/panic.c:184
>>>   __warn.cold.8+0x163/0x1ba kernel/panic.c:536
>>>   report_bug+0x252/0x2d0 lib/bug.c:186
>>>   fixup_bug arch/x86/kernel/traps.c:178 [inline]
>>>   do_error_trap+0x1fc/0x4d0 arch/x86/kernel/traps.c:296
>>>   do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:316
>>>   invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:993
>>> RIP: 0010:apparmor_secid_to_secctx+0x2b5/0x2f0
>>> security/apparmor/secid.c:82
>>> Code: c7 c7 40 66 58 87 e8 6a 6d 0f fe 0f 0b e9 6c fe ff ff e8 3e aa 44
>>> fe 48 c7 c6 80 67 58 87 48 c7 c7 a0 65 58 87 e8 4b 6d 0f fe <0f> 0b e9 3f fe
>>> ff ff 48 89 df e8 fc a7 83 fe e9 ed fe ff ff bb f4
>>> RSP: 0018:8801ba1bed10 EFLAGS: 00010286
>>> RAX:  RBX: 8801ba1beed0 RCX: c9000227e000
>>> RDX: 00018482 RSI: 8163ac01 RDI: 0001
>>> RBP: 8801ba1bed30 R08: 8801b80ec080 R09: ed003b603eca
>>> R10: ed003b603eca R11: 8801db01f657 R12: 0001
>>> R13:  R14:  R15: 8801ba1beed0
>>>   security_secid_to_secctx+0x63/0xc0 security/security.c:1314
>>>   ctnetlink_secctx_size net/netfilter/nf_conntrack_netlink.c:621 [inline]
>>>   ctnetlink_nlmsg_size net/netfilter/nf_conntrack_netlink.c:659 [inline]
>>>   ctnetlink_conntrack_event+0x303/0x1470
>>> net/netfilter/nf_conntrack_netlink.c:706
>>>   nf_conntrack_eventmask_report+0x55f/0x930
>>> net/netfilter/nf_conntrack_ecache.c:151
>>>   nf_conntrack_event_report
>>> include/net/netfilter/nf_conntrack_ecache.h:112 [inline]
>>>   nf_ct_delete+0x33c/0x5d0 net/netfilter/nf_conntrack_core.c:601
>>>   nf_ct_iterate_cleanup+0x48c/0x5e0
>>> net/netfilter/nf_conntrack_core.c:1892
>>>   nf_ct_iterate_cleanup_net+0x23c/0x2d0
>>> net/netfilter/nf_conntrack_core.c:1974
>>>   ctnetlink_flush_conntrack net/netfilter/nf_conntrack_netlink.c:1226
>>> [inline]
>>>   ctnetlink_del_conntrack+0x66c/0x850
>>> net/netfilter/nf_conntrack_netlink.c:1258
>>>   

Re: [PATCH V3] spi: spi-geni-qcom: Add SPI driver support for GENI based QUP

2018-09-01 Thread Doug Anderson
Hi,

On Fri, Aug 24, 2018 at 3:42 AM, Dilip Kota  wrote:
> From: Girish Mahadevan 
>
> This driver supports GENI based SPI Controller in the Qualcomm SOCs. The
> Qualcomm Generic Interface (GENI) is a programmable module supporting a
> wide range of serial interfaces including SPI. This driver supports SPI
> operations using FIFO mode of transfer.
>
> Signed-off-by: Girish Mahadevan 
> Signed-off-by: Dilip Kota 
> ---
> Addressing all the reviewer commets given in Patchset1.
> Summerizing all the comments below:
>
> MAKEFILE: Arrange SPI-GENI driver in alphabetical order
> Kconfig: Mark SPI_GENI driver dependent on QCOM_GENI_SE
> Enable SPI core auto runtime pm, and remove runtime pm calls.
> Remove spi_geni_unprepare_message(), 
> spi_geni_unprepare_transfer_hardware()
> Remove likely/unlikely keywords.
> Remove get_spi_master() and use dev_get_drvdata()
> Move request_irq to probe()
> Mark bus number assignment to -1 as SPI core framework will assign 
> dynamically
> Use devm_spi_register_master()
> Include platform_device.h instead of of_platform.h
> Removing macros which are used only once:
> #define SPI_NUM_CHIPSELECT 4
> #define SPI_XFER_TIMEOUT_MS250
> Place Register field definitions next to respective Register 
> definitions.
> Replace int and u32 declerations to unsigned int.
> Remove Hex numbers in debug prints.
> Declare mode as u16 in spi_setup_word_len()
> Remove the labels: setup_fifo_params_exit: 
> exit_prepare_transfer_hardware:
> Declaring struct spi_master as spi everywhere in the file.
> Calling spi_finalize_current_transfer() for end of transfer.
> Hard code the SPI controller max frequency instead of reading from 
> DTSI node.
> Spinlock not required, removed it.
> Removed unrequired error prints.
> Fix KASAN error in geni_spi_isr().
> Remove spi-geni-qcom.h
> Remove inter words delay and CS to Clock toggle delay logic in the 
> driver, as of now no clients are using it.
> Will submit this logic in the next patchset.
> Use major, minor and step macros to read from hardware version 
> register.
>
>  .../devicetree/bindings/soc/qcom/qcom,geni-se.txt  |   2 -
>  drivers/spi/Kconfig|  12 +
>  drivers/spi/Makefile   |   1 +
>  drivers/spi/spi-geni-qcom.c| 678 
> +
>  4 files changed, 691 insertions(+), 2 deletions(-)

See below for comments.  In general I've tried to post patches to
address my own comments.  See the series ending at
.
>From there you can download patch files by using the "DOWNLOAD" link
at the bottom.  Yell if you have problems.  Hopefully that's useful.
I expect that you can squash many of these into your patch to give you
a leg up on v3.

NOTE: I won't promise that I made no mistakes on my fixup patches nor
that I caught everything or did everything right.  I'll plan to take a
fresh look at the whole patch when I see your v3.


> --- a/Documentation/devicetree/bindings/soc/qcom/qcom,geni-se.txt
> +++ b/Documentation/devicetree/bindings/soc/qcom/qcom,geni-se.txt
> @@ -60,7 +60,6 @@ Required properties:
>  - interrupts:  Must contain SPI controller interrupts.
>  - clock-names: Must contain "se".
>  - clocks:  Serial engine core clock needed by the device.
> -- spi-max-frequency:   Specifies maximum SPI clock frequency, units - Hz.

As per Rob's feedback, please split the device tree change into a
separate patch and justify it.  Perhaps the commit message could be
something like:

---

dt-bindings: spi: Remove spi-max-frequency from qcom,geni-se controller

No other SPI controllers have a "spi-max-frequency" at the controller
level.  The normal "spi-max-frequency" property is something that is
used when defining the nodes for SPI slaves.  While it is possible
that someone might want to define a controller-level max frequency it
should be done in other ways (perhaps by keying off a compatible
string?)

---

I think in the past Mark Brown has also requested that the bindings
actually live under "Documentation/devicetree/bindings/spi/", so
perhaps you should also add a patch to your series that moves this
documentation there and changes the "soc/qcom/qcom,geni-se.txt" to
reference that.


> +static irqreturn_t geni_spi_isr(int irq, void *data);
> +
> +struct spi_geni_master {
> +   struct geni_se se;
> +   unsigned int irq;

In v1 Stephen requested that many things in this struct become
"unsigned int", but he didn't mean the "irq".  Please change this back
to an int.  As you have things right now the code "if (spi_geni->irq <
0)" you have below is a no-op.  :(

...oh, and as Stephen pointed out to me offline 

Re: [PATCH V3] spi: spi-geni-qcom: Add SPI driver support for GENI based QUP

2018-09-01 Thread Doug Anderson
Hi,

On Fri, Aug 24, 2018 at 3:42 AM, Dilip Kota  wrote:
> From: Girish Mahadevan 
>
> This driver supports GENI based SPI Controller in the Qualcomm SOCs. The
> Qualcomm Generic Interface (GENI) is a programmable module supporting a
> wide range of serial interfaces including SPI. This driver supports SPI
> operations using FIFO mode of transfer.
>
> Signed-off-by: Girish Mahadevan 
> Signed-off-by: Dilip Kota 
> ---
> Addressing all the reviewer commets given in Patchset1.
> Summerizing all the comments below:
>
> MAKEFILE: Arrange SPI-GENI driver in alphabetical order
> Kconfig: Mark SPI_GENI driver dependent on QCOM_GENI_SE
> Enable SPI core auto runtime pm, and remove runtime pm calls.
> Remove spi_geni_unprepare_message(), 
> spi_geni_unprepare_transfer_hardware()
> Remove likely/unlikely keywords.
> Remove get_spi_master() and use dev_get_drvdata()
> Move request_irq to probe()
> Mark bus number assignment to -1 as SPI core framework will assign 
> dynamically
> Use devm_spi_register_master()
> Include platform_device.h instead of of_platform.h
> Removing macros which are used only once:
> #define SPI_NUM_CHIPSELECT 4
> #define SPI_XFER_TIMEOUT_MS250
> Place Register field definitions next to respective Register 
> definitions.
> Replace int and u32 declerations to unsigned int.
> Remove Hex numbers in debug prints.
> Declare mode as u16 in spi_setup_word_len()
> Remove the labels: setup_fifo_params_exit: 
> exit_prepare_transfer_hardware:
> Declaring struct spi_master as spi everywhere in the file.
> Calling spi_finalize_current_transfer() for end of transfer.
> Hard code the SPI controller max frequency instead of reading from 
> DTSI node.
> Spinlock not required, removed it.
> Removed unrequired error prints.
> Fix KASAN error in geni_spi_isr().
> Remove spi-geni-qcom.h
> Remove inter words delay and CS to Clock toggle delay logic in the 
> driver, as of now no clients are using it.
> Will submit this logic in the next patchset.
> Use major, minor and step macros to read from hardware version 
> register.
>
>  .../devicetree/bindings/soc/qcom/qcom,geni-se.txt  |   2 -
>  drivers/spi/Kconfig|  12 +
>  drivers/spi/Makefile   |   1 +
>  drivers/spi/spi-geni-qcom.c| 678 
> +
>  4 files changed, 691 insertions(+), 2 deletions(-)

See below for comments.  In general I've tried to post patches to
address my own comments.  See the series ending at
.
>From there you can download patch files by using the "DOWNLOAD" link
at the bottom.  Yell if you have problems.  Hopefully that's useful.
I expect that you can squash many of these into your patch to give you
a leg up on v3.

NOTE: I won't promise that I made no mistakes on my fixup patches nor
that I caught everything or did everything right.  I'll plan to take a
fresh look at the whole patch when I see your v3.


> --- a/Documentation/devicetree/bindings/soc/qcom/qcom,geni-se.txt
> +++ b/Documentation/devicetree/bindings/soc/qcom/qcom,geni-se.txt
> @@ -60,7 +60,6 @@ Required properties:
>  - interrupts:  Must contain SPI controller interrupts.
>  - clock-names: Must contain "se".
>  - clocks:  Serial engine core clock needed by the device.
> -- spi-max-frequency:   Specifies maximum SPI clock frequency, units - Hz.

As per Rob's feedback, please split the device tree change into a
separate patch and justify it.  Perhaps the commit message could be
something like:

---

dt-bindings: spi: Remove spi-max-frequency from qcom,geni-se controller

No other SPI controllers have a "spi-max-frequency" at the controller
level.  The normal "spi-max-frequency" property is something that is
used when defining the nodes for SPI slaves.  While it is possible
that someone might want to define a controller-level max frequency it
should be done in other ways (perhaps by keying off a compatible
string?)

---

I think in the past Mark Brown has also requested that the bindings
actually live under "Documentation/devicetree/bindings/spi/", so
perhaps you should also add a patch to your series that moves this
documentation there and changes the "soc/qcom/qcom,geni-se.txt" to
reference that.


> +static irqreturn_t geni_spi_isr(int irq, void *data);
> +
> +struct spi_geni_master {
> +   struct geni_se se;
> +   unsigned int irq;

In v1 Stephen requested that many things in this struct become
"unsigned int", but he didn't mean the "irq".  Please change this back
to an int.  As you have things right now the code "if (spi_geni->irq <
0)" you have below is a no-op.  :(

...oh, and as Stephen pointed out to me offline 

Re: Re: WARNING in apparmor_secid_to_secctx

2018-09-01 Thread syzbot

On Sun, Sep 2, 2018 at 6:52 AM, John Johansen
 wrote:

On 09/01/2018 09:33 PM, Dmitry Vyukov wrote:

On Sat, Sep 1, 2018 at 11:18 AM, John Johansen
 wrote:

On 08/29/2018 07:17 PM, syzbot wrote:

Hello,



syzbot found the following crash on:



HEAD commit:817e60a7a2bb Merge branch 'nfp-add-NFP5000-support'
git tree:   net-next
console output:  
https://syzkaller.appspot.com/x/log.txt?x=1536d29640
kernel config:   
https://syzkaller.appspot.com/x/.config?x=531a917630d2a492
dashboard link:  
https://syzkaller.appspot.com/bug?extid=21016130b0580a9de3b5

compiler:   gcc (GCC) 8.0.1 20180413 (experimental)



Unfortunately, I don't have any reproducer for this crash yet.


IMPORTANT: if you fix the bug, please add the following tag to the  
commit:

Reported-by: syzbot+21016130b0580a9de...@syzkaller.appspotmail.com




<< snip >>



Patch sent directly to syzbot for testing



Hi John,



What do you mean? syzbot has not received any test requests for this,
and it would reply within half an hour or so. Where is that patch?



Hrmmm strange I followed the web instruction and attached the patch to  
the

reply. The patch is below, its also available at


git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor  
4.18-syzbot-secid



Humm.. Maybe you did not send it to syzbot?  The command should be just:



#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor


"git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor" does not  
look like a valid git repo address.



4.18-syzbot-secid




---



 From 22dad84baabf4174f11f5e9b34a05529084fa29c Mon Sep 17 00:00:00 2001
From: John Johansen 
Date: Sat, 1 Sep 2018 01:57:52 -0700
Subject: [PATCH] apparmor: fix apparmor_secid_to_secctx incorrect debug
  triggering  WARN_ON



apparmor_secid_to_secctx() has a bad debug statement tripping on a
condition handle by the code.  When kconfig SECURITY_APPARMOR_DEBUG is
enabled the debug WARN_ON will trip when **secdata is NULL resulting
in the following trace.



[ cut here ]
AppArmor WARN apparmor_secid_to_secctx: ((!secdata)):
WARNING: CPU: 0 PID: 14826 at security/apparmor/secid.c:82  
apparmor_secid_to_secctx+0x2b5/0x2f0 security/apparmor/secid.c:82

Kernel panic - not syncing: panic_on_warn set ...



CPU: 0 PID: 14826 Comm: syz-executor1 Not tainted 4.19.0-rc1+ #193
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113
  panic+0x238/0x4e7 kernel/panic.c:184
  __warn.cold.8+0x163/0x1ba kernel/panic.c:536
  report_bug+0x252/0x2d0 lib/bug.c:186
  fixup_bug arch/x86/kernel/traps.c:178 [inline]
  do_error_trap+0x1fc/0x4d0 arch/x86/kernel/traps.c:296
  do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:316
  invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:993
RIP: 0010:apparmor_secid_to_secctx+0x2b5/0x2f0  
security/apparmor/secid.c:82
Code: c7 c7 40 66 58 87 e8 6a 6d 0f fe 0f 0b e9 6c fe ff ff e8 3e aa 44  
fe 48 c7 c6 80 67 58 87 48 c7 c7 a0 65 58 87 e8 4b 6d 0f fe <0f> 0b e9  
3f fe ff ff 48 89 df e8 fc a7 83 fe e9 ed fe ff ff bb f4

RSP: 0018:8801ba1bed10 EFLAGS: 00010286
RAX:  RBX: 8801ba1beed0 RCX: c9000227e000
RDX: 00018482 RSI: 8163ac01 RDI: 0001
RBP: 8801ba1bed30 R08: 8801b80ec080 R09: ed003b603eca
R10: ed003b603eca R11: 8801db01f657 R12: 0001
R13:  R14:  R15: 8801ba1beed0
  security_secid_to_secctx+0x63/0xc0 security/security.c:1314
  ctnetlink_secctx_size net/netfilter/nf_conntrack_netlink.c:621 [inline]
  ctnetlink_nlmsg_size net/netfilter/nf_conntrack_netlink.c:659 [inline]
  ctnetlink_conntrack_event+0x303/0x1470  
net/netfilter/nf_conntrack_netlink.c:706
  nf_conntrack_eventmask_report+0x55f/0x930  
net/netfilter/nf_conntrack_ecache.c:151
  nf_conntrack_event_report  
include/net/netfilter/nf_conntrack_ecache.h:112 [inline]

  nf_ct_delete+0x33c/0x5d0 net/netfilter/nf_conntrack_core.c:601
  nf_ct_iterate_cleanup+0x48c/0x5e0 net/netfilter/nf_conntrack_core.c:1892
  nf_ct_iterate_cleanup_net+0x23c/0x2d0  
net/netfilter/nf_conntrack_core.c:1974
  ctnetlink_flush_conntrack net/netfilter/nf_conntrack_netlink.c:1226  
[inline]
  ctnetlink_del_conntrack+0x66c/0x850  
net/netfilter/nf_conntrack_netlink.c:1258

  nfnetlink_rcv_msg+0xd88/0x1070 net/netfilter/nfnetlink.c:228
  netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2454
  nfnetlink_rcv+0x1c0/0x4d0 net/netfilter/nfnetlink.c:560
  netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
  netlink_unicast+0x5a0/0x760 net/netlink/af_netlink.c:1343
  netlink_sendmsg+0xa18/0xfc0 net/netlink/af_netlink.c:1908
  sock_sendmsg_nosec net/socket.c:621 [inline]
  sock_sendmsg+0xd5/0x120 net/socket.c:631
  ___sys_sendmsg+0x7fd/0x930 net/socket.c:2114
  __sys_sendmsg+0x11d/0x290 net/socket.c:2152
  __do_sys_sendmsg net/socket.c:2161 

Re: WARNING in apparmor_secid_to_secctx

2018-09-01 Thread Dmitry Vyukov
On Sun, Sep 2, 2018 at 6:52 AM, John Johansen
 wrote:
> On 09/01/2018 09:33 PM, Dmitry Vyukov wrote:
>> On Sat, Sep 1, 2018 at 11:18 AM, John Johansen
>>  wrote:
>>> On 08/29/2018 07:17 PM, syzbot wrote:
 Hello,

 syzbot found the following crash on:

 HEAD commit:817e60a7a2bb Merge branch 'nfp-add-NFP5000-support'
 git tree:   net-next
 console output: https://syzkaller.appspot.com/x/log.txt?x=1536d29640
 kernel config:  https://syzkaller.appspot.com/x/.config?x=531a917630d2a492
 dashboard link: 
 https://syzkaller.appspot.com/bug?extid=21016130b0580a9de3b5
 compiler:   gcc (GCC) 8.0.1 20180413 (experimental)

 Unfortunately, I don't have any reproducer for this crash yet.

 IMPORTANT: if you fix the bug, please add the following tag to the commit:
 Reported-by: syzbot+21016130b0580a9de...@syzkaller.appspotmail.com

>>>
>>> << snip >>
>>>
>>> Patch sent directly to syzbot for testing
>>
>> Hi John,
>>
>> What do you mean? syzbot has not received any test requests for this,
>> and it would reply within half an hour or so. Where is that patch?
>>
>
> Hrmmm strange I followed the web instruction and attached the patch to the
> reply. The patch is below, its also available at
>
> git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor 
> 4.18-syzbot-secid

Humm.. Maybe you did not send it to syzbot?  The command should be just:

#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor
4.18-syzbot-secid


> ---
>
> From 22dad84baabf4174f11f5e9b34a05529084fa29c Mon Sep 17 00:00:00 2001
> From: John Johansen 
> Date: Sat, 1 Sep 2018 01:57:52 -0700
> Subject: [PATCH] apparmor: fix apparmor_secid_to_secctx incorrect debug
>  triggering  WARN_ON
>
> apparmor_secid_to_secctx() has a bad debug statement tripping on a
> condition handle by the code.  When kconfig SECURITY_APPARMOR_DEBUG is
> enabled the debug WARN_ON will trip when **secdata is NULL resulting
> in the following trace.
>
> [ cut here ]
> AppArmor WARN apparmor_secid_to_secctx: ((!secdata)):
> WARNING: CPU: 0 PID: 14826 at security/apparmor/secid.c:82 
> apparmor_secid_to_secctx+0x2b5/0x2f0 security/apparmor/secid.c:82
> Kernel panic - not syncing: panic_on_warn set ...
>
> CPU: 0 PID: 14826 Comm: syz-executor1 Not tainted 4.19.0-rc1+ #193
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113
>  panic+0x238/0x4e7 kernel/panic.c:184
>  __warn.cold.8+0x163/0x1ba kernel/panic.c:536
>  report_bug+0x252/0x2d0 lib/bug.c:186
>  fixup_bug arch/x86/kernel/traps.c:178 [inline]
>  do_error_trap+0x1fc/0x4d0 arch/x86/kernel/traps.c:296
>  do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:316
>  invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:993
> RIP: 0010:apparmor_secid_to_secctx+0x2b5/0x2f0 security/apparmor/secid.c:82
> Code: c7 c7 40 66 58 87 e8 6a 6d 0f fe 0f 0b e9 6c fe ff ff e8 3e aa 44 fe 48 
> c7 c6 80 67 58 87 48 c7 c7 a0 65 58 87 e8 4b 6d 0f fe <0f> 0b e9 3f fe ff ff 
> 48 89 df e8 fc a7 83 fe e9 ed fe ff ff bb f4
> RSP: 0018:8801ba1bed10 EFLAGS: 00010286
> RAX:  RBX: 8801ba1beed0 RCX: c9000227e000
> RDX: 00018482 RSI: 8163ac01 RDI: 0001
> RBP: 8801ba1bed30 R08: 8801b80ec080 R09: ed003b603eca
> R10: ed003b603eca R11: 8801db01f657 R12: 0001
> R13:  R14:  R15: 8801ba1beed0
>  security_secid_to_secctx+0x63/0xc0 security/security.c:1314
>  ctnetlink_secctx_size net/netfilter/nf_conntrack_netlink.c:621 [inline]
>  ctnetlink_nlmsg_size net/netfilter/nf_conntrack_netlink.c:659 [inline]
>  ctnetlink_conntrack_event+0x303/0x1470 
> net/netfilter/nf_conntrack_netlink.c:706
>  nf_conntrack_eventmask_report+0x55f/0x930 
> net/netfilter/nf_conntrack_ecache.c:151
>  nf_conntrack_event_report include/net/netfilter/nf_conntrack_ecache.h:112 
> [inline]
>  nf_ct_delete+0x33c/0x5d0 net/netfilter/nf_conntrack_core.c:601
>  nf_ct_iterate_cleanup+0x48c/0x5e0 net/netfilter/nf_conntrack_core.c:1892
>  nf_ct_iterate_cleanup_net+0x23c/0x2d0 net/netfilter/nf_conntrack_core.c:1974
>  ctnetlink_flush_conntrack net/netfilter/nf_conntrack_netlink.c:1226 [inline]
>  ctnetlink_del_conntrack+0x66c/0x850 net/netfilter/nf_conntrack_netlink.c:1258
>  nfnetlink_rcv_msg+0xd88/0x1070 net/netfilter/nfnetlink.c:228
>  netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2454
>  nfnetlink_rcv+0x1c0/0x4d0 net/netfilter/nfnetlink.c:560
>  netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
>  netlink_unicast+0x5a0/0x760 net/netlink/af_netlink.c:1343
>  netlink_sendmsg+0xa18/0xfc0 net/netlink/af_netlink.c:1908
>  sock_sendmsg_nosec net/socket.c:621 [inline]
>  sock_sendmsg+0xd5/0x120 net/socket.c:631
>  ___sys_sendmsg+0x7fd/0x930 net/socket.c:2114
>  __sys_sendmsg+0x11d/0x290 net/socket.c:2152

Re: Re: WARNING in apparmor_secid_to_secctx

2018-09-01 Thread syzbot

On Sun, Sep 2, 2018 at 6:52 AM, John Johansen
 wrote:

On 09/01/2018 09:33 PM, Dmitry Vyukov wrote:

On Sat, Sep 1, 2018 at 11:18 AM, John Johansen
 wrote:

On 08/29/2018 07:17 PM, syzbot wrote:

Hello,



syzbot found the following crash on:



HEAD commit:817e60a7a2bb Merge branch 'nfp-add-NFP5000-support'
git tree:   net-next
console output:  
https://syzkaller.appspot.com/x/log.txt?x=1536d29640
kernel config:   
https://syzkaller.appspot.com/x/.config?x=531a917630d2a492
dashboard link:  
https://syzkaller.appspot.com/bug?extid=21016130b0580a9de3b5

compiler:   gcc (GCC) 8.0.1 20180413 (experimental)



Unfortunately, I don't have any reproducer for this crash yet.


IMPORTANT: if you fix the bug, please add the following tag to the  
commit:

Reported-by: syzbot+21016130b0580a9de...@syzkaller.appspotmail.com




<< snip >>



Patch sent directly to syzbot for testing



Hi John,



What do you mean? syzbot has not received any test requests for this,
and it would reply within half an hour or so. Where is that patch?



Hrmmm strange I followed the web instruction and attached the patch to  
the

reply. The patch is below, its also available at


git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor  
4.18-syzbot-secid



Humm.. Maybe you did not send it to syzbot?  The command should be just:



#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor


"git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor" does not  
look like a valid git repo address.



4.18-syzbot-secid




---



 From 22dad84baabf4174f11f5e9b34a05529084fa29c Mon Sep 17 00:00:00 2001
From: John Johansen 
Date: Sat, 1 Sep 2018 01:57:52 -0700
Subject: [PATCH] apparmor: fix apparmor_secid_to_secctx incorrect debug
  triggering  WARN_ON



apparmor_secid_to_secctx() has a bad debug statement tripping on a
condition handle by the code.  When kconfig SECURITY_APPARMOR_DEBUG is
enabled the debug WARN_ON will trip when **secdata is NULL resulting
in the following trace.



[ cut here ]
AppArmor WARN apparmor_secid_to_secctx: ((!secdata)):
WARNING: CPU: 0 PID: 14826 at security/apparmor/secid.c:82  
apparmor_secid_to_secctx+0x2b5/0x2f0 security/apparmor/secid.c:82

Kernel panic - not syncing: panic_on_warn set ...



CPU: 0 PID: 14826 Comm: syz-executor1 Not tainted 4.19.0-rc1+ #193
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113
  panic+0x238/0x4e7 kernel/panic.c:184
  __warn.cold.8+0x163/0x1ba kernel/panic.c:536
  report_bug+0x252/0x2d0 lib/bug.c:186
  fixup_bug arch/x86/kernel/traps.c:178 [inline]
  do_error_trap+0x1fc/0x4d0 arch/x86/kernel/traps.c:296
  do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:316
  invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:993
RIP: 0010:apparmor_secid_to_secctx+0x2b5/0x2f0  
security/apparmor/secid.c:82
Code: c7 c7 40 66 58 87 e8 6a 6d 0f fe 0f 0b e9 6c fe ff ff e8 3e aa 44  
fe 48 c7 c6 80 67 58 87 48 c7 c7 a0 65 58 87 e8 4b 6d 0f fe <0f> 0b e9  
3f fe ff ff 48 89 df e8 fc a7 83 fe e9 ed fe ff ff bb f4

RSP: 0018:8801ba1bed10 EFLAGS: 00010286
RAX:  RBX: 8801ba1beed0 RCX: c9000227e000
RDX: 00018482 RSI: 8163ac01 RDI: 0001
RBP: 8801ba1bed30 R08: 8801b80ec080 R09: ed003b603eca
R10: ed003b603eca R11: 8801db01f657 R12: 0001
R13:  R14:  R15: 8801ba1beed0
  security_secid_to_secctx+0x63/0xc0 security/security.c:1314
  ctnetlink_secctx_size net/netfilter/nf_conntrack_netlink.c:621 [inline]
  ctnetlink_nlmsg_size net/netfilter/nf_conntrack_netlink.c:659 [inline]
  ctnetlink_conntrack_event+0x303/0x1470  
net/netfilter/nf_conntrack_netlink.c:706
  nf_conntrack_eventmask_report+0x55f/0x930  
net/netfilter/nf_conntrack_ecache.c:151
  nf_conntrack_event_report  
include/net/netfilter/nf_conntrack_ecache.h:112 [inline]

  nf_ct_delete+0x33c/0x5d0 net/netfilter/nf_conntrack_core.c:601
  nf_ct_iterate_cleanup+0x48c/0x5e0 net/netfilter/nf_conntrack_core.c:1892
  nf_ct_iterate_cleanup_net+0x23c/0x2d0  
net/netfilter/nf_conntrack_core.c:1974
  ctnetlink_flush_conntrack net/netfilter/nf_conntrack_netlink.c:1226  
[inline]
  ctnetlink_del_conntrack+0x66c/0x850  
net/netfilter/nf_conntrack_netlink.c:1258

  nfnetlink_rcv_msg+0xd88/0x1070 net/netfilter/nfnetlink.c:228
  netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2454
  nfnetlink_rcv+0x1c0/0x4d0 net/netfilter/nfnetlink.c:560
  netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
  netlink_unicast+0x5a0/0x760 net/netlink/af_netlink.c:1343
  netlink_sendmsg+0xa18/0xfc0 net/netlink/af_netlink.c:1908
  sock_sendmsg_nosec net/socket.c:621 [inline]
  sock_sendmsg+0xd5/0x120 net/socket.c:631
  ___sys_sendmsg+0x7fd/0x930 net/socket.c:2114
  __sys_sendmsg+0x11d/0x290 net/socket.c:2152
  __do_sys_sendmsg net/socket.c:2161 

Re: WARNING in apparmor_secid_to_secctx

2018-09-01 Thread Dmitry Vyukov
On Sun, Sep 2, 2018 at 6:52 AM, John Johansen
 wrote:
> On 09/01/2018 09:33 PM, Dmitry Vyukov wrote:
>> On Sat, Sep 1, 2018 at 11:18 AM, John Johansen
>>  wrote:
>>> On 08/29/2018 07:17 PM, syzbot wrote:
 Hello,

 syzbot found the following crash on:

 HEAD commit:817e60a7a2bb Merge branch 'nfp-add-NFP5000-support'
 git tree:   net-next
 console output: https://syzkaller.appspot.com/x/log.txt?x=1536d29640
 kernel config:  https://syzkaller.appspot.com/x/.config?x=531a917630d2a492
 dashboard link: 
 https://syzkaller.appspot.com/bug?extid=21016130b0580a9de3b5
 compiler:   gcc (GCC) 8.0.1 20180413 (experimental)

 Unfortunately, I don't have any reproducer for this crash yet.

 IMPORTANT: if you fix the bug, please add the following tag to the commit:
 Reported-by: syzbot+21016130b0580a9de...@syzkaller.appspotmail.com

>>>
>>> << snip >>
>>>
>>> Patch sent directly to syzbot for testing
>>
>> Hi John,
>>
>> What do you mean? syzbot has not received any test requests for this,
>> and it would reply within half an hour or so. Where is that patch?
>>
>
> Hrmmm strange I followed the web instruction and attached the patch to the
> reply. The patch is below, its also available at
>
> git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor 
> 4.18-syzbot-secid

Humm.. Maybe you did not send it to syzbot?  The command should be just:

#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor
4.18-syzbot-secid


> ---
>
> From 22dad84baabf4174f11f5e9b34a05529084fa29c Mon Sep 17 00:00:00 2001
> From: John Johansen 
> Date: Sat, 1 Sep 2018 01:57:52 -0700
> Subject: [PATCH] apparmor: fix apparmor_secid_to_secctx incorrect debug
>  triggering  WARN_ON
>
> apparmor_secid_to_secctx() has a bad debug statement tripping on a
> condition handle by the code.  When kconfig SECURITY_APPARMOR_DEBUG is
> enabled the debug WARN_ON will trip when **secdata is NULL resulting
> in the following trace.
>
> [ cut here ]
> AppArmor WARN apparmor_secid_to_secctx: ((!secdata)):
> WARNING: CPU: 0 PID: 14826 at security/apparmor/secid.c:82 
> apparmor_secid_to_secctx+0x2b5/0x2f0 security/apparmor/secid.c:82
> Kernel panic - not syncing: panic_on_warn set ...
>
> CPU: 0 PID: 14826 Comm: syz-executor1 Not tainted 4.19.0-rc1+ #193
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113
>  panic+0x238/0x4e7 kernel/panic.c:184
>  __warn.cold.8+0x163/0x1ba kernel/panic.c:536
>  report_bug+0x252/0x2d0 lib/bug.c:186
>  fixup_bug arch/x86/kernel/traps.c:178 [inline]
>  do_error_trap+0x1fc/0x4d0 arch/x86/kernel/traps.c:296
>  do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:316
>  invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:993
> RIP: 0010:apparmor_secid_to_secctx+0x2b5/0x2f0 security/apparmor/secid.c:82
> Code: c7 c7 40 66 58 87 e8 6a 6d 0f fe 0f 0b e9 6c fe ff ff e8 3e aa 44 fe 48 
> c7 c6 80 67 58 87 48 c7 c7 a0 65 58 87 e8 4b 6d 0f fe <0f> 0b e9 3f fe ff ff 
> 48 89 df e8 fc a7 83 fe e9 ed fe ff ff bb f4
> RSP: 0018:8801ba1bed10 EFLAGS: 00010286
> RAX:  RBX: 8801ba1beed0 RCX: c9000227e000
> RDX: 00018482 RSI: 8163ac01 RDI: 0001
> RBP: 8801ba1bed30 R08: 8801b80ec080 R09: ed003b603eca
> R10: ed003b603eca R11: 8801db01f657 R12: 0001
> R13:  R14:  R15: 8801ba1beed0
>  security_secid_to_secctx+0x63/0xc0 security/security.c:1314
>  ctnetlink_secctx_size net/netfilter/nf_conntrack_netlink.c:621 [inline]
>  ctnetlink_nlmsg_size net/netfilter/nf_conntrack_netlink.c:659 [inline]
>  ctnetlink_conntrack_event+0x303/0x1470 
> net/netfilter/nf_conntrack_netlink.c:706
>  nf_conntrack_eventmask_report+0x55f/0x930 
> net/netfilter/nf_conntrack_ecache.c:151
>  nf_conntrack_event_report include/net/netfilter/nf_conntrack_ecache.h:112 
> [inline]
>  nf_ct_delete+0x33c/0x5d0 net/netfilter/nf_conntrack_core.c:601
>  nf_ct_iterate_cleanup+0x48c/0x5e0 net/netfilter/nf_conntrack_core.c:1892
>  nf_ct_iterate_cleanup_net+0x23c/0x2d0 net/netfilter/nf_conntrack_core.c:1974
>  ctnetlink_flush_conntrack net/netfilter/nf_conntrack_netlink.c:1226 [inline]
>  ctnetlink_del_conntrack+0x66c/0x850 net/netfilter/nf_conntrack_netlink.c:1258
>  nfnetlink_rcv_msg+0xd88/0x1070 net/netfilter/nfnetlink.c:228
>  netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2454
>  nfnetlink_rcv+0x1c0/0x4d0 net/netfilter/nfnetlink.c:560
>  netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
>  netlink_unicast+0x5a0/0x760 net/netlink/af_netlink.c:1343
>  netlink_sendmsg+0xa18/0xfc0 net/netlink/af_netlink.c:1908
>  sock_sendmsg_nosec net/socket.c:621 [inline]
>  sock_sendmsg+0xd5/0x120 net/socket.c:631
>  ___sys_sendmsg+0x7fd/0x930 net/socket.c:2114
>  __sys_sendmsg+0x11d/0x290 net/socket.c:2152

Re: WARNING in apparmor_secid_to_secctx

2018-09-01 Thread John Johansen
On 09/01/2018 09:33 PM, Dmitry Vyukov wrote:
> On Sat, Sep 1, 2018 at 11:18 AM, John Johansen
>  wrote:
>> On 08/29/2018 07:17 PM, syzbot wrote:
>>> Hello,
>>>
>>> syzbot found the following crash on:
>>>
>>> HEAD commit:817e60a7a2bb Merge branch 'nfp-add-NFP5000-support'
>>> git tree:   net-next
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1536d29640
>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=531a917630d2a492
>>> dashboard link: https://syzkaller.appspot.com/bug?extid=21016130b0580a9de3b5
>>> compiler:   gcc (GCC) 8.0.1 20180413 (experimental)
>>>
>>> Unfortunately, I don't have any reproducer for this crash yet.
>>>
>>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>>> Reported-by: syzbot+21016130b0580a9de...@syzkaller.appspotmail.com
>>>
>>
>> << snip >>
>>
>> Patch sent directly to syzbot for testing
> 
> Hi John,
> 
> What do you mean? syzbot has not received any test requests for this,
> and it would reply within half an hour or so. Where is that patch?
> 

Hrmmm strange I followed the web instruction and attached the patch to the
reply. The patch is below, its also available at

git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor 
4.18-syzbot-secid

---

>From 22dad84baabf4174f11f5e9b34a05529084fa29c Mon Sep 17 00:00:00 2001
From: John Johansen 
Date: Sat, 1 Sep 2018 01:57:52 -0700
Subject: [PATCH] apparmor: fix apparmor_secid_to_secctx incorrect debug
 triggering  WARN_ON

apparmor_secid_to_secctx() has a bad debug statement tripping on a
condition handle by the code.  When kconfig SECURITY_APPARMOR_DEBUG is
enabled the debug WARN_ON will trip when **secdata is NULL resulting
in the following trace.

[ cut here ]
AppArmor WARN apparmor_secid_to_secctx: ((!secdata)):
WARNING: CPU: 0 PID: 14826 at security/apparmor/secid.c:82 
apparmor_secid_to_secctx+0x2b5/0x2f0 security/apparmor/secid.c:82
Kernel panic - not syncing: panic_on_warn set ...

CPU: 0 PID: 14826 Comm: syz-executor1 Not tainted 4.19.0-rc1+ #193
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113
 panic+0x238/0x4e7 kernel/panic.c:184
 __warn.cold.8+0x163/0x1ba kernel/panic.c:536
 report_bug+0x252/0x2d0 lib/bug.c:186
 fixup_bug arch/x86/kernel/traps.c:178 [inline]
 do_error_trap+0x1fc/0x4d0 arch/x86/kernel/traps.c:296
 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:316
 invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:993
RIP: 0010:apparmor_secid_to_secctx+0x2b5/0x2f0 security/apparmor/secid.c:82
Code: c7 c7 40 66 58 87 e8 6a 6d 0f fe 0f 0b e9 6c fe ff ff e8 3e aa 44 fe 48 
c7 c6 80 67 58 87 48 c7 c7 a0 65 58 87 e8 4b 6d 0f fe <0f> 0b e9 3f fe ff ff 48 
89 df e8 fc a7 83 fe e9 ed fe ff ff bb f4
RSP: 0018:8801ba1bed10 EFLAGS: 00010286
RAX:  RBX: 8801ba1beed0 RCX: c9000227e000
RDX: 00018482 RSI: 8163ac01 RDI: 0001
RBP: 8801ba1bed30 R08: 8801b80ec080 R09: ed003b603eca
R10: ed003b603eca R11: 8801db01f657 R12: 0001
R13:  R14:  R15: 8801ba1beed0
 security_secid_to_secctx+0x63/0xc0 security/security.c:1314
 ctnetlink_secctx_size net/netfilter/nf_conntrack_netlink.c:621 [inline]
 ctnetlink_nlmsg_size net/netfilter/nf_conntrack_netlink.c:659 [inline]
 ctnetlink_conntrack_event+0x303/0x1470 net/netfilter/nf_conntrack_netlink.c:706
 nf_conntrack_eventmask_report+0x55f/0x930 
net/netfilter/nf_conntrack_ecache.c:151
 nf_conntrack_event_report include/net/netfilter/nf_conntrack_ecache.h:112 
[inline]
 nf_ct_delete+0x33c/0x5d0 net/netfilter/nf_conntrack_core.c:601
 nf_ct_iterate_cleanup+0x48c/0x5e0 net/netfilter/nf_conntrack_core.c:1892
 nf_ct_iterate_cleanup_net+0x23c/0x2d0 net/netfilter/nf_conntrack_core.c:1974
 ctnetlink_flush_conntrack net/netfilter/nf_conntrack_netlink.c:1226 [inline]
 ctnetlink_del_conntrack+0x66c/0x850 net/netfilter/nf_conntrack_netlink.c:1258
 nfnetlink_rcv_msg+0xd88/0x1070 net/netfilter/nfnetlink.c:228
 netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2454
 nfnetlink_rcv+0x1c0/0x4d0 net/netfilter/nfnetlink.c:560
 netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
 netlink_unicast+0x5a0/0x760 net/netlink/af_netlink.c:1343
 netlink_sendmsg+0xa18/0xfc0 net/netlink/af_netlink.c:1908
 sock_sendmsg_nosec net/socket.c:621 [inline]
 sock_sendmsg+0xd5/0x120 net/socket.c:631
 ___sys_sendmsg+0x7fd/0x930 net/socket.c:2114
 __sys_sendmsg+0x11d/0x290 net/socket.c:2152
 __do_sys_sendmsg net/socket.c:2161 [inline]
 __se_sys_sendmsg net/socket.c:2159 [inline]
 __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2159
 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457089
Code: fd b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 
89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 

Re: WARNING in apparmor_secid_to_secctx

2018-09-01 Thread John Johansen
On 09/01/2018 09:33 PM, Dmitry Vyukov wrote:
> On Sat, Sep 1, 2018 at 11:18 AM, John Johansen
>  wrote:
>> On 08/29/2018 07:17 PM, syzbot wrote:
>>> Hello,
>>>
>>> syzbot found the following crash on:
>>>
>>> HEAD commit:817e60a7a2bb Merge branch 'nfp-add-NFP5000-support'
>>> git tree:   net-next
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1536d29640
>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=531a917630d2a492
>>> dashboard link: https://syzkaller.appspot.com/bug?extid=21016130b0580a9de3b5
>>> compiler:   gcc (GCC) 8.0.1 20180413 (experimental)
>>>
>>> Unfortunately, I don't have any reproducer for this crash yet.
>>>
>>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>>> Reported-by: syzbot+21016130b0580a9de...@syzkaller.appspotmail.com
>>>
>>
>> << snip >>
>>
>> Patch sent directly to syzbot for testing
> 
> Hi John,
> 
> What do you mean? syzbot has not received any test requests for this,
> and it would reply within half an hour or so. Where is that patch?
> 

Hrmmm strange I followed the web instruction and attached the patch to the
reply. The patch is below, its also available at

git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor 
4.18-syzbot-secid

---

>From 22dad84baabf4174f11f5e9b34a05529084fa29c Mon Sep 17 00:00:00 2001
From: John Johansen 
Date: Sat, 1 Sep 2018 01:57:52 -0700
Subject: [PATCH] apparmor: fix apparmor_secid_to_secctx incorrect debug
 triggering  WARN_ON

apparmor_secid_to_secctx() has a bad debug statement tripping on a
condition handle by the code.  When kconfig SECURITY_APPARMOR_DEBUG is
enabled the debug WARN_ON will trip when **secdata is NULL resulting
in the following trace.

[ cut here ]
AppArmor WARN apparmor_secid_to_secctx: ((!secdata)):
WARNING: CPU: 0 PID: 14826 at security/apparmor/secid.c:82 
apparmor_secid_to_secctx+0x2b5/0x2f0 security/apparmor/secid.c:82
Kernel panic - not syncing: panic_on_warn set ...

CPU: 0 PID: 14826 Comm: syz-executor1 Not tainted 4.19.0-rc1+ #193
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113
 panic+0x238/0x4e7 kernel/panic.c:184
 __warn.cold.8+0x163/0x1ba kernel/panic.c:536
 report_bug+0x252/0x2d0 lib/bug.c:186
 fixup_bug arch/x86/kernel/traps.c:178 [inline]
 do_error_trap+0x1fc/0x4d0 arch/x86/kernel/traps.c:296
 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:316
 invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:993
RIP: 0010:apparmor_secid_to_secctx+0x2b5/0x2f0 security/apparmor/secid.c:82
Code: c7 c7 40 66 58 87 e8 6a 6d 0f fe 0f 0b e9 6c fe ff ff e8 3e aa 44 fe 48 
c7 c6 80 67 58 87 48 c7 c7 a0 65 58 87 e8 4b 6d 0f fe <0f> 0b e9 3f fe ff ff 48 
89 df e8 fc a7 83 fe e9 ed fe ff ff bb f4
RSP: 0018:8801ba1bed10 EFLAGS: 00010286
RAX:  RBX: 8801ba1beed0 RCX: c9000227e000
RDX: 00018482 RSI: 8163ac01 RDI: 0001
RBP: 8801ba1bed30 R08: 8801b80ec080 R09: ed003b603eca
R10: ed003b603eca R11: 8801db01f657 R12: 0001
R13:  R14:  R15: 8801ba1beed0
 security_secid_to_secctx+0x63/0xc0 security/security.c:1314
 ctnetlink_secctx_size net/netfilter/nf_conntrack_netlink.c:621 [inline]
 ctnetlink_nlmsg_size net/netfilter/nf_conntrack_netlink.c:659 [inline]
 ctnetlink_conntrack_event+0x303/0x1470 net/netfilter/nf_conntrack_netlink.c:706
 nf_conntrack_eventmask_report+0x55f/0x930 
net/netfilter/nf_conntrack_ecache.c:151
 nf_conntrack_event_report include/net/netfilter/nf_conntrack_ecache.h:112 
[inline]
 nf_ct_delete+0x33c/0x5d0 net/netfilter/nf_conntrack_core.c:601
 nf_ct_iterate_cleanup+0x48c/0x5e0 net/netfilter/nf_conntrack_core.c:1892
 nf_ct_iterate_cleanup_net+0x23c/0x2d0 net/netfilter/nf_conntrack_core.c:1974
 ctnetlink_flush_conntrack net/netfilter/nf_conntrack_netlink.c:1226 [inline]
 ctnetlink_del_conntrack+0x66c/0x850 net/netfilter/nf_conntrack_netlink.c:1258
 nfnetlink_rcv_msg+0xd88/0x1070 net/netfilter/nfnetlink.c:228
 netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2454
 nfnetlink_rcv+0x1c0/0x4d0 net/netfilter/nfnetlink.c:560
 netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
 netlink_unicast+0x5a0/0x760 net/netlink/af_netlink.c:1343
 netlink_sendmsg+0xa18/0xfc0 net/netlink/af_netlink.c:1908
 sock_sendmsg_nosec net/socket.c:621 [inline]
 sock_sendmsg+0xd5/0x120 net/socket.c:631
 ___sys_sendmsg+0x7fd/0x930 net/socket.c:2114
 __sys_sendmsg+0x11d/0x290 net/socket.c:2152
 __do_sys_sendmsg net/socket.c:2161 [inline]
 __se_sys_sendmsg net/socket.c:2159 [inline]
 __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2159
 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457089
Code: fd b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 
89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 

Re: WARNING in apparmor_secid_to_secctx

2018-09-01 Thread Dmitry Vyukov
On Sat, Sep 1, 2018 at 11:18 AM, John Johansen
 wrote:
> On 08/29/2018 07:17 PM, syzbot wrote:
>> Hello,
>>
>> syzbot found the following crash on:
>>
>> HEAD commit:817e60a7a2bb Merge branch 'nfp-add-NFP5000-support'
>> git tree:   net-next
>> console output: https://syzkaller.appspot.com/x/log.txt?x=1536d29640
>> kernel config:  https://syzkaller.appspot.com/x/.config?x=531a917630d2a492
>> dashboard link: https://syzkaller.appspot.com/bug?extid=21016130b0580a9de3b5
>> compiler:   gcc (GCC) 8.0.1 20180413 (experimental)
>>
>> Unfortunately, I don't have any reproducer for this crash yet.
>>
>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>> Reported-by: syzbot+21016130b0580a9de...@syzkaller.appspotmail.com
>>
>
> << snip >>
>
> Patch sent directly to syzbot for testing

Hi John,

What do you mean? syzbot has not received any test requests for this,
and it would reply within half an hour or so. Where is that patch?


Re: WARNING in apparmor_secid_to_secctx

2018-09-01 Thread Dmitry Vyukov
On Sat, Sep 1, 2018 at 11:18 AM, John Johansen
 wrote:
> On 08/29/2018 07:17 PM, syzbot wrote:
>> Hello,
>>
>> syzbot found the following crash on:
>>
>> HEAD commit:817e60a7a2bb Merge branch 'nfp-add-NFP5000-support'
>> git tree:   net-next
>> console output: https://syzkaller.appspot.com/x/log.txt?x=1536d29640
>> kernel config:  https://syzkaller.appspot.com/x/.config?x=531a917630d2a492
>> dashboard link: https://syzkaller.appspot.com/bug?extid=21016130b0580a9de3b5
>> compiler:   gcc (GCC) 8.0.1 20180413 (experimental)
>>
>> Unfortunately, I don't have any reproducer for this crash yet.
>>
>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>> Reported-by: syzbot+21016130b0580a9de...@syzkaller.appspotmail.com
>>
>
> << snip >>
>
> Patch sent directly to syzbot for testing

Hi John,

What do you mean? syzbot has not received any test requests for this,
and it would reply within half an hour or so. Where is that patch?


Re: 4.19-rc1: ./include/linux/rcupdate.h:631 rcu_read_lock() used illegally while idle!

2018-09-01 Thread Paul E. McKenney
On Sat, Sep 01, 2018 at 06:45:31PM -0400, Steven Rostedt wrote:
> On Sat, 1 Sep 2018 10:54:42 -0700
> "Paul E. McKenney"  wrote:
> 
> > On Sat, Sep 01, 2018 at 07:35:59PM +0200, Borislav Petkov wrote:
> > > This is a huge splat! It haz some perf* and sched* in it, I guess for
> > > peterz to stare at. And lemme add Paul for good measure too :)
> > > 
> > > Kernel is -rc1 + 3 microcode loader patches ontop which should not be
> > > related.  
> > 
> > It really is tracing from the idle loop.  But I thought that the event
> > tracing took care of that.  Adding Steve and Joel for their thoughts.
> > 
> > Thanx, Paul
> > 
> > > Thx.
> > > 
> > > ---
> > > [   62.409125] =
> > > [   62.409129] WARNING: suspicious RCU usage
> > > [   62.409133] 4.19.0-rc1+ #1 Not tainted
> > > [   62.409136] -
> > > [   62.409140] ./include/linux/rcupdate.h:631 rcu_read_lock() used 
> > > illegally while idle!
> > > [   62.409143] 
> > >other info that might help us debug this:
> > > 
> > > [   62.409147] 
> > >RCU used illegally from idle CPU!
> > >rcu_scheduler_active = 2, debug_locks = 1
> > > [   62.409151] RCU used illegally from extended quiescent state!
> > > [   62.409155] 1 lock held by swapper/0/0:
> > > [   62.409158]  #0: 4557ee0e (rcu_read_lock){}, at: 
> > > perf_event_output_forward+0x0/0x130
> > > [   62.409175] 
> > >stack backtrace:
> > > [   62.409180] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0-rc1+ #1
> > > [   62.409183] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 
> > > ) 11/13/2012
> > > [   62.409187] Call Trace:
> > > [   62.409196]  dump_stack+0x85/0xcb
> > > [   62.409203]  perf_event_output_forward+0xf6/0x130
> 
> I think this is because we switched the trace point code to be
> protected by srcu instead of rcu_lock_sched() and a song and dance to
> "make RCU watch again" if it is not, but perf is using normal
> rcu_read_lock() internally even though it is hooked into the
> tracepoint code. Should perf switch to SRCU, or perhaps it can do the
> song and dance to make RCU watch again?

Well, this is a regression, so in theory we could push my three SRCU
patches into the current merge window, which would allow perf going
to SRCU, thus fixing the above splat.  I am OK either way.  What would
you prefer?

Thanx, Paul

> -- Steve
> 
> 
> > > [   62.409215]  __perf_event_overflow+0x52/0xe0
> > > [   62.409223]  perf_swevent_overflow+0x91/0xb0
> > > [   62.409229]  perf_tp_event+0x11a/0x350
> > > [   62.409235]  ? find_held_lock+0x2d/0x90
> > > [   62.409251]  ? __lock_acquire+0x2ce/0x1350
> > > [   62.409263]  ? __lock_acquire+0x2ce/0x1350
> > > [   62.409270]  ? retint_kernel+0x2d/0x2d
> > > [   62.409278]  ? find_held_lock+0x2d/0x90
> > > [   62.409285]  ? tick_nohz_get_sleep_length+0x83/0xb0
> > > [   62.409299]  ? perf_trace_cpu+0xbb/0xd0
> > > [   62.409306]  ? perf_trace_buf_alloc+0x5a/0xa0
> > > [   62.409311]  perf_trace_cpu+0xbb/0xd0
> > > [   62.409323]  cpuidle_enter_state+0x185/0x340
> > > [   62.409332]  do_idle+0x1eb/0x260
> > > [   62.409340]  cpu_startup_entry+0x5f/0x70
> > > [   62.409347]  start_kernel+0x49b/0x4a6
> > > 
> > > [   62.409357]  secondary_startup_64+0xa4/0xb0
> 



Re: 4.19-rc1: ./include/linux/rcupdate.h:631 rcu_read_lock() used illegally while idle!

2018-09-01 Thread Paul E. McKenney
On Sat, Sep 01, 2018 at 06:45:31PM -0400, Steven Rostedt wrote:
> On Sat, 1 Sep 2018 10:54:42 -0700
> "Paul E. McKenney"  wrote:
> 
> > On Sat, Sep 01, 2018 at 07:35:59PM +0200, Borislav Petkov wrote:
> > > This is a huge splat! It haz some perf* and sched* in it, I guess for
> > > peterz to stare at. And lemme add Paul for good measure too :)
> > > 
> > > Kernel is -rc1 + 3 microcode loader patches ontop which should not be
> > > related.  
> > 
> > It really is tracing from the idle loop.  But I thought that the event
> > tracing took care of that.  Adding Steve and Joel for their thoughts.
> > 
> > Thanx, Paul
> > 
> > > Thx.
> > > 
> > > ---
> > > [   62.409125] =
> > > [   62.409129] WARNING: suspicious RCU usage
> > > [   62.409133] 4.19.0-rc1+ #1 Not tainted
> > > [   62.409136] -
> > > [   62.409140] ./include/linux/rcupdate.h:631 rcu_read_lock() used 
> > > illegally while idle!
> > > [   62.409143] 
> > >other info that might help us debug this:
> > > 
> > > [   62.409147] 
> > >RCU used illegally from idle CPU!
> > >rcu_scheduler_active = 2, debug_locks = 1
> > > [   62.409151] RCU used illegally from extended quiescent state!
> > > [   62.409155] 1 lock held by swapper/0/0:
> > > [   62.409158]  #0: 4557ee0e (rcu_read_lock){}, at: 
> > > perf_event_output_forward+0x0/0x130
> > > [   62.409175] 
> > >stack backtrace:
> > > [   62.409180] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0-rc1+ #1
> > > [   62.409183] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 
> > > ) 11/13/2012
> > > [   62.409187] Call Trace:
> > > [   62.409196]  dump_stack+0x85/0xcb
> > > [   62.409203]  perf_event_output_forward+0xf6/0x130
> 
> I think this is because we switched the trace point code to be
> protected by srcu instead of rcu_lock_sched() and a song and dance to
> "make RCU watch again" if it is not, but perf is using normal
> rcu_read_lock() internally even though it is hooked into the
> tracepoint code. Should perf switch to SRCU, or perhaps it can do the
> song and dance to make RCU watch again?

Well, this is a regression, so in theory we could push my three SRCU
patches into the current merge window, which would allow perf going
to SRCU, thus fixing the above splat.  I am OK either way.  What would
you prefer?

Thanx, Paul

> -- Steve
> 
> 
> > > [   62.409215]  __perf_event_overflow+0x52/0xe0
> > > [   62.409223]  perf_swevent_overflow+0x91/0xb0
> > > [   62.409229]  perf_tp_event+0x11a/0x350
> > > [   62.409235]  ? find_held_lock+0x2d/0x90
> > > [   62.409251]  ? __lock_acquire+0x2ce/0x1350
> > > [   62.409263]  ? __lock_acquire+0x2ce/0x1350
> > > [   62.409270]  ? retint_kernel+0x2d/0x2d
> > > [   62.409278]  ? find_held_lock+0x2d/0x90
> > > [   62.409285]  ? tick_nohz_get_sleep_length+0x83/0xb0
> > > [   62.409299]  ? perf_trace_cpu+0xbb/0xd0
> > > [   62.409306]  ? perf_trace_buf_alloc+0x5a/0xa0
> > > [   62.409311]  perf_trace_cpu+0xbb/0xd0
> > > [   62.409323]  cpuidle_enter_state+0x185/0x340
> > > [   62.409332]  do_idle+0x1eb/0x260
> > > [   62.409340]  cpu_startup_entry+0x5f/0x70
> > > [   62.409347]  start_kernel+0x49b/0x4a6
> > > 
> > > [   62.409357]  secondary_startup_64+0xa4/0xb0
> 



[PATCH] x86: fix pti Section Mismatch warning/error

2018-09-01 Thread Randy Dunlap
From: Randy Dunlap 

Fix the section mismatch warning in arch/x86/mm/pti.c:

WARNING: vmlinux.o(.text+0x6972a): Section mismatch in reference from the 
function pti_clone_pgtable() to the function 
.init.text:pti_user_pagetable_walk_pte()
The function pti_clone_pgtable() references
the function __init pti_user_pagetable_walk_pte().
This is often because pti_clone_pgtable lacks a __init
annotation or the annotation of pti_user_pagetable_walk_pte is wrong.
FATAL: modpost: Section mismatches detected.

Fixes: 85900ea51577 ("x86/pti: Map the vsyscall page if needed")

Reported-by: kbuild test robot 
Signed-off-by: Randy Dunlap 
Cc: Andy Lutomirski 
Cc: x...@kernel.org
---
Applies to mainline.

 arch/x86/mm/pti.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-next-20180830.orig/arch/x86/mm/pti.c
+++ linux-next-20180830/arch/x86/mm/pti.c
@@ -248,7 +248,7 @@ static pmd_t *pti_user_pagetable_walk_pm
  *
  * Returns a pointer to a PTE on success, or NULL on failure.
  */
-static __init pte_t *pti_user_pagetable_walk_pte(unsigned long address)
+static pte_t *pti_user_pagetable_walk_pte(unsigned long address)
 {
gfp_t gfp = (GFP_KERNEL | __GFP_NOTRACK | __GFP_ZERO);
pmd_t *pmd;



[PATCH] x86: fix pti Section Mismatch warning/error

2018-09-01 Thread Randy Dunlap
From: Randy Dunlap 

Fix the section mismatch warning in arch/x86/mm/pti.c:

WARNING: vmlinux.o(.text+0x6972a): Section mismatch in reference from the 
function pti_clone_pgtable() to the function 
.init.text:pti_user_pagetable_walk_pte()
The function pti_clone_pgtable() references
the function __init pti_user_pagetable_walk_pte().
This is often because pti_clone_pgtable lacks a __init
annotation or the annotation of pti_user_pagetable_walk_pte is wrong.
FATAL: modpost: Section mismatches detected.

Fixes: 85900ea51577 ("x86/pti: Map the vsyscall page if needed")

Reported-by: kbuild test robot 
Signed-off-by: Randy Dunlap 
Cc: Andy Lutomirski 
Cc: x...@kernel.org
---
Applies to mainline.

 arch/x86/mm/pti.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-next-20180830.orig/arch/x86/mm/pti.c
+++ linux-next-20180830/arch/x86/mm/pti.c
@@ -248,7 +248,7 @@ static pmd_t *pti_user_pagetable_walk_pm
  *
  * Returns a pointer to a PTE on success, or NULL on failure.
  */
-static __init pte_t *pti_user_pagetable_walk_pte(unsigned long address)
+static pte_t *pti_user_pagetable_walk_pte(unsigned long address)
 {
gfp_t gfp = (GFP_KERNEL | __GFP_NOTRACK | __GFP_ZERO);
pmd_t *pmd;



[PATCH] pci: dwc: pcie_designware: Fix a sleep-in-atomic-context bug in dw_pcie_prog_outbound_atu

2018-09-01 Thread Jia-Ju Bai
The driver may sleep with holding a spinlock and in an interupt handler.

The function call paths (from bottom to top) in Linux-4.16 are:

[FUNC] usleep_range
drivers/pci/dwc/pcie-designware.c, 181: 
usleep_range in dw_pcie_prog_outbound_atu
drivers/pci/dwc/pcie-designware-host.c, 479: 
dw_pcie_prog_outbound_atu in dw_pcie_rd_other_conf
drivers/pci/dwc/pcie-designware-host.c, 561: 
dw_pcie_rd_other_conf in dw_pcie_rd_conf
drivers/pci/access.c, 66: 
[FUNC_PTR]dw_pcie_rd_conf in pci_bus_read_config_word
drivers/pci/access.c, 918: 
pci_bus_read_config_word in pci_read_config_word
drivers/block/umem.c, 630: 
pci_read_config_word in mm_interrupt (interrupt handler)

[FUNC] usleep_range
drivers/pci/dwc/pcie-designware.c, 181: 
usleep_range in dw_pcie_prog_outbound_atu
drivers/pci/dwc/pcie-designware-host.c, 479: 
dw_pcie_prog_outbound_atu in dw_pcie_rd_other_conf
drivers/pci/dwc/pcie-designware-host.c, 561: 
dw_pcie_rd_other_conf in dw_pcie_rd_conf
drivers/pci/access.c, 66: 
[FUNC_PTR]dw_pcie_rd_conf in pci_bus_read_config_word
drivers/pci/access.c, 918: 
pci_bus_read_config_word in pci_read_config_word
drivers/ata/pata_efar.c, 115: 
pci_read_config_word in efar_set_piomode
drivers/ata/pata_efar.c, 113: 
_raw_spin_lock_irqsave in efar_set_piomod

[FUNC] usleep_range
drivers/pci/dwc/pcie-designware.c, 181: 
usleep_range in dw_pcie_prog_outbound_atu
drivers/pci/dwc/pcie-designware-host.c, 479: 
dw_pcie_prog_outbound_atu in dw_pcie_rd_other_conf
drivers/pci/dwc/pcie-designware-host.c, 561: 
dw_pcie_rd_other_conf in dw_pcie_rd_conf
drivers/pci/access.c, 66: 
[FUNC_PTR]dw_pcie_rd_conf in pci_bus_read_config_word
drivers/pci/access.c, 918: 
pci_bus_read_config_word in pci_read_config_word
drivers/block/mtip32xx/mtip32xx.c, 158: 
pci_read_config_word in mtip_check_surprise_removal
drivers/block/mtip32xx/mtip32xx.c, 843: 
mtip_check_surprise_removal in mtip_handle_irq
drivers/block/mtip32xx/mtip32xx.c, 879: 
mtip_handle_irq in mtip_irq_handler (interrupt handler)

[FUNC] usleep_range
drivers/pci/dwc/pcie-designware.c, 181: 
usleep_range in dw_pcie_prog_outbound_atu
drivers/pci/dwc/pcie-designware-host.c, 479: 
dw_pcie_prog_outbound_atu in dw_pcie_rd_other_conf
drivers/pci/dwc/pcie-designware-host.c, 561: 
dw_pcie_rd_other_conf in dw_pcie_rd_conf
drivers/pci/access.c, 66: 
[FUNC_PTR]dw_pcie_rd_conf in pci_bus_read_config_word
drivers/pci/access.c, 918: 
pci_bus_read_config_word in pci_read_config_word
drivers/gpu/vga/vgaarb.c, 645: 
pci_read_config_word in vga_arbiter_add_pci_device
drivers/gpu/vga/vgaarb.c, 629: 
_raw_spin_lock_irqsave in vga_arbiter_add_pci_device

[FUNC] usleep_range
drivers/pci/dwc/pcie-designware.c, 181: 
usleep_range in dw_pcie_prog_outbound_atu
drivers/pci/dwc/pcie-designware-host.c, 479: 
dw_pcie_prog_outbound_atu in dw_pcie_rd_other_conf
drivers/pci/dwc/pcie-designware-host.c, 561: 
dw_pcie_rd_other_conf in dw_pcie_rd_conf
drivers/pci/access.c, 66: 
[FUNC_PTR]dw_pcie_rd_conf in pci_bus_read_config_word
drivers/pci/access.c, 918: 
pci_bus_read_config_word in pci_read_config_word
drivers/pci/ats.c, 139: 
pci_read_config_word in pci_ats_queue_depth
drivers/iommu/intel-iommu.c, 1519: 
pci_ats_queue_depth in iommu_enable_dev_iotlb
drivers/iommu/intel-iommu.c, 5295: 
iommu_enable_dev_iotlb in intel_iommu_enable_pasid
drivers/iommu/intel-iommu.c, 5241: 
_raw_spin_lock_irqsave in intel_iommu_enable_pasid

To fix this bug, usleep_range() is replaced with udelay().

This bug is found by my static analysis tool DSAC.

Signed-off-by: Jia-Ju Bai 
---
 drivers/pci/controller/dwc/pcie-designware.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/controller/dwc/pcie-designware.c 
b/drivers/pci/controller/dwc/pcie-designware.c
index 778c4f76a884..7f50f7e51543 100644
--- a/drivers/pci/controller/dwc/pcie-designware.c
+++ b/drivers/pci/controller/dwc/pcie-designware.c
@@ -135,7 +135,7 @@ static void dw_pcie_prog_outbound_atu_unroll(struct dw_pcie 
*pci, int index,
if (val & PCIE_ATU_ENABLE)
return;
 
-   usleep_range(LINK_WAIT_IATU_MIN, LINK_WAIT_IATU_MAX);
+   udelay(LINK_WAIT_IATU_MAX);
}
dev_err(pci->dev, "Outbound iATU is not being enabled\n");
 }
-- 
2.17.0



[PATCH] pci: dwc: pcie_designware: Fix a sleep-in-atomic-context bug in dw_pcie_prog_outbound_atu

2018-09-01 Thread Jia-Ju Bai
The driver may sleep with holding a spinlock and in an interupt handler.

The function call paths (from bottom to top) in Linux-4.16 are:

[FUNC] usleep_range
drivers/pci/dwc/pcie-designware.c, 181: 
usleep_range in dw_pcie_prog_outbound_atu
drivers/pci/dwc/pcie-designware-host.c, 479: 
dw_pcie_prog_outbound_atu in dw_pcie_rd_other_conf
drivers/pci/dwc/pcie-designware-host.c, 561: 
dw_pcie_rd_other_conf in dw_pcie_rd_conf
drivers/pci/access.c, 66: 
[FUNC_PTR]dw_pcie_rd_conf in pci_bus_read_config_word
drivers/pci/access.c, 918: 
pci_bus_read_config_word in pci_read_config_word
drivers/block/umem.c, 630: 
pci_read_config_word in mm_interrupt (interrupt handler)

[FUNC] usleep_range
drivers/pci/dwc/pcie-designware.c, 181: 
usleep_range in dw_pcie_prog_outbound_atu
drivers/pci/dwc/pcie-designware-host.c, 479: 
dw_pcie_prog_outbound_atu in dw_pcie_rd_other_conf
drivers/pci/dwc/pcie-designware-host.c, 561: 
dw_pcie_rd_other_conf in dw_pcie_rd_conf
drivers/pci/access.c, 66: 
[FUNC_PTR]dw_pcie_rd_conf in pci_bus_read_config_word
drivers/pci/access.c, 918: 
pci_bus_read_config_word in pci_read_config_word
drivers/ata/pata_efar.c, 115: 
pci_read_config_word in efar_set_piomode
drivers/ata/pata_efar.c, 113: 
_raw_spin_lock_irqsave in efar_set_piomod

[FUNC] usleep_range
drivers/pci/dwc/pcie-designware.c, 181: 
usleep_range in dw_pcie_prog_outbound_atu
drivers/pci/dwc/pcie-designware-host.c, 479: 
dw_pcie_prog_outbound_atu in dw_pcie_rd_other_conf
drivers/pci/dwc/pcie-designware-host.c, 561: 
dw_pcie_rd_other_conf in dw_pcie_rd_conf
drivers/pci/access.c, 66: 
[FUNC_PTR]dw_pcie_rd_conf in pci_bus_read_config_word
drivers/pci/access.c, 918: 
pci_bus_read_config_word in pci_read_config_word
drivers/block/mtip32xx/mtip32xx.c, 158: 
pci_read_config_word in mtip_check_surprise_removal
drivers/block/mtip32xx/mtip32xx.c, 843: 
mtip_check_surprise_removal in mtip_handle_irq
drivers/block/mtip32xx/mtip32xx.c, 879: 
mtip_handle_irq in mtip_irq_handler (interrupt handler)

[FUNC] usleep_range
drivers/pci/dwc/pcie-designware.c, 181: 
usleep_range in dw_pcie_prog_outbound_atu
drivers/pci/dwc/pcie-designware-host.c, 479: 
dw_pcie_prog_outbound_atu in dw_pcie_rd_other_conf
drivers/pci/dwc/pcie-designware-host.c, 561: 
dw_pcie_rd_other_conf in dw_pcie_rd_conf
drivers/pci/access.c, 66: 
[FUNC_PTR]dw_pcie_rd_conf in pci_bus_read_config_word
drivers/pci/access.c, 918: 
pci_bus_read_config_word in pci_read_config_word
drivers/gpu/vga/vgaarb.c, 645: 
pci_read_config_word in vga_arbiter_add_pci_device
drivers/gpu/vga/vgaarb.c, 629: 
_raw_spin_lock_irqsave in vga_arbiter_add_pci_device

[FUNC] usleep_range
drivers/pci/dwc/pcie-designware.c, 181: 
usleep_range in dw_pcie_prog_outbound_atu
drivers/pci/dwc/pcie-designware-host.c, 479: 
dw_pcie_prog_outbound_atu in dw_pcie_rd_other_conf
drivers/pci/dwc/pcie-designware-host.c, 561: 
dw_pcie_rd_other_conf in dw_pcie_rd_conf
drivers/pci/access.c, 66: 
[FUNC_PTR]dw_pcie_rd_conf in pci_bus_read_config_word
drivers/pci/access.c, 918: 
pci_bus_read_config_word in pci_read_config_word
drivers/pci/ats.c, 139: 
pci_read_config_word in pci_ats_queue_depth
drivers/iommu/intel-iommu.c, 1519: 
pci_ats_queue_depth in iommu_enable_dev_iotlb
drivers/iommu/intel-iommu.c, 5295: 
iommu_enable_dev_iotlb in intel_iommu_enable_pasid
drivers/iommu/intel-iommu.c, 5241: 
_raw_spin_lock_irqsave in intel_iommu_enable_pasid

To fix this bug, usleep_range() is replaced with udelay().

This bug is found by my static analysis tool DSAC.

Signed-off-by: Jia-Ju Bai 
---
 drivers/pci/controller/dwc/pcie-designware.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/controller/dwc/pcie-designware.c 
b/drivers/pci/controller/dwc/pcie-designware.c
index 778c4f76a884..7f50f7e51543 100644
--- a/drivers/pci/controller/dwc/pcie-designware.c
+++ b/drivers/pci/controller/dwc/pcie-designware.c
@@ -135,7 +135,7 @@ static void dw_pcie_prog_outbound_atu_unroll(struct dw_pcie 
*pci, int index,
if (val & PCIE_ATU_ENABLE)
return;
 
-   usleep_range(LINK_WAIT_IATU_MIN, LINK_WAIT_IATU_MAX);
+   udelay(LINK_WAIT_IATU_MAX);
}
dev_err(pci->dev, "Outbound iATU is not being enabled\n");
 }
-- 
2.17.0



Re: [PATCH] arm64: defconfig: enable EFI_ARMSTUB_DTB_LOADER

2018-09-01 Thread Olof Johansson
On Thu, Aug 30, 2018 at 9:23 AM, Ard Biesheuvel
 wrote:
> On 30 August 2018 at 17:06, Olof Johansson  wrote:
>> On Wed, Aug 29, 2018 at 10:54 PM, Ard Biesheuvel
>>  wrote:
>>> On 29 August 2018 at 20:59, Scott Branden  
>>> wrote:
 Hi Olof,


 On 18-08-29 11:44 AM, Olof Johansson wrote:
>
> Hi,
>
> On Wed, Aug 29, 2018 at 10:21 AM, Scott Branden
>  wrote:
>>
>> Enable EFI_ARMSTUB_DTB_LOADER to add support for the dtb= command line
>> parameter to function with efi loader.
>>
>> Required to boot on existing bootloaders that do not support devicetree
>> provided by the platform or by the bootloader.
>>
>> Fixes: 3d7ee348aa41 ("efi/libstub/arm: Add opt-in Kconfig option for the
>> DTB loader")
>> Signed-off-by: Scott Branden 
>
> Why did Ard create an option for this if it's just going be turned on
> in default configs? Doesn't make sense to me.
>
> It would help to know what firmware still is crippled and how common
> it is, since it's been a few years that this has been a requirement by
> now.

 Broadcom NS2 and Stingray in current development and production need this
 option in the kernel enabled in order to boot.
>>>
>>> And these production systems run mainline kernels in a defconfig 
>>> configuration?
>>>
>>> The simply reality is that the DTB loader has been deprecated for a
>>> good reason: it was only ever intended as a development hack anyway,
>>> and if we need to treat the EFI stub provided DTB as a first class
>>> citizen, there are things we need to fix to make things works as
>>> expected. For instance, GRUB will put a property in the /chosen node
>>> for the initramfs which will get dropped if you boot with dtb=.
>>>
>>> Don't be surprised if some future enhancements of the EFI stub code
>>> depend on !EFI_ARMSTUB_DTB_LOADER. On UEFI systems, DTBs [or ACPI
>>> tables] are used by the firmware to describe itself and the underlying
>>> platform to the OS, and the practice of booting with DTB file images
>>> (taken from the kernel build as well) conflicts with that view. Note
>>> that GRUB still permits you to load DTBs from files (and supports more
>>> sources than just the file system the kernel Image was loaded from).
>>
>> Ard,
>>
>> Maybe a WARN() splat would be more useful as a phasing-out method than
>> removing functionality for them that needs to be reinstated through
>> changing the config?
>>
>
> We don't have any of that in the stub, and inventing new ways to pass
> such information between the stub and the kernel proper seems like a
> cart-before-horse kind of thing to me. The EFI stub diagnostic
> messages you get on the serial console are not recorded in the kernel
> log buffer, so they only appear if you actually look at the serial
> output.

Ah yeah. I suppose you could do it in the kernel later if you detect
you've booted through EFI with dtb= on the command line though.

>
>> Once the stub and the boot method is there, it's hard to undo as we
>> can see here. Being loud and warn might be more useful, and set a
>> timeline for hard removal (12 months?).
>>
>
> The dtb= handling is still there, it is just not enabled by default.
> We can keep it around if people are still using it. But as I pointed
> out, we may decide to make new functionality available only if it is
> disabled, and at that point, we'll have to choose between one or the
> other in defconfig, which is annoying.
>
>> Scott; an alternative for you is to do a boot wrapper that bundles a
>> DT and kernel, and boot that instead of the kernel image (outside of
>> the kernel tree). Some 32-bit platforms from Marvell use that. That
>> way the kernel will just see it as a normally passed in DT.
>>
>
> Or use GRUB. It comes wired up in all the distros, and let's you load
> a DT binary from anywhere you can imagine, as opposed to the EFI stub
> which can only load it if it happens to reside in the same file system
> (or even directory - I can't remember) as the kernel image. Note that
> the same reservations apply to doing that - the firmware is no longer
> able to describe itself to the OS via the DT, which is really the only
> conduit it has available on an arm64 system..

So, I've looked at the history here a bit, and dtb= support was
introduced in 2014. Nowhere does it say that it isn't a recommended
way of booting.

There are some firmware stacks today that modify and provide a
runtime-updated devicetree to the kernel, but there are also a bunch
who don't. Most "real" products will want a firmware that knows how to
pass in things such as firmware environment variables, or MAC
addresses, etc, to the kernel, but not all of them need it.

In particular, in a world where you want EFI to be used on embedded
platforms, requiring another bootloader step such as GRUB to be able
to reasonably boot said platforms seems like a significant and
unfortunate new limitation. Documentation/efi-stub.txt has absolutely
no 

Re: [PATCH] arm64: defconfig: enable EFI_ARMSTUB_DTB_LOADER

2018-09-01 Thread Olof Johansson
On Thu, Aug 30, 2018 at 9:23 AM, Ard Biesheuvel
 wrote:
> On 30 August 2018 at 17:06, Olof Johansson  wrote:
>> On Wed, Aug 29, 2018 at 10:54 PM, Ard Biesheuvel
>>  wrote:
>>> On 29 August 2018 at 20:59, Scott Branden  
>>> wrote:
 Hi Olof,


 On 18-08-29 11:44 AM, Olof Johansson wrote:
>
> Hi,
>
> On Wed, Aug 29, 2018 at 10:21 AM, Scott Branden
>  wrote:
>>
>> Enable EFI_ARMSTUB_DTB_LOADER to add support for the dtb= command line
>> parameter to function with efi loader.
>>
>> Required to boot on existing bootloaders that do not support devicetree
>> provided by the platform or by the bootloader.
>>
>> Fixes: 3d7ee348aa41 ("efi/libstub/arm: Add opt-in Kconfig option for the
>> DTB loader")
>> Signed-off-by: Scott Branden 
>
> Why did Ard create an option for this if it's just going be turned on
> in default configs? Doesn't make sense to me.
>
> It would help to know what firmware still is crippled and how common
> it is, since it's been a few years that this has been a requirement by
> now.

 Broadcom NS2 and Stingray in current development and production need this
 option in the kernel enabled in order to boot.
>>>
>>> And these production systems run mainline kernels in a defconfig 
>>> configuration?
>>>
>>> The simply reality is that the DTB loader has been deprecated for a
>>> good reason: it was only ever intended as a development hack anyway,
>>> and if we need to treat the EFI stub provided DTB as a first class
>>> citizen, there are things we need to fix to make things works as
>>> expected. For instance, GRUB will put a property in the /chosen node
>>> for the initramfs which will get dropped if you boot with dtb=.
>>>
>>> Don't be surprised if some future enhancements of the EFI stub code
>>> depend on !EFI_ARMSTUB_DTB_LOADER. On UEFI systems, DTBs [or ACPI
>>> tables] are used by the firmware to describe itself and the underlying
>>> platform to the OS, and the practice of booting with DTB file images
>>> (taken from the kernel build as well) conflicts with that view. Note
>>> that GRUB still permits you to load DTBs from files (and supports more
>>> sources than just the file system the kernel Image was loaded from).
>>
>> Ard,
>>
>> Maybe a WARN() splat would be more useful as a phasing-out method than
>> removing functionality for them that needs to be reinstated through
>> changing the config?
>>
>
> We don't have any of that in the stub, and inventing new ways to pass
> such information between the stub and the kernel proper seems like a
> cart-before-horse kind of thing to me. The EFI stub diagnostic
> messages you get on the serial console are not recorded in the kernel
> log buffer, so they only appear if you actually look at the serial
> output.

Ah yeah. I suppose you could do it in the kernel later if you detect
you've booted through EFI with dtb= on the command line though.

>
>> Once the stub and the boot method is there, it's hard to undo as we
>> can see here. Being loud and warn might be more useful, and set a
>> timeline for hard removal (12 months?).
>>
>
> The dtb= handling is still there, it is just not enabled by default.
> We can keep it around if people are still using it. But as I pointed
> out, we may decide to make new functionality available only if it is
> disabled, and at that point, we'll have to choose between one or the
> other in defconfig, which is annoying.
>
>> Scott; an alternative for you is to do a boot wrapper that bundles a
>> DT and kernel, and boot that instead of the kernel image (outside of
>> the kernel tree). Some 32-bit platforms from Marvell use that. That
>> way the kernel will just see it as a normally passed in DT.
>>
>
> Or use GRUB. It comes wired up in all the distros, and let's you load
> a DT binary from anywhere you can imagine, as opposed to the EFI stub
> which can only load it if it happens to reside in the same file system
> (or even directory - I can't remember) as the kernel image. Note that
> the same reservations apply to doing that - the firmware is no longer
> able to describe itself to the OS via the DT, which is really the only
> conduit it has available on an arm64 system..

So, I've looked at the history here a bit, and dtb= support was
introduced in 2014. Nowhere does it say that it isn't a recommended
way of booting.

There are some firmware stacks today that modify and provide a
runtime-updated devicetree to the kernel, but there are also a bunch
who don't. Most "real" products will want a firmware that knows how to
pass in things such as firmware environment variables, or MAC
addresses, etc, to the kernel, but not all of them need it.

In particular, in a world where you want EFI to be used on embedded
platforms, requiring another bootloader step such as GRUB to be able
to reasonably boot said platforms seems like a significant and
unfortunate new limitation. Documentation/efi-stub.txt has absolutely
no 

[GIT PULL] ARM: SoC fixes

2018-09-01 Thread Olof Johansson
Hi Linus,

The following changes since commit 5b394b2ddf0347bef56e50c69a58773c94343ff3:

  Linux 4.19-rc1 (2018-08-26 14:11:59 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc.git 
tags/armsoc-fixes

for you to fetch changes up to a72b44a871c218e2a0580e68affa1d3528c0587a:

  Merge tag 'omap-for-v4.19/fixes-v2-signed' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes 
(2018-09-01 18:22:19 -0700)


ARM: SoC fixes

First batch of fixes post-merge window:

 - A handful of devicetree changes for i.MX2{3,8} to change over to new
   panel bindings. The platforms were moved from legacy framebuffers
   to DRM and some development board panels hadn't yet been converted.
 - OMAP fixes related to ti-sysc driver conversion fallout, fixing some
   register offsets, no_console_suspend fixes, etc.
 - Droid4 changes to fix flaky eMMC probing and vibrator DTS mismerge.
 - Fixed 0755->0644 permissions on a newly added file.
 - Defconfig changes to make ARM Versatile more useful with QEMU
   (helps testing).
 - Enable defconfig options for new TI SoC platform that was merged this
   window (AM6).


Fabio Estevam (6):
  ARM: dts: imx28-evk: Move regulators outside simple-bus
  ARM: dts: imx28-evk: Convert to the new display bindings
  ARM: dts: imx23-evk: Move regulators outside simple-bus
  ARM: dts: imx23-evk: Convert to the new display bindings
  ARM: mxs_defconfig: Select CONFIG_DRM_PANEL_SEIKO_43WVF1G
  ARM: imx_v6_v7_defconfig: Select CONFIG_DRM_PANEL_SEIKO_43WVF1G

Keerthy (1):
  arm: dts: am4372: setup rtc as system-power-controller

Leonard Crestez (1):
  Revert "ARM: dts: imx7d: Invert legacy PCI irq mapping"

Linus Walleij (1):
  ARM: defconfig: Update the ARM Versatile defconfig

Neeraj Dantu (1):
  ARM: dts: Fix file permission for am335x-osd3358-sm-red.dts

Nishanth Menon (1):
  arm64: defconfig: Enable TI's AM6 SoC platform

Olof Johansson (2):
  Merge tag 'imx-fixes-4.19' of git://git.kernel.org/.../shawnguo/linux 
into fixes
  Merge tag 'omap-for-v4.19/fixes-v2-signed' of 
git://git.kernel.org/.../tmlind/linux-omap into fixes

Pavel Machek (1):
  ARM: dts: omap4-droid4: fix vibrations on Droid 4

Tony Lindgren (6):
  ARM: OMAP2+: Fix null hwmod for ti-sysc debug
  ARM: OMAP2+: Fix module address for modules using mpu_rt_idx
  bus: ti-sysc: Fix module register ioremap for larger offsets
  bus: ti-sysc: Fix no_console_suspend handling
  Merge branch 'perm-fix' into omap-for-v4.19/fixes-v2
  ARM: dts: omap4-droid4: Fix emmc errors seen on some devices

 arch/arm/boot/dts/am335x-osd3358-sm-red.dts |   0
 arch/arm/boot/dts/am4372.dtsi   |   1 +
 arch/arm/boot/dts/imx23-evk.dts |  90 +++---
 arch/arm/boot/dts/imx28-evk.dts | 183 +---
 arch/arm/boot/dts/imx7d.dtsi|  12 +-
 arch/arm/boot/dts/omap4-droid4-xt894.dts|  20 +--
 arch/arm/configs/imx_v6_v7_defconfig|   1 +
 arch/arm/configs/mxs_defconfig  |   1 +
 arch/arm/configs/versatile_defconfig|  14 ++-
 arch/arm/mach-omap2/omap_hwmod.c|  39 +-
 arch/arm64/configs/defconfig|   3 +
 drivers/bus/ti-sysc.c   |  37 +++---
 12 files changed, 213 insertions(+), 188 deletions(-)
 mode change 100755 => 100644 arch/arm/boot/dts/am335x-osd3358-sm-red.dts


[GIT PULL] ARM: SoC fixes

2018-09-01 Thread Olof Johansson
Hi Linus,

The following changes since commit 5b394b2ddf0347bef56e50c69a58773c94343ff3:

  Linux 4.19-rc1 (2018-08-26 14:11:59 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc.git 
tags/armsoc-fixes

for you to fetch changes up to a72b44a871c218e2a0580e68affa1d3528c0587a:

  Merge tag 'omap-for-v4.19/fixes-v2-signed' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes 
(2018-09-01 18:22:19 -0700)


ARM: SoC fixes

First batch of fixes post-merge window:

 - A handful of devicetree changes for i.MX2{3,8} to change over to new
   panel bindings. The platforms were moved from legacy framebuffers
   to DRM and some development board panels hadn't yet been converted.
 - OMAP fixes related to ti-sysc driver conversion fallout, fixing some
   register offsets, no_console_suspend fixes, etc.
 - Droid4 changes to fix flaky eMMC probing and vibrator DTS mismerge.
 - Fixed 0755->0644 permissions on a newly added file.
 - Defconfig changes to make ARM Versatile more useful with QEMU
   (helps testing).
 - Enable defconfig options for new TI SoC platform that was merged this
   window (AM6).


Fabio Estevam (6):
  ARM: dts: imx28-evk: Move regulators outside simple-bus
  ARM: dts: imx28-evk: Convert to the new display bindings
  ARM: dts: imx23-evk: Move regulators outside simple-bus
  ARM: dts: imx23-evk: Convert to the new display bindings
  ARM: mxs_defconfig: Select CONFIG_DRM_PANEL_SEIKO_43WVF1G
  ARM: imx_v6_v7_defconfig: Select CONFIG_DRM_PANEL_SEIKO_43WVF1G

Keerthy (1):
  arm: dts: am4372: setup rtc as system-power-controller

Leonard Crestez (1):
  Revert "ARM: dts: imx7d: Invert legacy PCI irq mapping"

Linus Walleij (1):
  ARM: defconfig: Update the ARM Versatile defconfig

Neeraj Dantu (1):
  ARM: dts: Fix file permission for am335x-osd3358-sm-red.dts

Nishanth Menon (1):
  arm64: defconfig: Enable TI's AM6 SoC platform

Olof Johansson (2):
  Merge tag 'imx-fixes-4.19' of git://git.kernel.org/.../shawnguo/linux 
into fixes
  Merge tag 'omap-for-v4.19/fixes-v2-signed' of 
git://git.kernel.org/.../tmlind/linux-omap into fixes

Pavel Machek (1):
  ARM: dts: omap4-droid4: fix vibrations on Droid 4

Tony Lindgren (6):
  ARM: OMAP2+: Fix null hwmod for ti-sysc debug
  ARM: OMAP2+: Fix module address for modules using mpu_rt_idx
  bus: ti-sysc: Fix module register ioremap for larger offsets
  bus: ti-sysc: Fix no_console_suspend handling
  Merge branch 'perm-fix' into omap-for-v4.19/fixes-v2
  ARM: dts: omap4-droid4: Fix emmc errors seen on some devices

 arch/arm/boot/dts/am335x-osd3358-sm-red.dts |   0
 arch/arm/boot/dts/am4372.dtsi   |   1 +
 arch/arm/boot/dts/imx23-evk.dts |  90 +++---
 arch/arm/boot/dts/imx28-evk.dts | 183 +---
 arch/arm/boot/dts/imx7d.dtsi|  12 +-
 arch/arm/boot/dts/omap4-droid4-xt894.dts|  20 +--
 arch/arm/configs/imx_v6_v7_defconfig|   1 +
 arch/arm/configs/mxs_defconfig  |   1 +
 arch/arm/configs/versatile_defconfig|  14 ++-
 arch/arm/mach-omap2/omap_hwmod.c|  39 +-
 arch/arm64/configs/defconfig|   3 +
 drivers/bus/ti-sysc.c   |  37 +++---
 12 files changed, 213 insertions(+), 188 deletions(-)
 mode change 100755 => 100644 arch/arm/boot/dts/am335x-osd3358-sm-red.dts


[RFC][PATCH 2/5] [PATCH 2/5] proc: introduce /proc/PID/idle_bitmap

2018-09-01 Thread Fengguang Wu
This will be similar to /sys/kernel/mm/page_idle/bitmap documented in
Documentation/admin-guide/mm/idle_page_tracking.rst, however indexed
by process virtual address.

When using the global PFN indexed idle bitmap, we find 2 kind of
overheads:

- to track a task's working set, Brendan Gregg end up writing wss-v1
  for small tasks and wss-v2 for large tasks:

  https://github.com/brendangregg/wss

  That's because VAs may point to random PAs throughout the physical
  address space. So we either query /proc/pid/pagemap first and access
  the lots of random PFNs (with lots of syscalls) in the bitmap, or
  write+read the whole system idle bitmap beforehand.

- page table walking by PFN has much more overheads than to walk a
  page table in its natural order:
  - rmap queries
  - more locking
  - random memory reads/writes

This interface provides a cheap path for the majority non-shared mapping
pages. To walk 1TB memory of 4k active pages, it costs 2s vs 15s system
time to scan the per-task/global idle bitmaps. Which means ~7x speedup.
The gap will be enlarged if consider

- the extra /proc/pid/pagemap walk
- natural page table walks can skip the whole 512 PTEs if PMD is idle

OTOH, the per-task idle bitmap is not suitable in some situations:

- not accurate for shared pages
- don't work with non-mapped file pages
- don't perform well for sparse page tables (pointed out by Huang Ying)

So it's more about complementing the existing global idle bitmap.

CC: Huang Ying 
CC: Brendan Gregg 
Signed-off-by: Fengguang Wu 
---
 fs/proc/base.c |  2 ++
 fs/proc/internal.h |  1 +
 fs/proc/task_mmu.c | 63 ++
 3 files changed, 66 insertions(+)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index aaffc0c30216..d81322b5b8d2 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2942,6 +2942,7 @@ static const struct pid_entry tgid_base_stuff[] = {
REG("smaps",  S_IRUGO, proc_pid_smaps_operations),
REG("smaps_rollup", S_IRUGO, proc_pid_smaps_rollup_operations),
REG("pagemap",S_IRUSR, proc_pagemap_operations),
+   REG("idle_bitmap", S_IRUSR|S_IWUSR, proc_mm_idle_operations),
 #endif
 #ifdef CONFIG_SECURITY
DIR("attr",   S_IRUGO|S_IXUGO, proc_attr_dir_inode_operations, 
proc_attr_dir_operations),
@@ -3327,6 +3328,7 @@ static const struct pid_entry tid_base_stuff[] = {
REG("smaps", S_IRUGO, proc_tid_smaps_operations),
REG("smaps_rollup", S_IRUGO, proc_pid_smaps_rollup_operations),
REG("pagemap",S_IRUSR, proc_pagemap_operations),
+   REG("idle_bitmap", S_IRUSR|S_IWUSR, proc_mm_idle_operations),
 #endif
 #ifdef CONFIG_SECURITY
DIR("attr",  S_IRUGO|S_IXUGO, proc_attr_dir_inode_operations, 
proc_attr_dir_operations),
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index da3dbfa09e79..732a502acc27 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -305,6 +305,7 @@ extern const struct file_operations 
proc_pid_smaps_rollup_operations;
 extern const struct file_operations proc_tid_smaps_operations;
 extern const struct file_operations proc_clear_refs_operations;
 extern const struct file_operations proc_pagemap_operations;
+extern const struct file_operations proc_mm_idle_operations;
 
 extern unsigned long task_vsize(struct mm_struct *);
 extern unsigned long task_statm(struct mm_struct *,
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index dfd73a4616ce..376406a9cf45 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -1564,6 +1564,69 @@ const struct file_operations proc_pagemap_operations = {
.open   = pagemap_open,
.release= pagemap_release,
 };
+
+/* will be filled when kvm_ept_idle module loads */
+struct file_operations proc_ept_idle_operations = {
+};
+EXPORT_SYMBOL_GPL(proc_ept_idle_operations);
+
+static ssize_t mm_idle_read(struct file *file, char __user *buf,
+   size_t count, loff_t *ppos)
+{
+   struct task_struct *task = file->private_data;
+   ssize_t ret = -ESRCH;
+
+   // TODO: implement mm_walk for normal tasks
+
+   if (task_kvm(task)) {
+   if (proc_ept_idle_operations.read)
+   return proc_ept_idle_operations.read(file, buf, count, 
ppos);
+   }
+
+   return ret;
+}
+
+
+static int mm_idle_open(struct inode *inode, struct file *file)
+{
+   struct task_struct *task = get_proc_task(inode);
+
+   if (!task)
+   return -ESRCH;
+
+   file->private_data = task;
+
+   if (task_kvm(task)) {
+   if (proc_ept_idle_operations.open)
+   return proc_ept_idle_operations.open(inode, file);
+   }
+
+   return 0;
+}
+
+static int mm_idle_release(struct inode *inode, struct file *file)
+{
+   struct task_struct *task = file->private_data;
+
+   if (!task)
+   return 0;
+
+   if (task_kvm(task)) {
+   if (proc_ept_idle_operations.release)
+

[RFC][PATCH 2/5] [PATCH 2/5] proc: introduce /proc/PID/idle_bitmap

2018-09-01 Thread Fengguang Wu
This will be similar to /sys/kernel/mm/page_idle/bitmap documented in
Documentation/admin-guide/mm/idle_page_tracking.rst, however indexed
by process virtual address.

When using the global PFN indexed idle bitmap, we find 2 kind of
overheads:

- to track a task's working set, Brendan Gregg end up writing wss-v1
  for small tasks and wss-v2 for large tasks:

  https://github.com/brendangregg/wss

  That's because VAs may point to random PAs throughout the physical
  address space. So we either query /proc/pid/pagemap first and access
  the lots of random PFNs (with lots of syscalls) in the bitmap, or
  write+read the whole system idle bitmap beforehand.

- page table walking by PFN has much more overheads than to walk a
  page table in its natural order:
  - rmap queries
  - more locking
  - random memory reads/writes

This interface provides a cheap path for the majority non-shared mapping
pages. To walk 1TB memory of 4k active pages, it costs 2s vs 15s system
time to scan the per-task/global idle bitmaps. Which means ~7x speedup.
The gap will be enlarged if consider

- the extra /proc/pid/pagemap walk
- natural page table walks can skip the whole 512 PTEs if PMD is idle

OTOH, the per-task idle bitmap is not suitable in some situations:

- not accurate for shared pages
- don't work with non-mapped file pages
- don't perform well for sparse page tables (pointed out by Huang Ying)

So it's more about complementing the existing global idle bitmap.

CC: Huang Ying 
CC: Brendan Gregg 
Signed-off-by: Fengguang Wu 
---
 fs/proc/base.c |  2 ++
 fs/proc/internal.h |  1 +
 fs/proc/task_mmu.c | 63 ++
 3 files changed, 66 insertions(+)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index aaffc0c30216..d81322b5b8d2 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2942,6 +2942,7 @@ static const struct pid_entry tgid_base_stuff[] = {
REG("smaps",  S_IRUGO, proc_pid_smaps_operations),
REG("smaps_rollup", S_IRUGO, proc_pid_smaps_rollup_operations),
REG("pagemap",S_IRUSR, proc_pagemap_operations),
+   REG("idle_bitmap", S_IRUSR|S_IWUSR, proc_mm_idle_operations),
 #endif
 #ifdef CONFIG_SECURITY
DIR("attr",   S_IRUGO|S_IXUGO, proc_attr_dir_inode_operations, 
proc_attr_dir_operations),
@@ -3327,6 +3328,7 @@ static const struct pid_entry tid_base_stuff[] = {
REG("smaps", S_IRUGO, proc_tid_smaps_operations),
REG("smaps_rollup", S_IRUGO, proc_pid_smaps_rollup_operations),
REG("pagemap",S_IRUSR, proc_pagemap_operations),
+   REG("idle_bitmap", S_IRUSR|S_IWUSR, proc_mm_idle_operations),
 #endif
 #ifdef CONFIG_SECURITY
DIR("attr",  S_IRUGO|S_IXUGO, proc_attr_dir_inode_operations, 
proc_attr_dir_operations),
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index da3dbfa09e79..732a502acc27 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -305,6 +305,7 @@ extern const struct file_operations 
proc_pid_smaps_rollup_operations;
 extern const struct file_operations proc_tid_smaps_operations;
 extern const struct file_operations proc_clear_refs_operations;
 extern const struct file_operations proc_pagemap_operations;
+extern const struct file_operations proc_mm_idle_operations;
 
 extern unsigned long task_vsize(struct mm_struct *);
 extern unsigned long task_statm(struct mm_struct *,
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index dfd73a4616ce..376406a9cf45 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -1564,6 +1564,69 @@ const struct file_operations proc_pagemap_operations = {
.open   = pagemap_open,
.release= pagemap_release,
 };
+
+/* will be filled when kvm_ept_idle module loads */
+struct file_operations proc_ept_idle_operations = {
+};
+EXPORT_SYMBOL_GPL(proc_ept_idle_operations);
+
+static ssize_t mm_idle_read(struct file *file, char __user *buf,
+   size_t count, loff_t *ppos)
+{
+   struct task_struct *task = file->private_data;
+   ssize_t ret = -ESRCH;
+
+   // TODO: implement mm_walk for normal tasks
+
+   if (task_kvm(task)) {
+   if (proc_ept_idle_operations.read)
+   return proc_ept_idle_operations.read(file, buf, count, 
ppos);
+   }
+
+   return ret;
+}
+
+
+static int mm_idle_open(struct inode *inode, struct file *file)
+{
+   struct task_struct *task = get_proc_task(inode);
+
+   if (!task)
+   return -ESRCH;
+
+   file->private_data = task;
+
+   if (task_kvm(task)) {
+   if (proc_ept_idle_operations.open)
+   return proc_ept_idle_operations.open(inode, file);
+   }
+
+   return 0;
+}
+
+static int mm_idle_release(struct inode *inode, struct file *file)
+{
+   struct task_struct *task = file->private_data;
+
+   if (!task)
+   return 0;
+
+   if (task_kvm(task)) {
+   if (proc_ept_idle_operations.release)
+

[RFC][PATCH 3/5] [PATCH 3/5] kvm-ept-idle: HVA indexed EPT read

2018-09-01 Thread Fengguang Wu
For virtual machines, "accessed" bits will be set in guest page tables
and EPT/NPT. So for qemu-kvm process, convert HVA to GFN to GPA, then do
EPT/NPT walks. Thanks to the in-memslot linear HVA-GPA mapping, the conversion
can be done efficiently, outside of the loops for page table walks.

In this manner, we provide uniform interface for both virtual machines and
normal processes.

The use scenario would be per task/VM working set tracking and migration.
Very convenient for applying task/vma and VM granularity policies.

Signed-off-by: Peng DongX 
Signed-off-by: Fengguang Wu 
---
 arch/x86/kvm/ept_idle.c | 118 
 arch/x86/kvm/ept_idle.h |  24 ++
 2 files changed, 142 insertions(+)
 create mode 100644 arch/x86/kvm/ept_idle.c
 create mode 100644 arch/x86/kvm/ept_idle.h

diff --git a/arch/x86/kvm/ept_idle.c b/arch/x86/kvm/ept_idle.c
new file mode 100644
index ..5b97dd01011b
--- /dev/null
+++ b/arch/x86/kvm/ept_idle.c
@@ -0,0 +1,118 @@
+// SPDX-License-Identifier: GPL-2.0
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "ept_idle.h"
+
+
+// mindless copy from kvm_handle_hva_range().
+// TODO: handle order and hole.
+static int ept_idle_walk_hva_range(struct ept_idle_ctrl *eic,
+  unsigned long start,
+  unsigned long end)
+{
+   struct kvm_memslots *slots;
+   struct kvm_memory_slot *memslot;
+   int ret = 0;
+
+   slots = kvm_memslots(eic->kvm);
+   kvm_for_each_memslot(memslot, slots) {
+   unsigned long hva_start, hva_end;
+   gfn_t gfn_start, gfn_end;
+
+   hva_start = max(start, memslot->userspace_addr);
+   hva_end = min(end, memslot->userspace_addr +
+ (memslot->npages << PAGE_SHIFT));
+   if (hva_start >= hva_end)
+   continue;
+   /*
+* {gfn(page) | page intersects with [hva_start, hva_end)} =
+* {gfn_start, gfn_start+1, ..., gfn_end-1}.
+*/
+   gfn_start = hva_to_gfn_memslot(hva_start, memslot);
+   gfn_end = hva_to_gfn_memslot(hva_end + PAGE_SIZE - 1, memslot);
+
+   ret = ept_idle_walk_gfn_range(eic, gfn_start, gfn_end);
+   if (ret)
+   return ret;
+   }
+
+   return ret;
+}
+
+static ssize_t ept_idle_read(struct file *file, char *buf,
+size_t count, loff_t *ppos)
+{
+   struct task_struct *task = file->private_data;
+   struct ept_idle_ctrl *eic;
+   unsigned long hva_start = *ppos << BITMAP_BYTE2PVA_SHIFT;
+   unsigned long hva_end = hva_start + (count << BITMAP_BYTE2PVA_SHIFT);
+   int ret;
+
+   if (*ppos % IDLE_BITMAP_CHUNK_SIZE ||
+   count % IDLE_BITMAP_CHUNK_SIZE)
+   return -EINVAL;
+
+   eic = kzalloc(sizeof(*eic), GFP_KERNEL);
+   if (!eic)
+   return -EBUSY;
+
+   eic->buf = buf;
+   eic->buf_size = count;
+   eic->kvm = task_kvm(task);
+   if (!eic->kvm) {
+   ret = -EINVAL;
+   goto out_free;
+   }
+
+   ret = ept_idle_walk_hva_range(eic, hva_start, hva_end);
+   if (ret)
+   goto out_free;
+
+   ret = eic->bytes_copied;
+   *ppos += ret;
+out_free:
+   kfree(eic);
+
+   return ret;
+}
+
+static int ept_idle_open(struct inode *inode, struct file *file)
+{
+   if (!try_module_get(THIS_MODULE))
+   return -EBUSY;
+
+   return 0;
+}
+
+static int ept_idle_release(struct inode *inode, struct file *file)
+{
+   module_put(THIS_MODULE);
+   return 0;
+}
+
+extern struct file_operations proc_ept_idle_operations;
+
+static int ept_idle_entry(void)
+{
+   proc_ept_idle_operations.owner = THIS_MODULE;
+   proc_ept_idle_operations.read = ept_idle_read;
+   proc_ept_idle_operations.open = ept_idle_open;
+   proc_ept_idle_operations.release = ept_idle_release;
+
+   return 0;
+}
+
+static void ept_idle_exit(void)
+{
+   memset(_ept_idle_operations, 0, sizeof(proc_ept_idle_operations));
+}
+
+MODULE_LICENSE("GPL");
+module_init(ept_idle_entry);
+module_exit(ept_idle_exit);
diff --git a/arch/x86/kvm/ept_idle.h b/arch/x86/kvm/ept_idle.h
new file mode 100644
index ..e0b9dcecf50b
--- /dev/null
+++ b/arch/x86/kvm/ept_idle.h
@@ -0,0 +1,24 @@
+#ifndef _EPT_IDLE_H
+#define _EPT_IDLE_H
+
+#define IDLE_BITMAP_CHUNK_SIZE sizeof(u64)
+#define IDLE_BITMAP_CHUNK_BITS (IDLE_BITMAP_CHUNK_SIZE * BITS_PER_BYTE)
+
+#define BITMAP_BYTE2PVA_SHIFT  (3 + PAGE_SHIFT)
+
+#define EPT_IDLE_KBUF_FULL 1
+#define EPT_IDLE_KBUF_BYTES 8000
+#define EPT_IDLE_KBUF_BITS  (EPT_IDLE_KBUF_BYTES * 8)
+
+struct ept_idle_ctrl {
+   struct kvm *kvm;
+
+   u64 kbuf[EPT_IDLE_KBUF_BITS / IDLE_BITMAP_CHUNK_BITS];
+   int bits_read;
+
+   void __user *buf;
+   int 

[RFC][PATCH 5/5] [PATCH 5/5] kvm-ept-idle: enable module

2018-09-01 Thread Fengguang Wu
Signed-off-by: Fengguang Wu 
---
 arch/x86/kvm/Kconfig  | 11 +++
 arch/x86/kvm/Makefile |  4 
 2 files changed, 15 insertions(+)

diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index 1bbec387d289..4c6dec47fac6 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -96,6 +96,17 @@ config KVM_MMU_AUDIT
 This option adds a R/W kVM module parameter 'mmu_audit', which allows
 auditing of KVM MMU events at runtime.
 
+config KVM_EPT_IDLE
+   tristate "KVM EPT idle page tracking"
+   depends on KVM_INTEL
+   depends on PROC_PAGE_MONITOR
+   ---help---
+ Provides support for walking EPT to get the A bits on Intel
+ processors equipped with the VT extensions.
+
+ To compile this as a module, choose M here: the module
+ will be called kvm-ept-idle.
+
 # OK, it's a little counter-intuitive to do this, but it puts it neatly under
 # the virtualization menu.
 source drivers/vhost/Kconfig
diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index dc4f2fdf5e57..5cad0590205d 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -19,6 +19,10 @@ kvm-y+= x86.o mmu.o emulate.o 
i8259.o irq.o lapic.o \
 kvm-intel-y+= vmx.o pmu_intel.o
 kvm-amd-y  += svm.o pmu_amd.o
 
+kvm-ept-idle-y += ept_idle.o
+
 obj-$(CONFIG_KVM)  += kvm.o
 obj-$(CONFIG_KVM_INTEL)+= kvm-intel.o
 obj-$(CONFIG_KVM_AMD)  += kvm-amd.o
+
+obj-$(CONFIG_KVM_EPT_IDLE) += kvm-ept-idle.o
-- 
2.15.0





[RFC][PATCH 0/5] introduce /proc/PID/idle_bitmap

2018-09-01 Thread Fengguang Wu
This new /proc/PID/idle_bitmap interface aims to complement the current global
/sys/kernel/mm/page_idle/bitmap. To enable efficient user space driven 
migrations.

The pros and cons will be discussed in changelog of "[PATCH] proc: introduce
/proc/PID/idle_bitmap". The driving force is to improve efficiency by 10+
times, so that hot/cold page tracking can be done in some regular intervals in
user space w/o too much overheads. Making it possible for some user space
daemon to do regular page migration between NUMA nodes of different speeds.

Note it's not about NUMA migration between local and remote nodes -- we already
have NUMA balancing for that. This interface and user space migration daemon
targets for NUMA nodes made of different mediums -- ie. DIMM and NVDIMM(*) --
with larger performance gaps. Basic policy will be "move hot pages to DIMM;
cold pages to NVDIMM".

Since NVDIMMs size can easily reach several Terabytes, working set tracking
efficiency will matter and be challeging.

(*) Here we use persistent memory (PMEM) w/o using its persistence.
Persistence is good to have, however it requires modifying applications.
Upcoming NVDIMM products like Intel Apache Pass (AEP) will be more cost and 
energy
effective than DRAM, but slower. Merely using it in form of NUMA memory node
could immediately benefit many workloads. For example, warm but not hot apps,
workloads with sharp hot/cold page distribution (good for migration), or relies
more on memory size than latency and bandwidth, and do more reads than writes.

This is an early RFC version to collect feedbacks. It's complete enough to demo
the basic ideas and performance, however not usable yet.

Regards,
Fengguang



[RFC][PATCH 1/5] [PATCH 1/5] kvm: register in task_struct

2018-09-01 Thread Fengguang Wu
The added pointer will be used by the /proc/PID/idle_bitmap code to
quickly identify QEMU task and walk EPT/NPT accordingly. For virtual
machines, the A bits will be set in guest page tables and EPT/NPT,
rather than the QEMU task page table.

This costs 8 bytes in task_struct which could be wasteful for the
majority normal tasks. The alternative is to add a flag only, and
let it find the corresponding VM in kvm vm_list.

Signed-off-by: Fengguang Wu 
---
 include/linux/sched.h | 10 ++
 virt/kvm/kvm_main.c   |  1 +
 2 files changed, 11 insertions(+)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 43731fe51c97..26c8549bbc28 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -38,6 +38,7 @@ struct cfs_rq;
 struct fs_struct;
 struct futex_pi_state;
 struct io_context;
+struct kvm;
 struct mempolicy;
 struct nameidata;
 struct nsproxy;
@@ -1179,6 +1180,9 @@ struct task_struct {
/* Used by LSM modules for access restriction: */
void*security;
 #endif
+#if IS_ENABLED(CONFIG_KVM)
+struct kvm  *kvm;
+#endif
 
/*
 * New fields for task_struct should be added above here, so that
@@ -1898,4 +1902,10 @@ static inline void rseq_syscall(struct pt_regs *regs)
 
 #endif
 
+#if IS_ENABLED(CONFIG_KVM)
+static inline struct kvm *task_kvm(struct task_struct *t) { return t->kvm; }
+#else
+static inline struct kvm *task_kvm(struct task_struct *t) { return NULL; }
+#endif
+
 #endif
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 8b47507faab5..0c483720de8d 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3892,6 +3892,7 @@ static void kvm_uevent_notify_change(unsigned int type, 
struct kvm *kvm)
if (type == KVM_EVENT_CREATE_VM) {
add_uevent_var(env, "EVENT=create");
kvm->userspace_pid = task_pid_nr(current);
+   current->kvm = kvm;
} else if (type == KVM_EVENT_DESTROY_VM) {
add_uevent_var(env, "EVENT=destroy");
}
-- 
2.15.0





[RFC][PATCH 3/5] [PATCH 3/5] kvm-ept-idle: HVA indexed EPT read

2018-09-01 Thread Fengguang Wu
For virtual machines, "accessed" bits will be set in guest page tables
and EPT/NPT. So for qemu-kvm process, convert HVA to GFN to GPA, then do
EPT/NPT walks. Thanks to the in-memslot linear HVA-GPA mapping, the conversion
can be done efficiently, outside of the loops for page table walks.

In this manner, we provide uniform interface for both virtual machines and
normal processes.

The use scenario would be per task/VM working set tracking and migration.
Very convenient for applying task/vma and VM granularity policies.

Signed-off-by: Peng DongX 
Signed-off-by: Fengguang Wu 
---
 arch/x86/kvm/ept_idle.c | 118 
 arch/x86/kvm/ept_idle.h |  24 ++
 2 files changed, 142 insertions(+)
 create mode 100644 arch/x86/kvm/ept_idle.c
 create mode 100644 arch/x86/kvm/ept_idle.h

diff --git a/arch/x86/kvm/ept_idle.c b/arch/x86/kvm/ept_idle.c
new file mode 100644
index ..5b97dd01011b
--- /dev/null
+++ b/arch/x86/kvm/ept_idle.c
@@ -0,0 +1,118 @@
+// SPDX-License-Identifier: GPL-2.0
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "ept_idle.h"
+
+
+// mindless copy from kvm_handle_hva_range().
+// TODO: handle order and hole.
+static int ept_idle_walk_hva_range(struct ept_idle_ctrl *eic,
+  unsigned long start,
+  unsigned long end)
+{
+   struct kvm_memslots *slots;
+   struct kvm_memory_slot *memslot;
+   int ret = 0;
+
+   slots = kvm_memslots(eic->kvm);
+   kvm_for_each_memslot(memslot, slots) {
+   unsigned long hva_start, hva_end;
+   gfn_t gfn_start, gfn_end;
+
+   hva_start = max(start, memslot->userspace_addr);
+   hva_end = min(end, memslot->userspace_addr +
+ (memslot->npages << PAGE_SHIFT));
+   if (hva_start >= hva_end)
+   continue;
+   /*
+* {gfn(page) | page intersects with [hva_start, hva_end)} =
+* {gfn_start, gfn_start+1, ..., gfn_end-1}.
+*/
+   gfn_start = hva_to_gfn_memslot(hva_start, memslot);
+   gfn_end = hva_to_gfn_memslot(hva_end + PAGE_SIZE - 1, memslot);
+
+   ret = ept_idle_walk_gfn_range(eic, gfn_start, gfn_end);
+   if (ret)
+   return ret;
+   }
+
+   return ret;
+}
+
+static ssize_t ept_idle_read(struct file *file, char *buf,
+size_t count, loff_t *ppos)
+{
+   struct task_struct *task = file->private_data;
+   struct ept_idle_ctrl *eic;
+   unsigned long hva_start = *ppos << BITMAP_BYTE2PVA_SHIFT;
+   unsigned long hva_end = hva_start + (count << BITMAP_BYTE2PVA_SHIFT);
+   int ret;
+
+   if (*ppos % IDLE_BITMAP_CHUNK_SIZE ||
+   count % IDLE_BITMAP_CHUNK_SIZE)
+   return -EINVAL;
+
+   eic = kzalloc(sizeof(*eic), GFP_KERNEL);
+   if (!eic)
+   return -EBUSY;
+
+   eic->buf = buf;
+   eic->buf_size = count;
+   eic->kvm = task_kvm(task);
+   if (!eic->kvm) {
+   ret = -EINVAL;
+   goto out_free;
+   }
+
+   ret = ept_idle_walk_hva_range(eic, hva_start, hva_end);
+   if (ret)
+   goto out_free;
+
+   ret = eic->bytes_copied;
+   *ppos += ret;
+out_free:
+   kfree(eic);
+
+   return ret;
+}
+
+static int ept_idle_open(struct inode *inode, struct file *file)
+{
+   if (!try_module_get(THIS_MODULE))
+   return -EBUSY;
+
+   return 0;
+}
+
+static int ept_idle_release(struct inode *inode, struct file *file)
+{
+   module_put(THIS_MODULE);
+   return 0;
+}
+
+extern struct file_operations proc_ept_idle_operations;
+
+static int ept_idle_entry(void)
+{
+   proc_ept_idle_operations.owner = THIS_MODULE;
+   proc_ept_idle_operations.read = ept_idle_read;
+   proc_ept_idle_operations.open = ept_idle_open;
+   proc_ept_idle_operations.release = ept_idle_release;
+
+   return 0;
+}
+
+static void ept_idle_exit(void)
+{
+   memset(_ept_idle_operations, 0, sizeof(proc_ept_idle_operations));
+}
+
+MODULE_LICENSE("GPL");
+module_init(ept_idle_entry);
+module_exit(ept_idle_exit);
diff --git a/arch/x86/kvm/ept_idle.h b/arch/x86/kvm/ept_idle.h
new file mode 100644
index ..e0b9dcecf50b
--- /dev/null
+++ b/arch/x86/kvm/ept_idle.h
@@ -0,0 +1,24 @@
+#ifndef _EPT_IDLE_H
+#define _EPT_IDLE_H
+
+#define IDLE_BITMAP_CHUNK_SIZE sizeof(u64)
+#define IDLE_BITMAP_CHUNK_BITS (IDLE_BITMAP_CHUNK_SIZE * BITS_PER_BYTE)
+
+#define BITMAP_BYTE2PVA_SHIFT  (3 + PAGE_SHIFT)
+
+#define EPT_IDLE_KBUF_FULL 1
+#define EPT_IDLE_KBUF_BYTES 8000
+#define EPT_IDLE_KBUF_BITS  (EPT_IDLE_KBUF_BYTES * 8)
+
+struct ept_idle_ctrl {
+   struct kvm *kvm;
+
+   u64 kbuf[EPT_IDLE_KBUF_BITS / IDLE_BITMAP_CHUNK_BITS];
+   int bits_read;
+
+   void __user *buf;
+   int 

[RFC][PATCH 5/5] [PATCH 5/5] kvm-ept-idle: enable module

2018-09-01 Thread Fengguang Wu
Signed-off-by: Fengguang Wu 
---
 arch/x86/kvm/Kconfig  | 11 +++
 arch/x86/kvm/Makefile |  4 
 2 files changed, 15 insertions(+)

diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index 1bbec387d289..4c6dec47fac6 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -96,6 +96,17 @@ config KVM_MMU_AUDIT
 This option adds a R/W kVM module parameter 'mmu_audit', which allows
 auditing of KVM MMU events at runtime.
 
+config KVM_EPT_IDLE
+   tristate "KVM EPT idle page tracking"
+   depends on KVM_INTEL
+   depends on PROC_PAGE_MONITOR
+   ---help---
+ Provides support for walking EPT to get the A bits on Intel
+ processors equipped with the VT extensions.
+
+ To compile this as a module, choose M here: the module
+ will be called kvm-ept-idle.
+
 # OK, it's a little counter-intuitive to do this, but it puts it neatly under
 # the virtualization menu.
 source drivers/vhost/Kconfig
diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index dc4f2fdf5e57..5cad0590205d 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -19,6 +19,10 @@ kvm-y+= x86.o mmu.o emulate.o 
i8259.o irq.o lapic.o \
 kvm-intel-y+= vmx.o pmu_intel.o
 kvm-amd-y  += svm.o pmu_amd.o
 
+kvm-ept-idle-y += ept_idle.o
+
 obj-$(CONFIG_KVM)  += kvm.o
 obj-$(CONFIG_KVM_INTEL)+= kvm-intel.o
 obj-$(CONFIG_KVM_AMD)  += kvm-amd.o
+
+obj-$(CONFIG_KVM_EPT_IDLE) += kvm-ept-idle.o
-- 
2.15.0





[RFC][PATCH 0/5] introduce /proc/PID/idle_bitmap

2018-09-01 Thread Fengguang Wu
This new /proc/PID/idle_bitmap interface aims to complement the current global
/sys/kernel/mm/page_idle/bitmap. To enable efficient user space driven 
migrations.

The pros and cons will be discussed in changelog of "[PATCH] proc: introduce
/proc/PID/idle_bitmap". The driving force is to improve efficiency by 10+
times, so that hot/cold page tracking can be done in some regular intervals in
user space w/o too much overheads. Making it possible for some user space
daemon to do regular page migration between NUMA nodes of different speeds.

Note it's not about NUMA migration between local and remote nodes -- we already
have NUMA balancing for that. This interface and user space migration daemon
targets for NUMA nodes made of different mediums -- ie. DIMM and NVDIMM(*) --
with larger performance gaps. Basic policy will be "move hot pages to DIMM;
cold pages to NVDIMM".

Since NVDIMMs size can easily reach several Terabytes, working set tracking
efficiency will matter and be challeging.

(*) Here we use persistent memory (PMEM) w/o using its persistence.
Persistence is good to have, however it requires modifying applications.
Upcoming NVDIMM products like Intel Apache Pass (AEP) will be more cost and 
energy
effective than DRAM, but slower. Merely using it in form of NUMA memory node
could immediately benefit many workloads. For example, warm but not hot apps,
workloads with sharp hot/cold page distribution (good for migration), or relies
more on memory size than latency and bandwidth, and do more reads than writes.

This is an early RFC version to collect feedbacks. It's complete enough to demo
the basic ideas and performance, however not usable yet.

Regards,
Fengguang



[RFC][PATCH 1/5] [PATCH 1/5] kvm: register in task_struct

2018-09-01 Thread Fengguang Wu
The added pointer will be used by the /proc/PID/idle_bitmap code to
quickly identify QEMU task and walk EPT/NPT accordingly. For virtual
machines, the A bits will be set in guest page tables and EPT/NPT,
rather than the QEMU task page table.

This costs 8 bytes in task_struct which could be wasteful for the
majority normal tasks. The alternative is to add a flag only, and
let it find the corresponding VM in kvm vm_list.

Signed-off-by: Fengguang Wu 
---
 include/linux/sched.h | 10 ++
 virt/kvm/kvm_main.c   |  1 +
 2 files changed, 11 insertions(+)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 43731fe51c97..26c8549bbc28 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -38,6 +38,7 @@ struct cfs_rq;
 struct fs_struct;
 struct futex_pi_state;
 struct io_context;
+struct kvm;
 struct mempolicy;
 struct nameidata;
 struct nsproxy;
@@ -1179,6 +1180,9 @@ struct task_struct {
/* Used by LSM modules for access restriction: */
void*security;
 #endif
+#if IS_ENABLED(CONFIG_KVM)
+struct kvm  *kvm;
+#endif
 
/*
 * New fields for task_struct should be added above here, so that
@@ -1898,4 +1902,10 @@ static inline void rseq_syscall(struct pt_regs *regs)
 
 #endif
 
+#if IS_ENABLED(CONFIG_KVM)
+static inline struct kvm *task_kvm(struct task_struct *t) { return t->kvm; }
+#else
+static inline struct kvm *task_kvm(struct task_struct *t) { return NULL; }
+#endif
+
 #endif
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 8b47507faab5..0c483720de8d 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3892,6 +3892,7 @@ static void kvm_uevent_notify_change(unsigned int type, 
struct kvm *kvm)
if (type == KVM_EVENT_CREATE_VM) {
add_uevent_var(env, "EVENT=create");
kvm->userspace_pid = task_pid_nr(current);
+   current->kvm = kvm;
} else if (type == KVM_EVENT_DESTROY_VM) {
add_uevent_var(env, "EVENT=destroy");
}
-- 
2.15.0





[RFC][PATCH 4/5] [PATCH 4/5] kvm-ept-idle: EPT page table walk for A bits

2018-09-01 Thread Fengguang Wu
This borrows host page table walk macros/functions to do EPT walk.
So it depends on them using the same level.

Dave Hansen raised the concern that hottest pages may be cached in TLB and
don't frequently set the accessed bits. The solution would be to invalidate TLB
for the mm being walked, when finished one round of scan.

Warning: read() also clears the accessed bit btw, in order to avoid one more
page table walk for write(). That may not be desirable for some use cases, so
we can avoid clearing accessed bit when opened in readonly mode.

The interface should be further improved to

1) report holes and huge pages in one go
2) represent huge pages and sparse page tables efficiently

(1) can be trivially fixed by extending the bitmap to more bits per PAGE_SIZE.

(2) would need fundemental changes to the interface. It seems existing solutions
for sparse files like SEEK_HOLE/SEEK_DATA and FIEMAP ioctl may not serve this
situation well. The most efficient way could be to fill user space read()
buffer with an array of small extents:

struct idle_extent {
unsigned type :  4; 
unsigned nr   :  4; 
};

where type can be one of

4K_HOLE
4K_IDLE
4K_ACCESSED
2M_HOLE
2M_IDLE
2M_ACCESSED
1G_OR_LARGER_PAGE
...

There can be up to 16 types, so more page sizes can be defined. The above names
are just for easy understanding the typical case. It's also possible that
PAGE_SIZE is not 4K, or PMD represents 4M pages. In which case we change type
names to more suitable ones like PTE_HOLE, PMD_ACCESSED. Since it's page table
walking, the user space should better know the exact page sizes. Either the
accessed bit or page migration are tied to the real page size.

Anyone interested in adding PTE_DIRTY or more types?

The main problem with such extent reporting interface is, the number of bytes
returned by read (variable extents) will mismatch the advanced file position
(fixed VA indexes), which is not POSIX compliant. Simple cp/cat may still work,
as they don't lseek based on read return value. If that's really a concern, we
may use ioctl() instead..

CC: Dave Hansen 
Signed-off-by: Fengguang Wu 
---
 arch/x86/kvm/ept_idle.c | 211 
 arch/x86/kvm/ept_idle.h |  55 +
 2 files changed, 266 insertions(+)

diff --git a/arch/x86/kvm/ept_idle.c b/arch/x86/kvm/ept_idle.c
index 5b97dd01011b..8a233ab8656d 100644
--- a/arch/x86/kvm/ept_idle.c
+++ b/arch/x86/kvm/ept_idle.c
@@ -9,6 +9,217 @@
 
 #include "ept_idle.h"
 
+static int add_to_idle_bitmap(struct ept_idle_ctrl *eic,
+ int idle, unsigned long addr_range)
+{
+   int nbits = addr_range >> PAGE_SHIFT;
+   int bits_left = EPT_IDLE_KBUF_BITS - eic->bits_read;
+   int ret = 0;
+
+   if (nbits >= bits_left) {
+   ret = EPT_IDLE_KBUF_FULL;
+   nbits = bits_left;
+   }
+
+   // TODO: this assumes u64 == unsigned long
+   if (!idle)
+   __bitmap_clear((unsigned long *)eic->kbuf, eic->bits_read, 
nbits);
+   eic->bits_read += nbits;
+
+   return ret;
+}
+
+static int ept_pte_range(struct ept_idle_ctrl *eic,
+pmd_t *pmd, unsigned long addr, unsigned long end)
+{
+   pte_t *pte;
+   int err = 0;
+   int idle;
+
+   pte = pte_offset_kernel(pmd, addr);
+   do {
+   if (!ept_pte_present(*pte) ||
+   !ept_pte_accessed(*pte))
+   idle = 1;
+   else {
+   idle = 0;
+   pte_clear_flags(*pte, _PAGE_EPT_ACCESSED);
+   }
+
+   err = add_to_idle_bitmap(eic, idle, PAGE_SIZE);
+   if (err)
+   break;
+   } while (pte++, addr += PAGE_SIZE, addr != end);
+
+   return err;
+}
+
+static int ept_pmd_range(struct ept_idle_ctrl *eic,
+pud_t *pud, unsigned long addr, unsigned long end)
+{
+   pmd_t *pmd;
+   unsigned long next;
+   int err = 0;
+   int idle;
+
+   pmd = pmd_offset(pud, addr);
+   do {
+   next = pmd_addr_end(addr, end);
+   idle = -1;
+   if (!ept_pmd_present(*pmd) ||
+   !ept_pmd_accessed(*pmd)) {
+   idle = 1;
+   } else if (pmd_large(*pmd)) {
+   idle = 0;
+   pmd_clear_flags(*pmd, _PAGE_EPT_ACCESSED);
+   }
+   if (idle >= 0)
+   err = add_to_idle_bitmap(eic, idle, next - addr);
+   else
+   err = ept_pte_range(eic, pmd, addr, next);
+   if (err)
+   break;
+   } while (pmd++, addr = next, addr != end);
+
+   return err;
+}
+
+static int ept_pud_range(struct ept_idle_ctrl *eic,
+p4d_t *p4d, unsigned long addr, unsigned 

[RFC][PATCH 4/5] [PATCH 4/5] kvm-ept-idle: EPT page table walk for A bits

2018-09-01 Thread Fengguang Wu
This borrows host page table walk macros/functions to do EPT walk.
So it depends on them using the same level.

Dave Hansen raised the concern that hottest pages may be cached in TLB and
don't frequently set the accessed bits. The solution would be to invalidate TLB
for the mm being walked, when finished one round of scan.

Warning: read() also clears the accessed bit btw, in order to avoid one more
page table walk for write(). That may not be desirable for some use cases, so
we can avoid clearing accessed bit when opened in readonly mode.

The interface should be further improved to

1) report holes and huge pages in one go
2) represent huge pages and sparse page tables efficiently

(1) can be trivially fixed by extending the bitmap to more bits per PAGE_SIZE.

(2) would need fundemental changes to the interface. It seems existing solutions
for sparse files like SEEK_HOLE/SEEK_DATA and FIEMAP ioctl may not serve this
situation well. The most efficient way could be to fill user space read()
buffer with an array of small extents:

struct idle_extent {
unsigned type :  4; 
unsigned nr   :  4; 
};

where type can be one of

4K_HOLE
4K_IDLE
4K_ACCESSED
2M_HOLE
2M_IDLE
2M_ACCESSED
1G_OR_LARGER_PAGE
...

There can be up to 16 types, so more page sizes can be defined. The above names
are just for easy understanding the typical case. It's also possible that
PAGE_SIZE is not 4K, or PMD represents 4M pages. In which case we change type
names to more suitable ones like PTE_HOLE, PMD_ACCESSED. Since it's page table
walking, the user space should better know the exact page sizes. Either the
accessed bit or page migration are tied to the real page size.

Anyone interested in adding PTE_DIRTY or more types?

The main problem with such extent reporting interface is, the number of bytes
returned by read (variable extents) will mismatch the advanced file position
(fixed VA indexes), which is not POSIX compliant. Simple cp/cat may still work,
as they don't lseek based on read return value. If that's really a concern, we
may use ioctl() instead..

CC: Dave Hansen 
Signed-off-by: Fengguang Wu 
---
 arch/x86/kvm/ept_idle.c | 211 
 arch/x86/kvm/ept_idle.h |  55 +
 2 files changed, 266 insertions(+)

diff --git a/arch/x86/kvm/ept_idle.c b/arch/x86/kvm/ept_idle.c
index 5b97dd01011b..8a233ab8656d 100644
--- a/arch/x86/kvm/ept_idle.c
+++ b/arch/x86/kvm/ept_idle.c
@@ -9,6 +9,217 @@
 
 #include "ept_idle.h"
 
+static int add_to_idle_bitmap(struct ept_idle_ctrl *eic,
+ int idle, unsigned long addr_range)
+{
+   int nbits = addr_range >> PAGE_SHIFT;
+   int bits_left = EPT_IDLE_KBUF_BITS - eic->bits_read;
+   int ret = 0;
+
+   if (nbits >= bits_left) {
+   ret = EPT_IDLE_KBUF_FULL;
+   nbits = bits_left;
+   }
+
+   // TODO: this assumes u64 == unsigned long
+   if (!idle)
+   __bitmap_clear((unsigned long *)eic->kbuf, eic->bits_read, 
nbits);
+   eic->bits_read += nbits;
+
+   return ret;
+}
+
+static int ept_pte_range(struct ept_idle_ctrl *eic,
+pmd_t *pmd, unsigned long addr, unsigned long end)
+{
+   pte_t *pte;
+   int err = 0;
+   int idle;
+
+   pte = pte_offset_kernel(pmd, addr);
+   do {
+   if (!ept_pte_present(*pte) ||
+   !ept_pte_accessed(*pte))
+   idle = 1;
+   else {
+   idle = 0;
+   pte_clear_flags(*pte, _PAGE_EPT_ACCESSED);
+   }
+
+   err = add_to_idle_bitmap(eic, idle, PAGE_SIZE);
+   if (err)
+   break;
+   } while (pte++, addr += PAGE_SIZE, addr != end);
+
+   return err;
+}
+
+static int ept_pmd_range(struct ept_idle_ctrl *eic,
+pud_t *pud, unsigned long addr, unsigned long end)
+{
+   pmd_t *pmd;
+   unsigned long next;
+   int err = 0;
+   int idle;
+
+   pmd = pmd_offset(pud, addr);
+   do {
+   next = pmd_addr_end(addr, end);
+   idle = -1;
+   if (!ept_pmd_present(*pmd) ||
+   !ept_pmd_accessed(*pmd)) {
+   idle = 1;
+   } else if (pmd_large(*pmd)) {
+   idle = 0;
+   pmd_clear_flags(*pmd, _PAGE_EPT_ACCESSED);
+   }
+   if (idle >= 0)
+   err = add_to_idle_bitmap(eic, idle, next - addr);
+   else
+   err = ept_pte_range(eic, pmd, addr, next);
+   if (err)
+   break;
+   } while (pmd++, addr = next, addr != end);
+
+   return err;
+}
+
+static int ept_pud_range(struct ept_idle_ctrl *eic,
+p4d_t *p4d, unsigned long addr, unsigned 

[RFC PATCH] spi: at91-usart: at91_usart_spi_transfer_one() can be static

2018-09-01 Thread kbuild test robot


Fixes: 5890bab41187 ("spi: at91-usart: add driver for at91-usart as spi")
Signed-off-by: kbuild test robot 
---
 spi-at91-usart.c |   14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/spi/spi-at91-usart.c b/drivers/spi/spi-at91-usart.c
index 4712bd4..a924657 100644
--- a/drivers/spi/spi-at91-usart.c
+++ b/drivers/spi/spi-at91-usart.c
@@ -215,9 +215,9 @@ static int at91_usart_spi_setup(struct spi_device *spi)
return 0;
 }
 
-int at91_usart_spi_transfer_one(struct spi_controller *ctlr,
-   struct spi_device *spi,
-   struct spi_transfer *xfer)
+static int at91_usart_spi_transfer_one(struct spi_controller *ctlr,
+  struct spi_device *spi,
+  struct spi_transfer *xfer)
 {
struct at91_usart_spi *aus = spi_master_get_devdata(ctlr);
 
@@ -242,8 +242,8 @@ int at91_usart_spi_transfer_one(struct spi_controller *ctlr,
return 0;
 }
 
-int at91_usart_spi_prepare_message(struct spi_controller *ctlr,
-  struct spi_message *message)
+static int at91_usart_spi_prepare_message(struct spi_controller *ctlr,
+ struct spi_message *message)
 {
struct at91_usart_spi *aus = spi_master_get_devdata(ctlr);
struct spi_device *spi = message->spi;
@@ -256,8 +256,8 @@ int at91_usart_spi_prepare_message(struct spi_controller 
*ctlr,
return 0;
 }
 
-int at91_usart_spi_unprepare_message(struct spi_controller *ctlr,
-struct spi_message *message)
+static int at91_usart_spi_unprepare_message(struct spi_controller *ctlr,
+   struct spi_message *message)
 {
struct at91_usart_spi *aus = spi_master_get_devdata(ctlr);
 


[RFC PATCH] spi: at91-usart: at91_usart_spi_transfer_one() can be static

2018-09-01 Thread kbuild test robot


Fixes: 5890bab41187 ("spi: at91-usart: add driver for at91-usart as spi")
Signed-off-by: kbuild test robot 
---
 spi-at91-usart.c |   14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/spi/spi-at91-usart.c b/drivers/spi/spi-at91-usart.c
index 4712bd4..a924657 100644
--- a/drivers/spi/spi-at91-usart.c
+++ b/drivers/spi/spi-at91-usart.c
@@ -215,9 +215,9 @@ static int at91_usart_spi_setup(struct spi_device *spi)
return 0;
 }
 
-int at91_usart_spi_transfer_one(struct spi_controller *ctlr,
-   struct spi_device *spi,
-   struct spi_transfer *xfer)
+static int at91_usart_spi_transfer_one(struct spi_controller *ctlr,
+  struct spi_device *spi,
+  struct spi_transfer *xfer)
 {
struct at91_usart_spi *aus = spi_master_get_devdata(ctlr);
 
@@ -242,8 +242,8 @@ int at91_usart_spi_transfer_one(struct spi_controller *ctlr,
return 0;
 }
 
-int at91_usart_spi_prepare_message(struct spi_controller *ctlr,
-  struct spi_message *message)
+static int at91_usart_spi_prepare_message(struct spi_controller *ctlr,
+ struct spi_message *message)
 {
struct at91_usart_spi *aus = spi_master_get_devdata(ctlr);
struct spi_device *spi = message->spi;
@@ -256,8 +256,8 @@ int at91_usart_spi_prepare_message(struct spi_controller 
*ctlr,
return 0;
 }
 
-int at91_usart_spi_unprepare_message(struct spi_controller *ctlr,
-struct spi_message *message)
+static int at91_usart_spi_unprepare_message(struct spi_controller *ctlr,
+   struct spi_message *message)
 {
struct at91_usart_spi *aus = spi_master_get_devdata(ctlr);
 


WARNING: vmlinux.o(.text+0xc816b): Section mismatch in reference from the function pti_clone_pgtable() to the function .init.text:pti_user_pagetable_walk_pte()

2018-09-01 Thread kbuild test robot
tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   420f51f4ab6bce6e580390729fadb89c31123636
commit: 269777aa530f3438ec1781586cdac0b5fe47b061 cpu/hotplug: Non-SMP machines 
do not make use of booted_once
date:   3 weeks ago
config: i386-randconfig-b0-09011544 (attached as .config)
compiler: gcc-4.9 (Debian 4.9.4-2) 4.9.4
reproduce:
git checkout 269777aa530f3438ec1781586cdac0b5fe47b061
# save the attached .config to linux build tree
make ARCH=i386 

All warnings (new ones prefixed by >>):

>> WARNING: vmlinux.o(.text+0xc816b): Section mismatch in reference from the 
>> function pti_clone_pgtable() to the function 
>> .init.text:pti_user_pagetable_walk_pte()
   The function pti_clone_pgtable() references
   the function __init pti_user_pagetable_walk_pte().
   This is often because pti_clone_pgtable lacks a __init
   annotation or the annotation of pti_user_pagetable_walk_pte is wrong.

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


WARNING: vmlinux.o(.text+0xc816b): Section mismatch in reference from the function pti_clone_pgtable() to the function .init.text:pti_user_pagetable_walk_pte()

2018-09-01 Thread kbuild test robot
tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   420f51f4ab6bce6e580390729fadb89c31123636
commit: 269777aa530f3438ec1781586cdac0b5fe47b061 cpu/hotplug: Non-SMP machines 
do not make use of booted_once
date:   3 weeks ago
config: i386-randconfig-b0-09011544 (attached as .config)
compiler: gcc-4.9 (Debian 4.9.4-2) 4.9.4
reproduce:
git checkout 269777aa530f3438ec1781586cdac0b5fe47b061
# save the attached .config to linux build tree
make ARCH=i386 

All warnings (new ones prefixed by >>):

>> WARNING: vmlinux.o(.text+0xc816b): Section mismatch in reference from the 
>> function pti_clone_pgtable() to the function 
>> .init.text:pti_user_pagetable_walk_pte()
   The function pti_clone_pgtable() references
   the function __init pti_user_pagetable_walk_pte().
   This is often because pti_clone_pgtable lacks a __init
   annotation or the annotation of pti_user_pagetable_walk_pte is wrong.

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [RESEND PATCH v11 5/6] spi: at91-usart: add driver for at91-usart as spi

2018-09-01 Thread kbuild test robot
Hi Radu,

I love your patch! Perhaps something to improve:

[auto build test WARNING on ljones-mfd/for-mfd-next]
[also build test WARNING on v4.19-rc1 next-20180831]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Radu-Pirea/Driver-for-at91-usart-in-spi-mode/20180901-165150
base:   https://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd.git for-mfd-next
reproduce:
# apt-get install sparse
make ARCH=x86_64 allmodconfig
make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by >>)

>> drivers/spi/spi-at91-usart.c:218:5: sparse: symbol 
>> 'at91_usart_spi_transfer_one' was not declared. Should it be static?
>> drivers/spi/spi-at91-usart.c:245:5: sparse: symbol 
>> 'at91_usart_spi_prepare_message' was not declared. Should it be static?
>> drivers/spi/spi-at91-usart.c:259:5: sparse: symbol 
>> 'at91_usart_spi_unprepare_message' was not declared. Should it be static?

Please review and possibly fold the followup patch.

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


Re: [RESEND PATCH v11 5/6] spi: at91-usart: add driver for at91-usart as spi

2018-09-01 Thread kbuild test robot
Hi Radu,

I love your patch! Perhaps something to improve:

[auto build test WARNING on ljones-mfd/for-mfd-next]
[also build test WARNING on v4.19-rc1 next-20180831]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Radu-Pirea/Driver-for-at91-usart-in-spi-mode/20180901-165150
base:   https://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd.git for-mfd-next
reproduce:
# apt-get install sparse
make ARCH=x86_64 allmodconfig
make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by >>)

>> drivers/spi/spi-at91-usart.c:218:5: sparse: symbol 
>> 'at91_usart_spi_transfer_one' was not declared. Should it be static?
>> drivers/spi/spi-at91-usart.c:245:5: sparse: symbol 
>> 'at91_usart_spi_prepare_message' was not declared. Should it be static?
>> drivers/spi/spi-at91-usart.c:259:5: sparse: symbol 
>> 'at91_usart_spi_unprepare_message' was not declared. Should it be static?

Please review and possibly fold the followup patch.

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


[PATCH] PCI/AER: Fix an AER enabling/disabling race

2018-09-01 Thread Jon Derrick
There is a sequence with non-ACPI root ports where the AER driver can
enable error reporting on the tree before port drivers have bound to
ports on the tree. The port driver assumes the AER driver will set up
error reporting afterwards, so instead add a check if error reporting
was set up first.

Example:
[  343.790573] pcieport 1:00:00.0: pci_disable_pcie_error_reporting
[  343.809812] pcieport 1:00:00.0: pci_enable_pcie_error_reporting
[  343.819506] pci 1:01:00.0: pci_enable_pcie_error_reporting
[  343.828814] pci 1:02:00.0: pci_enable_pcie_error_reporting
[  343.838089] pci 1:02:01.0: pci_enable_pcie_error_reporting
[  343.847478] pci 1:02:02.0: pci_enable_pcie_error_reporting
[  343.856659] pci 1:02:03.0: pci_enable_pcie_error_reporting
[  343.865794] pci 1:02:04.0: pci_enable_pcie_error_reporting
[  343.874875] pci 1:02:05.0: pci_enable_pcie_error_reporting
[  343.883918] pci 1:02:06.0: pci_enable_pcie_error_reporting
[  343.892922] pci 1:02:07.0: pci_enable_pcie_error_reporting
[  343.918900] pcieport 1:01:00.0: pci_disable_pcie_error_reporting
[  343.968426] pcieport 1:02:00.0: pci_disable_pcie_error_reporting
[  344.028179] pcieport 1:02:01.0: pci_disable_pcie_error_reporting
[  344.091269] pcieport 1:02:02.0: pci_disable_pcie_error_reporting
[  344.156473] pcieport 1:02:03.0: pci_disable_pcie_error_reporting
[  344.238042] pcieport 1:02:04.0: pci_disable_pcie_error_reporting
[  344.321864] pcieport 1:02:05.0: pci_disable_pcie_error_reporting
[  344.411601] pcieport 1:02:06.0: pci_disable_pcie_error_reporting
[  344.505332] pcieport 1:02:07.0: pci_disable_pcie_error_reporting
[  344.621824] nvme 1:06:00.0: pci_enable_pcie_error_reporting

Signed-off-by: Jon Derrick 
---
 drivers/pci/pcie/aer.c  | 1 +
 drivers/pci/pcie/portdrv_core.c | 5 -
 include/linux/pci.h | 1 +
 3 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index 83180ed..a4e36b6 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -1333,6 +1333,7 @@ static int set_device_error_reporting(struct pci_dev 
*dev, void *data)
if (enable)
pcie_set_ecrc_checking(dev);
 
+   dev->aer_configured = 1;
return 0;
 }
 
diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
index 7c37d81..f5de554 100644
--- a/drivers/pci/pcie/portdrv_core.c
+++ b/drivers/pci/pcie/portdrv_core.c
@@ -224,8 +224,11 @@ static int get_port_device_capability(struct pci_dev *dev)
/*
 * Disable AER on this port in case it's been enabled by the
 * BIOS (the AER service driver will enable it when necessary).
+* Don't disable it if the AER service driver has already
+* enabled it from the root port bus walking
 */
-   pci_disable_pcie_error_reporting(dev);
+   if (!dev->aer_configured)
+   pci_disable_pcie_error_reporting(dev);
}
 #endif
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index e72ca8d..c071622 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -402,6 +402,7 @@ struct pci_dev {
unsigned inthas_secondary_link:1;
unsigned intnon_compliant_bars:1;   /* Broken BARs; ignore them */
unsigned intis_probed:1;/* Device probing in progress */
+   unsigned intaer_configured:1;   /* AER configured for device */
pci_dev_flags_t dev_flags;
atomic_tenable_cnt; /* pci_enable_device has been called */
 
-- 
1.8.3.1



[PATCH] PCI/AER: Fix an AER enabling/disabling race

2018-09-01 Thread Jon Derrick
There is a sequence with non-ACPI root ports where the AER driver can
enable error reporting on the tree before port drivers have bound to
ports on the tree. The port driver assumes the AER driver will set up
error reporting afterwards, so instead add a check if error reporting
was set up first.

Example:
[  343.790573] pcieport 1:00:00.0: pci_disable_pcie_error_reporting
[  343.809812] pcieport 1:00:00.0: pci_enable_pcie_error_reporting
[  343.819506] pci 1:01:00.0: pci_enable_pcie_error_reporting
[  343.828814] pci 1:02:00.0: pci_enable_pcie_error_reporting
[  343.838089] pci 1:02:01.0: pci_enable_pcie_error_reporting
[  343.847478] pci 1:02:02.0: pci_enable_pcie_error_reporting
[  343.856659] pci 1:02:03.0: pci_enable_pcie_error_reporting
[  343.865794] pci 1:02:04.0: pci_enable_pcie_error_reporting
[  343.874875] pci 1:02:05.0: pci_enable_pcie_error_reporting
[  343.883918] pci 1:02:06.0: pci_enable_pcie_error_reporting
[  343.892922] pci 1:02:07.0: pci_enable_pcie_error_reporting
[  343.918900] pcieport 1:01:00.0: pci_disable_pcie_error_reporting
[  343.968426] pcieport 1:02:00.0: pci_disable_pcie_error_reporting
[  344.028179] pcieport 1:02:01.0: pci_disable_pcie_error_reporting
[  344.091269] pcieport 1:02:02.0: pci_disable_pcie_error_reporting
[  344.156473] pcieport 1:02:03.0: pci_disable_pcie_error_reporting
[  344.238042] pcieport 1:02:04.0: pci_disable_pcie_error_reporting
[  344.321864] pcieport 1:02:05.0: pci_disable_pcie_error_reporting
[  344.411601] pcieport 1:02:06.0: pci_disable_pcie_error_reporting
[  344.505332] pcieport 1:02:07.0: pci_disable_pcie_error_reporting
[  344.621824] nvme 1:06:00.0: pci_enable_pcie_error_reporting

Signed-off-by: Jon Derrick 
---
 drivers/pci/pcie/aer.c  | 1 +
 drivers/pci/pcie/portdrv_core.c | 5 -
 include/linux/pci.h | 1 +
 3 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index 83180ed..a4e36b6 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -1333,6 +1333,7 @@ static int set_device_error_reporting(struct pci_dev 
*dev, void *data)
if (enable)
pcie_set_ecrc_checking(dev);
 
+   dev->aer_configured = 1;
return 0;
 }
 
diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
index 7c37d81..f5de554 100644
--- a/drivers/pci/pcie/portdrv_core.c
+++ b/drivers/pci/pcie/portdrv_core.c
@@ -224,8 +224,11 @@ static int get_port_device_capability(struct pci_dev *dev)
/*
 * Disable AER on this port in case it's been enabled by the
 * BIOS (the AER service driver will enable it when necessary).
+* Don't disable it if the AER service driver has already
+* enabled it from the root port bus walking
 */
-   pci_disable_pcie_error_reporting(dev);
+   if (!dev->aer_configured)
+   pci_disable_pcie_error_reporting(dev);
}
 #endif
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index e72ca8d..c071622 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -402,6 +402,7 @@ struct pci_dev {
unsigned inthas_secondary_link:1;
unsigned intnon_compliant_bars:1;   /* Broken BARs; ignore them */
unsigned intis_probed:1;/* Device probing in progress */
+   unsigned intaer_configured:1;   /* AER configured for device */
pci_dev_flags_t dev_flags;
atomic_tenable_cnt; /* pci_enable_device has been called */
 
-- 
1.8.3.1



Re: [PATCH v5 2/2] clk: Add functions to get optional clocks

2018-09-01 Thread kbuild test robot
Hi Phil,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on clk/clk-next]
[also build test WARNING on v4.19-rc1 next-20180831]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Phil-Edworthy/clk-Add-functions-to-get-optional-clocks/20180901-154009
base:   https://git.kernel.org/pub/scm/linux/kernel/git/clk/linux.git clk-next
reproduce: make htmldocs

All warnings (new ones prefixed by >>):

   WARNING: convert(1) not found, for SVG to PDF conversion install ImageMagick 
(https://www.imagemagick.org)
>> include/linux/clk.h:368: warning: Function parameter or member 'dev' not 
>> described in 'devm_clk_get_optional'
>> include/linux/clk.h:368: warning: Function parameter or member 'id' not 
>> described in 'devm_clk_get_optional'
   include/linux/srcu.h:175: warning: Function parameter or member 'p' not 
described in 'srcu_dereference_notrace'
   include/linux/srcu.h:175: warning: Function parameter or member 'sp' not 
described in 'srcu_dereference_notrace'
   include/linux/gfp.h:1: warning: no structured comments found
   include/net/cfg80211.h:4381: warning: Function parameter or member 
'wext.ibss' not described in 'wireless_dev'
   include/net/cfg80211.h:4381: warning: Function parameter or member 
'wext.connect' not described in 'wireless_dev'
   include/net/cfg80211.h:4381: warning: Function parameter or member 
'wext.keys' not described in 'wireless_dev'
   include/net/cfg80211.h:4381: warning: Function parameter or member 'wext.ie' 
not described in 'wireless_dev'
   include/net/cfg80211.h:4381: warning: Function parameter or member 
'wext.ie_len' not described in 'wireless_dev'
   include/net/cfg80211.h:4381: warning: Function parameter or member 
'wext.bssid' not described in 'wireless_dev'
   include/net/cfg80211.h:4381: warning: Function parameter or member 
'wext.ssid' not described in 'wireless_dev'
   include/net/cfg80211.h:4381: warning: Function parameter or member 
'wext.default_key' not described in 'wireless_dev'
   include/net/cfg80211.h:4381: warning: Function parameter or member 
'wext.default_mgmt_key' not described in 'wireless_dev'
   include/net/cfg80211.h:4381: warning: Function parameter or member 
'wext.prev_bssid_valid' not described in 'wireless_dev'
   include/net/mac80211.h:2328: warning: Function parameter or member 
'radiotap_timestamp.units_pos' not described in 'ieee80211_hw'
   include/net/mac80211.h:2328: warning: Function parameter or member 
'radiotap_timestamp.accuracy' not described in 'ieee80211_hw'
   include/net/mac80211.h:977: warning: Function parameter or member 
'control.rates' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:977: warning: Function parameter or member 
'control.rts_cts_rate_idx' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:977: warning: Function parameter or member 
'control.use_rts' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:977: warning: Function parameter or member 
'control.use_cts_prot' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:977: warning: Function parameter or member 
'control.short_preamble' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:977: warning: Function parameter or member 
'control.skip_table' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:977: warning: Function parameter or member 
'control.jiffies' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:977: warning: Function parameter or member 
'control.vif' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:977: warning: Function parameter or member 
'control.hw_key' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:977: warning: Function parameter or member 
'control.flags' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:977: warning: Function parameter or member 
'control.enqueue_time' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:977: warning: Function parameter or member 'ack' not 
described in 'ieee80211_tx_info'
   include/net/mac80211.h:977: warning: Function parameter or member 
'ack.cookie' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:977: warning: Function parameter or member 
'status.rates' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:977: warning: Function parameter or member 
'status.ack_signal' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:977: warning: Function parameter or member 
'status.ampdu_ack_len' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:977: warning: Function parameter or member 
'status.ampdu_len' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:977: warning: Function parameter or member 
'status.antenna' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:977: warning: Function parameter or member 

Re: [PATCH v5 2/2] clk: Add functions to get optional clocks

2018-09-01 Thread kbuild test robot
Hi Phil,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on clk/clk-next]
[also build test WARNING on v4.19-rc1 next-20180831]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Phil-Edworthy/clk-Add-functions-to-get-optional-clocks/20180901-154009
base:   https://git.kernel.org/pub/scm/linux/kernel/git/clk/linux.git clk-next
reproduce: make htmldocs

All warnings (new ones prefixed by >>):

   WARNING: convert(1) not found, for SVG to PDF conversion install ImageMagick 
(https://www.imagemagick.org)
>> include/linux/clk.h:368: warning: Function parameter or member 'dev' not 
>> described in 'devm_clk_get_optional'
>> include/linux/clk.h:368: warning: Function parameter or member 'id' not 
>> described in 'devm_clk_get_optional'
   include/linux/srcu.h:175: warning: Function parameter or member 'p' not 
described in 'srcu_dereference_notrace'
   include/linux/srcu.h:175: warning: Function parameter or member 'sp' not 
described in 'srcu_dereference_notrace'
   include/linux/gfp.h:1: warning: no structured comments found
   include/net/cfg80211.h:4381: warning: Function parameter or member 
'wext.ibss' not described in 'wireless_dev'
   include/net/cfg80211.h:4381: warning: Function parameter or member 
'wext.connect' not described in 'wireless_dev'
   include/net/cfg80211.h:4381: warning: Function parameter or member 
'wext.keys' not described in 'wireless_dev'
   include/net/cfg80211.h:4381: warning: Function parameter or member 'wext.ie' 
not described in 'wireless_dev'
   include/net/cfg80211.h:4381: warning: Function parameter or member 
'wext.ie_len' not described in 'wireless_dev'
   include/net/cfg80211.h:4381: warning: Function parameter or member 
'wext.bssid' not described in 'wireless_dev'
   include/net/cfg80211.h:4381: warning: Function parameter or member 
'wext.ssid' not described in 'wireless_dev'
   include/net/cfg80211.h:4381: warning: Function parameter or member 
'wext.default_key' not described in 'wireless_dev'
   include/net/cfg80211.h:4381: warning: Function parameter or member 
'wext.default_mgmt_key' not described in 'wireless_dev'
   include/net/cfg80211.h:4381: warning: Function parameter or member 
'wext.prev_bssid_valid' not described in 'wireless_dev'
   include/net/mac80211.h:2328: warning: Function parameter or member 
'radiotap_timestamp.units_pos' not described in 'ieee80211_hw'
   include/net/mac80211.h:2328: warning: Function parameter or member 
'radiotap_timestamp.accuracy' not described in 'ieee80211_hw'
   include/net/mac80211.h:977: warning: Function parameter or member 
'control.rates' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:977: warning: Function parameter or member 
'control.rts_cts_rate_idx' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:977: warning: Function parameter or member 
'control.use_rts' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:977: warning: Function parameter or member 
'control.use_cts_prot' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:977: warning: Function parameter or member 
'control.short_preamble' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:977: warning: Function parameter or member 
'control.skip_table' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:977: warning: Function parameter or member 
'control.jiffies' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:977: warning: Function parameter or member 
'control.vif' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:977: warning: Function parameter or member 
'control.hw_key' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:977: warning: Function parameter or member 
'control.flags' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:977: warning: Function parameter or member 
'control.enqueue_time' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:977: warning: Function parameter or member 'ack' not 
described in 'ieee80211_tx_info'
   include/net/mac80211.h:977: warning: Function parameter or member 
'ack.cookie' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:977: warning: Function parameter or member 
'status.rates' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:977: warning: Function parameter or member 
'status.ack_signal' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:977: warning: Function parameter or member 
'status.ampdu_ack_len' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:977: warning: Function parameter or member 
'status.ampdu_len' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:977: warning: Function parameter or member 
'status.antenna' not described in 'ieee80211_tx_info'
   include/net/mac80211.h:977: warning: Function parameter or member 

Re: 4.19-rc1: ./include/linux/rcupdate.h:631 rcu_read_lock() used illegally while idle!

2018-09-01 Thread Steven Rostedt
On Sat, 1 Sep 2018 10:54:42 -0700
"Paul E. McKenney"  wrote:

> On Sat, Sep 01, 2018 at 07:35:59PM +0200, Borislav Petkov wrote:
> > This is a huge splat! It haz some perf* and sched* in it, I guess for
> > peterz to stare at. And lemme add Paul for good measure too :)
> > 
> > Kernel is -rc1 + 3 microcode loader patches ontop which should not be
> > related.  
> 
> It really is tracing from the idle loop.  But I thought that the event
> tracing took care of that.  Adding Steve and Joel for their thoughts.
> 
>   Thanx, Paul
> 
> > Thx.
> > 
> > ---
> > [   62.409125] =
> > [   62.409129] WARNING: suspicious RCU usage
> > [   62.409133] 4.19.0-rc1+ #1 Not tainted
> > [   62.409136] -
> > [   62.409140] ./include/linux/rcupdate.h:631 rcu_read_lock() used 
> > illegally while idle!
> > [   62.409143] 
> >other info that might help us debug this:
> > 
> > [   62.409147] 
> >RCU used illegally from idle CPU!
> >rcu_scheduler_active = 2, debug_locks = 1
> > [   62.409151] RCU used illegally from extended quiescent state!
> > [   62.409155] 1 lock held by swapper/0/0:
> > [   62.409158]  #0: 4557ee0e (rcu_read_lock){}, at: 
> > perf_event_output_forward+0x0/0x130
> > [   62.409175] 
> >stack backtrace:
> > [   62.409180] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0-rc1+ #1
> > [   62.409183] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 
> > 11/13/2012
> > [   62.409187] Call Trace:
> > [   62.409196]  dump_stack+0x85/0xcb
> > [   62.409203]  perf_event_output_forward+0xf6/0x130

I think this is because we switched the trace point code to be
protected by srcu instead of rcu_lock_sched() and a song and dance to
"make RCU watch again" if it is not, but perf is using normal
rcu_read_lock() internally even though it is hooked into the
tracepoint code. Should perf switch to SRCU, or perhaps it can do the
song and dance to make RCU watch again?

-- Steve


> > [   62.409215]  __perf_event_overflow+0x52/0xe0
> > [   62.409223]  perf_swevent_overflow+0x91/0xb0
> > [   62.409229]  perf_tp_event+0x11a/0x350
> > [   62.409235]  ? find_held_lock+0x2d/0x90
> > [   62.409251]  ? __lock_acquire+0x2ce/0x1350
> > [   62.409263]  ? __lock_acquire+0x2ce/0x1350
> > [   62.409270]  ? retint_kernel+0x2d/0x2d
> > [   62.409278]  ? find_held_lock+0x2d/0x90
> > [   62.409285]  ? tick_nohz_get_sleep_length+0x83/0xb0
> > [   62.409299]  ? perf_trace_cpu+0xbb/0xd0
> > [   62.409306]  ? perf_trace_buf_alloc+0x5a/0xa0
> > [   62.409311]  perf_trace_cpu+0xbb/0xd0
> > [   62.409323]  cpuidle_enter_state+0x185/0x340
> > [   62.409332]  do_idle+0x1eb/0x260
> > [   62.409340]  cpu_startup_entry+0x5f/0x70
> > [   62.409347]  start_kernel+0x49b/0x4a6
> > 
> > [   62.409357]  secondary_startup_64+0xa4/0xb0


Re: 4.19-rc1: ./include/linux/rcupdate.h:631 rcu_read_lock() used illegally while idle!

2018-09-01 Thread Steven Rostedt
On Sat, 1 Sep 2018 10:54:42 -0700
"Paul E. McKenney"  wrote:

> On Sat, Sep 01, 2018 at 07:35:59PM +0200, Borislav Petkov wrote:
> > This is a huge splat! It haz some perf* and sched* in it, I guess for
> > peterz to stare at. And lemme add Paul for good measure too :)
> > 
> > Kernel is -rc1 + 3 microcode loader patches ontop which should not be
> > related.  
> 
> It really is tracing from the idle loop.  But I thought that the event
> tracing took care of that.  Adding Steve and Joel for their thoughts.
> 
>   Thanx, Paul
> 
> > Thx.
> > 
> > ---
> > [   62.409125] =
> > [   62.409129] WARNING: suspicious RCU usage
> > [   62.409133] 4.19.0-rc1+ #1 Not tainted
> > [   62.409136] -
> > [   62.409140] ./include/linux/rcupdate.h:631 rcu_read_lock() used 
> > illegally while idle!
> > [   62.409143] 
> >other info that might help us debug this:
> > 
> > [   62.409147] 
> >RCU used illegally from idle CPU!
> >rcu_scheduler_active = 2, debug_locks = 1
> > [   62.409151] RCU used illegally from extended quiescent state!
> > [   62.409155] 1 lock held by swapper/0/0:
> > [   62.409158]  #0: 4557ee0e (rcu_read_lock){}, at: 
> > perf_event_output_forward+0x0/0x130
> > [   62.409175] 
> >stack backtrace:
> > [   62.409180] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0-rc1+ #1
> > [   62.409183] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 
> > 11/13/2012
> > [   62.409187] Call Trace:
> > [   62.409196]  dump_stack+0x85/0xcb
> > [   62.409203]  perf_event_output_forward+0xf6/0x130

I think this is because we switched the trace point code to be
protected by srcu instead of rcu_lock_sched() and a song and dance to
"make RCU watch again" if it is not, but perf is using normal
rcu_read_lock() internally even though it is hooked into the
tracepoint code. Should perf switch to SRCU, or perhaps it can do the
song and dance to make RCU watch again?

-- Steve


> > [   62.409215]  __perf_event_overflow+0x52/0xe0
> > [   62.409223]  perf_swevent_overflow+0x91/0xb0
> > [   62.409229]  perf_tp_event+0x11a/0x350
> > [   62.409235]  ? find_held_lock+0x2d/0x90
> > [   62.409251]  ? __lock_acquire+0x2ce/0x1350
> > [   62.409263]  ? __lock_acquire+0x2ce/0x1350
> > [   62.409270]  ? retint_kernel+0x2d/0x2d
> > [   62.409278]  ? find_held_lock+0x2d/0x90
> > [   62.409285]  ? tick_nohz_get_sleep_length+0x83/0xb0
> > [   62.409299]  ? perf_trace_cpu+0xbb/0xd0
> > [   62.409306]  ? perf_trace_buf_alloc+0x5a/0xa0
> > [   62.409311]  perf_trace_cpu+0xbb/0xd0
> > [   62.409323]  cpuidle_enter_state+0x185/0x340
> > [   62.409332]  do_idle+0x1eb/0x260
> > [   62.409340]  cpu_startup_entry+0x5f/0x70
> > [   62.409347]  start_kernel+0x49b/0x4a6
> > 
> > [   62.409357]  secondary_startup_64+0xa4/0xb0


[PATCH] arm64: dts: qcom: sdm845: Add adsp, cdsp and slpi smp2p

2018-09-01 Thread Bjorn Andersson
Add the SMP2P nodes for the remoteproc states for adsp, cdsp and slpi.

Signed-off-by: Bjorn Andersson 
---
 arch/arm64/boot/dts/qcom/sdm845.dtsi | 88 
 1 file changed, 88 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi 
b/arch/arm64/boot/dts/qcom/sdm845.dtsi
index 0c9a2aa6a1b5..d977117acac4 100644
--- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
@@ -230,6 +230,94 @@
hwlocks = <_mutex 3>;
};
 
+   smp2p-cdsp {
+   compatible = "qcom,smp2p";
+   qcom,smem = <94>, <432>;
+
+   interrupts = ;
+
+   mboxes = <_shared 6>;
+
+   qcom,local-pid = <0>;
+   qcom,remote-pid = <5>;
+
+   cdsp_smp2p_out: master-kernel {
+   qcom,entry-name = "master-kernel";
+   #qcom,smem-state-cells = <1>;
+   };
+
+   cdsp_smp2p_in: slave-kernel {
+   qcom,entry-name = "slave-kernel";
+
+   interrupt-controller;
+   #interrupt-cells = <2>;
+   };
+   };
+
+   smp2p-lpass {
+   compatible = "qcom,smp2p";
+   qcom,smem = <443>, <429>;
+
+   interrupts = ;
+
+   mboxes = <_shared 10>;
+
+   qcom,local-pid = <0>;
+   qcom,remote-pid = <2>;
+
+   adsp_smp2p_out: master-kernel {
+   qcom,entry-name = "master-kernel";
+   #qcom,smem-state-cells = <1>;
+   };
+
+   adsp_smp2p_in: slave-kernel {
+   qcom,entry-name = "slave-kernel";
+
+   interrupt-controller;
+   #interrupt-cells = <2>;
+   };
+   };
+
+   smp2p-mpss {
+   compatible = "qcom,smp2p";
+   qcom,smem = <435>, <428>;
+   interrupts = ;
+   mboxes = <_shared 14>;
+   qcom,local-pid = <0>;
+   qcom,remote-pid = <1>;
+
+   modem_smp2p_out: master-kernel {
+   qcom,entry-name = "master-kernel";
+   #qcom,smem-state-cells = <1>;
+   };
+
+   modem_smp2p_in: slave-kernel {
+   qcom,entry-name = "slave-kernel";
+   interrupt-controller;
+   #interrupt-cells = <2>;
+   };
+   };
+
+   smp2p-slpi {
+   compatible = "qcom,smp2p";
+   qcom,smem = <481>, <430>;
+   interrupts = ;
+   mboxes = <_shared 26>;
+   qcom,local-pid = <0>;
+   qcom,remote-pid = <3>;
+
+   slpi_smp2p_out: master-kernel {
+   qcom,entry-name = "master-kernel";
+   #qcom,smem-state-cells = <1>;
+   };
+
+   slpi_smp2p_in: slave-kernel {
+   qcom,entry-name = "slave-kernel";
+   interrupt-controller;
+   #interrupt-cells = <2>;
+   };
+   };
+
psci {
compatible = "arm,psci-1.0";
method = "smc";
-- 
2.18.0



[PATCH] arm64: dts: qcom: sdm845: Add adsp, cdsp and slpi smp2p

2018-09-01 Thread Bjorn Andersson
Add the SMP2P nodes for the remoteproc states for adsp, cdsp and slpi.

Signed-off-by: Bjorn Andersson 
---
 arch/arm64/boot/dts/qcom/sdm845.dtsi | 88 
 1 file changed, 88 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi 
b/arch/arm64/boot/dts/qcom/sdm845.dtsi
index 0c9a2aa6a1b5..d977117acac4 100644
--- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
@@ -230,6 +230,94 @@
hwlocks = <_mutex 3>;
};
 
+   smp2p-cdsp {
+   compatible = "qcom,smp2p";
+   qcom,smem = <94>, <432>;
+
+   interrupts = ;
+
+   mboxes = <_shared 6>;
+
+   qcom,local-pid = <0>;
+   qcom,remote-pid = <5>;
+
+   cdsp_smp2p_out: master-kernel {
+   qcom,entry-name = "master-kernel";
+   #qcom,smem-state-cells = <1>;
+   };
+
+   cdsp_smp2p_in: slave-kernel {
+   qcom,entry-name = "slave-kernel";
+
+   interrupt-controller;
+   #interrupt-cells = <2>;
+   };
+   };
+
+   smp2p-lpass {
+   compatible = "qcom,smp2p";
+   qcom,smem = <443>, <429>;
+
+   interrupts = ;
+
+   mboxes = <_shared 10>;
+
+   qcom,local-pid = <0>;
+   qcom,remote-pid = <2>;
+
+   adsp_smp2p_out: master-kernel {
+   qcom,entry-name = "master-kernel";
+   #qcom,smem-state-cells = <1>;
+   };
+
+   adsp_smp2p_in: slave-kernel {
+   qcom,entry-name = "slave-kernel";
+
+   interrupt-controller;
+   #interrupt-cells = <2>;
+   };
+   };
+
+   smp2p-mpss {
+   compatible = "qcom,smp2p";
+   qcom,smem = <435>, <428>;
+   interrupts = ;
+   mboxes = <_shared 14>;
+   qcom,local-pid = <0>;
+   qcom,remote-pid = <1>;
+
+   modem_smp2p_out: master-kernel {
+   qcom,entry-name = "master-kernel";
+   #qcom,smem-state-cells = <1>;
+   };
+
+   modem_smp2p_in: slave-kernel {
+   qcom,entry-name = "slave-kernel";
+   interrupt-controller;
+   #interrupt-cells = <2>;
+   };
+   };
+
+   smp2p-slpi {
+   compatible = "qcom,smp2p";
+   qcom,smem = <481>, <430>;
+   interrupts = ;
+   mboxes = <_shared 26>;
+   qcom,local-pid = <0>;
+   qcom,remote-pid = <3>;
+
+   slpi_smp2p_out: master-kernel {
+   qcom,entry-name = "master-kernel";
+   #qcom,smem-state-cells = <1>;
+   };
+
+   slpi_smp2p_in: slave-kernel {
+   qcom,entry-name = "slave-kernel";
+   interrupt-controller;
+   #interrupt-cells = <2>;
+   };
+   };
+
psci {
compatible = "arm,psci-1.0";
method = "smc";
-- 
2.18.0



[PATCH] arm64: dts: qcom: Add AOSS reset driver node for SDM845

2018-09-01 Thread Bjorn Andersson
From: Sibi Sankar 

This patch adds the node to support AOSS reset driver on
SDM845

Signed-off-by: Sibi Sankar 
[bjorn: Updated addresses to match the binding that was merged]
Signed-off-by: Bjorn Andersson 
---
 arch/arm64/boot/dts/qcom/sdm845.dtsi | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi 
b/arch/arm64/boot/dts/qcom/sdm845.dtsi
index 0c9a2aa6a1b5..077760792cf0 100644
--- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 
 / {
interrupt-parent = <>;
@@ -978,6 +979,12 @@
#thermal-sensor-cells = <1>;
};
 
+   aoss_reset: reset-controller@c2a {
+   compatible = "qcom,sdm845-aoss-cc";
+   reg = <0xc2a 0x31000>;
+   #reset-cells = <1>;
+   };
+
spmi_bus: spmi@c44 {
compatible = "qcom,spmi-pmic-arb";
reg = <0xc44 0x1100>,
-- 
2.18.0



[PATCH] arm64: dts: qcom: Add AOSS reset driver node for SDM845

2018-09-01 Thread Bjorn Andersson
From: Sibi Sankar 

This patch adds the node to support AOSS reset driver on
SDM845

Signed-off-by: Sibi Sankar 
[bjorn: Updated addresses to match the binding that was merged]
Signed-off-by: Bjorn Andersson 
---
 arch/arm64/boot/dts/qcom/sdm845.dtsi | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi 
b/arch/arm64/boot/dts/qcom/sdm845.dtsi
index 0c9a2aa6a1b5..077760792cf0 100644
--- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 
 / {
interrupt-parent = <>;
@@ -978,6 +979,12 @@
#thermal-sensor-cells = <1>;
};
 
+   aoss_reset: reset-controller@c2a {
+   compatible = "qcom,sdm845-aoss-cc";
+   reg = <0xc2a 0x31000>;
+   #reset-cells = <1>;
+   };
+
spmi_bus: spmi@c44 {
compatible = "qcom,spmi-pmic-arb";
reg = <0xc44 0x1100>,
-- 
2.18.0



[PATCH] arm64: dts: qcom: sdm845-mtp: pm8998 and pmi8998 regulators

2018-09-01 Thread Bjorn Andersson
Add regulator definitions for pm8998 and pmi8998 regulators on the MTP.

Signed-off-by: Bjorn Andersson 
---
 arch/arm64/boot/dts/qcom/sdm845-mtp.dts | 216 
 1 file changed, 216 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sdm845-mtp.dts 
b/arch/arm64/boot/dts/qcom/sdm845-mtp.dts
index 6d651f314193..d9fbdcb39a4d 100644
--- a/arch/arm64/boot/dts/qcom/sdm845-mtp.dts
+++ b/arch/arm64/boot/dts/qcom/sdm845-mtp.dts
@@ -7,6 +7,7 @@
 
 /dts-v1/;
 
+#include 
 #include "sdm845.dtsi"
 
 / {
@@ -20,6 +21,221 @@
chosen {
stdout-path = "serial0:115200n8";
};
+
+   vph_pwr: vph-pwr-regulator {
+   compatible = "regulator-fixed";
+   regulator-name = "vph_pwr";
+   regulator-always-on;
+   regulator-boot-on;
+   };
+
+   /* S4 is always-on 1.8V and not controllable through RPMh */
+   vreg_s4a_1p8: vreg-s4a-regulator {
+   compatible = "regulator-fixed";
+   regulator-name = "vreg_s4a_1p8";
+
+   regulator-min-microvolt = <180>;
+   regulator-max-microvolt = <180>;
+
+   regulator-always-on;
+   regulator-boot-on;
+
+   vin-supply = <_pwr>;
+   };
+};
+
+_rsc {
+   pm8998-rpmh-regulators {
+   compatible = "qcom,pm8998-rpmh-regulators";
+   qcom,pmic-id = "a";
+
+   vdd-s3-supply = <_pwr>;
+   vdd-s5-supply = <_pwr>;
+   vdd-s7-supply = <_pwr>;
+   vdd-l1-l27-supply = <_s7a_1p025>;
+   vdd-l10-l23-l25-supply = <_bob>;
+   vdd-l13-l19-l21-supply = <_bob>;
+   vdd-l16-l28-supply = <_bob>;
+   vdd-l18-l22-supply = <_bob>;
+   vdd-l2-l8-l17-supply = <_s3a_1p35>;
+   vdd-l20-l24-supply = <_bob>;
+   vdd-l26-supply = <_s3a_1p35>;
+   vdd-l3-l11-supply = <_s7a_1p025>;
+   vdd-l4-l5-supply = <_s7a_1p025>;
+   vdd-l6-supply = <_pwr>;
+   vdd-l7-l12-l14-l15-supply = <_s5a_2p04>;
+   vdd-l9-supply = <_bob>;
+   vin-lvs-1-2-supply = <_s4a_1p8>;
+
+   vreg_s3a_1p35: smps3 {
+   regulator-min-microvolt = <1352000>;
+   regulator-max-microvolt = <1352000>;
+   };
+
+   vreg_s5a_2p04: smps5 {
+   regulator-min-microvolt = <1904000>;
+   regulator-max-microvolt = <204>;
+   };
+
+   vreg_s7a_1p025: smps7 {
+   regulator-min-microvolt = <90>;
+   regulator-max-microvolt = <1028000>;
+   };
+
+   vreg_l1a_0p875: ldo1 {
+   regulator-min-microvolt = <88>;
+   regulator-max-microvolt = <88>;
+   regulator-allow-set-load;
+   };
+
+   vreg_l2a_1p2: ldo2 {
+   regulator-min-microvolt = <120>;
+   regulator-max-microvolt = <120>;
+   regulator-always-on;
+   };
+
+   vreg_l3a_1p0: ldo3 {
+   regulator-min-microvolt = <100>;
+   regulator-max-microvolt = <100>;
+   };
+
+   vreg_l5a_0p8: ldo5 {
+   regulator-min-microvolt = <80>;
+   regulator-max-microvolt = <80>;
+   };
+
+   vreg_l6a_1p8: ldo6 {
+   regulator-min-microvolt = <1856000>;
+   regulator-max-microvolt = <1856000>;
+   };
+
+   vreg_l7a_1p8: ldo7 {
+   regulator-min-microvolt = <180>;
+   regulator-max-microvolt = <180>;
+   };
+
+   vreg_l8a_1p2: ldo8 {
+regulator-min-microvolt = <120>;
+regulator-max-microvolt = <1248000>;
+   };
+
+   vreg_l9a_1p8: ldo9 {
+   regulator-min-microvolt = <1704000>;
+   regulator-max-microvolt = <2928000>;
+   };
+
+   vreg_l10a_1p8: ldo10 {
+   regulator-min-microvolt = <1704000>;
+   regulator-max-microvolt = <2928000>;
+   };
+
+   vreg_l11a_1p0: ldo11 {
+   regulator-min-microvolt = <100>;
+   regulator-max-microvolt = <1048000>;
+   };
+
+   vreg_12a_1p8: ldo12 {
+   regulator-min-microvolt = <180>;
+   regulator-max-microvolt = <180>;
+   };
+
+   vreg_l13a_2p95: ldo13 {
+   regulator-min-microvolt = <180>;
+   regulator-max-microvolt = <296>;
+

[PATCH] arm64: dts: qcom: sdm845-mtp: pm8998 and pmi8998 regulators

2018-09-01 Thread Bjorn Andersson
Add regulator definitions for pm8998 and pmi8998 regulators on the MTP.

Signed-off-by: Bjorn Andersson 
---
 arch/arm64/boot/dts/qcom/sdm845-mtp.dts | 216 
 1 file changed, 216 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sdm845-mtp.dts 
b/arch/arm64/boot/dts/qcom/sdm845-mtp.dts
index 6d651f314193..d9fbdcb39a4d 100644
--- a/arch/arm64/boot/dts/qcom/sdm845-mtp.dts
+++ b/arch/arm64/boot/dts/qcom/sdm845-mtp.dts
@@ -7,6 +7,7 @@
 
 /dts-v1/;
 
+#include 
 #include "sdm845.dtsi"
 
 / {
@@ -20,6 +21,221 @@
chosen {
stdout-path = "serial0:115200n8";
};
+
+   vph_pwr: vph-pwr-regulator {
+   compatible = "regulator-fixed";
+   regulator-name = "vph_pwr";
+   regulator-always-on;
+   regulator-boot-on;
+   };
+
+   /* S4 is always-on 1.8V and not controllable through RPMh */
+   vreg_s4a_1p8: vreg-s4a-regulator {
+   compatible = "regulator-fixed";
+   regulator-name = "vreg_s4a_1p8";
+
+   regulator-min-microvolt = <180>;
+   regulator-max-microvolt = <180>;
+
+   regulator-always-on;
+   regulator-boot-on;
+
+   vin-supply = <_pwr>;
+   };
+};
+
+_rsc {
+   pm8998-rpmh-regulators {
+   compatible = "qcom,pm8998-rpmh-regulators";
+   qcom,pmic-id = "a";
+
+   vdd-s3-supply = <_pwr>;
+   vdd-s5-supply = <_pwr>;
+   vdd-s7-supply = <_pwr>;
+   vdd-l1-l27-supply = <_s7a_1p025>;
+   vdd-l10-l23-l25-supply = <_bob>;
+   vdd-l13-l19-l21-supply = <_bob>;
+   vdd-l16-l28-supply = <_bob>;
+   vdd-l18-l22-supply = <_bob>;
+   vdd-l2-l8-l17-supply = <_s3a_1p35>;
+   vdd-l20-l24-supply = <_bob>;
+   vdd-l26-supply = <_s3a_1p35>;
+   vdd-l3-l11-supply = <_s7a_1p025>;
+   vdd-l4-l5-supply = <_s7a_1p025>;
+   vdd-l6-supply = <_pwr>;
+   vdd-l7-l12-l14-l15-supply = <_s5a_2p04>;
+   vdd-l9-supply = <_bob>;
+   vin-lvs-1-2-supply = <_s4a_1p8>;
+
+   vreg_s3a_1p35: smps3 {
+   regulator-min-microvolt = <1352000>;
+   regulator-max-microvolt = <1352000>;
+   };
+
+   vreg_s5a_2p04: smps5 {
+   regulator-min-microvolt = <1904000>;
+   regulator-max-microvolt = <204>;
+   };
+
+   vreg_s7a_1p025: smps7 {
+   regulator-min-microvolt = <90>;
+   regulator-max-microvolt = <1028000>;
+   };
+
+   vreg_l1a_0p875: ldo1 {
+   regulator-min-microvolt = <88>;
+   regulator-max-microvolt = <88>;
+   regulator-allow-set-load;
+   };
+
+   vreg_l2a_1p2: ldo2 {
+   regulator-min-microvolt = <120>;
+   regulator-max-microvolt = <120>;
+   regulator-always-on;
+   };
+
+   vreg_l3a_1p0: ldo3 {
+   regulator-min-microvolt = <100>;
+   regulator-max-microvolt = <100>;
+   };
+
+   vreg_l5a_0p8: ldo5 {
+   regulator-min-microvolt = <80>;
+   regulator-max-microvolt = <80>;
+   };
+
+   vreg_l6a_1p8: ldo6 {
+   regulator-min-microvolt = <1856000>;
+   regulator-max-microvolt = <1856000>;
+   };
+
+   vreg_l7a_1p8: ldo7 {
+   regulator-min-microvolt = <180>;
+   regulator-max-microvolt = <180>;
+   };
+
+   vreg_l8a_1p2: ldo8 {
+regulator-min-microvolt = <120>;
+regulator-max-microvolt = <1248000>;
+   };
+
+   vreg_l9a_1p8: ldo9 {
+   regulator-min-microvolt = <1704000>;
+   regulator-max-microvolt = <2928000>;
+   };
+
+   vreg_l10a_1p8: ldo10 {
+   regulator-min-microvolt = <1704000>;
+   regulator-max-microvolt = <2928000>;
+   };
+
+   vreg_l11a_1p0: ldo11 {
+   regulator-min-microvolt = <100>;
+   regulator-max-microvolt = <1048000>;
+   };
+
+   vreg_12a_1p8: ldo12 {
+   regulator-min-microvolt = <180>;
+   regulator-max-microvolt = <180>;
+   };
+
+   vreg_l13a_2p95: ldo13 {
+   regulator-min-microvolt = <180>;
+   regulator-max-microvolt = <296>;
+

Re: Access to non-RAM pages

2018-09-01 Thread Jiri Kosina
On Sat, 1 Sep 2018, Al Viro wrote:

> IMO that's crap.  In absolute majority of cases there is a guaranteed gap
> between the end of accessed object and the next page boundary.  

So if that's the case, you're absolutely right. But I am unable to find 
any such guarantee in our current code though.

Thanks,

-- 
Jiri Kosina
SUSE Labs



Re: Access to non-RAM pages

2018-09-01 Thread Jiri Kosina
On Sat, 1 Sep 2018, Al Viro wrote:

> IMO that's crap.  In absolute majority of cases there is a guaranteed gap
> between the end of accessed object and the next page boundary.  

So if that's the case, you're absolutely right. But I am unable to find 
any such guarantee in our current code though.

Thanks,

-- 
Jiri Kosina
SUSE Labs



Re: [PATCH] staging: android: ion: fix ION_IOC_{MAP,SHARE} use-after-free

2018-09-01 Thread Greg Kroah-Hartman
On Fri, Aug 31, 2018 at 01:30:01PM -0700, Greg Hackmann wrote:
> On 08/31/2018 01:27 PM, Greg Hackmann wrote:
> > Change-Id: Ia0542dd8134e81cd5e1412e126545303c766f738
> 
> Sorry, please disregard the Change-Id line.  This is what I get for
> forgetting to re-run checkpatch after amending my commit message.  :/

Can you please resend with that fixed up.  Having to hand-edit patches
on my end is a royal pain...

thanks,

greg k-h


Re: [PATCH] staging: android: ion: fix ION_IOC_{MAP,SHARE} use-after-free

2018-09-01 Thread Greg Kroah-Hartman
On Fri, Aug 31, 2018 at 01:30:01PM -0700, Greg Hackmann wrote:
> On 08/31/2018 01:27 PM, Greg Hackmann wrote:
> > Change-Id: Ia0542dd8134e81cd5e1412e126545303c766f738
> 
> Sorry, please disregard the Change-Id line.  This is what I get for
> forgetting to re-run checkpatch after amending my commit message.  :/

Can you please resend with that fixed up.  Having to hand-edit patches
on my end is a royal pain...

thanks,

greg k-h


Re: Redoing eXclusive Page Frame Ownership (XPFO) with isolated CPUs in mind (for KVM to isolate its guests per CPU)

2018-09-01 Thread Linus Torvalds
On Fri, Aug 31, 2018 at 12:45 AM Julian Stecklina  wrote:
>
> I've been spending some cycles on the XPFO patch set this week. For the
> patch set as it was posted for v4.13, the performance overhead of
> compiling a Linux kernel is ~40% on x86_64[1]. The overhead comes almost
> completely from TLB flushing. If we can live with stale TLB entries
> allowing temporary access (which I think is reasonable), we can remove
> all TLB flushing (on x86). This reduces the overhead to 2-3% for
> kernel compile.

I have to say, even 2-3% for a kernel compile sounds absolutely horrendous.

Kernel bullds are 90% user space at least for me, so a 2-3% slowdown
from a kernel is not some small unnoticeable thing.

   Linus


Re: Redoing eXclusive Page Frame Ownership (XPFO) with isolated CPUs in mind (for KVM to isolate its guests per CPU)

2018-09-01 Thread Linus Torvalds
On Fri, Aug 31, 2018 at 12:45 AM Julian Stecklina  wrote:
>
> I've been spending some cycles on the XPFO patch set this week. For the
> patch set as it was posted for v4.13, the performance overhead of
> compiling a Linux kernel is ~40% on x86_64[1]. The overhead comes almost
> completely from TLB flushing. If we can live with stale TLB entries
> allowing temporary access (which I think is reasonable), we can remove
> all TLB flushing (on x86). This reduces the overhead to 2-3% for
> kernel compile.

I have to say, even 2-3% for a kernel compile sounds absolutely horrendous.

Kernel bullds are 90% user space at least for me, so a 2-3% slowdown
from a kernel is not some small unnoticeable thing.

   Linus


Re: [PATCH 2/3] arm64: dts: meson-axg: s400: add dmic codec

2018-09-01 Thread Fabio Estevam
Hi Jerome,

On Fri, Aug 31, 2018 at 12:02 PM, Jerome Brunet  wrote:
> There are 7 digital mics on the MIC daughter board attached
> to the s400 board, so add the digital microphone codec to
> its DTS
>
> Signed-off-by: Jerome Brunet 
> ---
>  arch/arm64/boot/dts/amlogic/meson-axg-s400.dts | 9 +
>  1 file changed, 9 insertions(+)
>
> diff --git a/arch/arm64/boot/dts/amlogic/meson-axg-s400.dts 
> b/arch/arm64/boot/dts/amlogic/meson-axg-s400.dts
> index ff64c429d432..f3e16cbbc61e 100644
> --- a/arch/arm64/boot/dts/amlogic/meson-axg-s400.dts
> +++ b/arch/arm64/boot/dts/amlogic/meson-axg-s400.dts
> @@ -86,6 +86,15 @@
> sound-name-prefix = "DIT";
> };
>
> +   dmics: audio-codec@3 {

You pass @3 without a corresponding reg = <3>, which causes dtc
warnings with W=1.


Re: [PATCH 2/3] arm64: dts: meson-axg: s400: add dmic codec

2018-09-01 Thread Fabio Estevam
Hi Jerome,

On Fri, Aug 31, 2018 at 12:02 PM, Jerome Brunet  wrote:
> There are 7 digital mics on the MIC daughter board attached
> to the s400 board, so add the digital microphone codec to
> its DTS
>
> Signed-off-by: Jerome Brunet 
> ---
>  arch/arm64/boot/dts/amlogic/meson-axg-s400.dts | 9 +
>  1 file changed, 9 insertions(+)
>
> diff --git a/arch/arm64/boot/dts/amlogic/meson-axg-s400.dts 
> b/arch/arm64/boot/dts/amlogic/meson-axg-s400.dts
> index ff64c429d432..f3e16cbbc61e 100644
> --- a/arch/arm64/boot/dts/amlogic/meson-axg-s400.dts
> +++ b/arch/arm64/boot/dts/amlogic/meson-axg-s400.dts
> @@ -86,6 +86,15 @@
> sound-name-prefix = "DIT";
> };
>
> +   dmics: audio-codec@3 {

You pass @3 without a corresponding reg = <3>, which causes dtc
warnings with W=1.


[tip:x86/urgent] x86/vdso: Fix lsl operand order

2018-09-01 Thread tip-bot for Samuel Neves
Commit-ID:  e78e5a91456fcecaa2efbb3706572fe043766f4d
Gitweb: https://git.kernel.org/tip/e78e5a91456fcecaa2efbb3706572fe043766f4d
Author: Samuel Neves 
AuthorDate: Sat, 1 Sep 2018 21:14:52 +0100
Committer:  Thomas Gleixner 
CommitDate: Sat, 1 Sep 2018 23:01:56 +0200

x86/vdso: Fix lsl operand order

In the __getcpu function, lsl is using the wrong target and destination
registers. Luckily, the compiler tends to choose %eax for both variables,
so it has been working so far.

Fixes: a582c540ac1b ("x86/vdso: Use RDPID in preference to LSL when available")
Signed-off-by: Samuel Neves 
Signed-off-by: Thomas Gleixner 
Acked-by: Andy Lutomirski 
Cc: sta...@vger.kernel.org
Link: https://lkml.kernel.org/r/20180901201452.27828-1-sne...@dei.uc.pt

---
 arch/x86/include/asm/vgtod.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/vgtod.h b/arch/x86/include/asm/vgtod.h
index fb856c9f0449..53748541c487 100644
--- a/arch/x86/include/asm/vgtod.h
+++ b/arch/x86/include/asm/vgtod.h
@@ -93,7 +93,7 @@ static inline unsigned int __getcpu(void)
 *
 * If RDPID is available, use it.
 */
-   alternative_io ("lsl %[p],%[seg]",
+   alternative_io ("lsl %[seg],%[p]",
".byte 0xf3,0x0f,0xc7,0xf8", /* RDPID %eax/rax */
X86_FEATURE_RDPID,
[p] "=a" (p), [seg] "r" (__PER_CPU_SEG));


[tip:x86/urgent] x86/vdso: Fix lsl operand order

2018-09-01 Thread tip-bot for Samuel Neves
Commit-ID:  e78e5a91456fcecaa2efbb3706572fe043766f4d
Gitweb: https://git.kernel.org/tip/e78e5a91456fcecaa2efbb3706572fe043766f4d
Author: Samuel Neves 
AuthorDate: Sat, 1 Sep 2018 21:14:52 +0100
Committer:  Thomas Gleixner 
CommitDate: Sat, 1 Sep 2018 23:01:56 +0200

x86/vdso: Fix lsl operand order

In the __getcpu function, lsl is using the wrong target and destination
registers. Luckily, the compiler tends to choose %eax for both variables,
so it has been working so far.

Fixes: a582c540ac1b ("x86/vdso: Use RDPID in preference to LSL when available")
Signed-off-by: Samuel Neves 
Signed-off-by: Thomas Gleixner 
Acked-by: Andy Lutomirski 
Cc: sta...@vger.kernel.org
Link: https://lkml.kernel.org/r/20180901201452.27828-1-sne...@dei.uc.pt

---
 arch/x86/include/asm/vgtod.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/vgtod.h b/arch/x86/include/asm/vgtod.h
index fb856c9f0449..53748541c487 100644
--- a/arch/x86/include/asm/vgtod.h
+++ b/arch/x86/include/asm/vgtod.h
@@ -93,7 +93,7 @@ static inline unsigned int __getcpu(void)
 *
 * If RDPID is available, use it.
 */
-   alternative_io ("lsl %[p],%[seg]",
+   alternative_io ("lsl %[seg],%[p]",
".byte 0xf3,0x0f,0xc7,0xf8", /* RDPID %eax/rax */
X86_FEATURE_RDPID,
[p] "=a" (p), [seg] "r" (__PER_CPU_SEG));


Re: [PATCH] x86/vdso: fix lsl operand order

2018-09-01 Thread Andy Lutomirski
On Sat, Sep 1, 2018 at 1:14 PM, Samuel Neves  wrote:
> In the __getcpu function, lsl was using the wrong target
> and destination registers. Luckily, the compiler tends to
> choose %eax for both variables, so it has been working
> so far.
>
> Cc: x...@kernel.org
> Cc: sta...@vger.kernel.org
> Signed-off-by: Samuel Neves 

Acked-by: Andy Lutomirski 
Fixes: a582c540ac1b ("x86/vdso: Use RDPID in preference to LSL when available")

Whoops!  I even wrote a selftest just for the offending commit, but,
of course, the selftest passes :(  I tested this by giving gcc some
gentle encouragement to allocate different registers, and the existing
code is indeed wrong and the fix indeed fixes it.

--Andy


Re: [PATCH] x86/vdso: fix lsl operand order

2018-09-01 Thread Andy Lutomirski
On Sat, Sep 1, 2018 at 1:14 PM, Samuel Neves  wrote:
> In the __getcpu function, lsl was using the wrong target
> and destination registers. Luckily, the compiler tends to
> choose %eax for both variables, so it has been working
> so far.
>
> Cc: x...@kernel.org
> Cc: sta...@vger.kernel.org
> Signed-off-by: Samuel Neves 

Acked-by: Andy Lutomirski 
Fixes: a582c540ac1b ("x86/vdso: Use RDPID in preference to LSL when available")

Whoops!  I even wrote a selftest just for the offending commit, but,
of course, the selftest passes :(  I tested this by giving gcc some
gentle encouragement to allocate different registers, and the existing
code is indeed wrong and the fix indeed fixes it.

--Andy


[PATCH] x86/vdso: fix lsl operand order

2018-09-01 Thread Samuel Neves
In the __getcpu function, lsl was using the wrong target
and destination registers. Luckily, the compiler tends to
choose %eax for both variables, so it has been working
so far.

Cc: x...@kernel.org
Cc: sta...@vger.kernel.org
Signed-off-by: Samuel Neves 
---
 arch/x86/include/asm/vgtod.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/vgtod.h b/arch/x86/include/asm/vgtod.h
index fb856c9f0449..53748541c487 100644
--- a/arch/x86/include/asm/vgtod.h
+++ b/arch/x86/include/asm/vgtod.h
@@ -93,7 +93,7 @@ static inline unsigned int __getcpu(void)
 *
 * If RDPID is available, use it.
 */
-   alternative_io ("lsl %[p],%[seg]",
+   alternative_io ("lsl %[seg],%[p]",
".byte 0xf3,0x0f,0xc7,0xf8", /* RDPID %eax/rax */
X86_FEATURE_RDPID,
[p] "=a" (p), [seg] "r" (__PER_CPU_SEG));
-- 
2.17.1



[PATCH] x86/vdso: fix lsl operand order

2018-09-01 Thread Samuel Neves
In the __getcpu function, lsl was using the wrong target
and destination registers. Luckily, the compiler tends to
choose %eax for both variables, so it has been working
so far.

Cc: x...@kernel.org
Cc: sta...@vger.kernel.org
Signed-off-by: Samuel Neves 
---
 arch/x86/include/asm/vgtod.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/vgtod.h b/arch/x86/include/asm/vgtod.h
index fb856c9f0449..53748541c487 100644
--- a/arch/x86/include/asm/vgtod.h
+++ b/arch/x86/include/asm/vgtod.h
@@ -93,7 +93,7 @@ static inline unsigned int __getcpu(void)
 *
 * If RDPID is available, use it.
 */
-   alternative_io ("lsl %[p],%[seg]",
+   alternative_io ("lsl %[seg],%[p]",
".byte 0xf3,0x0f,0xc7,0xf8", /* RDPID %eax/rax */
X86_FEATURE_RDPID,
[p] "=a" (p), [seg] "r" (__PER_CPU_SEG));
-- 
2.17.1



[PATCH] iio: pressure: ms5611: switch to SPDX identifier

2018-09-01 Thread Tomasz Duszynski
Drop boilerplate license text and use SPDX identifier instead.

Signed-off-by: Tomasz Duszynski 
---
 drivers/iio/pressure/ms5611.h  | 5 +
 drivers/iio/pressure/ms5611_core.c | 5 +
 drivers/iio/pressure/ms5611_i2c.c  | 5 +
 drivers/iio/pressure/ms5611_spi.c  | 5 +
 4 files changed, 4 insertions(+), 16 deletions(-)

diff --git a/drivers/iio/pressure/ms5611.h b/drivers/iio/pressure/ms5611.h
index ead9e9f85894..c9dd86de3ade 100644
--- a/drivers/iio/pressure/ms5611.h
+++ b/drivers/iio/pressure/ms5611.h
@@ -1,12 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
  * MS5611 pressure and temperature sensor driver
  *
  * Copyright (c) Tomasz Duszynski 
  *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
  */

 #ifndef _MS5611_H
diff --git a/drivers/iio/pressure/ms5611_core.c 
b/drivers/iio/pressure/ms5611_core.c
index f950cfde5db9..2f598ad91621 100644
--- a/drivers/iio/pressure/ms5611_core.c
+++ b/drivers/iio/pressure/ms5611_core.c
@@ -1,12 +1,9 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
  * MS5611 pressure and temperature sensor driver
  *
  * Copyright (c) Tomasz Duszynski 
  *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
  * Data sheet:
  *  http://www.meas-spec.com/downloads/MS5611-01BA03.pdf
  *  http://www.meas-spec.com/downloads/MS5607-02BA03.pdf
diff --git a/drivers/iio/pressure/ms5611_i2c.c 
b/drivers/iio/pressure/ms5611_i2c.c
index 0469c8ae1134..8089c59adce5 100644
--- a/drivers/iio/pressure/ms5611_i2c.c
+++ b/drivers/iio/pressure/ms5611_i2c.c
@@ -1,12 +1,9 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
  * MS5611 pressure and temperature sensor driver (I2C bus)
  *
  * Copyright (c) Tomasz Duszynski 
  *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
  * 7-bit I2C slave addresses:
  *
  * 0x77 (CSB pin low)
diff --git a/drivers/iio/pressure/ms5611_spi.c 
b/drivers/iio/pressure/ms5611_spi.c
index cd11d022208e..b463eaa799ab 100644
--- a/drivers/iio/pressure/ms5611_spi.c
+++ b/drivers/iio/pressure/ms5611_spi.c
@@ -1,12 +1,9 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
  * MS5611 pressure and temperature sensor driver (SPI bus)
  *
  * Copyright (c) Tomasz Duszynski 
  *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
  */

 #include 
--
2.18.0



[PATCH] iio: pressure: ms5611: switch to SPDX identifier

2018-09-01 Thread Tomasz Duszynski
Drop boilerplate license text and use SPDX identifier instead.

Signed-off-by: Tomasz Duszynski 
---
 drivers/iio/pressure/ms5611.h  | 5 +
 drivers/iio/pressure/ms5611_core.c | 5 +
 drivers/iio/pressure/ms5611_i2c.c  | 5 +
 drivers/iio/pressure/ms5611_spi.c  | 5 +
 4 files changed, 4 insertions(+), 16 deletions(-)

diff --git a/drivers/iio/pressure/ms5611.h b/drivers/iio/pressure/ms5611.h
index ead9e9f85894..c9dd86de3ade 100644
--- a/drivers/iio/pressure/ms5611.h
+++ b/drivers/iio/pressure/ms5611.h
@@ -1,12 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
  * MS5611 pressure and temperature sensor driver
  *
  * Copyright (c) Tomasz Duszynski 
  *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
  */

 #ifndef _MS5611_H
diff --git a/drivers/iio/pressure/ms5611_core.c 
b/drivers/iio/pressure/ms5611_core.c
index f950cfde5db9..2f598ad91621 100644
--- a/drivers/iio/pressure/ms5611_core.c
+++ b/drivers/iio/pressure/ms5611_core.c
@@ -1,12 +1,9 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
  * MS5611 pressure and temperature sensor driver
  *
  * Copyright (c) Tomasz Duszynski 
  *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
  * Data sheet:
  *  http://www.meas-spec.com/downloads/MS5611-01BA03.pdf
  *  http://www.meas-spec.com/downloads/MS5607-02BA03.pdf
diff --git a/drivers/iio/pressure/ms5611_i2c.c 
b/drivers/iio/pressure/ms5611_i2c.c
index 0469c8ae1134..8089c59adce5 100644
--- a/drivers/iio/pressure/ms5611_i2c.c
+++ b/drivers/iio/pressure/ms5611_i2c.c
@@ -1,12 +1,9 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
  * MS5611 pressure and temperature sensor driver (I2C bus)
  *
  * Copyright (c) Tomasz Duszynski 
  *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
  * 7-bit I2C slave addresses:
  *
  * 0x77 (CSB pin low)
diff --git a/drivers/iio/pressure/ms5611_spi.c 
b/drivers/iio/pressure/ms5611_spi.c
index cd11d022208e..b463eaa799ab 100644
--- a/drivers/iio/pressure/ms5611_spi.c
+++ b/drivers/iio/pressure/ms5611_spi.c
@@ -1,12 +1,9 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
  * MS5611 pressure and temperature sensor driver (SPI bus)
  *
  * Copyright (c) Tomasz Duszynski 
  *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
  */

 #include 
--
2.18.0



[PATCH] iio: light: bh1750: switch to SPDX identifier

2018-09-01 Thread Tomasz Duszynski
Drop boilerplate license text and use SPDX identifier instead.

Signed-off-by: Tomasz Duszynski 
---
 drivers/iio/light/bh1750.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/iio/light/bh1750.c b/drivers/iio/light/bh1750.c
index a814828e69f5..493ca7420602 100644
--- a/drivers/iio/light/bh1750.c
+++ b/drivers/iio/light/bh1750.c
@@ -1,12 +1,9 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
  * ROHM BH1710/BH1715/BH1721/BH1750/BH1751 ambient light sensor driver
  *
  * Copyright (c) Tomasz Duszynski 
  *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
  * Data sheets:
  *  
http://rohmfs.rohm.com/en/products/databook/datasheet/ic/sensor/light/bh1710fvc-e.pdf
  *  
http://rohmfs.rohm.com/en/products/databook/datasheet/ic/sensor/light/bh1715fvc-e.pdf
-- 
2.18.0



[PATCH] iio: light: bh1750: switch to SPDX identifier

2018-09-01 Thread Tomasz Duszynski
Drop boilerplate license text and use SPDX identifier instead.

Signed-off-by: Tomasz Duszynski 
---
 drivers/iio/light/bh1750.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/iio/light/bh1750.c b/drivers/iio/light/bh1750.c
index a814828e69f5..493ca7420602 100644
--- a/drivers/iio/light/bh1750.c
+++ b/drivers/iio/light/bh1750.c
@@ -1,12 +1,9 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
  * ROHM BH1710/BH1715/BH1721/BH1750/BH1751 ambient light sensor driver
  *
  * Copyright (c) Tomasz Duszynski 
  *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
  * Data sheets:
  *  
http://rohmfs.rohm.com/en/products/databook/datasheet/ic/sensor/light/bh1710fvc-e.pdf
  *  
http://rohmfs.rohm.com/en/products/databook/datasheet/ic/sensor/light/bh1715fvc-e.pdf
-- 
2.18.0



Re: [PATCH 7/7] Compiler Attributes: use feature checks instead of version checks

2018-09-01 Thread Miguel Ojeda
Hi Greg,

On Sat, Sep 1, 2018 at 8:39 PM, Greg KH  wrote:
> On Sat, Sep 01, 2018 at 03:38:13PM +0200, Miguel Ojeda wrote:
>> Hi Nick,
>>
>> On Sat, Sep 1, 2018 at 1:07 AM, Nick Desaulniers
>>  wrote:
>> > Overall, pretty happy with this patch.  Still some thoughts for a v3,
>> >> -#define __visible  __attribute__((externally_visible))
>> >> diff --git a/include/linux/compiler_attributes.h 
>> >> b/include/linux/compiler_attributes.h
>> >> new file mode 100644
>> >> index ..a9dfafc8fd19
>> >> --- /dev/null
>> >> +++ b/include/linux/compiler_attributes.h
>> >> @@ -0,0 +1,226 @@
>> >
>> > New file needs an SPDX license identifier right?
>>
>> Yeah, but I wasn't sure of adding it, since the code I moved (even if
>> rearranged) from _types.h does not have it either. Any legal expert
>> here? Is _types.h it implicitly GPL 2? We should add the identifier to
>> both if so.
>
> It looks like we missed that file in the big "properly add SPDX
> identifiers to all files without a license" commit as it came in from a
> different tree.
>
> But yes, it is GPLv2 only implicitly, so you can add that.  I should
> sweep the tree again to see if anything else has been added accidentally
> with that same problem.
>

Thanks Greg! Will do for v3.

Cheers,
Miguel


Re: [PATCH 7/7] Compiler Attributes: use feature checks instead of version checks

2018-09-01 Thread Miguel Ojeda
Hi Greg,

On Sat, Sep 1, 2018 at 8:39 PM, Greg KH  wrote:
> On Sat, Sep 01, 2018 at 03:38:13PM +0200, Miguel Ojeda wrote:
>> Hi Nick,
>>
>> On Sat, Sep 1, 2018 at 1:07 AM, Nick Desaulniers
>>  wrote:
>> > Overall, pretty happy with this patch.  Still some thoughts for a v3,
>> >> -#define __visible  __attribute__((externally_visible))
>> >> diff --git a/include/linux/compiler_attributes.h 
>> >> b/include/linux/compiler_attributes.h
>> >> new file mode 100644
>> >> index ..a9dfafc8fd19
>> >> --- /dev/null
>> >> +++ b/include/linux/compiler_attributes.h
>> >> @@ -0,0 +1,226 @@
>> >
>> > New file needs an SPDX license identifier right?
>>
>> Yeah, but I wasn't sure of adding it, since the code I moved (even if
>> rearranged) from _types.h does not have it either. Any legal expert
>> here? Is _types.h it implicitly GPL 2? We should add the identifier to
>> both if so.
>
> It looks like we missed that file in the big "properly add SPDX
> identifiers to all files without a license" commit as it came in from a
> different tree.
>
> But yes, it is GPLv2 only implicitly, so you can add that.  I should
> sweep the tree again to see if anything else has been added accidentally
> with that same problem.
>

Thanks Greg! Will do for v3.

Cheers,
Miguel


Re: [PATCH 7/7] Compiler Attributes: use feature checks instead of version checks

2018-09-01 Thread Greg KH
On Sat, Sep 01, 2018 at 03:38:13PM +0200, Miguel Ojeda wrote:
> Hi Nick,
> 
> On Sat, Sep 1, 2018 at 1:07 AM, Nick Desaulniers
>  wrote:
> > Overall, pretty happy with this patch.  Still some thoughts for a v3,
> >> -#define __visible  __attribute__((externally_visible))
> >> diff --git a/include/linux/compiler_attributes.h 
> >> b/include/linux/compiler_attributes.h
> >> new file mode 100644
> >> index ..a9dfafc8fd19
> >> --- /dev/null
> >> +++ b/include/linux/compiler_attributes.h
> >> @@ -0,0 +1,226 @@
> >
> > New file needs an SPDX license identifier right?
> 
> Yeah, but I wasn't sure of adding it, since the code I moved (even if
> rearranged) from _types.h does not have it either. Any legal expert
> here? Is _types.h it implicitly GPL 2? We should add the identifier to
> both if so.

It looks like we missed that file in the big "properly add SPDX
identifiers to all files without a license" commit as it came in from a
different tree.

But yes, it is GPLv2 only implicitly, so you can add that.  I should
sweep the tree again to see if anything else has been added accidentally
with that same problem.

thanks,

greg k-h


Re: [PATCH 7/7] Compiler Attributes: use feature checks instead of version checks

2018-09-01 Thread Greg KH
On Sat, Sep 01, 2018 at 03:38:13PM +0200, Miguel Ojeda wrote:
> Hi Nick,
> 
> On Sat, Sep 1, 2018 at 1:07 AM, Nick Desaulniers
>  wrote:
> > Overall, pretty happy with this patch.  Still some thoughts for a v3,
> >> -#define __visible  __attribute__((externally_visible))
> >> diff --git a/include/linux/compiler_attributes.h 
> >> b/include/linux/compiler_attributes.h
> >> new file mode 100644
> >> index ..a9dfafc8fd19
> >> --- /dev/null
> >> +++ b/include/linux/compiler_attributes.h
> >> @@ -0,0 +1,226 @@
> >
> > New file needs an SPDX license identifier right?
> 
> Yeah, but I wasn't sure of adding it, since the code I moved (even if
> rearranged) from _types.h does not have it either. Any legal expert
> here? Is _types.h it implicitly GPL 2? We should add the identifier to
> both if so.

It looks like we missed that file in the big "properly add SPDX
identifiers to all files without a license" commit as it came in from a
different tree.

But yes, it is GPLv2 only implicitly, so you can add that.  I should
sweep the tree again to see if anything else has been added accidentally
with that same problem.

thanks,

greg k-h


Re: Access to non-RAM pages

2018-09-01 Thread Linus Torvalds
[ Adding a few new people the the cc.

  The issue is the worry about software-speculative accesses (ie
things like CONFIG_DCACHE_WORD_ACCESS - not talking about the hw
speculation now) accessing past RAM into possibly contiguous IO ]

On Sat, Sep 1, 2018 at 10:27 AM Linus Torvalds
 wrote:
>
> If you have a machine with RAM that touches IO, you need to disable
> the last page, exactly the same way we disable and marked reserved the
> first page at zero.
>
> I thought we already did that.

We don't seem to do that.

And it's not just the last page, it's _any_ last page in a region that
bumps up to IO. That's actually much more common in the low 4G area on
PC's, I suspect, although the reserved BIOS ranges always tend to be
there.

I suspect it should be trivial to do - maybe in
e820__memblock_setup()? That's where we already trim partial pages
etc.

In fact, I think this might be done as an extension of commit
124049decbb1 ("x86/e820: put !E820_TYPE_RAM regions into
memblock.reserved"), except making sure that non-RAM regions mark one
page _previous_ as reserved too.

I assume memory hotplug might have the same issue, and checking
whether ARM64 and powerpc perhaps might have already done something
like this (or might need to add it).

We discussed long ago the case of user space mapping IO in user space,
and decided we didn't care. But the kernel should probably explicitly
make sure we don't either, even if I can't recall having ever seen a
machine that actually maps IO contiguously to RAM. The layout always
tends to end up having holes anyway.

  Linus


Re: Access to non-RAM pages

2018-09-01 Thread Linus Torvalds
[ Adding a few new people the the cc.

  The issue is the worry about software-speculative accesses (ie
things like CONFIG_DCACHE_WORD_ACCESS - not talking about the hw
speculation now) accessing past RAM into possibly contiguous IO ]

On Sat, Sep 1, 2018 at 10:27 AM Linus Torvalds
 wrote:
>
> If you have a machine with RAM that touches IO, you need to disable
> the last page, exactly the same way we disable and marked reserved the
> first page at zero.
>
> I thought we already did that.

We don't seem to do that.

And it's not just the last page, it's _any_ last page in a region that
bumps up to IO. That's actually much more common in the low 4G area on
PC's, I suspect, although the reserved BIOS ranges always tend to be
there.

I suspect it should be trivial to do - maybe in
e820__memblock_setup()? That's where we already trim partial pages
etc.

In fact, I think this might be done as an extension of commit
124049decbb1 ("x86/e820: put !E820_TYPE_RAM regions into
memblock.reserved"), except making sure that non-RAM regions mark one
page _previous_ as reserved too.

I assume memory hotplug might have the same issue, and checking
whether ARM64 and powerpc perhaps might have already done something
like this (or might need to add it).

We discussed long ago the case of user space mapping IO in user space,
and decided we didn't care. But the kernel should probably explicitly
make sure we don't either, even if I can't recall having ever seen a
machine that actually maps IO contiguously to RAM. The layout always
tends to end up having holes anyway.

  Linus


[PATCH] ACPI / LPSS: Ensure LPIOEP is always set on resume

2018-09-01 Thread William Lieurance
For some number of systems with lpss_quirks enabled, on boot the system
goes through an acpi_lpss_resume() without a corresponding
acpi_lpss_suspend() having been called.  In that case, it requires the
IOSF write to LPSS_IOSF_UNIT_LPIOEP / LPSS_IOSF_GPIODEF0 in order to
continue booting successfully.

Signed-off-by: William Lieurance 
---
 drivers/acpi/acpi_lpss.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/acpi/acpi_lpss.c b/drivers/acpi/acpi_lpss.c
index 9706613eecf9..c7790ba943d4 100644
--- a/drivers/acpi/acpi_lpss.c
+++ b/drivers/acpi/acpi_lpss.c
@@ -939,14 +939,14 @@ static void lpss_iosf_exit_d3_state(void)
 
mutex_lock(_iosf_mutex);
 
+   iosf_mbi_modify(LPSS_IOSF_UNIT_LPIOEP, MBI_CR_WRITE,
+   LPSS_IOSF_GPIODEF0, value1, mask1);
+
if (!lpss_iosf_d3_entered)
goto exit;
 
lpss_iosf_d3_entered = false;
 
-   iosf_mbi_modify(LPSS_IOSF_UNIT_LPIOEP, MBI_CR_WRITE,
-   LPSS_IOSF_GPIODEF0, value1, mask1);
-
iosf_mbi_modify(LPSS_IOSF_UNIT_LPIO2, MBI_CFG_WRITE,
LPSS_IOSF_PMCSR, value2, mask2);
 
-- 
2.17.1



[PATCH] ACPI / LPSS: Ensure LPIOEP is always set on resume

2018-09-01 Thread William Lieurance
For some number of systems with lpss_quirks enabled, on boot the system
goes through an acpi_lpss_resume() without a corresponding
acpi_lpss_suspend() having been called.  In that case, it requires the
IOSF write to LPSS_IOSF_UNIT_LPIOEP / LPSS_IOSF_GPIODEF0 in order to
continue booting successfully.

Signed-off-by: William Lieurance 
---
 drivers/acpi/acpi_lpss.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/acpi/acpi_lpss.c b/drivers/acpi/acpi_lpss.c
index 9706613eecf9..c7790ba943d4 100644
--- a/drivers/acpi/acpi_lpss.c
+++ b/drivers/acpi/acpi_lpss.c
@@ -939,14 +939,14 @@ static void lpss_iosf_exit_d3_state(void)
 
mutex_lock(_iosf_mutex);
 
+   iosf_mbi_modify(LPSS_IOSF_UNIT_LPIOEP, MBI_CR_WRITE,
+   LPSS_IOSF_GPIODEF0, value1, mask1);
+
if (!lpss_iosf_d3_entered)
goto exit;
 
lpss_iosf_d3_entered = false;
 
-   iosf_mbi_modify(LPSS_IOSF_UNIT_LPIOEP, MBI_CR_WRITE,
-   LPSS_IOSF_GPIODEF0, value1, mask1);
-
iosf_mbi_modify(LPSS_IOSF_UNIT_LPIO2, MBI_CFG_WRITE,
LPSS_IOSF_PMCSR, value2, mask2);
 
-- 
2.17.1



Re: 4.19-rc1: ./include/linux/rcupdate.h:631 rcu_read_lock() used illegally while idle!

2018-09-01 Thread Paul E. McKenney
On Sat, Sep 01, 2018 at 07:35:59PM +0200, Borislav Petkov wrote:
> This is a huge splat! It haz some perf* and sched* in it, I guess for
> peterz to stare at. And lemme add Paul for good measure too :)
> 
> Kernel is -rc1 + 3 microcode loader patches ontop which should not be
> related.

It really is tracing from the idle loop.  But I thought that the event
tracing took care of that.  Adding Steve and Joel for their thoughts.

Thanx, Paul

> Thx.
> 
> ---
> [   62.409125] =
> [   62.409129] WARNING: suspicious RCU usage
> [   62.409133] 4.19.0-rc1+ #1 Not tainted
> [   62.409136] -
> [   62.409140] ./include/linux/rcupdate.h:631 rcu_read_lock() used illegally 
> while idle!
> [   62.409143] 
>other info that might help us debug this:
> 
> [   62.409147] 
>RCU used illegally from idle CPU!
>rcu_scheduler_active = 2, debug_locks = 1
> [   62.409151] RCU used illegally from extended quiescent state!
> [   62.409155] 1 lock held by swapper/0/0:
> [   62.409158]  #0: 4557ee0e (rcu_read_lock){}, at: 
> perf_event_output_forward+0x0/0x130
> [   62.409175] 
>stack backtrace:
> [   62.409180] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0-rc1+ #1
> [   62.409183] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 
> 11/13/2012
> [   62.409187] Call Trace:
> [   62.409196]  dump_stack+0x85/0xcb
> [   62.409203]  perf_event_output_forward+0xf6/0x130
> [   62.409215]  __perf_event_overflow+0x52/0xe0
> [   62.409223]  perf_swevent_overflow+0x91/0xb0
> [   62.409229]  perf_tp_event+0x11a/0x350
> [   62.409235]  ? find_held_lock+0x2d/0x90
> [   62.409251]  ? __lock_acquire+0x2ce/0x1350
> [   62.409263]  ? __lock_acquire+0x2ce/0x1350
> [   62.409270]  ? retint_kernel+0x2d/0x2d
> [   62.409278]  ? find_held_lock+0x2d/0x90
> [   62.409285]  ? tick_nohz_get_sleep_length+0x83/0xb0
> [   62.409299]  ? perf_trace_cpu+0xbb/0xd0
> [   62.409306]  ? perf_trace_buf_alloc+0x5a/0xa0
> [   62.409311]  perf_trace_cpu+0xbb/0xd0
> [   62.409323]  cpuidle_enter_state+0x185/0x340
> [   62.409332]  do_idle+0x1eb/0x260
> [   62.409340]  cpu_startup_entry+0x5f/0x70
> [   62.409347]  start_kernel+0x49b/0x4a6
> 
> [   62.409357]  secondary_startup_64+0xa4/0xb0
> 
> [   62.409374] =
> [   62.409375] WARNING: suspicious RCU usage
> [   62.409377] 4.19.0-rc1+ #1 Not tainted
> [   62.409378] -
> [   62.409380] kernel/events/ring_buffer.c:138 suspicious 
> rcu_dereference_check() usage!
> [   62.409381] 
>other info that might help us debug this:
> 
> [   62.409382] 
>RCU used illegally from idle CPU!
>rcu_scheduler_active = 2, debug_locks = 1
> [   62.409384] RCU used illegally from extended quiescent state!
> [   62.409386] 2 locks held by swapper/0/0:
> [   62.409387]  #0: 4557ee0e (
> [   62.409390] =
> [   62.409391] WARNING: suspicious RCU usage
> [   62.409393] rcu_read_lock){}, at: perf_event_output_forward+0x0/0x130
> [   62.409398] 4.19.0-rc1+ #1 Not tainted
> [   62.409399] -
> [   62.409400]  #1: 4557ee0e
> [   62.409403] ./include/linux/rcupdate.h:631 rcu_read_lock() used illegally 
> while idle!
> [   62.409403]  (rcu_read_lock){}
> [   62.409406] 
>other info that might help us debug this:
> 
> [   62.409408] , at: perf_output_begin_forward+0x5/0x320
> [   62.409409] 
>stack backtrace:
> [   62.409412] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0-rc1+ #1
> [   62.409413] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 
> 11/13/2012
> [   62.409414] Call Trace:
> [   62.409418]  dump_stack+0x85/0xcb
> [   62.409422]  perf_output_begin_forward+0x2d2/0x320
> [   62.409423] 
>RCU used illegally from idle CPU!
>rcu_scheduler_active = 2, debug_locks = 1
> [   62.409424] RCU used illegally from extended quiescent state!
> [   62.409428]  ? find_held_lock+0x2d/0x90
> [   62.409433]  ? vprintk_emit+0x2ce/0x340
> [   62.409434] 2 locks held by swapper/2/0:
> [   62.409435]  #0: 4557ee0e (rcu_read_lock){}, at: 
> perf_event_output_forward+0x0/0x130
> [   62.409445]  ? find_held_lock+0x2d/0x90
> [   62.409449]  ? is_bpf_text_address+0x65/0xe0
> [   62.409450]  #1: 4557ee0e (rcu_read_lock){}, at: 
> perf_output_begin_forward+0x5/0x320
> [   62.409457] 
>stack backtrace:
> [   62.409462]  ? rcu_dynticks_eqs_enter+0x12/0x30
> [   62.409466]  ? kernel_text_address+0x8f/0xf0
> [   62.409472]  ? __kernel_text_address+0xe/0x30
> [   62.409477]  ? show_trace_log_lvl+0x19f/0x3d0
> [   62.409484]  ? secondary_startup_64+0xa4/0xb0
> 
> [   62.409492] =
> [   62.409494]  ? sched_clock+0x5/0x10
> [   62.409496]  ? sched_clock+0x5/0x10

Re: 4.19-rc1: ./include/linux/rcupdate.h:631 rcu_read_lock() used illegally while idle!

2018-09-01 Thread Paul E. McKenney
On Sat, Sep 01, 2018 at 07:35:59PM +0200, Borislav Petkov wrote:
> This is a huge splat! It haz some perf* and sched* in it, I guess for
> peterz to stare at. And lemme add Paul for good measure too :)
> 
> Kernel is -rc1 + 3 microcode loader patches ontop which should not be
> related.

It really is tracing from the idle loop.  But I thought that the event
tracing took care of that.  Adding Steve and Joel for their thoughts.

Thanx, Paul

> Thx.
> 
> ---
> [   62.409125] =
> [   62.409129] WARNING: suspicious RCU usage
> [   62.409133] 4.19.0-rc1+ #1 Not tainted
> [   62.409136] -
> [   62.409140] ./include/linux/rcupdate.h:631 rcu_read_lock() used illegally 
> while idle!
> [   62.409143] 
>other info that might help us debug this:
> 
> [   62.409147] 
>RCU used illegally from idle CPU!
>rcu_scheduler_active = 2, debug_locks = 1
> [   62.409151] RCU used illegally from extended quiescent state!
> [   62.409155] 1 lock held by swapper/0/0:
> [   62.409158]  #0: 4557ee0e (rcu_read_lock){}, at: 
> perf_event_output_forward+0x0/0x130
> [   62.409175] 
>stack backtrace:
> [   62.409180] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0-rc1+ #1
> [   62.409183] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 
> 11/13/2012
> [   62.409187] Call Trace:
> [   62.409196]  dump_stack+0x85/0xcb
> [   62.409203]  perf_event_output_forward+0xf6/0x130
> [   62.409215]  __perf_event_overflow+0x52/0xe0
> [   62.409223]  perf_swevent_overflow+0x91/0xb0
> [   62.409229]  perf_tp_event+0x11a/0x350
> [   62.409235]  ? find_held_lock+0x2d/0x90
> [   62.409251]  ? __lock_acquire+0x2ce/0x1350
> [   62.409263]  ? __lock_acquire+0x2ce/0x1350
> [   62.409270]  ? retint_kernel+0x2d/0x2d
> [   62.409278]  ? find_held_lock+0x2d/0x90
> [   62.409285]  ? tick_nohz_get_sleep_length+0x83/0xb0
> [   62.409299]  ? perf_trace_cpu+0xbb/0xd0
> [   62.409306]  ? perf_trace_buf_alloc+0x5a/0xa0
> [   62.409311]  perf_trace_cpu+0xbb/0xd0
> [   62.409323]  cpuidle_enter_state+0x185/0x340
> [   62.409332]  do_idle+0x1eb/0x260
> [   62.409340]  cpu_startup_entry+0x5f/0x70
> [   62.409347]  start_kernel+0x49b/0x4a6
> 
> [   62.409357]  secondary_startup_64+0xa4/0xb0
> 
> [   62.409374] =
> [   62.409375] WARNING: suspicious RCU usage
> [   62.409377] 4.19.0-rc1+ #1 Not tainted
> [   62.409378] -
> [   62.409380] kernel/events/ring_buffer.c:138 suspicious 
> rcu_dereference_check() usage!
> [   62.409381] 
>other info that might help us debug this:
> 
> [   62.409382] 
>RCU used illegally from idle CPU!
>rcu_scheduler_active = 2, debug_locks = 1
> [   62.409384] RCU used illegally from extended quiescent state!
> [   62.409386] 2 locks held by swapper/0/0:
> [   62.409387]  #0: 4557ee0e (
> [   62.409390] =
> [   62.409391] WARNING: suspicious RCU usage
> [   62.409393] rcu_read_lock){}, at: perf_event_output_forward+0x0/0x130
> [   62.409398] 4.19.0-rc1+ #1 Not tainted
> [   62.409399] -
> [   62.409400]  #1: 4557ee0e
> [   62.409403] ./include/linux/rcupdate.h:631 rcu_read_lock() used illegally 
> while idle!
> [   62.409403]  (rcu_read_lock){}
> [   62.409406] 
>other info that might help us debug this:
> 
> [   62.409408] , at: perf_output_begin_forward+0x5/0x320
> [   62.409409] 
>stack backtrace:
> [   62.409412] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0-rc1+ #1
> [   62.409413] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 
> 11/13/2012
> [   62.409414] Call Trace:
> [   62.409418]  dump_stack+0x85/0xcb
> [   62.409422]  perf_output_begin_forward+0x2d2/0x320
> [   62.409423] 
>RCU used illegally from idle CPU!
>rcu_scheduler_active = 2, debug_locks = 1
> [   62.409424] RCU used illegally from extended quiescent state!
> [   62.409428]  ? find_held_lock+0x2d/0x90
> [   62.409433]  ? vprintk_emit+0x2ce/0x340
> [   62.409434] 2 locks held by swapper/2/0:
> [   62.409435]  #0: 4557ee0e (rcu_read_lock){}, at: 
> perf_event_output_forward+0x0/0x130
> [   62.409445]  ? find_held_lock+0x2d/0x90
> [   62.409449]  ? is_bpf_text_address+0x65/0xe0
> [   62.409450]  #1: 4557ee0e (rcu_read_lock){}, at: 
> perf_output_begin_forward+0x5/0x320
> [   62.409457] 
>stack backtrace:
> [   62.409462]  ? rcu_dynticks_eqs_enter+0x12/0x30
> [   62.409466]  ? kernel_text_address+0x8f/0xf0
> [   62.409472]  ? __kernel_text_address+0xe/0x30
> [   62.409477]  ? show_trace_log_lvl+0x19f/0x3d0
> [   62.409484]  ? secondary_startup_64+0xa4/0xb0
> 
> [   62.409492] =
> [   62.409494]  ? sched_clock+0x5/0x10
> [   62.409496]  ? sched_clock+0x5/0x10

Re: [PATCH] y2038: Remove newstat family from default syscall set

2018-09-01 Thread Guenter Roeck
Hi Arnd,

On Fri, Apr 13, 2018 at 11:50:12AM +0200, Arnd Bergmann wrote:
> We have four generations of stat() syscalls:
> - the oldstat syscalls that are only used on the older architectures
> - the newstat family that is used on all 64-bit architectures but
>   lacked support for large files on 32-bit architectures.
> - the stat64 family that is used mostly on 32-bit architectures to
>   replace newstat
> - statx() to replace all of the above, adding 64-bit timestamps among
>   other things.
> 
> We already compile stat64 only on those architectures that need it,
> but newstat is always built, including on those that don't reference
> it. This adds a new __ARCH_WANT_NEW_STAT symbol along the lines of
> __ARCH_WANT_OLD_STAT and __ARCH_WANT_STAT64 to control compilation of
> newstat. All architectures that need it use an explict define, the
> others now get a little bit smaller, and future architecture (including
> 64-bit targets) won't ever see it.
> 

This patch causes my riscv boot tests to crash in -next

sbin/init: error while loading shared libraries: libc.so.6: cannot stat shared 
object: Error 38
Kernel panic - not syncing: Attempted to kill init! exitcode=0x7f00

The following change fixes the problem for me, but of course I have no idea
if it is correct. Copying RISC-V maintainers for input.

Guenter

---
diff --git a/arch/riscv/include/asm/unistd.h b/arch/riscv/include/asm/unistd.h
index 0caea01d5cca..eff7aa9aa163 100644
--- a/arch/riscv/include/asm/unistd.h
+++ b/arch/riscv/include/asm/unistd.h
@@ -16,6 +16,7 @@
  * be included multiple times.  See uapi/asm/syscalls.h for more info.
  */
 
+#define __ARCH_WANT_NEW_STAT
 #define __ARCH_WANT_SYS_CLONE
 #include 
 #include 


Re: [PATCH] y2038: Remove newstat family from default syscall set

2018-09-01 Thread Guenter Roeck
Hi Arnd,

On Fri, Apr 13, 2018 at 11:50:12AM +0200, Arnd Bergmann wrote:
> We have four generations of stat() syscalls:
> - the oldstat syscalls that are only used on the older architectures
> - the newstat family that is used on all 64-bit architectures but
>   lacked support for large files on 32-bit architectures.
> - the stat64 family that is used mostly on 32-bit architectures to
>   replace newstat
> - statx() to replace all of the above, adding 64-bit timestamps among
>   other things.
> 
> We already compile stat64 only on those architectures that need it,
> but newstat is always built, including on those that don't reference
> it. This adds a new __ARCH_WANT_NEW_STAT symbol along the lines of
> __ARCH_WANT_OLD_STAT and __ARCH_WANT_STAT64 to control compilation of
> newstat. All architectures that need it use an explict define, the
> others now get a little bit smaller, and future architecture (including
> 64-bit targets) won't ever see it.
> 

This patch causes my riscv boot tests to crash in -next

sbin/init: error while loading shared libraries: libc.so.6: cannot stat shared 
object: Error 38
Kernel panic - not syncing: Attempted to kill init! exitcode=0x7f00

The following change fixes the problem for me, but of course I have no idea
if it is correct. Copying RISC-V maintainers for input.

Guenter

---
diff --git a/arch/riscv/include/asm/unistd.h b/arch/riscv/include/asm/unistd.h
index 0caea01d5cca..eff7aa9aa163 100644
--- a/arch/riscv/include/asm/unistd.h
+++ b/arch/riscv/include/asm/unistd.h
@@ -16,6 +16,7 @@
  * be included multiple times.  See uapi/asm/syscalls.h for more info.
  */
 
+#define __ARCH_WANT_NEW_STAT
 #define __ARCH_WANT_SYS_CLONE
 #include 
 #include 


Re: [PATCH v3] x86/vdso: Handle clock_gettime(CLOCK_TAI) in vDSO

2018-09-01 Thread Andy Lutomirski
On Sat, Sep 1, 2018 at 2:33 AM, Florian Weimer  wrote:
> On 09/01/2018 05:39 AM, Andy Lutomirski wrote:
>>
>> Florian, do you think
>> glibc would be willing to add some magic to turn
>> clock_gettime(CLOCK_MONOTONIC, t) into
>> __vdso_clock_gettime_monotonic(t) when CLOCK_MONOTONIC is a constant?
>
>
> What's the goal here?  Turn the indirect call/conditional jump/indirect call
> sequence into a single indirect call, purely for performance reasons?

Almost.  It's to bypass some of the branches in
__vdso_clock_gettime(), which is supposed to be very fast.  AFAIK most
user code that uses clock_gettime() passes a constant for the first
argument, and we can squeeze out some performance by optimizing that
case.  The indirect branches internal to the vDSO are a separate issue
and should be solved separately.

(It's really too bad that x86 doesn't have a 64-bit call instruction.
If it did, then the PLT could get rewritten at dynamic link time to
avoid indirect calls entirely, and presumably glibc could use the same
technique to call into the vDSO without indirect calls.)


Re: [PATCH v3] x86/vdso: Handle clock_gettime(CLOCK_TAI) in vDSO

2018-09-01 Thread Andy Lutomirski
On Sat, Sep 1, 2018 at 2:33 AM, Florian Weimer  wrote:
> On 09/01/2018 05:39 AM, Andy Lutomirski wrote:
>>
>> Florian, do you think
>> glibc would be willing to add some magic to turn
>> clock_gettime(CLOCK_MONOTONIC, t) into
>> __vdso_clock_gettime_monotonic(t) when CLOCK_MONOTONIC is a constant?
>
>
> What's the goal here?  Turn the indirect call/conditional jump/indirect call
> sequence into a single indirect call, purely for performance reasons?

Almost.  It's to bypass some of the branches in
__vdso_clock_gettime(), which is supposed to be very fast.  AFAIK most
user code that uses clock_gettime() passes a constant for the first
argument, and we can squeeze out some performance by optimizing that
case.  The indirect branches internal to the vDSO are a separate issue
and should be solved separately.

(It's really too bad that x86 doesn't have a 64-bit call instruction.
If it did, then the PLT could get rewritten at dynamic link time to
avoid indirect calls entirely, and presumably glibc could use the same
technique to call into the vDSO without indirect calls.)


4.19-rc1: ./include/linux/rcupdate.h:631 rcu_read_lock() used illegally while idle!

2018-09-01 Thread Borislav Petkov
This is a huge splat! It haz some perf* and sched* in it, I guess for
peterz to stare at. And lemme add Paul for good measure too :)

Kernel is -rc1 + 3 microcode loader patches ontop which should not be
related.

Thx.

---
[   62.409125] =
[   62.409129] WARNING: suspicious RCU usage
[   62.409133] 4.19.0-rc1+ #1 Not tainted
[   62.409136] -
[   62.409140] ./include/linux/rcupdate.h:631 rcu_read_lock() used illegally 
while idle!
[   62.409143] 
   other info that might help us debug this:

[   62.409147] 
   RCU used illegally from idle CPU!
   rcu_scheduler_active = 2, debug_locks = 1
[   62.409151] RCU used illegally from extended quiescent state!
[   62.409155] 1 lock held by swapper/0/0:
[   62.409158]  #0: 4557ee0e (rcu_read_lock){}, at: 
perf_event_output_forward+0x0/0x130
[   62.409175] 
   stack backtrace:
[   62.409180] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0-rc1+ #1
[   62.409183] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 
11/13/2012
[   62.409187] Call Trace:
[   62.409196]  dump_stack+0x85/0xcb
[   62.409203]  perf_event_output_forward+0xf6/0x130
[   62.409215]  __perf_event_overflow+0x52/0xe0
[   62.409223]  perf_swevent_overflow+0x91/0xb0
[   62.409229]  perf_tp_event+0x11a/0x350
[   62.409235]  ? find_held_lock+0x2d/0x90
[   62.409251]  ? __lock_acquire+0x2ce/0x1350
[   62.409263]  ? __lock_acquire+0x2ce/0x1350
[   62.409270]  ? retint_kernel+0x2d/0x2d
[   62.409278]  ? find_held_lock+0x2d/0x90
[   62.409285]  ? tick_nohz_get_sleep_length+0x83/0xb0
[   62.409299]  ? perf_trace_cpu+0xbb/0xd0
[   62.409306]  ? perf_trace_buf_alloc+0x5a/0xa0
[   62.409311]  perf_trace_cpu+0xbb/0xd0
[   62.409323]  cpuidle_enter_state+0x185/0x340
[   62.409332]  do_idle+0x1eb/0x260
[   62.409340]  cpu_startup_entry+0x5f/0x70
[   62.409347]  start_kernel+0x49b/0x4a6

[   62.409357]  secondary_startup_64+0xa4/0xb0

[   62.409374] =
[   62.409375] WARNING: suspicious RCU usage
[   62.409377] 4.19.0-rc1+ #1 Not tainted
[   62.409378] -
[   62.409380] kernel/events/ring_buffer.c:138 suspicious 
rcu_dereference_check() usage!
[   62.409381] 
   other info that might help us debug this:

[   62.409382] 
   RCU used illegally from idle CPU!
   rcu_scheduler_active = 2, debug_locks = 1
[   62.409384] RCU used illegally from extended quiescent state!
[   62.409386] 2 locks held by swapper/0/0:
[   62.409387]  #0: 4557ee0e (
[   62.409390] =
[   62.409391] WARNING: suspicious RCU usage
[   62.409393] rcu_read_lock){}, at: perf_event_output_forward+0x0/0x130
[   62.409398] 4.19.0-rc1+ #1 Not tainted
[   62.409399] -
[   62.409400]  #1: 4557ee0e
[   62.409403] ./include/linux/rcupdate.h:631 rcu_read_lock() used illegally 
while idle!
[   62.409403]  (rcu_read_lock){}
[   62.409406] 
   other info that might help us debug this:

[   62.409408] , at: perf_output_begin_forward+0x5/0x320
[   62.409409] 
   stack backtrace:
[   62.409412] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0-rc1+ #1
[   62.409413] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 
11/13/2012
[   62.409414] Call Trace:
[   62.409418]  dump_stack+0x85/0xcb
[   62.409422]  perf_output_begin_forward+0x2d2/0x320
[   62.409423] 
   RCU used illegally from idle CPU!
   rcu_scheduler_active = 2, debug_locks = 1
[   62.409424] RCU used illegally from extended quiescent state!
[   62.409428]  ? find_held_lock+0x2d/0x90
[   62.409433]  ? vprintk_emit+0x2ce/0x340
[   62.409434] 2 locks held by swapper/2/0:
[   62.409435]  #0: 4557ee0e (rcu_read_lock){}, at: 
perf_event_output_forward+0x0/0x130
[   62.409445]  ? find_held_lock+0x2d/0x90
[   62.409449]  ? is_bpf_text_address+0x65/0xe0
[   62.409450]  #1: 4557ee0e (rcu_read_lock){}, at: 
perf_output_begin_forward+0x5/0x320
[   62.409457] 
   stack backtrace:
[   62.409462]  ? rcu_dynticks_eqs_enter+0x12/0x30
[   62.409466]  ? kernel_text_address+0x8f/0xf0
[   62.409472]  ? __kernel_text_address+0xe/0x30
[   62.409477]  ? show_trace_log_lvl+0x19f/0x3d0
[   62.409484]  ? secondary_startup_64+0xa4/0xb0

[   62.409492] =
[   62.409494]  ? sched_clock+0x5/0x10
[   62.409496]  ? sched_clock+0x5/0x10
[   62.409500]  ? sched_clock_cpu+0x10/0xd0
[   62.409504] WARNING: suspicious RCU usage
[   62.409506]  ? perf_event_output_forward+0x70/0x130
[   62.409508]  ? perf_prepare_sample+0x53/0x460
[   62.409513] 4.19.0-rc1+ #1 Not tainted
[   62.409514]  perf_event_output_forward+0x70/0x130
[   62.409518] -
[   62.409522] ./include/linux/rcupdate.h:680 rcu_read_unlock() used illegally 
while idle!
[   62.409523]  __perf_event_overflow+0x52/0xe0
[   62.409528]  

4.19-rc1: ./include/linux/rcupdate.h:631 rcu_read_lock() used illegally while idle!

2018-09-01 Thread Borislav Petkov
This is a huge splat! It haz some perf* and sched* in it, I guess for
peterz to stare at. And lemme add Paul for good measure too :)

Kernel is -rc1 + 3 microcode loader patches ontop which should not be
related.

Thx.

---
[   62.409125] =
[   62.409129] WARNING: suspicious RCU usage
[   62.409133] 4.19.0-rc1+ #1 Not tainted
[   62.409136] -
[   62.409140] ./include/linux/rcupdate.h:631 rcu_read_lock() used illegally 
while idle!
[   62.409143] 
   other info that might help us debug this:

[   62.409147] 
   RCU used illegally from idle CPU!
   rcu_scheduler_active = 2, debug_locks = 1
[   62.409151] RCU used illegally from extended quiescent state!
[   62.409155] 1 lock held by swapper/0/0:
[   62.409158]  #0: 4557ee0e (rcu_read_lock){}, at: 
perf_event_output_forward+0x0/0x130
[   62.409175] 
   stack backtrace:
[   62.409180] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0-rc1+ #1
[   62.409183] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 
11/13/2012
[   62.409187] Call Trace:
[   62.409196]  dump_stack+0x85/0xcb
[   62.409203]  perf_event_output_forward+0xf6/0x130
[   62.409215]  __perf_event_overflow+0x52/0xe0
[   62.409223]  perf_swevent_overflow+0x91/0xb0
[   62.409229]  perf_tp_event+0x11a/0x350
[   62.409235]  ? find_held_lock+0x2d/0x90
[   62.409251]  ? __lock_acquire+0x2ce/0x1350
[   62.409263]  ? __lock_acquire+0x2ce/0x1350
[   62.409270]  ? retint_kernel+0x2d/0x2d
[   62.409278]  ? find_held_lock+0x2d/0x90
[   62.409285]  ? tick_nohz_get_sleep_length+0x83/0xb0
[   62.409299]  ? perf_trace_cpu+0xbb/0xd0
[   62.409306]  ? perf_trace_buf_alloc+0x5a/0xa0
[   62.409311]  perf_trace_cpu+0xbb/0xd0
[   62.409323]  cpuidle_enter_state+0x185/0x340
[   62.409332]  do_idle+0x1eb/0x260
[   62.409340]  cpu_startup_entry+0x5f/0x70
[   62.409347]  start_kernel+0x49b/0x4a6

[   62.409357]  secondary_startup_64+0xa4/0xb0

[   62.409374] =
[   62.409375] WARNING: suspicious RCU usage
[   62.409377] 4.19.0-rc1+ #1 Not tainted
[   62.409378] -
[   62.409380] kernel/events/ring_buffer.c:138 suspicious 
rcu_dereference_check() usage!
[   62.409381] 
   other info that might help us debug this:

[   62.409382] 
   RCU used illegally from idle CPU!
   rcu_scheduler_active = 2, debug_locks = 1
[   62.409384] RCU used illegally from extended quiescent state!
[   62.409386] 2 locks held by swapper/0/0:
[   62.409387]  #0: 4557ee0e (
[   62.409390] =
[   62.409391] WARNING: suspicious RCU usage
[   62.409393] rcu_read_lock){}, at: perf_event_output_forward+0x0/0x130
[   62.409398] 4.19.0-rc1+ #1 Not tainted
[   62.409399] -
[   62.409400]  #1: 4557ee0e
[   62.409403] ./include/linux/rcupdate.h:631 rcu_read_lock() used illegally 
while idle!
[   62.409403]  (rcu_read_lock){}
[   62.409406] 
   other info that might help us debug this:

[   62.409408] , at: perf_output_begin_forward+0x5/0x320
[   62.409409] 
   stack backtrace:
[   62.409412] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0-rc1+ #1
[   62.409413] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 
11/13/2012
[   62.409414] Call Trace:
[   62.409418]  dump_stack+0x85/0xcb
[   62.409422]  perf_output_begin_forward+0x2d2/0x320
[   62.409423] 
   RCU used illegally from idle CPU!
   rcu_scheduler_active = 2, debug_locks = 1
[   62.409424] RCU used illegally from extended quiescent state!
[   62.409428]  ? find_held_lock+0x2d/0x90
[   62.409433]  ? vprintk_emit+0x2ce/0x340
[   62.409434] 2 locks held by swapper/2/0:
[   62.409435]  #0: 4557ee0e (rcu_read_lock){}, at: 
perf_event_output_forward+0x0/0x130
[   62.409445]  ? find_held_lock+0x2d/0x90
[   62.409449]  ? is_bpf_text_address+0x65/0xe0
[   62.409450]  #1: 4557ee0e (rcu_read_lock){}, at: 
perf_output_begin_forward+0x5/0x320
[   62.409457] 
   stack backtrace:
[   62.409462]  ? rcu_dynticks_eqs_enter+0x12/0x30
[   62.409466]  ? kernel_text_address+0x8f/0xf0
[   62.409472]  ? __kernel_text_address+0xe/0x30
[   62.409477]  ? show_trace_log_lvl+0x19f/0x3d0
[   62.409484]  ? secondary_startup_64+0xa4/0xb0

[   62.409492] =
[   62.409494]  ? sched_clock+0x5/0x10
[   62.409496]  ? sched_clock+0x5/0x10
[   62.409500]  ? sched_clock_cpu+0x10/0xd0
[   62.409504] WARNING: suspicious RCU usage
[   62.409506]  ? perf_event_output_forward+0x70/0x130
[   62.409508]  ? perf_prepare_sample+0x53/0x460
[   62.409513] 4.19.0-rc1+ #1 Not tainted
[   62.409514]  perf_event_output_forward+0x70/0x130
[   62.409518] -
[   62.409522] ./include/linux/rcupdate.h:680 rcu_read_unlock() used illegally 
while idle!
[   62.409523]  __perf_event_overflow+0x52/0xe0
[   62.409528]  

Re: [PATCH 2/3] x86/entry/64: Use the TSS sp2 slot for rsp_scratch

2018-09-01 Thread Andy Lutomirski
On Sat, Sep 1, 2018 at 9:33 AM, Linus Torvalds
 wrote:
> On Fri, Aug 31, 2018 at 3:21 PM Andy Lutomirski  wrote:
>>
>>  #ifdef CONFIG_X86_64
>>  # define cpu_current_top_of_stack (cpu_tss_rw + TSS_sp1)
>> +# define rsp_scratch (cpu_tss_rw + TSS_sp2)
>>  #endif
>
> Ugh. The above gets used by *assembler* code. I was really confused by how 
> this:
>
>
>> --- a/arch/x86/kernel/process_64.c
>> +++ b/arch/x86/kernel/process_64.c
>> @@ -59,8 +59,6 @@
>>  #include 
>>  #endif
>>
>> -__visible DEFINE_PER_CPU(unsigned long, rsp_scratch);
>> -
>
> could continue to work despite the accesses to "rsp_scratch" still
> remaining in the asm files.
>
> Can yu humor me, and just not do something quite that subtle. I must
> have missed this the first time around.
>
> Please get rid of the define, and just make the asm code spell out
> what it actually does.

Done for v2.

>
> We already do that for TSS_sp0 for the normal case:
>
>   movqPER_CPU_VAR(cpu_tss_rw + TSS_sp0), %rsp
>
> so I think this should just change
>
> - movq%rsp, PER_CPU_VAR(rsp_scratch)
> + movq%rsp, PER_CPU_VAR(cpu_tss_rw + TSS_sp2)
>
> instead of having that subtle rsp_scratch thing.
>
> And honestly, I think we should strive to do the same thing with
> cpu_current_top_of_stack. There at least the #define currently makes
> sense (because on 32-bit, it's actually a percpu variable, on 64-bit
> it's that sp1 field).
>
> But wouldn't it be nice to just unify 32-bit and 64-bit in this
> respect, and get rid of that subtle difference?
>

Yes.  But ugh, the way that thing has worked has changed so many times
on 32-bit and 64-bit that I've lost track a little bit.  I'll put it
on my long list of things to clean up.


Re: [PATCH 2/3] x86/entry/64: Use the TSS sp2 slot for rsp_scratch

2018-09-01 Thread Andy Lutomirski
On Sat, Sep 1, 2018 at 9:33 AM, Linus Torvalds
 wrote:
> On Fri, Aug 31, 2018 at 3:21 PM Andy Lutomirski  wrote:
>>
>>  #ifdef CONFIG_X86_64
>>  # define cpu_current_top_of_stack (cpu_tss_rw + TSS_sp1)
>> +# define rsp_scratch (cpu_tss_rw + TSS_sp2)
>>  #endif
>
> Ugh. The above gets used by *assembler* code. I was really confused by how 
> this:
>
>
>> --- a/arch/x86/kernel/process_64.c
>> +++ b/arch/x86/kernel/process_64.c
>> @@ -59,8 +59,6 @@
>>  #include 
>>  #endif
>>
>> -__visible DEFINE_PER_CPU(unsigned long, rsp_scratch);
>> -
>
> could continue to work despite the accesses to "rsp_scratch" still
> remaining in the asm files.
>
> Can yu humor me, and just not do something quite that subtle. I must
> have missed this the first time around.
>
> Please get rid of the define, and just make the asm code spell out
> what it actually does.

Done for v2.

>
> We already do that for TSS_sp0 for the normal case:
>
>   movqPER_CPU_VAR(cpu_tss_rw + TSS_sp0), %rsp
>
> so I think this should just change
>
> - movq%rsp, PER_CPU_VAR(rsp_scratch)
> + movq%rsp, PER_CPU_VAR(cpu_tss_rw + TSS_sp2)
>
> instead of having that subtle rsp_scratch thing.
>
> And honestly, I think we should strive to do the same thing with
> cpu_current_top_of_stack. There at least the #define currently makes
> sense (because on 32-bit, it's actually a percpu variable, on 64-bit
> it's that sp1 field).
>
> But wouldn't it be nice to just unify 32-bit and 64-bit in this
> respect, and get rid of that subtle difference?
>

Yes.  But ugh, the way that thing has worked has changed so many times
on 32-bit and 64-bit that I've lost track a little bit.  I'll put it
on my long list of things to clean up.


Re: Access to non-RAM pages

2018-09-01 Thread Linus Torvalds
On Fri, Aug 31, 2018 at 2:18 PM Jiri Kosina  wrote:
>
> If noone has any clever idea how to work this around (I don't), I am
> afraid we'd have to ditch the whole DCACHE_WORD_ACCESS optimization, as
> it's silently dangerous.

No way in hell will I apply such a stupid patch.

It is NOT dangerous.

If you have a machine with RAM that touches IO, you need to disable
the last page, exactly the same way we disable and marked reserved the
first page at zero.

I thought we already did that.

I suspect this is a Xen bug, where the fake BIOS sets up a garbage
description of the hardware that is simply not realistic. I don't
think I've ever seen a machine that didn't have some reserved memory
at the top, but hey, if we don't expressly mark the last page reserved
already, doing so should be trivial.

No way do we disable the word accesses just because of some crazy
corner case that doesn't matter and doesn't happen in reality.

   Linus


Re: Access to non-RAM pages

2018-09-01 Thread Linus Torvalds
On Fri, Aug 31, 2018 at 2:18 PM Jiri Kosina  wrote:
>
> If noone has any clever idea how to work this around (I don't), I am
> afraid we'd have to ditch the whole DCACHE_WORD_ACCESS optimization, as
> it's silently dangerous.

No way in hell will I apply such a stupid patch.

It is NOT dangerous.

If you have a machine with RAM that touches IO, you need to disable
the last page, exactly the same way we disable and marked reserved the
first page at zero.

I thought we already did that.

I suspect this is a Xen bug, where the fake BIOS sets up a garbage
description of the hardware that is simply not realistic. I don't
think I've ever seen a machine that didn't have some reserved memory
at the top, but hey, if we don't expressly mark the last page reserved
already, doing so should be trivial.

No way do we disable the word accesses just because of some crazy
corner case that doesn't matter and doesn't happen in reality.

   Linus


Re: Access to non-RAM pages

2018-09-01 Thread Al Viro
On Sat, Sep 01, 2018 at 12:47:48PM +0200, Juergen Gross wrote:
> On 31/08/18 23:18, Jiri Kosina wrote:
> > On Wed, 29 Aug 2018, Juergen Gross wrote:
> > 
> >> While being very unlikely I still believe this is possible. Any
> >> thoughts?
> > 
> > So in theory we should somehow test whether the next page is some form of 
> > mmio/gart/... mapping, but I guess that by itself would kill the 
> > performance advantage of the whole load_unaligned_zeropad() trick.
> > 
> > And yes, the sideffects of reading from mmio mapping is a real thing -- 
> > see for example the issue fixed by 2a3e83c6f9.
> > 
> > If noone has any clever idea how to work this around (I don't), I am 
> > afraid we'd have to ditch the whole DCACHE_WORD_ACCESS optimization, as 
> > it's silently dangerous.
> 
> Are there any architectures which can still use DCACHE_WORD_ACCESS?
> 
> x86 shouldn't, what about powerpc, arm and arm64?

IMO that's crap.  In absolute majority of cases there is a guaranteed gap
between the end of accessed object and the next page boundary.  Penalizing
each syscall that does pathname resolution to deal with something that
might only happen when the pathname length is just under 4Kb looks like
a bloody bad idea.


Re: Access to non-RAM pages

2018-09-01 Thread Al Viro
On Sat, Sep 01, 2018 at 12:47:48PM +0200, Juergen Gross wrote:
> On 31/08/18 23:18, Jiri Kosina wrote:
> > On Wed, 29 Aug 2018, Juergen Gross wrote:
> > 
> >> While being very unlikely I still believe this is possible. Any
> >> thoughts?
> > 
> > So in theory we should somehow test whether the next page is some form of 
> > mmio/gart/... mapping, but I guess that by itself would kill the 
> > performance advantage of the whole load_unaligned_zeropad() trick.
> > 
> > And yes, the sideffects of reading from mmio mapping is a real thing -- 
> > see for example the issue fixed by 2a3e83c6f9.
> > 
> > If noone has any clever idea how to work this around (I don't), I am 
> > afraid we'd have to ditch the whole DCACHE_WORD_ACCESS optimization, as 
> > it's silently dangerous.
> 
> Are there any architectures which can still use DCACHE_WORD_ACCESS?
> 
> x86 shouldn't, what about powerpc, arm and arm64?

IMO that's crap.  In absolute majority of cases there is a guaranteed gap
between the end of accessed object and the next page boundary.  Penalizing
each syscall that does pathname resolution to deal with something that
might only happen when the pathname length is just under 4Kb looks like
a bloody bad idea.


Re: [PATCH 1/1] Update AMD cpu microcode for family 15h

2018-09-01 Thread Rudolf Marek
Hi again,

Here is a short summary of what is missing in the microcode containers [1] [2]. 
I only included AMD family 15h and 17h.
Similar could be done for Intel CPUs. 

I do believe having a latest microcode is a vital for the userspace security 
because it provides
IBPB barrier.

Family 15h [1] container parsed with [4] (with some lines omitted)

-- Processor Signature:   : 0x00600f20
-- Processor Revision ID: : 0x6020

-- Processor Signature:   : 0x00610f01
-- Processor Revision ID: : 0x6101

-- Processor Signature:   : 0x00600f12
-- Processor Revision ID: : 0x6012

Contains following microcodes:

| # | eqrev| urev |date| latest|
| 1 | 6012 | 0600063E | 2018/02/07 |  yes  |
| 2 | 6020 | 06000852 | 2018/02/06 |  yes  |
| 3 | 6101 | 06001119 | 2012/07/13 |  no   |

Note the #3 is what I have been complaining about.

Family 17h [2] parsed with [4]

The container seems to include the equivalent versions for various CPUs (not 
even a family17h) but only a microcode for a "Naples/EPYC" chips.

Container Processor Signature Table: 
-- Processor Signature:   : 0x00600f20 (not even a fam17h)
-- Processor Revision ID: : 0x6020

-- Processor Signature:   : 0x00610f01 (not even a fam17h)
-- Processor Revision ID: : 0x6101

-- Processor Signature:   : 0x00700f01 (not even a fam17h)
-- Processor Revision ID: : 0x7001

-- Processor Signature:   : 0x00800f12 (update is OK)
-- Processor Revision ID: : 0x8012

-- Processor Signature:   : 0x00800f11 (update is missing!)
-- Processor Revision ID: : 0x8011

-- Processor Signature:   : 0x00600f12 (not even a fam17h)
-- Processor Revision ID: : 0x6012

-- Processor Signature:   : 0x00800f13 (future CPU?)
-- Processor Revision ID: : 0x8013

-- Processor Signature:   : 0x00800f00 (perhaps ES?)
-- Processor Revision ID: : 0x8000

Microcode Type:   : 0x0001
Microcode Size:   : 0x0c80
Date  : 2018/02/09
Patch ID  : 0x08001227
Patch Data ID : 0x8004

| # | eqrev| urev |date| latest|
| 1 | 8004 | 08001227 | 2018/02/09 | yes   |

It misses microcode update for 00800F11 - latest known should be 2018/02/14 and 
for other CPUs like Pinacle Ridge 00800F82 - latest known should be 2018/02/12
Or Ryzen mobile 00810F10 etc...

Thanks
Rudolf

Resources used to construct this tables:

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/amd-ucode/microcode_amd_fam15h.bin
[2] 
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/amd-ucode/microcode_amd_fam17h.bin
[3] http://users.atw.hu/instlatx64/
[4] https://github.com/ddcc/microparse



Re: [PATCH 1/1] Update AMD cpu microcode for family 15h

2018-09-01 Thread Rudolf Marek
Hi again,

Here is a short summary of what is missing in the microcode containers [1] [2]. 
I only included AMD family 15h and 17h.
Similar could be done for Intel CPUs. 

I do believe having a latest microcode is a vital for the userspace security 
because it provides
IBPB barrier.

Family 15h [1] container parsed with [4] (with some lines omitted)

-- Processor Signature:   : 0x00600f20
-- Processor Revision ID: : 0x6020

-- Processor Signature:   : 0x00610f01
-- Processor Revision ID: : 0x6101

-- Processor Signature:   : 0x00600f12
-- Processor Revision ID: : 0x6012

Contains following microcodes:

| # | eqrev| urev |date| latest|
| 1 | 6012 | 0600063E | 2018/02/07 |  yes  |
| 2 | 6020 | 06000852 | 2018/02/06 |  yes  |
| 3 | 6101 | 06001119 | 2012/07/13 |  no   |

Note the #3 is what I have been complaining about.

Family 17h [2] parsed with [4]

The container seems to include the equivalent versions for various CPUs (not 
even a family17h) but only a microcode for a "Naples/EPYC" chips.

Container Processor Signature Table: 
-- Processor Signature:   : 0x00600f20 (not even a fam17h)
-- Processor Revision ID: : 0x6020

-- Processor Signature:   : 0x00610f01 (not even a fam17h)
-- Processor Revision ID: : 0x6101

-- Processor Signature:   : 0x00700f01 (not even a fam17h)
-- Processor Revision ID: : 0x7001

-- Processor Signature:   : 0x00800f12 (update is OK)
-- Processor Revision ID: : 0x8012

-- Processor Signature:   : 0x00800f11 (update is missing!)
-- Processor Revision ID: : 0x8011

-- Processor Signature:   : 0x00600f12 (not even a fam17h)
-- Processor Revision ID: : 0x6012

-- Processor Signature:   : 0x00800f13 (future CPU?)
-- Processor Revision ID: : 0x8013

-- Processor Signature:   : 0x00800f00 (perhaps ES?)
-- Processor Revision ID: : 0x8000

Microcode Type:   : 0x0001
Microcode Size:   : 0x0c80
Date  : 2018/02/09
Patch ID  : 0x08001227
Patch Data ID : 0x8004

| # | eqrev| urev |date| latest|
| 1 | 8004 | 08001227 | 2018/02/09 | yes   |

It misses microcode update for 00800F11 - latest known should be 2018/02/14 and 
for other CPUs like Pinacle Ridge 00800F82 - latest known should be 2018/02/12
Or Ryzen mobile 00810F10 etc...

Thanks
Rudolf

Resources used to construct this tables:

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/amd-ucode/microcode_amd_fam15h.bin
[2] 
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/amd-ucode/microcode_amd_fam17h.bin
[3] http://users.atw.hu/instlatx64/
[4] https://github.com/ddcc/microparse



Re: [PATCH RFC LKMM 7/7] EXP tools/memory-model: Add .cfg and .cat files for s390

2018-09-01 Thread Paul E. McKenney
On Fri, Aug 31, 2018 at 05:06:30PM +0100, Will Deacon wrote:
> Hi Paul,
> 
> On Wed, Aug 29, 2018 at 02:10:53PM -0700, Paul E. McKenney wrote:
> > This commit adds s390.cat and s390.cfg files to allow users to check
> > litmus tests for s390-specific code.  Note that this change only enables
> > herd7 checking of C-language litmus tests.  Larger changes are required
> > to enable the litmus7 and klitmus7 tools to check litmus tests on real
> > hardare.
> > 
> > Suggested-by: Martin Schwidefsky 
> > Suggested-by: Christian Borntraeger 
> > Signed-off-by: Paul E. McKenney 
> > [ paulmck: Add fixes suggested by Alan Stern. ]
> > ---
> >  tools/memory-model/s390.cat | 18 ++
> >  tools/memory-model/s390.cfg | 21 +
> >  2 files changed, 39 insertions(+)
> >  create mode 100644 tools/memory-model/s390.cat
> >  create mode 100644 tools/memory-model/s390.cfg
> 
> As I said before, I'd *much* prefer this to be part of the upstream
> herdtools7 repository. It's not really anything to do with the Linux
> kernel, so I don't think it belongs in the source tree.

Agreed.  As the cover letter says, "Add .cfg and .cat files for s390,
which is a not-for-mainline placeholder."

Thanx, Paul



Re: [PATCH RFC LKMM 7/7] EXP tools/memory-model: Add .cfg and .cat files for s390

2018-09-01 Thread Paul E. McKenney
On Fri, Aug 31, 2018 at 05:06:30PM +0100, Will Deacon wrote:
> Hi Paul,
> 
> On Wed, Aug 29, 2018 at 02:10:53PM -0700, Paul E. McKenney wrote:
> > This commit adds s390.cat and s390.cfg files to allow users to check
> > litmus tests for s390-specific code.  Note that this change only enables
> > herd7 checking of C-language litmus tests.  Larger changes are required
> > to enable the litmus7 and klitmus7 tools to check litmus tests on real
> > hardare.
> > 
> > Suggested-by: Martin Schwidefsky 
> > Suggested-by: Christian Borntraeger 
> > Signed-off-by: Paul E. McKenney 
> > [ paulmck: Add fixes suggested by Alan Stern. ]
> > ---
> >  tools/memory-model/s390.cat | 18 ++
> >  tools/memory-model/s390.cfg | 21 +
> >  2 files changed, 39 insertions(+)
> >  create mode 100644 tools/memory-model/s390.cat
> >  create mode 100644 tools/memory-model/s390.cfg
> 
> As I said before, I'd *much* prefer this to be part of the upstream
> herdtools7 repository. It's not really anything to do with the Linux
> kernel, so I don't think it belongs in the source tree.

Agreed.  As the cover letter says, "Add .cfg and .cat files for s390,
which is a not-for-mainline placeholder."

Thanx, Paul



[PATCH] uio: convert to vm_fault_t

2018-09-01 Thread Souptick Joarder
As part of commit 9b85e95a3080 ("uio: Change return
type to vm_fault_t") in 4.19-rc1, this conversion
was missed. Now converted 'ret' to vm_fault_t type.

Signed-off-by: Souptick Joarder 
---
 drivers/uio/uio.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c
index 70a7981..8fae1a4 100644
--- a/drivers/uio/uio.c
+++ b/drivers/uio/uio.c
@@ -668,7 +668,7 @@ static vm_fault_t uio_vma_fault(struct vm_fault *vmf)
struct page *page;
unsigned long offset;
void *addr;
-   int ret = 0;
+   vm_fault_t ret = 0;
int mi;
 
mutex_lock(>info_lock);
-- 
1.9.1



[PATCH] uio: convert to vm_fault_t

2018-09-01 Thread Souptick Joarder
As part of commit 9b85e95a3080 ("uio: Change return
type to vm_fault_t") in 4.19-rc1, this conversion
was missed. Now converted 'ret' to vm_fault_t type.

Signed-off-by: Souptick Joarder 
---
 drivers/uio/uio.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c
index 70a7981..8fae1a4 100644
--- a/drivers/uio/uio.c
+++ b/drivers/uio/uio.c
@@ -668,7 +668,7 @@ static vm_fault_t uio_vma_fault(struct vm_fault *vmf)
struct page *page;
unsigned long offset;
void *addr;
-   int ret = 0;
+   vm_fault_t ret = 0;
int mi;
 
mutex_lock(>info_lock);
-- 
1.9.1



  1   2   3   4   >