date:20180118

Re: [PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing

2018-01-18 Thread Keith Busch

On Thu, Jan 18, 2018 at 06:10:00PM +0800, Jianchao Wang wrote:
> Hello
> 
> Please consider the following scenario.
> nvme_reset_ctrl
>   -> set state to RESETTING
>   -> queue reset_work   
> (scheduling)
> nvme_reset_work
>   -> nvme_dev_disable
> -> quiesce queues
> -> nvme_cancel_request 
>on outstanding requests
> ---_boundary_
>   -> nvme initializing (issue request on adminq)
> 
> Before the _boundary_, not only quiesce the queues, but only cancel
> all the outstanding requests.
> 
> A request could expire when the ctrl state is RESETTING.
>  - If the timeout occur before the _boundary_, the expired requests
>are from the previous work.
>  - Otherwise, the expired requests are from the controller initializing
>procedure, such as sending cq/sq create commands to adminq to setup
>io queues.
> In current implementation, nvme_timeout cannot identify the _boundary_ 
> so only handles second case above.

Bare with me a moment, as I'm only just now getting a real chance to look
at this, and I'm not quite sure I follow what problem this is solving.

The nvme_dev_disable routine makes forward progress without depending on
timeout handling to complete expired commands. Once controller disabling
completes, there can't possibly be any started requests that can expire.
So we don't need nvme_timeout to do anything for requests above the
boundary.

Re: [PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing

2018-01-18 Thread Keith Busch

On Thu, Jan 18, 2018 at 06:10:00PM +0800, Jianchao Wang wrote:
> Hello
> 
> Please consider the following scenario.
> nvme_reset_ctrl
>   -> set state to RESETTING
>   -> queue reset_work   
> (scheduling)
> nvme_reset_work
>   -> nvme_dev_disable
> -> quiesce queues
> -> nvme_cancel_request 
>on outstanding requests
> ---_boundary_
>   -> nvme initializing (issue request on adminq)
> 
> Before the _boundary_, not only quiesce the queues, but only cancel
> all the outstanding requests.
> 
> A request could expire when the ctrl state is RESETTING.
>  - If the timeout occur before the _boundary_, the expired requests
>are from the previous work.
>  - Otherwise, the expired requests are from the controller initializing
>procedure, such as sending cq/sq create commands to adminq to setup
>io queues.
> In current implementation, nvme_timeout cannot identify the _boundary_ 
> so only handles second case above.

Bare with me a moment, as I'm only just now getting a real chance to look
at this, and I'm not quite sure I follow what problem this is solving.

The nvme_dev_disable routine makes forward progress without depending on
timeout handling to complete expired commands. Once controller disabling
completes, there can't possibly be any started requests that can expire.
So we don't need nvme_timeout to do anything for requests above the
boundary.

Re: [PATCH 6/6] s390: scrub registers on kernel entry and KVM exit

2018-01-18 Thread Christian Borntraeger

On 01/19/2018 07:29 AM, QingFeng Hao wrote:
> 
> 
> 在 2018/1/17 17:48, Martin Schwidefsky 写道:
>> Clear all user space registers on entry to the kernel and all KVM guest
>> registers on KVM guest exit if the register does not contain either a
>> parameter or a result value.
> I am not sure if I understand this but it will be safer?

It ist similar to commit 0cb5b30698fd ("kvm: vmx: Scrub hardware GPRs at 
VM-exit").
The idea is to minimize potential payload channels.

> And can we abstract the operations to be a macro like CLEAR_REG_7?

No, please.
xgr %r7,%r7
is absolutely clear what it does, a MACRO often is not.

Re: [PATCH 6/6] s390: scrub registers on kernel entry and KVM exit

2018-01-18 Thread Christian Borntraeger

On 01/19/2018 07:29 AM, QingFeng Hao wrote:
> 
> 
> 在 2018/1/17 17:48, Martin Schwidefsky 写道:
>> Clear all user space registers on entry to the kernel and all KVM guest
>> registers on KVM guest exit if the register does not contain either a
>> parameter or a result value.
> I am not sure if I understand this but it will be safer?

It ist similar to commit 0cb5b30698fd ("kvm: vmx: Scrub hardware GPRs at 
VM-exit").
The idea is to minimize potential payload channels.

> And can we abstract the operations to be a macro like CLEAR_REG_7?

No, please.
xgr %r7,%r7
is absolutely clear what it does, a MACRO often is not.

[PATCH][V2] mtd: nand: marvell: fix spelling mistake: "suceed"-> "succeed"

2018-01-18 Thread Colin King

From: Colin Ian King 

Trivial fix to spelling mistakes in dev_err error message text.

Signed-off-by: Colin Ian King 
---
 drivers/mtd/nand/marvell_nand.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/mtd/nand/marvell_nand.c b/drivers/mtd/nand/marvell_nand.c
index b8fec6093b75..4bd53b360277 100644
--- a/drivers/mtd/nand/marvell_nand.c
+++ b/drivers/mtd/nand/marvell_nand.c
@@ -517,7 +517,7 @@ static int marvell_nfc_prepare_cmd(struct nand_chip *chip)
/* Poll ND_RUN and clear NDSR before issuing any command */
ret = marvell_nfc_wait_ndrun(chip);
if (ret) {
-   dev_err(nfc->dev, "Last operation did not suceed\n");
+   dev_err(nfc->dev, "Last operation did not succeed\n");
return ret;
}
 
-- 
2.15.1

[PATCH][V2] mtd: nand: marvell: fix spelling mistake: "suceed"-> "succeed"

2018-01-18 Thread Colin King

From: Colin Ian King 

Trivial fix to spelling mistakes in dev_err error message text.

Signed-off-by: Colin Ian King 
---
 drivers/mtd/nand/marvell_nand.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/mtd/nand/marvell_nand.c b/drivers/mtd/nand/marvell_nand.c
index b8fec6093b75..4bd53b360277 100644
--- a/drivers/mtd/nand/marvell_nand.c
+++ b/drivers/mtd/nand/marvell_nand.c
@@ -517,7 +517,7 @@ static int marvell_nfc_prepare_cmd(struct nand_chip *chip)
/* Poll ND_RUN and clear NDSR before issuing any command */
ret = marvell_nfc_wait_ndrun(chip);
if (ret) {
-   dev_err(nfc->dev, "Last operation did not suceed\n");
+   dev_err(nfc->dev, "Last operation did not succeed\n");
return ret;
}
 
-- 
2.15.1

Re: [PATCH 4.4 045/115] sched/deadline: Throttle a constrained deadline task activated after the deadline

2018-01-18 Thread Greg Kroah-Hartman

On Fri, Jan 19, 2018 at 01:00:45AM +, Ben Hutchings wrote:
> On Mon, 2017-12-18 at 16:48 +0100, Greg Kroah-Hartman wrote:
> > 4.4-stable review patch.  If anyone has any objections, please let me
> > know.
> > 
> > --
> > 
> > From: Daniel Bristot de Oliveira 
> > 
> > 
> > [ Upstream commit df8eac8cafce7d086be3bd5cf5a838fa37594dfb ]
> [...]
> 
> I think this needs another fix on top:
> 
> commit ae83b56a56f8d9643dedbee86b457fa1c5d42f59
> Author: Xunlei Pang 
> Date:   Wed May 10 21:03:37 2017 +0800
> 
> sched/deadline: Zero out positive runtime after throttling constrained 
> tasks

Now queued up, thanks.

> There's another fix related to this, but it doesn't appear to fix a
> regression and I don't know how critical it is:
> 
> commit 3effcb4247e74a51f5d8b775a1ee4abf87cc089a
> Author: Daniel Bristot de Oliveira 
> Date:   Mon May 29 16:24:03 2017 +0200
> 
> sched/deadline: Use the revised wakeup rule for suspending constrained dl 
> tasks

I'll hold off on this one until someone actually asks for it, as it's a
big change.

thanks again for the review,

greg k-h

Re: [PATCH 4.4 045/115] sched/deadline: Throttle a constrained deadline task activated after the deadline

2018-01-18 Thread Greg Kroah-Hartman

On Fri, Jan 19, 2018 at 01:00:45AM +, Ben Hutchings wrote:
> On Mon, 2017-12-18 at 16:48 +0100, Greg Kroah-Hartman wrote:
> > 4.4-stable review patch.  If anyone has any objections, please let me
> > know.
> > 
> > --
> > 
> > From: Daniel Bristot de Oliveira 
> > 
> > 
> > [ Upstream commit df8eac8cafce7d086be3bd5cf5a838fa37594dfb ]
> [...]
> 
> I think this needs another fix on top:
> 
> commit ae83b56a56f8d9643dedbee86b457fa1c5d42f59
> Author: Xunlei Pang 
> Date:   Wed May 10 21:03:37 2017 +0800
> 
> sched/deadline: Zero out positive runtime after throttling constrained 
> tasks

Now queued up, thanks.

> There's another fix related to this, but it doesn't appear to fix a
> regression and I don't know how critical it is:
> 
> commit 3effcb4247e74a51f5d8b775a1ee4abf87cc089a
> Author: Daniel Bristot de Oliveira 
> Date:   Mon May 29 16:24:03 2017 +0200
> 
> sched/deadline: Use the revised wakeup rule for suspending constrained dl 
> tasks

I'll hold off on this one until someone actually asks for it, as it's a
big change.

thanks again for the review,

greg k-h

Re: [PATCH 4.4 040/115] scsi: hpsa: update check for logical volume status

2018-01-18 Thread Greg Kroah-Hartman

On Fri, Jan 19, 2018 at 12:29:12AM +, Ben Hutchings wrote:
> On Mon, 2017-12-18 at 16:48 +0100, Greg Kroah-Hartman wrote:
> > 4.4-stable review patch.  If anyone has any objections, please let me know.
> > 
> > --
> > 
> > From: Don Brace 
> > 
> > 
> > [ Upstream commit 85b29008d8af6d94a0723aaa8d93cfb6e041158b ]
> > 
> >  - Add in a new case for volume offline. Resolves internal testing bug
> >    for multilun array management.
> >  - Return correct status for failed TURs.
> [...]
> 
> This apparently caused a regression that is fixed by:
> 
> commit eb94588dabec82e012281608949a860f64752914
> Author: Tomas Henzl 
> Date:   Mon Mar 20 16:42:48 2017 +0100
> 
> scsi: hpsa: fix volume offline state

Many thanks, also now queued up for 4.9 which needs this too.

greg k-h

Re: [PATCH 4.4 040/115] scsi: hpsa: update check for logical volume status

2018-01-18 Thread Greg Kroah-Hartman

On Fri, Jan 19, 2018 at 12:29:12AM +, Ben Hutchings wrote:
> On Mon, 2017-12-18 at 16:48 +0100, Greg Kroah-Hartman wrote:
> > 4.4-stable review patch.  If anyone has any objections, please let me know.
> > 
> > --
> > 
> > From: Don Brace 
> > 
> > 
> > [ Upstream commit 85b29008d8af6d94a0723aaa8d93cfb6e041158b ]
> > 
> >  - Add in a new case for volume offline. Resolves internal testing bug
> >    for multilun array management.
> >  - Return correct status for failed TURs.
> [...]
> 
> This apparently caused a regression that is fixed by:
> 
> commit eb94588dabec82e012281608949a860f64752914
> Author: Tomas Henzl 
> Date:   Mon Mar 20 16:42:48 2017 +0100
> 
> scsi: hpsa: fix volume offline state

Many thanks, also now queued up for 4.9 which needs this too.

greg k-h

Re: [PATCH] general protection fault in sock_has_perm

2018-01-18 Thread Greg KH

On Thu, Jan 18, 2018 at 01:58:45PM -0800, Mark Salyzyn wrote:
> general protection fault:  [#1] PREEMPT SMP KASAN
> CPU: 1 PID: 14233 Comm: syz-executor2 Not tainted 4.4.112-g5f6325b #28
> task: 8801d1095f00 task.stack: 8800b595
> RIP: 0010:[]  [] 
> sock_has_perm+0x1fe/0x3e0 security/selinux/hooks.c:4069
> RSP: 0018:8800b5957ce0  EFLAGS: 00010202
> RAX: dc00 RBX: 110016b2af9f RCX: 81b69b51
> RDX: 0002 RSI:  RDI: 0010
> RBP: 8800b5957de0 R08: 0001 R09: 0001
> R10:  R11: 110016b2af68 R12: 8800b5957db8
> R13:  R14: 8800b7259f40 R15: 00d7
> FS:  7f72f5ae2700() GS:8801db30() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: 00a2fa38 CR3: 0001d798 CR4: 00160670
> DR0:  DR1:  DR2: 
> DR3:  DR6: fffe0ff0 DR7: 0400
> Stack:
>  81b69a1f 8800b5957d58 8000b5957d30 41b58ab3
>  83fc82f2 81b69980 0246 8801d1096770
>  8801d3165668 8157844b 8801d1095f00
>  8801
> Call Trace:
> [] selinux_socket_setsockopt+0x4d/0x80 
> security/selinux/hooks.c:4338
> [] security_socket_setsockopt+0x7d/0xb0 
> security/security.c:1257
> [] SYSC_setsockopt net/socket.c:1757 [inline]
> [] SyS_setsockopt+0xe8/0x250 net/socket.c:1746
> [] entry_SYSCALL_64_fastpath+0x16/0x92
> Code: c2 42 9b b6 81 be 01 00 00 00 48 c7 c7 a0 cb 2b 84 e8
> f7 2f 6d ff 49 8d 7d 10 48 b8 00 00 00 00 00 fc ff df 48 89
> fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e 83 01 00
> 00 41 8b 75 10 31
> RIP  [] sock_has_perm+0x1fe/0x3e0 
> security/selinux/hooks.c:4069
> RSP 
> ---[ end trace 7b5aaf788fef6174 ]---
> 
> In the absence of commit a4298e4522d6 ("net: add SOCK_RCU_FREE socket
> flag") and all the associated infrastructure changes to take advantage
> of a RCU grace period before freeing, there is a heightened
> possibility that a security check is performed while an ill-timed
> setsockopt call races in from user space.  It then is prudent to null
> check sk_security, and if the case, reject the permissions.
> 
> This adjustment is orthogonal to infrastructure improvements that may
> nullify the needed check, but should be added as good code hygiene.
> 
> Signed-off-by: Mark Salyzyn 
> Cc: Paul Moore 
> Cc: Stephen Smalley 
> Cc: Eric Paris 
> Cc: James Morris 
> Cc: "Serge E. Hallyn" 
> Cc: seli...@tycho.nsa.gov
> Cc: linux-security-mod...@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: sta...@vger.kernel.org
> ---
> This patch should be applied to all stable trees (author wants
> minimum of 3.18, 4.4, 4.9 and 4.14)

Note, if you want this type of thing to show up in the patch itself, so
I will see it when it hits Linus's tree, you can just change the stable
line to be:
cc: stable  # 3.18+

thanks,

greg k-h

Re: [PATCH] general protection fault in sock_has_perm

2018-01-18 Thread Greg KH

On Thu, Jan 18, 2018 at 01:58:45PM -0800, Mark Salyzyn wrote:
> general protection fault:  [#1] PREEMPT SMP KASAN
> CPU: 1 PID: 14233 Comm: syz-executor2 Not tainted 4.4.112-g5f6325b #28
> task: 8801d1095f00 task.stack: 8800b595
> RIP: 0010:[]  [] 
> sock_has_perm+0x1fe/0x3e0 security/selinux/hooks.c:4069
> RSP: 0018:8800b5957ce0  EFLAGS: 00010202
> RAX: dc00 RBX: 110016b2af9f RCX: 81b69b51
> RDX: 0002 RSI:  RDI: 0010
> RBP: 8800b5957de0 R08: 0001 R09: 0001
> R10:  R11: 110016b2af68 R12: 8800b5957db8
> R13:  R14: 8800b7259f40 R15: 00d7
> FS:  7f72f5ae2700() GS:8801db30() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: 00a2fa38 CR3: 0001d798 CR4: 00160670
> DR0:  DR1:  DR2: 
> DR3:  DR6: fffe0ff0 DR7: 0400
> Stack:
>  81b69a1f 8800b5957d58 8000b5957d30 41b58ab3
>  83fc82f2 81b69980 0246 8801d1096770
>  8801d3165668 8157844b 8801d1095f00
>  8801
> Call Trace:
> [] selinux_socket_setsockopt+0x4d/0x80 
> security/selinux/hooks.c:4338
> [] security_socket_setsockopt+0x7d/0xb0 
> security/security.c:1257
> [] SYSC_setsockopt net/socket.c:1757 [inline]
> [] SyS_setsockopt+0xe8/0x250 net/socket.c:1746
> [] entry_SYSCALL_64_fastpath+0x16/0x92
> Code: c2 42 9b b6 81 be 01 00 00 00 48 c7 c7 a0 cb 2b 84 e8
> f7 2f 6d ff 49 8d 7d 10 48 b8 00 00 00 00 00 fc ff df 48 89
> fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e 83 01 00
> 00 41 8b 75 10 31
> RIP  [] sock_has_perm+0x1fe/0x3e0 
> security/selinux/hooks.c:4069
> RSP 
> ---[ end trace 7b5aaf788fef6174 ]---
> 
> In the absence of commit a4298e4522d6 ("net: add SOCK_RCU_FREE socket
> flag") and all the associated infrastructure changes to take advantage
> of a RCU grace period before freeing, there is a heightened
> possibility that a security check is performed while an ill-timed
> setsockopt call races in from user space.  It then is prudent to null
> check sk_security, and if the case, reject the permissions.
> 
> This adjustment is orthogonal to infrastructure improvements that may
> nullify the needed check, but should be added as good code hygiene.
> 
> Signed-off-by: Mark Salyzyn 
> Cc: Paul Moore 
> Cc: Stephen Smalley 
> Cc: Eric Paris 
> Cc: James Morris 
> Cc: "Serge E. Hallyn" 
> Cc: seli...@tycho.nsa.gov
> Cc: linux-security-mod...@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: sta...@vger.kernel.org
> ---
> This patch should be applied to all stable trees (author wants
> minimum of 3.18, 4.4, 4.9 and 4.14)

Note, if you want this type of thing to show up in the patch itself, so
I will see it when it hits Linus's tree, you can just change the stable
line to be:
cc: stable  # 3.18+

thanks,

greg k-h

答复: 答复: 答复: [PATCH v6] mfd: Add support for RTS5250S power saving

2018-01-18 Thread 冯锐

> On Wed, Dec 27, 2017 at 05:37:50PM -0600, Bjorn Helgaas wrote:
> > On Tue, Dec 19, 2017 at 08:15:24AM +, 冯锐 wrote:
> > > > On Fri, Dec 15, 2017 at 09:42:45AM +, 冯锐 wrote:
> > > > > > [+cc Hans, Dave, linux-pci]
> > > > > >
> > > > > > On Thu, Sep 07, 2017 at 04:26:39PM +0800,
> > > > > > rui_f...@realsil.com.cn
> > > > wrote:
> > > > > > > From: Rui Feng 
> > > > > >
> > > > > > I wish this had been posted to linux-pci before being merged.
> > > > > >
> > > > > > I'm concerned because some of this appears to overlap and
> > > > > > conflict with PCI core management of ASPM.
> > > > > >
> > > > > > I assume these devices advertise ASPM support in their Link
> > > > > > Capabilites registers, right?  If so, why isn't the existing
> > > > > > PCI core ASPM support sufficient?
> > > > > >
> > > > > When L1SS is configured, the device(hardware) can't enter L1SS
> > > > > status automatically, it need driver(software) to do some work
> > > > > to achieve the
> > > > function.
> > > >
> > > > So this is a hardware defect in the device?  As far as I know,
> > > > ASPM and L1SS are specified such that they should work without special
> driver support.
> > > >
> > > Yes, you can say that.
> > >
> > > > > > > Enable power saving for RTS5250S as following steps:
> > > > > > > 1.Set 0xFE58 to enable clock power management.
> > > > > >
> > > > > > Is this clock power management something specific to RTS5250S,
> > > > > > or is it standard PCIe architected stuff?
> > > > > >
> > > > > 0xFE58 is specific register to RTS5250S not standard PCIe architected
> stuff.
> > > >
> > > > OK.  I asked because devices often mirror architected PCIe config
> > > > things in device-specific MMIO space, and if I squint just right,
> > > > I can sort of match up the register bits you used with things in the 
> > > > PCIe
> spec.
> > > >
> > > > > > > 2.Check cfg space whether support L1SS or not.
> > > > > >
> > > > > > This sounds like standard PCIe ASPM L1 Substates, right?
> > > > > >
> > > > > Yes.
> > > > >
> > > > > > > 3.If support L1SS, set 0xFF03 to free clkreq.
> > > > > > > 4.When entering idle status, enable aspm
> > > > > > >   and set parameters for L1SS and LTR.
> > > > > > > 5.Wnen entering run status, disable aspm
> > > > > > >   and set parameters for L1SS and LTR.
> > > > > >
> > > > > > In general, drivers should not configure ASPM, L1SS, and LTR
> > > > > > themselves; the PCI core should do that.
> > > > > >
> > > > > > If a driver needs to tweak ASPM at run-time, it should use
> > > > > > interfaces exported by the PCI core to do so.
> > > > > >
> > > > > Which interface I can use to set ASPM? I use "pci_write_config_byte"
> now.
> > > >
> > > > What do you need to do?  include/linux/pci-aspm.h exports
> > > > pci_disable_link_state(), which is mainly used to avoid ASPM
> > > > states that have hardware errata.
> > > >
> > > I want to enable ASPM(L0 -> L1) and disable ASPM(L1 -> L0), which
> > > interface can I use?
> >
> > You can use pci_disable_link_state() to disable usage of L1.
> >
> > Currently there is no corresponding pci_enable_link_state().  What if
> > we added something like the following (untested)?  Would that work for
> > you?
> 
> Hi Rui,
> 
> Any thoughts on the patch below?

I'm busy with other work, the patch seems ok, I will test it later.
> 
> > commit 209930d809fa602b8aafdd171b26719cee6c6649
> > Author: Bjorn Helgaas 
> > Date:   Wed Dec 27 16:56:26 2017 -0600
> >
> > PCI/ASPM: Add pci_enable_link_state()
> >
> > Some drivers want control over the ASPM states their device is allowed
> to
> > use.  We already have a pci_disable_link_state(), and drivers can use
> that
> > to prevent the device from entering L0 or L1s.
> >
> > Add a corresponding pci_enable_link_state() so a driver can enable use
> of
> > L0 or L1s again.
> >
> > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c index
> > 3b9b4d50cd98..ca217195f800 100644
> > --- a/drivers/pci/pcie/aspm.c
> > +++ b/drivers/pci/pcie/aspm.c
> > @@ -1028,6 +1028,67 @@ void pcie_aspm_powersave_config_link(struct
> pci_dev *pdev)
> > up_read(_bus_sem);
> >  }
> >
> > +/**
> > + * pci_enable_link_state - Enable device's link state, so the link
> > +may
> > + * enter specific states.  Note that if the BIOS didn't grant ASPM
> > + * control to the OS, this does nothing because we can't touch the
> > +LNKCTL
> > + * register.
> > + *
> > + * @pdev: PCI device
> > + * @state: ASPM link state to enable
> > + */
> > +void pci_enable_link_state(struct pci_dev *pdev, int state) {
> > +   struct pci_dev *parent = pdev->bus->self;
> > +   struct pcie_link_state *link;
> > +   u32 lnkcap;
> > +
> > +   if (!pci_is_pcie(pdev))
> > +   return;
> > +
> > +   if (pdev->has_secondary_link)
> > +   parent = pdev;
> > +   if (!parent || !parent->link_state)
> > +   return;
> > +
> > +   /*
> > +* A driver requested that ASPM be enabled on this

答复: 答复: 答复: [PATCH v6] mfd: Add support for RTS5250S power saving

2018-01-18 Thread 冯锐

> On Wed, Dec 27, 2017 at 05:37:50PM -0600, Bjorn Helgaas wrote:
> > On Tue, Dec 19, 2017 at 08:15:24AM +, 冯锐 wrote:
> > > > On Fri, Dec 15, 2017 at 09:42:45AM +, 冯锐 wrote:
> > > > > > [+cc Hans, Dave, linux-pci]
> > > > > >
> > > > > > On Thu, Sep 07, 2017 at 04:26:39PM +0800,
> > > > > > rui_f...@realsil.com.cn
> > > > wrote:
> > > > > > > From: Rui Feng 
> > > > > >
> > > > > > I wish this had been posted to linux-pci before being merged.
> > > > > >
> > > > > > I'm concerned because some of this appears to overlap and
> > > > > > conflict with PCI core management of ASPM.
> > > > > >
> > > > > > I assume these devices advertise ASPM support in their Link
> > > > > > Capabilites registers, right?  If so, why isn't the existing
> > > > > > PCI core ASPM support sufficient?
> > > > > >
> > > > > When L1SS is configured, the device(hardware) can't enter L1SS
> > > > > status automatically, it need driver(software) to do some work
> > > > > to achieve the
> > > > function.
> > > >
> > > > So this is a hardware defect in the device?  As far as I know,
> > > > ASPM and L1SS are specified such that they should work without special
> driver support.
> > > >
> > > Yes, you can say that.
> > >
> > > > > > > Enable power saving for RTS5250S as following steps:
> > > > > > > 1.Set 0xFE58 to enable clock power management.
> > > > > >
> > > > > > Is this clock power management something specific to RTS5250S,
> > > > > > or is it standard PCIe architected stuff?
> > > > > >
> > > > > 0xFE58 is specific register to RTS5250S not standard PCIe architected
> stuff.
> > > >
> > > > OK.  I asked because devices often mirror architected PCIe config
> > > > things in device-specific MMIO space, and if I squint just right,
> > > > I can sort of match up the register bits you used with things in the 
> > > > PCIe
> spec.
> > > >
> > > > > > > 2.Check cfg space whether support L1SS or not.
> > > > > >
> > > > > > This sounds like standard PCIe ASPM L1 Substates, right?
> > > > > >
> > > > > Yes.
> > > > >
> > > > > > > 3.If support L1SS, set 0xFF03 to free clkreq.
> > > > > > > 4.When entering idle status, enable aspm
> > > > > > >   and set parameters for L1SS and LTR.
> > > > > > > 5.Wnen entering run status, disable aspm
> > > > > > >   and set parameters for L1SS and LTR.
> > > > > >
> > > > > > In general, drivers should not configure ASPM, L1SS, and LTR
> > > > > > themselves; the PCI core should do that.
> > > > > >
> > > > > > If a driver needs to tweak ASPM at run-time, it should use
> > > > > > interfaces exported by the PCI core to do so.
> > > > > >
> > > > > Which interface I can use to set ASPM? I use "pci_write_config_byte"
> now.
> > > >
> > > > What do you need to do?  include/linux/pci-aspm.h exports
> > > > pci_disable_link_state(), which is mainly used to avoid ASPM
> > > > states that have hardware errata.
> > > >
> > > I want to enable ASPM(L0 -> L1) and disable ASPM(L1 -> L0), which
> > > interface can I use?
> >
> > You can use pci_disable_link_state() to disable usage of L1.
> >
> > Currently there is no corresponding pci_enable_link_state().  What if
> > we added something like the following (untested)?  Would that work for
> > you?
> 
> Hi Rui,
> 
> Any thoughts on the patch below?

I'm busy with other work, the patch seems ok, I will test it later.
> 
> > commit 209930d809fa602b8aafdd171b26719cee6c6649
> > Author: Bjorn Helgaas 
> > Date:   Wed Dec 27 16:56:26 2017 -0600
> >
> > PCI/ASPM: Add pci_enable_link_state()
> >
> > Some drivers want control over the ASPM states their device is allowed
> to
> > use.  We already have a pci_disable_link_state(), and drivers can use
> that
> > to prevent the device from entering L0 or L1s.
> >
> > Add a corresponding pci_enable_link_state() so a driver can enable use
> of
> > L0 or L1s again.
> >
> > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c index
> > 3b9b4d50cd98..ca217195f800 100644
> > --- a/drivers/pci/pcie/aspm.c
> > +++ b/drivers/pci/pcie/aspm.c
> > @@ -1028,6 +1028,67 @@ void pcie_aspm_powersave_config_link(struct
> pci_dev *pdev)
> > up_read(_bus_sem);
> >  }
> >
> > +/**
> > + * pci_enable_link_state - Enable device's link state, so the link
> > +may
> > + * enter specific states.  Note that if the BIOS didn't grant ASPM
> > + * control to the OS, this does nothing because we can't touch the
> > +LNKCTL
> > + * register.
> > + *
> > + * @pdev: PCI device
> > + * @state: ASPM link state to enable
> > + */
> > +void pci_enable_link_state(struct pci_dev *pdev, int state) {
> > +   struct pci_dev *parent = pdev->bus->self;
> > +   struct pcie_link_state *link;
> > +   u32 lnkcap;
> > +
> > +   if (!pci_is_pcie(pdev))
> > +   return;
> > +
> > +   if (pdev->has_secondary_link)
> > +   parent = pdev;
> > +   if (!parent || !parent->link_state)
> > +   return;
> > +
> > +   /*
> > +* A driver requested that ASPM be enabled on this device, but
> > +* if we don't have

Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle

2018-01-18 Thread Ming Lei

On Fri, Jan 19, 2018 at 05:09:46AM +, Bart Van Assche wrote:
> On Fri, 2018-01-19 at 10:32 +0800, Ming Lei wrote:
> > Now most of times both NVMe and SCSI won't return BLK_STS_RESOURCE, and
> > it should be DM-only which returns STS_RESOURCE so often.
> 
> That's wrong at least for SCSI. See also 
> https://marc.info/?l=linux-block=151578329417076.
> 

> For other scenario's, e.g. if a SCSI initiator submits a
> SCSI request over a fabric and the SCSI target replies with "BUSY" then the

Could you explain a bit when SCSI target replies with BUSY very often?

Inside initiator, we have limited the max per-LUN requests and per-host
requests already before calling .queue_rq().

> SCSI core will end the I/O request with status BLK_STS_RESOURCE after the
> maximum number of retries has been reached (see also scsi_io_completion()).
> In that last case, if a SCSI target sends a "BUSY" reply over the wire back
> to the initiator, there is no other approach for the SCSI initiator to
> figure out whether it can queue another request than to resubmit the
> request. The worst possible strategy is to resubmit a request immediately
> because that will cause a significant fraction of the fabric bandwidth to
> be used just for replying "BUSY" to requests that can't be processed
> immediately.


-- 
Ming

Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle

2018-01-18 Thread Ming Lei

On Fri, Jan 19, 2018 at 05:09:46AM +, Bart Van Assche wrote:
> On Fri, 2018-01-19 at 10:32 +0800, Ming Lei wrote:
> > Now most of times both NVMe and SCSI won't return BLK_STS_RESOURCE, and
> > it should be DM-only which returns STS_RESOURCE so often.
> 
> That's wrong at least for SCSI. See also 
> https://marc.info/?l=linux-block=151578329417076.
> 

> For other scenario's, e.g. if a SCSI initiator submits a
> SCSI request over a fabric and the SCSI target replies with "BUSY" then the

Could you explain a bit when SCSI target replies with BUSY very often?

Inside initiator, we have limited the max per-LUN requests and per-host
requests already before calling .queue_rq().

> SCSI core will end the I/O request with status BLK_STS_RESOURCE after the
> maximum number of retries has been reached (see also scsi_io_completion()).
> In that last case, if a SCSI target sends a "BUSY" reply over the wire back
> to the initiator, there is no other approach for the SCSI initiator to
> figure out whether it can queue another request than to resubmit the
> request. The worst possible strategy is to resubmit a request immediately
> because that will cause a significant fraction of the fabric bandwidth to
> be used just for replying "BUSY" to requests that can't be processed
> immediately.


-- 
Ming

Re: [patch v17 2/4] drivers: jtag: Add Aspeed SoC 24xx and 25xx families JTAG master driver

2018-01-18 Thread kbuild test robot

Hi Oleksandr,

I love your patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v4.15-rc8]
[cannot apply to next-20180118]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Oleksandr-Shamray/drivers-jtag-Add-JTAG-core-driver/20180119-123719
config: ia64-allmodconfig (attached as .config)
compiler: ia64-linux-gcc (GCC) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=ia64 

Note: the 
linux-review/Oleksandr-Shamray/drivers-jtag-Add-JTAG-core-driver/20180119-123719
 HEAD b9c3d4721186f8264960ad87c6c499cdd1b6c2e8 builds fine.
  It only hurts bisectibility.

All error/warnings (new ones prefixed by >>):

   drivers/jtag/jtag-aspeed.c: In function 'aspeed_jtag_init':
>> drivers/jtag/jtag-aspeed.c:657:21: error: implicit declaration of function 
>> 'devm_reset_control_get_shared'; did you mean 'devm_pinctrl_get_select'? 
>> [-Werror=implicit-function-declaration]
 aspeed_jtag->rst = devm_reset_control_get_shared(aspeed_jtag->dev,
^
devm_pinctrl_get_select
>> drivers/jtag/jtag-aspeed.c:657:19: warning: assignment makes pointer from 
>> integer without a cast [-Wint-conversion]
 aspeed_jtag->rst = devm_reset_control_get_shared(aspeed_jtag->dev,
  ^
>> drivers/jtag/jtag-aspeed.c:664:2: error: implicit declaration of function 
>> 'reset_control_deassert' [-Werror=implicit-function-declaration]
 reset_control_deassert(aspeed_jtag->rst);
 ^~
   drivers/jtag/jtag-aspeed.c: In function 'aspeed_jtag_deinit':
>> drivers/jtag/jtag-aspeed.c:707:2: error: implicit declaration of function 
>> 'reset_control_assert' [-Werror=implicit-function-declaration]
 reset_control_assert(aspeed_jtag->rst);
 ^~~~
   cc1: some warnings being treated as errors

vim +657 drivers/jtag/jtag-aspeed.c

   631  
   632  int aspeed_jtag_init(struct platform_device *pdev,
   633   struct aspeed_jtag *aspeed_jtag)
   634  {
   635  struct resource *res;
   636  int err;
   637  
   638  res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
   639  aspeed_jtag->reg_base = devm_ioremap_resource(aspeed_jtag->dev, 
res);
   640  if (IS_ERR(aspeed_jtag->reg_base))
   641  return -ENOMEM;
   642  
   643  aspeed_jtag->pclk = devm_clk_get(aspeed_jtag->dev, NULL);
   644  if (IS_ERR(aspeed_jtag->pclk)) {
   645  dev_err(aspeed_jtag->dev, "devm_clk_get failed\n");
   646  return PTR_ERR(aspeed_jtag->pclk);
   647  }
   648  
   649  aspeed_jtag->irq = platform_get_irq(pdev, 0);
   650  if (aspeed_jtag->irq < 0) {
   651  dev_err(aspeed_jtag->dev, "no irq specified\n");
   652  return -ENOENT;
   653  }
   654  
   655  clk_prepare_enable(aspeed_jtag->pclk);
   656  
 > 657  aspeed_jtag->rst = 
 > devm_reset_control_get_shared(aspeed_jtag->dev,
   658   NULL);
   659  if (IS_ERR(aspeed_jtag->rst)) {
   660  dev_err(aspeed_jtag->dev,
   661  "missing or invalid reset controller device 
tree entry");
   662  return PTR_ERR(aspeed_jtag->rst);
   663  }
 > 664  reset_control_deassert(aspeed_jtag->rst);
   665  
   666  /* Enable clock */
   667  aspeed_jtag_write(aspeed_jtag, ASPEED_JTAG_CTL_ENG_EN |
   668ASPEED_JTAG_CTL_ENG_OUT_EN, ASPEED_JTAG_CTRL);
   669  aspeed_jtag_write(aspeed_jtag, ASPEED_JTAG_SW_MODE_EN |
   670ASPEED_JTAG_SW_MODE_TDIO, ASPEED_JTAG_SW);
   671  
   672  err = devm_request_irq(aspeed_jtag->dev, aspeed_jtag->irq,
   673 aspeed_jtag_interrupt, 0,
   674 "aspeed-jtag", aspeed_jtag);
   675  if (err) {
   676  dev_err(aspeed_jtag->dev, "unable to get IRQ");
   677  goto clk_unprep;
   678  }
   679  dev_dbg(>dev, "IRQ %d.\n", aspeed_jtag->irq);
   680  
   681  aspeed_jtag_write(aspeed_jtag, ASPEED_JTAG_ISR_INST_PAUSE |
   682ASPEED_JTAG_ISR_INST_COMPLETE |
   683ASPEED_JTAG_ISR_DATA_PAUSE |
   684ASPEED_JTAG

Re: [patch v17 2/4] drivers: jtag: Add Aspeed SoC 24xx and 25xx families JTAG master driver

2018-01-18 Thread kbuild test robot

Hi Oleksandr,

I love your patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v4.15-rc8]
[cannot apply to next-20180118]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Oleksandr-Shamray/drivers-jtag-Add-JTAG-core-driver/20180119-123719
config: ia64-allmodconfig (attached as .config)
compiler: ia64-linux-gcc (GCC) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=ia64 

Note: the 
linux-review/Oleksandr-Shamray/drivers-jtag-Add-JTAG-core-driver/20180119-123719
 HEAD b9c3d4721186f8264960ad87c6c499cdd1b6c2e8 builds fine.
  It only hurts bisectibility.

All error/warnings (new ones prefixed by >>):

   drivers/jtag/jtag-aspeed.c: In function 'aspeed_jtag_init':
>> drivers/jtag/jtag-aspeed.c:657:21: error: implicit declaration of function 
>> 'devm_reset_control_get_shared'; did you mean 'devm_pinctrl_get_select'? 
>> [-Werror=implicit-function-declaration]
 aspeed_jtag->rst = devm_reset_control_get_shared(aspeed_jtag->dev,
^
devm_pinctrl_get_select
>> drivers/jtag/jtag-aspeed.c:657:19: warning: assignment makes pointer from 
>> integer without a cast [-Wint-conversion]
 aspeed_jtag->rst = devm_reset_control_get_shared(aspeed_jtag->dev,
  ^
>> drivers/jtag/jtag-aspeed.c:664:2: error: implicit declaration of function 
>> 'reset_control_deassert' [-Werror=implicit-function-declaration]
 reset_control_deassert(aspeed_jtag->rst);
 ^~
   drivers/jtag/jtag-aspeed.c: In function 'aspeed_jtag_deinit':
>> drivers/jtag/jtag-aspeed.c:707:2: error: implicit declaration of function 
>> 'reset_control_assert' [-Werror=implicit-function-declaration]
 reset_control_assert(aspeed_jtag->rst);
 ^~~~
   cc1: some warnings being treated as errors

vim +657 drivers/jtag/jtag-aspeed.c

   631  
   632  int aspeed_jtag_init(struct platform_device *pdev,
   633   struct aspeed_jtag *aspeed_jtag)
   634  {
   635  struct resource *res;
   636  int err;
   637  
   638  res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
   639  aspeed_jtag->reg_base = devm_ioremap_resource(aspeed_jtag->dev, 
res);
   640  if (IS_ERR(aspeed_jtag->reg_base))
   641  return -ENOMEM;
   642  
   643  aspeed_jtag->pclk = devm_clk_get(aspeed_jtag->dev, NULL);
   644  if (IS_ERR(aspeed_jtag->pclk)) {
   645  dev_err(aspeed_jtag->dev, "devm_clk_get failed\n");
   646  return PTR_ERR(aspeed_jtag->pclk);
   647  }
   648  
   649  aspeed_jtag->irq = platform_get_irq(pdev, 0);
   650  if (aspeed_jtag->irq < 0) {
   651  dev_err(aspeed_jtag->dev, "no irq specified\n");
   652  return -ENOENT;
   653  }
   654  
   655  clk_prepare_enable(aspeed_jtag->pclk);
   656  
 > 657  aspeed_jtag->rst = 
 > devm_reset_control_get_shared(aspeed_jtag->dev,
   658   NULL);
   659  if (IS_ERR(aspeed_jtag->rst)) {
   660  dev_err(aspeed_jtag->dev,
   661  "missing or invalid reset controller device 
tree entry");
   662  return PTR_ERR(aspeed_jtag->rst);
   663  }
 > 664  reset_control_deassert(aspeed_jtag->rst);
   665  
   666  /* Enable clock */
   667  aspeed_jtag_write(aspeed_jtag, ASPEED_JTAG_CTL_ENG_EN |
   668ASPEED_JTAG_CTL_ENG_OUT_EN, ASPEED_JTAG_CTRL);
   669  aspeed_jtag_write(aspeed_jtag, ASPEED_JTAG_SW_MODE_EN |
   670ASPEED_JTAG_SW_MODE_TDIO, ASPEED_JTAG_SW);
   671  
   672  err = devm_request_irq(aspeed_jtag->dev, aspeed_jtag->irq,
   673 aspeed_jtag_interrupt, 0,
   674 "aspeed-jtag", aspeed_jtag);
   675  if (err) {
   676  dev_err(aspeed_jtag->dev, "unable to get IRQ");
   677  goto clk_unprep;
   678  }
   679  dev_dbg(>dev, "IRQ %d.\n", aspeed_jtag->irq);
   680  
   681  aspeed_jtag_write(aspeed_jtag, ASPEED_JTAG_ISR_INST_PAUSE |
   682ASPEED_JTAG_ISR_INST_COMPLETE |
   683ASPEED_JTAG_ISR_DATA_PAUSE |
   684ASPEED_JTAG

Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle

2018-01-18 Thread Ming Lei

On Thu, Jan 18, 2018 at 09:02:45PM -0700, Jens Axboe wrote:
> On 1/18/18 7:32 PM, Ming Lei wrote:
> > On Thu, Jan 18, 2018 at 01:11:01PM -0700, Jens Axboe wrote:
> >> On 1/18/18 11:47 AM, Bart Van Assche wrote:
>  This is all very tiresome.
> >>>
> >>> Yes, this is tiresome. It is very annoying to me that others keep
> >>> introducing so many regressions in such important parts of the kernel.
> >>> It is also annoying to me that I get blamed if I report a regression
> >>> instead of seeing that the regression gets fixed.
> >>
> >> I agree, it sucks that any change there introduces the regression. I'm
> >> fine with doing the delay insert again until a new patch is proven to be
> >> better.
> > 
> > That way is still buggy as I explained, since rerun queue before adding
> > request to hctx->dispatch_list isn't correct. Who can make sure the request
> > is visible when __blk_mq_run_hw_queue() is called?
> 
> That race basically doesn't exist for a 10ms gap.
> 
> > Not mention this way will cause performance regression again.
> 
> How so? It's _exactly_ the same as what you are proposing, except mine
> will potentially run the queue when it need not do so. But given that
> these are random 10ms queue kicks because we are screwed, it should not
> matter. The key point is that it only should be if we have NO better
> options. If it's a frequently occurring event that we have to return
> BLK_STS_RESOURCE, then we need to get a way to register an event for
> when that condition clears. That event will then kick the necessary
> queue(s).

Please see queue_delayed_work_on(), hctx->run_work is shared by all
scheduling, once blk_mq_delay_run_hw_queue(100ms) returns, no new
scheduling can make progress during the 100ms.

> 
> >> From the original topic of this email, we have conditions that can cause
> >> the driver to not be able to submit an IO. A set of those conditions can
> >> only happen if IO is in flight, and those cases we have covered just
> >> fine. Another set can potentially trigger without IO being in flight.
> >> These are cases where a non-device resource is unavailable at the time
> >> of submission. This might be iommu running out of space, for instance,
> >> or it might be a memory allocation of some sort. For these cases, we
> >> don't get any notification when the shortage clears. All we can do is
> >> ensure that we restart operations at some point in the future. We're SOL
> >> at that point, but we have to ensure that we make forward progress.
> > 
> > Right, it is a generic issue, not DM-specific one, almost all drivers
> > call kmalloc(GFP_ATOMIC) in IO path.
> 
> GFP_ATOMIC basically never fails, unless we are out of memory. The

I guess GFP_KERNEL may never fail, but GFP_ATOMIC failure might be
possible, and it is mentioned[1] there is such code in mm allocation
path, also OOM can happen too.

  if (some randomly generated condition) && (request is atomic)
  return NULL;

[1] https://lwn.net/Articles/276731/

> exception is higher order allocations. If a driver has a higher order
> atomic allocation in its IO path, the device driver writer needs to be
> taken out behind the barn and shot. Simple as that. It will NEVER work
> well in a production environment. Witness the disaster that so many NIC
> driver writers have learned.
> 
> This is NOT the case we care about here. It's resources that are more
> readily depleted because other devices are using them. If it's a high
> frequency or generally occurring event, then we simply must have a
> callback to restart the queue from that. The condition then becomes
> identical to device private starvation, the only difference being from
> where we restart the queue.
> 
> > IMO, there is enough time for figuring out a generic solution before
> > 4.16 release.
> 
> I would hope so, but the proposed solutions have not filled me with
> a lot of confidence in the end result so far.
> 
> >> That last set of conditions better not be a a common occurence, since
> >> performance is down the toilet at that point. I don't want to introduce
> >> hot path code to rectify it. Have the driver return if that happens in a
> >> way that is DIFFERENT from needing a normal restart. The driver knows if
> >> this is a resource that will become available when IO completes on this
> >> device or not. If we get that return, we have a generic run-again delay.
> > 
> > Now most of times both NVMe and SCSI won't return BLK_STS_RESOURCE, and
> > it should be DM-only which returns STS_RESOURCE so often.
> 
> Where does the dm STS_RESOURCE error usually come from - what's exact
> resource are we running out of?

It is from blk_get_request(underlying queue), see multipath_clone_and_map().

Thanks,
Ming

Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle

2018-01-18 Thread Ming Lei

On Thu, Jan 18, 2018 at 09:02:45PM -0700, Jens Axboe wrote:
> On 1/18/18 7:32 PM, Ming Lei wrote:
> > On Thu, Jan 18, 2018 at 01:11:01PM -0700, Jens Axboe wrote:
> >> On 1/18/18 11:47 AM, Bart Van Assche wrote:
>  This is all very tiresome.
> >>>
> >>> Yes, this is tiresome. It is very annoying to me that others keep
> >>> introducing so many regressions in such important parts of the kernel.
> >>> It is also annoying to me that I get blamed if I report a regression
> >>> instead of seeing that the regression gets fixed.
> >>
> >> I agree, it sucks that any change there introduces the regression. I'm
> >> fine with doing the delay insert again until a new patch is proven to be
> >> better.
> > 
> > That way is still buggy as I explained, since rerun queue before adding
> > request to hctx->dispatch_list isn't correct. Who can make sure the request
> > is visible when __blk_mq_run_hw_queue() is called?
> 
> That race basically doesn't exist for a 10ms gap.
> 
> > Not mention this way will cause performance regression again.
> 
> How so? It's _exactly_ the same as what you are proposing, except mine
> will potentially run the queue when it need not do so. But given that
> these are random 10ms queue kicks because we are screwed, it should not
> matter. The key point is that it only should be if we have NO better
> options. If it's a frequently occurring event that we have to return
> BLK_STS_RESOURCE, then we need to get a way to register an event for
> when that condition clears. That event will then kick the necessary
> queue(s).

Please see queue_delayed_work_on(), hctx->run_work is shared by all
scheduling, once blk_mq_delay_run_hw_queue(100ms) returns, no new
scheduling can make progress during the 100ms.

> 
> >> From the original topic of this email, we have conditions that can cause
> >> the driver to not be able to submit an IO. A set of those conditions can
> >> only happen if IO is in flight, and those cases we have covered just
> >> fine. Another set can potentially trigger without IO being in flight.
> >> These are cases where a non-device resource is unavailable at the time
> >> of submission. This might be iommu running out of space, for instance,
> >> or it might be a memory allocation of some sort. For these cases, we
> >> don't get any notification when the shortage clears. All we can do is
> >> ensure that we restart operations at some point in the future. We're SOL
> >> at that point, but we have to ensure that we make forward progress.
> > 
> > Right, it is a generic issue, not DM-specific one, almost all drivers
> > call kmalloc(GFP_ATOMIC) in IO path.
> 
> GFP_ATOMIC basically never fails, unless we are out of memory. The

I guess GFP_KERNEL may never fail, but GFP_ATOMIC failure might be
possible, and it is mentioned[1] there is such code in mm allocation
path, also OOM can happen too.

  if (some randomly generated condition) && (request is atomic)
  return NULL;

[1] https://lwn.net/Articles/276731/

> exception is higher order allocations. If a driver has a higher order
> atomic allocation in its IO path, the device driver writer needs to be
> taken out behind the barn and shot. Simple as that. It will NEVER work
> well in a production environment. Witness the disaster that so many NIC
> driver writers have learned.
> 
> This is NOT the case we care about here. It's resources that are more
> readily depleted because other devices are using them. If it's a high
> frequency or generally occurring event, then we simply must have a
> callback to restart the queue from that. The condition then becomes
> identical to device private starvation, the only difference being from
> where we restart the queue.
> 
> > IMO, there is enough time for figuring out a generic solution before
> > 4.16 release.
> 
> I would hope so, but the proposed solutions have not filled me with
> a lot of confidence in the end result so far.
> 
> >> That last set of conditions better not be a a common occurence, since
> >> performance is down the toilet at that point. I don't want to introduce
> >> hot path code to rectify it. Have the driver return if that happens in a
> >> way that is DIFFERENT from needing a normal restart. The driver knows if
> >> this is a resource that will become available when IO completes on this
> >> device or not. If we get that return, we have a generic run-again delay.
> > 
> > Now most of times both NVMe and SCSI won't return BLK_STS_RESOURCE, and
> > it should be DM-only which returns STS_RESOURCE so often.
> 
> Where does the dm STS_RESOURCE error usually come from - what's exact
> resource are we running out of?

It is from blk_get_request(underlying queue), see multipath_clone_and_map().

Thanks,
Ming

Re: [RESEND PATCH 3/3] x86/apic: Clean up the names of legacy irq mode setting related functions

2018-01-18 Thread Baoquan He

On 01/19/18 at 02:42pm, Dou Liyang wrote:
> Hi Baoquan,
> 
> At 01/05/2018 12:39 PM, Baoquan He wrote:
> [...]
> >   /*
> > - * Not an __init, needed by kexec/kdump code.
> > - * For safety IO-APIC and Local APIC need be cleared before this.
> > + * In legacy irq mode, full DOS compatibility with the uniprocessor PC/AT 
> > is
> > + * provided by using the APICs in conjunction with standard 
> > 8259A-equivalent
> > + * programmable interrupt controllers (PICs). It's necessary to deliver 
> > legacy
> > + * interrupts even when APIC mode is not enabled. This is required by 
> > kexec/
> > + * kdump before enter into the 2nd kernel.
> >*/
> >   void switch_to_legacy_irq_mode(void)
> >   {
> > if (!nr_legacy_irqs())
> > return;
> > -   x86_io_apic_ops.disable();
> > +   ioapic_set_virtual_wire_mode();
> > +
> > +   if (boot_cpu_has(X86_FEATURE_APIC) || apic_from_smp_config())
> > +   lapic_set_legacy_irq_mode(ioapic_i8259.pin != -1);
> 
> Seems these two function, ioapic/lapic_set_legacy_irq_mode should be
> exclusive.

Thanks for looking into this, dou!

It might be not exclusive. You can see mp_spec 3.6.2.2 Virtual Wire Mode
subsection, there are two kinds of virtual wire mode, one is
8259A-Equivalent pics is connected to lint0 of boot cpu LAPIC, the other
is 8259A-Equivalent pics go through IO-APIC, then is connected to lint0
of LAPIC. Whatever it is, LAPIC need be set as through-lapic.

Above is what I got from mp_spec. But from function
native_disable_io_apic() and disconnect_bsp_APIC(), the code seems to be
telling that if io-apic is connected to 8259A-Equivalent pics, we need
mask lvt0 of LAPIC. This conflicts with mp_spec 3.6.2.2.

Thanks
Baoquan
> 
> But We do that because both the through-lapic and through-ioapic virtual
> wire mode need setup the APIC_SPIV_APIC_ENABLED which is only located in
> the lapic_set_legacy_irq_mode(). So we need call them both.
> 
> IMO, this cleanup may not make it clear. we can separate these two mode
> totally or just keep it like before.
> 
> Thanks,
>   dou.
> >   }
> >   #ifdef CONFIG_X86_32
> > diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
> > index 1151ccd72ce9..c30f0f273dbd 100644
> > --- a/arch/x86/kernel/x86_init.c
> > +++ b/arch/x86/kernel/x86_init.c
> > @@ -148,5 +148,5 @@ void arch_restore_msi_irqs(struct pci_dev *dev)
> >   struct x86_io_apic_ops x86_io_apic_ops __ro_after_init = {
> > .read   = native_io_apic_read,
> > -   .disable= native_disable_io_apic,
> > +   .disable= switch_to_legacy_irq_mode,
> >   };
> > diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
> > index 49721b4e1975..751472ddf536 100644
> > --- a/drivers/iommu/irq_remapping.c
> > +++ b/drivers/iommu/irq_remapping.c
> > @@ -37,7 +37,7 @@ static void irq_remapping_disable_io_apic(void)
> >  * now.
> >  */
> > if (boot_cpu_has(X86_FEATURE_APIC) || apic_from_smp_config())
> > -   disconnect_bsp_APIC(0);
> > +   lapic_set_legacy_irq_mode(0);
> >   }
> >   static void __init irq_remapping_modify_x86_ops(void)
> > 
> 
>

Re: [RESEND PATCH 3/3] x86/apic: Clean up the names of legacy irq mode setting related functions

2018-01-18 Thread Baoquan He

On 01/19/18 at 02:42pm, Dou Liyang wrote:
> Hi Baoquan,
> 
> At 01/05/2018 12:39 PM, Baoquan He wrote:
> [...]
> >   /*
> > - * Not an __init, needed by kexec/kdump code.
> > - * For safety IO-APIC and Local APIC need be cleared before this.
> > + * In legacy irq mode, full DOS compatibility with the uniprocessor PC/AT 
> > is
> > + * provided by using the APICs in conjunction with standard 
> > 8259A-equivalent
> > + * programmable interrupt controllers (PICs). It's necessary to deliver 
> > legacy
> > + * interrupts even when APIC mode is not enabled. This is required by 
> > kexec/
> > + * kdump before enter into the 2nd kernel.
> >*/
> >   void switch_to_legacy_irq_mode(void)
> >   {
> > if (!nr_legacy_irqs())
> > return;
> > -   x86_io_apic_ops.disable();
> > +   ioapic_set_virtual_wire_mode();
> > +
> > +   if (boot_cpu_has(X86_FEATURE_APIC) || apic_from_smp_config())
> > +   lapic_set_legacy_irq_mode(ioapic_i8259.pin != -1);
> 
> Seems these two function, ioapic/lapic_set_legacy_irq_mode should be
> exclusive.

Thanks for looking into this, dou!

It might be not exclusive. You can see mp_spec 3.6.2.2 Virtual Wire Mode
subsection, there are two kinds of virtual wire mode, one is
8259A-Equivalent pics is connected to lint0 of boot cpu LAPIC, the other
is 8259A-Equivalent pics go through IO-APIC, then is connected to lint0
of LAPIC. Whatever it is, LAPIC need be set as through-lapic.

Above is what I got from mp_spec. But from function
native_disable_io_apic() and disconnect_bsp_APIC(), the code seems to be
telling that if io-apic is connected to 8259A-Equivalent pics, we need
mask lvt0 of LAPIC. This conflicts with mp_spec 3.6.2.2.

Thanks
Baoquan
> 
> But We do that because both the through-lapic and through-ioapic virtual
> wire mode need setup the APIC_SPIV_APIC_ENABLED which is only located in
> the lapic_set_legacy_irq_mode(). So we need call them both.
> 
> IMO, this cleanup may not make it clear. we can separate these two mode
> totally or just keep it like before.
> 
> Thanks,
>   dou.
> >   }
> >   #ifdef CONFIG_X86_32
> > diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
> > index 1151ccd72ce9..c30f0f273dbd 100644
> > --- a/arch/x86/kernel/x86_init.c
> > +++ b/arch/x86/kernel/x86_init.c
> > @@ -148,5 +148,5 @@ void arch_restore_msi_irqs(struct pci_dev *dev)
> >   struct x86_io_apic_ops x86_io_apic_ops __ro_after_init = {
> > .read   = native_io_apic_read,
> > -   .disable= native_disable_io_apic,
> > +   .disable= switch_to_legacy_irq_mode,
> >   };
> > diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
> > index 49721b4e1975..751472ddf536 100644
> > --- a/drivers/iommu/irq_remapping.c
> > +++ b/drivers/iommu/irq_remapping.c
> > @@ -37,7 +37,7 @@ static void irq_remapping_disable_io_apic(void)
> >  * now.
> >  */
> > if (boot_cpu_has(X86_FEATURE_APIC) || apic_from_smp_config())
> > -   disconnect_bsp_APIC(0);
> > +   lapic_set_legacy_irq_mode(0);
> >   }
> >   static void __init irq_remapping_modify_x86_ops(void)
> > 
> 
>

Re: [PATCH v4] perf report: Fix regression when decoding intelPT traces

2018-01-18 Thread Adrian Hunter

On 18/01/18 18:29, Arnaldo Carvalho de Melo wrote:
> Em Wed, Jan 10, 2018 at 01:31:52PM -0700, Mathieu Poirier escreveu:
>> Commit (93d10af26bb7 perf tools: Optimize sample parsing for ordered
>> events) breaks intelPT trace decoding by invariably returning an error if
>> the event type isn't a PERF_SAMPLE_TIME.
> 
> Adrian, have you had the chance of looking at this?
> 
> I'm tentatively applying with Jiri's ack.

Yes, it is fine.  FWIW

Acked-by: Adrian Hunter 

> 
> - Arnaldo
>  
>> With this patch the timestamp is initialised and processing is allowed to
>> continue if the error returned by function
>> perf_evlist__parse_sample_timestamp() is not a fault.
>>
>> Signed-off-by: Mathieu Poirier 
>> Acked-by: Jiri Olsa 
>> ---
>> Changes for v4:
>> - Rebased to latest perf/core branch
>> - Added Jiri's ACK
>> ---
>>  tools/perf/util/session.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
>> index 54e30f1bcbd7..07221884f725 100644
>> --- a/tools/perf/util/session.c
>> +++ b/tools/perf/util/session.c
>> @@ -1508,10 +1508,10 @@ static s64 perf_session__process_event(struct 
>> perf_session *session,
>>  return perf_session__process_user_event(session, event, 
>> file_offset);
>>  
>>  if (tool->ordered_events) {
>> -u64 timestamp;
>> +u64 timestamp = -1ULL;
>>  
>>  ret = perf_evlist__parse_sample_timestamp(evlist, event, 
>> );
>> -if (ret)
>> +if (ret && ret != -1)
>>  return ret;
>>  
>>  ret = perf_session__queue_event(session, event, timestamp, 
>> file_offset);
>> -- 
>> 2.7.4
>

Re: [PATCH v4] perf report: Fix regression when decoding intelPT traces

2018-01-18 Thread Adrian Hunter

On 18/01/18 18:29, Arnaldo Carvalho de Melo wrote:
> Em Wed, Jan 10, 2018 at 01:31:52PM -0700, Mathieu Poirier escreveu:
>> Commit (93d10af26bb7 perf tools: Optimize sample parsing for ordered
>> events) breaks intelPT trace decoding by invariably returning an error if
>> the event type isn't a PERF_SAMPLE_TIME.
> 
> Adrian, have you had the chance of looking at this?
> 
> I'm tentatively applying with Jiri's ack.

Yes, it is fine.  FWIW

Acked-by: Adrian Hunter 

> 
> - Arnaldo
>  
>> With this patch the timestamp is initialised and processing is allowed to
>> continue if the error returned by function
>> perf_evlist__parse_sample_timestamp() is not a fault.
>>
>> Signed-off-by: Mathieu Poirier 
>> Acked-by: Jiri Olsa 
>> ---
>> Changes for v4:
>> - Rebased to latest perf/core branch
>> - Added Jiri's ACK
>> ---
>>  tools/perf/util/session.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
>> index 54e30f1bcbd7..07221884f725 100644
>> --- a/tools/perf/util/session.c
>> +++ b/tools/perf/util/session.c
>> @@ -1508,10 +1508,10 @@ static s64 perf_session__process_event(struct 
>> perf_session *session,
>>  return perf_session__process_user_event(session, event, 
>> file_offset);
>>  
>>  if (tool->ordered_events) {
>> -u64 timestamp;
>> +u64 timestamp = -1ULL;
>>  
>>  ret = perf_evlist__parse_sample_timestamp(evlist, event, 
>> );
>> -if (ret)
>> +if (ret && ret != -1)
>>  return ret;
>>  
>>  ret = perf_session__queue_event(session, event, timestamp, 
>> file_offset);
>> -- 
>> 2.7.4
>

[PATCH] Fix explanation of lower bits in the SPARSEMEM mem_map pointer

2018-01-18 Thread Petr Tesarik

The comment is confusing. On the one hand, it refers to 32-bit
alignment (struct page alignment on 32-bit platforms), but this
would only guarantee that the 2 lowest bits must be zero. On the
other hand, it claims that at least 3 bits are available, and 3 bits
are actually used.

This is not broken, because there is a stronger alignment guarantee,
just less obvious. Let's fix the comment to make it clear how many
bits are available and why.

Signed-off-by: Petr Tesarik 
---
 include/linux/mmzone.h | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 67f2e3c38939..7522a6987595 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1166,8 +1166,16 @@ extern unsigned long usemap_size(void);
 
 /*
  * We use the lower bits of the mem_map pointer to store
- * a little bit of information.  There should be at least
- * 3 bits here due to 32-bit alignment.
+ * a little bit of information.  The pointer is calculated
+ * as mem_map - section_nr_to_pfn(pnum).  The result is
+ * aligned to the minimum alignment of the two values:
+ *   1. All mem_map arrays are page-aligned.
+ *   2. section_nr_to_pfn() always clears PFN_SECTION_SHIFT
+ *  lowest bits.  PFN_SECTION_SHIFT is arch-specific
+ *  (equal SECTION_SIZE_BITS - PAGE_SHIFT), and the
+ *  worst combination is powerpc with 256k pages,
+ *  which results in PFN_SECTION_SHIFT equal 6.
+ * To sum it up, at least 6 bits are available.
  */
 #defineSECTION_MARKED_PRESENT  (1UL<<0)
 #define SECTION_HAS_MEM_MAP(1UL<<1)
-- 
2.13.6

[PATCH] Fix explanation of lower bits in the SPARSEMEM mem_map pointer

2018-01-18 Thread Petr Tesarik

The comment is confusing. On the one hand, it refers to 32-bit
alignment (struct page alignment on 32-bit platforms), but this
would only guarantee that the 2 lowest bits must be zero. On the
other hand, it claims that at least 3 bits are available, and 3 bits
are actually used.

This is not broken, because there is a stronger alignment guarantee,
just less obvious. Let's fix the comment to make it clear how many
bits are available and why.

Signed-off-by: Petr Tesarik 
---
 include/linux/mmzone.h | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 67f2e3c38939..7522a6987595 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1166,8 +1166,16 @@ extern unsigned long usemap_size(void);
 
 /*
  * We use the lower bits of the mem_map pointer to store
- * a little bit of information.  There should be at least
- * 3 bits here due to 32-bit alignment.
+ * a little bit of information.  The pointer is calculated
+ * as mem_map - section_nr_to_pfn(pnum).  The result is
+ * aligned to the minimum alignment of the two values:
+ *   1. All mem_map arrays are page-aligned.
+ *   2. section_nr_to_pfn() always clears PFN_SECTION_SHIFT
+ *  lowest bits.  PFN_SECTION_SHIFT is arch-specific
+ *  (equal SECTION_SIZE_BITS - PAGE_SHIFT), and the
+ *  worst combination is powerpc with 256k pages,
+ *  which results in PFN_SECTION_SHIFT equal 6.
+ * To sum it up, at least 6 bits are available.
  */
 #defineSECTION_MARKED_PRESENT  (1UL<<0)
 #define SECTION_HAS_MEM_MAP(1UL<<1)
-- 
2.13.6

Re: [PATCH v5 0/2] kprobes: improve error handling when arming/disarming kprobes

2018-01-18 Thread Masami Hiramatsu

Hi Ingo,

Could you pick this to tip tree?

Thank you,

On Wed, 10 Jan 2018 00:51:22 +0100
Jessica Yu  wrote:

> Hi,
> 
> This patchset attempts to improve error handling when arming or disarming
> ftrace-based kprobes. The current behavior is to simply WARN when ftrace
> (un-)registration fails, without propagating the error code. This can lead
> to confusing situations where, for example, register_kprobe()/enable_kprobe()
> would return 0 indicating success even if arming via ftrace had failed. In
> this scenario we'd end up with a non-functioning kprobe even though kprobe
> registration (or enablement) returned success. In this patchset, we take
> errors from ftrace into account and propagate the error when we cannot arm
> or disarm a kprobe.
> 
> Below is an example that illustrates the problem using livepatch and
> systemtap (which uses kprobes underneath). Both livepatch and kprobes use
> ftrace ops with the IPMODIFY flag set, so registration at the same
> function entry is limited to only one ftrace user. 
> 
> Before
> --
> # modprobe livepatch-sample   # patches cmdline_proc_show, ftrace ops has 
> IPMODIFY set
> # stap -e 'probe kernel.function("cmdline_proc_show").call { printf 
> ("cmdline_proc_show\n"); }'
> 
>.. (nothing prints after reading /proc/cmdline) ..
> 
> The systemtap handler doesn't execute due to a kprobe arming failure caused
> by a ftrace IPMODIFY conflict with livepatch, and there isn't an obvious
> indication of error from systemtap (because register_kprobe() returned
> success) unless the user inspects dmesg.
> 
> After
> -
> # modprobe livepatch-sample 
> # stap -e 'probe kernel.function("cmdline_proc_show").call { printf 
> ("cmdline_proc_show\n"); }'
> WARNING: probe 
> kernel.function("cmdline_proc_show@/home/jeyu/work/linux-next/fs/proc/cmdline.c:6").call
>  (address 0xa82fe910) registration error (rc -16)
> 
> Although the systemtap handler doesn't execute (as it shouldn't), the
> ftrace error is propagated and now systemtap prints a visible error message
> stating that (kprobe) registration had failed (because register_kprobe()
> returned an error), along with the propagated error code.
> 
> This patchset was based on Petr Mladek's original patchset (patches 2 and 3)
> back in 2015, which improved kprobes error handling, found here:
> 
>https://lkml.org/lkml/2015/2/26/452
> 
> However, further work on this had been paused since then and the patches
> were not upstreamed.
> 
> This patchset has been lightly sanity-tested (on linux-next) with kprobes,
> kretprobes, and optimized kprobes. It passes the kprobes smoke test, but
> more testing is greatly appreciated.
> 
> Changes from v4:
>  - Switch from WARN() to pr_debug() in arm_kprobe_ftrace() so the stack
>dumps don't pollute dmesg, as IPMODIFY conflicts can occur in normal usage
>  - Added Masami's ack to the first patch
> 
> Changes from v3:
>  - Have (dis)arm_kprobe_ftrace() return -ENODEV instead of 0 in case of
>!CONFIG_KPROBES_ON_FTRACE
>  - Add total count of all probes tried in (dis)arm_all_kprobes()
> 
> Changes from v2:
>  - Add missing synchronize rcu in register_aggr_kprobe()
>  - s/kprobes/probes/ on error message in (dis)arm_all_kprobes()
> 
> Changes from v1:
> - Don't arm the kprobe before adding it to the kprobe table, otherwise
>   we'll temporarily see a stray breakpoint.
> - Remove kprobe from the kprobe_table and call synchronize_sched() if
>   arming during register_kprobe() fails.
> - add Masami's ack on the 2nd patch (unchanged from v1)
> 
> ---
> Jessica Yu (2):
>   kprobes: propagate error from arm_kprobe_ftrace()
>   kprobes: propagate error from disarm_kprobe_ftrace()
> 
>  kernel/kprobes.c | 178 
> +++
>  1 file changed, 128 insertions(+), 50 deletions(-)
> 
> -- 
> 2.13.6
> 


-- 
Masami Hiramatsu

Re: [PATCH v5 0/2] kprobes: improve error handling when arming/disarming kprobes

2018-01-18 Thread Masami Hiramatsu

Hi Ingo,

Could you pick this to tip tree?

Thank you,

On Wed, 10 Jan 2018 00:51:22 +0100
Jessica Yu  wrote:

> Hi,
> 
> This patchset attempts to improve error handling when arming or disarming
> ftrace-based kprobes. The current behavior is to simply WARN when ftrace
> (un-)registration fails, without propagating the error code. This can lead
> to confusing situations where, for example, register_kprobe()/enable_kprobe()
> would return 0 indicating success even if arming via ftrace had failed. In
> this scenario we'd end up with a non-functioning kprobe even though kprobe
> registration (or enablement) returned success. In this patchset, we take
> errors from ftrace into account and propagate the error when we cannot arm
> or disarm a kprobe.
> 
> Below is an example that illustrates the problem using livepatch and
> systemtap (which uses kprobes underneath). Both livepatch and kprobes use
> ftrace ops with the IPMODIFY flag set, so registration at the same
> function entry is limited to only one ftrace user. 
> 
> Before
> --
> # modprobe livepatch-sample   # patches cmdline_proc_show, ftrace ops has 
> IPMODIFY set
> # stap -e 'probe kernel.function("cmdline_proc_show").call { printf 
> ("cmdline_proc_show\n"); }'
> 
>.. (nothing prints after reading /proc/cmdline) ..
> 
> The systemtap handler doesn't execute due to a kprobe arming failure caused
> by a ftrace IPMODIFY conflict with livepatch, and there isn't an obvious
> indication of error from systemtap (because register_kprobe() returned
> success) unless the user inspects dmesg.
> 
> After
> -
> # modprobe livepatch-sample 
> # stap -e 'probe kernel.function("cmdline_proc_show").call { printf 
> ("cmdline_proc_show\n"); }'
> WARNING: probe 
> kernel.function("cmdline_proc_show@/home/jeyu/work/linux-next/fs/proc/cmdline.c:6").call
>  (address 0xa82fe910) registration error (rc -16)
> 
> Although the systemtap handler doesn't execute (as it shouldn't), the
> ftrace error is propagated and now systemtap prints a visible error message
> stating that (kprobe) registration had failed (because register_kprobe()
> returned an error), along with the propagated error code.
> 
> This patchset was based on Petr Mladek's original patchset (patches 2 and 3)
> back in 2015, which improved kprobes error handling, found here:
> 
>https://lkml.org/lkml/2015/2/26/452
> 
> However, further work on this had been paused since then and the patches
> were not upstreamed.
> 
> This patchset has been lightly sanity-tested (on linux-next) with kprobes,
> kretprobes, and optimized kprobes. It passes the kprobes smoke test, but
> more testing is greatly appreciated.
> 
> Changes from v4:
>  - Switch from WARN() to pr_debug() in arm_kprobe_ftrace() so the stack
>dumps don't pollute dmesg, as IPMODIFY conflicts can occur in normal usage
>  - Added Masami's ack to the first patch
> 
> Changes from v3:
>  - Have (dis)arm_kprobe_ftrace() return -ENODEV instead of 0 in case of
>!CONFIG_KPROBES_ON_FTRACE
>  - Add total count of all probes tried in (dis)arm_all_kprobes()
> 
> Changes from v2:
>  - Add missing synchronize rcu in register_aggr_kprobe()
>  - s/kprobes/probes/ on error message in (dis)arm_all_kprobes()
> 
> Changes from v1:
> - Don't arm the kprobe before adding it to the kprobe table, otherwise
>   we'll temporarily see a stray breakpoint.
> - Remove kprobe from the kprobe_table and call synchronize_sched() if
>   arming during register_kprobe() fails.
> - add Masami's ack on the 2nd patch (unchanged from v1)
> 
> ---
> Jessica Yu (2):
>   kprobes: propagate error from arm_kprobe_ftrace()
>   kprobes: propagate error from disarm_kprobe_ftrace()
> 
>  kernel/kprobes.c | 178 
> +++
>  1 file changed, 128 insertions(+), 50 deletions(-)
> 
> -- 
> 2.13.6
> 


-- 
Masami Hiramatsu

Re: [PATCH] xhci:Fix NULL pointer in xhci debugfs

2018-01-18 Thread Mathias Nyman


On 19.01.2018 04:13, Zhengjun Xing wrote:

Commit dde634057da7 ("xhci: Fix use-after-free in xhci debugfs") causes a
null pointer dereference while fixing xhci-debugfs usage of ring pointers
that were freed during hibernate.

The fix passed addresses to ring pointers instead, but forgot to do this
change for the xhci_ring_trb_show function.

The address of the ring pointer passed to xhci-debugfs was of a temporary
ring pointer "new_ring" instead of the actual ring "ring" pointer. The
temporary new_ring pointer will be set to NULL later causing the NULL
pointer dereference.

This issue was seen when reading xhci related files in debugfs:

cat /sys/kernel/debug/usb/xhci/*/devices/*/ep*/trbs

[  184.604861] BUG: unable to handle kernel NULL pointer dereference at (null)
[  184.613776] IP: xhci_ring_trb_show+0x3a/0x890
[  184.618733] PGD 264193067 P4D 264193067 PUD 263238067 PMD 0
[  184.625184] Oops:  [#1] SMP
[  184.726410] RIP: 0010:xhci_ring_trb_show+0x3a/0x890
[  184.731944] RSP: 0018:ba8243c0fd90 EFLAGS: 00010246
[  184.737880] RAX:  RBX:  RCX: 000295d6
[  184.746020] RDX: 000295d5 RSI: 0001 RDI: 971a6418d400
[  184.754121] RBP:  R08:  R09: 
[  184.76] R10: 971a64c98a80 R11: 971a62a00e40 R12: 971a62a85500
[  184.770325] R13: 0002 R14: 971a6418d400 R15: 971a6418d400
[  184.778448] FS:  7fe725a79700() GS:971a6ec0() 
knlGS:
[  184.787644] CS:  0010 DS:  ES:  CR0: 80050033
[  184.794168] CR2:  CR3: 00025f365005 CR4: 003606f0
[  184.802318] Call Trace:
[  184.805094]  ? seq_read+0x281/0x3b0
[  184.809068]  seq_read+0xeb/0x3b0
[  184.812735]  full_proxy_read+0x4d/0x70
[  184.817007]  __vfs_read+0x23/0x120
[  184.820870]  vfs_read+0x91/0x130
[  184.824538]  SyS_read+0x42/0x90
[  184.828106]  entry_SYSCALL_64_fastpath+0x1a/0x7d

Fixes: dde634057da7 ("xhci: Fix use-after-free in xhci debugfs")
Signed-off-by: Zhengjun Xing 
---


Thanks, adding  to queue

-Mathias

Re: [PATCH] xhci:Fix NULL pointer in xhci debugfs

2018-01-18 Thread Mathias Nyman


On 19.01.2018 04:13, Zhengjun Xing wrote:

Commit dde634057da7 ("xhci: Fix use-after-free in xhci debugfs") causes a
null pointer dereference while fixing xhci-debugfs usage of ring pointers
that were freed during hibernate.

The fix passed addresses to ring pointers instead, but forgot to do this
change for the xhci_ring_trb_show function.

The address of the ring pointer passed to xhci-debugfs was of a temporary
ring pointer "new_ring" instead of the actual ring "ring" pointer. The
temporary new_ring pointer will be set to NULL later causing the NULL
pointer dereference.

This issue was seen when reading xhci related files in debugfs:

cat /sys/kernel/debug/usb/xhci/*/devices/*/ep*/trbs

[  184.604861] BUG: unable to handle kernel NULL pointer dereference at (null)
[  184.613776] IP: xhci_ring_trb_show+0x3a/0x890
[  184.618733] PGD 264193067 P4D 264193067 PUD 263238067 PMD 0
[  184.625184] Oops:  [#1] SMP
[  184.726410] RIP: 0010:xhci_ring_trb_show+0x3a/0x890
[  184.731944] RSP: 0018:ba8243c0fd90 EFLAGS: 00010246
[  184.737880] RAX:  RBX:  RCX: 000295d6
[  184.746020] RDX: 000295d5 RSI: 0001 RDI: 971a6418d400
[  184.754121] RBP:  R08:  R09: 
[  184.76] R10: 971a64c98a80 R11: 971a62a00e40 R12: 971a62a85500
[  184.770325] R13: 0002 R14: 971a6418d400 R15: 971a6418d400
[  184.778448] FS:  7fe725a79700() GS:971a6ec0() 
knlGS:
[  184.787644] CS:  0010 DS:  ES:  CR0: 80050033
[  184.794168] CR2:  CR3: 00025f365005 CR4: 003606f0
[  184.802318] Call Trace:
[  184.805094]  ? seq_read+0x281/0x3b0
[  184.809068]  seq_read+0xeb/0x3b0
[  184.812735]  full_proxy_read+0x4d/0x70
[  184.817007]  __vfs_read+0x23/0x120
[  184.820870]  vfs_read+0x91/0x130
[  184.824538]  SyS_read+0x42/0x90
[  184.828106]  entry_SYSCALL_64_fastpath+0x1a/0x7d

Fixes: dde634057da7 ("xhci: Fix use-after-free in xhci debugfs")
Signed-off-by: Zhengjun Xing 
---


Thanks, adding  to queue

-Mathias

linux-next: Tree for Jan 19

2018-01-18 Thread Stephen Rothwell

Hi all,

News: there will probably be very few, if any, releases next week as LCA
is on (unfortunate clash with the merge window).

Changes since 20180118:

The powerpc tree gained a build failure due to an interaction with Linus'
tree, so I applied a merge fix patch.  It gained another for which I
applied a supplied fix patch.

The f2fs tree gained a build failure due to an interaction with the
btrfs tree for which I reverted a commit.

The net-next tree gained a conflict against the net tree.

Non-merge commits (relative to Linus' tree): 9833
 9793 files changed, 406830 insertions(+), 263432 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a
multi_v7_defconfig for arm and a native build of tools/perf. After
the final fixups (if any), I do an x86_64 modules_install followed by
builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit),
ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc
and sparc64 defconfig. And finally, a simple boot test of the powerpc
pseries_le_defconfig kernel in qemu (with and without kvm enabled).

Below is a summary of the state of the merge.

I am currently merging 256 trees (counting Linus' and 44 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (dda3e15231b3 Merge branch 'fixes' of 
git://git.armlinux.org.uk/~rmk/linux-arm)
Merging fixes/master (820bf5c419e4 Merge tag 'scsi-fixes' of 
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi)
Merging kbuild-current/fixes (36c1681678b5 genksyms: drop *.hash.c from 
.gitignore)
Merging arc-current/for-curr (8ff3afc159f2 ARC: Enable fatal signals on boot 
for dev platforms)
Merging arm-current/fixes (091f02483df7 ARM: net: bpf: clarify tail_call index)
Merging m68k-current/for-linus (5e387199c17c m68k/defconfig: Update defconfigs 
for v4.14-rc7)
Merging metag-fixes/fixes (b884a190afce metag/usercopy: Add missing fixups)
Merging powerpc-fixes/fixes (1b689a95ce74 powerpc/pseries: include 
linux/types.h in asm/hvcall.h)
Merging sparc/master (59585b4be9ae sparc64: repair calling incorrect hweight 
function from stubs)
Merging fscrypt-current/for-stable (ae64f9bd1d36 Linux 4.15-rc2)
Merging net/master (b200bfd6112a fm10k: mark PM functions as __maybe_unused)
Merging bpf/master (7155f8f39157 Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf)
Merging ipsec/master (ad9294dbc227 bpf: fix cls_bpf on filter replace)
Merging netfilter/master (889c604fd0b5 netfilter: x_tables: fix int overflow in 
xt_alloc_table_info())
Merging ipvs/master (f7fb77fc1235 netfilter: nft_compat: check extension hook 
mask only if set)
Merging wireless-drivers/master (cc124d5cc8d8 brcmfmac: fix CLM load error for 
legacy chips when user helper is enabled)
Merging mac80211/master (59b179b48ce2 cfg80211: check dev_set_name() return 
value)
Merging rdma-fixes/for-rc (ae59c3f0b6cf RDMA/mlx5: Fix out-of-bound access 
while querying AH)
Merging sound-current/for-linus (b3defb791b26 ALSA: seq: Make ioctls race-free)
Merging pci-current/for-linus (d6c1efecd1e1 x86/PCI: Enable AMD 64-bit window 
on resume)
Merging driver-core.current/driver-core-linus (30a7acd57389 Linux 4.15-rc6)
Merging tty.current/tty-linus (30a7acd57389 Linux 4.15-rc6)
Merging usb.current/usb-linus (a8750ddca918 Linux 4.15-rc8)
Merging usb-gadget-fixes/fixes (b2cd1df66037 Linux 4.15-rc7)
Merging usb-serial-fixes/usb-linus (d14ac576d10f USB: serial: cp210x: add new 
device ID ELV ALC 8xxx)
Merging usb-chipidea-fixes/ci-for-usb-stable (964728f9f407 USB: chipidea: msm: 
fix ulpi-node lookup)
Merging phy/fixes (2b88212c4cc6 phy: rcar-gen3-usb2: select USB_COMMON)
Merging staging.current/staging-linus (a8750ddca918 Linux 4.15-rc8)
Merging char-misc.current/char-misc-linus (a8750ddca918 Linux 4.15-rc8)
Merging input-current/for-linus

linux-next: Tree for Jan 19

2018-01-18 Thread Stephen Rothwell

Hi all,

News: there will probably be very few, if any, releases next week as LCA
is on (unfortunate clash with the merge window).

Changes since 20180118:

The powerpc tree gained a build failure due to an interaction with Linus'
tree, so I applied a merge fix patch.  It gained another for which I
applied a supplied fix patch.

The f2fs tree gained a build failure due to an interaction with the
btrfs tree for which I reverted a commit.

The net-next tree gained a conflict against the net tree.

Non-merge commits (relative to Linus' tree): 9833
 9793 files changed, 406830 insertions(+), 263432 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a
multi_v7_defconfig for arm and a native build of tools/perf. After
the final fixups (if any), I do an x86_64 modules_install followed by
builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit),
ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc
and sparc64 defconfig. And finally, a simple boot test of the powerpc
pseries_le_defconfig kernel in qemu (with and without kvm enabled).

Below is a summary of the state of the merge.

I am currently merging 256 trees (counting Linus' and 44 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (dda3e15231b3 Merge branch 'fixes' of 
git://git.armlinux.org.uk/~rmk/linux-arm)
Merging fixes/master (820bf5c419e4 Merge tag 'scsi-fixes' of 
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi)
Merging kbuild-current/fixes (36c1681678b5 genksyms: drop *.hash.c from 
.gitignore)
Merging arc-current/for-curr (8ff3afc159f2 ARC: Enable fatal signals on boot 
for dev platforms)
Merging arm-current/fixes (091f02483df7 ARM: net: bpf: clarify tail_call index)
Merging m68k-current/for-linus (5e387199c17c m68k/defconfig: Update defconfigs 
for v4.14-rc7)
Merging metag-fixes/fixes (b884a190afce metag/usercopy: Add missing fixups)
Merging powerpc-fixes/fixes (1b689a95ce74 powerpc/pseries: include 
linux/types.h in asm/hvcall.h)
Merging sparc/master (59585b4be9ae sparc64: repair calling incorrect hweight 
function from stubs)
Merging fscrypt-current/for-stable (ae64f9bd1d36 Linux 4.15-rc2)
Merging net/master (b200bfd6112a fm10k: mark PM functions as __maybe_unused)
Merging bpf/master (7155f8f39157 Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf)
Merging ipsec/master (ad9294dbc227 bpf: fix cls_bpf on filter replace)
Merging netfilter/master (889c604fd0b5 netfilter: x_tables: fix int overflow in 
xt_alloc_table_info())
Merging ipvs/master (f7fb77fc1235 netfilter: nft_compat: check extension hook 
mask only if set)
Merging wireless-drivers/master (cc124d5cc8d8 brcmfmac: fix CLM load error for 
legacy chips when user helper is enabled)
Merging mac80211/master (59b179b48ce2 cfg80211: check dev_set_name() return 
value)
Merging rdma-fixes/for-rc (ae59c3f0b6cf RDMA/mlx5: Fix out-of-bound access 
while querying AH)
Merging sound-current/for-linus (b3defb791b26 ALSA: seq: Make ioctls race-free)
Merging pci-current/for-linus (d6c1efecd1e1 x86/PCI: Enable AMD 64-bit window 
on resume)
Merging driver-core.current/driver-core-linus (30a7acd57389 Linux 4.15-rc6)
Merging tty.current/tty-linus (30a7acd57389 Linux 4.15-rc6)
Merging usb.current/usb-linus (a8750ddca918 Linux 4.15-rc8)
Merging usb-gadget-fixes/fixes (b2cd1df66037 Linux 4.15-rc7)
Merging usb-serial-fixes/usb-linus (d14ac576d10f USB: serial: cp210x: add new 
device ID ELV ALC 8xxx)
Merging usb-chipidea-fixes/ci-for-usb-stable (964728f9f407 USB: chipidea: msm: 
fix ulpi-node lookup)
Merging phy/fixes (2b88212c4cc6 phy: rcar-gen3-usb2: select USB_COMMON)
Merging staging.current/staging-linus (a8750ddca918 Linux 4.15-rc8)
Merging char-misc.current/char-misc-linus (a8750ddca918 Linux 4.15-rc8)
Merging input-current/for-linus

Re: [PATCH V5 2/2] nvme-pci: fixup the timeout case when reset is ongoing

2018-01-18 Thread jianchao.wang

Hi Keith

Thanks for your kindly reminding.

On 01/19/2018 02:05 PM, Keith Busch wrote:
>>> The driver may be giving up on the command here, but that doesn't mean
>>> the controller has. We can't just end the request like this because that
>>> will release the memory the controller still owns. We must wait until
>>> after nvme_dev_disable clears bus master because we can't say for sure
>>> the controller isn't going to write to that address right after we end
>>> the request.
>>>
>> Yes, but the controller is going to be reseted or shutdown at the moment,
>> even if the controller accesses a bad address and goes wrong, everything will
>> be ok after reset or shutdown. :)
> Hm, I don't follow. DMA access after free is never okay.
Yes, this may cause unexpected memory corruption.

Thanks
Jianchao

Re: [PATCH V5 2/2] nvme-pci: fixup the timeout case when reset is ongoing

2018-01-18 Thread jianchao.wang

Hi Keith

Thanks for your kindly reminding.

On 01/19/2018 02:05 PM, Keith Busch wrote:
>>> The driver may be giving up on the command here, but that doesn't mean
>>> the controller has. We can't just end the request like this because that
>>> will release the memory the controller still owns. We must wait until
>>> after nvme_dev_disable clears bus master because we can't say for sure
>>> the controller isn't going to write to that address right after we end
>>> the request.
>>>
>> Yes, but the controller is going to be reseted or shutdown at the moment,
>> even if the controller accesses a bad address and goes wrong, everything will
>> be ok after reset or shutdown. :)
> Hm, I don't follow. DMA access after free is never okay.
Yes, this may cause unexpected memory corruption.

Thanks
Jianchao

[git pull] drm fixes for 4.15 final

2018-01-18 Thread Dave Airlie

Hi Linus,

This is a set of drm regression fixes that I'd like to get into 4.15
final, but I understand if it's too much too late, and am happy to
drop these into -next and make people chase the stable monkey.

The i915 change fixes a display corruption problem introduced in 4.15,
the nouveau changes are for regressions in 4.15, one of the vmwgfx
fixes goes back a little further, the other is a 4.15 regression fix,
the 3 sun4i changes fix blank HDMI output on those devices.

Again happy if you don't take these, just let me know, I suspect 4.15
will have a lot of stable backports for security things over time!

Thanks,
Dave.


The following changes since commit a8750ddca918032d6349adbf9a4b6555e7db20da:

  Linux 4.15-rc8 (2018-01-14 15:32:30 -0800)

are available in the git repository at:

  git://people.freedesktop.org/~airlied/linux tags/drm-fixes-for-v4.15-rc9

for you to fetch changes up to 04cef3eadcf0bf9783a985286cc5f48c5d33fd7a:

  Merge tag 'drm-intel-fixes-2018-01-18' of
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes (2018-01-19
12:40:07 +1000)


nouveau, i915, vmwgfx and sun4i regression fixes


Ben Skeggs (1):
  drm/nouveau/mmu/mcp77: fix regressions in stolen memory handling

Dave Airlie (4):
  Merge branch 'vmwgfx-fixes-4.15' of
git://people.freedesktop.org/~thomash/linux into drm-fixes
  Merge tag 'drm-misc-fixes-2018-01-17' of
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
  Merge branch 'linux-4.15' of git://github.com/skeggsb/linux into drm-fixes
  Merge tag 'drm-intel-fixes-2018-01-18' of
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

Jon Hunter (1):
  drm/nouveau/bar/gk20a: Avoid bar teardown during init

Jonathan Liu (3):
  drm/sun4i: hdmi: Check for unset best_parent in sun4i_tmds_determine_rate
  drm/sun4i: hdmi: Fix incorrect assignment in sun4i_tmds_determine_rate
  drm/sun4i: hdmi: Add missing rate halving check in
sun4i_tmds_determine_rate

Rob Clark (1):
  drm/vmwgfx: fix memory corruption with legacy/sou connectors

Thierry Reding (1):
  drm/nouveau/drm/nouveau: Pass the proper arguments to
nvif_object_map_handle()

Ville Syrjälä (3):
  drm/i915: Add .get_hw_state() method for planes
  drm/i915: Redo plane sanitation during readout
  drm/i915: Fix deadlock in i830_disable_pipe()

Woody Suwalski (1):
  drm/vmwgfx: Fix a boot time warning

 drivers/gpu/drm/i915/intel_display.c   | 303 +++--
 drivers/gpu/drm/i915/intel_drv.h   |   2 +
 drivers/gpu/drm/i915/intel_sprite.c|  83 ++
 drivers/gpu/drm/nouveau/include/nvkm/subdev/mmu.h  |   1 +
 drivers/gpu/drm/nouveau/nouveau_bo.c   |   4 +-
 drivers/gpu/drm/nouveau/nvkm/engine/device/base.c  |   4 +-
 drivers/gpu/drm/nouveau/nvkm/subdev/bar/base.c |   3 +-
 drivers/gpu/drm/nouveau/nvkm/subdev/bar/gk20a.c|   1 -
 drivers/gpu/drm/nouveau/nvkm/subdev/mmu/Kbuild |   2 +
 drivers/gpu/drm/nouveau/nvkm/subdev/mmu/mcp77.c|  41 +++
 drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h  |  10 +
 drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmmcp77.c |  45 +++
 drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmnv50.c  |  16 +-
 drivers/gpu/drm/sun4i/sun4i_hdmi_tmds_clk.c|   9 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_kms.c|   2 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c|   4 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c   |   4 +-
 17 files changed, 367 insertions(+), 167 deletions(-)
 create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/mmu/mcp77.c
 create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmmcp77.c

[git pull] drm fixes for 4.15 final

2018-01-18 Thread Dave Airlie

Hi Linus,

This is a set of drm regression fixes that I'd like to get into 4.15
final, but I understand if it's too much too late, and am happy to
drop these into -next and make people chase the stable monkey.

The i915 change fixes a display corruption problem introduced in 4.15,
the nouveau changes are for regressions in 4.15, one of the vmwgfx
fixes goes back a little further, the other is a 4.15 regression fix,
the 3 sun4i changes fix blank HDMI output on those devices.

Again happy if you don't take these, just let me know, I suspect 4.15
will have a lot of stable backports for security things over time!

Thanks,
Dave.


The following changes since commit a8750ddca918032d6349adbf9a4b6555e7db20da:

  Linux 4.15-rc8 (2018-01-14 15:32:30 -0800)

are available in the git repository at:

  git://people.freedesktop.org/~airlied/linux tags/drm-fixes-for-v4.15-rc9

for you to fetch changes up to 04cef3eadcf0bf9783a985286cc5f48c5d33fd7a:

  Merge tag 'drm-intel-fixes-2018-01-18' of
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes (2018-01-19
12:40:07 +1000)


nouveau, i915, vmwgfx and sun4i regression fixes


Ben Skeggs (1):
  drm/nouveau/mmu/mcp77: fix regressions in stolen memory handling

Dave Airlie (4):
  Merge branch 'vmwgfx-fixes-4.15' of
git://people.freedesktop.org/~thomash/linux into drm-fixes
  Merge tag 'drm-misc-fixes-2018-01-17' of
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
  Merge branch 'linux-4.15' of git://github.com/skeggsb/linux into drm-fixes
  Merge tag 'drm-intel-fixes-2018-01-18' of
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

Jon Hunter (1):
  drm/nouveau/bar/gk20a: Avoid bar teardown during init

Jonathan Liu (3):
  drm/sun4i: hdmi: Check for unset best_parent in sun4i_tmds_determine_rate
  drm/sun4i: hdmi: Fix incorrect assignment in sun4i_tmds_determine_rate
  drm/sun4i: hdmi: Add missing rate halving check in
sun4i_tmds_determine_rate

Rob Clark (1):
  drm/vmwgfx: fix memory corruption with legacy/sou connectors

Thierry Reding (1):
  drm/nouveau/drm/nouveau: Pass the proper arguments to
nvif_object_map_handle()

Ville Syrjälä (3):
  drm/i915: Add .get_hw_state() method for planes
  drm/i915: Redo plane sanitation during readout
  drm/i915: Fix deadlock in i830_disable_pipe()

Woody Suwalski (1):
  drm/vmwgfx: Fix a boot time warning

 drivers/gpu/drm/i915/intel_display.c   | 303 +++--
 drivers/gpu/drm/i915/intel_drv.h   |   2 +
 drivers/gpu/drm/i915/intel_sprite.c|  83 ++
 drivers/gpu/drm/nouveau/include/nvkm/subdev/mmu.h  |   1 +
 drivers/gpu/drm/nouveau/nouveau_bo.c   |   4 +-
 drivers/gpu/drm/nouveau/nvkm/engine/device/base.c  |   4 +-
 drivers/gpu/drm/nouveau/nvkm/subdev/bar/base.c |   3 +-
 drivers/gpu/drm/nouveau/nvkm/subdev/bar/gk20a.c|   1 -
 drivers/gpu/drm/nouveau/nvkm/subdev/mmu/Kbuild |   2 +
 drivers/gpu/drm/nouveau/nvkm/subdev/mmu/mcp77.c|  41 +++
 drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h  |  10 +
 drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmmcp77.c |  45 +++
 drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmnv50.c  |  16 +-
 drivers/gpu/drm/sun4i/sun4i_hdmi_tmds_clk.c|   9 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_kms.c|   2 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c|   4 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c   |   4 +-
 17 files changed, 367 insertions(+), 167 deletions(-)
 create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/mmu/mcp77.c
 create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmmcp77.c

Re: [RESEND PATCH 3/3] x86/apic: Clean up the names of legacy irq mode setting related functions

2018-01-18 Thread Dou Liyang


Hi Baoquan,

At 01/05/2018 12:39 PM, Baoquan He wrote:
[...]

  /*
- * Not an __init, needed by kexec/kdump code.
- * For safety IO-APIC and Local APIC need be cleared before this.
+ * In legacy irq mode, full DOS compatibility with the uniprocessor PC/AT is
+ * provided by using the APICs in conjunction with standard 8259A-equivalent
+ * programmable interrupt controllers (PICs). It's necessary to deliver legacy
+ * interrupts even when APIC mode is not enabled. This is required by kexec/
+ * kdump before enter into the 2nd kernel.
   */
  void switch_to_legacy_irq_mode(void)
  {
if (!nr_legacy_irqs())
return;
  
-	x86_io_apic_ops.disable();

+   ioapic_set_virtual_wire_mode();
+
+   if (boot_cpu_has(X86_FEATURE_APIC) || apic_from_smp_config())
+   lapic_set_legacy_irq_mode(ioapic_i8259.pin != -1);


Seems these two function, ioapic/lapic_set_legacy_irq_mode should be
exclusive.

But We do that because both the through-lapic and through-ioapic virtual 
wire mode need setup the APIC_SPIV_APIC_ENABLED which is only located in

the lapic_set_legacy_irq_mode(). So we need call them both.

IMO, this cleanup may not make it clear. we can separate these two mode 
totally or just keep it like before.


Thanks,
dou.

  }
  
  #ifdef CONFIG_X86_32

diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
index 1151ccd72ce9..c30f0f273dbd 100644
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -148,5 +148,5 @@ void arch_restore_msi_irqs(struct pci_dev *dev)
  
  struct x86_io_apic_ops x86_io_apic_ops __ro_after_init = {

.read   = native_io_apic_read,
-   .disable= native_disable_io_apic,
+   .disable= switch_to_legacy_irq_mode,
  };
diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
index 49721b4e1975..751472ddf536 100644
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c
@@ -37,7 +37,7 @@ static void irq_remapping_disable_io_apic(void)
 * now.
 */
if (boot_cpu_has(X86_FEATURE_APIC) || apic_from_smp_config())
-   disconnect_bsp_APIC(0);
+   lapic_set_legacy_irq_mode(0);
  }
  
  static void __init irq_remapping_modify_x86_ops(void)

Re: [RESEND PATCH 3/3] x86/apic: Clean up the names of legacy irq mode setting related functions

2018-01-18 Thread Dou Liyang


Hi Baoquan,

At 01/05/2018 12:39 PM, Baoquan He wrote:
[...]

  /*
- * Not an __init, needed by kexec/kdump code.
- * For safety IO-APIC and Local APIC need be cleared before this.
+ * In legacy irq mode, full DOS compatibility with the uniprocessor PC/AT is
+ * provided by using the APICs in conjunction with standard 8259A-equivalent
+ * programmable interrupt controllers (PICs). It's necessary to deliver legacy
+ * interrupts even when APIC mode is not enabled. This is required by kexec/
+ * kdump before enter into the 2nd kernel.
   */
  void switch_to_legacy_irq_mode(void)
  {
if (!nr_legacy_irqs())
return;
  
-	x86_io_apic_ops.disable();

+   ioapic_set_virtual_wire_mode();
+
+   if (boot_cpu_has(X86_FEATURE_APIC) || apic_from_smp_config())
+   lapic_set_legacy_irq_mode(ioapic_i8259.pin != -1);


Seems these two function, ioapic/lapic_set_legacy_irq_mode should be
exclusive.

But We do that because both the through-lapic and through-ioapic virtual 
wire mode need setup the APIC_SPIV_APIC_ENABLED which is only located in

the lapic_set_legacy_irq_mode(). So we need call them both.

IMO, this cleanup may not make it clear. we can separate these two mode 
totally or just keep it like before.


Thanks,
dou.

  }
  
  #ifdef CONFIG_X86_32

diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
index 1151ccd72ce9..c30f0f273dbd 100644
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -148,5 +148,5 @@ void arch_restore_msi_irqs(struct pci_dev *dev)
  
  struct x86_io_apic_ops x86_io_apic_ops __ro_after_init = {

.read   = native_io_apic_read,
-   .disable= native_disable_io_apic,
+   .disable= switch_to_legacy_irq_mode,
  };
diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
index 49721b4e1975..751472ddf536 100644
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c
@@ -37,7 +37,7 @@ static void irq_remapping_disable_io_apic(void)
 * now.
 */
if (boot_cpu_has(X86_FEATURE_APIC) || apic_from_smp_config())
-   disconnect_bsp_APIC(0);
+   lapic_set_legacy_irq_mode(0);
  }
  
  static void __init irq_remapping_modify_x86_ops(void)

Re: [RESEND] phy: sun4i-usb: add support for R40 USB PHY

2018-01-18 Thread Icenowy Zheng



于 2018年1月19日 GMT+08:00 下午2:25:09, Chen-Yu Tsai  写到:
>Hi Kishon,
>
>On Mon, Jan 15, 2018 at 11:06 PM, Hermann Lauer
> wrote:
>> On Wed, Jan 03, 2018 at 04:49:44PM +0800, Icenowy Zheng wrote:
>>> Allwinner R40 features a USB PHY like the one in A64, but with 3
>PHYs.
>>>
>>> Add support for it.
>>>
>>> Signed-off-by: Icenowy Zheng 
>>> Acked-by: Maxime Ripard 
>>> Acked-by: Rob Herring 
>>
>> You may add
>>
>> Tested-by: hermann.la...@iwr.uni-heidelberg.de
>
>Gentle ping for this patch to be included in 4.16

I think maybe I forgot PATCH in title so it didn't enter patchwork?

>
>ChenYu
>
>___
>linux-arm-kernel mailing list
>linux-arm-ker...@lists.infradead.org
>http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

Re: [RESEND] phy: sun4i-usb: add support for R40 USB PHY

2018-01-18 Thread Icenowy Zheng



于 2018年1月19日 GMT+08:00 下午2:25:09, Chen-Yu Tsai  写到:
>Hi Kishon,
>
>On Mon, Jan 15, 2018 at 11:06 PM, Hermann Lauer
> wrote:
>> On Wed, Jan 03, 2018 at 04:49:44PM +0800, Icenowy Zheng wrote:
>>> Allwinner R40 features a USB PHY like the one in A64, but with 3
>PHYs.
>>>
>>> Add support for it.
>>>
>>> Signed-off-by: Icenowy Zheng 
>>> Acked-by: Maxime Ripard 
>>> Acked-by: Rob Herring 
>>
>> You may add
>>
>> Tested-by: hermann.la...@iwr.uni-heidelberg.de
>
>Gentle ping for this patch to be included in 4.16

I think maybe I forgot PATCH in title so it didn't enter patchwork?

>
>ChenYu
>
>___
>linux-arm-kernel mailing list
>linux-arm-ker...@lists.infradead.org
>http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

Re: [PATCH 6/6] s390: scrub registers on kernel entry and KVM exit

2018-01-18 Thread QingFeng Hao




在 2018/1/17 17:48, Martin Schwidefsky 写道:

Clear all user space registers on entry to the kernel and all KVM guest
registers on KVM guest exit if the register does not contain either a
parameter or a result value.

I am not sure if I understand this but it will be safer?
And can we abstract the operations to be a macro like CLEAR_REG_7?
Thanks


Suggested-by: Christian Borntraeger 
Reviewed-by: Christian Borntraeger 
Signed-off-by: Martin Schwidefsky 
---
  arch/s390/kernel/entry.S | 41 +
  1 file changed, 41 insertions(+)

diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S
index 2a22c03..47227d3 100644
--- a/arch/s390/kernel/entry.S
+++ b/arch/s390/kernel/entry.S
@@ -322,6 +322,12 @@ ENTRY(sie64a)
  sie_exit:
lg  %r14,__SF_EMPTY+8(%r15) # load guest register save area
stmg%r0,%r13,0(%r14)# save guest gprs 0-13
+   xgr %r0,%r0 # clear guest registers
+   xgr %r1,%r1
+   xgr %r2,%r2
+   xgr %r3,%r3
+   xgr %r4,%r4
+   xgr %r5,%r5
lmg %r6,%r14,__SF_GPRS(%r15)# restore kernel registers
lg  %r2,__SF_EMPTY+16(%r15) # return exit reason code
br  %r14
@@ -358,6 +364,7 @@ ENTRY(system_call)
UPDATE_VTIME %r8,%r9,__LC_SYNC_ENTER_TIMER
BPENTER __TI_flags(%r12),_TIF_NOBP
stmg%r0,%r7,__PT_R0(%r11)
+   xgr %r0,%r0
mvc __PT_R8(64,%r11),__LC_SAVE_AREA_SYNC
mvc __PT_PSW(16,%r11),__LC_SVC_OLD_PSW
mvc __PT_INT_CODE(4,%r11),__LC_SVC_ILC
@@ -640,6 +647,14 @@ ENTRY(pgm_check_handler)
  4:lgr %r13,%r11
la  %r11,STACK_FRAME_OVERHEAD(%r15)
stmg%r0,%r7,__PT_R0(%r11)
+   xgr %r0,%r0 # clear user space registers
+   xgr %r1,%r1
+   xgr %r2,%r2
+   xgr %r3,%r3
+   xgr %r4,%r4
+   xgr %r5,%r5
+   xgr %r6,%r6
+   xgr %r7,%r7
mvc __PT_R8(64,%r11),__LC_SAVE_AREA_SYNC
stmg%r8,%r9,__PT_PSW(%r11)
mvc __PT_INT_CODE(4,%r11),__LC_PGM_ILC
@@ -706,6 +721,15 @@ ENTRY(io_int_handler)
lmg %r8,%r9,__LC_IO_OLD_PSW
SWITCH_ASYNC __LC_SAVE_AREA_ASYNC,__LC_ASYNC_ENTER_TIMER
stmg%r0,%r7,__PT_R0(%r11)
+   xgr %r0,%r0 # clear user space registers
+   xgr %r1,%r1
+   xgr %r2,%r2
+   xgr %r3,%r3
+   xgr %r4,%r4
+   xgr %r5,%r5
+   xgr %r6,%r6
+   xgr %r7,%r7
+   xgr %r10,%r10
mvc __PT_R8(64,%r11),__LC_SAVE_AREA_ASYNC
stmg%r8,%r9,__PT_PSW(%r11)
mvc __PT_INT_CODE(12,%r11),__LC_SUBCHANNEL_ID
@@ -924,6 +948,15 @@ ENTRY(ext_int_handler)
lmg %r8,%r9,__LC_EXT_OLD_PSW
SWITCH_ASYNC __LC_SAVE_AREA_ASYNC,__LC_ASYNC_ENTER_TIMER
stmg%r0,%r7,__PT_R0(%r11)
+   xgr %r0,%r0 # clear user space registers
+   xgr %r1,%r1
+   xgr %r2,%r2
+   xgr %r3,%r3
+   xgr %r4,%r4
+   xgr %r5,%r5
+   xgr %r6,%r6
+   xgr %r7,%r7
+   xgr %r10,%r10
mvc __PT_R8(64,%r11),__LC_SAVE_AREA_ASYNC
stmg%r8,%r9,__PT_PSW(%r11)
lghi%r1,__LC_EXT_PARAMS2
@@ -1133,6 +1166,14 @@ ENTRY(mcck_int_handler)
  .Lmcck_skip:
lghi%r14,__LC_GPREGS_SAVE_AREA+64
stmg%r0,%r7,__PT_R0(%r11)
+   xgr %r0,%r0 # clear user space registers
+   xgr %r2,%r2
+   xgr %r3,%r3
+   xgr %r4,%r4
+   xgr %r5,%r5
+   xgr %r6,%r6
+   xgr %r7,%r7
+   xgr %r10,%r10
mvc __PT_R8(64,%r11),0(%r14)
stmg%r8,%r9,__PT_PSW(%r11)
xc  __PT_FLAGS(8,%r11),__PT_FLAGS(%r11)


--
Regards
QingFeng Hao

Re: [PATCH 6/6] s390: scrub registers on kernel entry and KVM exit

2018-01-18 Thread QingFeng Hao




在 2018/1/17 17:48, Martin Schwidefsky 写道:

Clear all user space registers on entry to the kernel and all KVM guest
registers on KVM guest exit if the register does not contain either a
parameter or a result value.

I am not sure if I understand this but it will be safer?
And can we abstract the operations to be a macro like CLEAR_REG_7?
Thanks


Suggested-by: Christian Borntraeger 
Reviewed-by: Christian Borntraeger 
Signed-off-by: Martin Schwidefsky 
---
  arch/s390/kernel/entry.S | 41 +
  1 file changed, 41 insertions(+)

diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S
index 2a22c03..47227d3 100644
--- a/arch/s390/kernel/entry.S
+++ b/arch/s390/kernel/entry.S
@@ -322,6 +322,12 @@ ENTRY(sie64a)
  sie_exit:
lg  %r14,__SF_EMPTY+8(%r15) # load guest register save area
stmg%r0,%r13,0(%r14)# save guest gprs 0-13
+   xgr %r0,%r0 # clear guest registers
+   xgr %r1,%r1
+   xgr %r2,%r2
+   xgr %r3,%r3
+   xgr %r4,%r4
+   xgr %r5,%r5
lmg %r6,%r14,__SF_GPRS(%r15)# restore kernel registers
lg  %r2,__SF_EMPTY+16(%r15) # return exit reason code
br  %r14
@@ -358,6 +364,7 @@ ENTRY(system_call)
UPDATE_VTIME %r8,%r9,__LC_SYNC_ENTER_TIMER
BPENTER __TI_flags(%r12),_TIF_NOBP
stmg%r0,%r7,__PT_R0(%r11)
+   xgr %r0,%r0
mvc __PT_R8(64,%r11),__LC_SAVE_AREA_SYNC
mvc __PT_PSW(16,%r11),__LC_SVC_OLD_PSW
mvc __PT_INT_CODE(4,%r11),__LC_SVC_ILC
@@ -640,6 +647,14 @@ ENTRY(pgm_check_handler)
  4:lgr %r13,%r11
la  %r11,STACK_FRAME_OVERHEAD(%r15)
stmg%r0,%r7,__PT_R0(%r11)
+   xgr %r0,%r0 # clear user space registers
+   xgr %r1,%r1
+   xgr %r2,%r2
+   xgr %r3,%r3
+   xgr %r4,%r4
+   xgr %r5,%r5
+   xgr %r6,%r6
+   xgr %r7,%r7
mvc __PT_R8(64,%r11),__LC_SAVE_AREA_SYNC
stmg%r8,%r9,__PT_PSW(%r11)
mvc __PT_INT_CODE(4,%r11),__LC_PGM_ILC
@@ -706,6 +721,15 @@ ENTRY(io_int_handler)
lmg %r8,%r9,__LC_IO_OLD_PSW
SWITCH_ASYNC __LC_SAVE_AREA_ASYNC,__LC_ASYNC_ENTER_TIMER
stmg%r0,%r7,__PT_R0(%r11)
+   xgr %r0,%r0 # clear user space registers
+   xgr %r1,%r1
+   xgr %r2,%r2
+   xgr %r3,%r3
+   xgr %r4,%r4
+   xgr %r5,%r5
+   xgr %r6,%r6
+   xgr %r7,%r7
+   xgr %r10,%r10
mvc __PT_R8(64,%r11),__LC_SAVE_AREA_ASYNC
stmg%r8,%r9,__PT_PSW(%r11)
mvc __PT_INT_CODE(12,%r11),__LC_SUBCHANNEL_ID
@@ -924,6 +948,15 @@ ENTRY(ext_int_handler)
lmg %r8,%r9,__LC_EXT_OLD_PSW
SWITCH_ASYNC __LC_SAVE_AREA_ASYNC,__LC_ASYNC_ENTER_TIMER
stmg%r0,%r7,__PT_R0(%r11)
+   xgr %r0,%r0 # clear user space registers
+   xgr %r1,%r1
+   xgr %r2,%r2
+   xgr %r3,%r3
+   xgr %r4,%r4
+   xgr %r5,%r5
+   xgr %r6,%r6
+   xgr %r7,%r7
+   xgr %r10,%r10
mvc __PT_R8(64,%r11),__LC_SAVE_AREA_ASYNC
stmg%r8,%r9,__PT_PSW(%r11)
lghi%r1,__LC_EXT_PARAMS2
@@ -1133,6 +1166,14 @@ ENTRY(mcck_int_handler)
  .Lmcck_skip:
lghi%r14,__LC_GPREGS_SAVE_AREA+64
stmg%r0,%r7,__PT_R0(%r11)
+   xgr %r0,%r0 # clear user space registers
+   xgr %r2,%r2
+   xgr %r3,%r3
+   xgr %r4,%r4
+   xgr %r5,%r5
+   xgr %r6,%r6
+   xgr %r7,%r7
+   xgr %r10,%r10
mvc __PT_R8(64,%r11),0(%r14)
stmg%r8,%r9,__PT_PSW(%r11)
xc  __PT_FLAGS(8,%r11),__PT_FLAGS(%r11)


--
Regards
QingFeng Hao

Re: [RESEND] phy: sun4i-usb: add support for R40 USB PHY

2018-01-18 Thread Chen-Yu Tsai

Hi Kishon,

On Mon, Jan 15, 2018 at 11:06 PM, Hermann Lauer
 wrote:
> On Wed, Jan 03, 2018 at 04:49:44PM +0800, Icenowy Zheng wrote:
>> Allwinner R40 features a USB PHY like the one in A64, but with 3 PHYs.
>>
>> Add support for it.
>>
>> Signed-off-by: Icenowy Zheng 
>> Acked-by: Maxime Ripard 
>> Acked-by: Rob Herring 
>
> You may add
>
> Tested-by: hermann.la...@iwr.uni-heidelberg.de

Gentle ping for this patch to be included in 4.16

ChenYu

Re: [RESEND] phy: sun4i-usb: add support for R40 USB PHY

2018-01-18 Thread Chen-Yu Tsai

Hi Kishon,

On Mon, Jan 15, 2018 at 11:06 PM, Hermann Lauer
 wrote:
> On Wed, Jan 03, 2018 at 04:49:44PM +0800, Icenowy Zheng wrote:
>> Allwinner R40 features a USB PHY like the one in A64, but with 3 PHYs.
>>
>> Add support for it.
>>
>> Signed-off-by: Icenowy Zheng 
>> Acked-by: Maxime Ripard 
>> Acked-by: Rob Herring 
>
> You may add
>
> Tested-by: hermann.la...@iwr.uni-heidelberg.de

Gentle ping for this patch to be included in 4.16

ChenYu

Re: [RESEND PATCH 2/3] x86/apic/kexec: Enable legacy irq mode before jump to kexec/kdump kernel

2018-01-18 Thread Dou Liyang


Hi Baoquan,

At 01/17/2018 06:08 PM, Baoquan He wrote:

On 01/17/18 at 05:47pm, Dou Liyang wrote:

Hi Baoquan,

At 01/05/2018 12:38 PM, Baoquan He wrote:

In commit

commit 522e66464467 ("x86/apic: Disable I/O APIC before shutdown of the local 
APIC").

lapic_shutdown() invocation is moved after disable_IO_APIC(). In fact
in disable_IO_APIC(), it not only calls clear_IO_APIC() to disable
IO-APIC, also sets sets LAPIC and IO-APIC to make system be PIC or
Virtual wire mode. While the above commit putting disable_IO_APIC earlier
causes local APIC is completely disabled. So the legacy irq mode is
disabled too before jump to kexec/kdump kernel.


I have a question:

As you said, Due to disable_IO_APIC() is triggered before
lapic_shutdown(), So the interrupt virtual wire mode will be disabled.

but, I found that:

After machine_crash_shutdown() is executed, Linux will call
machine_kexec(), and in machine_kexec(), disable_IO_APIC() will also be
called again, why it can't switch to virtual wire mode successfully? Or
is my understanding wrong?

The disable_IO_APIC() calling has a condition check,

if (image->preserve_context) {
disable_IO_APIC();
}

For preserve_context case, it comes from kernel_kexec(). You can check
it in kexec man page, that is another scenario we use kexec for. But not
kexec and kdump.



Understood!

This patch looks good to me and I also tested it, it's OK.

Thanks,
dou.


+--+
| __crash_kexec|
+--+
|
|+-+
+--> | machine_crash_shutdown  |
|+++
| |
| |  +-+
| +> | disable_IO_APIC |
| |  +-+
| |
| |  ++
| +-^+ lapic_shutdown |
|++
|
|+-+
+--> | machine_kexec   |
|+++
| |
| |  +-+
| +> | disable_IO_APIC |
|+-+
|
v

Thanks,
dou.

In normal kernel it defaults to be PIC mode or Virtual Wire mode during
system initialization before APIC mode is enabled and this is done by
BIOS initialization. But kexec/kdump kernel won't go through BIOS, so
we should set system as PIC or Virtual Wire mode before jump to kdump
kernel code directly.

So let's take clear_IO_APIC out from disable_IO_APIC and rename
disable_IO_APIC as switch_to_legacy_irq_mode. Then only call clear_IO_APIC
when IO-APIC need be disabled. And call switch_to_legacy_irq_mode before
kexec/kdump jumping.

Signed-off-by: Baoquan He 
---
   arch/x86/include/asm/io_apic.h |  3 ++-
   arch/x86/kernel/apic/io_apic.c | 12 
   arch/x86/kernel/crash.c|  2 +-
   arch/x86/kernel/machine_kexec_32.c | 15 +--
   arch/x86/kernel/machine_kexec_64.c | 15 +--
   arch/x86/kernel/reboot.c   |  2 +-
   6 files changed, 18 insertions(+), 31 deletions(-)

diff --git a/arch/x86/include/asm/io_apic.h b/arch/x86/include/asm/io_apic.h
index a8834dd546cd..e38ad3863a2c 100644
--- a/arch/x86/include/asm/io_apic.h
+++ b/arch/x86/include/asm/io_apic.h
@@ -192,7 +192,8 @@ static inline unsigned int io_apic_read(unsigned int apic, 
unsigned int reg)
   extern void setup_IO_APIC(void);
   extern void enable_IO_APIC(void);
-extern void disable_IO_APIC(void);
+extern void clear_IO_APIC (void);
+extern void switch_to_legacy_irq_mode(void);
   extern int IO_APIC_get_PCI_irq_vector(int bus, int devfn, int pin);
   extern void print_IO_APICs(void);
   #else  /* !CONFIG_X86_IO_APIC */
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 8a7963421460..a47aa915d18c 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -587,7 +587,7 @@ static void clear_IO_APIC_pin(unsigned int apic, unsigned 
int pin)
   mpc_ioapic_id(apic), pin);
   }
-static void clear_IO_APIC (void)
+void clear_IO_APIC (void)
   {
int apic, pin;
@@ -1439,15 +1439,11 @@ void native_disable_io_apic(void)
   }
   /*
- * Not an __init, needed by the reboot code
+ * Not an __init, needed by kexec/kdump code.
+ * For safety IO-APIC and Local APIC need be cleared before this.
*/
-void disable_IO_APIC(void)
+void switch_to_legacy_irq_mode(void)
   {
-   /*
-* Clear the IO-APIC before rebooting:
-*/
-   clear_IO_APIC();
-
if (!nr_legacy_irqs())
return;
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 10e74d4778a1..318ffeaaf55a 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -199,7 +199,7 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
   #ifdef CONFIG_X86_IO_APIC
/* Prevent crash_kexec() from deadlocking on ioapic_lock. */
ioapic_zap_locks();
-   disable_IO_APIC();
+   clear_IO_APIC();
   #endif
lapic_shutdown();
   #ifdef CONFIG_HPET_TIMER
diff --git a/arch/x86/kernel/machine_kexec_32.c

Re: [RESEND PATCH 2/3] x86/apic/kexec: Enable legacy irq mode before jump to kexec/kdump kernel

2018-01-18 Thread Dou Liyang


Hi Baoquan,

At 01/17/2018 06:08 PM, Baoquan He wrote:

On 01/17/18 at 05:47pm, Dou Liyang wrote:

Hi Baoquan,

At 01/05/2018 12:38 PM, Baoquan He wrote:

In commit

commit 522e66464467 ("x86/apic: Disable I/O APIC before shutdown of the local 
APIC").

lapic_shutdown() invocation is moved after disable_IO_APIC(). In fact
in disable_IO_APIC(), it not only calls clear_IO_APIC() to disable
IO-APIC, also sets sets LAPIC and IO-APIC to make system be PIC or
Virtual wire mode. While the above commit putting disable_IO_APIC earlier
causes local APIC is completely disabled. So the legacy irq mode is
disabled too before jump to kexec/kdump kernel.


I have a question:

As you said, Due to disable_IO_APIC() is triggered before
lapic_shutdown(), So the interrupt virtual wire mode will be disabled.

but, I found that:

After machine_crash_shutdown() is executed, Linux will call
machine_kexec(), and in machine_kexec(), disable_IO_APIC() will also be
called again, why it can't switch to virtual wire mode successfully? Or
is my understanding wrong?

The disable_IO_APIC() calling has a condition check,

if (image->preserve_context) {
disable_IO_APIC();
}

For preserve_context case, it comes from kernel_kexec(). You can check
it in kexec man page, that is another scenario we use kexec for. But not
kexec and kdump.



Understood!

This patch looks good to me and I also tested it, it's OK.

Thanks,
dou.


+--+
| __crash_kexec|
+--+
|
|+-+
+--> | machine_crash_shutdown  |
|+++
| |
| |  +-+
| +> | disable_IO_APIC |
| |  +-+
| |
| |  ++
| +-^+ lapic_shutdown |
|++
|
|+-+
+--> | machine_kexec   |
|+++
| |
| |  +-+
| +> | disable_IO_APIC |
|+-+
|
v

Thanks,
dou.

In normal kernel it defaults to be PIC mode or Virtual Wire mode during
system initialization before APIC mode is enabled and this is done by
BIOS initialization. But kexec/kdump kernel won't go through BIOS, so
we should set system as PIC or Virtual Wire mode before jump to kdump
kernel code directly.

So let's take clear_IO_APIC out from disable_IO_APIC and rename
disable_IO_APIC as switch_to_legacy_irq_mode. Then only call clear_IO_APIC
when IO-APIC need be disabled. And call switch_to_legacy_irq_mode before
kexec/kdump jumping.

Signed-off-by: Baoquan He 
---
   arch/x86/include/asm/io_apic.h |  3 ++-
   arch/x86/kernel/apic/io_apic.c | 12 
   arch/x86/kernel/crash.c|  2 +-
   arch/x86/kernel/machine_kexec_32.c | 15 +--
   arch/x86/kernel/machine_kexec_64.c | 15 +--
   arch/x86/kernel/reboot.c   |  2 +-
   6 files changed, 18 insertions(+), 31 deletions(-)

diff --git a/arch/x86/include/asm/io_apic.h b/arch/x86/include/asm/io_apic.h
index a8834dd546cd..e38ad3863a2c 100644
--- a/arch/x86/include/asm/io_apic.h
+++ b/arch/x86/include/asm/io_apic.h
@@ -192,7 +192,8 @@ static inline unsigned int io_apic_read(unsigned int apic, 
unsigned int reg)
   extern void setup_IO_APIC(void);
   extern void enable_IO_APIC(void);
-extern void disable_IO_APIC(void);
+extern void clear_IO_APIC (void);
+extern void switch_to_legacy_irq_mode(void);
   extern int IO_APIC_get_PCI_irq_vector(int bus, int devfn, int pin);
   extern void print_IO_APICs(void);
   #else  /* !CONFIG_X86_IO_APIC */
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 8a7963421460..a47aa915d18c 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -587,7 +587,7 @@ static void clear_IO_APIC_pin(unsigned int apic, unsigned 
int pin)
   mpc_ioapic_id(apic), pin);
   }
-static void clear_IO_APIC (void)
+void clear_IO_APIC (void)
   {
int apic, pin;
@@ -1439,15 +1439,11 @@ void native_disable_io_apic(void)
   }
   /*
- * Not an __init, needed by the reboot code
+ * Not an __init, needed by kexec/kdump code.
+ * For safety IO-APIC and Local APIC need be cleared before this.
*/
-void disable_IO_APIC(void)
+void switch_to_legacy_irq_mode(void)
   {
-   /*
-* Clear the IO-APIC before rebooting:
-*/
-   clear_IO_APIC();
-
if (!nr_legacy_irqs())
return;
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 10e74d4778a1..318ffeaaf55a 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -199,7 +199,7 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
   #ifdef CONFIG_X86_IO_APIC
/* Prevent crash_kexec() from deadlocking on ioapic_lock. */
ioapic_zap_locks();
-   disable_IO_APIC();
+   clear_IO_APIC();
   #endif
lapic_shutdown();
   #ifdef CONFIG_HPET_TIMER
diff --git a/arch/x86/kernel/machine_kexec_32.c

Re: [PATCH v22 2/3] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_VQ

2018-01-18 Thread Wei Wang


On 01/18/2018 12:44 AM, Michael S. Tsirkin wrote:

On Wed, Jan 17, 2018 at 01:10:11PM +0800, Wei Wang wrote:
  
+static void virtballoon_changed(struct virtio_device *vdev)

+{
+   struct virtio_balloon *vb = vdev->priv;
+   unsigned long flags;
+   __u32 cmd_id;
+   s64 diff = towards_target(vb);
+
+   if (diff) {
+   spin_lock_irqsave(>stop_update_lock, flags);
+   if (!vb->stop_update)

Why do you ignore stop_update for freeze?
This means new wq entries can be added during remove
causing use after free issues.


I think stop_update isn't needed, because the lock has already been 
handled internally by the APIs. Similar examples like 
mem_cgroup_css_free() in "mm/memcontrol.c", there is no such locks used 
for cancel_work_sync(>high_work).


Best,
Wei

Re: [PATCH v22 2/3] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_VQ

2018-01-18 Thread Wei Wang


On 01/18/2018 12:44 AM, Michael S. Tsirkin wrote:

On Wed, Jan 17, 2018 at 01:10:11PM +0800, Wei Wang wrote:
  
+static void virtballoon_changed(struct virtio_device *vdev)

+{
+   struct virtio_balloon *vb = vdev->priv;
+   unsigned long flags;
+   __u32 cmd_id;
+   s64 diff = towards_target(vb);
+
+   if (diff) {
+   spin_lock_irqsave(>stop_update_lock, flags);
+   if (!vb->stop_update)

Why do you ignore stop_update for freeze?
This means new wq entries can be added during remove
causing use after free issues.


I think stop_update isn't needed, because the lock has already been 
handled internally by the APIs. Similar examples like 
mem_cgroup_css_free() in "mm/memcontrol.c", there is no such locks used 
for cancel_work_sync(>high_work).


Best,
Wei

Re: [PATCH v5 20/44] dt-bindings: clock: Add bindings for TI DA8XX USB PHY clocks

2018-01-18 Thread Sekhar Nori

On Friday 19 January 2018 12:30 AM, David Lechner wrote:
> On 01/18/2018 06:10 AM, Sekhar Nori wrote:
>> On Monday 08 January 2018 07:47 AM, David Lechner wrote:
>>> This adds a new binding for TI DA8XX USB PHY clocks. These clocks are
>>> part
>>> of a syscon register called CFGCHIP3.
>>
>> CFGCHIP2
>>
>>>
>>> Signed-off-by: David Lechner 
>>
>>> +Examples:
>>> +
>>> +    cfgchip: syscon@1417c {
>>> +    compatible = "ti,da830-cfgchip", "syscon", "simple-mfd";
>>> +    reg = <0x1417c 0x14>;
>>> +
>>> +    usb0_phy_clk: usb0-phy-clock {
>>> +    compatible = "ti,da830-usb0-phy-clock";
>>> +    #clock-cells = <0>;
>>> +    clocks = <_refclkin>, <_aux_clk>, < 1>;
>>> +    clock-names = "usb_refclkin", "auxclk", "usb0_lpsc";
>>> +    clock-output-names = "usb0_phy_clk";
>>
>> Probably call this "usb0_phy" to match with the input name used for
>> usb1_phy_clk?
> 
> I was planning on just dropping clock-output-names altogether actually
> since they don't really do anything useful.
> 
> Also, I was considering sending a series to change the con_id for the
> PHY clocks.
> 
> My current revision of the device tree bindings is looking like this:
> 
> usb_phy: usb-phy {
>     compatible = "ti,da830-usb-phy";
>     #phy-cells = <1>;
>     clocks = <_phy_clk 0>, <_phy_clk 1>;
>     clock-names = "usb20_phy", "usb11_phy";
>     status = "disabled";
> };
> usb_phy_clk: usb-phy-clocks {
>     compatible = "ti,da830-usb-phy-clocks";
>     #clock-cells = <1>;
>     clocks = < 1>, <_refclkin>, <_auxclk>;
>     clock-names = "fck", "usb_refclkin", "auxclk";
> };
> 
> The clock-names = "usb20_phy", "usb11_phy" comes from the existing con_ids
> in the PHY driver's clk_get()s.
> 
> However, in device tree, we are usually referring to the USB devices as
> usb0 and usb1 instead of usb20 and usb11, respectively. Figure 6-2 "USB
> Clocking Diagram" in spruh82c.pdf (AM1808 TRM) calls these clocks "CLK48"
> and "CLK48MHz from USB 2.0 PHY", so I was thinking of changing the con_ids
> (and therefore also clock-names) to "usb0_clk48" and "usb1_clk48".

This is fine with me.

Thanks,
Sekhar

Re: [PATCH v5 20/44] dt-bindings: clock: Add bindings for TI DA8XX USB PHY clocks

2018-01-18 Thread Sekhar Nori

On Friday 19 January 2018 12:30 AM, David Lechner wrote:
> On 01/18/2018 06:10 AM, Sekhar Nori wrote:
>> On Monday 08 January 2018 07:47 AM, David Lechner wrote:
>>> This adds a new binding for TI DA8XX USB PHY clocks. These clocks are
>>> part
>>> of a syscon register called CFGCHIP3.
>>
>> CFGCHIP2
>>
>>>
>>> Signed-off-by: David Lechner 
>>
>>> +Examples:
>>> +
>>> +    cfgchip: syscon@1417c {
>>> +    compatible = "ti,da830-cfgchip", "syscon", "simple-mfd";
>>> +    reg = <0x1417c 0x14>;
>>> +
>>> +    usb0_phy_clk: usb0-phy-clock {
>>> +    compatible = "ti,da830-usb0-phy-clock";
>>> +    #clock-cells = <0>;
>>> +    clocks = <_refclkin>, <_aux_clk>, < 1>;
>>> +    clock-names = "usb_refclkin", "auxclk", "usb0_lpsc";
>>> +    clock-output-names = "usb0_phy_clk";
>>
>> Probably call this "usb0_phy" to match with the input name used for
>> usb1_phy_clk?
> 
> I was planning on just dropping clock-output-names altogether actually
> since they don't really do anything useful.
> 
> Also, I was considering sending a series to change the con_id for the
> PHY clocks.
> 
> My current revision of the device tree bindings is looking like this:
> 
> usb_phy: usb-phy {
>     compatible = "ti,da830-usb-phy";
>     #phy-cells = <1>;
>     clocks = <_phy_clk 0>, <_phy_clk 1>;
>     clock-names = "usb20_phy", "usb11_phy";
>     status = "disabled";
> };
> usb_phy_clk: usb-phy-clocks {
>     compatible = "ti,da830-usb-phy-clocks";
>     #clock-cells = <1>;
>     clocks = < 1>, <_refclkin>, <_auxclk>;
>     clock-names = "fck", "usb_refclkin", "auxclk";
> };
> 
> The clock-names = "usb20_phy", "usb11_phy" comes from the existing con_ids
> in the PHY driver's clk_get()s.
> 
> However, in device tree, we are usually referring to the USB devices as
> usb0 and usb1 instead of usb20 and usb11, respectively. Figure 6-2 "USB
> Clocking Diagram" in spruh82c.pdf (AM1808 TRM) calls these clocks "CLK48"
> and "CLK48MHz from USB 2.0 PHY", so I was thinking of changing the con_ids
> (and therefore also clock-names) to "usb0_clk48" and "usb1_clk48".

This is fine with me.

Thanks,
Sekhar

Re: [PATCH v5 43/44] ARM: da8xx-dt: switch to device tree clocks

2018-01-18 Thread Sekhar Nori

On Friday 19 January 2018 12:10 AM, David Lechner wrote:
> On 01/18/2018 09:27 AM, Sekhar Nori wrote:
>> On Monday 08 January 2018 07:55 AM, David Lechner wrote:
>>> This removes all of the clock init code from da8xx-dt.c. This includes
>>> all of the OF_DEV_AUXDATA that was just used for looking up clocks.
>>>
>>> Note: You need to have clocks defined in your device tree or your system
>>> won't boot after this patch.
>>
>> I am not sure we can do this then, as we cannot break DT compatibility.
>>
> 
> In the past, you have told me that you don't want the .dts changes and code
> changes in the same patch. In this case, if you apply either one

Thats still true.

> separately,
> it will break clocks. It does not matter which one is first.
> 
> So either we have to squash [PATCH v5 44/44] ARM: dts: da850: Add clocks
> into this patch or deal with the breakage.

I am not so much concerned about temporary breakage in the middle of the
series, but more about DT compatibility after the entire series is applied.

Thanks,
Sekhar

Re: [PATCH v5 43/44] ARM: da8xx-dt: switch to device tree clocks

2018-01-18 Thread Sekhar Nori

On Friday 19 January 2018 12:10 AM, David Lechner wrote:
> On 01/18/2018 09:27 AM, Sekhar Nori wrote:
>> On Monday 08 January 2018 07:55 AM, David Lechner wrote:
>>> This removes all of the clock init code from da8xx-dt.c. This includes
>>> all of the OF_DEV_AUXDATA that was just used for looking up clocks.
>>>
>>> Note: You need to have clocks defined in your device tree or your system
>>> won't boot after this patch.
>>
>> I am not sure we can do this then, as we cannot break DT compatibility.
>>
> 
> In the past, you have told me that you don't want the .dts changes and code
> changes in the same patch. In this case, if you apply either one

Thats still true.

> separately,
> it will break clocks. It does not matter which one is first.
> 
> So either we have to squash [PATCH v5 44/44] ARM: dts: da850: Add clocks
> into this patch or deal with the breakage.

I am not so much concerned about temporary breakage in the middle of the
series, but more about DT compatibility after the entire series is applied.

Thanks,
Sekhar

[PATCH V2 net-next 1/4] net: hns3: add support for get_regs

2018-01-18 Thread Peng Li

From: Fuyun Liang 

This patch adds get_regs support for ethtool cmd.

Signed-off-by: Fuyun Liang 
Signed-off-by: Peng Li 
---
 drivers/net/ethernet/hisilicon/hns3/hnae3.h|   3 +-
 drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c |  23 +++
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h |   4 +
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c| 176 +
 4 files changed, 205 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hnae3.h 
b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
index 634e932..d104ce5 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hnae3.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
@@ -356,7 +356,8 @@ struct hnae3_ae_ops {
u32 stringset, u8 *data);
int (*get_sset_count)(struct hnae3_handle *handle, int stringset);
 
-   void (*get_regs)(struct hnae3_handle *handle, void *data);
+   void (*get_regs)(struct hnae3_handle *handle, u32 *version,
+void *data);
int (*get_regs_len)(struct hnae3_handle *handle);
 
u32 (*get_rss_key_size)(struct hnae3_handle *handle);
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
index 358f780..1c8b293 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
@@ -1063,6 +1063,27 @@ static int hns3_set_coalesce(struct net_device *netdev,
return 0;
 }
 
+static int hns3_get_regs_len(struct net_device *netdev)
+{
+   struct hnae3_handle *h = hns3_get_handle(netdev);
+
+   if (!h->ae_algo->ops->get_regs_len)
+   return -EOPNOTSUPP;
+
+   return h->ae_algo->ops->get_regs_len(h);
+}
+
+static void hns3_get_regs(struct net_device *netdev,
+ struct ethtool_regs *cmd, void *data)
+{
+   struct hnae3_handle *h = hns3_get_handle(netdev);
+
+   if (!h->ae_algo->ops->get_regs)
+   return;
+
+   h->ae_algo->ops->get_regs(h, >version, data);
+}
+
 static const struct ethtool_ops hns3vf_ethtool_ops = {
.get_drvinfo = hns3_get_drvinfo,
.get_ringparam = hns3_get_ringparam,
@@ -1103,6 +1124,8 @@ static const struct ethtool_ops hns3_ethtool_ops = {
.set_channels = hns3_set_channels,
.get_coalesce = hns3_get_coalesce,
.set_coalesce = hns3_set_coalesce,
+   .get_regs_len = hns3_get_regs_len,
+   .get_regs = hns3_get_regs,
 };
 
 void hns3_ethtool_set_ops(struct net_device *netdev)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
index 3c3159b..2561e7a 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
@@ -102,6 +102,10 @@ enum hclge_opcode_type {
HCLGE_OPC_STATS_64_BIT  = 0x0030,
HCLGE_OPC_STATS_32_BIT  = 0x0031,
HCLGE_OPC_STATS_MAC = 0x0032,
+
+   HCLGE_OPC_QUERY_REG_NUM = 0x0040,
+   HCLGE_OPC_QUERY_32_BIT_REG  = 0x0041,
+   HCLGE_OPC_QUERY_64_BIT_REG  = 0x0042,
/* Device management command */
 
/* MAC commond */
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index 27f0ab6..c3d2cca 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -5544,6 +5544,180 @@ static int hclge_set_channels(struct hnae3_handle 
*handle, u32 new_tqps_num)
return ret;
 }
 
+static int hclge_get_regs_num(struct hclge_dev *hdev, u32 *regs_num_32_bit,
+ u32 *regs_num_64_bit)
+{
+   struct hclge_desc desc;
+   u32 total_num;
+   int ret;
+
+   hclge_cmd_setup_basic_desc(, HCLGE_OPC_QUERY_REG_NUM, true);
+   ret = hclge_cmd_send(>hw, , 1);
+   if (ret) {
+   dev_err(>pdev->dev,
+   "Query register number cmd failed, ret = %d.\n", ret);
+   return ret;
+   }
+
+   *regs_num_32_bit = le32_to_cpu(desc.data[0]);
+   *regs_num_64_bit = le32_to_cpu(desc.data[1]);
+
+   total_num = *regs_num_32_bit + *regs_num_64_bit;
+   if (!total_num)
+   return -EINVAL;
+
+   return 0;
+}
+
+static int hclge_get_32_bit_regs(struct hclge_dev *hdev, u32 regs_num,
+void *data)
+{
+#define HCLGE_32_BIT_REG_RTN_DATANUM 8
+
+   struct hclge_desc *desc;
+   u32 *reg_val = data;
+   __le32 *desc_data;
+   int cmd_num;
+   int i, k, n;
+   int ret;
+
+   if (regs_num == 0)
+   return 0;
+
+   cmd_num = DIV_ROUND_UP(regs_num + 2, HCLGE_32_BIT_REG_RTN_DATANUM);
+   desc = kcalloc(cmd_num, sizeof(struct hclge_desc), GFP_KERNEL);
+   if (!desc)

[PATCH V2 net-next 1/4] net: hns3: add support for get_regs

2018-01-18 Thread Peng Li

From: Fuyun Liang 

This patch adds get_regs support for ethtool cmd.

Signed-off-by: Fuyun Liang 
Signed-off-by: Peng Li 
---
 drivers/net/ethernet/hisilicon/hns3/hnae3.h|   3 +-
 drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c |  23 +++
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h |   4 +
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c| 176 +
 4 files changed, 205 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hnae3.h 
b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
index 634e932..d104ce5 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hnae3.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
@@ -356,7 +356,8 @@ struct hnae3_ae_ops {
u32 stringset, u8 *data);
int (*get_sset_count)(struct hnae3_handle *handle, int stringset);
 
-   void (*get_regs)(struct hnae3_handle *handle, void *data);
+   void (*get_regs)(struct hnae3_handle *handle, u32 *version,
+void *data);
int (*get_regs_len)(struct hnae3_handle *handle);
 
u32 (*get_rss_key_size)(struct hnae3_handle *handle);
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
index 358f780..1c8b293 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
@@ -1063,6 +1063,27 @@ static int hns3_set_coalesce(struct net_device *netdev,
return 0;
 }
 
+static int hns3_get_regs_len(struct net_device *netdev)
+{
+   struct hnae3_handle *h = hns3_get_handle(netdev);
+
+   if (!h->ae_algo->ops->get_regs_len)
+   return -EOPNOTSUPP;
+
+   return h->ae_algo->ops->get_regs_len(h);
+}
+
+static void hns3_get_regs(struct net_device *netdev,
+ struct ethtool_regs *cmd, void *data)
+{
+   struct hnae3_handle *h = hns3_get_handle(netdev);
+
+   if (!h->ae_algo->ops->get_regs)
+   return;
+
+   h->ae_algo->ops->get_regs(h, >version, data);
+}
+
 static const struct ethtool_ops hns3vf_ethtool_ops = {
.get_drvinfo = hns3_get_drvinfo,
.get_ringparam = hns3_get_ringparam,
@@ -1103,6 +1124,8 @@ static const struct ethtool_ops hns3_ethtool_ops = {
.set_channels = hns3_set_channels,
.get_coalesce = hns3_get_coalesce,
.set_coalesce = hns3_set_coalesce,
+   .get_regs_len = hns3_get_regs_len,
+   .get_regs = hns3_get_regs,
 };
 
 void hns3_ethtool_set_ops(struct net_device *netdev)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
index 3c3159b..2561e7a 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
@@ -102,6 +102,10 @@ enum hclge_opcode_type {
HCLGE_OPC_STATS_64_BIT  = 0x0030,
HCLGE_OPC_STATS_32_BIT  = 0x0031,
HCLGE_OPC_STATS_MAC = 0x0032,
+
+   HCLGE_OPC_QUERY_REG_NUM = 0x0040,
+   HCLGE_OPC_QUERY_32_BIT_REG  = 0x0041,
+   HCLGE_OPC_QUERY_64_BIT_REG  = 0x0042,
/* Device management command */
 
/* MAC commond */
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index 27f0ab6..c3d2cca 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -5544,6 +5544,180 @@ static int hclge_set_channels(struct hnae3_handle 
*handle, u32 new_tqps_num)
return ret;
 }
 
+static int hclge_get_regs_num(struct hclge_dev *hdev, u32 *regs_num_32_bit,
+ u32 *regs_num_64_bit)
+{
+   struct hclge_desc desc;
+   u32 total_num;
+   int ret;
+
+   hclge_cmd_setup_basic_desc(, HCLGE_OPC_QUERY_REG_NUM, true);
+   ret = hclge_cmd_send(>hw, , 1);
+   if (ret) {
+   dev_err(>pdev->dev,
+   "Query register number cmd failed, ret = %d.\n", ret);
+   return ret;
+   }
+
+   *regs_num_32_bit = le32_to_cpu(desc.data[0]);
+   *regs_num_64_bit = le32_to_cpu(desc.data[1]);
+
+   total_num = *regs_num_32_bit + *regs_num_64_bit;
+   if (!total_num)
+   return -EINVAL;
+
+   return 0;
+}
+
+static int hclge_get_32_bit_regs(struct hclge_dev *hdev, u32 regs_num,
+void *data)
+{
+#define HCLGE_32_BIT_REG_RTN_DATANUM 8
+
+   struct hclge_desc *desc;
+   u32 *reg_val = data;
+   __le32 *desc_data;
+   int cmd_num;
+   int i, k, n;
+   int ret;
+
+   if (regs_num == 0)
+   return 0;
+
+   cmd_num = DIV_ROUND_UP(regs_num + 2, HCLGE_32_BIT_REG_RTN_DATANUM);
+   desc = kcalloc(cmd_num, sizeof(struct hclge_desc), GFP_KERNEL);
+   if (!desc)
+   return -ENOMEM;
+
+

[PATCH V2 net-next 2/4] net: hns3: add manager table initialization for hardware

2018-01-18 Thread Peng Li

From: Fuyun Liang 

The manager table is empty by default. If it is not initialized, the
management pkgs like LLDP will be dropped by hardware. Default entries
need to be added to manager table.

Signed-off-by: Fuyun Liang 
Signed-off-by: Peng Li 
---
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h |  22 +
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c| 101 +
 2 files changed, 123 insertions(+)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
index 2561e7a..1cd28e0 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
@@ -605,6 +605,28 @@ struct hclge_mac_vlan_mask_entry_cmd {
u8 rsv2[14];
 };
 
+#define HCLGE_MAC_MGR_MASK_VLAN_B  BIT(0)
+#define HCLGE_MAC_MGR_MASK_MAC_B   BIT(1)
+#define HCLGE_MAC_MGR_MASK_ETHERTYPE_B BIT(2)
+#define HCLGE_MAC_ETHERTYPE_LLDP   0x88cc
+
+struct hclge_mac_mgr_tbl_entry_cmd {
+   u8  flags;
+   u8  resp_code;
+   __le16  vlan_tag;
+   __le32  mac_addr_hi32;
+   __le16  mac_addr_lo16;
+   __le16  rsv1;
+   __le16  ethter_type;
+   __le16  egress_port;
+   __le16  egress_queue;
+   u8  sw_port_id_aware;
+   u8  rsv2;
+   u8  i_port_bitmap;
+   u8  i_port_direction;
+   u8  rsv3[2];
+};
+
 #define HCLGE_CFG_MTA_MAC_SEL_S0x0
 #define HCLGE_CFG_MTA_MAC_SEL_MGENMASK(1, 0)
 #define HCLGE_CFG_MTA_MAC_EN_B 0x7
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index c3d2cca..6e64bed 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -392,6 +392,16 @@ static const struct hclge_comm_stats_str 
g_mac_stats_string[] = {
HCLGE_MAC_STATS_FIELD_OFF(mac_rx_send_app_bad_pkt_num)}
 };
 
+static const struct hclge_mac_mgr_tbl_entry_cmd hclge_mgr_table[] = {
+   {
+   .flags = HCLGE_MAC_MGR_MASK_VLAN_B,
+   .ethter_type = cpu_to_le16(HCLGE_MAC_ETHERTYPE_LLDP),
+   .mac_addr_hi32 = cpu_to_le32(htonl(0x0180C200)),
+   .mac_addr_lo16 = cpu_to_le16(htons(0x000E)),
+   .i_port_bitmap = 0x1,
+   },
+};
+
 static int hclge_64_bit_update_stats(struct hclge_dev *hdev)
 {
 #define HCLGE_64_BIT_CMD_NUM 5
@@ -4249,6 +4259,91 @@ int hclge_rm_mc_addr_common(struct hclge_vport *vport,
return status;
 }
 
+static int hclge_get_mac_ethertype_cmd_status(struct hclge_dev *hdev,
+ u16 cmdq_resp, u8 resp_code)
+{
+#define HCLGE_ETHERTYPE_SUCCESS_ADD0
+#define HCLGE_ETHERTYPE_ALREADY_ADD1
+#define HCLGE_ETHERTYPE_MGR_TBL_OVERFLOW   2
+#define HCLGE_ETHERTYPE_KEY_CONFLICT   3
+
+   int return_status;
+
+   if (cmdq_resp) {
+   dev_err(>pdev->dev,
+   "cmdq execute failed for get_mac_ethertype_cmd_status, 
status=%d.\n",
+   cmdq_resp);
+   return -EIO;
+   }
+
+   switch (resp_code) {
+   case HCLGE_ETHERTYPE_SUCCESS_ADD:
+   case HCLGE_ETHERTYPE_ALREADY_ADD:
+   return_status = 0;
+   break;
+   case HCLGE_ETHERTYPE_MGR_TBL_OVERFLOW:
+   dev_err(>pdev->dev,
+   "add mac ethertype failed for manager table 
overflow.\n");
+   return_status = -EIO;
+   break;
+   case HCLGE_ETHERTYPE_KEY_CONFLICT:
+   dev_err(>pdev->dev,
+   "add mac ethertype failed for key conflict.\n");
+   return_status = -EIO;
+   break;
+   default:
+   dev_err(>pdev->dev,
+   "add mac ethertype failed for undefined, code=%d.\n",
+   resp_code);
+   return_status = -EIO;
+   }
+
+   return return_status;
+}
+
+static int hclge_add_mgr_tbl(struct hclge_dev *hdev,
+const struct hclge_mac_mgr_tbl_entry_cmd *req)
+{
+   struct hclge_desc desc;
+   u8 resp_code;
+   u16 retval;
+   int ret;
+
+   hclge_cmd_setup_basic_desc(, HCLGE_OPC_MAC_ETHTYPE_ADD, false);
+   memcpy(desc.data, req, sizeof(struct hclge_mac_mgr_tbl_entry_cmd));
+
+   ret = hclge_cmd_send(>hw, , 1);
+   if (ret) {
+   dev_err(>pdev->dev,
+   "add mac ethertype failed for cmd_send, ret =%d.\n",
+   ret);
+   return ret;
+   }
+
+   resp_code = (le32_to_cpu(desc.data[0]) >> 8) & 0xff;
+   retval = le16_to_cpu(desc.retval);
+
+   return hclge_get_mac_ethertype_cmd_status(hdev, retval, resp_code);

[PATCH V2 net-next 3/4] net: hns3: add ethtool -p support for fiber port

2018-01-18 Thread Peng Li

From: Jian Shen 

Add led location support for fiber port. The led will keep blinking
when locating.

Signed-off-by: Jian Shen 
Signed-off-by: Peng Li 
---
 drivers/net/ethernet/hisilicon/hns3/hnae3.h|  2 +
 drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c | 12 
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h | 20 +++
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c| 70 ++
 4 files changed, 104 insertions(+)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hnae3.h 
b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
index d104ce5..fd06bc7 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hnae3.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
@@ -405,6 +405,8 @@ struct hnae3_ae_ops {
int (*set_channels)(struct hnae3_handle *handle, u32 new_tqps_num);
void (*get_flowctrl_adv)(struct hnae3_handle *handle,
 u32 *flowctrl_adv);
+   int (*set_led_id)(struct hnae3_handle *handle,
+ enum ethtool_phys_id_state status);
 };
 
 struct hnae3_dcb_ops {
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
index 1c8b293..7410205 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
@@ -1084,6 +1084,17 @@ static void hns3_get_regs(struct net_device *netdev,
h->ae_algo->ops->get_regs(h, >version, data);
 }
 
+static int hns3_set_phys_id(struct net_device *netdev,
+   enum ethtool_phys_id_state state)
+{
+   struct hnae3_handle *h = hns3_get_handle(netdev);
+
+   if (!h->ae_algo || !h->ae_algo->ops || !h->ae_algo->ops->set_led_id)
+   return -EOPNOTSUPP;
+
+   return h->ae_algo->ops->set_led_id(h, state);
+}
+
 static const struct ethtool_ops hns3vf_ethtool_ops = {
.get_drvinfo = hns3_get_drvinfo,
.get_ringparam = hns3_get_ringparam,
@@ -1126,6 +1137,7 @@ static const struct ethtool_ops hns3_ethtool_ops = {
.set_coalesce = hns3_set_coalesce,
.get_regs_len = hns3_get_regs_len,
.get_regs = hns3_get_regs,
+   .set_phys_id = hns3_set_phys_id,
 };
 
 void hns3_ethtool_set_ops(struct net_device *netdev)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
index 1cd28e0..122f862 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
@@ -227,6 +227,9 @@ enum hclge_opcode_type {
 
/* Mailbox cmd */
HCLGEVF_OPC_MBX_PF_TO_VF= 0x2000,
+
+   /* Led command */
+   HCLGE_OPC_LED_STATUS_CFG= 0xB000,
 };
 
 #define HCLGE_TQP_REG_OFFSET   0x8
@@ -807,6 +810,23 @@ struct hclge_reset_cmd {
 #define HCLGE_NIC_CMQ_DESC_NUM 1024
 #define HCLGE_NIC_CMQ_DESC_NUM_S   3
 
+#define HCLGE_LED_PORT_SPEED_STATE_S   0
+#define HCLGE_LED_PORT_SPEED_STATE_M   GENMASK(5, 0)
+#define HCLGE_LED_ACTIVITY_STATE_S 0
+#define HCLGE_LED_ACTIVITY_STATE_M GENMASK(1, 0)
+#define HCLGE_LED_LINK_STATE_S 0
+#define HCLGE_LED_LINK_STATE_M GENMASK(1, 0)
+#define HCLGE_LED_LOCATE_STATE_S   0
+#define HCLGE_LED_LOCATE_STATE_M   GENMASK(1, 0)
+
+struct hclge_set_led_state_cmd {
+   u8 port_speed_led_config;
+   u8 link_led_config;
+   u8 activity_led_config;
+   u8 locate_led_config;
+   u8 rsv[20];
+};
+
 int hclge_cmd_init(struct hclge_dev *hdev);
 static inline void hclge_write_reg(void __iomem *base, u32 reg, u32 value)
 {
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index 6e64bed..12150f2 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -5819,6 +5819,75 @@ static void hclge_get_regs(struct hnae3_handle *handle, 
u32 *version,
"Get 64 bit register failed, ret = %d.\n", ret);
 }
 
+static int hclge_set_led_status_sfp(struct hclge_dev *hdev, u8 
speed_led_status,
+   u8 act_led_status, u8 link_led_status,
+   u8 locate_led_status)
+{
+   struct hclge_set_led_state_cmd *req;
+   struct hclge_desc desc;
+   int ret;
+
+   hclge_cmd_setup_basic_desc(, HCLGE_OPC_LED_STATUS_CFG, false);
+
+   req = (struct hclge_set_led_state_cmd *)desc.data;
+   hnae_set_field(req->port_speed_led_config, HCLGE_LED_PORT_SPEED_STATE_M,
+  HCLGE_LED_PORT_SPEED_STATE_S, speed_led_status);
+   hnae_set_field(req->link_led_config, HCLGE_LED_ACTIVITY_STATE_M,
+  HCLGE_LED_ACTIVITY_STATE_S, act_led_status);
+   hnae_set_field(req->activity_led_config, HCLGE_LED_LINK_STATE_M,
+

[PATCH V2 net-next 2/4] net: hns3: add manager table initialization for hardware

2018-01-18 Thread Peng Li

From: Fuyun Liang 

The manager table is empty by default. If it is not initialized, the
management pkgs like LLDP will be dropped by hardware. Default entries
need to be added to manager table.

Signed-off-by: Fuyun Liang 
Signed-off-by: Peng Li 
---
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h |  22 +
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c| 101 +
 2 files changed, 123 insertions(+)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
index 2561e7a..1cd28e0 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
@@ -605,6 +605,28 @@ struct hclge_mac_vlan_mask_entry_cmd {
u8 rsv2[14];
 };
 
+#define HCLGE_MAC_MGR_MASK_VLAN_B  BIT(0)
+#define HCLGE_MAC_MGR_MASK_MAC_B   BIT(1)
+#define HCLGE_MAC_MGR_MASK_ETHERTYPE_B BIT(2)
+#define HCLGE_MAC_ETHERTYPE_LLDP   0x88cc
+
+struct hclge_mac_mgr_tbl_entry_cmd {
+   u8  flags;
+   u8  resp_code;
+   __le16  vlan_tag;
+   __le32  mac_addr_hi32;
+   __le16  mac_addr_lo16;
+   __le16  rsv1;
+   __le16  ethter_type;
+   __le16  egress_port;
+   __le16  egress_queue;
+   u8  sw_port_id_aware;
+   u8  rsv2;
+   u8  i_port_bitmap;
+   u8  i_port_direction;
+   u8  rsv3[2];
+};
+
 #define HCLGE_CFG_MTA_MAC_SEL_S0x0
 #define HCLGE_CFG_MTA_MAC_SEL_MGENMASK(1, 0)
 #define HCLGE_CFG_MTA_MAC_EN_B 0x7
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index c3d2cca..6e64bed 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -392,6 +392,16 @@ static const struct hclge_comm_stats_str 
g_mac_stats_string[] = {
HCLGE_MAC_STATS_FIELD_OFF(mac_rx_send_app_bad_pkt_num)}
 };
 
+static const struct hclge_mac_mgr_tbl_entry_cmd hclge_mgr_table[] = {
+   {
+   .flags = HCLGE_MAC_MGR_MASK_VLAN_B,
+   .ethter_type = cpu_to_le16(HCLGE_MAC_ETHERTYPE_LLDP),
+   .mac_addr_hi32 = cpu_to_le32(htonl(0x0180C200)),
+   .mac_addr_lo16 = cpu_to_le16(htons(0x000E)),
+   .i_port_bitmap = 0x1,
+   },
+};
+
 static int hclge_64_bit_update_stats(struct hclge_dev *hdev)
 {
 #define HCLGE_64_BIT_CMD_NUM 5
@@ -4249,6 +4259,91 @@ int hclge_rm_mc_addr_common(struct hclge_vport *vport,
return status;
 }
 
+static int hclge_get_mac_ethertype_cmd_status(struct hclge_dev *hdev,
+ u16 cmdq_resp, u8 resp_code)
+{
+#define HCLGE_ETHERTYPE_SUCCESS_ADD0
+#define HCLGE_ETHERTYPE_ALREADY_ADD1
+#define HCLGE_ETHERTYPE_MGR_TBL_OVERFLOW   2
+#define HCLGE_ETHERTYPE_KEY_CONFLICT   3
+
+   int return_status;
+
+   if (cmdq_resp) {
+   dev_err(>pdev->dev,
+   "cmdq execute failed for get_mac_ethertype_cmd_status, 
status=%d.\n",
+   cmdq_resp);
+   return -EIO;
+   }
+
+   switch (resp_code) {
+   case HCLGE_ETHERTYPE_SUCCESS_ADD:
+   case HCLGE_ETHERTYPE_ALREADY_ADD:
+   return_status = 0;
+   break;
+   case HCLGE_ETHERTYPE_MGR_TBL_OVERFLOW:
+   dev_err(>pdev->dev,
+   "add mac ethertype failed for manager table 
overflow.\n");
+   return_status = -EIO;
+   break;
+   case HCLGE_ETHERTYPE_KEY_CONFLICT:
+   dev_err(>pdev->dev,
+   "add mac ethertype failed for key conflict.\n");
+   return_status = -EIO;
+   break;
+   default:
+   dev_err(>pdev->dev,
+   "add mac ethertype failed for undefined, code=%d.\n",
+   resp_code);
+   return_status = -EIO;
+   }
+
+   return return_status;
+}
+
+static int hclge_add_mgr_tbl(struct hclge_dev *hdev,
+const struct hclge_mac_mgr_tbl_entry_cmd *req)
+{
+   struct hclge_desc desc;
+   u8 resp_code;
+   u16 retval;
+   int ret;
+
+   hclge_cmd_setup_basic_desc(, HCLGE_OPC_MAC_ETHTYPE_ADD, false);
+   memcpy(desc.data, req, sizeof(struct hclge_mac_mgr_tbl_entry_cmd));
+
+   ret = hclge_cmd_send(>hw, , 1);
+   if (ret) {
+   dev_err(>pdev->dev,
+   "add mac ethertype failed for cmd_send, ret =%d.\n",
+   ret);
+   return ret;
+   }
+
+   resp_code = (le32_to_cpu(desc.data[0]) >> 8) & 0xff;
+   retval = le16_to_cpu(desc.retval);
+
+   return hclge_get_mac_ethertype_cmd_status(hdev, retval, resp_code);
+}
+
+static int init_mgr_tbl(struct hclge_dev *hdev)
+{
+   int

[PATCH V2 net-next 3/4] net: hns3: add ethtool -p support for fiber port

2018-01-18 Thread Peng Li

From: Jian Shen 

Add led location support for fiber port. The led will keep blinking
when locating.

Signed-off-by: Jian Shen 
Signed-off-by: Peng Li 
---
 drivers/net/ethernet/hisilicon/hns3/hnae3.h|  2 +
 drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c | 12 
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h | 20 +++
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c| 70 ++
 4 files changed, 104 insertions(+)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hnae3.h 
b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
index d104ce5..fd06bc7 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hnae3.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
@@ -405,6 +405,8 @@ struct hnae3_ae_ops {
int (*set_channels)(struct hnae3_handle *handle, u32 new_tqps_num);
void (*get_flowctrl_adv)(struct hnae3_handle *handle,
 u32 *flowctrl_adv);
+   int (*set_led_id)(struct hnae3_handle *handle,
+ enum ethtool_phys_id_state status);
 };
 
 struct hnae3_dcb_ops {
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
index 1c8b293..7410205 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
@@ -1084,6 +1084,17 @@ static void hns3_get_regs(struct net_device *netdev,
h->ae_algo->ops->get_regs(h, >version, data);
 }
 
+static int hns3_set_phys_id(struct net_device *netdev,
+   enum ethtool_phys_id_state state)
+{
+   struct hnae3_handle *h = hns3_get_handle(netdev);
+
+   if (!h->ae_algo || !h->ae_algo->ops || !h->ae_algo->ops->set_led_id)
+   return -EOPNOTSUPP;
+
+   return h->ae_algo->ops->set_led_id(h, state);
+}
+
 static const struct ethtool_ops hns3vf_ethtool_ops = {
.get_drvinfo = hns3_get_drvinfo,
.get_ringparam = hns3_get_ringparam,
@@ -1126,6 +1137,7 @@ static const struct ethtool_ops hns3_ethtool_ops = {
.set_coalesce = hns3_set_coalesce,
.get_regs_len = hns3_get_regs_len,
.get_regs = hns3_get_regs,
+   .set_phys_id = hns3_set_phys_id,
 };
 
 void hns3_ethtool_set_ops(struct net_device *netdev)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
index 1cd28e0..122f862 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
@@ -227,6 +227,9 @@ enum hclge_opcode_type {
 
/* Mailbox cmd */
HCLGEVF_OPC_MBX_PF_TO_VF= 0x2000,
+
+   /* Led command */
+   HCLGE_OPC_LED_STATUS_CFG= 0xB000,
 };
 
 #define HCLGE_TQP_REG_OFFSET   0x8
@@ -807,6 +810,23 @@ struct hclge_reset_cmd {
 #define HCLGE_NIC_CMQ_DESC_NUM 1024
 #define HCLGE_NIC_CMQ_DESC_NUM_S   3
 
+#define HCLGE_LED_PORT_SPEED_STATE_S   0
+#define HCLGE_LED_PORT_SPEED_STATE_M   GENMASK(5, 0)
+#define HCLGE_LED_ACTIVITY_STATE_S 0
+#define HCLGE_LED_ACTIVITY_STATE_M GENMASK(1, 0)
+#define HCLGE_LED_LINK_STATE_S 0
+#define HCLGE_LED_LINK_STATE_M GENMASK(1, 0)
+#define HCLGE_LED_LOCATE_STATE_S   0
+#define HCLGE_LED_LOCATE_STATE_M   GENMASK(1, 0)
+
+struct hclge_set_led_state_cmd {
+   u8 port_speed_led_config;
+   u8 link_led_config;
+   u8 activity_led_config;
+   u8 locate_led_config;
+   u8 rsv[20];
+};
+
 int hclge_cmd_init(struct hclge_dev *hdev);
 static inline void hclge_write_reg(void __iomem *base, u32 reg, u32 value)
 {
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index 6e64bed..12150f2 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -5819,6 +5819,75 @@ static void hclge_get_regs(struct hnae3_handle *handle, 
u32 *version,
"Get 64 bit register failed, ret = %d.\n", ret);
 }
 
+static int hclge_set_led_status_sfp(struct hclge_dev *hdev, u8 
speed_led_status,
+   u8 act_led_status, u8 link_led_status,
+   u8 locate_led_status)
+{
+   struct hclge_set_led_state_cmd *req;
+   struct hclge_desc desc;
+   int ret;
+
+   hclge_cmd_setup_basic_desc(, HCLGE_OPC_LED_STATUS_CFG, false);
+
+   req = (struct hclge_set_led_state_cmd *)desc.data;
+   hnae_set_field(req->port_speed_led_config, HCLGE_LED_PORT_SPEED_STATE_M,
+  HCLGE_LED_PORT_SPEED_STATE_S, speed_led_status);
+   hnae_set_field(req->link_led_config, HCLGE_LED_ACTIVITY_STATE_M,
+  HCLGE_LED_ACTIVITY_STATE_S, act_led_status);
+   hnae_set_field(req->activity_led_config, HCLGE_LED_LINK_STATE_M,
+  HCLGE_LED_LINK_STATE_S, link_led_status);
+

[PATCH V2 net-next 4/4] net: hns3: add net status led support for fiber port

2018-01-18 Thread Peng Li

From: Jian Shen 

Check the net status per second, include port speed, total rx/tx packets
and link status. Updating the led status for fiber port.

Signed-off-by: Jian Shen 
Signed-off-by: Peng Li 
---
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h |   1 +
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c| 109 +
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.h|   3 +
 3 files changed, 113 insertions(+)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
index 122f862..3fd10a6 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
@@ -115,6 +115,7 @@ enum hclge_opcode_type {
HCLGE_OPC_QUERY_LINK_STATUS = 0x0307,
HCLGE_OPC_CONFIG_MAX_FRM_SIZE   = 0x0308,
HCLGE_OPC_CONFIG_SPEED_DUP  = 0x0309,
+   HCLGE_OPC_STATS_MAC_TRAFFIC = 0x0314,
/* MACSEC command */
 
/* PFC/Pause CMD*/
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index 12150f2..32bc6f6 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -39,6 +39,7 @@ static int hclge_set_mta_filter_mode(struct hclge_dev *hdev,
 static int hclge_set_mtu(struct hnae3_handle *handle, int new_mtu);
 static int hclge_init_vlan_config(struct hclge_dev *hdev);
 static int hclge_reset_ae_dev(struct hnae3_ae_dev *ae_dev);
+static int hclge_update_led_status(struct hclge_dev *hdev);
 
 static struct hnae3_ae_algo ae_algo;
 
@@ -505,6 +506,38 @@ static int hclge_32_bit_update_stats(struct hclge_dev 
*hdev)
return 0;
 }
 
+static int hclge_mac_get_traffic_stats(struct hclge_dev *hdev)
+{
+   struct hclge_mac_stats *mac_stats = >hw_stats.mac_stats;
+   struct hclge_desc desc;
+   __le64 *desc_data;
+   int ret;
+
+   /* for fiber port, need to query the total rx/tx packets statstics,
+* used for data transferring checking.
+*/
+   if (hdev->hw.mac.media_type != HNAE3_MEDIA_TYPE_FIBER)
+   return 0;
+
+   if (test_bit(HCLGE_STATE_STATISTICS_UPDATING, >state))
+   return 0;
+
+   hclge_cmd_setup_basic_desc(, HCLGE_OPC_STATS_MAC_TRAFFIC, true);
+   ret = hclge_cmd_send(>hw, , 1);
+   if (ret) {
+   dev_err(>pdev->dev,
+   "Get MAC total pkt stats fail, ret = %d\n", ret);
+
+   return ret;
+   }
+
+   desc_data = (__le64 *)([0]);
+   mac_stats->mac_tx_total_pkt_num += le64_to_cpu(*desc_data++);
+   mac_stats->mac_rx_total_pkt_num += le64_to_cpu(*desc_data);
+
+   return 0;
+}
+
 static int hclge_mac_update_stats(struct hclge_dev *hdev)
 {
 #define HCLGE_MAC_CMD_NUM 21
@@ -2846,13 +2879,20 @@ static void hclge_service_task(struct work_struct *work)
struct hclge_dev *hdev =
container_of(work, struct hclge_dev, service_task);
 
+   /* The total rx/tx packets statstics are wanted to be updated
+* per second. Both hclge_update_stats_for_all() and
+* hclge_mac_get_traffic_stats() can do it.
+*/
if (hdev->hw_stats.stats_timer >= HCLGE_STATS_TIMER_INTERVAL) {
hclge_update_stats_for_all(hdev);
hdev->hw_stats.stats_timer = 0;
+   } else {
+   hclge_mac_get_traffic_stats(hdev);
}
 
hclge_update_speed_duplex(hdev);
hclge_update_link_status(hdev);
+   hclge_update_led_status(hdev);
hclge_service_complete(hdev);
 }
 
@@ -5888,6 +5928,75 @@ static int hclge_set_led_id(struct hnae3_handle *handle,
return ret;
 }
 
+enum hclge_led_port_speed {
+   HCLGE_SPEED_LED_FOR_1G,
+   HCLGE_SPEED_LED_FOR_10G,
+   HCLGE_SPEED_LED_FOR_25G,
+   HCLGE_SPEED_LED_FOR_40G,
+   HCLGE_SPEED_LED_FOR_50G,
+   HCLGE_SPEED_LED_FOR_100G,
+};
+
+static u8 hclge_led_get_speed_status(u32 speed)
+{
+   u8 speed_led;
+
+   switch (speed) {
+   case HCLGE_MAC_SPEED_1G:
+   speed_led = HCLGE_SPEED_LED_FOR_1G;
+   break;
+   case HCLGE_MAC_SPEED_10G:
+   speed_led = HCLGE_SPEED_LED_FOR_10G;
+   break;
+   case HCLGE_MAC_SPEED_25G:
+   speed_led = HCLGE_SPEED_LED_FOR_25G;
+   break;
+   case HCLGE_MAC_SPEED_40G:
+   speed_led = HCLGE_SPEED_LED_FOR_40G;
+   break;
+   case HCLGE_MAC_SPEED_50G:
+   speed_led = HCLGE_SPEED_LED_FOR_50G;
+   break;
+   case HCLGE_MAC_SPEED_100G:
+   speed_led = HCLGE_SPEED_LED_FOR_100G;
+   break;
+   default:
+   speed_led = HCLGE_LED_NO_CHANGE;
+   }
+
+   return speed_led;
+}
+
+static int hclge_update_led_status(struct

[PATCH V2 net-next 4/4] net: hns3: add net status led support for fiber port

2018-01-18 Thread Peng Li

From: Jian Shen 

Check the net status per second, include port speed, total rx/tx packets
and link status. Updating the led status for fiber port.

Signed-off-by: Jian Shen 
Signed-off-by: Peng Li 
---
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h |   1 +
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c| 109 +
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.h|   3 +
 3 files changed, 113 insertions(+)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
index 122f862..3fd10a6 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
@@ -115,6 +115,7 @@ enum hclge_opcode_type {
HCLGE_OPC_QUERY_LINK_STATUS = 0x0307,
HCLGE_OPC_CONFIG_MAX_FRM_SIZE   = 0x0308,
HCLGE_OPC_CONFIG_SPEED_DUP  = 0x0309,
+   HCLGE_OPC_STATS_MAC_TRAFFIC = 0x0314,
/* MACSEC command */
 
/* PFC/Pause CMD*/
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index 12150f2..32bc6f6 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -39,6 +39,7 @@ static int hclge_set_mta_filter_mode(struct hclge_dev *hdev,
 static int hclge_set_mtu(struct hnae3_handle *handle, int new_mtu);
 static int hclge_init_vlan_config(struct hclge_dev *hdev);
 static int hclge_reset_ae_dev(struct hnae3_ae_dev *ae_dev);
+static int hclge_update_led_status(struct hclge_dev *hdev);
 
 static struct hnae3_ae_algo ae_algo;
 
@@ -505,6 +506,38 @@ static int hclge_32_bit_update_stats(struct hclge_dev 
*hdev)
return 0;
 }
 
+static int hclge_mac_get_traffic_stats(struct hclge_dev *hdev)
+{
+   struct hclge_mac_stats *mac_stats = >hw_stats.mac_stats;
+   struct hclge_desc desc;
+   __le64 *desc_data;
+   int ret;
+
+   /* for fiber port, need to query the total rx/tx packets statstics,
+* used for data transferring checking.
+*/
+   if (hdev->hw.mac.media_type != HNAE3_MEDIA_TYPE_FIBER)
+   return 0;
+
+   if (test_bit(HCLGE_STATE_STATISTICS_UPDATING, >state))
+   return 0;
+
+   hclge_cmd_setup_basic_desc(, HCLGE_OPC_STATS_MAC_TRAFFIC, true);
+   ret = hclge_cmd_send(>hw, , 1);
+   if (ret) {
+   dev_err(>pdev->dev,
+   "Get MAC total pkt stats fail, ret = %d\n", ret);
+
+   return ret;
+   }
+
+   desc_data = (__le64 *)([0]);
+   mac_stats->mac_tx_total_pkt_num += le64_to_cpu(*desc_data++);
+   mac_stats->mac_rx_total_pkt_num += le64_to_cpu(*desc_data);
+
+   return 0;
+}
+
 static int hclge_mac_update_stats(struct hclge_dev *hdev)
 {
 #define HCLGE_MAC_CMD_NUM 21
@@ -2846,13 +2879,20 @@ static void hclge_service_task(struct work_struct *work)
struct hclge_dev *hdev =
container_of(work, struct hclge_dev, service_task);
 
+   /* The total rx/tx packets statstics are wanted to be updated
+* per second. Both hclge_update_stats_for_all() and
+* hclge_mac_get_traffic_stats() can do it.
+*/
if (hdev->hw_stats.stats_timer >= HCLGE_STATS_TIMER_INTERVAL) {
hclge_update_stats_for_all(hdev);
hdev->hw_stats.stats_timer = 0;
+   } else {
+   hclge_mac_get_traffic_stats(hdev);
}
 
hclge_update_speed_duplex(hdev);
hclge_update_link_status(hdev);
+   hclge_update_led_status(hdev);
hclge_service_complete(hdev);
 }
 
@@ -5888,6 +5928,75 @@ static int hclge_set_led_id(struct hnae3_handle *handle,
return ret;
 }
 
+enum hclge_led_port_speed {
+   HCLGE_SPEED_LED_FOR_1G,
+   HCLGE_SPEED_LED_FOR_10G,
+   HCLGE_SPEED_LED_FOR_25G,
+   HCLGE_SPEED_LED_FOR_40G,
+   HCLGE_SPEED_LED_FOR_50G,
+   HCLGE_SPEED_LED_FOR_100G,
+};
+
+static u8 hclge_led_get_speed_status(u32 speed)
+{
+   u8 speed_led;
+
+   switch (speed) {
+   case HCLGE_MAC_SPEED_1G:
+   speed_led = HCLGE_SPEED_LED_FOR_1G;
+   break;
+   case HCLGE_MAC_SPEED_10G:
+   speed_led = HCLGE_SPEED_LED_FOR_10G;
+   break;
+   case HCLGE_MAC_SPEED_25G:
+   speed_led = HCLGE_SPEED_LED_FOR_25G;
+   break;
+   case HCLGE_MAC_SPEED_40G:
+   speed_led = HCLGE_SPEED_LED_FOR_40G;
+   break;
+   case HCLGE_MAC_SPEED_50G:
+   speed_led = HCLGE_SPEED_LED_FOR_50G;
+   break;
+   case HCLGE_MAC_SPEED_100G:
+   speed_led = HCLGE_SPEED_LED_FOR_100G;
+   break;
+   default:
+   speed_led = HCLGE_LED_NO_CHANGE;
+   }
+
+   return speed_led;
+}
+
+static int hclge_update_led_status(struct hclge_dev *hdev)
+{
+   u8 port_speed_status, link_status,

[PATCH V2 net-next 0/4] add some features to hns3 driver

2018-01-18 Thread Peng Li

This patchset adds some features to hns3 driver, include the support
for ethtool command -d, -p and support for manager table.

[Patch 1/4] adds support for ethtool command -d, its ops is get_regs.
driver will send command to command queue, and get regs number and
regs value from command queue.
[Patch 2/4] adds manager table initialization for hardware.
[Patch 3/4] adds support for ethtool command -p. For fiber ports, driver
sends command to command queue, and IMP will write SGPIO regs to control
leds.
[Patch 4/4] adds support for net status led for fiber ports. Net status
include  port speed, total rx/tx packets and link status. Driver send
the status to command queue, and IMP will write SGPIO to control leds.

---
Change log:
V1 -> V2:
1, fix comments from Andrew Lunn, remove the patch "net: hns3: add
ethtool -p support for phy device".
---

Fuyun Liang (2):
  net: hns3: add support for get_regs
  net: hns3: add manager table initialization for hardware

Jian Shen (2):
  net: hns3: add ethtool -p support for fiber port
  net: hns3: add net status led support for fiber port

 drivers/net/ethernet/hisilicon/hns3/hnae3.h|   5 +-
 drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c |  35 ++
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h |  47 +++
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c| 456 +
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.h|   3 +
 5 files changed, 545 insertions(+), 1 deletion(-)

-- 
2.9.3

[PATCH V2 net-next 0/4] add some features to hns3 driver

2018-01-18 Thread Peng Li

This patchset adds some features to hns3 driver, include the support
for ethtool command -d, -p and support for manager table.

[Patch 1/4] adds support for ethtool command -d, its ops is get_regs.
driver will send command to command queue, and get regs number and
regs value from command queue.
[Patch 2/4] adds manager table initialization for hardware.
[Patch 3/4] adds support for ethtool command -p. For fiber ports, driver
sends command to command queue, and IMP will write SGPIO regs to control
leds.
[Patch 4/4] adds support for net status led for fiber ports. Net status
include  port speed, total rx/tx packets and link status. Driver send
the status to command queue, and IMP will write SGPIO to control leds.

---
Change log:
V1 -> V2:
1, fix comments from Andrew Lunn, remove the patch "net: hns3: add
ethtool -p support for phy device".
---

Fuyun Liang (2):
  net: hns3: add support for get_regs
  net: hns3: add manager table initialization for hardware

Jian Shen (2):
  net: hns3: add ethtool -p support for fiber port
  net: hns3: add net status led support for fiber port

 drivers/net/ethernet/hisilicon/hns3/hnae3.h|   5 +-
 drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c |  35 ++
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h |  47 +++
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c| 456 +
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.h|   3 +
 5 files changed, 545 insertions(+), 1 deletion(-)

-- 
2.9.3

Re: [PATCH 3/4] drm/gem: adjust per file OOM badness on handling buffers

2018-01-18 Thread Chunming Zhou




On 2018年01月19日 00:47, Andrey Grodzovsky wrote:

Large amounts of VRAM are usually not CPU accessible, so they are not mapped
into the processes address space. But since the device drivers usually support
swapping buffers from VRAM to system memory we can still run into an out of
memory situation when userspace starts to allocate to much.

This patch gives the OOM another hint which process is
holding how many resources.

Signed-off-by: Andrey Grodzovsky 
---
  drivers/gpu/drm/drm_file.c | 12 
  drivers/gpu/drm/drm_gem.c  |  8 
  include/drm/drm_file.h |  4 
  3 files changed, 24 insertions(+)

diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
index b3c6e99..626cc76 100644
--- a/drivers/gpu/drm/drm_file.c
+++ b/drivers/gpu/drm/drm_file.c
@@ -747,3 +747,15 @@ void drm_send_event(struct drm_device *dev, struct 
drm_pending_event *e)
spin_unlock_irqrestore(>event_lock, irqflags);
  }
  EXPORT_SYMBOL(drm_send_event);
+
+long drm_oom_badness(struct file *f)
+{
+
+   struct drm_file *file_priv = f->private_data;
+
+   if (file_priv)
+   return atomic_long_read(_priv->f_oom_badness);
+
+   return 0;
+}
+EXPORT_SYMBOL(drm_oom_badness);
diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 01f8d94..ffbadc8 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -264,6 +264,9 @@ drm_gem_object_release_handle(int id, void *ptr, void *data)
drm_gem_remove_prime_handles(obj, file_priv);
drm_vma_node_revoke(>vma_node, file_priv);
  
+	atomic_long_sub(obj->size >> PAGE_SHIFT,

+   _priv->f_oom_badness);
+
drm_gem_object_handle_put_unlocked(obj);
  
  	return 0;

@@ -299,6 +302,8 @@ drm_gem_handle_delete(struct drm_file *filp, u32 handle)
idr_remove(>object_idr, handle);
spin_unlock(>table_lock);
  
+	atomic_long_sub(obj->size >> PAGE_SHIFT, >f_oom_badness);

+
return 0;
  }
  EXPORT_SYMBOL(drm_gem_handle_delete);
@@ -417,6 +422,9 @@ drm_gem_handle_create_tail(struct drm_file *file_priv,
}
  
  	*handlep = handle;

+
+   atomic_long_add(obj->size >> PAGE_SHIFT,
+   _priv->f_oom_badness);
For VRAM case, it should be counted only when vram bo is evicted to 
system memory.
For example, vram total is 8GB, system memory total is 8GB, one 
application allocates 7GB vram and 7GB system memory, which is allowed, 
but if following your idea, then this application will be killed by OOM, 
right?


Regards,
David Zhou

return 0;
  
  err_revoke:

diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h
index 0e0c868..ac3aa75 100644
--- a/include/drm/drm_file.h
+++ b/include/drm/drm_file.h
@@ -317,6 +317,8 @@ struct drm_file {
  
  	/* private: */

unsigned long lock_count; /* DRI1 legacy lock count */
+
+   atomic_long_t   f_oom_badness;
  };
  
  /**

@@ -378,4 +380,6 @@ void drm_event_cancel_free(struct drm_device *dev,
  void drm_send_event_locked(struct drm_device *dev, struct drm_pending_event 
*e);
  void drm_send_event(struct drm_device *dev, struct drm_pending_event *e);
  
+long drm_oom_badness(struct file *f);

+
  #endif /* _DRM_FILE_H_ */

Re: [PATCH V5 2/2] nvme-pci: fixup the timeout case when reset is ongoing

2018-01-18 Thread Keith Busch

On Fri, Jan 19, 2018 at 01:55:29PM +0800, jianchao.wang wrote:
> On 01/19/2018 12:59 PM, Keith Busch wrote:
> > On Thu, Jan 18, 2018 at 06:10:02PM +0800, Jianchao Wang wrote:
> >> +   * - When the ctrl.state is NVME_CTRL_RESETTING, the expired
> >> +   *   request should come from the previous work and we handle
> >> +   *   it as nvme_cancel_request.
> >> +   * - When the ctrl.state is NVME_CTRL_RECONNECTING, the expired
> >> +   *   request should come from the initializing procedure such as
> >> +   *   setup io queues, because all the previous outstanding
> >> +   *   requests should have been cancelled.
> >> */
> >> -  if (dev->ctrl.state == NVME_CTRL_RESETTING) {
> >> -  dev_warn(dev->ctrl.device,
> >> -   "I/O %d QID %d timeout, disable controller\n",
> >> -   req->tag, nvmeq->qid);
> >> -  nvme_dev_disable(dev, false);
> >> +  switch (dev->ctrl.state) {
> >> +  case NVME_CTRL_RESETTING:
> >> +  nvme_req(req)->status = NVME_SC_ABORT_REQ;
> >> +  return BLK_EH_HANDLED;
> >> +  case NVME_CTRL_RECONNECTING:
> >> +  WARN_ON_ONCE(nvmeq->qid);
> >>nvme_req(req)->flags |= NVME_REQ_CANCELLED;
> >>return BLK_EH_HANDLED;
> >> +  default:
> >> +  break;
> >>}
> > 
> > The driver may be giving up on the command here, but that doesn't mean
> > the controller has. We can't just end the request like this because that
> > will release the memory the controller still owns. We must wait until
> > after nvme_dev_disable clears bus master because we can't say for sure
> > the controller isn't going to write to that address right after we end
> > the request.
> > 
> Yes, but the controller is going to be reseted or shutdown at the moment,
> even if the controller accesses a bad address and goes wrong, everything will
> be ok after reset or shutdown. :)

Hm, I don't follow. DMA access after free is never okay.

Re: [PATCH 3/4] drm/gem: adjust per file OOM badness on handling buffers

2018-01-18 Thread Chunming Zhou




On 2018年01月19日 00:47, Andrey Grodzovsky wrote:

Large amounts of VRAM are usually not CPU accessible, so they are not mapped
into the processes address space. But since the device drivers usually support
swapping buffers from VRAM to system memory we can still run into an out of
memory situation when userspace starts to allocate to much.

This patch gives the OOM another hint which process is
holding how many resources.

Signed-off-by: Andrey Grodzovsky 
---
  drivers/gpu/drm/drm_file.c | 12 
  drivers/gpu/drm/drm_gem.c  |  8 
  include/drm/drm_file.h |  4 
  3 files changed, 24 insertions(+)

diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
index b3c6e99..626cc76 100644
--- a/drivers/gpu/drm/drm_file.c
+++ b/drivers/gpu/drm/drm_file.c
@@ -747,3 +747,15 @@ void drm_send_event(struct drm_device *dev, struct 
drm_pending_event *e)
spin_unlock_irqrestore(>event_lock, irqflags);
  }
  EXPORT_SYMBOL(drm_send_event);
+
+long drm_oom_badness(struct file *f)
+{
+
+   struct drm_file *file_priv = f->private_data;
+
+   if (file_priv)
+   return atomic_long_read(_priv->f_oom_badness);
+
+   return 0;
+}
+EXPORT_SYMBOL(drm_oom_badness);
diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 01f8d94..ffbadc8 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -264,6 +264,9 @@ drm_gem_object_release_handle(int id, void *ptr, void *data)
drm_gem_remove_prime_handles(obj, file_priv);
drm_vma_node_revoke(>vma_node, file_priv);
  
+	atomic_long_sub(obj->size >> PAGE_SHIFT,

+   _priv->f_oom_badness);
+
drm_gem_object_handle_put_unlocked(obj);
  
  	return 0;

@@ -299,6 +302,8 @@ drm_gem_handle_delete(struct drm_file *filp, u32 handle)
idr_remove(>object_idr, handle);
spin_unlock(>table_lock);
  
+	atomic_long_sub(obj->size >> PAGE_SHIFT, >f_oom_badness);

+
return 0;
  }
  EXPORT_SYMBOL(drm_gem_handle_delete);
@@ -417,6 +422,9 @@ drm_gem_handle_create_tail(struct drm_file *file_priv,
}
  
  	*handlep = handle;

+
+   atomic_long_add(obj->size >> PAGE_SHIFT,
+   _priv->f_oom_badness);
For VRAM case, it should be counted only when vram bo is evicted to 
system memory.
For example, vram total is 8GB, system memory total is 8GB, one 
application allocates 7GB vram and 7GB system memory, which is allowed, 
but if following your idea, then this application will be killed by OOM, 
right?


Regards,
David Zhou

return 0;
  
  err_revoke:

diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h
index 0e0c868..ac3aa75 100644
--- a/include/drm/drm_file.h
+++ b/include/drm/drm_file.h
@@ -317,6 +317,8 @@ struct drm_file {
  
  	/* private: */

unsigned long lock_count; /* DRI1 legacy lock count */
+
+   atomic_long_t   f_oom_badness;
  };
  
  /**

@@ -378,4 +380,6 @@ void drm_event_cancel_free(struct drm_device *dev,
  void drm_send_event_locked(struct drm_device *dev, struct drm_pending_event 
*e);
  void drm_send_event(struct drm_device *dev, struct drm_pending_event *e);
  
+long drm_oom_badness(struct file *f);

+
  #endif /* _DRM_FILE_H_ */

Re: [PATCH V5 2/2] nvme-pci: fixup the timeout case when reset is ongoing

2018-01-18 Thread Keith Busch

On Fri, Jan 19, 2018 at 01:55:29PM +0800, jianchao.wang wrote:
> On 01/19/2018 12:59 PM, Keith Busch wrote:
> > On Thu, Jan 18, 2018 at 06:10:02PM +0800, Jianchao Wang wrote:
> >> +   * - When the ctrl.state is NVME_CTRL_RESETTING, the expired
> >> +   *   request should come from the previous work and we handle
> >> +   *   it as nvme_cancel_request.
> >> +   * - When the ctrl.state is NVME_CTRL_RECONNECTING, the expired
> >> +   *   request should come from the initializing procedure such as
> >> +   *   setup io queues, because all the previous outstanding
> >> +   *   requests should have been cancelled.
> >> */
> >> -  if (dev->ctrl.state == NVME_CTRL_RESETTING) {
> >> -  dev_warn(dev->ctrl.device,
> >> -   "I/O %d QID %d timeout, disable controller\n",
> >> -   req->tag, nvmeq->qid);
> >> -  nvme_dev_disable(dev, false);
> >> +  switch (dev->ctrl.state) {
> >> +  case NVME_CTRL_RESETTING:
> >> +  nvme_req(req)->status = NVME_SC_ABORT_REQ;
> >> +  return BLK_EH_HANDLED;
> >> +  case NVME_CTRL_RECONNECTING:
> >> +  WARN_ON_ONCE(nvmeq->qid);
> >>nvme_req(req)->flags |= NVME_REQ_CANCELLED;
> >>return BLK_EH_HANDLED;
> >> +  default:
> >> +  break;
> >>}
> > 
> > The driver may be giving up on the command here, but that doesn't mean
> > the controller has. We can't just end the request like this because that
> > will release the memory the controller still owns. We must wait until
> > after nvme_dev_disable clears bus master because we can't say for sure
> > the controller isn't going to write to that address right after we end
> > the request.
> > 
> Yes, but the controller is going to be reseted or shutdown at the moment,
> even if the controller accesses a bad address and goes wrong, everything will
> be ok after reset or shutdown. :)

Hm, I don't follow. DMA access after free is never okay.

RE: [RFC] Per file OOM badness

2018-01-18 Thread He, Roger

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of 
Michal Hocko
Sent: Friday, January 19, 2018 1:14 AM
To: Grodzovsky, Andrey 
Cc: linux...@kvack.org; amd-...@lists.freedesktop.org; 
linux-kernel@vger.kernel.org; dri-de...@lists.freedesktop.org; Koenig, 
Christian 
Subject: Re: [RFC] Per file OOM badness

On Thu 18-01-18 18:00:06, Michal Hocko wrote:
> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
> > Hi, this series is a revised version of an RFC sent by Christian 
> > König a few years ago. The original RFC can be found at 
> > https://lists.freedesktop.org/archives/dri-devel/2015-September/0897
> > 78.html
> > 
> > This is the same idea and I've just adressed his concern from the 
> > original RFC and switched to a callback into file_ops instead of a new 
> > member in struct file.
> 
> Please add the full description to the cover letter and do not make 
> people hunt links.
> 
> Here is the origin cover letter text
> : I'm currently working on the issue that when device drivers allocate 
> memory on
> : behalf of an application the OOM killer usually doesn't knew about 
> that unless
> : the application also get this memory mapped into their address space.
> : 
> : This is especially annoying for graphics drivers where a lot of the 
> VRAM
> : usually isn't CPU accessible and so doesn't make sense to map into 
> the
> : address space of the process using it.
> : 
> : The problem now is that when an application starts to use a lot of 
> VRAM those
> : buffers objects sooner or later get swapped out to system memory, 
> but when we
> : now run into an out of memory situation the OOM killer obviously 
> doesn't knew
> : anything about that memory and so usually kills the wrong process.

OK, but how do you attribute that memory to a particular OOM killable 
entity? And how do you actually enforce that those resources get freed on the 
oom killer action?

Here I think we need more fine granularity for distinguishing the buffer is 
taking VRAM or system memory.

> : The following set of patches tries to address this problem by 
> introducing a per
> : file OOM badness score, which device drivers can use to give the OOM 
> killer a
> : hint how many resources are bound to a file descriptor so that it 
> can make
> : better decisions which process to kill.

But files are not killable, they can be shared... In other words this doesn't 
help the oom killer to make an educated guess at all.

> : 
> : So question at every one: What do you think about this approach?

I thing is just just wrong semantically. Non-reclaimable memory is a 
pain, especially when there is way too much of it. If you can free that memory 
somehow then you can hook into slab shrinker API and react on the memory 
pressure. If you can account such amemory to a particular process and 
make sure that the consumption is bound by the process life time then we can 
think of an accounting that oom_badness can consider when selecting a victim.

I think you are misunderstanding here.
Actually for now, the memory in TTM Pools already has mm_shrink which is 
implemented in ttm_pool_mm_shrink_init.
And here the memory we want to make it contribute to OOM badness is not in TTM 
Pools.
Because when TTM buffer allocation success, the memory already is removed from 
TTM Pools.  

Thanks
Roger(Hongbo.He)

--
Michal Hocko
SUSE Labs
___
amd-gfx mailing list
amd-...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [RFC] Per file OOM badness

2018-01-18 Thread He, Roger

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of 
Michal Hocko
Sent: Friday, January 19, 2018 1:14 AM
To: Grodzovsky, Andrey 
Cc: linux...@kvack.org; amd-...@lists.freedesktop.org; 
linux-kernel@vger.kernel.org; dri-de...@lists.freedesktop.org; Koenig, 
Christian 
Subject: Re: [RFC] Per file OOM badness

On Thu 18-01-18 18:00:06, Michal Hocko wrote:
> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
> > Hi, this series is a revised version of an RFC sent by Christian 
> > König a few years ago. The original RFC can be found at 
> > https://lists.freedesktop.org/archives/dri-devel/2015-September/0897
> > 78.html
> > 
> > This is the same idea and I've just adressed his concern from the 
> > original RFC and switched to a callback into file_ops instead of a new 
> > member in struct file.
> 
> Please add the full description to the cover letter and do not make 
> people hunt links.
> 
> Here is the origin cover letter text
> : I'm currently working on the issue that when device drivers allocate 
> memory on
> : behalf of an application the OOM killer usually doesn't knew about 
> that unless
> : the application also get this memory mapped into their address space.
> : 
> : This is especially annoying for graphics drivers where a lot of the 
> VRAM
> : usually isn't CPU accessible and so doesn't make sense to map into 
> the
> : address space of the process using it.
> : 
> : The problem now is that when an application starts to use a lot of 
> VRAM those
> : buffers objects sooner or later get swapped out to system memory, 
> but when we
> : now run into an out of memory situation the OOM killer obviously 
> doesn't knew
> : anything about that memory and so usually kills the wrong process.

OK, but how do you attribute that memory to a particular OOM killable 
entity? And how do you actually enforce that those resources get freed on the 
oom killer action?

Here I think we need more fine granularity for distinguishing the buffer is 
taking VRAM or system memory.

> : The following set of patches tries to address this problem by 
> introducing a per
> : file OOM badness score, which device drivers can use to give the OOM 
> killer a
> : hint how many resources are bound to a file descriptor so that it 
> can make
> : better decisions which process to kill.

But files are not killable, they can be shared... In other words this doesn't 
help the oom killer to make an educated guess at all.

> : 
> : So question at every one: What do you think about this approach?

I thing is just just wrong semantically. Non-reclaimable memory is a 
pain, especially when there is way too much of it. If you can free that memory 
somehow then you can hook into slab shrinker API and react on the memory 
pressure. If you can account such amemory to a particular process and 
make sure that the consumption is bound by the process life time then we can 
think of an accounting that oom_badness can consider when selecting a victim.

I think you are misunderstanding here.
Actually for now, the memory in TTM Pools already has mm_shrink which is 
implemented in ttm_pool_mm_shrink_init.
And here the memory we want to make it contribute to OOM badness is not in TTM 
Pools.
Because when TTM buffer allocation success, the memory already is removed from 
TTM Pools.  

Thanks
Roger(Hongbo.He)

--
Michal Hocko
SUSE Labs
___
amd-gfx mailing list
amd-...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH V5 2/2] nvme-pci: fixup the timeout case when reset is ongoing

2018-01-18 Thread jianchao.wang

Hi Keith

Thanks for your kindly response and directive.

On 01/19/2018 12:59 PM, Keith Busch wrote:
> On Thu, Jan 18, 2018 at 06:10:02PM +0800, Jianchao Wang wrote:
>> + * - When the ctrl.state is NVME_CTRL_RESETTING, the expired
>> + *   request should come from the previous work and we handle
>> + *   it as nvme_cancel_request.
>> + * - When the ctrl.state is NVME_CTRL_RECONNECTING, the expired
>> + *   request should come from the initializing procedure such as
>> + *   setup io queues, because all the previous outstanding
>> + *   requests should have been cancelled.
>>   */
>> -if (dev->ctrl.state == NVME_CTRL_RESETTING) {
>> -dev_warn(dev->ctrl.device,
>> - "I/O %d QID %d timeout, disable controller\n",
>> - req->tag, nvmeq->qid);
>> -nvme_dev_disable(dev, false);
>> +switch (dev->ctrl.state) {
>> +case NVME_CTRL_RESETTING:
>> +nvme_req(req)->status = NVME_SC_ABORT_REQ;
>> +return BLK_EH_HANDLED;
>> +case NVME_CTRL_RECONNECTING:
>> +WARN_ON_ONCE(nvmeq->qid);
>>  nvme_req(req)->flags |= NVME_REQ_CANCELLED;
>>  return BLK_EH_HANDLED;
>> +default:
>> +break;
>>  }
> 
> The driver may be giving up on the command here, but that doesn't mean
> the controller has. We can't just end the request like this because that
> will release the memory the controller still owns. We must wait until
> after nvme_dev_disable clears bus master because we can't say for sure
> the controller isn't going to write to that address right after we end
> the request.
> 
Yes, but the controller is going to be reseted or shutdown at the moment,
even if the controller accesses a bad address and goes wrong, everything will
be ok after reset or shutdown. :)

Thanks
Jianchao

Re: [PATCH V5 2/2] nvme-pci: fixup the timeout case when reset is ongoing

2018-01-18 Thread jianchao.wang

Hi Keith

Thanks for your kindly response and directive.

On 01/19/2018 12:59 PM, Keith Busch wrote:
> On Thu, Jan 18, 2018 at 06:10:02PM +0800, Jianchao Wang wrote:
>> + * - When the ctrl.state is NVME_CTRL_RESETTING, the expired
>> + *   request should come from the previous work and we handle
>> + *   it as nvme_cancel_request.
>> + * - When the ctrl.state is NVME_CTRL_RECONNECTING, the expired
>> + *   request should come from the initializing procedure such as
>> + *   setup io queues, because all the previous outstanding
>> + *   requests should have been cancelled.
>>   */
>> -if (dev->ctrl.state == NVME_CTRL_RESETTING) {
>> -dev_warn(dev->ctrl.device,
>> - "I/O %d QID %d timeout, disable controller\n",
>> - req->tag, nvmeq->qid);
>> -nvme_dev_disable(dev, false);
>> +switch (dev->ctrl.state) {
>> +case NVME_CTRL_RESETTING:
>> +nvme_req(req)->status = NVME_SC_ABORT_REQ;
>> +return BLK_EH_HANDLED;
>> +case NVME_CTRL_RECONNECTING:
>> +WARN_ON_ONCE(nvmeq->qid);
>>  nvme_req(req)->flags |= NVME_REQ_CANCELLED;
>>  return BLK_EH_HANDLED;
>> +default:
>> +break;
>>  }
> 
> The driver may be giving up on the command here, but that doesn't mean
> the controller has. We can't just end the request like this because that
> will release the memory the controller still owns. We must wait until
> after nvme_dev_disable clears bus master because we can't say for sure
> the controller isn't going to write to that address right after we end
> the request.
> 
Yes, but the controller is going to be reseted or shutdown at the moment,
even if the controller accesses a bad address and goes wrong, everything will
be ok after reset or shutdown. :)

Thanks
Jianchao

linux-next: build failure after merge of the powerpc tree

2018-01-18 Thread Stephen Rothwell

Hi all,

After merging the powerpc tree, today's linux-next build (powerpc64
allnoconfig) failed like this:

arch/powerpc/kernel/mce_power.o: In function `.mce_handle_error':
mce_power.c:(.text+0x5a8): undefined reference to `.hash__tlbiel_all'
mce_power.c:(.text+0x6b8): undefined reference to `.hash__tlbiel_all'
arch/powerpc/mm/hash_utils_64.o: In function `.hash__early_init_mmu':
hash_utils_64.c:(.init.text+0x9d0): undefined reference to `.hash__tlbiel_all'

Caused by commit

  d4748276ae14 ("powerpc/64s: Improve local TLB flush for boot and MCE on 
POWER9")

The definition of hash__tlbiel_all() is in
arch/powerpc/mm/hash_native_64.c which is only built if CONFIG_PPC_NATIVE
is set, which it is not for this build.

I applied a supplied fix patch.

-- 
Cheers,
Stephen Rothwell

linux-next: build failure after merge of the powerpc tree

2018-01-18 Thread Stephen Rothwell

Hi all,

After merging the powerpc tree, today's linux-next build (powerpc64
allnoconfig) failed like this:

arch/powerpc/kernel/mce_power.o: In function `.mce_handle_error':
mce_power.c:(.text+0x5a8): undefined reference to `.hash__tlbiel_all'
mce_power.c:(.text+0x6b8): undefined reference to `.hash__tlbiel_all'
arch/powerpc/mm/hash_utils_64.o: In function `.hash__early_init_mmu':
hash_utils_64.c:(.init.text+0x9d0): undefined reference to `.hash__tlbiel_all'

Caused by commit

  d4748276ae14 ("powerpc/64s: Improve local TLB flush for boot and MCE on 
POWER9")

The definition of hash__tlbiel_all() is in
arch/powerpc/mm/hash_native_64.c which is only built if CONFIG_PPC_NATIVE
is set, which it is not for this build.

I applied a supplied fix patch.

-- 
Cheers,
Stephen Rothwell

Re: [PATCH 2/4] dmaengine: qcom: bam_dma: add num-channels binding for remotely controlled

2018-01-18 Thread Vinod Koul

On Tue, Jan 16, 2018 at 07:02:34PM +, srinivas.kandaga...@linaro.org wrote:
> From: Srinivas Kandagatla 
> 
> When Linux is master of BAM, it can directly read registers to know number
> of supported channels, however when its remotely controlled reading these
> registers would trigger a crash if the BAM is not yet intialized/powered up
> on the remote side.
> 
> This patch adds num-channels binding to specify number of supported
> dma channels on remotely controlled BAM.
> 
> Signed-off-by: Srinivas Kandagatla 
> ---
>  Documentation/devicetree/bindings/dma/qcom_bam_dma.txt |  2 ++
>  drivers/dma/qcom/bam_dma.c | 13 +++--
>  2 files changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/dma/qcom_bam_dma.txt 
> b/Documentation/devicetree/bindings/dma/qcom_bam_dma.txt
> index 9cbf5d9df8fd..aa6822cbb230 100644
> --- a/Documentation/devicetree/bindings/dma/qcom_bam_dma.txt
> +++ b/Documentation/devicetree/bindings/dma/qcom_bam_dma.txt
> @@ -15,6 +15,8 @@ Required properties:
>the secure world.
>  - qcom,controlled-remotely : optional, indicates that the bam is controlled 
> by
>remote proccessor i.e. execution environment.
> +- num-channels : optional, indicates supported number of DMA channels in a
> +  remotely controlled bam.
>  
>  Example:
>  
> diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
> index 78e488e8f96d..523bd178047a 100644
> --- a/drivers/dma/qcom/bam_dma.c
> +++ b/drivers/dma/qcom/bam_dma.c
> @@ -1083,8 +1083,10 @@ static int bam_init(struct bam_device *bdev)
>   if (bdev->ee >= val)
>   return -EINVAL;
>  
> - val = readl_relaxed(bam_addr(bdev, 0, BAM_NUM_PIPES));
> - bdev->num_channels = val & BAM_NUM_PIPES_MASK;
> + if (!bdev->num_channels) {
> + val = readl_relaxed(bam_addr(bdev, 0, BAM_NUM_PIPES));
> + bdev->num_channels = val & BAM_NUM_PIPES_MASK;
> + }
>  
>   if (bdev->controlled_remotely)
>   return 0;
> @@ -1179,6 +1181,13 @@ static int bam_dma_probe(struct platform_device *pdev)
>   bdev->controlled_remotely = of_property_read_bool(pdev->dev.of_node,
>   "qcom,controlled-remotely");
>  
> + if (bdev->controlled_remotely) {

hmm so if we remove the remotely controlled instanced from DT and then Linux
won't see them and not do anything. Do we need to do configuration of these
instances too?

> + ret = of_property_read_u32(pdev->dev.of_node, "num-channels",
> +>num_channels);
> + if (ret)
> + dev_err(bdev->dev, "num-channels unspecified in dt\n");
> + }
> +
>   bdev->bamclk = devm_clk_get(bdev->dev, "bam_clk");
>   if (IS_ERR(bdev->bamclk)) {
>   bdev->bamclk = NULL;
> -- 
> 2.15.1
> 

-- 
~Vinod

Re: [PATCH 2/4] dmaengine: qcom: bam_dma: add num-channels binding for remotely controlled

2018-01-18 Thread Vinod Koul

On Tue, Jan 16, 2018 at 07:02:34PM +, srinivas.kandaga...@linaro.org wrote:
> From: Srinivas Kandagatla 
> 
> When Linux is master of BAM, it can directly read registers to know number
> of supported channels, however when its remotely controlled reading these
> registers would trigger a crash if the BAM is not yet intialized/powered up
> on the remote side.
> 
> This patch adds num-channels binding to specify number of supported
> dma channels on remotely controlled BAM.
> 
> Signed-off-by: Srinivas Kandagatla 
> ---
>  Documentation/devicetree/bindings/dma/qcom_bam_dma.txt |  2 ++
>  drivers/dma/qcom/bam_dma.c | 13 +++--
>  2 files changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/dma/qcom_bam_dma.txt 
> b/Documentation/devicetree/bindings/dma/qcom_bam_dma.txt
> index 9cbf5d9df8fd..aa6822cbb230 100644
> --- a/Documentation/devicetree/bindings/dma/qcom_bam_dma.txt
> +++ b/Documentation/devicetree/bindings/dma/qcom_bam_dma.txt
> @@ -15,6 +15,8 @@ Required properties:
>the secure world.
>  - qcom,controlled-remotely : optional, indicates that the bam is controlled 
> by
>remote proccessor i.e. execution environment.
> +- num-channels : optional, indicates supported number of DMA channels in a
> +  remotely controlled bam.
>  
>  Example:
>  
> diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
> index 78e488e8f96d..523bd178047a 100644
> --- a/drivers/dma/qcom/bam_dma.c
> +++ b/drivers/dma/qcom/bam_dma.c
> @@ -1083,8 +1083,10 @@ static int bam_init(struct bam_device *bdev)
>   if (bdev->ee >= val)
>   return -EINVAL;
>  
> - val = readl_relaxed(bam_addr(bdev, 0, BAM_NUM_PIPES));
> - bdev->num_channels = val & BAM_NUM_PIPES_MASK;
> + if (!bdev->num_channels) {
> + val = readl_relaxed(bam_addr(bdev, 0, BAM_NUM_PIPES));
> + bdev->num_channels = val & BAM_NUM_PIPES_MASK;
> + }
>  
>   if (bdev->controlled_remotely)
>   return 0;
> @@ -1179,6 +1181,13 @@ static int bam_dma_probe(struct platform_device *pdev)
>   bdev->controlled_remotely = of_property_read_bool(pdev->dev.of_node,
>   "qcom,controlled-remotely");
>  
> + if (bdev->controlled_remotely) {

hmm so if we remove the remotely controlled instanced from DT and then Linux
won't see them and not do anything. Do we need to do configuration of these
instances too?

> + ret = of_property_read_u32(pdev->dev.of_node, "num-channels",
> +>num_channels);
> + if (ret)
> + dev_err(bdev->dev, "num-channels unspecified in dt\n");
> + }
> +
>   bdev->bamclk = devm_clk_get(bdev->dev, "bam_clk");
>   if (IS_ERR(bdev->bamclk)) {
>   bdev->bamclk = NULL;
> -- 
> 2.15.1
> 

-- 
~Vinod

RE: [PATCH] USB TYPEC: RT1711H Type-C Chip Driver

2018-01-18 Thread 李書帆

Hi Jun,

  For now, RT1711H is not fully compatible with TCPCI. So the existing tcpci.c 
may not work for it.

Best Regards,
*
Shu-Fan Lee
Richtek Technology Corporation
TEL: +886-3-5526789 #2359
FAX: +886-3-5526612
*

-Original Message-
From: Jun Li [mailto:jun...@nxp.com]
Sent: Friday, January 19, 2018 11:10 AM
To: ShuFanLee; heikki.kroge...@linux.intel.com
Cc: cy_huang(黃啟原); shufan_lee(李書帆); linux-kernel@vger.kernel.org; 
linux-...@vger.kernel.org; Guenter Roeck
Subject: RE: [PATCH] USB TYPEC: RT1711H Type-C Chip Driver

Hi
> -Original Message-
> From: linux-usb-ow...@vger.kernel.org [mailto:linux-usb-
> ow...@vger.kernel.org] On Behalf Of ShuFanLee
> Sent: Wednesday, January 10, 2018 2:59 PM
> To: heikki.kroge...@linux.intel.com
> Cc: cy_hu...@richtek.com; shufan_...@richtek.com; linux-
> ker...@vger.kernel.org; linux-...@vger.kernel.org
> Subject: [PATCH] USB TYPEC: RT1711H Type-C Chip Driver
>
> From: ShuFanLee 
>
> Richtek RT1711H Type-C chip driver that works with Type-C Port
> Controller Manager to provide USB PD and USB Type-C functionalities.

A general question, is this Rt1711h type-c chip compatible with TCPCI 
(Universal Serial Bus Type-C Port Controller Interface Specification)?
looks like it has the same register map and has some extension, can the 
existing ./drivers/staging/typec/tcpic.c basically work for you?

+Guenter

Li Jun

>
> Signed-off-by: ShuFanLee 
> ---
>  .../devicetree/bindings/usb/richtek,rt1711h.txt|   38 +
>  arch/arm64/boot/dts/hisilicon/rt1711h.dtsi |   11 +
>  drivers/usb/typec/Kconfig  |2 +
>  drivers/usb/typec/Makefile |1 +
>  drivers/usb/typec/rt1711h/Kconfig  |7 +
>  drivers/usb/typec/rt1711h/Makefile |2 +
>  drivers/usb/typec/rt1711h/rt1711h.c| 2241 
> 
>  drivers/usb/typec/rt1711h/rt1711h.h|  300 +++
>  8 files changed, 2602 insertions(+)
>  create mode 100644
> Documentation/devicetree/bindings/usb/richtek,rt1711h.txt
>  create mode 100644 arch/arm64/boot/dts/hisilicon/rt1711h.dtsi
>  create mode 100644 drivers/usb/typec/rt1711h/Kconfig  create mode
> 100644 drivers/usb/typec/rt1711h/Makefile
>  create mode 100644 drivers/usb/typec/rt1711h/rt1711h.c
>  create mode 100644 drivers/usb/typec/rt1711h/rt1711h.h
>
* Email Confidentiality Notice 

The information contained in this e-mail message (including any attachments) 
may be confidential, proprietary, privileged, or otherwise exempt from 
disclosure under applicable laws. It is intended to be conveyed only to the 
designated recipient(s). Any use, dissemination, distribution, printing, 
retaining or copying of this e-mail (including its attachments) by unintended 
recipient(s) is strictly prohibited and may be unlawful. If you are not an 
intended recipient of this e-mail, or believe that you have received this 
e-mail in error, please notify the sender immediately (by replying to this 
e-mail), delete any and all copies of this e-mail (including any attachments) 
from your system, and do not disclose the content of this e-mail to any other 
person. Thank you!

RE: [PATCH] USB TYPEC: RT1711H Type-C Chip Driver

2018-01-18 Thread 李書帆

Hi Jun,

  For now, RT1711H is not fully compatible with TCPCI. So the existing tcpci.c 
may not work for it.

Best Regards,
*
Shu-Fan Lee
Richtek Technology Corporation
TEL: +886-3-5526789 #2359
FAX: +886-3-5526612
*

-Original Message-
From: Jun Li [mailto:jun...@nxp.com]
Sent: Friday, January 19, 2018 11:10 AM
To: ShuFanLee; heikki.kroge...@linux.intel.com
Cc: cy_huang(黃啟原); shufan_lee(李書帆); linux-kernel@vger.kernel.org; 
linux-...@vger.kernel.org; Guenter Roeck
Subject: RE: [PATCH] USB TYPEC: RT1711H Type-C Chip Driver

Hi
> -Original Message-
> From: linux-usb-ow...@vger.kernel.org [mailto:linux-usb-
> ow...@vger.kernel.org] On Behalf Of ShuFanLee
> Sent: Wednesday, January 10, 2018 2:59 PM
> To: heikki.kroge...@linux.intel.com
> Cc: cy_hu...@richtek.com; shufan_...@richtek.com; linux-
> ker...@vger.kernel.org; linux-...@vger.kernel.org
> Subject: [PATCH] USB TYPEC: RT1711H Type-C Chip Driver
>
> From: ShuFanLee 
>
> Richtek RT1711H Type-C chip driver that works with Type-C Port
> Controller Manager to provide USB PD and USB Type-C functionalities.

A general question, is this Rt1711h type-c chip compatible with TCPCI 
(Universal Serial Bus Type-C Port Controller Interface Specification)?
looks like it has the same register map and has some extension, can the 
existing ./drivers/staging/typec/tcpic.c basically work for you?

+Guenter

Li Jun

>
> Signed-off-by: ShuFanLee 
> ---
>  .../devicetree/bindings/usb/richtek,rt1711h.txt|   38 +
>  arch/arm64/boot/dts/hisilicon/rt1711h.dtsi |   11 +
>  drivers/usb/typec/Kconfig  |2 +
>  drivers/usb/typec/Makefile |1 +
>  drivers/usb/typec/rt1711h/Kconfig  |7 +
>  drivers/usb/typec/rt1711h/Makefile |2 +
>  drivers/usb/typec/rt1711h/rt1711h.c| 2241 
> 
>  drivers/usb/typec/rt1711h/rt1711h.h|  300 +++
>  8 files changed, 2602 insertions(+)
>  create mode 100644
> Documentation/devicetree/bindings/usb/richtek,rt1711h.txt
>  create mode 100644 arch/arm64/boot/dts/hisilicon/rt1711h.dtsi
>  create mode 100644 drivers/usb/typec/rt1711h/Kconfig  create mode
> 100644 drivers/usb/typec/rt1711h/Makefile
>  create mode 100644 drivers/usb/typec/rt1711h/rt1711h.c
>  create mode 100644 drivers/usb/typec/rt1711h/rt1711h.h
>
* Email Confidentiality Notice 

The information contained in this e-mail message (including any attachments) 
may be confidential, proprietary, privileged, or otherwise exempt from 
disclosure under applicable laws. It is intended to be conveyed only to the 
designated recipient(s). Any use, dissemination, distribution, printing, 
retaining or copying of this e-mail (including its attachments) by unintended 
recipient(s) is strictly prohibited and may be unlawful. If you are not an 
intended recipient of this e-mail, or believe that you have received this 
e-mail in error, please notify the sender immediately (by replying to this 
e-mail), delete any and all copies of this e-mail (including any attachments) 
from your system, and do not disclose the content of this e-mail to any other 
person. Thank you!

Re: [PATCH 1/4] dmaengine: qcom: bam_dma: make bam clk optional

2018-01-18 Thread Vinod Koul

On Tue, Jan 16, 2018 at 07:02:33PM +, srinivas.kandaga...@linaro.org wrote:
> From: Srinivas Kandagatla 
> 
> When BAM is remotely controlled it does not sound correct to control
> its clk on Linux side. Make it optional, so that its not madatory

s/madatory/mandatory

> for remote controlled BAM instances.
> 
> Signed-off-by: Srinivas Kandagatla 
> ---
>  drivers/dma/qcom/bam_dma.c | 15 ---
>  1 file changed, 8 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
> index 03c4eb3fd314..78e488e8f96d 100644
> --- a/drivers/dma/qcom/bam_dma.c
> +++ b/drivers/dma/qcom/bam_dma.c
> @@ -1180,13 +1180,14 @@ static int bam_dma_probe(struct platform_device *pdev)
>   "qcom,controlled-remotely");
>  
>   bdev->bamclk = devm_clk_get(bdev->dev, "bam_clk");

but you still do clk_get unconditionally?

> - if (IS_ERR(bdev->bamclk))
> - return PTR_ERR(bdev->bamclk);
> -
> - ret = clk_prepare_enable(bdev->bamclk);
> - if (ret) {
> - dev_err(bdev->dev, "failed to prepare/enable clock\n");
> - return ret;
> + if (IS_ERR(bdev->bamclk)) {
> + bdev->bamclk = NULL;
> + } else {
> + ret = clk_prepare_enable(bdev->bamclk);
> + if (ret) {
> + dev_err(bdev->dev, "failed to prepare/enable clock\n");
> + return ret;
> + }

wouldn't it be better to set that an instance is remote controlled and thus
not at all visible to Linux?

>   }
>  
>   ret = bam_init(bdev);
> -- 
> 2.15.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe dmaengine" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
~Vinod

Re: [PATCH 1/4] dmaengine: qcom: bam_dma: make bam clk optional

2018-01-18 Thread Vinod Koul

On Tue, Jan 16, 2018 at 07:02:33PM +, srinivas.kandaga...@linaro.org wrote:
> From: Srinivas Kandagatla 
> 
> When BAM is remotely controlled it does not sound correct to control
> its clk on Linux side. Make it optional, so that its not madatory

s/madatory/mandatory

> for remote controlled BAM instances.
> 
> Signed-off-by: Srinivas Kandagatla 
> ---
>  drivers/dma/qcom/bam_dma.c | 15 ---
>  1 file changed, 8 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
> index 03c4eb3fd314..78e488e8f96d 100644
> --- a/drivers/dma/qcom/bam_dma.c
> +++ b/drivers/dma/qcom/bam_dma.c
> @@ -1180,13 +1180,14 @@ static int bam_dma_probe(struct platform_device *pdev)
>   "qcom,controlled-remotely");
>  
>   bdev->bamclk = devm_clk_get(bdev->dev, "bam_clk");

but you still do clk_get unconditionally?

> - if (IS_ERR(bdev->bamclk))
> - return PTR_ERR(bdev->bamclk);
> -
> - ret = clk_prepare_enable(bdev->bamclk);
> - if (ret) {
> - dev_err(bdev->dev, "failed to prepare/enable clock\n");
> - return ret;
> + if (IS_ERR(bdev->bamclk)) {
> + bdev->bamclk = NULL;
> + } else {
> + ret = clk_prepare_enable(bdev->bamclk);
> + if (ret) {
> + dev_err(bdev->dev, "failed to prepare/enable clock\n");
> + return ret;
> + }

wouldn't it be better to set that an instance is remote controlled and thus
not at all visible to Linux?

>   }
>  
>   ret = bam_init(bdev);
> -- 
> 2.15.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe dmaengine" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
~Vinod

Re: [PATCH] print kdump kernel loaded status in stack dump

2018-01-18 Thread Sergey Senozhatsky

On (01/18/18 10:02), Andi Kleen wrote:
> Dave Young  writes:
> > printk("%sHardware name: %s\n",
> >log_lvl, dump_stack_arch_desc_str);
> > +   if (kexec_crash_loaded())
> > +   printk("%skdump kernel loaded\n", log_lvl);
> 
> Oops/warnings are getting longer and longer, often scrolling away
> from the screen, and if the kernel crashes backscroll does not work
> anymore, so precious information is lost.

true. I even ended up having a console_reflush_on_panic() function. it
simply re-prints with a delay [so I can at least read the oops] logbuf
entries every once in a while, staring with the first oops_in_progress
record.

something like below [it's completely hacked up, but at least gives
an idea]

---

 include/linux/console.h |  1 +
 kernel/panic.c  |  7 +++
 kernel/printk/printk.c  | 39 ++-
 3 files changed, 46 insertions(+), 1 deletion(-)

diff --git a/include/linux/console.h b/include/linux/console.h
index b8920a031a3e..502e3f539448 100644
--- a/include/linux/console.h
+++ b/include/linux/console.h
@@ -168,6 +168,7 @@ extern void console_unlock(void);
 extern void console_conditional_schedule(void);
 extern void console_unblank(void);
 extern void console_flush_on_panic(void);
+extern void console_reflush_on_panic(void);
 extern struct tty_driver *console_device(int *);
 extern void console_stop(struct console *);
 extern void console_start(struct console *);
diff --git a/kernel/panic.c b/kernel/panic.c
index 2cfef408fec9..39cd59bbfaab 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -137,6 +137,7 @@ void panic(const char *fmt, ...)
va_list args;
long i, i_next = 0;
int state = 0;
+   int reflush_tick = 0;
int old_cpu, this_cpu;
bool _crash_kexec_post_notifiers = crash_kexec_post_notifiers;
 
@@ -298,6 +299,12 @@ void panic(const char *fmt, ...)
i_next = i + 3600 / PANIC_BLINK_SPD;
}
mdelay(PANIC_TIMER_STEP);
+
+   reflush_tick++;
+   if (reflush_tick == 32) { /* don't reflush too often */
+   console_reflush_on_panic();
+   reflush_tick = 0;
+   }
}
 }
 
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 9cb943c90d98..ef3f28d4c741 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -426,6 +426,10 @@ static u32 log_next_idx;
 static u64 console_seq;
 static u32 console_idx;
 
+/* index and sequence number of the record which started the oops print out */
+static u64 log_oops_seq;
+static u32 log_oops_idx;
+
 /* the next printk record to read after the last 'clear' command */
 static u64 clear_seq;
 static u32 clear_idx;
@@ -1736,6 +1740,15 @@ static inline void printk_delay(void)
}
 }
 
+/*
+ * Why do we have printk_delay() in vprintk_emit()
+ * and not in console_unlock()?
+ */
+static inline void console_unlock_delay(void)
+{
+   printk_delay();
+}
+
 /*
  * Continuation lines are buffered, and not committed to the record buffer
  * until the line is complete, or a race forces it. The line fragments
@@ -1849,6 +1862,7 @@ asmlinkage int vprintk_emit(int facility, int level,
 
/* This stops the holder of console_sem just where we want him */
logbuf_lock_irqsave(flags);
+
/*
 * The printf needs to come first; we need the syslog
 * prefix which might be passed-in as a parameter.
@@ -1890,7 +1904,11 @@ asmlinkage int vprintk_emit(int facility, int level,
lflags |= LOG_PREFIX|LOG_NEWLINE;
 
printed_len = log_output(facility, level, lflags, dict, dictlen, text, 
text_len);
-
+   /* Oops... */
+   if (oops_in_progress && !log_oops_seq) {
+   log_oops_seq = log_next_seq;
+   log_oops_idx = log_next_idx;
+   }
logbuf_unlock_irqrestore(flags);
 
/* If called from the scheduler, we can not call up(). */
@@ -2396,6 +2414,7 @@ void console_unlock(void)
 
stop_critical_timings();/* don't trace print latency */
call_console_drivers(ext_text, ext_len, text, len);
+   console_unlock_delay();
start_critical_timings();
 
if (console_lock_spinning_disable_and_check()) {
@@ -2495,6 +2514,24 @@ void console_flush_on_panic(void)
console_unlock();
 }
 
+/**
+ * console_reflush_on_panic - re-flush console content starting from the
+ * first oops_in_progress record
+ */
+void console_reflush_on_panic(void)
+{
+   unsigned long flags;
+
+   logbuf_lock_irqsave(flags);
+   console_seq = log_oops_seq;
+   console_idx = log_oops_idx;
+   logbuf_unlock_irqrestore(flags);
+
+   if (!printk_delay_msec)
+   printk_delay_msec = 273; /* I can't read any faster */
+   console_flush_on_panic();
+}
+
 /*
  * Return the console tty driver structure and its

Re: [PATCH] print kdump kernel loaded status in stack dump

2018-01-18 Thread Sergey Senozhatsky

On (01/18/18 10:02), Andi Kleen wrote:
> Dave Young  writes:
> > printk("%sHardware name: %s\n",
> >log_lvl, dump_stack_arch_desc_str);
> > +   if (kexec_crash_loaded())
> > +   printk("%skdump kernel loaded\n", log_lvl);
> 
> Oops/warnings are getting longer and longer, often scrolling away
> from the screen, and if the kernel crashes backscroll does not work
> anymore, so precious information is lost.

true. I even ended up having a console_reflush_on_panic() function. it
simply re-prints with a delay [so I can at least read the oops] logbuf
entries every once in a while, staring with the first oops_in_progress
record.

something like below [it's completely hacked up, but at least gives
an idea]

---

 include/linux/console.h |  1 +
 kernel/panic.c  |  7 +++
 kernel/printk/printk.c  | 39 ++-
 3 files changed, 46 insertions(+), 1 deletion(-)

diff --git a/include/linux/console.h b/include/linux/console.h
index b8920a031a3e..502e3f539448 100644
--- a/include/linux/console.h
+++ b/include/linux/console.h
@@ -168,6 +168,7 @@ extern void console_unlock(void);
 extern void console_conditional_schedule(void);
 extern void console_unblank(void);
 extern void console_flush_on_panic(void);
+extern void console_reflush_on_panic(void);
 extern struct tty_driver *console_device(int *);
 extern void console_stop(struct console *);
 extern void console_start(struct console *);
diff --git a/kernel/panic.c b/kernel/panic.c
index 2cfef408fec9..39cd59bbfaab 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -137,6 +137,7 @@ void panic(const char *fmt, ...)
va_list args;
long i, i_next = 0;
int state = 0;
+   int reflush_tick = 0;
int old_cpu, this_cpu;
bool _crash_kexec_post_notifiers = crash_kexec_post_notifiers;
 
@@ -298,6 +299,12 @@ void panic(const char *fmt, ...)
i_next = i + 3600 / PANIC_BLINK_SPD;
}
mdelay(PANIC_TIMER_STEP);
+
+   reflush_tick++;
+   if (reflush_tick == 32) { /* don't reflush too often */
+   console_reflush_on_panic();
+   reflush_tick = 0;
+   }
}
 }
 
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 9cb943c90d98..ef3f28d4c741 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -426,6 +426,10 @@ static u32 log_next_idx;
 static u64 console_seq;
 static u32 console_idx;
 
+/* index and sequence number of the record which started the oops print out */
+static u64 log_oops_seq;
+static u32 log_oops_idx;
+
 /* the next printk record to read after the last 'clear' command */
 static u64 clear_seq;
 static u32 clear_idx;
@@ -1736,6 +1740,15 @@ static inline void printk_delay(void)
}
 }
 
+/*
+ * Why do we have printk_delay() in vprintk_emit()
+ * and not in console_unlock()?
+ */
+static inline void console_unlock_delay(void)
+{
+   printk_delay();
+}
+
 /*
  * Continuation lines are buffered, and not committed to the record buffer
  * until the line is complete, or a race forces it. The line fragments
@@ -1849,6 +1862,7 @@ asmlinkage int vprintk_emit(int facility, int level,
 
/* This stops the holder of console_sem just where we want him */
logbuf_lock_irqsave(flags);
+
/*
 * The printf needs to come first; we need the syslog
 * prefix which might be passed-in as a parameter.
@@ -1890,7 +1904,11 @@ asmlinkage int vprintk_emit(int facility, int level,
lflags |= LOG_PREFIX|LOG_NEWLINE;
 
printed_len = log_output(facility, level, lflags, dict, dictlen, text, 
text_len);
-
+   /* Oops... */
+   if (oops_in_progress && !log_oops_seq) {
+   log_oops_seq = log_next_seq;
+   log_oops_idx = log_next_idx;
+   }
logbuf_unlock_irqrestore(flags);
 
/* If called from the scheduler, we can not call up(). */
@@ -2396,6 +2414,7 @@ void console_unlock(void)
 
stop_critical_timings();/* don't trace print latency */
call_console_drivers(ext_text, ext_len, text, len);
+   console_unlock_delay();
start_critical_timings();
 
if (console_lock_spinning_disable_and_check()) {
@@ -2495,6 +2514,24 @@ void console_flush_on_panic(void)
console_unlock();
 }
 
+/**
+ * console_reflush_on_panic - re-flush console content starting from the
+ * first oops_in_progress record
+ */
+void console_reflush_on_panic(void)
+{
+   unsigned long flags;
+
+   logbuf_lock_irqsave(flags);
+   console_seq = log_oops_seq;
+   console_idx = log_oops_idx;
+   logbuf_unlock_irqrestore(flags);
+
+   if (!printk_delay_msec)
+   printk_delay_msec = 273; /* I can't read any faster */
+   console_flush_on_panic();
+}
+
 /*
  * Return the console tty driver structure and its associated index
  */
--

Re: [PATCH] cpufreq: remove at32ap-cpufreq

2018-01-18 Thread Viresh Kumar

On 18-01-18, 21:02, Corentin Labbe wrote:
> Since AVR32 arch was removed, at32ap-cpufreq is useless.
> Remove this driver.
> 
> Signed-off-by: Corentin Labbe 
> ---
>  drivers/cpufreq/Kconfig  |  10 ---
>  drivers/cpufreq/Makefile |   1 -
>  drivers/cpufreq/at32ap-cpufreq.c | 127 
> ---
>  3 files changed, 138 deletions(-)
>  delete mode 100644 drivers/cpufreq/at32ap-cpufreq.c

Acked-by: Viresh Kumar 

-- 
viresh

Re: [PATCH] cpufreq: remove at32ap-cpufreq

2018-01-18 Thread Viresh Kumar

On 18-01-18, 21:02, Corentin Labbe wrote:
> Since AVR32 arch was removed, at32ap-cpufreq is useless.
> Remove this driver.
> 
> Signed-off-by: Corentin Labbe 
> ---
>  drivers/cpufreq/Kconfig  |  10 ---
>  drivers/cpufreq/Makefile |   1 -
>  drivers/cpufreq/at32ap-cpufreq.c | 127 
> ---
>  3 files changed, 138 deletions(-)
>  delete mode 100644 drivers/cpufreq/at32ap-cpufreq.c

Acked-by: Viresh Kumar 

-- 
viresh

Re: [PATCH 0/7] PM /Domain/OPP: Add support to get performance state from DT

2018-01-18 Thread Viresh Kumar

On 18-01-18, 20:24, Rafael J. Wysocki wrote:
> On Thursday, January 18, 2018 7:34:04 AM CET Viresh Kumar wrote:
> > On 22-12-17, 12:56, Viresh Kumar wrote:
> > > Hi,
> > > 
> > > Now that the DT bindings [1] are already Reviewed/Acked by respective
> > > maintainers, here is the code to start using them.
> > > 
> > > The first two patches provide helpers in the OPP core, [3-5]/7 update
> > > the PM domain core to start supporting domain OPP tables, etc, 6/7
> > > updates the OPP core to use the new callback provided by the PM domains
> > > to get performance state and the last one removes the unused helpers
> > > now.
> > > 
> > > This is tested on Hikey620 and works just fine.
> > 
> > Ping !
> 
> Well, whom are you pinging exactly and why?

Ulf and Kevin as its been almost a month since this series is posted
and has received no comments at all.

-- 
viresh

Re: [PATCH 0/7] PM /Domain/OPP: Add support to get performance state from DT

2018-01-18 Thread Viresh Kumar

On 18-01-18, 20:24, Rafael J. Wysocki wrote:
> On Thursday, January 18, 2018 7:34:04 AM CET Viresh Kumar wrote:
> > On 22-12-17, 12:56, Viresh Kumar wrote:
> > > Hi,
> > > 
> > > Now that the DT bindings [1] are already Reviewed/Acked by respective
> > > maintainers, here is the code to start using them.
> > > 
> > > The first two patches provide helpers in the OPP core, [3-5]/7 update
> > > the PM domain core to start supporting domain OPP tables, etc, 6/7
> > > updates the OPP core to use the new callback provided by the PM domains
> > > to get performance state and the last one removes the unused helpers
> > > now.
> > > 
> > > This is tested on Hikey620 and works just fine.
> > 
> > Ping !
> 
> Well, whom are you pinging exactly and why?

Ulf and Kevin as its been almost a month since this series is posted
and has received no comments at all.

-- 
viresh

RE: [RFC] Per file OOM badness

2018-01-18 Thread He, Roger

Basically the idea is right to me.

1. But we need smaller granularity to control the contribution to OOM badness.
 Because when the TTM buffer resides in VRAM rather than evict to system 
memory, we should not take this account into badness.
 But I think it is not easy to implement.

2. If the TTM buffer(GTT here) is mapped to user for CPU access, not quite sure 
the buffer size is already taken into account for kernel.
 If yes, at last the size will be counted again by your patches.

So, I am thinking if we can counted the TTM buffer size into: 
struct mm_rss_stat {
atomic_long_t count[NR_MM_COUNTERS];
};
Which is done by kernel based on CPU VM (page table).

Something like that:
When GTT allocate suceess:
add_mm_counter(vma->vm_mm, MM_ANONPAGES, buffer_size);

When GTT swapped out:
dec_mm_counter from MM_ANONPAGES frist, then 
add_mm_counter(vma->vm_mm, MM_SWAPENTS, buffer_size);  // or MM_SHMEMPAGES or 
add new item.

Update the corresponding item in mm_rss_stat always.
If that, we can control the status update accurately. 
What do you think about that?
And is there any side-effect for this approach?


Thanks
Roger(Hongbo.He)

-Original Message-
From: dri-devel [mailto:dri-devel-boun...@lists.freedesktop.org] On Behalf Of 
Andrey Grodzovsky
Sent: Friday, January 19, 2018 12:48 AM
To: linux-kernel@vger.kernel.org; linux...@kvack.org; 
dri-de...@lists.freedesktop.org; amd-...@lists.freedesktop.org
Cc: Koenig, Christian 
Subject: [RFC] Per file OOM badness

Hi, this series is a revised version of an RFC sent by Christian König a few 
years ago. The original RFC can be found at 
https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html

This is the same idea and I've just adressed his concern from the original RFC 
and switched to a callback into file_ops instead of a new member in struct file.

Thanks,
Andrey

___
dri-devel mailing list
dri-de...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

RE: [RFC] Per file OOM badness

2018-01-18 Thread He, Roger

Basically the idea is right to me.

1. But we need smaller granularity to control the contribution to OOM badness.
 Because when the TTM buffer resides in VRAM rather than evict to system 
memory, we should not take this account into badness.
 But I think it is not easy to implement.

2. If the TTM buffer(GTT here) is mapped to user for CPU access, not quite sure 
the buffer size is already taken into account for kernel.
 If yes, at last the size will be counted again by your patches.

So, I am thinking if we can counted the TTM buffer size into: 
struct mm_rss_stat {
atomic_long_t count[NR_MM_COUNTERS];
};
Which is done by kernel based on CPU VM (page table).

Something like that:
When GTT allocate suceess:
add_mm_counter(vma->vm_mm, MM_ANONPAGES, buffer_size);

When GTT swapped out:
dec_mm_counter from MM_ANONPAGES frist, then 
add_mm_counter(vma->vm_mm, MM_SWAPENTS, buffer_size);  // or MM_SHMEMPAGES or 
add new item.

Update the corresponding item in mm_rss_stat always.
If that, we can control the status update accurately. 
What do you think about that?
And is there any side-effect for this approach?


Thanks
Roger(Hongbo.He)

-Original Message-
From: dri-devel [mailto:dri-devel-boun...@lists.freedesktop.org] On Behalf Of 
Andrey Grodzovsky
Sent: Friday, January 19, 2018 12:48 AM
To: linux-kernel@vger.kernel.org; linux...@kvack.org; 
dri-de...@lists.freedesktop.org; amd-...@lists.freedesktop.org
Cc: Koenig, Christian 
Subject: [RFC] Per file OOM badness

Hi, this series is a revised version of an RFC sent by Christian König a few 
years ago. The original RFC can be found at 
https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html

This is the same idea and I've just adressed his concern from the original RFC 
and switched to a callback into file_ops instead of a new member in struct file.

Thanks,
Andrey

___
dri-devel mailing list
dri-de...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [RFC PATCH] e1000e: Remove Other from EIAC.

2018-01-18 Thread Benjamin Poirier

On 2018/01/18 18:42, Shrikrishna Khare wrote:
> 
> 
> On Thu, 18 Jan 2018, Benjamin Poirier wrote:
> 
> > On 2018/01/18 15:50, Benjamin Poirier wrote:
> > > It was reported that emulated e1000e devices in vmware esxi 6.5 Build
> > > 7526125 do not link up after commit 4aea7a5c5e94 ("e1000e: Avoid receiver
> > > overrun interrupt bursts", v4.15-rc1). Some tracing shows that after
> > > e1000e_trigger_lsc() is called, ICR reads out as 0x0 in e1000_msix_other()
> > > on emulated e1000e devices. In comparison, on real e1000e 82574 hardware,
> > > icr=0x8004 (_INT_ASSERTED | _OTHER) in the same situation.
> > > 
> > > Some experimentation showed that this flaw in vmware e1000e emulation can
> > > be worked around by not setting Other in EIAC. This is how it was before
> > > 16ecba59bc33 ("e1000e: Do not read ICR in Other interrupt", v4.5-rc1).
> > 
> > vmware folks, please comment.
> 
> Thank you for bringing this to our attention.
> 
> Using the reported build (ESX 6.5, 7526125) and 4.15.0-rc8+ kernel (which 
> has the said patch), I could bring up e1000e interface (version: 3.2.6-k),
> get dhcp address and even do large file downloads without difficulty.
> 
> Could you give us more pointers on how we may be able to reproduce this 
> locally? Was there anything different with the configuration when the 
> issue was observed? Is the issue consistently reproducible?

It's consistently reproducible, however I noticed that once in a while
there is a genuine "Other" interrupt that comes in and triggers the link
status change. The problem is with interrupts that are triggered via a
write to ICS (such as in e1000e_trigger_lsc()). Can you reproduce a
problem if you do:
ip link set ethX down
ip link set ethX up

If you're building your own kernel, you can add the following patch and
cat /sys/kernel/debug/tracing/trace_pipe

For me it shows on v4.15-rc8:
   <...>-2578  [000]  83527.938321: e1000e_trigger_lsc: trigger_lsc
   <...>-2578  [000] d.h. 83527.938398: e1000_msix_other: icr 0x0

With the patch that I submitted, it shows:
 wickedd-1329  [002] .N..20.123545: e1000e_trigger_lsc: trigger_lsc
  -0 [000] d.h.20.123630: e1000_msix_other: icr 0x8104
  -0 [000] d.h.20.123654: e1000_msix_other: lsc
  -0 [000] d.h.20.123676: e1000_msix_other: mod_timer


diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c 
b/drivers/net/ethernet/intel/e1000e/netdev.c
index 9f18d39bdc8f..16620ce840fc 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -1918,22 +1918,29 @@ static irqreturn_t e1000_msix_other(int __always_unused 
irq, void *data)
bool enable = true;
 
icr = er32(ICR);
+   trace_printk("icr 0x%x\n", icr);
+
if (icr & E1000_ICR_RXO) {
+   trace_printk("rxo\n");
ew32(ICR, E1000_ICR_RXO);
enable = false;
/* napi poll will re-enable Other, make sure it runs */
if (napi_schedule_prep(>napi)) {
+   trace_printk("napi schedule\n");
adapter->total_rx_bytes = 0;
adapter->total_rx_packets = 0;
__napi_schedule(>napi);
}
}
if (icr & E1000_ICR_LSC) {
+   trace_printk("lsc\n");
ew32(ICR, E1000_ICR_LSC);
hw->mac.get_link_status = true;
/* guard against interrupt when we're going down */
-   if (!test_bit(__E1000_DOWN, >state))
+   if (!test_bit(__E1000_DOWN, >state)) {
+   trace_printk("mod_timer\n");
mod_timer(>watchdog_timer, jiffies + 1);
+   }
}
 
if (enable && !test_bit(__E1000_DOWN, >state))
@@ -4221,6 +4228,8 @@ static void e1000e_trigger_lsc(struct e1000_adapter 
*adapter)
 {
struct e1000_hw *hw = >hw;
 
+   trace_printk("trigger_lsc\n");
+
if (adapter->msix_entries)
ew32(ICS, E1000_ICS_LSC | E1000_ICS_OTHER);
else

Re: [RFC PATCH] e1000e: Remove Other from EIAC.

2018-01-18 Thread Benjamin Poirier

On 2018/01/18 18:42, Shrikrishna Khare wrote:
> 
> 
> On Thu, 18 Jan 2018, Benjamin Poirier wrote:
> 
> > On 2018/01/18 15:50, Benjamin Poirier wrote:
> > > It was reported that emulated e1000e devices in vmware esxi 6.5 Build
> > > 7526125 do not link up after commit 4aea7a5c5e94 ("e1000e: Avoid receiver
> > > overrun interrupt bursts", v4.15-rc1). Some tracing shows that after
> > > e1000e_trigger_lsc() is called, ICR reads out as 0x0 in e1000_msix_other()
> > > on emulated e1000e devices. In comparison, on real e1000e 82574 hardware,
> > > icr=0x8004 (_INT_ASSERTED | _OTHER) in the same situation.
> > > 
> > > Some experimentation showed that this flaw in vmware e1000e emulation can
> > > be worked around by not setting Other in EIAC. This is how it was before
> > > 16ecba59bc33 ("e1000e: Do not read ICR in Other interrupt", v4.5-rc1).
> > 
> > vmware folks, please comment.
> 
> Thank you for bringing this to our attention.
> 
> Using the reported build (ESX 6.5, 7526125) and 4.15.0-rc8+ kernel (which 
> has the said patch), I could bring up e1000e interface (version: 3.2.6-k),
> get dhcp address and even do large file downloads without difficulty.
> 
> Could you give us more pointers on how we may be able to reproduce this 
> locally? Was there anything different with the configuration when the 
> issue was observed? Is the issue consistently reproducible?

It's consistently reproducible, however I noticed that once in a while
there is a genuine "Other" interrupt that comes in and triggers the link
status change. The problem is with interrupts that are triggered via a
write to ICS (such as in e1000e_trigger_lsc()). Can you reproduce a
problem if you do:
ip link set ethX down
ip link set ethX up

If you're building your own kernel, you can add the following patch and
cat /sys/kernel/debug/tracing/trace_pipe

For me it shows on v4.15-rc8:
   <...>-2578  [000]  83527.938321: e1000e_trigger_lsc: trigger_lsc
   <...>-2578  [000] d.h. 83527.938398: e1000_msix_other: icr 0x0

With the patch that I submitted, it shows:
 wickedd-1329  [002] .N..20.123545: e1000e_trigger_lsc: trigger_lsc
  -0 [000] d.h.20.123630: e1000_msix_other: icr 0x8104
  -0 [000] d.h.20.123654: e1000_msix_other: lsc
  -0 [000] d.h.20.123676: e1000_msix_other: mod_timer


diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c 
b/drivers/net/ethernet/intel/e1000e/netdev.c
index 9f18d39bdc8f..16620ce840fc 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -1918,22 +1918,29 @@ static irqreturn_t e1000_msix_other(int __always_unused 
irq, void *data)
bool enable = true;
 
icr = er32(ICR);
+   trace_printk("icr 0x%x\n", icr);
+
if (icr & E1000_ICR_RXO) {
+   trace_printk("rxo\n");
ew32(ICR, E1000_ICR_RXO);
enable = false;
/* napi poll will re-enable Other, make sure it runs */
if (napi_schedule_prep(>napi)) {
+   trace_printk("napi schedule\n");
adapter->total_rx_bytes = 0;
adapter->total_rx_packets = 0;
__napi_schedule(>napi);
}
}
if (icr & E1000_ICR_LSC) {
+   trace_printk("lsc\n");
ew32(ICR, E1000_ICR_LSC);
hw->mac.get_link_status = true;
/* guard against interrupt when we're going down */
-   if (!test_bit(__E1000_DOWN, >state))
+   if (!test_bit(__E1000_DOWN, >state)) {
+   trace_printk("mod_timer\n");
mod_timer(>watchdog_timer, jiffies + 1);
+   }
}
 
if (enable && !test_bit(__E1000_DOWN, >state))
@@ -4221,6 +4228,8 @@ static void e1000e_trigger_lsc(struct e1000_adapter 
*adapter)
 {
struct e1000_hw *hw = >hw;
 
+   trace_printk("trigger_lsc\n");
+
if (adapter->msix_entries)
ew32(ICS, E1000_ICS_LSC | E1000_ICS_OTHER);
else

Re: [PATCH v8 5/5] document: add document for kaslr_mem

2018-01-18 Thread Chao Fan

On Fri, Jan 19, 2018 at 11:53:31AM +0800, Baoquan He wrote:
>On 01/19/18 at 11:36am, Chao Fan wrote:
>> Signed-off-by: Chao Fan 
>> ---
>>  Documentation/admin-guide/kernel-parameters.txt | 10 ++
>>  1 file changed, 10 insertions(+)
>> 
>> diff --git a/Documentation/admin-guide/kernel-parameters.txt 
>> b/Documentation/admin-guide/kernel-parameters.txt
>> index e2de7c006a74..28a879f62560 100644
>> --- a/Documentation/admin-guide/kernel-parameters.txt
>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>> @@ -2350,6 +2350,16 @@
>>  allocations which rules out almost all kernel
>>  allocations. Use with caution!
>>  
>> +kaslr_mem=nn[KMG][@ss[KMG]]
>> +[KNL] Force usage of a specific region of memory
>> +for KASLR during kernel decompression stage.
>> +Region of usable memory is from ss to ss+nn. If ss
>> +is omitted, it is qeuivalent to kaslr_mem=nn[KMG]@0.
>> +Multiple regions can be specified, comma delimited.
>> +Notice: we support 4 regions at most now.
>
>Better not use 'we' here. You can refer to kernel-parameter.txt.

You are right, so I resend this part, and add several Cc.

Thanks,
Chao Fan
>
>> +Example:
>> +kaslr_mem=1G,500M@2G,1G@4G
>> +
>>  MTD_Partition=  [MTD]
>>  Format: ,,,
>>  
>> -- 
>> 2.14.3
>> 
>> 
>> 
>
>

Re: [PATCH v8 5/5] document: add document for kaslr_mem

2018-01-18 Thread Chao Fan

On Fri, Jan 19, 2018 at 11:53:31AM +0800, Baoquan He wrote:
>On 01/19/18 at 11:36am, Chao Fan wrote:
>> Signed-off-by: Chao Fan 
>> ---
>>  Documentation/admin-guide/kernel-parameters.txt | 10 ++
>>  1 file changed, 10 insertions(+)
>> 
>> diff --git a/Documentation/admin-guide/kernel-parameters.txt 
>> b/Documentation/admin-guide/kernel-parameters.txt
>> index e2de7c006a74..28a879f62560 100644
>> --- a/Documentation/admin-guide/kernel-parameters.txt
>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>> @@ -2350,6 +2350,16 @@
>>  allocations which rules out almost all kernel
>>  allocations. Use with caution!
>>  
>> +kaslr_mem=nn[KMG][@ss[KMG]]
>> +[KNL] Force usage of a specific region of memory
>> +for KASLR during kernel decompression stage.
>> +Region of usable memory is from ss to ss+nn. If ss
>> +is omitted, it is qeuivalent to kaslr_mem=nn[KMG]@0.
>> +Multiple regions can be specified, comma delimited.
>> +Notice: we support 4 regions at most now.
>
>Better not use 'we' here. You can refer to kernel-parameter.txt.

You are right, so I resend this part, and add several Cc.

Thanks,
Chao Fan
>
>> +Example:
>> +kaslr_mem=1G,500M@2G,1G@4G
>> +
>>  MTD_Partition=  [MTD]
>>  Format: ,,,
>>  
>> -- 
>> 2.14.3
>> 
>> 
>> 
>
>

[RESEND PATCH v8 5/5] document: add document for kaslr_mem

2018-01-18 Thread Chao Fan

Cc: linux-...@vger.kernel.org
Cc: Jonathan Corbet 
Cc: Randy Dunlap 
Signed-off-by: Chao Fan 
---
 Documentation/admin-guide/kernel-parameters.txt | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index e2de7c006a74..2e3d5fb13f7f 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2350,6 +2350,16 @@
allocations which rules out almost all kernel
allocations. Use with caution!
 
+   kaslr_mem=nn[KMG][@ss[KMG]]
+   [KNL] Force usage of a specific region of memory
+   for KASLR during kernel decompression stage.
+   Region of usable memory is from ss to ss+nn. If ss
+   is omitted, it is qeuivalent to kaslr_mem=nn[KMG]@0.
+   Multiple regions can be specified, comma delimited.
+   Notice: only support 4 regions at most now.
+   Example:
+   kaslr_mem=1G,500M@2G,1G@4G
+
MTD_Partition=  [MTD]
Format: ,,,
 
-- 
2.14.3

[RESEND PATCH v8 5/5] document: add document for kaslr_mem

2018-01-18 Thread Chao Fan

Cc: linux-...@vger.kernel.org
Cc: Jonathan Corbet 
Cc: Randy Dunlap 
Signed-off-by: Chao Fan 
---
 Documentation/admin-guide/kernel-parameters.txt | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index e2de7c006a74..2e3d5fb13f7f 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2350,6 +2350,16 @@
allocations which rules out almost all kernel
allocations. Use with caution!
 
+   kaslr_mem=nn[KMG][@ss[KMG]]
+   [KNL] Force usage of a specific region of memory
+   for KASLR during kernel decompression stage.
+   Region of usable memory is from ss to ss+nn. If ss
+   is omitted, it is qeuivalent to kaslr_mem=nn[KMG]@0.
+   Multiple regions can be specified, comma delimited.
+   Notice: only support 4 regions at most now.
+   Example:
+   kaslr_mem=1G,500M@2G,1G@4G
+
MTD_Partition=  [MTD]
Format: ,,,
 
-- 
2.14.3

Re: [PATCH v4 07/13] ARM: dts: rockchip: add clocks in vop iommu nodes

2018-01-18 Thread Tomasz Figa

On Fri, Jan 19, 2018 at 1:55 PM, JeffyChen  wrote:
> Hi Tomasz,
>
> Thanks for your reply.
>
>
> On 01/19/2018 11:23 AM, Tomasz Figa wrote:
>>
>> On Thu, Jan 18, 2018 at 8:52 PM, Jeffy Chen 
>> wrote:
>>>
>>> Add clocks in vop iommu nodes, since we are going to control clocks in
>>> rockchip iommu driver.
>>>
>>> Signed-off-by: Jeffy Chen 
>>> ---
>>>
>>> Changes in v4: None
>>> Changes in v3: None
>>> Changes in v2: None
>>>
>>>   arch/arm/boot/dts/rk3036.dtsi | 2 ++
>>>   arch/arm/boot/dts/rk3288.dtsi | 4 
>>>   2 files changed, 6 insertions(+)
>>>
>>> diff --git a/arch/arm/boot/dts/rk3036.dtsi
>>> b/arch/arm/boot/dts/rk3036.dtsi
>>> index 3b704cfed69a..95b0ebc7a40f 100644
>>> --- a/arch/arm/boot/dts/rk3036.dtsi
>>> +++ b/arch/arm/boot/dts/rk3036.dtsi
>>> @@ -197,6 +197,8 @@
>>>  reg = <0x10118300 0x100>;
>>>  interrupts = ;
>>>  interrupt-names = "vop_mmu";
>>> +   clocks = < ACLK_LCDC>, < SCLK_LCDC>, <
>>> HCLK_LCDC>;
>>> +   clock-names = "aclk_vop", "dclk_vop", "hclk_vop";
>>
>>
>> We should remove clock-names from IOMMU nodes. The Rockchip IOMMU
>> bindings don't define clock names and only the clocks property should
>> be given.
>>
> hmmm, i'm trying to switch to clk_bulk APIs, the get and put are name based.
> or maybe i can use clk_get/put along with other clk_bulk APIs

I think it should be possible to just put the clock pointers to the
clk_bulk_data struct manually. Otherwise, I'm not sure what names we
could use for clock-names, since the clocks depend on master.
(Something like "clock0, clock1, clock2, ..., clockN" could work, but
it doesn't add any value IMHO...).

Re: [PATCH v4 07/13] ARM: dts: rockchip: add clocks in vop iommu nodes

2018-01-18 Thread Tomasz Figa

On Fri, Jan 19, 2018 at 1:55 PM, JeffyChen  wrote:
> Hi Tomasz,
>
> Thanks for your reply.
>
>
> On 01/19/2018 11:23 AM, Tomasz Figa wrote:
>>
>> On Thu, Jan 18, 2018 at 8:52 PM, Jeffy Chen 
>> wrote:
>>>
>>> Add clocks in vop iommu nodes, since we are going to control clocks in
>>> rockchip iommu driver.
>>>
>>> Signed-off-by: Jeffy Chen 
>>> ---
>>>
>>> Changes in v4: None
>>> Changes in v3: None
>>> Changes in v2: None
>>>
>>>   arch/arm/boot/dts/rk3036.dtsi | 2 ++
>>>   arch/arm/boot/dts/rk3288.dtsi | 4 
>>>   2 files changed, 6 insertions(+)
>>>
>>> diff --git a/arch/arm/boot/dts/rk3036.dtsi
>>> b/arch/arm/boot/dts/rk3036.dtsi
>>> index 3b704cfed69a..95b0ebc7a40f 100644
>>> --- a/arch/arm/boot/dts/rk3036.dtsi
>>> +++ b/arch/arm/boot/dts/rk3036.dtsi
>>> @@ -197,6 +197,8 @@
>>>  reg = <0x10118300 0x100>;
>>>  interrupts = ;
>>>  interrupt-names = "vop_mmu";
>>> +   clocks = < ACLK_LCDC>, < SCLK_LCDC>, <
>>> HCLK_LCDC>;
>>> +   clock-names = "aclk_vop", "dclk_vop", "hclk_vop";
>>
>>
>> We should remove clock-names from IOMMU nodes. The Rockchip IOMMU
>> bindings don't define clock names and only the clocks property should
>> be given.
>>
> hmmm, i'm trying to switch to clk_bulk APIs, the get and put are name based.
> or maybe i can use clk_get/put along with other clk_bulk APIs

I think it should be possible to just put the clock pointers to the
clk_bulk_data struct manually. Otherwise, I'm not sure what names we
could use for clock-names, since the clocks depend on master.
(Something like "clock0, clock1, clock2, ..., clockN" could work, but
it doesn't add any value IMHO...).

Re: linux-next: build warning after merge of the crypto tree

2018-01-18 Thread Herbert Xu

On Fri, Jan 19, 2018 at 09:51:43AM +0530, Harsh Jain wrote:
> Hi Herbert,
> 
> It's an indentation issue. Seems checkpatch and default compile options does 
> not report this warning.
> 
> How would you like to take the fix. Should I sent whole series again with fix 
> or only indentation patch.

Please send an incremental patch.

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Re: linux-next: build warning after merge of the crypto tree

2018-01-18 Thread Herbert Xu

On Fri, Jan 19, 2018 at 09:51:43AM +0530, Harsh Jain wrote:
> Hi Herbert,
> 
> It's an indentation issue. Seems checkpatch and default compile options does 
> not report this warning.
> 
> How would you like to take the fix. Should I sent whole series again with fix 
> or only indentation patch.

Please send an incremental patch.

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle

2018-01-18 Thread Bart Van Assche

On Fri, 2018-01-19 at 10:32 +0800, Ming Lei wrote:
> Now most of times both NVMe and SCSI won't return BLK_STS_RESOURCE, and
> it should be DM-only which returns STS_RESOURCE so often.

That's wrong at least for SCSI. See also 
https://marc.info/?l=linux-block=151578329417076.

Bart.

Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle

2018-01-18 Thread Bart Van Assche

On Fri, 2018-01-19 at 10:32 +0800, Ming Lei wrote:
> Now most of times both NVMe and SCSI won't return BLK_STS_RESOURCE, and
> it should be DM-only which returns STS_RESOURCE so often.

That's wrong at least for SCSI. See also 
https://marc.info/?l=linux-block=151578329417076.

Bart.

Re: [PATCH v5 29/44] ARM: da8xx: add new USB PHY clock init using common clock framework

2018-01-18 Thread Sekhar Nori

On Friday 19 January 2018 12:13 AM, David Lechner wrote:
> On 01/18/2018 09:14 AM, Sekhar Nori wrote:
>> On Monday 08 January 2018 07:47 AM, David Lechner wrote:
>>> +int __init da8xx_register_usb20_phy_clk(bool use_usb_refclkin)
>>> +{
>>> +    struct regmap *cfgchip;
>>> +    struct clk *usb0_psc_clk, *clk;
>>> +    struct clk_hw *parent;
>>> +
>>> +    cfgchip = syscon_regmap_lookup_by_compatible("ti,da830-cfgchip");
>>
>> Am I right in understanding that this API is only called for non-DT
>> boot? If yes, do we really need the lookup by compatible?
> 
> This code is used in DT boot until [PATCH v5 43/44] "ARM: da8xx-dt:
> switch to device tree clocks". So, yes it is needed temporarily to
> prevent breaking USB.

Alright, so this line should probably be dropped either as part of 43/44
or later.

Thanks,
Sekhar

Re: [PATCH v5 29/44] ARM: da8xx: add new USB PHY clock init using common clock framework

2018-01-18 Thread Sekhar Nori

On Friday 19 January 2018 12:13 AM, David Lechner wrote:
> On 01/18/2018 09:14 AM, Sekhar Nori wrote:
>> On Monday 08 January 2018 07:47 AM, David Lechner wrote:
>>> +int __init da8xx_register_usb20_phy_clk(bool use_usb_refclkin)
>>> +{
>>> +    struct regmap *cfgchip;
>>> +    struct clk *usb0_psc_clk, *clk;
>>> +    struct clk_hw *parent;
>>> +
>>> +    cfgchip = syscon_regmap_lookup_by_compatible("ti,da830-cfgchip");
>>
>> Am I right in understanding that this API is only called for non-DT
>> boot? If yes, do we really need the lookup by compatible?
> 
> This code is used in DT boot until [PATCH v5 43/44] "ARM: da8xx-dt:
> switch to device tree clocks". So, yes it is needed temporarily to
> prevent breaking USB.

Alright, so this line should probably be dropped either as part of 43/44
or later.

Thanks,
Sekhar

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1962 matches

Mail list logo