Re: [PATCH] audit: add containerid support for IMA-audit

2018-03-12 Thread Richard Guy Briggs
On 2018-03-08 13:02, Mimi Zohar wrote:
> On Thu, 2018-03-08 at 06:21 -0500, Richard Guy Briggs wrote:
> > On 2018-03-05 09:24, Mimi Zohar wrote:
> > > On Mon, 2018-03-05 at 08:50 -0500, Richard Guy Briggs wrote:
> > > > On 2018-03-05 08:43, Mimi Zohar wrote:
> > > > > Hi Richard,
> > > > > 
> > > > > This patch has been compiled, but not runtime tested.
> > > > 
> > > > Ok, great, thank you.  I assume you are offering this patch to be
> > > > included in this patchset?
> > > 
> > > Yes, thank you.
> > > 
> > > > I'll have a look to see where it fits in the
> > > > IMA record.  It might be better if it were an AUDIT_CONTAINER_INFO
> > > > auxiliary record, but I'll have a look at the circumstances of the
> > > > event.  
> > 
> > I had a look at the context of this record to see if adding the contid
> > field to it made sense.  I think the only records for which the contid
> > field makes sense are the two newly proposed records, AUDIT_CONTAINER
> > which introduces the container ID and the and AUDIT_CONTAINER_INFO which
> > documents the presence of the container ID in a process event (or
> > process-less network event).  All others should use the auxiliary record
> > AUDIT_CONTAINER_INFO rather than include the contid field directly
> > itself.  There are several reasons for this including record length, the
> > ability to filter unwanted records, the difficulty of changing the order
> > of or removing fields in the future.
> > 
> > Syscalls get this information automatically if the container ID is set
> > for a task via the AUDIT_CONTAINER_INFO auxiliary record.  Generally a
> > syscall event is one that uses the task's audit_context while a
> > standalone event uses NULL or builds a local audit_context that is
> > discarded immediately after the local use.
> > 
> > Looking at the two cases of AUDIT_INTEGRITY_RULE record generation, it
> > appears that they should be split into two distinct audit record types.
> > 
> > The record created in ima_audit_measurement() is a syscall record that
> > could possibly stand on its own since the subject attributes are
> > present.  If it remains a syscall auxiliary record it will automatically
> > have the AUDIT_CONTAINER_INFO record accompany it anyways.  If it is
> > decided to detach it (which would save cpu/netlink/disk bandwidth but is
> > not recommended due to not wanting to throw away any other syscall
> > information or other involved records (PATH, CWD, etc...) then a local
> > audit_context would be created for the AUDIT_INTEGRITY_RULE and
> > AUDIT_CONTAINERID_INFO records only and immediately discarded.
> > 
> > The record created in ima_parse_rule() is not currently a syscall record
> > since it is passed an audit_context of NULL and it has a very different
> > format that does not include any subject attributes (except subj_*=).
> > At first glance it appears this one should be a syscall accompanied
> > auxiliary record.  Either way it should have an AUDIT_CONTAINER_INFO
> > auxiliary record either by being converted to a syscall auxiliary record
> > by using current->audit_context rather than NULL when calling
> > audit_log_start(), or creating a local audit_context and calling
> > audit_log_container_info() then releasing the local context.  This
> > version of the record has additional concerns covered here:
> > https://github.com/linux-audit/audit-kernel/issues/52
> > 
> > Can you briefly describe the circumstances under which these two
> > different identically-numbered records are produced as a first step
> > towards splitting them into two distict records?
> 
> Agreed, the two uses should really be separated.  ima_parse_rule()
> generates audit messages, when the IMA policy is initially loaded,
> replaced, or extended, the policy rules are included in the audit log.
> When IMA is namespaced, there will be a host policy and namespace
> policies.  We'll need to be able differentiate between the host policy
> rules and IMA namespaced policy rules, and between IMA namespaced
> policy rules.

I would argue this type of message/record should be converted to an
accompanied syscall record, or have the subject attributes added to the
record so that it is clear what user/process initiated this action.  It
now occurs to me that to save audit communications and disk bandwidth,
one syscall record could accompany all the rule records, but if the list
is long enough it might overwhelm userspace audit event parsing code.
Steve?

Regardless, the ima_parse_rule() record format needs to be fixed to
address the non-standard use of "<" and ">" operators instead of the
standard "=" field/value separator.

> The audit messages produced by ima_audit_measurement() were originally
> upstreamed for forensics, and as seen by the FireEye blog are now used
> to augment existing security analytics.  These records are probably
> being used independently of any other audit records.  A single record
> is generated per file, per system.  With IMA namespacing, these
> records 

Re: [PATCH] audit: add containerid support for IMA-audit

2018-03-12 Thread Richard Guy Briggs
On 2018-03-08 13:02, Mimi Zohar wrote:
> On Thu, 2018-03-08 at 06:21 -0500, Richard Guy Briggs wrote:
> > On 2018-03-05 09:24, Mimi Zohar wrote:
> > > On Mon, 2018-03-05 at 08:50 -0500, Richard Guy Briggs wrote:
> > > > On 2018-03-05 08:43, Mimi Zohar wrote:
> > > > > Hi Richard,
> > > > > 
> > > > > This patch has been compiled, but not runtime tested.
> > > > 
> > > > Ok, great, thank you.  I assume you are offering this patch to be
> > > > included in this patchset?
> > > 
> > > Yes, thank you.
> > > 
> > > > I'll have a look to see where it fits in the
> > > > IMA record.  It might be better if it were an AUDIT_CONTAINER_INFO
> > > > auxiliary record, but I'll have a look at the circumstances of the
> > > > event.  
> > 
> > I had a look at the context of this record to see if adding the contid
> > field to it made sense.  I think the only records for which the contid
> > field makes sense are the two newly proposed records, AUDIT_CONTAINER
> > which introduces the container ID and the and AUDIT_CONTAINER_INFO which
> > documents the presence of the container ID in a process event (or
> > process-less network event).  All others should use the auxiliary record
> > AUDIT_CONTAINER_INFO rather than include the contid field directly
> > itself.  There are several reasons for this including record length, the
> > ability to filter unwanted records, the difficulty of changing the order
> > of or removing fields in the future.
> > 
> > Syscalls get this information automatically if the container ID is set
> > for a task via the AUDIT_CONTAINER_INFO auxiliary record.  Generally a
> > syscall event is one that uses the task's audit_context while a
> > standalone event uses NULL or builds a local audit_context that is
> > discarded immediately after the local use.
> > 
> > Looking at the two cases of AUDIT_INTEGRITY_RULE record generation, it
> > appears that they should be split into two distinct audit record types.
> > 
> > The record created in ima_audit_measurement() is a syscall record that
> > could possibly stand on its own since the subject attributes are
> > present.  If it remains a syscall auxiliary record it will automatically
> > have the AUDIT_CONTAINER_INFO record accompany it anyways.  If it is
> > decided to detach it (which would save cpu/netlink/disk bandwidth but is
> > not recommended due to not wanting to throw away any other syscall
> > information or other involved records (PATH, CWD, etc...) then a local
> > audit_context would be created for the AUDIT_INTEGRITY_RULE and
> > AUDIT_CONTAINERID_INFO records only and immediately discarded.
> > 
> > The record created in ima_parse_rule() is not currently a syscall record
> > since it is passed an audit_context of NULL and it has a very different
> > format that does not include any subject attributes (except subj_*=).
> > At first glance it appears this one should be a syscall accompanied
> > auxiliary record.  Either way it should have an AUDIT_CONTAINER_INFO
> > auxiliary record either by being converted to a syscall auxiliary record
> > by using current->audit_context rather than NULL when calling
> > audit_log_start(), or creating a local audit_context and calling
> > audit_log_container_info() then releasing the local context.  This
> > version of the record has additional concerns covered here:
> > https://github.com/linux-audit/audit-kernel/issues/52
> > 
> > Can you briefly describe the circumstances under which these two
> > different identically-numbered records are produced as a first step
> > towards splitting them into two distict records?
> 
> Agreed, the two uses should really be separated.  ima_parse_rule()
> generates audit messages, when the IMA policy is initially loaded,
> replaced, or extended, the policy rules are included in the audit log.
> When IMA is namespaced, there will be a host policy and namespace
> policies.  We'll need to be able differentiate between the host policy
> rules and IMA namespaced policy rules, and between IMA namespaced
> policy rules.

I would argue this type of message/record should be converted to an
accompanied syscall record, or have the subject attributes added to the
record so that it is clear what user/process initiated this action.  It
now occurs to me that to save audit communications and disk bandwidth,
one syscall record could accompany all the rule records, but if the list
is long enough it might overwhelm userspace audit event parsing code.
Steve?

Regardless, the ima_parse_rule() record format needs to be fixed to
address the non-standard use of "<" and ">" operators instead of the
standard "=" field/value separator.

> The audit messages produced by ima_audit_measurement() were originally
> upstreamed for forensics, and as seen by the FireEye blog are now used
> to augment existing security analytics.  These records are probably
> being used independently of any other audit records.  A single record
> is generated per file, per system.  With IMA namespacing, these
> records 

Re: Dell Inc. XPS 13 9343/0TM99H fails to boot v4.16-rc5

2018-03-12 Thread Dominik Brodowski
On Mon, Mar 12, 2018 at 10:42:01PM +, mario.limoncie...@dell.com wrote:
> 
> 
> > -Original Message-
> > From: Dominik Brodowski [mailto:li...@dominikbrodowski.net]
> > Sent: Tuesday, March 13, 2018 2:54 AM
> > To: dvh...@infradead.org; Limonciello, Mario 
> > Cc: platform-driver-...@vger.kernel.org; linux-kernel@vger.kernel.org
> > Subject: Dell Inc. XPS 13 9343/0TM99H fails to boot v4.16-rc5
> > 
> > Mario,
> > 
> > unfortunately, my Dell Inc. XPS 13 9343/0TM99H, BIOS A11 12/08/2016 fails to
> > boot v4.16-rc5. More exactly, I could bisect it down to commit 25d47027e10
> > ("platform/x86: dell-smbios: Link all dell-smbios-* modules together").
> > Usually, I have enabled
> > 
> > CONFIG_SENSORS_DELL_SMM=y
> > CONFIG_DELL_SMBIOS=y
> > CONFIG_DELL_SMBIOS_WMI=y
> > CONFIG_DELL_SMBIOS_SMM=y
> > CONFIG_DELL_LAPTOP=y
> > CONFIG_DELL_WMI=y
> > CONFIG_DELL_WMI_DESCRIPTOR=y
> > # CONFIG_DELL_WMI_AIO is not set
> > # CONFIG_DELL_WMI_LED is not set
> > # CONFIG_DELL_SMO8800 is not set
> > # CONFIG_DELL_RBTN is not set
> > # CONFIG_DELL_RBU is not set
> > 
> > For v4.16-rc5 to work, I need to manually disable DELL_SMBIOS_WMI:
> > 
> > -CONFIG_DELL_SMBIOS_WMI=y
> > +# CONFIG_DELL_SMBIOS_WMI is not set
> > 
> > Any ideas?
> > 
> Dominick,
> 
> Interesting.  Can you please change CONFIG_DELL_SMBIOS to a module
> and see if that behavior persists?  If it does, can you please blacklist it on
> the kernel command line and try to load it manually and share any
> backtrace?

Mario,

building and running it as a *module* works flawlessly. But that was
actually expected after a 'grep "initcall"' in drivers/platform/x86:

As Darren pointed out, DELL_SMBIOS_WMI depends on ACPI_WMI, so probably
ACPI_WMI needs to be initialized first. However, the all-in-one
dell-smbios.o is run as subsys_initcall(), same as wmi.o
(subsys_initcall_sync() there).

If both are built-ins, that means that dell-smbios.o is run first, and wmi.o
second. Changing dell-smbios.o to run at the later fs_initcall() level
instead lets me boot the kernel. HOWEVER:

1) Is there a reason why both the core and the dell-smbios-smm driver have
   to run already at subsys_initcall() time? They did so previous to your
   patch. Is it OK to defer these parts opf the all-in-one dell-smbios.o
   to fs_initcall(), or even to the default device_initcall()?

2) dell-smbios-wmi depends on (well, selects) DELL_WMI_DESCRIPTOR. The
   dell-smbios-wmi is running at the default device_initcall() time, but
   (AFAICS) probably later than the initialization of dell-smbios-wmi.o.
   May I presume that this poses no additional problem?

Thanks,
Dominik


Re: Dell Inc. XPS 13 9343/0TM99H fails to boot v4.16-rc5

2018-03-12 Thread Dominik Brodowski
On Mon, Mar 12, 2018 at 10:42:01PM +, mario.limoncie...@dell.com wrote:
> 
> 
> > -Original Message-
> > From: Dominik Brodowski [mailto:li...@dominikbrodowski.net]
> > Sent: Tuesday, March 13, 2018 2:54 AM
> > To: dvh...@infradead.org; Limonciello, Mario 
> > Cc: platform-driver-...@vger.kernel.org; linux-kernel@vger.kernel.org
> > Subject: Dell Inc. XPS 13 9343/0TM99H fails to boot v4.16-rc5
> > 
> > Mario,
> > 
> > unfortunately, my Dell Inc. XPS 13 9343/0TM99H, BIOS A11 12/08/2016 fails to
> > boot v4.16-rc5. More exactly, I could bisect it down to commit 25d47027e10
> > ("platform/x86: dell-smbios: Link all dell-smbios-* modules together").
> > Usually, I have enabled
> > 
> > CONFIG_SENSORS_DELL_SMM=y
> > CONFIG_DELL_SMBIOS=y
> > CONFIG_DELL_SMBIOS_WMI=y
> > CONFIG_DELL_SMBIOS_SMM=y
> > CONFIG_DELL_LAPTOP=y
> > CONFIG_DELL_WMI=y
> > CONFIG_DELL_WMI_DESCRIPTOR=y
> > # CONFIG_DELL_WMI_AIO is not set
> > # CONFIG_DELL_WMI_LED is not set
> > # CONFIG_DELL_SMO8800 is not set
> > # CONFIG_DELL_RBTN is not set
> > # CONFIG_DELL_RBU is not set
> > 
> > For v4.16-rc5 to work, I need to manually disable DELL_SMBIOS_WMI:
> > 
> > -CONFIG_DELL_SMBIOS_WMI=y
> > +# CONFIG_DELL_SMBIOS_WMI is not set
> > 
> > Any ideas?
> > 
> Dominick,
> 
> Interesting.  Can you please change CONFIG_DELL_SMBIOS to a module
> and see if that behavior persists?  If it does, can you please blacklist it on
> the kernel command line and try to load it manually and share any
> backtrace?

Mario,

building and running it as a *module* works flawlessly. But that was
actually expected after a 'grep "initcall"' in drivers/platform/x86:

As Darren pointed out, DELL_SMBIOS_WMI depends on ACPI_WMI, so probably
ACPI_WMI needs to be initialized first. However, the all-in-one
dell-smbios.o is run as subsys_initcall(), same as wmi.o
(subsys_initcall_sync() there).

If both are built-ins, that means that dell-smbios.o is run first, and wmi.o
second. Changing dell-smbios.o to run at the later fs_initcall() level
instead lets me boot the kernel. HOWEVER:

1) Is there a reason why both the core and the dell-smbios-smm driver have
   to run already at subsys_initcall() time? They did so previous to your
   patch. Is it OK to defer these parts opf the all-in-one dell-smbios.o
   to fs_initcall(), or even to the default device_initcall()?

2) dell-smbios-wmi depends on (well, selects) DELL_WMI_DESCRIPTOR. The
   dell-smbios-wmi is running at the default device_initcall() time, but
   (AFAICS) probably later than the initialization of dell-smbios-wmi.o.
   May I presume that this poses no additional problem?

Thanks,
Dominik


Re: [RESEND PATCH v3 1/2] sched/deadline: Add cpudl_maximum_dl() for clean-up

2018-03-12 Thread Byungchul Park

On 1/11/2018 6:07 PM, Peter Zijlstra wrote:



Sorry for the huge delay on this, but I'll have to postpone further.
Still busy with meltdown/spectre stuff.


Could you review the patch?

--
Thanks,
Byungchul


Re: [RESEND PATCH v3 1/2] sched/deadline: Add cpudl_maximum_dl() for clean-up

2018-03-12 Thread Byungchul Park

On 1/11/2018 6:07 PM, Peter Zijlstra wrote:



Sorry for the huge delay on this, but I'll have to postpone further.
Still busy with meltdown/spectre stuff.


Could you review the patch?

--
Thanks,
Byungchul


Re: [PATCH v2 1/1] net: check before dereferencing netdev_ops during busy poll

2018-03-12 Thread Eric Dumazet



On 03/12/2018 10:32 PM, Josh Elsasser wrote:

init_dummy_netdev() leaves its netdev_ops pointer zeroed. This leads
to a NULL pointer dereference when sk_busy_loop fires against an iwlwifi
wireless adapter and checks napi->dev->netdev_ops->ndo_busy_poll.

Avoid this by ensuring napi->dev->netdev_ops is valid before following
the pointer, avoiding the following panic when busy polling on a dummy
netdev:






Fixes: 060212928670 ("net: add low latency socket poll")
Fixes: ce6aea93f751 ("net: network drivers no longer need to implement 
ndo_busy_poll()") - 4.9.y
Signed-off-by: Josh Elsasser 
---
  net/core/dev.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 8898618bf341..1f50c131ed15 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5042,7 +5042,10 @@ bool sk_busy_loop(struct sock *sk, int nonblock)
goto out;
  
  	/* Note: ndo_busy_poll method is optional in linux-4.5 */

-   busy_poll = napi->dev->netdev_ops->ndo_busy_poll;
+   if (napi->dev->netdev_ops)
+   busy_poll = napi->dev->netdev_ops->ndo_busy_poll;
+   else
+   busy_poll = NULL;
  
  	do {

rc = 0;

We could instead setup a non NULL netdev_ops pointer on these 'dummy' 
devices to not add a check in fast path, but I presume we do
not really care since this fix is for old kernels, and considering how 
long it took to discover this bug.


Reviewed-by: Eric Dumazet 


RE: [PATCH v4 2/6] staging: fsl-dpaa2/ethsw: Add Freescale DPAA2 Ethernet Switch driver

2018-03-12 Thread Razvan Stefanescu


> -Original Message-
> From: Andrew Lunn [mailto:and...@lunn.ch]
> Sent: Monday, March 12, 2018 4:37 PM
> To: Razvan Stefanescu 
> Cc: gre...@linuxfoundation.org; de...@driverdev.osuosl.org; linux-
> ker...@vger.kernel.org; net...@vger.kernel.org; Alexander Graf
> ; a...@arndb.de; Alexandru Marginean
> ; Ruxandra Ioana Ciocoi Radulescu
> ; Ioana Ciornei ;
> Laurentiu Tudor ; stuyo...@gmail.com
> Subject: Re: [PATCH v4 2/6] staging: fsl-dpaa2/ethsw: Add Freescale DPAA2
> Ethernet Switch driver
> 
> > +static int port_netdevice_event(struct notifier_block *unused,
> > +   unsigned long event, void *ptr)
> > +{
> > +   struct net_device *netdev = netdev_notifier_info_to_dev(ptr);
> > +   struct netdev_notifier_changeupper_info *info = ptr;
> > +   struct net_device *upper_dev;
> > +   int err = 0;
> > +
> > +   if (netdev->netdev_ops != _port_ops)
> > +   return NOTIFY_DONE;
> > +
> > +   /* Handle just upper dev link/unlink for the moment */
> > +   if (event == NETDEV_CHANGEUPPER) {
> > +   upper_dev = info->upper_dev;
> > +   if (netif_is_bridge_master(upper_dev)) {
> > +   if (info->linking)
> > +   err = port_bridge_join(netdev);
> > +   else
> > +   err = port_bridge_leave(netdev);
> > +   }
> > +   }
> > +
> > +   return notifier_from_errno(err);
> > +}
> 
> I could be missing something here, but don't you need to pass to
> port_bridge_join() which bridge the port is joining. There can be
> multiple bridges, so you need to ensure the port joins the correct
> bridge.
> 
Thank you for noticing this. I'll add proper checks in next version.

Razvan

>   Andrew


Re: [PATCH v2 1/1] net: check before dereferencing netdev_ops during busy poll

2018-03-12 Thread Eric Dumazet



On 03/12/2018 10:32 PM, Josh Elsasser wrote:

init_dummy_netdev() leaves its netdev_ops pointer zeroed. This leads
to a NULL pointer dereference when sk_busy_loop fires against an iwlwifi
wireless adapter and checks napi->dev->netdev_ops->ndo_busy_poll.

Avoid this by ensuring napi->dev->netdev_ops is valid before following
the pointer, avoiding the following panic when busy polling on a dummy
netdev:






Fixes: 060212928670 ("net: add low latency socket poll")
Fixes: ce6aea93f751 ("net: network drivers no longer need to implement 
ndo_busy_poll()") - 4.9.y
Signed-off-by: Josh Elsasser 
---
  net/core/dev.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 8898618bf341..1f50c131ed15 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5042,7 +5042,10 @@ bool sk_busy_loop(struct sock *sk, int nonblock)
goto out;
  
  	/* Note: ndo_busy_poll method is optional in linux-4.5 */

-   busy_poll = napi->dev->netdev_ops->ndo_busy_poll;
+   if (napi->dev->netdev_ops)
+   busy_poll = napi->dev->netdev_ops->ndo_busy_poll;
+   else
+   busy_poll = NULL;
  
  	do {

rc = 0;

We could instead setup a non NULL netdev_ops pointer on these 'dummy' 
devices to not add a check in fast path, but I presume we do
not really care since this fix is for old kernels, and considering how 
long it took to discover this bug.


Reviewed-by: Eric Dumazet 


RE: [PATCH v4 2/6] staging: fsl-dpaa2/ethsw: Add Freescale DPAA2 Ethernet Switch driver

2018-03-12 Thread Razvan Stefanescu


> -Original Message-
> From: Andrew Lunn [mailto:and...@lunn.ch]
> Sent: Monday, March 12, 2018 4:37 PM
> To: Razvan Stefanescu 
> Cc: gre...@linuxfoundation.org; de...@driverdev.osuosl.org; linux-
> ker...@vger.kernel.org; net...@vger.kernel.org; Alexander Graf
> ; a...@arndb.de; Alexandru Marginean
> ; Ruxandra Ioana Ciocoi Radulescu
> ; Ioana Ciornei ;
> Laurentiu Tudor ; stuyo...@gmail.com
> Subject: Re: [PATCH v4 2/6] staging: fsl-dpaa2/ethsw: Add Freescale DPAA2
> Ethernet Switch driver
> 
> > +static int port_netdevice_event(struct notifier_block *unused,
> > +   unsigned long event, void *ptr)
> > +{
> > +   struct net_device *netdev = netdev_notifier_info_to_dev(ptr);
> > +   struct netdev_notifier_changeupper_info *info = ptr;
> > +   struct net_device *upper_dev;
> > +   int err = 0;
> > +
> > +   if (netdev->netdev_ops != _port_ops)
> > +   return NOTIFY_DONE;
> > +
> > +   /* Handle just upper dev link/unlink for the moment */
> > +   if (event == NETDEV_CHANGEUPPER) {
> > +   upper_dev = info->upper_dev;
> > +   if (netif_is_bridge_master(upper_dev)) {
> > +   if (info->linking)
> > +   err = port_bridge_join(netdev);
> > +   else
> > +   err = port_bridge_leave(netdev);
> > +   }
> > +   }
> > +
> > +   return notifier_from_errno(err);
> > +}
> 
> I could be missing something here, but don't you need to pass to
> port_bridge_join() which bridge the port is joining. There can be
> multiple bridges, so you need to ensure the port joins the correct
> bridge.
> 
Thank you for noticing this. I'll add proper checks in next version.

Razvan

>   Andrew


Re: [RFC PATCH 00/35] remove in-kernel syscall invocations

2018-03-12 Thread Ingo Molnar

* Dominik Brodowski  wrote:

> On Mon, Mar 12, 2018 at 08:32:56AM +0100, Ingo Molnar wrote:
> > 
> > * Dominik Brodowski  wrote:
> > 
> > > I'm a bit more unsure about these remaining patches. They use inline stubs
> > > named ksys_xyzzy() which (mostly) call fs-internal functions. Another
> > > alternative would be to define these in fs/*, but then we'd get more and
> > > more indirections.
> > > 
> > >   syscalls: do not call sys_unlink() within the kernel
> > >   syscalls: do not call sys_rmdir() within the kernel
> > >   syscalls: do not call sys_mkdir{,at}() within the kernel
> > >   syscalls: do not call sys_symlink{,at}() within the kernel
> > >   syscalls: do not call sys_mknod{,at}() within the kernel
> > >   syscalls: do not call sys_link{,at}() within the kernel
> > >   syscalls: do not call sys_{f,}chmod{at,}() within the kernel
> > >   syscalls: do not call sys_{f,}access{,at}() within the kernel
> > >   syscalls: do not call sys_ftruncate() within the kernel
> > >   syscalls: do not call sys_{,l,f}chown() within the kernel
> > >   syscalls: do not call sys_close() within the kernel
> > 
> > >  72 files changed, 572 insertions(+), 274 deletions(-)
> > 
> > Just curious, have you done a before/after vmlinux /usr/bin/size comparison?
> > How do these changes impact generated code?
> > 
> > My expectation would be for there to be a noticeable decrease in text size.
> 
> Unfortunately, no:
> 
>text  data bss dec hex filename
> 10352515  6598692 13344972302961791ce4873 
> vmlinux-v4.16-rc5
> 10352845  6598852 13344972302966691ce4a5d 
> vmlinux-v4.16-rc5-patch23
> 10352942  6598948 13344972302968621ce4b1e 
> vmlinux-v4.16-rc5-patch35

Maybe it's due to the new helpers - still, the changes in size are very small 
so 
this isn't a showstopper.

Thanks,

Ingo


Re: [RFC PATCH 00/35] remove in-kernel syscall invocations

2018-03-12 Thread Ingo Molnar

* Dominik Brodowski  wrote:

> On Mon, Mar 12, 2018 at 08:32:56AM +0100, Ingo Molnar wrote:
> > 
> > * Dominik Brodowski  wrote:
> > 
> > > I'm a bit more unsure about these remaining patches. They use inline stubs
> > > named ksys_xyzzy() which (mostly) call fs-internal functions. Another
> > > alternative would be to define these in fs/*, but then we'd get more and
> > > more indirections.
> > > 
> > >   syscalls: do not call sys_unlink() within the kernel
> > >   syscalls: do not call sys_rmdir() within the kernel
> > >   syscalls: do not call sys_mkdir{,at}() within the kernel
> > >   syscalls: do not call sys_symlink{,at}() within the kernel
> > >   syscalls: do not call sys_mknod{,at}() within the kernel
> > >   syscalls: do not call sys_link{,at}() within the kernel
> > >   syscalls: do not call sys_{f,}chmod{at,}() within the kernel
> > >   syscalls: do not call sys_{f,}access{,at}() within the kernel
> > >   syscalls: do not call sys_ftruncate() within the kernel
> > >   syscalls: do not call sys_{,l,f}chown() within the kernel
> > >   syscalls: do not call sys_close() within the kernel
> > 
> > >  72 files changed, 572 insertions(+), 274 deletions(-)
> > 
> > Just curious, have you done a before/after vmlinux /usr/bin/size comparison?
> > How do these changes impact generated code?
> > 
> > My expectation would be for there to be a noticeable decrease in text size.
> 
> Unfortunately, no:
> 
>text  data bss dec hex filename
> 10352515  6598692 13344972302961791ce4873 
> vmlinux-v4.16-rc5
> 10352845  6598852 13344972302966691ce4a5d 
> vmlinux-v4.16-rc5-patch23
> 10352942  6598948 13344972302968621ce4b1e 
> vmlinux-v4.16-rc5-patch35

Maybe it's due to the new helpers - still, the changes in size are very small 
so 
this isn't a showstopper.

Thanks,

Ingo


linux-next: build failure after merge of the drm tree

2018-03-12 Thread Stephen Rothwell
Hi all,

After merging the drm tree, today's linux-next build (powerpc
allyesconfig) failed like this:

drivers/gpu/drm/amd/amdgpu/../powerplay/hwmgr/smu7_hwmgr.c: In function 
'smu7_notify_link_speed_change_after_state_change':
drivers/gpu/drm/amd/amdgpu/../powerplay/hwmgr/smu7_hwmgr.c:3830:7: error: 
implicit declaration of function 'amdgpu_acpi_pcie_performance_request'; did 
you mean 'smu7_pcie_performance_request'? 
[-Werror=implicit-function-declaration]
   if (amdgpu_acpi_pcie_performance_request(hwmgr->adev, request, false)) {
   ^~~~
   smu7_pcie_performance_request

Caused by commit

  e1deba285156 ("drm/amd/pp: Use amdgpu acpi helper functions in powerplay")

and commit

  37a94791a097 ("drm/amd/pp: Add #ifdef checks for CONFIG_ACPI")

missed this instance.

I added this hack patch for today:

From: Stephen Rothwell 
Date: Tue, 13 Mar 2018 16:24:31 +1100
Subject: [PATCH] drm/amd/pp: add another CONFIG_ACPI check

Signed-off-by: Stephen Rothwell 
---
 drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
index d4d1d2e7e233..df2a312ca6c9 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
@@ -3827,12 +3827,14 @@ static int 
smu7_notify_link_speed_change_after_state_change(
smu7_get_current_pcie_speed(hwmgr) > 0)
return 0;
 
+#ifdef CONFIG_ACPI
if (amdgpu_acpi_pcie_performance_request(hwmgr->adev, request, 
false)) {
if (PP_PCIEGen2 == target_link_speed)
pr_info("PSPP request to switch to Gen2 from 
Gen3 Failed!");
else
pr_info("PSPP request to switch to Gen1 from 
Gen2 Failed!");
}
+#endif
}
 
return 0;
-- 
2.16.1

-- 
Cheers,
Stephen Rothwell


pgp2E_BXOA3LI.pgp
Description: OpenPGP digital signature


linux-next: build failure after merge of the drm tree

2018-03-12 Thread Stephen Rothwell
Hi all,

After merging the drm tree, today's linux-next build (powerpc
allyesconfig) failed like this:

drivers/gpu/drm/amd/amdgpu/../powerplay/hwmgr/smu7_hwmgr.c: In function 
'smu7_notify_link_speed_change_after_state_change':
drivers/gpu/drm/amd/amdgpu/../powerplay/hwmgr/smu7_hwmgr.c:3830:7: error: 
implicit declaration of function 'amdgpu_acpi_pcie_performance_request'; did 
you mean 'smu7_pcie_performance_request'? 
[-Werror=implicit-function-declaration]
   if (amdgpu_acpi_pcie_performance_request(hwmgr->adev, request, false)) {
   ^~~~
   smu7_pcie_performance_request

Caused by commit

  e1deba285156 ("drm/amd/pp: Use amdgpu acpi helper functions in powerplay")

and commit

  37a94791a097 ("drm/amd/pp: Add #ifdef checks for CONFIG_ACPI")

missed this instance.

I added this hack patch for today:

From: Stephen Rothwell 
Date: Tue, 13 Mar 2018 16:24:31 +1100
Subject: [PATCH] drm/amd/pp: add another CONFIG_ACPI check

Signed-off-by: Stephen Rothwell 
---
 drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
index d4d1d2e7e233..df2a312ca6c9 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
@@ -3827,12 +3827,14 @@ static int 
smu7_notify_link_speed_change_after_state_change(
smu7_get_current_pcie_speed(hwmgr) > 0)
return 0;
 
+#ifdef CONFIG_ACPI
if (amdgpu_acpi_pcie_performance_request(hwmgr->adev, request, 
false)) {
if (PP_PCIEGen2 == target_link_speed)
pr_info("PSPP request to switch to Gen2 from 
Gen3 Failed!");
else
pr_info("PSPP request to switch to Gen1 from 
Gen2 Failed!");
}
+#endif
}
 
return 0;
-- 
2.16.1

-- 
Cheers,
Stephen Rothwell


pgp2E_BXOA3LI.pgp
Description: OpenPGP digital signature


Re: [RFC PATCH 35/35] syscalls: do not call sys_close() within the kernel

2018-03-12 Thread Ingo Molnar

* Dominik Brodowski  wrote:

> On Mon, Mar 12, 2018 at 08:37:19AM +0100, Ingo Molnar wrote:
> > 
> > * Dominik Brodowski  wrote:
> > 
> > > --- a/fs/open.c
> > > +++ b/fs/open.c
> > > @@ -1200,7 +1200,7 @@ SYSCALL_DEFINE1(close, unsigned int, fd)
> > >  
> > >   return retval;
> > >  }
> > > -EXPORT_SYMBOL(sys_close);
> > > +
> > >  
> > >  /*
> > 
> > Nit: this introduces a stray newline.
> > 
> > > --- a/include/linux/syscalls.h
> > > +++ b/include/linux/syscalls.h
> > > @@ -1042,4 +1042,10 @@ static inline long ksys_lchown(const char __user 
> > > *filename, uid_t user,
> > >   return do_fchownat(AT_FDCWD, filename, user, group,
> > >AT_SYMLINK_NOFOLLOW);
> > >  }
> > > +
> > > +extern int __close_fd(struct files_struct *files, unsigned int fd);
> > > +static inline int ksys_close(unsigned int fd)
> > > +{
> > > + return __close_fd(current->files, fd);
> > > +}
> > 
> > Would be nice to reuse that stray newline after the __close_fd() prototype 
> > for the 
> > canonical stylistic separation of declarations from definitions.
> > 
> > It would also be very nice to add a short comment before ksys_close() that 
> > explains how it differs from sys_close(). This should reduce the amount of 
> > cargo-cult copying of existing ksys_close()/sys_close() patterns.
> 
> Thanks, have done that:
> 
>   /*
>* In contrast to sys_close(), this stub does not check whether the 
> syscall
>* should or should not be restarted, but returns the raw error codes 
> from
>* __close_fd().
>*/
> 
> Is that ok?

Yeah, looks good to me!

Thanks,

Ingo


Re: [RFC PATCH 35/35] syscalls: do not call sys_close() within the kernel

2018-03-12 Thread Ingo Molnar

* Dominik Brodowski  wrote:

> On Mon, Mar 12, 2018 at 08:37:19AM +0100, Ingo Molnar wrote:
> > 
> > * Dominik Brodowski  wrote:
> > 
> > > --- a/fs/open.c
> > > +++ b/fs/open.c
> > > @@ -1200,7 +1200,7 @@ SYSCALL_DEFINE1(close, unsigned int, fd)
> > >  
> > >   return retval;
> > >  }
> > > -EXPORT_SYMBOL(sys_close);
> > > +
> > >  
> > >  /*
> > 
> > Nit: this introduces a stray newline.
> > 
> > > --- a/include/linux/syscalls.h
> > > +++ b/include/linux/syscalls.h
> > > @@ -1042,4 +1042,10 @@ static inline long ksys_lchown(const char __user 
> > > *filename, uid_t user,
> > >   return do_fchownat(AT_FDCWD, filename, user, group,
> > >AT_SYMLINK_NOFOLLOW);
> > >  }
> > > +
> > > +extern int __close_fd(struct files_struct *files, unsigned int fd);
> > > +static inline int ksys_close(unsigned int fd)
> > > +{
> > > + return __close_fd(current->files, fd);
> > > +}
> > 
> > Would be nice to reuse that stray newline after the __close_fd() prototype 
> > for the 
> > canonical stylistic separation of declarations from definitions.
> > 
> > It would also be very nice to add a short comment before ksys_close() that 
> > explains how it differs from sys_close(). This should reduce the amount of 
> > cargo-cult copying of existing ksys_close()/sys_close() patterns.
> 
> Thanks, have done that:
> 
>   /*
>* In contrast to sys_close(), this stub does not check whether the 
> syscall
>* should or should not be restarted, but returns the raw error codes 
> from
>* __close_fd().
>*/
> 
> Is that ok?

Yeah, looks good to me!

Thanks,

Ingo


Re: [PATCH v2 2/2] clk: aspeed: Prevent reset if clock is enabled

2018-03-12 Thread Joel Stanley
On Fri, Mar 9, 2018 at 7:27 AM, Eddie James  wrote:
> According to the Aspeed specification, the reset and enable sequence
> should be done when the clock is stopped. The specification doesn't
> define behavior if the reset is done while the clock is enabled.
>
> From testing on the AST2500, the LPC Controller has problems if the
> clock is reset while enabled.
>
> Therefore, check whether the clock is enabled or not before performing
> the reset and enable sequence in the Aspeed clock driver.
>
> Root-caused-by: Lei Yu 
> Signed-off-by: Eddie James 

Fixes: 15ed8ce5f84e ("clk: aspeed: Register gated clocks")

Reviewed-by: Joel Stanley 

Cheers,

Joel


> ---
>  drivers/clk/clk-aspeed.c | 29 +
>  1 file changed, 17 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/clk/clk-aspeed.c b/drivers/clk/clk-aspeed.c
> index 1687771..5eb50c3 100644
> --- a/drivers/clk/clk-aspeed.c
> +++ b/drivers/clk/clk-aspeed.c
> @@ -205,6 +205,18 @@ struct aspeed_clk_soc_data {
> .calc_pll = aspeed_ast2400_calc_pll,
>  };
>
> +static int aspeed_clk_is_enabled(struct clk_hw *hw)
> +{
> +   struct aspeed_clk_gate *gate = to_aspeed_clk_gate(hw);
> +   u32 clk = BIT(gate->clock_idx);
> +   u32 enval = (gate->flags & CLK_GATE_SET_TO_DISABLE) ? 0 : clk;
> +   u32 reg;
> +
> +   regmap_read(gate->map, ASPEED_CLK_STOP_CTRL, );
> +
> +   return ((reg & clk) == enval) ? 1 : 0;
> +}
> +
>  static int aspeed_clk_enable(struct clk_hw *hw)
>  {
> struct aspeed_clk_gate *gate = to_aspeed_clk_gate(hw);
> @@ -215,6 +227,11 @@ static int aspeed_clk_enable(struct clk_hw *hw)
>
> spin_lock_irqsave(gate->lock, flags);
>
> +   if (aspeed_clk_is_enabled(hw)) {
> +   spin_unlock_irqrestore(gate->lock, flags);
> +   return 0;
> +   }
> +
> if (gate->reset_idx >= 0) {
> /* Put IP in reset */
> regmap_update_bits(gate->map, ASPEED_RESET_CTRL, rst, rst);
> @@ -255,18 +272,6 @@ static void aspeed_clk_disable(struct clk_hw *hw)
> spin_unlock_irqrestore(gate->lock, flags);
>  }
>
> -static int aspeed_clk_is_enabled(struct clk_hw *hw)
> -{
> -   struct aspeed_clk_gate *gate = to_aspeed_clk_gate(hw);
> -   u32 clk = BIT(gate->clock_idx);
> -   u32 enval = (gate->flags & CLK_GATE_SET_TO_DISABLE) ? 0 : clk;
> -   u32 reg;
> -
> -   regmap_read(gate->map, ASPEED_CLK_STOP_CTRL, );
> -
> -   return ((reg & clk) == enval) ? 1 : 0;
> -}
> -
>  static const struct clk_ops aspeed_clk_gate_ops = {
> .enable = aspeed_clk_enable,
> .disable = aspeed_clk_disable,
> --
> 1.8.3.1
>


Re: [PATCH v2 2/2] clk: aspeed: Prevent reset if clock is enabled

2018-03-12 Thread Joel Stanley
On Fri, Mar 9, 2018 at 7:27 AM, Eddie James  wrote:
> According to the Aspeed specification, the reset and enable sequence
> should be done when the clock is stopped. The specification doesn't
> define behavior if the reset is done while the clock is enabled.
>
> From testing on the AST2500, the LPC Controller has problems if the
> clock is reset while enabled.
>
> Therefore, check whether the clock is enabled or not before performing
> the reset and enable sequence in the Aspeed clock driver.
>
> Root-caused-by: Lei Yu 
> Signed-off-by: Eddie James 

Fixes: 15ed8ce5f84e ("clk: aspeed: Register gated clocks")

Reviewed-by: Joel Stanley 

Cheers,

Joel


> ---
>  drivers/clk/clk-aspeed.c | 29 +
>  1 file changed, 17 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/clk/clk-aspeed.c b/drivers/clk/clk-aspeed.c
> index 1687771..5eb50c3 100644
> --- a/drivers/clk/clk-aspeed.c
> +++ b/drivers/clk/clk-aspeed.c
> @@ -205,6 +205,18 @@ struct aspeed_clk_soc_data {
> .calc_pll = aspeed_ast2400_calc_pll,
>  };
>
> +static int aspeed_clk_is_enabled(struct clk_hw *hw)
> +{
> +   struct aspeed_clk_gate *gate = to_aspeed_clk_gate(hw);
> +   u32 clk = BIT(gate->clock_idx);
> +   u32 enval = (gate->flags & CLK_GATE_SET_TO_DISABLE) ? 0 : clk;
> +   u32 reg;
> +
> +   regmap_read(gate->map, ASPEED_CLK_STOP_CTRL, );
> +
> +   return ((reg & clk) == enval) ? 1 : 0;
> +}
> +
>  static int aspeed_clk_enable(struct clk_hw *hw)
>  {
> struct aspeed_clk_gate *gate = to_aspeed_clk_gate(hw);
> @@ -215,6 +227,11 @@ static int aspeed_clk_enable(struct clk_hw *hw)
>
> spin_lock_irqsave(gate->lock, flags);
>
> +   if (aspeed_clk_is_enabled(hw)) {
> +   spin_unlock_irqrestore(gate->lock, flags);
> +   return 0;
> +   }
> +
> if (gate->reset_idx >= 0) {
> /* Put IP in reset */
> regmap_update_bits(gate->map, ASPEED_RESET_CTRL, rst, rst);
> @@ -255,18 +272,6 @@ static void aspeed_clk_disable(struct clk_hw *hw)
> spin_unlock_irqrestore(gate->lock, flags);
>  }
>
> -static int aspeed_clk_is_enabled(struct clk_hw *hw)
> -{
> -   struct aspeed_clk_gate *gate = to_aspeed_clk_gate(hw);
> -   u32 clk = BIT(gate->clock_idx);
> -   u32 enval = (gate->flags & CLK_GATE_SET_TO_DISABLE) ? 0 : clk;
> -   u32 reg;
> -
> -   regmap_read(gate->map, ASPEED_CLK_STOP_CTRL, );
> -
> -   return ((reg & clk) == enval) ? 1 : 0;
> -}
> -
>  static const struct clk_ops aspeed_clk_gate_ops = {
> .enable = aspeed_clk_enable,
> .disable = aspeed_clk_disable,
> --
> 1.8.3.1
>


Re: [PATCH v3] powerpc/kernel/sysfs: Export ldbar spr to sysfs

2018-03-12 Thread Anju T Sudhakar

Hi,


On Tuesday 06 March 2018 04:35 PM, Michael Ellerman wrote:

Anju T Sudhakar  writes:


diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c
index 4437c70..caefb64 100644
--- a/arch/powerpc/kernel/sysfs.c
+++ b/arch/powerpc/kernel/sysfs.c
@@ -757,6 +759,9 @@ static int register_cpu_online(unsigned int cpu)
device_create_file(s, _attrs[i]);
  
  #ifdef CONFIG_PPC64

+   if (cpu_has_feature(CPU_FTR_ARCH_300))
+   device_create_file(s, _attr_ldbar);

Is this register readable in supervisor state?


This is a nice catch, thanks. :)
The guest kernel can not access the register, it is only readable in the 
hypervisor state.
I will resend the patch with a condition check so that this spr will not 
get registered for guest kernel.



Regards,
Anju


cheers





Re: [PATCH v3] powerpc/kernel/sysfs: Export ldbar spr to sysfs

2018-03-12 Thread Anju T Sudhakar

Hi,


On Tuesday 06 March 2018 04:35 PM, Michael Ellerman wrote:

Anju T Sudhakar  writes:


diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c
index 4437c70..caefb64 100644
--- a/arch/powerpc/kernel/sysfs.c
+++ b/arch/powerpc/kernel/sysfs.c
@@ -757,6 +759,9 @@ static int register_cpu_online(unsigned int cpu)
device_create_file(s, _attrs[i]);
  
  #ifdef CONFIG_PPC64

+   if (cpu_has_feature(CPU_FTR_ARCH_300))
+   device_create_file(s, _attr_ldbar);

Is this register readable in supervisor state?


This is a nice catch, thanks. :)
The guest kernel can not access the register, it is only readable in the 
hypervisor state.
I will resend the patch with a condition check so that this spr will not 
get registered for guest kernel.



Regards,
Anju


cheers





Re: [PATCH v2 1/2] clk: aspeed: Fix is_enabled for certain clocks

2018-03-12 Thread Joel Stanley
On Fri, Mar 9, 2018 at 7:27 AM, Eddie James  wrote:
> Some of the Aspeed clocks are disabled by setting the relevant bit in
> the "clock stop control" register to one, while others are disabled by
> setting their bit to zero. The driver already uses a flag per gate  to
> identify this behavior, but doesn't apply it in the clock is_enabled
> function.
>
> Use the existing gate flag to correctly return whether or not a clock
> is enabled in the aspeed_clk_is_enabled function.
>
> Signed-off-by: Eddie James 

Fixes: 6671507f0fbd ("clk: aspeed: Handle inverse polarity of USB port
1 clock gate")

Reviewed-by: Joel Stanley 

Cheers,

Joel

> ---
>  drivers/clk/clk-aspeed.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/clk/clk-aspeed.c b/drivers/clk/clk-aspeed.c
> index 9f7f931..1687771 100644
> --- a/drivers/clk/clk-aspeed.c
> +++ b/drivers/clk/clk-aspeed.c
> @@ -259,11 +259,12 @@ static int aspeed_clk_is_enabled(struct clk_hw *hw)
>  {
> struct aspeed_clk_gate *gate = to_aspeed_clk_gate(hw);
> u32 clk = BIT(gate->clock_idx);
> +   u32 enval = (gate->flags & CLK_GATE_SET_TO_DISABLE) ? 0 : clk;
> u32 reg;
>
> regmap_read(gate->map, ASPEED_CLK_STOP_CTRL, );
>
> -   return (reg & clk) ? 0 : 1;
> +   return ((reg & clk) == enval) ? 1 : 0;
>  }
>
>  static const struct clk_ops aspeed_clk_gate_ops = {
> --
> 1.8.3.1
>


Re: [PATCH v2 1/2] clk: aspeed: Fix is_enabled for certain clocks

2018-03-12 Thread Joel Stanley
On Fri, Mar 9, 2018 at 7:27 AM, Eddie James  wrote:
> Some of the Aspeed clocks are disabled by setting the relevant bit in
> the "clock stop control" register to one, while others are disabled by
> setting their bit to zero. The driver already uses a flag per gate  to
> identify this behavior, but doesn't apply it in the clock is_enabled
> function.
>
> Use the existing gate flag to correctly return whether or not a clock
> is enabled in the aspeed_clk_is_enabled function.
>
> Signed-off-by: Eddie James 

Fixes: 6671507f0fbd ("clk: aspeed: Handle inverse polarity of USB port
1 clock gate")

Reviewed-by: Joel Stanley 

Cheers,

Joel

> ---
>  drivers/clk/clk-aspeed.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/clk/clk-aspeed.c b/drivers/clk/clk-aspeed.c
> index 9f7f931..1687771 100644
> --- a/drivers/clk/clk-aspeed.c
> +++ b/drivers/clk/clk-aspeed.c
> @@ -259,11 +259,12 @@ static int aspeed_clk_is_enabled(struct clk_hw *hw)
>  {
> struct aspeed_clk_gate *gate = to_aspeed_clk_gate(hw);
> u32 clk = BIT(gate->clock_idx);
> +   u32 enval = (gate->flags & CLK_GATE_SET_TO_DISABLE) ? 0 : clk;
> u32 reg;
>
> regmap_read(gate->map, ASPEED_CLK_STOP_CTRL, );
>
> -   return (reg & clk) ? 0 : 1;
> +   return ((reg & clk) == enval) ? 1 : 0;
>  }
>
>  static const struct clk_ops aspeed_clk_gate_ops = {
> --
> 1.8.3.1
>


Hello

2018-03-12 Thread Bauer M


Good day dear, i hope this mail meets you well? I know this may seem 
inappropriate so
i ask for your forgiveness but i wish to get to know you better, if I may be so 
bold.
I consider myself an easy-going man, adventurous, honest and fun loving person 
but I
am currently looking for a relationship in which I will feel loved. I promise to
answer any question that you may want to ask me...all i need is just your 
attention
and the chance to know you more.

Please tell me more about yourself, if you do not mind. Hope to hear back from 
you
soon.
Bauer


Hello

2018-03-12 Thread Bauer M


Good day dear, i hope this mail meets you well? I know this may seem 
inappropriate so
i ask for your forgiveness but i wish to get to know you better, if I may be so 
bold.
I consider myself an easy-going man, adventurous, honest and fun loving person 
but I
am currently looking for a relationship in which I will feel loved. I promise to
answer any question that you may want to ask me...all i need is just your 
attention
and the chance to know you more.

Please tell me more about yourself, if you do not mind. Hope to hear back from 
you
soon.
Bauer


[PATCH v2 1/1] net: check before dereferencing netdev_ops during busy poll

2018-03-12 Thread Josh Elsasser
init_dummy_netdev() leaves its netdev_ops pointer zeroed. This leads
to a NULL pointer dereference when sk_busy_loop fires against an iwlwifi
wireless adapter and checks napi->dev->netdev_ops->ndo_busy_poll.

Avoid this by ensuring napi->dev->netdev_ops is valid before following
the pointer, avoiding the following panic when busy polling on a dummy
netdev:

  BUG: unable to handle kernel NULL pointer dereference at 00c8
  IP: [] sk_busy_loop+0x92/0x2f0
  Call Trace:
   [] ? uart_write_room+0x74/0xf0
   [] sock_poll+0x99/0xa0
   [] do_sys_poll+0x2e2/0x520
   [] ? get_page_from_freelist+0x3bc/0xa30
   [] ? update_curr+0x62/0x140
   [] ? __slab_free+0xa1/0x2a0
   [] ? __slab_free+0xa1/0x2a0
   [] ? skb_free_head+0x21/0x30
   [] ? poll_initwait+0x50/0x50
   [] ? kmem_cache_free+0x1c6/0x1e0
   [] ? uart_write+0x124/0x1d0
   [] ? remove_wait_queue+0x4d/0x60
   [] ? __wake_up+0x44/0x50
   [] ? tty_write_unlock+0x31/0x40
   [] ? tty_ldisc_deref+0x16/0x20
   [] ? tty_write+0x1e0/0x2f0
   [] ? process_echoes+0x80/0x80
   [] ? __vfs_write+0x2b/0x130
   [] ? vfs_write+0x15a/0x1a0
   [] SyS_poll+0x75/0x100
   [] entry_SYSCALL_64_fastpath+0x24/0xcf

Commit 79e7fff47b7b ("net: remove support for per driver ndo_busy_poll()")
indirectly fixed this upstream in linux-4.11 by removing the offending
pointer usage. No other users of napi->dev touch its netdev_ops.

Fixes: 060212928670 ("net: add low latency socket poll")
Fixes: ce6aea93f751 ("net: network drivers no longer need to implement 
ndo_busy_poll()") - 4.9.y
Signed-off-by: Josh Elsasser 
---
 net/core/dev.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 8898618bf341..1f50c131ed15 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5042,7 +5042,10 @@ bool sk_busy_loop(struct sock *sk, int nonblock)
goto out;
 
/* Note: ndo_busy_poll method is optional in linux-4.5 */
-   busy_poll = napi->dev->netdev_ops->ndo_busy_poll;
+   if (napi->dev->netdev_ops)
+   busy_poll = napi->dev->netdev_ops->ndo_busy_poll;
+   else
+   busy_poll = NULL;
 
do {
rc = 0;
-- 
2.11.0



[PATCH v2 1/1] net: check before dereferencing netdev_ops during busy poll

2018-03-12 Thread Josh Elsasser
init_dummy_netdev() leaves its netdev_ops pointer zeroed. This leads
to a NULL pointer dereference when sk_busy_loop fires against an iwlwifi
wireless adapter and checks napi->dev->netdev_ops->ndo_busy_poll.

Avoid this by ensuring napi->dev->netdev_ops is valid before following
the pointer, avoiding the following panic when busy polling on a dummy
netdev:

  BUG: unable to handle kernel NULL pointer dereference at 00c8
  IP: [] sk_busy_loop+0x92/0x2f0
  Call Trace:
   [] ? uart_write_room+0x74/0xf0
   [] sock_poll+0x99/0xa0
   [] do_sys_poll+0x2e2/0x520
   [] ? get_page_from_freelist+0x3bc/0xa30
   [] ? update_curr+0x62/0x140
   [] ? __slab_free+0xa1/0x2a0
   [] ? __slab_free+0xa1/0x2a0
   [] ? skb_free_head+0x21/0x30
   [] ? poll_initwait+0x50/0x50
   [] ? kmem_cache_free+0x1c6/0x1e0
   [] ? uart_write+0x124/0x1d0
   [] ? remove_wait_queue+0x4d/0x60
   [] ? __wake_up+0x44/0x50
   [] ? tty_write_unlock+0x31/0x40
   [] ? tty_ldisc_deref+0x16/0x20
   [] ? tty_write+0x1e0/0x2f0
   [] ? process_echoes+0x80/0x80
   [] ? __vfs_write+0x2b/0x130
   [] ? vfs_write+0x15a/0x1a0
   [] SyS_poll+0x75/0x100
   [] entry_SYSCALL_64_fastpath+0x24/0xcf

Commit 79e7fff47b7b ("net: remove support for per driver ndo_busy_poll()")
indirectly fixed this upstream in linux-4.11 by removing the offending
pointer usage. No other users of napi->dev touch its netdev_ops.

Fixes: 060212928670 ("net: add low latency socket poll")
Fixes: ce6aea93f751 ("net: network drivers no longer need to implement 
ndo_busy_poll()") - 4.9.y
Signed-off-by: Josh Elsasser 
---
 net/core/dev.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 8898618bf341..1f50c131ed15 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5042,7 +5042,10 @@ bool sk_busy_loop(struct sock *sk, int nonblock)
goto out;
 
/* Note: ndo_busy_poll method is optional in linux-4.5 */
-   busy_poll = napi->dev->netdev_ops->ndo_busy_poll;
+   if (napi->dev->netdev_ops)
+   busy_poll = napi->dev->netdev_ops->ndo_busy_poll;
+   else
+   busy_poll = NULL;
 
do {
rc = 0;
-- 
2.11.0



[PATCH 0/1] net: avoid a kernel panic during sk_busy_loop

2018-03-12 Thread Josh Elsasser
V2: just check napi->dev->netdev_ops instead of getting clever with the
netdev registration state.

Original cover letter:

Hi Dave,

I stumbled across a reproducible kernel panic while playing around with
busy_poll on a Linux 4.9.86 kernel. There's an unfortunate interaction
between init_dummy_netdev, which doesn't bother to fill in netdev_ops, and
sk_busy_loop, which assumed netdev_ops is a valid pointer.

To reproduce on the device under test (DUT), I did:

  $ ip addr show dev wlan0
  8: wlan0:  mtu 1500 qdisc mq [...]
  inet 172.16.122.6/23 brd 172.16.123.255 scope global wlan0
  $ sysctl -w net.core.busy_read=50
  $ nc -l 172.16.122.6 5001

Then transmitted some data to this socket from a second host:

  $ echo "foo" | nc 172.16.122.6 5001

The DUT immediately hits a kernel panic.

I've attached a patch that applies cleanly to the 4.9.87 stable release.
This fix isn't necessary for net/net-next (ndo_busy_poll was removed in
linux-4.11), but a further backport of this commit is likely required for
any stable releases older than linux-4.5.

I hope this is the right way to raise something like this. I couldn't find
a clear answer from the -stable and netdev on how to handle bugs in features
that no longer exist in mainline.

Thanks,
Josh


[PATCH 0/1] net: avoid a kernel panic during sk_busy_loop

2018-03-12 Thread Josh Elsasser
V2: just check napi->dev->netdev_ops instead of getting clever with the
netdev registration state.

Original cover letter:

Hi Dave,

I stumbled across a reproducible kernel panic while playing around with
busy_poll on a Linux 4.9.86 kernel. There's an unfortunate interaction
between init_dummy_netdev, which doesn't bother to fill in netdev_ops, and
sk_busy_loop, which assumed netdev_ops is a valid pointer.

To reproduce on the device under test (DUT), I did:

  $ ip addr show dev wlan0
  8: wlan0:  mtu 1500 qdisc mq [...]
  inet 172.16.122.6/23 brd 172.16.123.255 scope global wlan0
  $ sysctl -w net.core.busy_read=50
  $ nc -l 172.16.122.6 5001

Then transmitted some data to this socket from a second host:

  $ echo "foo" | nc 172.16.122.6 5001

The DUT immediately hits a kernel panic.

I've attached a patch that applies cleanly to the 4.9.87 stable release.
This fix isn't necessary for net/net-next (ndo_busy_poll was removed in
linux-4.11), but a further backport of this commit is likely required for
any stable releases older than linux-4.5.

I hope this is the right way to raise something like this. I couldn't find
a clear answer from the -stable and netdev on how to handle bugs in features
that no longer exist in mainline.

Thanks,
Josh


RE: [PATCH 1/1] x86/platform/x86: Fix count of CHas on multi-pci-segment arches

2018-03-12 Thread Kroening, Gary
Kan -- sorry for my late-night typo. My description for "hubless" should have 
said "a single segment/domain" rather than single bus. 

> "hubless" -- equivalent to "glueless" or "white box" where our BIOS sets
> things up with a single bus for all sockets.

 single segment/domain *
Sorry for the spam!
Gary


> -Original Message-
> From: Kroening, Gary
> Sent: Tuesday, March 13, 2018 12:07 AM
> To: 'Liang, Kan'; mi...@redhat.com; h...@zytor.com; t...@linutronix.de;
> pet...@infradead.org
> Cc: Travis, Mike; Banman, Andrew; Sivanich, Dimitri; Anderson, Russ;
> x...@kernel.org; linux-kernel@vger.kernel.org
> Subject: RE: [PATCH 1/1] x86/platform/x86: Fix count of CHas on multi-pci-
> segment arches
> 
> Thanks, Kan -- your patch looks good, and cleaner than the previous
> method!
> 
> Tonight, I've tested it on our in-house simulator for the following
> configurations. The simulator has been matching hardware well for this
> issue -- we'll do more testing on real hardware, but I'm confident you
> have it right.
> 
> Terminology:
> 
> "hubless" -- equivalent to "glueless" or "white box" where our BIOS sets
> things up with a single bus for all sockets.
> 
> "scalable" -- BIOS assigns a new segment/domain for each socket
> 
> Configurations tested so far:
> - single-socket hubless/scalable
> - two sockets, scalable
> - four sockets, hubless
> - eight sockets, scalable
> 
> In all cases, skx_count_chabox() is returning 28 for Skylake server.
> Thanks for the quick response!
> Gary
> 
> > -Original Message-
> > From: Liang, Kan [mailto:kan.li...@linux.intel.com]
> > Sent: Monday, March 12, 2018 8:43 PM
> > To: Kroening, Gary; mi...@redhat.com; h...@zytor.com; t...@linutronix.de;
> > pet...@infradead.org
> > Cc: Travis, Mike; Banman, Andrew; Sivanich, Dimitri; Anderson, Russ;
> > x...@kernel.org; linux-kernel@vger.kernel.org
> > Subject: Re: [PATCH 1/1] x86/platform/x86: Fix count of CHas on multi-
> pci-
> > segment arches
> >
> >
> >
> > On 3/7/2018 3:33 PM, Kroening, Gary wrote:
> > > For systems with a single PCI segment, it is sufficient to look for
> the
> > > bus number to change in order to determine that all of the CHa's have
> > > been counted for a single socket.
> > >
> > > However, for multi PCI segment systems, each socket is given a new
> > > segment and the bus number does NOT change.  So looking only for the
> > > bus number to change ends up counting all of the CHa's on all sockets
> > > in the system.  This leads to writing CPU MSRs beyond a valid range
> and
> > > causes an error in ivbep_uncore_msr_init_box().
> > >
> > > The fix is to check for either the bus number or segment number to
> > change.
> > >
> >
> > Hi Gary,
> >
> > There is a recommended way in uncore document to query the number of
> > CHAs on Skylake server.
> > I have a patch to implement the new way.
> >
> > Could you please take a look at the patch and see if it can fix your
> > issue?
> >
> >
> > Thanks,
> > Kan
> >
> > --
> >  From 55f54b2fa3021c691c2fd4f5cfc8f441fd104e91 Mon Sep 17 00:00:00 2001
> > From: Kan Liang 
> > Date: Mon, 12 Mar 2018 13:03:40 -0700
> > Subject: [PATCH] perf/x86/intel/uncore: Querying number of CHAs from
> > CAPID6 register
> >
> > The number of CHAs is miscalculated on multi PCI domain systems on
> > Skylake server
> >
> > (From Kroening, Gary:
> >
> > For systems with a single PCI segment, it is sufficient to look for the
> > bus number to change in order to determine that all of the CHa's have
> > been counted for a single socket.
> > However, for multi PCI segment systems, each socket is given a new
> > segment and the bus number does NOT change.  So looking only for the
> > bus number to change ends up counting all of the CHa's on all sockets
> > in the system.  This leads to writing CPU MSRs beyond a valid range and
> > causes an error in ivbep_uncore_msr_init_box().)
> >
> > To determine the number of CHAs, it should read bits 27:0 in the CAPID6
> > register located at Device 30, Function 3, Offset 0x9C. These 28 bits
> > form a bit vector of available LLC slices and the CHAs that manage those
> > slices.
> >
> > Fixes: cd34cd97b7b4 ("perf/x86/intel/uncore: Add Skylake server uncore
> > support")
> > Reported-by: Kroening, Gary 
> > Signed-off-by: Kan Liang 
> > ---
> >   arch/x86/events/intel/uncore_snbep.c | 24 ++--
> >   1 file changed, 10 insertions(+), 14 deletions(-)
> >
> > diff --git a/arch/x86/events/intel/uncore_snbep.c
> > b/arch/x86/events/intel/uncore_snbep.c
> > index d4672ed..a42715b 100644
> > --- a/arch/x86/events/intel/uncore_snbep.c
> > +++ b/arch/x86/events/intel/uncore_snbep.c
> > @@ -3575,24 +3575,20 @@ static struct intel_uncore_type
> > *skx_msr_uncores[] = {
> > NULL,
> >   };
> >
> > +#define SKX_CAPID6 0x9c
> > +#define SKX_CHA_BIT_WIDTH  28
> > +
> >   static int skx_count_chabox(void)
> >   {
> > -   struct pci_dev 

RE: [PATCH 1/1] x86/platform/x86: Fix count of CHas on multi-pci-segment arches

2018-03-12 Thread Kroening, Gary
Kan -- sorry for my late-night typo. My description for "hubless" should have 
said "a single segment/domain" rather than single bus. 

> "hubless" -- equivalent to "glueless" or "white box" where our BIOS sets
> things up with a single bus for all sockets.

 single segment/domain *
Sorry for the spam!
Gary


> -Original Message-
> From: Kroening, Gary
> Sent: Tuesday, March 13, 2018 12:07 AM
> To: 'Liang, Kan'; mi...@redhat.com; h...@zytor.com; t...@linutronix.de;
> pet...@infradead.org
> Cc: Travis, Mike; Banman, Andrew; Sivanich, Dimitri; Anderson, Russ;
> x...@kernel.org; linux-kernel@vger.kernel.org
> Subject: RE: [PATCH 1/1] x86/platform/x86: Fix count of CHas on multi-pci-
> segment arches
> 
> Thanks, Kan -- your patch looks good, and cleaner than the previous
> method!
> 
> Tonight, I've tested it on our in-house simulator for the following
> configurations. The simulator has been matching hardware well for this
> issue -- we'll do more testing on real hardware, but I'm confident you
> have it right.
> 
> Terminology:
> 
> "hubless" -- equivalent to "glueless" or "white box" where our BIOS sets
> things up with a single bus for all sockets.
> 
> "scalable" -- BIOS assigns a new segment/domain for each socket
> 
> Configurations tested so far:
> - single-socket hubless/scalable
> - two sockets, scalable
> - four sockets, hubless
> - eight sockets, scalable
> 
> In all cases, skx_count_chabox() is returning 28 for Skylake server.
> Thanks for the quick response!
> Gary
> 
> > -Original Message-
> > From: Liang, Kan [mailto:kan.li...@linux.intel.com]
> > Sent: Monday, March 12, 2018 8:43 PM
> > To: Kroening, Gary; mi...@redhat.com; h...@zytor.com; t...@linutronix.de;
> > pet...@infradead.org
> > Cc: Travis, Mike; Banman, Andrew; Sivanich, Dimitri; Anderson, Russ;
> > x...@kernel.org; linux-kernel@vger.kernel.org
> > Subject: Re: [PATCH 1/1] x86/platform/x86: Fix count of CHas on multi-
> pci-
> > segment arches
> >
> >
> >
> > On 3/7/2018 3:33 PM, Kroening, Gary wrote:
> > > For systems with a single PCI segment, it is sufficient to look for
> the
> > > bus number to change in order to determine that all of the CHa's have
> > > been counted for a single socket.
> > >
> > > However, for multi PCI segment systems, each socket is given a new
> > > segment and the bus number does NOT change.  So looking only for the
> > > bus number to change ends up counting all of the CHa's on all sockets
> > > in the system.  This leads to writing CPU MSRs beyond a valid range
> and
> > > causes an error in ivbep_uncore_msr_init_box().
> > >
> > > The fix is to check for either the bus number or segment number to
> > change.
> > >
> >
> > Hi Gary,
> >
> > There is a recommended way in uncore document to query the number of
> > CHAs on Skylake server.
> > I have a patch to implement the new way.
> >
> > Could you please take a look at the patch and see if it can fix your
> > issue?
> >
> >
> > Thanks,
> > Kan
> >
> > --
> >  From 55f54b2fa3021c691c2fd4f5cfc8f441fd104e91 Mon Sep 17 00:00:00 2001
> > From: Kan Liang 
> > Date: Mon, 12 Mar 2018 13:03:40 -0700
> > Subject: [PATCH] perf/x86/intel/uncore: Querying number of CHAs from
> > CAPID6 register
> >
> > The number of CHAs is miscalculated on multi PCI domain systems on
> > Skylake server
> >
> > (From Kroening, Gary:
> >
> > For systems with a single PCI segment, it is sufficient to look for the
> > bus number to change in order to determine that all of the CHa's have
> > been counted for a single socket.
> > However, for multi PCI segment systems, each socket is given a new
> > segment and the bus number does NOT change.  So looking only for the
> > bus number to change ends up counting all of the CHa's on all sockets
> > in the system.  This leads to writing CPU MSRs beyond a valid range and
> > causes an error in ivbep_uncore_msr_init_box().)
> >
> > To determine the number of CHAs, it should read bits 27:0 in the CAPID6
> > register located at Device 30, Function 3, Offset 0x9C. These 28 bits
> > form a bit vector of available LLC slices and the CHAs that manage those
> > slices.
> >
> > Fixes: cd34cd97b7b4 ("perf/x86/intel/uncore: Add Skylake server uncore
> > support")
> > Reported-by: Kroening, Gary 
> > Signed-off-by: Kan Liang 
> > ---
> >   arch/x86/events/intel/uncore_snbep.c | 24 ++--
> >   1 file changed, 10 insertions(+), 14 deletions(-)
> >
> > diff --git a/arch/x86/events/intel/uncore_snbep.c
> > b/arch/x86/events/intel/uncore_snbep.c
> > index d4672ed..a42715b 100644
> > --- a/arch/x86/events/intel/uncore_snbep.c
> > +++ b/arch/x86/events/intel/uncore_snbep.c
> > @@ -3575,24 +3575,20 @@ static struct intel_uncore_type
> > *skx_msr_uncores[] = {
> > NULL,
> >   };
> >
> > +#define SKX_CAPID6 0x9c
> > +#define SKX_CHA_BIT_WIDTH  28
> > +
> >   static int skx_count_chabox(void)
> >   {
> > -   struct pci_dev *chabox_dev = NULL;
> > -   int bus, count = 0;
> > +   struct pci_dev *dev = 

Re: [PATCH] arm: ubsan: select ARCH_HAS_UBSAN_SANITIZE_ALL

2018-03-12 Thread Jaehoon Chung
On 03/13/2018 01:53 PM, Jinbum Park wrote:
> To enable UBSAN on arm, ARCH_HAS_UBSAN_SANITIZE_ALL is needed to be selected.
> 
> Basic test has passed on Raspberry Pi2, Raspbian jessi lite with
> CONFIG_UBSAN_SANITIZE_ALL, CONFIG_UBSAN_NULL.

This patch had been already sent from Seungwoo.

https://patchwork.kernel.org/patch/9344477/

Best Regards,
Jaehoon Chung

> 
> Signed-off-by: Jinbum Park 
> ---
>  arch/arm/Kconfig  | 1 +
>  arch/arm/boot/compressed/Makefile | 1 +
>  arch/arm/vdso/Makefile| 1 +
>  3 files changed, 3 insertions(+)
> 
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index 1878083..bdd1561 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -7,6 +7,7 @@ config ARM
>   select ARCH_HAS_DEBUG_VIRTUAL if MMU
>   select ARCH_HAS_DEVMEM_IS_ALLOWED
>   select ARCH_HAS_ELF_RANDOMIZE
> + select ARCH_HAS_UBSAN_SANITIZE_ALL
>   select ARCH_HAS_SET_MEMORY
>   select ARCH_HAS_PHYS_TO_DMA
>   select ARCH_HAS_STRICT_KERNEL_RWX if MMU && !XIP_KERNEL
> diff --git a/arch/arm/boot/compressed/Makefile 
> b/arch/arm/boot/compressed/Makefile
> index 45a6b9b..1b374ba 100644
> --- a/arch/arm/boot/compressed/Makefile
> +++ b/arch/arm/boot/compressed/Makefile
> @@ -24,6 +24,7 @@ OBJS+= hyp-stub.o
>  endif
>  
>  GCOV_PROFILE := n
> +UBSAN_SANITIZE   := n
>  
>  #
>  # Architecture dependencies
> diff --git a/arch/arm/vdso/Makefile b/arch/arm/vdso/Makefile
> index bb411821..05597f7 100644
> --- a/arch/arm/vdso/Makefile
> +++ b/arch/arm/vdso/Makefile
> @@ -29,6 +29,7 @@ CFLAGS_vgettimeofday.o = -O2
>  
>  # Disable gcov profiling for VDSO code
>  GCOV_PROFILE := n
> +UBSAN_SANITIZE := n
>  
>  # Force dependency
>  $(obj)/vdso.o : $(obj)/vdso.so
> 



Re: [PATCH] arm: ubsan: select ARCH_HAS_UBSAN_SANITIZE_ALL

2018-03-12 Thread Jaehoon Chung
On 03/13/2018 01:53 PM, Jinbum Park wrote:
> To enable UBSAN on arm, ARCH_HAS_UBSAN_SANITIZE_ALL is needed to be selected.
> 
> Basic test has passed on Raspberry Pi2, Raspbian jessi lite with
> CONFIG_UBSAN_SANITIZE_ALL, CONFIG_UBSAN_NULL.

This patch had been already sent from Seungwoo.

https://patchwork.kernel.org/patch/9344477/

Best Regards,
Jaehoon Chung

> 
> Signed-off-by: Jinbum Park 
> ---
>  arch/arm/Kconfig  | 1 +
>  arch/arm/boot/compressed/Makefile | 1 +
>  arch/arm/vdso/Makefile| 1 +
>  3 files changed, 3 insertions(+)
> 
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index 1878083..bdd1561 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -7,6 +7,7 @@ config ARM
>   select ARCH_HAS_DEBUG_VIRTUAL if MMU
>   select ARCH_HAS_DEVMEM_IS_ALLOWED
>   select ARCH_HAS_ELF_RANDOMIZE
> + select ARCH_HAS_UBSAN_SANITIZE_ALL
>   select ARCH_HAS_SET_MEMORY
>   select ARCH_HAS_PHYS_TO_DMA
>   select ARCH_HAS_STRICT_KERNEL_RWX if MMU && !XIP_KERNEL
> diff --git a/arch/arm/boot/compressed/Makefile 
> b/arch/arm/boot/compressed/Makefile
> index 45a6b9b..1b374ba 100644
> --- a/arch/arm/boot/compressed/Makefile
> +++ b/arch/arm/boot/compressed/Makefile
> @@ -24,6 +24,7 @@ OBJS+= hyp-stub.o
>  endif
>  
>  GCOV_PROFILE := n
> +UBSAN_SANITIZE   := n
>  
>  #
>  # Architecture dependencies
> diff --git a/arch/arm/vdso/Makefile b/arch/arm/vdso/Makefile
> index bb411821..05597f7 100644
> --- a/arch/arm/vdso/Makefile
> +++ b/arch/arm/vdso/Makefile
> @@ -29,6 +29,7 @@ CFLAGS_vgettimeofday.o = -O2
>  
>  # Disable gcov profiling for VDSO code
>  GCOV_PROFILE := n
> +UBSAN_SANITIZE := n
>  
>  # Force dependency
>  $(obj)/vdso.o : $(obj)/vdso.so
> 



Re: [PATCH 1/1] net: check dev->reg_state before deref of napi netdev_ops

2018-03-12 Thread Josh Elsasser
On Mon, Mar 12, 2018 at 4:17 PM, Cong Wang  wrote:
> On Sun, Mar 11, 2018 at 12:22 PM, Josh Elsasser  wrote:
>> init_dummy_netdev() leaves its netdev_ops pointer zeroed. This leads
>> to a NULL pointer dereference when sk_busy_loop fires against an iwlwifi
>> wireless adapter and checks napi->dev->netdev_ops->ndo_busy_poll.
>>
>> Avoid this by ensuring that napi->dev is not a dummy device before
>> dereferencing napi dev's netdev_ops, preventing the following panic:
>
> Hmm, how about just checking ->netdev_ops? Checking reg_state looks
> odd, although works.

Fair point. I was trying to differentiate between an unexpected
NULL pointer and a dummy netdev, but I guess it was clever
at the expense of readability.

I'll push up a v2 that just does the obvious.


Re: [PATCH 1/1] net: check dev->reg_state before deref of napi netdev_ops

2018-03-12 Thread Josh Elsasser
On Mon, Mar 12, 2018 at 4:17 PM, Cong Wang  wrote:
> On Sun, Mar 11, 2018 at 12:22 PM, Josh Elsasser  wrote:
>> init_dummy_netdev() leaves its netdev_ops pointer zeroed. This leads
>> to a NULL pointer dereference when sk_busy_loop fires against an iwlwifi
>> wireless adapter and checks napi->dev->netdev_ops->ndo_busy_poll.
>>
>> Avoid this by ensuring that napi->dev is not a dummy device before
>> dereferencing napi dev's netdev_ops, preventing the following panic:
>
> Hmm, how about just checking ->netdev_ops? Checking reg_state looks
> odd, although works.

Fair point. I was trying to differentiate between an unexpected
NULL pointer and a dummy netdev, but I guess it was clever
at the expense of readability.

I'll push up a v2 that just does the obvious.


Re: [PATCH v2] exec: Set file unwritable before LSM check

2018-03-12 Thread James Morris
On Fri, 9 Mar 2018, Kees Cook wrote:

> The LSM check should happen after the file has been confirmed to be
> unchanging. Without this, we could have a race between the Time of Check
> (the call to security_kernel_read_file() which could read the file and
> make access policy decisions) and the Time of Use (starting with
> kernel_read_file()'s reading of the file contents). In theory, file
> contents could change between the two.
> 
> Signed-off-by: Kees Cook 
> ---
> v2: Clarify the ToC/ToU race (Linus)


I'll merge this unless Al objects (cc'd).

> 
> Only loadpin and SELinux currently implement this hook. From what
> I can see, this won't change anything for either of them. IMA calls
> kernel_read_file(), but looking there it seems those callers won't be
> negatively impacted either. Can folks double-check this and send an
> Ack please?
> ---
>  fs/exec.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/exec.c b/fs/exec.c
> index 7eb8d21bcab9..a919a827d181 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -895,13 +895,13 @@ int kernel_read_file(struct file *file, void **buf, 
> loff_t *size,
>   if (!S_ISREG(file_inode(file)->i_mode) || max_size < 0)
>   return -EINVAL;
>  
> - ret = security_kernel_read_file(file, id);
> + ret = deny_write_access(file);
>   if (ret)
>   return ret;
>  
> - ret = deny_write_access(file);
> + ret = security_kernel_read_file(file, id);
>   if (ret)
> - return ret;
> + goto out;
>  
>   i_size = i_size_read(file_inode(file));
>   if (max_size > 0 && i_size > max_size) {
> 

-- 
James Morris




Re: [PATCH v2] exec: Set file unwritable before LSM check

2018-03-12 Thread James Morris
On Fri, 9 Mar 2018, Kees Cook wrote:

> The LSM check should happen after the file has been confirmed to be
> unchanging. Without this, we could have a race between the Time of Check
> (the call to security_kernel_read_file() which could read the file and
> make access policy decisions) and the Time of Use (starting with
> kernel_read_file()'s reading of the file contents). In theory, file
> contents could change between the two.
> 
> Signed-off-by: Kees Cook 
> ---
> v2: Clarify the ToC/ToU race (Linus)


I'll merge this unless Al objects (cc'd).

> 
> Only loadpin and SELinux currently implement this hook. From what
> I can see, this won't change anything for either of them. IMA calls
> kernel_read_file(), but looking there it seems those callers won't be
> negatively impacted either. Can folks double-check this and send an
> Ack please?
> ---
>  fs/exec.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/exec.c b/fs/exec.c
> index 7eb8d21bcab9..a919a827d181 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -895,13 +895,13 @@ int kernel_read_file(struct file *file, void **buf, 
> loff_t *size,
>   if (!S_ISREG(file_inode(file)->i_mode) || max_size < 0)
>   return -EINVAL;
>  
> - ret = security_kernel_read_file(file, id);
> + ret = deny_write_access(file);
>   if (ret)
>   return ret;
>  
> - ret = deny_write_access(file);
> + ret = security_kernel_read_file(file, id);
>   if (ret)
> - return ret;
> + goto out;
>  
>   i_size = i_size_read(file_inode(file));
>   if (max_size > 0 && i_size > max_size) {
> 

-- 
James Morris




Re: [PATCH] arm: ubsan: select ARCH_HAS_UBSAN_SANITIZE_ALL

2018-03-12 Thread Jaehoon Chung
On 03/13/2018 01:53 PM, Jinbum Park wrote:
> To enable UBSAN on arm, ARCH_HAS_UBSAN_SANITIZE_ALL is needed to be selected.
> 
> Basic test has passed on Raspberry Pi2, Raspbian jessi lite with
> CONFIG_UBSAN_SANITIZE_ALL, CONFIG_UBSAN_NULL.
> 
> Signed-off-by: Jinbum Park 
> ---
>  arch/arm/Kconfig  | 1 +
>  arch/arm/boot/compressed/Makefile | 1 +
>  arch/arm/vdso/Makefile| 1 +
>  3 files changed, 3 insertions(+)
> 
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index 1878083..bdd1561 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -7,6 +7,7 @@ config ARM
>   select ARCH_HAS_DEBUG_VIRTUAL if MMU
>   select ARCH_HAS_DEVMEM_IS_ALLOWED
>   select ARCH_HAS_ELF_RANDOMIZE
> + select ARCH_HAS_UBSAN_SANITIZE_ALL
>   select ARCH_HAS_SET_MEMORY
>   select ARCH_HAS_PHYS_TO_DMA
>   select ARCH_HAS_STRICT_KERNEL_RWX if MMU && !XIP_KERNEL
> diff --git a/arch/arm/boot/compressed/Makefile 
> b/arch/arm/boot/compressed/Makefile
> index 45a6b9b..1b374ba 100644
> --- a/arch/arm/boot/compressed/Makefile
> +++ b/arch/arm/boot/compressed/Makefile
> @@ -24,6 +24,7 @@ OBJS+= hyp-stub.o
>  endif
>  
>  GCOV_PROFILE := n
> +UBSAN_SANITIZE   := n
>  
>  #
>  # Architecture dependencies
> diff --git a/arch/arm/vdso/Makefile b/arch/arm/vdso/Makefile
> index bb411821..05597f7 100644
> --- a/arch/arm/vdso/Makefile
> +++ b/arch/arm/vdso/Makefile
> @@ -29,6 +29,7 @@ CFLAGS_vgettimeofday.o = -O2
>  
>  # Disable gcov profiling for VDSO code
>  GCOV_PROFILE := n
> +UBSAN_SANITIZE := n
>  
>  # Force dependency
>  $(obj)/vdso.o : $(obj)/vdso.so
> 



Re: [PATCH] arm: ubsan: select ARCH_HAS_UBSAN_SANITIZE_ALL

2018-03-12 Thread Jaehoon Chung
On 03/13/2018 01:53 PM, Jinbum Park wrote:
> To enable UBSAN on arm, ARCH_HAS_UBSAN_SANITIZE_ALL is needed to be selected.
> 
> Basic test has passed on Raspberry Pi2, Raspbian jessi lite with
> CONFIG_UBSAN_SANITIZE_ALL, CONFIG_UBSAN_NULL.
> 
> Signed-off-by: Jinbum Park 
> ---
>  arch/arm/Kconfig  | 1 +
>  arch/arm/boot/compressed/Makefile | 1 +
>  arch/arm/vdso/Makefile| 1 +
>  3 files changed, 3 insertions(+)
> 
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index 1878083..bdd1561 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -7,6 +7,7 @@ config ARM
>   select ARCH_HAS_DEBUG_VIRTUAL if MMU
>   select ARCH_HAS_DEVMEM_IS_ALLOWED
>   select ARCH_HAS_ELF_RANDOMIZE
> + select ARCH_HAS_UBSAN_SANITIZE_ALL
>   select ARCH_HAS_SET_MEMORY
>   select ARCH_HAS_PHYS_TO_DMA
>   select ARCH_HAS_STRICT_KERNEL_RWX if MMU && !XIP_KERNEL
> diff --git a/arch/arm/boot/compressed/Makefile 
> b/arch/arm/boot/compressed/Makefile
> index 45a6b9b..1b374ba 100644
> --- a/arch/arm/boot/compressed/Makefile
> +++ b/arch/arm/boot/compressed/Makefile
> @@ -24,6 +24,7 @@ OBJS+= hyp-stub.o
>  endif
>  
>  GCOV_PROFILE := n
> +UBSAN_SANITIZE   := n
>  
>  #
>  # Architecture dependencies
> diff --git a/arch/arm/vdso/Makefile b/arch/arm/vdso/Makefile
> index bb411821..05597f7 100644
> --- a/arch/arm/vdso/Makefile
> +++ b/arch/arm/vdso/Makefile
> @@ -29,6 +29,7 @@ CFLAGS_vgettimeofday.o = -O2
>  
>  # Disable gcov profiling for VDSO code
>  GCOV_PROFILE := n
> +UBSAN_SANITIZE := n
>  
>  # Force dependency
>  $(obj)/vdso.o : $(obj)/vdso.so
> 



[PATCH v10] mmc: Export host capabilities to debugfs.

2018-03-12 Thread Harish Jenny K N
This patch exports the host capabilities to debugfs

This idea of sharing host capabilities over debugfs
came up from Abbas Raza 
Earlier discussions:
https://lkml.org/lkml/2018/3/5/357
https://www.spinics.net/lists/linux-mmc/msg48219.html

Reviewed-by: Andy Shevchenko 
Signed-off-by: Harish Jenny K N 
---

Changes in v10:
- minor review comment addressed. 
- Added "Reviewed-by" line

Changes in v9
- More code cleanup as suggested by Andy Shevchenko.

Changes in v8
- Changes to use for_each_set_bit as suggested by Andy Shevchenko.

Changes in v7
- Moved additional capabilities also to caps file as mentioned by Ulf Hansson
- compacting the code with macros

Changes in v6:
- Used DEFINE_SHOW_ATTRIBUTE

Changes in v5:
- Added parser logic in kernel by using debugfs_create_file  for caps and caps2 
instead of debugfs_create_x32
- Changed Author

Changes in v4:
- Moved the creation of nodes to mmc_add_host_debugfs
- Exported caps2
- Renamed host_caps to caps

Changes in v3:
- Removed typecasting of >caps to (u32 *)

Changes in v2:
- Changed Author

 drivers/mmc/core/debugfs.c | 107 +
 1 file changed, 107 insertions(+)

diff --git a/drivers/mmc/core/debugfs.c b/drivers/mmc/core/debugfs.c
index c51e0c0..826d361 100644
--- a/drivers/mmc/core/debugfs.c
+++ b/drivers/mmc/core/debugfs.c
@@ -225,6 +225,110 @@ static int mmc_clock_opt_set(void *data, u64 val)
 DEFINE_SIMPLE_ATTRIBUTE(mmc_clock_fops, mmc_clock_opt_get, mmc_clock_opt_set,
"%llu\n");
 
+/*
+ * mmc_host_capabilities - MMC host capabilities
+ *
+ * This must be in sync with caps definitions in the mmc/host.h
+ */
+static const char * const mmc_host_capabilities[] = {
+   "4-bit transfers allowed",
+   "Supports MMC high-speed timing",
+   "Supports SD high-speed timing",
+   "Can signal pending SDIO IRQs",
+   "Talks only SPI protocols",
+   "Needs polling for card-detection",
+   "8 bit transfers allowed",
+   "Suspends (e)MMC/SD at idle",
+   "Nonremovable",
+   "Waits while card is busy",
+   "Allows erase/trim commands",
+   "Supports DDR mode at 3.3V",
+   "Supports DDR mode at 1.8V",
+   "Supports DDR mode at 1.2V",
+   "Can power off after boot",
+   "CMD14/CMD19 bus width ok",
+   "Supports UHS SDR12 mode",
+   "Supports UHS SDR25 mode",
+   "Supports UHS SDR50 mode",
+   "Supports UHS SDR104 mode",
+   "Supports UHS DDR50 mode",
+   "Unknown (bit 21)",
+   "Unknown (bit 22)",
+   "Supports Driver Type A",
+   "Supports Driver Type C",
+   "Supports Driver Type D",
+   "Unknown (bit 26)",
+   "RW reqs can be completed within mmc_request_done()",
+   "Supports Enable card detect wake",
+   "Can send commands during data transfer",
+   "CMD23 supported",
+   "Supports Hardware reset"
+};
+
+/*
+ * mmc_host_capabilities2 - MMC host additional capabilities
+ *
+ * This must be in sync with caps2 definitions in the mmc/host.h
+ */
+static const char * const mmc_host_capabilities2[] = {
+   "No access to Boot partition",
+   "Unknown (bit 1)",
+   "Can do full power cycle",
+   "Unknown (bit 3)",
+   "Unknown (bit 4)",
+   "Supports HS200 1.8V SDR",
+   "Supports HS200 1.2V SDR",
+   "Unknown (bit 7)",
+   "Unknown (bit 8)",
+   "Unknown (bit 9)",
+   "Card-detect signal active high",
+   "Write-protect signal active high",
+   "Unknown (bit 12)",
+   "Unknown (bit 13)",
+   "Can do complete power cycle of the card",
+   "Supports HS400 1.8V",
+   "Support HS400 1.2V",
+   "SDIO IRQ - Nothread",
+   "No physical write protect pin, assume always read-write",
+   "Do not send SDIO commands during initialization",
+   "Supports enhanced strobe",
+   "Do not send SD commands during initialization",
+   "Do not send (e)MMC commands during initialization",
+   "Has eMMC command queue engine",
+   "CQE can issue a direct command",
+   "Unknown (bit 25)",
+   "Unknown (bit 26)",
+   "Unknown (bit 27)",
+   "Unknown (bit 28)",
+   "Unknown (bit 29)",
+   "Unknown (bit 30)",
+   "Unknown (bit 31)"
+};
+
+static int mmc_caps_show(struct seq_file *s, void *unused)
+{
+   struct mmc_host *host = s->private;
+   unsigned long caps = host->caps;
+   unsigned long caps2 = host->caps2;
+   int bit;
+
+   seq_puts(s, "MMC Host capabilities are:\n");
+   seq_puts(s, "=\n");
+
+   for_each_set_bit(bit, , ARRAY_SIZE(mmc_host_capabilities))
+   seq_printf(s, "%s\n", mmc_host_capabilities[bit]);
+
+   seq_puts(s, "=\n");
+   seq_puts(s, "MMC Host additional capabilities are:\n");
+   seq_puts(s, "=\n");
+
+ 

[PATCH v10] mmc: Export host capabilities to debugfs.

2018-03-12 Thread Harish Jenny K N
This patch exports the host capabilities to debugfs

This idea of sharing host capabilities over debugfs
came up from Abbas Raza 
Earlier discussions:
https://lkml.org/lkml/2018/3/5/357
https://www.spinics.net/lists/linux-mmc/msg48219.html

Reviewed-by: Andy Shevchenko 
Signed-off-by: Harish Jenny K N 
---

Changes in v10:
- minor review comment addressed. 
- Added "Reviewed-by" line

Changes in v9
- More code cleanup as suggested by Andy Shevchenko.

Changes in v8
- Changes to use for_each_set_bit as suggested by Andy Shevchenko.

Changes in v7
- Moved additional capabilities also to caps file as mentioned by Ulf Hansson
- compacting the code with macros

Changes in v6:
- Used DEFINE_SHOW_ATTRIBUTE

Changes in v5:
- Added parser logic in kernel by using debugfs_create_file  for caps and caps2 
instead of debugfs_create_x32
- Changed Author

Changes in v4:
- Moved the creation of nodes to mmc_add_host_debugfs
- Exported caps2
- Renamed host_caps to caps

Changes in v3:
- Removed typecasting of >caps to (u32 *)

Changes in v2:
- Changed Author

 drivers/mmc/core/debugfs.c | 107 +
 1 file changed, 107 insertions(+)

diff --git a/drivers/mmc/core/debugfs.c b/drivers/mmc/core/debugfs.c
index c51e0c0..826d361 100644
--- a/drivers/mmc/core/debugfs.c
+++ b/drivers/mmc/core/debugfs.c
@@ -225,6 +225,110 @@ static int mmc_clock_opt_set(void *data, u64 val)
 DEFINE_SIMPLE_ATTRIBUTE(mmc_clock_fops, mmc_clock_opt_get, mmc_clock_opt_set,
"%llu\n");
 
+/*
+ * mmc_host_capabilities - MMC host capabilities
+ *
+ * This must be in sync with caps definitions in the mmc/host.h
+ */
+static const char * const mmc_host_capabilities[] = {
+   "4-bit transfers allowed",
+   "Supports MMC high-speed timing",
+   "Supports SD high-speed timing",
+   "Can signal pending SDIO IRQs",
+   "Talks only SPI protocols",
+   "Needs polling for card-detection",
+   "8 bit transfers allowed",
+   "Suspends (e)MMC/SD at idle",
+   "Nonremovable",
+   "Waits while card is busy",
+   "Allows erase/trim commands",
+   "Supports DDR mode at 3.3V",
+   "Supports DDR mode at 1.8V",
+   "Supports DDR mode at 1.2V",
+   "Can power off after boot",
+   "CMD14/CMD19 bus width ok",
+   "Supports UHS SDR12 mode",
+   "Supports UHS SDR25 mode",
+   "Supports UHS SDR50 mode",
+   "Supports UHS SDR104 mode",
+   "Supports UHS DDR50 mode",
+   "Unknown (bit 21)",
+   "Unknown (bit 22)",
+   "Supports Driver Type A",
+   "Supports Driver Type C",
+   "Supports Driver Type D",
+   "Unknown (bit 26)",
+   "RW reqs can be completed within mmc_request_done()",
+   "Supports Enable card detect wake",
+   "Can send commands during data transfer",
+   "CMD23 supported",
+   "Supports Hardware reset"
+};
+
+/*
+ * mmc_host_capabilities2 - MMC host additional capabilities
+ *
+ * This must be in sync with caps2 definitions in the mmc/host.h
+ */
+static const char * const mmc_host_capabilities2[] = {
+   "No access to Boot partition",
+   "Unknown (bit 1)",
+   "Can do full power cycle",
+   "Unknown (bit 3)",
+   "Unknown (bit 4)",
+   "Supports HS200 1.8V SDR",
+   "Supports HS200 1.2V SDR",
+   "Unknown (bit 7)",
+   "Unknown (bit 8)",
+   "Unknown (bit 9)",
+   "Card-detect signal active high",
+   "Write-protect signal active high",
+   "Unknown (bit 12)",
+   "Unknown (bit 13)",
+   "Can do complete power cycle of the card",
+   "Supports HS400 1.8V",
+   "Support HS400 1.2V",
+   "SDIO IRQ - Nothread",
+   "No physical write protect pin, assume always read-write",
+   "Do not send SDIO commands during initialization",
+   "Supports enhanced strobe",
+   "Do not send SD commands during initialization",
+   "Do not send (e)MMC commands during initialization",
+   "Has eMMC command queue engine",
+   "CQE can issue a direct command",
+   "Unknown (bit 25)",
+   "Unknown (bit 26)",
+   "Unknown (bit 27)",
+   "Unknown (bit 28)",
+   "Unknown (bit 29)",
+   "Unknown (bit 30)",
+   "Unknown (bit 31)"
+};
+
+static int mmc_caps_show(struct seq_file *s, void *unused)
+{
+   struct mmc_host *host = s->private;
+   unsigned long caps = host->caps;
+   unsigned long caps2 = host->caps2;
+   int bit;
+
+   seq_puts(s, "MMC Host capabilities are:\n");
+   seq_puts(s, "=\n");
+
+   for_each_set_bit(bit, , ARRAY_SIZE(mmc_host_capabilities))
+   seq_printf(s, "%s\n", mmc_host_capabilities[bit]);
+
+   seq_puts(s, "=\n");
+   seq_puts(s, "MMC Host additional capabilities are:\n");
+   seq_puts(s, "=\n");
+
+   for_each_set_bit(bit, , ARRAY_SIZE(mmc_host_capabilities2))
+   

RE: [PATCH 1/1] x86/platform/x86: Fix count of CHas on multi-pci-segment arches

2018-03-12 Thread Kroening, Gary
Thanks, Kan -- your patch looks good, and cleaner than the previous method!

Tonight, I've tested it on our in-house simulator for the following 
configurations. The simulator has been matching hardware well for this issue -- 
we'll do more testing on real hardware, but I'm confident you have it right.

Terminology:

"hubless" -- equivalent to "glueless" or "white box" where our BIOS sets things 
up with a single bus for all sockets.

"scalable" -- BIOS assigns a new segment/domain for each socket

Configurations tested so far:
- single-socket hubless/scalable
- two sockets, scalable
- four sockets, hubless
- eight sockets, scalable

In all cases, skx_count_chabox() is returning 28 for Skylake server.
Thanks for the quick response!
Gary

> -Original Message-
> From: Liang, Kan [mailto:kan.li...@linux.intel.com]
> Sent: Monday, March 12, 2018 8:43 PM
> To: Kroening, Gary; mi...@redhat.com; h...@zytor.com; t...@linutronix.de;
> pet...@infradead.org
> Cc: Travis, Mike; Banman, Andrew; Sivanich, Dimitri; Anderson, Russ;
> x...@kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH 1/1] x86/platform/x86: Fix count of CHas on multi-pci-
> segment arches
> 
> 
> 
> On 3/7/2018 3:33 PM, Kroening, Gary wrote:
> > For systems with a single PCI segment, it is sufficient to look for the
> > bus number to change in order to determine that all of the CHa's have
> > been counted for a single socket.
> >
> > However, for multi PCI segment systems, each socket is given a new
> > segment and the bus number does NOT change.  So looking only for the
> > bus number to change ends up counting all of the CHa's on all sockets
> > in the system.  This leads to writing CPU MSRs beyond a valid range and
> > causes an error in ivbep_uncore_msr_init_box().
> >
> > The fix is to check for either the bus number or segment number to
> change.
> >
> 
> Hi Gary,
> 
> There is a recommended way in uncore document to query the number of
> CHAs on Skylake server.
> I have a patch to implement the new way.
> 
> Could you please take a look at the patch and see if it can fix your
> issue?
> 
> 
> Thanks,
> Kan
> 
> --
>  From 55f54b2fa3021c691c2fd4f5cfc8f441fd104e91 Mon Sep 17 00:00:00 2001
> From: Kan Liang 
> Date: Mon, 12 Mar 2018 13:03:40 -0700
> Subject: [PATCH] perf/x86/intel/uncore: Querying number of CHAs from
> CAPID6 register
> 
> The number of CHAs is miscalculated on multi PCI domain systems on
> Skylake server
> 
> (From Kroening, Gary:
> 
> For systems with a single PCI segment, it is sufficient to look for the
> bus number to change in order to determine that all of the CHa's have
> been counted for a single socket.
> However, for multi PCI segment systems, each socket is given a new
> segment and the bus number does NOT change.  So looking only for the
> bus number to change ends up counting all of the CHa's on all sockets
> in the system.  This leads to writing CPU MSRs beyond a valid range and
> causes an error in ivbep_uncore_msr_init_box().)
> 
> To determine the number of CHAs, it should read bits 27:0 in the CAPID6
> register located at Device 30, Function 3, Offset 0x9C. These 28 bits
> form a bit vector of available LLC slices and the CHAs that manage those
> slices.
> 
> Fixes: cd34cd97b7b4 ("perf/x86/intel/uncore: Add Skylake server uncore
> support")
> Reported-by: Kroening, Gary 
> Signed-off-by: Kan Liang 
> ---
>   arch/x86/events/intel/uncore_snbep.c | 24 ++--
>   1 file changed, 10 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/x86/events/intel/uncore_snbep.c
> b/arch/x86/events/intel/uncore_snbep.c
> index d4672ed..a42715b 100644
> --- a/arch/x86/events/intel/uncore_snbep.c
> +++ b/arch/x86/events/intel/uncore_snbep.c
> @@ -3575,24 +3575,20 @@ static struct intel_uncore_type
> *skx_msr_uncores[] = {
>   NULL,
>   };
> 
> +#define SKX_CAPID6   0x9c
> +#define SKX_CHA_BIT_WIDTH28
> +
>   static int skx_count_chabox(void)
>   {
> - struct pci_dev *chabox_dev = NULL;
> - int bus, count = 0;
> + struct pci_dev *dev = NULL;
> + u32 val = 0;
> 
> - while (1) {
> - chabox_dev = pci_get_device(PCI_VENDOR_ID_INTEL, 0x208d,
> chabox_dev);
> - if (!chabox_dev)
> - break;
> - if (count == 0)
> - bus = chabox_dev->bus->number;
> - if (bus != chabox_dev->bus->number)
> - break;
> - count++;
> - }
> + dev = pci_get_device(PCI_VENDOR_ID_INTEL, 0x2083, dev);
> + if (!dev)
> + return 0;
> 
> - pci_dev_put(chabox_dev);
> - return count;
> + pci_read_config_dword(dev, SKX_CAPID6, );
> + return bitmap_weight((unsigned long *), SKX_CHA_BIT_WIDTH);
>   }
> 
>   void skx_uncore_cpu_init(void)
> --
> 2.7.4



RE: [PATCH 1/1] x86/platform/x86: Fix count of CHas on multi-pci-segment arches

2018-03-12 Thread Kroening, Gary
Thanks, Kan -- your patch looks good, and cleaner than the previous method!

Tonight, I've tested it on our in-house simulator for the following 
configurations. The simulator has been matching hardware well for this issue -- 
we'll do more testing on real hardware, but I'm confident you have it right.

Terminology:

"hubless" -- equivalent to "glueless" or "white box" where our BIOS sets things 
up with a single bus for all sockets.

"scalable" -- BIOS assigns a new segment/domain for each socket

Configurations tested so far:
- single-socket hubless/scalable
- two sockets, scalable
- four sockets, hubless
- eight sockets, scalable

In all cases, skx_count_chabox() is returning 28 for Skylake server.
Thanks for the quick response!
Gary

> -Original Message-
> From: Liang, Kan [mailto:kan.li...@linux.intel.com]
> Sent: Monday, March 12, 2018 8:43 PM
> To: Kroening, Gary; mi...@redhat.com; h...@zytor.com; t...@linutronix.de;
> pet...@infradead.org
> Cc: Travis, Mike; Banman, Andrew; Sivanich, Dimitri; Anderson, Russ;
> x...@kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH 1/1] x86/platform/x86: Fix count of CHas on multi-pci-
> segment arches
> 
> 
> 
> On 3/7/2018 3:33 PM, Kroening, Gary wrote:
> > For systems with a single PCI segment, it is sufficient to look for the
> > bus number to change in order to determine that all of the CHa's have
> > been counted for a single socket.
> >
> > However, for multi PCI segment systems, each socket is given a new
> > segment and the bus number does NOT change.  So looking only for the
> > bus number to change ends up counting all of the CHa's on all sockets
> > in the system.  This leads to writing CPU MSRs beyond a valid range and
> > causes an error in ivbep_uncore_msr_init_box().
> >
> > The fix is to check for either the bus number or segment number to
> change.
> >
> 
> Hi Gary,
> 
> There is a recommended way in uncore document to query the number of
> CHAs on Skylake server.
> I have a patch to implement the new way.
> 
> Could you please take a look at the patch and see if it can fix your
> issue?
> 
> 
> Thanks,
> Kan
> 
> --
>  From 55f54b2fa3021c691c2fd4f5cfc8f441fd104e91 Mon Sep 17 00:00:00 2001
> From: Kan Liang 
> Date: Mon, 12 Mar 2018 13:03:40 -0700
> Subject: [PATCH] perf/x86/intel/uncore: Querying number of CHAs from
> CAPID6 register
> 
> The number of CHAs is miscalculated on multi PCI domain systems on
> Skylake server
> 
> (From Kroening, Gary:
> 
> For systems with a single PCI segment, it is sufficient to look for the
> bus number to change in order to determine that all of the CHa's have
> been counted for a single socket.
> However, for multi PCI segment systems, each socket is given a new
> segment and the bus number does NOT change.  So looking only for the
> bus number to change ends up counting all of the CHa's on all sockets
> in the system.  This leads to writing CPU MSRs beyond a valid range and
> causes an error in ivbep_uncore_msr_init_box().)
> 
> To determine the number of CHAs, it should read bits 27:0 in the CAPID6
> register located at Device 30, Function 3, Offset 0x9C. These 28 bits
> form a bit vector of available LLC slices and the CHAs that manage those
> slices.
> 
> Fixes: cd34cd97b7b4 ("perf/x86/intel/uncore: Add Skylake server uncore
> support")
> Reported-by: Kroening, Gary 
> Signed-off-by: Kan Liang 
> ---
>   arch/x86/events/intel/uncore_snbep.c | 24 ++--
>   1 file changed, 10 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/x86/events/intel/uncore_snbep.c
> b/arch/x86/events/intel/uncore_snbep.c
> index d4672ed..a42715b 100644
> --- a/arch/x86/events/intel/uncore_snbep.c
> +++ b/arch/x86/events/intel/uncore_snbep.c
> @@ -3575,24 +3575,20 @@ static struct intel_uncore_type
> *skx_msr_uncores[] = {
>   NULL,
>   };
> 
> +#define SKX_CAPID6   0x9c
> +#define SKX_CHA_BIT_WIDTH28
> +
>   static int skx_count_chabox(void)
>   {
> - struct pci_dev *chabox_dev = NULL;
> - int bus, count = 0;
> + struct pci_dev *dev = NULL;
> + u32 val = 0;
> 
> - while (1) {
> - chabox_dev = pci_get_device(PCI_VENDOR_ID_INTEL, 0x208d,
> chabox_dev);
> - if (!chabox_dev)
> - break;
> - if (count == 0)
> - bus = chabox_dev->bus->number;
> - if (bus != chabox_dev->bus->number)
> - break;
> - count++;
> - }
> + dev = pci_get_device(PCI_VENDOR_ID_INTEL, 0x2083, dev);
> + if (!dev)
> + return 0;
> 
> - pci_dev_put(chabox_dev);
> - return count;
> + pci_read_config_dword(dev, SKX_CAPID6, );
> + return bitmap_weight((unsigned long *), SKX_CHA_BIT_WIDTH);
>   }
> 
>   void skx_uncore_cpu_init(void)
> --
> 2.7.4



[PATCH] arm: ubsan: select ARCH_HAS_UBSAN_SANITIZE_ALL

2018-03-12 Thread Jinbum Park
To enable UBSAN on arm, ARCH_HAS_UBSAN_SANITIZE_ALL is needed to be selected.

Basic test has passed on Raspberry Pi2, Raspbian jessi lite with
CONFIG_UBSAN_SANITIZE_ALL, CONFIG_UBSAN_NULL.

Signed-off-by: Jinbum Park 
---
 arch/arm/Kconfig  | 1 +
 arch/arm/boot/compressed/Makefile | 1 +
 arch/arm/vdso/Makefile| 1 +
 3 files changed, 3 insertions(+)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 1878083..bdd1561 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -7,6 +7,7 @@ config ARM
select ARCH_HAS_DEBUG_VIRTUAL if MMU
select ARCH_HAS_DEVMEM_IS_ALLOWED
select ARCH_HAS_ELF_RANDOMIZE
+   select ARCH_HAS_UBSAN_SANITIZE_ALL
select ARCH_HAS_SET_MEMORY
select ARCH_HAS_PHYS_TO_DMA
select ARCH_HAS_STRICT_KERNEL_RWX if MMU && !XIP_KERNEL
diff --git a/arch/arm/boot/compressed/Makefile 
b/arch/arm/boot/compressed/Makefile
index 45a6b9b..1b374ba 100644
--- a/arch/arm/boot/compressed/Makefile
+++ b/arch/arm/boot/compressed/Makefile
@@ -24,6 +24,7 @@ OBJS  += hyp-stub.o
 endif
 
 GCOV_PROFILE   := n
+UBSAN_SANITIZE := n
 
 #
 # Architecture dependencies
diff --git a/arch/arm/vdso/Makefile b/arch/arm/vdso/Makefile
index bb411821..05597f7 100644
--- a/arch/arm/vdso/Makefile
+++ b/arch/arm/vdso/Makefile
@@ -29,6 +29,7 @@ CFLAGS_vgettimeofday.o = -O2
 
 # Disable gcov profiling for VDSO code
 GCOV_PROFILE := n
+UBSAN_SANITIZE := n
 
 # Force dependency
 $(obj)/vdso.o : $(obj)/vdso.so
-- 
1.9.1


[PATCH] arm: ubsan: select ARCH_HAS_UBSAN_SANITIZE_ALL

2018-03-12 Thread Jinbum Park
To enable UBSAN on arm, ARCH_HAS_UBSAN_SANITIZE_ALL is needed to be selected.

Basic test has passed on Raspberry Pi2, Raspbian jessi lite with
CONFIG_UBSAN_SANITIZE_ALL, CONFIG_UBSAN_NULL.

Signed-off-by: Jinbum Park 
---
 arch/arm/Kconfig  | 1 +
 arch/arm/boot/compressed/Makefile | 1 +
 arch/arm/vdso/Makefile| 1 +
 3 files changed, 3 insertions(+)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 1878083..bdd1561 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -7,6 +7,7 @@ config ARM
select ARCH_HAS_DEBUG_VIRTUAL if MMU
select ARCH_HAS_DEVMEM_IS_ALLOWED
select ARCH_HAS_ELF_RANDOMIZE
+   select ARCH_HAS_UBSAN_SANITIZE_ALL
select ARCH_HAS_SET_MEMORY
select ARCH_HAS_PHYS_TO_DMA
select ARCH_HAS_STRICT_KERNEL_RWX if MMU && !XIP_KERNEL
diff --git a/arch/arm/boot/compressed/Makefile 
b/arch/arm/boot/compressed/Makefile
index 45a6b9b..1b374ba 100644
--- a/arch/arm/boot/compressed/Makefile
+++ b/arch/arm/boot/compressed/Makefile
@@ -24,6 +24,7 @@ OBJS  += hyp-stub.o
 endif
 
 GCOV_PROFILE   := n
+UBSAN_SANITIZE := n
 
 #
 # Architecture dependencies
diff --git a/arch/arm/vdso/Makefile b/arch/arm/vdso/Makefile
index bb411821..05597f7 100644
--- a/arch/arm/vdso/Makefile
+++ b/arch/arm/vdso/Makefile
@@ -29,6 +29,7 @@ CFLAGS_vgettimeofday.o = -O2
 
 # Disable gcov profiling for VDSO code
 GCOV_PROFILE := n
+UBSAN_SANITIZE := n
 
 # Force dependency
 $(obj)/vdso.o : $(obj)/vdso.so
-- 
1.9.1


Re: [PATCH v3 20/20] mt7601u: use request_firmware_cache() to address cache on reboot

2018-03-12 Thread Jakub Kicinski
On Sat, 10 Mar 2018 06:15:01 -0800, Luis R. Rodriguez wrote:
> request_firmware_cache() will ensure the firmware is available on resume
> from suspend if on reboot the device retains the firmware.
> 
> This optimization is in place given otherwise on reboot we have to
> reload the firmware, the opmization saves us about max 1s, minimum 10ms.
> 
> Cantabile has reported back this fixes his woes with both suspend and
> hibernation.
> 
> Reported-by: Cantabile 
> Tested-by: Cantabile 
> Signed-off-by: Luis R. Rodriguez 

FWIW:

Acked-by: Jakub Kicinski 

Thanks Luis!!


Re: [PATCH v3 20/20] mt7601u: use request_firmware_cache() to address cache on reboot

2018-03-12 Thread Jakub Kicinski
On Sat, 10 Mar 2018 06:15:01 -0800, Luis R. Rodriguez wrote:
> request_firmware_cache() will ensure the firmware is available on resume
> from suspend if on reboot the device retains the firmware.
> 
> This optimization is in place given otherwise on reboot we have to
> reload the firmware, the opmization saves us about max 1s, minimum 10ms.
> 
> Cantabile has reported back this fixes his woes with both suspend and
> hibernation.
> 
> Reported-by: Cantabile 
> Tested-by: Cantabile 
> Signed-off-by: Luis R. Rodriguez 

FWIW:

Acked-by: Jakub Kicinski 

Thanks Luis!!


[PATCH 2/2] dh key: get rid of stack array allocation

2018-03-12 Thread Tycho Andersen
Similarly to the previous patch, we would like to get rid of stack
allocated arrays: https://lkml.org/lkml/2018/3/7/621

In this case, we can also use a malloc style approach to free the temporary
buffer, being careful to also use kzfree to free them (indeed, at least one
of these has a memzero_explicit, but it seems like maybe they both
should?).

Signed-off-by: Tycho Andersen 
CC: David Howells 
CC: James Morris 
CC: "Serge E. Hallyn" 
---
 security/keys/dh.c | 27 +--
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/security/keys/dh.c b/security/keys/dh.c
index d1ea9f325f94..f02261b24759 100644
--- a/security/keys/dh.c
+++ b/security/keys/dh.c
@@ -162,19 +162,27 @@ static int kdf_ctr(struct kdf_sdesc *sdesc, const u8 
*src, unsigned int slen,
goto err;
 
if (zlen && h) {
-   u8 tmpbuffer[h];
+   u8 *tmpbuffer;
size_t chunk = min_t(size_t, zlen, h);
-   memset(tmpbuffer, 0, chunk);
+
+   err = -ENOMEM;
+   tmpbuffer = kzalloc(chunk, GFP_KERNEL);
+   if (!tmpbuffer)
+   goto err;
 
do {
err = crypto_shash_update(desc, tmpbuffer,
  chunk);
-   if (err)
+   if (err) {
+   kzfree(tmpbuffer);
goto err;
+   }
 
zlen -= chunk;
chunk = min_t(size_t, zlen, h);
} while (zlen);
+
+   kzfree(tmpbuffer);
}
 
if (src && slen) {
@@ -184,13 +192,20 @@ static int kdf_ctr(struct kdf_sdesc *sdesc, const u8 
*src, unsigned int slen,
}
 
if (dlen < h) {
-   u8 tmpbuffer[h];
+   u8 *tmpbuffer;
+
+   err = -ENOMEM;
+   tmpbuffer = kzalloc(h, GFP_KERNEL);
+   if (!tmpbuffer)
+   goto err;
 
err = crypto_shash_final(desc, tmpbuffer);
-   if (err)
+   if (err) {
+   kzfree(tmpbuffer);
goto err;
+   }
memcpy(dst, tmpbuffer, dlen);
-   memzero_explicit(tmpbuffer, h);
+   kzfree(tmpbuffer);
return 0;
} else {
err = crypto_shash_final(desc, dst);
-- 
2.15.1



[PATCH 2/2] dh key: get rid of stack array allocation

2018-03-12 Thread Tycho Andersen
Similarly to the previous patch, we would like to get rid of stack
allocated arrays: https://lkml.org/lkml/2018/3/7/621

In this case, we can also use a malloc style approach to free the temporary
buffer, being careful to also use kzfree to free them (indeed, at least one
of these has a memzero_explicit, but it seems like maybe they both
should?).

Signed-off-by: Tycho Andersen 
CC: David Howells 
CC: James Morris 
CC: "Serge E. Hallyn" 
---
 security/keys/dh.c | 27 +--
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/security/keys/dh.c b/security/keys/dh.c
index d1ea9f325f94..f02261b24759 100644
--- a/security/keys/dh.c
+++ b/security/keys/dh.c
@@ -162,19 +162,27 @@ static int kdf_ctr(struct kdf_sdesc *sdesc, const u8 
*src, unsigned int slen,
goto err;
 
if (zlen && h) {
-   u8 tmpbuffer[h];
+   u8 *tmpbuffer;
size_t chunk = min_t(size_t, zlen, h);
-   memset(tmpbuffer, 0, chunk);
+
+   err = -ENOMEM;
+   tmpbuffer = kzalloc(chunk, GFP_KERNEL);
+   if (!tmpbuffer)
+   goto err;
 
do {
err = crypto_shash_update(desc, tmpbuffer,
  chunk);
-   if (err)
+   if (err) {
+   kzfree(tmpbuffer);
goto err;
+   }
 
zlen -= chunk;
chunk = min_t(size_t, zlen, h);
} while (zlen);
+
+   kzfree(tmpbuffer);
}
 
if (src && slen) {
@@ -184,13 +192,20 @@ static int kdf_ctr(struct kdf_sdesc *sdesc, const u8 
*src, unsigned int slen,
}
 
if (dlen < h) {
-   u8 tmpbuffer[h];
+   u8 *tmpbuffer;
+
+   err = -ENOMEM;
+   tmpbuffer = kzalloc(h, GFP_KERNEL);
+   if (!tmpbuffer)
+   goto err;
 
err = crypto_shash_final(desc, tmpbuffer);
-   if (err)
+   if (err) {
+   kzfree(tmpbuffer);
goto err;
+   }
memcpy(dst, tmpbuffer, dlen);
-   memzero_explicit(tmpbuffer, h);
+   kzfree(tmpbuffer);
return 0;
} else {
err = crypto_shash_final(desc, dst);
-- 
2.15.1



[PATCH 1/2] big key: get rid of stack array allocation

2018-03-12 Thread Tycho Andersen
We're interested in getting rid of all of the stack allocated arrays in the
kernel [1]. This patch removes one in keys by switching to malloc/free.
Note that we use kzalloc, to avoid leaking the nonce. I'm not sure this is
really necessary, but extra paranoia seems prudent.

Manually tested using the program from the add_key man page to trigger
big_key.

[1]: https://lkml.org/lkml/2018/3/7/621

Signed-off-by: Tycho Andersen 
CC: David Howells 
CC: James Morris 
CC: "Serge E. Hallyn" 
CC: Jason A. Donenfeld 
---
 security/keys/big_key.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/security/keys/big_key.c b/security/keys/big_key.c
index fa728f662a6f..70f9f785c59d 100644
--- a/security/keys/big_key.c
+++ b/security/keys/big_key.c
@@ -108,13 +108,18 @@ static int big_key_crypt(enum big_key_op op, struct 
big_key_buf *buf, size_t dat
 * an .update function, so there's no chance we'll wind up reusing the
 * key to encrypt updated data. Simply put: one key, one encryption.
 */
-   u8 zero_nonce[crypto_aead_ivsize(big_key_aead)];
+   u8 *zero_nonce;
+
+   zero_nonce = kzalloc(crypto_aead_ivsize(big_key_aead), GFP_KERNEL);
+   if (!zero_nonce)
+   return -ENOMEM;
 
aead_req = aead_request_alloc(big_key_aead, GFP_KERNEL);
-   if (!aead_req)
+   if (!aead_req) {
+   kfree(zero_nonce);
return -ENOMEM;
+   }
 
-   memset(zero_nonce, 0, sizeof(zero_nonce));
aead_request_set_crypt(aead_req, buf->sg, buf->sg, datalen, zero_nonce);
aead_request_set_callback(aead_req, CRYPTO_TFM_REQ_MAY_SLEEP, NULL, 
NULL);
aead_request_set_ad(aead_req, 0);
@@ -131,6 +136,7 @@ static int big_key_crypt(enum big_key_op op, struct 
big_key_buf *buf, size_t dat
 error:
mutex_unlock(_key_aead_lock);
aead_request_free(aead_req);
+   kzfree(zero_nonce);
return ret;
 }
 
-- 
2.15.1



[PATCH 1/2] big key: get rid of stack array allocation

2018-03-12 Thread Tycho Andersen
We're interested in getting rid of all of the stack allocated arrays in the
kernel [1]. This patch removes one in keys by switching to malloc/free.
Note that we use kzalloc, to avoid leaking the nonce. I'm not sure this is
really necessary, but extra paranoia seems prudent.

Manually tested using the program from the add_key man page to trigger
big_key.

[1]: https://lkml.org/lkml/2018/3/7/621

Signed-off-by: Tycho Andersen 
CC: David Howells 
CC: James Morris 
CC: "Serge E. Hallyn" 
CC: Jason A. Donenfeld 
---
 security/keys/big_key.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/security/keys/big_key.c b/security/keys/big_key.c
index fa728f662a6f..70f9f785c59d 100644
--- a/security/keys/big_key.c
+++ b/security/keys/big_key.c
@@ -108,13 +108,18 @@ static int big_key_crypt(enum big_key_op op, struct 
big_key_buf *buf, size_t dat
 * an .update function, so there's no chance we'll wind up reusing the
 * key to encrypt updated data. Simply put: one key, one encryption.
 */
-   u8 zero_nonce[crypto_aead_ivsize(big_key_aead)];
+   u8 *zero_nonce;
+
+   zero_nonce = kzalloc(crypto_aead_ivsize(big_key_aead), GFP_KERNEL);
+   if (!zero_nonce)
+   return -ENOMEM;
 
aead_req = aead_request_alloc(big_key_aead, GFP_KERNEL);
-   if (!aead_req)
+   if (!aead_req) {
+   kfree(zero_nonce);
return -ENOMEM;
+   }
 
-   memset(zero_nonce, 0, sizeof(zero_nonce));
aead_request_set_crypt(aead_req, buf->sg, buf->sg, datalen, zero_nonce);
aead_request_set_callback(aead_req, CRYPTO_TFM_REQ_MAY_SLEEP, NULL, 
NULL);
aead_request_set_ad(aead_req, 0);
@@ -131,6 +136,7 @@ static int big_key_crypt(enum big_key_op op, struct 
big_key_buf *buf, size_t dat
 error:
mutex_unlock(_key_aead_lock);
aead_request_free(aead_req);
+   kzfree(zero_nonce);
return ret;
 }
 
-- 
2.15.1



Re: [PATCH 2/2] watchdog: aspeed: Allow configuring for alternate boot

2018-03-12 Thread Joel Stanley
On Sat, Mar 10, 2018 at 8:28 AM, Eddie James  wrote:
> From: Milton Miller 
>
> Allow the device tree to specify a watchdog to fallover to
> the alternate boot source.
>
> The aspeeed watchdog can set a latch directing flash chip select 0 to
> chip select 1, allowing boot from an alternate media if the watchdog
> is not reset in time.  On the ast2400 bank 1 also goes to flash bank 1,
> while on the ast2500 the chip selects are swapped.
>
> Signed-off-by: Milton Miller 
> Signed-off-by: Eddie James 
> ---
>  drivers/watchdog/aspeed_wdt.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/watchdog/aspeed_wdt.c b/drivers/watchdog/aspeed_wdt.c
> index d1987d6..f41d246 100644
> --- a/drivers/watchdog/aspeed_wdt.c
> +++ b/drivers/watchdog/aspeed_wdt.c
> @@ -46,6 +46,7 @@ struct aspeed_wdt_config {
>  #define WDT_RELOAD_VALUE   0x04
>  #define WDT_RESTART0x08
>  #define WDT_CTRL   0x0C
> +#define   WDT_CTRL_BOOT_SECONDARY  BIT(7)
>  #define   WDT_CTRL_RESET_MODE_SOC  (0x00 << 5)
>  #define   WDT_CTRL_RESET_MODE_FULL_CHIP(0x01 << 5)
>  #define   WDT_CTRL_RESET_MODE_ARM_CPU  (0x10 << 5)
> @@ -245,6 +246,8 @@ static int aspeed_wdt_probe(struct platform_device *pdev)
> }
> if (of_property_read_bool(np, "aspeed,external-signal"))
> wdt->ctrl |= WDT_CTRL_WDT_EXT;
> +   if (of_property_read_bool(np, "aspeed,alt-boot"))
> +   wdt->ctrl |= WDT_CTRL_BOOT_SECONDARY;

If a user sets this property on the only watchdog in the system, then
they will trigger this behaviour when doing a normal 'reboot'. That
would not be desirable.

We could mitigate this by:

- not registering the watchdog as a reboot device if this property is enabled

- clearing the WDT_CTRL_BOOT_SECONDARY bit in the aspeed_wdt_restart path

The second option is neater as it allows the watchdog to behave
normally when only one is enabled. The first would be confusing if a
system didn't have more than one watchdog enabled.

Cheers,

Joel


Re: [PATCH 2/2] watchdog: aspeed: Allow configuring for alternate boot

2018-03-12 Thread Joel Stanley
On Sat, Mar 10, 2018 at 8:28 AM, Eddie James  wrote:
> From: Milton Miller 
>
> Allow the device tree to specify a watchdog to fallover to
> the alternate boot source.
>
> The aspeeed watchdog can set a latch directing flash chip select 0 to
> chip select 1, allowing boot from an alternate media if the watchdog
> is not reset in time.  On the ast2400 bank 1 also goes to flash bank 1,
> while on the ast2500 the chip selects are swapped.
>
> Signed-off-by: Milton Miller 
> Signed-off-by: Eddie James 
> ---
>  drivers/watchdog/aspeed_wdt.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/watchdog/aspeed_wdt.c b/drivers/watchdog/aspeed_wdt.c
> index d1987d6..f41d246 100644
> --- a/drivers/watchdog/aspeed_wdt.c
> +++ b/drivers/watchdog/aspeed_wdt.c
> @@ -46,6 +46,7 @@ struct aspeed_wdt_config {
>  #define WDT_RELOAD_VALUE   0x04
>  #define WDT_RESTART0x08
>  #define WDT_CTRL   0x0C
> +#define   WDT_CTRL_BOOT_SECONDARY  BIT(7)
>  #define   WDT_CTRL_RESET_MODE_SOC  (0x00 << 5)
>  #define   WDT_CTRL_RESET_MODE_FULL_CHIP(0x01 << 5)
>  #define   WDT_CTRL_RESET_MODE_ARM_CPU  (0x10 << 5)
> @@ -245,6 +246,8 @@ static int aspeed_wdt_probe(struct platform_device *pdev)
> }
> if (of_property_read_bool(np, "aspeed,external-signal"))
> wdt->ctrl |= WDT_CTRL_WDT_EXT;
> +   if (of_property_read_bool(np, "aspeed,alt-boot"))
> +   wdt->ctrl |= WDT_CTRL_BOOT_SECONDARY;

If a user sets this property on the only watchdog in the system, then
they will trigger this behaviour when doing a normal 'reboot'. That
would not be desirable.

We could mitigate this by:

- not registering the watchdog as a reboot device if this property is enabled

- clearing the WDT_CTRL_BOOT_SECONDARY bit in the aspeed_wdt_restart path

The second option is neater as it allows the watchdog to behave
normally when only one is enabled. The first would be confusing if a
system didn't have more than one watchdog enabled.

Cheers,

Joel


Re: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()

2018-03-12 Thread Kees Cook
On Mon, Mar 12, 2018 at 4:57 PM, Linus Torvalds
 wrote:
> On Mon, Mar 12, 2018 at 3:55 PM, Andrew Morton
>  wrote:
>>
>> Replacing the __builtin_choose_expr() with ?: works of course.
>
> Hmm. That sounds like the right thing to do. We were so myopically
> staring at the __builtin_choose_expr() problem that we overlooked the
> obvious solution.
>
> Using __builtin_constant_p() together with a ?: is in fact our common
> pattern, so that should be fine. The only real reason to use
> __builtin_choose_expr() is if you want to get the *type* to vary
> depending on which side you choose, but that's not an issue for
> min/max.

This doesn't solve it for -Wvla, unfortunately. That was the point of
Josh's original suggestion of __builtin_choose_expr().

Try building with KCFLAGS=-Wval and checking net/ipv6/proc.c:

net/ipv6/proc.c: In function ‘snmp6_seq_show_item’:
net/ipv6/proc.c:198:2: warning: ISO C90 forbids array ‘buff’ whose
size can’t be evaluated [-Wvla]
  unsigned long buff[SNMP_MIB_MAX];
  ^~~~


-Kees

-- 
Kees Cook
Pixel Security


Re: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()

2018-03-12 Thread Kees Cook
On Mon, Mar 12, 2018 at 4:57 PM, Linus Torvalds
 wrote:
> On Mon, Mar 12, 2018 at 3:55 PM, Andrew Morton
>  wrote:
>>
>> Replacing the __builtin_choose_expr() with ?: works of course.
>
> Hmm. That sounds like the right thing to do. We were so myopically
> staring at the __builtin_choose_expr() problem that we overlooked the
> obvious solution.
>
> Using __builtin_constant_p() together with a ?: is in fact our common
> pattern, so that should be fine. The only real reason to use
> __builtin_choose_expr() is if you want to get the *type* to vary
> depending on which side you choose, but that's not an issue for
> min/max.

This doesn't solve it for -Wvla, unfortunately. That was the point of
Josh's original suggestion of __builtin_choose_expr().

Try building with KCFLAGS=-Wval and checking net/ipv6/proc.c:

net/ipv6/proc.c: In function ‘snmp6_seq_show_item’:
net/ipv6/proc.c:198:2: warning: ISO C90 forbids array ‘buff’ whose
size can’t be evaluated [-Wvla]
  unsigned long buff[SNMP_MIB_MAX];
  ^~~~


-Kees

-- 
Kees Cook
Pixel Security


Re: [PATCH ghak21 V2 2/4] audit: link denied should not directly generate PATH record

2018-03-12 Thread Richard Guy Briggs
On 2018-03-13 02:22, kbuild test robot wrote:
> Hi Richard,
> 
> Thank you for the patch! Yet something to improve:
> 
> [auto build test ERROR on linus/master]
> [also build test ERROR on v4.16-rc5 next-20180309]
> [if your patch is applied to the wrong git tree, please drop us a note to 
> help improve the system]

My fault.  It was applied to a stale tree:
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit.git

A self-NACKed patch wasn't removed from the upstream tree as hoped.  A
new patch is already in the works.

> url:
> https://github.com/0day-ci/linux/commits/Richard-Guy-Briggs/audit-address-ANOM_LINK-excess-records/20180313-015527
> Note: the 
> linux-review/Richard-Guy-Briggs/audit-address-ANOM_LINK-excess-records/20180313-015527
>  HEAD 12e8c56bcd359f7d20d4ae011674d37bc832bc4c builds fine.
>   It only hurts bisectibility.

- RGB

--
Richard Guy Briggs 
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635


Re: [PATCH ghak21 V2 2/4] audit: link denied should not directly generate PATH record

2018-03-12 Thread Richard Guy Briggs
On 2018-03-13 02:22, kbuild test robot wrote:
> Hi Richard,
> 
> Thank you for the patch! Yet something to improve:
> 
> [auto build test ERROR on linus/master]
> [also build test ERROR on v4.16-rc5 next-20180309]
> [if your patch is applied to the wrong git tree, please drop us a note to 
> help improve the system]

My fault.  It was applied to a stale tree:
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit.git

A self-NACKed patch wasn't removed from the upstream tree as hoped.  A
new patch is already in the works.

> url:
> https://github.com/0day-ci/linux/commits/Richard-Guy-Briggs/audit-address-ANOM_LINK-excess-records/20180313-015527
> Note: the 
> linux-review/Richard-Guy-Briggs/audit-address-ANOM_LINK-excess-records/20180313-015527
>  HEAD 12e8c56bcd359f7d20d4ae011674d37bc832bc4c builds fine.
>   It only hurts bisectibility.

- RGB

--
Richard Guy Briggs 
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635


linux-next: manual merge of the staging tree with the vfs tree

2018-03-12 Thread Stephen Rothwell
Hi Greg,

Today's linux-next merge of the staging tree got conflicts in:

  drivers/staging/lustre/lnet/libcfs/linux/linux-curproc.c
  drivers/staging/lustre/lnet/libcfs/linux/linux-prim.c

between commit:

  304ec482f562 ("get rid of pointless includes of fs_struct.h")

from the vfs tree and commits:

  37d3b407dc14 ("staging: lustre: remove linux-curproc.c")
  6b7936ceefa7 ("staging: lustre: make signal-blocking functions inline")

from the staging tree.

I fixed it up (I just removed the files) and can carry the fix as
necessary. This is now fixed as far as linux-next is concerned, but any
non trivial conflicts should be mentioned to your upstream maintainer
when your tree is submitted for merging.  You may also want to consider
cooperating with the maintainer of the conflicting tree to minimise any
particularly complex conflicts.

-- 
Cheers,
Stephen Rothwell


pgpBXiGKxsNUk.pgp
Description: OpenPGP digital signature


linux-next: manual merge of the staging tree with the vfs tree

2018-03-12 Thread Stephen Rothwell
Hi Greg,

Today's linux-next merge of the staging tree got conflicts in:

  drivers/staging/lustre/lnet/libcfs/linux/linux-curproc.c
  drivers/staging/lustre/lnet/libcfs/linux/linux-prim.c

between commit:

  304ec482f562 ("get rid of pointless includes of fs_struct.h")

from the vfs tree and commits:

  37d3b407dc14 ("staging: lustre: remove linux-curproc.c")
  6b7936ceefa7 ("staging: lustre: make signal-blocking functions inline")

from the staging tree.

I fixed it up (I just removed the files) and can carry the fix as
necessary. This is now fixed as far as linux-next is concerned, but any
non trivial conflicts should be mentioned to your upstream maintainer
when your tree is submitted for merging.  You may also want to consider
cooperating with the maintainer of the conflicting tree to minimise any
particularly complex conflicts.

-- 
Cheers,
Stephen Rothwell


pgpBXiGKxsNUk.pgp
Description: OpenPGP digital signature


RE: [PATCH] dma-mapping: move dma configuration to bus infrastructure

2018-03-12 Thread Nipun Gupta


> -Original Message-
> From: Sinan Kaya [mailto:ok...@codeaurora.org]
> Sent: Monday, March 12, 2018 22:14
> To: Nipun Gupta ; h...@lst.de;
> robin.mur...@arm.com; li...@armlinux.org.uk; gre...@linuxfoundation.org;
> m.szyprow...@samsung.com; bhelg...@google.com
> Cc: dmitry.torok...@gmail.com; rafael.j.wyso...@intel.com;
> jarkko.sakki...@linux.intel.com; linus.wall...@linaro.org; jo...@kernel.org;
> msucha...@suse.de; linux-kernel@vger.kernel.org; iommu@lists.linux-
> foundation.org; linux-...@vger.kernel.org
> Subject: Re: [PATCH] dma-mapping: move dma configuration to bus
> infrastructure
> 
> On 3/12/2018 11:24 AM, Nipun Gupta wrote:
> > +   if (dma_dev->of_node) {
> > +   ret = of_dma_configure(dev, dma_dev->of_node);
> > +   } else if (has_acpi_companion(dma_dev)) {
> > +   attr = acpi_get_dma_attr(to_acpi_device_node(dma_dev-
> >fwnode));
> > +   if (attr != DEV_DMA_NOT_SUPPORTED)
> > +   ret = acpi_dma_configure(dev, attr);
> > +   }
> > +
> > +   pci_put_host_bridge_device(bridge);
> > +
> > +   return ret;
> > +}
> > +
> > +void pci_dma_deconfigure(struct device *dev)
> > +{
> > +   of_dma_deconfigure(dev);
> > +   acpi_dma_deconfigure(dev);
> > +}
> 
> Isn't this one or the other one but not both?
> 
> Something like:
> 
> if (dev->of_node)
>   of_dma_deconfigure(dev);
> else
>   acpi_dma_deconfigure(dev);
> 
> should work.

I understand your point. Seems reasonable as we should not expect
the 'of/acpi DMA deconfigure' API to not fail when they are not configured.

But, here we would also need to get dma_device (just as we get in
'pci_dma_configure') and need a check on it as for PCI there 'of_node'
is present in the dma_dev.

Ill update this in v2, and also make similar changes for platform and AMBA bus.

Thanks,
Nipun

> 
> --
> Sinan Kaya
> Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
> Technologies, Inc.
> Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux
> Foundation Collaborative Project.


RE: [PATCH] dma-mapping: move dma configuration to bus infrastructure

2018-03-12 Thread Nipun Gupta


> -Original Message-
> From: Sinan Kaya [mailto:ok...@codeaurora.org]
> Sent: Monday, March 12, 2018 22:14
> To: Nipun Gupta ; h...@lst.de;
> robin.mur...@arm.com; li...@armlinux.org.uk; gre...@linuxfoundation.org;
> m.szyprow...@samsung.com; bhelg...@google.com
> Cc: dmitry.torok...@gmail.com; rafael.j.wyso...@intel.com;
> jarkko.sakki...@linux.intel.com; linus.wall...@linaro.org; jo...@kernel.org;
> msucha...@suse.de; linux-kernel@vger.kernel.org; iommu@lists.linux-
> foundation.org; linux-...@vger.kernel.org
> Subject: Re: [PATCH] dma-mapping: move dma configuration to bus
> infrastructure
> 
> On 3/12/2018 11:24 AM, Nipun Gupta wrote:
> > +   if (dma_dev->of_node) {
> > +   ret = of_dma_configure(dev, dma_dev->of_node);
> > +   } else if (has_acpi_companion(dma_dev)) {
> > +   attr = acpi_get_dma_attr(to_acpi_device_node(dma_dev-
> >fwnode));
> > +   if (attr != DEV_DMA_NOT_SUPPORTED)
> > +   ret = acpi_dma_configure(dev, attr);
> > +   }
> > +
> > +   pci_put_host_bridge_device(bridge);
> > +
> > +   return ret;
> > +}
> > +
> > +void pci_dma_deconfigure(struct device *dev)
> > +{
> > +   of_dma_deconfigure(dev);
> > +   acpi_dma_deconfigure(dev);
> > +}
> 
> Isn't this one or the other one but not both?
> 
> Something like:
> 
> if (dev->of_node)
>   of_dma_deconfigure(dev);
> else
>   acpi_dma_deconfigure(dev);
> 
> should work.

I understand your point. Seems reasonable as we should not expect
the 'of/acpi DMA deconfigure' API to not fail when they are not configured.

But, here we would also need to get dma_device (just as we get in
'pci_dma_configure') and need a check on it as for PCI there 'of_node'
is present in the dma_dev.

Ill update this in v2, and also make similar changes for platform and AMBA bus.

Thanks,
Nipun

> 
> --
> Sinan Kaya
> Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
> Technologies, Inc.
> Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux
> Foundation Collaborative Project.


Re: Timing performance regression 4.15 to 4.16-rc1

2018-03-12 Thread Al Viro
On Sun, Mar 11, 2018 at 02:02:03PM +0100, Zdenek Kabelac wrote:
> Hi
> 
> I've noticed quite some drop of performance (around 15% in some cases) where
> execution of lvm2 tests took longer time - and while tests itself should not
> really load CPU system  - the overall running time just got bigger.
> 
> Running bisect game pointed clearly to this commit:
> 
> ---
> 
> commit 44c02a2c3dc55835e9f0d8ef73966406cd805001
> Author: Al Viro 
> Date:   Thu Oct 5 12:59:44 2017 -0400
> 
> dev_ioctl(): move copyin/copyout to callers
> 
> 
> ---
> 
> clear revert of this commit on top of 3266b5bd97eaa72793df0b6e5a106c69ccc166c4
> (recent  ~4.16-rc4)  restored timing of tests back.
> 
> 
> I'm not sure why - so at this moment this is just a patch causing 15% longer
> running times on lvm2 test suite.

I find it very hard to believe, TBH.  It doesn't go anywhere near drivers/md;
could you profile the damn thing and see where does it spend that extra time?

It really doesn't touch anything on non-sockets...


Re: dcache: remove trylock loops (was Re: [BUG] lock_parent() breakage when used from shrink_dentry_list())

2018-03-12 Thread Eric W. Biederman
Al Viro  writes:

> On Tue, Mar 13, 2018 at 12:37:51AM +, Al Viro wrote:
>> On Mon, Mar 12, 2018 at 06:52:31PM -0500, Eric W. Biederman wrote:
>> 
>> > Ah.  I see now there is now the s_roots list that handles
>> > that bit of strangeness.
>> > 
>> > So one path is to simply remove the heuristic from
>> > path_connected.
>> > 
>> > Another path is to have nfsv2 and nfsv3 not set s_root at all.
>> > Leaving the heuristic working for the rest of the filesystems,
>> > and generally simplifying the code.
>> > 
>> > Something like the diff below I should think.
>> 
>> > +  /* Leave nfsv2 and nfsv3 s_root == NULL */
>> 
>> Now, grep fs/super.c for s_root.  Or try to boot it, for that
>> matter...
>
> BTW, if rename happens on server and we step into directory
> we'd already seen in one subtree while doing a lookup in
> another, we will get it moved around.  Without having the
> subtrees ever connected in dcache on client.  So adding
> && IS_ROOT(sb->s_root) to the test also won't work.

Nope.  We fundamentally need to call is_subdir in the nfs case
to ensure we don't have crazy problems.

I believe below is the obviously correct fix (that still preserves some
caching).  I need to look at nilfs as it also calls d_obtain_root.  I am
also wondering if some of the other network filesystems might be
susceptible to problems caused by renames on the server.

It is tempting to be more clever and not consider NFS_MOUNT_UNSHARED
mounts or mounts without mulitple s_roots but there can be renames
on the server that should trip up even those cases.  At least if
anything figures out how to trigger the dentries in the path to ever get
revalidated.

Eric

diff --git a/fs/namei.c b/fs/namei.c
index 921ae32dbc80..cafa365eeb70 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -559,9 +559,10 @@ static int __nd_alloc_stack(struct nameidata *nd)
 static bool path_connected(const struct path *path)
 {
struct vfsmount *mnt = path->mnt;
+   struct super_block *sb = mnt->mnt_sb;
 
-   /* Only bind mounts can have disconnected paths */
-   if (mnt->mnt_root == mnt->mnt_sb->s_root)
+   /* Bind mounts and multi-root filesystems can have disconnected paths */
+   if (!(sb->s_iflags & SB_I_MULTIROOT) && (mnt->mnt_root == sb->s_root))
return true;
 
return is_subdir(path->dentry, mnt->mnt_root);
diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index 29bacdc56f6a..64129a72f312 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -2631,6 +2631,7 @@ struct dentry *nfs_fs_mount_common(struct nfs_server 
*server,
/* initial superblock/root creation */
mount_info->fill_super(s, mount_info);
nfs_get_cache_cookie(s, mount_info->parsed, mount_info->cloned);
+   s->s_iflags |= SB_I_MULTIROOT;
}
 
mntroot = nfs_get_root(s, mount_info->mntfh, dev_name);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 2a815560fda0..0430e03febaa 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1317,6 +1317,7 @@ extern int send_sigurg(struct fown_struct *fown);
 #define SB_I_CGROUPWB  0x0001  /* cgroup-aware writeback enabled */
 #define SB_I_NOEXEC0x0002  /* Ignore executables on this fs */
 #define SB_I_NODEV 0x0004  /* Ignore devices on this fs */
+#define SB_I_MULTIROOT 0x0008  /* Multiple roots to the dentry tree */
 
 /* sb->s_iflags to limit user namespace mounts */
 #define SB_I_USERNS_VISIBLE0x0010 /* fstype already mounted */


Re: Timing performance regression 4.15 to 4.16-rc1

2018-03-12 Thread Al Viro
On Sun, Mar 11, 2018 at 02:02:03PM +0100, Zdenek Kabelac wrote:
> Hi
> 
> I've noticed quite some drop of performance (around 15% in some cases) where
> execution of lvm2 tests took longer time - and while tests itself should not
> really load CPU system  - the overall running time just got bigger.
> 
> Running bisect game pointed clearly to this commit:
> 
> ---
> 
> commit 44c02a2c3dc55835e9f0d8ef73966406cd805001
> Author: Al Viro 
> Date:   Thu Oct 5 12:59:44 2017 -0400
> 
> dev_ioctl(): move copyin/copyout to callers
> 
> 
> ---
> 
> clear revert of this commit on top of 3266b5bd97eaa72793df0b6e5a106c69ccc166c4
> (recent  ~4.16-rc4)  restored timing of tests back.
> 
> 
> I'm not sure why - so at this moment this is just a patch causing 15% longer
> running times on lvm2 test suite.

I find it very hard to believe, TBH.  It doesn't go anywhere near drivers/md;
could you profile the damn thing and see where does it spend that extra time?

It really doesn't touch anything on non-sockets...


Re: dcache: remove trylock loops (was Re: [BUG] lock_parent() breakage when used from shrink_dentry_list())

2018-03-12 Thread Eric W. Biederman
Al Viro  writes:

> On Tue, Mar 13, 2018 at 12:37:51AM +, Al Viro wrote:
>> On Mon, Mar 12, 2018 at 06:52:31PM -0500, Eric W. Biederman wrote:
>> 
>> > Ah.  I see now there is now the s_roots list that handles
>> > that bit of strangeness.
>> > 
>> > So one path is to simply remove the heuristic from
>> > path_connected.
>> > 
>> > Another path is to have nfsv2 and nfsv3 not set s_root at all.
>> > Leaving the heuristic working for the rest of the filesystems,
>> > and generally simplifying the code.
>> > 
>> > Something like the diff below I should think.
>> 
>> > +  /* Leave nfsv2 and nfsv3 s_root == NULL */
>> 
>> Now, grep fs/super.c for s_root.  Or try to boot it, for that
>> matter...
>
> BTW, if rename happens on server and we step into directory
> we'd already seen in one subtree while doing a lookup in
> another, we will get it moved around.  Without having the
> subtrees ever connected in dcache on client.  So adding
> && IS_ROOT(sb->s_root) to the test also won't work.

Nope.  We fundamentally need to call is_subdir in the nfs case
to ensure we don't have crazy problems.

I believe below is the obviously correct fix (that still preserves some
caching).  I need to look at nilfs as it also calls d_obtain_root.  I am
also wondering if some of the other network filesystems might be
susceptible to problems caused by renames on the server.

It is tempting to be more clever and not consider NFS_MOUNT_UNSHARED
mounts or mounts without mulitple s_roots but there can be renames
on the server that should trip up even those cases.  At least if
anything figures out how to trigger the dentries in the path to ever get
revalidated.

Eric

diff --git a/fs/namei.c b/fs/namei.c
index 921ae32dbc80..cafa365eeb70 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -559,9 +559,10 @@ static int __nd_alloc_stack(struct nameidata *nd)
 static bool path_connected(const struct path *path)
 {
struct vfsmount *mnt = path->mnt;
+   struct super_block *sb = mnt->mnt_sb;
 
-   /* Only bind mounts can have disconnected paths */
-   if (mnt->mnt_root == mnt->mnt_sb->s_root)
+   /* Bind mounts and multi-root filesystems can have disconnected paths */
+   if (!(sb->s_iflags & SB_I_MULTIROOT) && (mnt->mnt_root == sb->s_root))
return true;
 
return is_subdir(path->dentry, mnt->mnt_root);
diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index 29bacdc56f6a..64129a72f312 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -2631,6 +2631,7 @@ struct dentry *nfs_fs_mount_common(struct nfs_server 
*server,
/* initial superblock/root creation */
mount_info->fill_super(s, mount_info);
nfs_get_cache_cookie(s, mount_info->parsed, mount_info->cloned);
+   s->s_iflags |= SB_I_MULTIROOT;
}
 
mntroot = nfs_get_root(s, mount_info->mntfh, dev_name);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 2a815560fda0..0430e03febaa 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1317,6 +1317,7 @@ extern int send_sigurg(struct fown_struct *fown);
 #define SB_I_CGROUPWB  0x0001  /* cgroup-aware writeback enabled */
 #define SB_I_NOEXEC0x0002  /* Ignore executables on this fs */
 #define SB_I_NODEV 0x0004  /* Ignore devices on this fs */
+#define SB_I_MULTIROOT 0x0008  /* Multiple roots to the dentry tree */
 
 /* sb->s_iflags to limit user namespace mounts */
 #define SB_I_USERNS_VISIBLE0x0010 /* fstype already mounted */


Re: [RESEND PATCH v3 1/3] drm/panel: refactor INNOLUX P079ZCA panel driver

2018-03-12 Thread hl

Hi Thierry Reding,


On Monday, March 12, 2018 05:21 PM, Thierry Reding wrote:

On Mon, Dec 04, 2017 at 03:17:48PM +0800, Lin Huang wrote:

Refactor Innolux P079ZCA panel driver, let it support
multi panel.

Signed-off-by: Lin Huang 
---
Changes in v2:
- Change regulator property name to meet the panel datasheet
Changes in v3:
- this patch only refactor P079ZCA panel to support multi panel, support 
P097PFG panel in another patch

  drivers/gpu/drm/panel/panel-innolux-p079zca.c | 147 ++
  1 file changed, 105 insertions(+), 42 deletions(-)

diff --git a/drivers/gpu/drm/panel/panel-innolux-p079zca.c 
b/drivers/gpu/drm/panel/panel-innolux-p079zca.c
index 6ba9344..1597744 100644
--- a/drivers/gpu/drm/panel/panel-innolux-p079zca.c
+++ b/drivers/gpu/drm/panel/panel-innolux-p079zca.c
@@ -20,12 +20,32 @@
  
  #include 
  
+struct panel_desc {

+   const struct drm_display_mode *modes;
+   unsigned int bpc;
+   struct {
+   unsigned int width;
+   unsigned int height;
+   } size;
+};
+
+struct panel_desc_dsi {
+   struct panel_desc desc;
+
+   unsigned long flags;
+   enum mipi_dsi_pixel_format format;
+   unsigned int lanes;
+};

There's no need for the two layers here. Just move everything from
struct panel_desc_dsi into struct panel_desc.

Okay, will fix it.



+
  struct innolux_panel {
struct drm_panel base;
struct mipi_dsi_device *link;
+   const struct panel_desc_dsi *dsi_desc;

And then this can just become:

const struct panel_desc *desc;

The _dsi suffix is redundant because this driver is exclusively for DSI
devices.


Got it.

-static int innolux_panel_add(struct innolux_panel *innolux)
+static int innolux_panel_add(struct mipi_dsi_device *dsi,
+const struct panel_desc_dsi *desc)
  {
-   struct device *dev = >link->dev;
+   struct innolux_panel *innolux;
+   struct device *dev = >dev;
struct device_node *np;
int err;
  
-	innolux->supply = devm_regulator_get(dev, "power");

-   if (IS_ERR(innolux->supply))
-   return PTR_ERR(innolux->supply);
+   innolux = devm_kzalloc(dev, sizeof(*innolux), GFP_KERNEL);
+   if (!innolux)
+   return -ENOMEM;
+
+   innolux->dsi_desc = desc;
+   innolux->vddi = devm_regulator_get(dev, "power");
+   if (IS_ERR(innolux->vddi))
+   return PTR_ERR(innolux->vddi);
+
+   innolux->avdd = devm_regulator_get(dev, "avdd");
+   if (IS_ERR(innolux->avdd))
+   return PTR_ERR(innolux->avdd);
+
+   innolux->avee = devm_regulator_get(dev, "avee");
+   if (IS_ERR(innolux->avee))
+   return PTR_ERR(innolux->avee);

According to the device tree bindings these regulators are all optional.
Now devm_regulator_get() will return a dummy regulator if one has not
been specified in DT, so this seems like it should work fine, even for
existing DTBs that don't have the avdd and avee regulators. However, the
DT bindings seem to be wrong, because these are in fact required
regulators.
Actually, the vddi is request, and avdd and avee is optional, i will 
only check

vddi error later.
  
  	innolux->enable_gpio = devm_gpiod_get_optional(dev, "enable",

   GPIOD_OUT_HIGH);
@@ -243,14 +306,16 @@ static int innolux_panel_add(struct innolux_panel 
*innolux)
  
  	drm_panel_init(>base);

innolux->base.funcs = _panel_funcs;
-   innolux->base.dev = >link->dev;
+   innolux->base.dev = dev;
  
  	err = drm_panel_add(>base);

if (err < 0)
goto put_backlight;
  
-	return 0;

+   dev_set_drvdata(dev, innolux);
+   innolux->link = dsi;
  
+	return 0;

  put_backlight:
put_device(>backlight->dev);
  
@@ -267,28 +332,25 @@ static void innolux_panel_del(struct innolux_panel *innolux)
  
  static int innolux_panel_probe(struct mipi_dsi_device *dsi)

  {
-   struct innolux_panel *innolux;
-   int err;
  
-	dsi->lanes = 4;

-   dsi->format = MIPI_DSI_FMT_RGB888;
-   dsi->mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_SYNC_PULSE |
- MIPI_DSI_MODE_LPM;
-
-   innolux = devm_kzalloc(>dev, sizeof(*innolux), GFP_KERNEL);
-   if (!innolux)
-   return -ENOMEM;
+   const struct panel_desc_dsi *dsi_desc;
+   const struct of_device_id *id;
+   int err;
  
-	mipi_dsi_set_drvdata(dsi, innolux);

+   id = of_match_node(innolux_of_match, dsi->dev.of_node);
+   if (!id)
+   return -ENODEV;
  
-	innolux->link = dsi;

+   dsi_desc = id->data;
+   dsi->mode_flags = dsi_desc->flags;
+   dsi->format = dsi_desc->format;
+   dsi->lanes = dsi_desc->lanes;
  
-	err = innolux_panel_add(innolux);

+   err = innolux_panel_add(dsi, dsi_desc);
if (err < 0)
return err;
  
-	err = mipi_dsi_attach(dsi);

-   return err;
+  

Re: [RESEND PATCH v3 1/3] drm/panel: refactor INNOLUX P079ZCA panel driver

2018-03-12 Thread hl

Hi Thierry Reding,


On Monday, March 12, 2018 05:21 PM, Thierry Reding wrote:

On Mon, Dec 04, 2017 at 03:17:48PM +0800, Lin Huang wrote:

Refactor Innolux P079ZCA panel driver, let it support
multi panel.

Signed-off-by: Lin Huang 
---
Changes in v2:
- Change regulator property name to meet the panel datasheet
Changes in v3:
- this patch only refactor P079ZCA panel to support multi panel, support 
P097PFG panel in another patch

  drivers/gpu/drm/panel/panel-innolux-p079zca.c | 147 ++
  1 file changed, 105 insertions(+), 42 deletions(-)

diff --git a/drivers/gpu/drm/panel/panel-innolux-p079zca.c 
b/drivers/gpu/drm/panel/panel-innolux-p079zca.c
index 6ba9344..1597744 100644
--- a/drivers/gpu/drm/panel/panel-innolux-p079zca.c
+++ b/drivers/gpu/drm/panel/panel-innolux-p079zca.c
@@ -20,12 +20,32 @@
  
  #include 
  
+struct panel_desc {

+   const struct drm_display_mode *modes;
+   unsigned int bpc;
+   struct {
+   unsigned int width;
+   unsigned int height;
+   } size;
+};
+
+struct panel_desc_dsi {
+   struct panel_desc desc;
+
+   unsigned long flags;
+   enum mipi_dsi_pixel_format format;
+   unsigned int lanes;
+};

There's no need for the two layers here. Just move everything from
struct panel_desc_dsi into struct panel_desc.

Okay, will fix it.



+
  struct innolux_panel {
struct drm_panel base;
struct mipi_dsi_device *link;
+   const struct panel_desc_dsi *dsi_desc;

And then this can just become:

const struct panel_desc *desc;

The _dsi suffix is redundant because this driver is exclusively for DSI
devices.


Got it.

-static int innolux_panel_add(struct innolux_panel *innolux)
+static int innolux_panel_add(struct mipi_dsi_device *dsi,
+const struct panel_desc_dsi *desc)
  {
-   struct device *dev = >link->dev;
+   struct innolux_panel *innolux;
+   struct device *dev = >dev;
struct device_node *np;
int err;
  
-	innolux->supply = devm_regulator_get(dev, "power");

-   if (IS_ERR(innolux->supply))
-   return PTR_ERR(innolux->supply);
+   innolux = devm_kzalloc(dev, sizeof(*innolux), GFP_KERNEL);
+   if (!innolux)
+   return -ENOMEM;
+
+   innolux->dsi_desc = desc;
+   innolux->vddi = devm_regulator_get(dev, "power");
+   if (IS_ERR(innolux->vddi))
+   return PTR_ERR(innolux->vddi);
+
+   innolux->avdd = devm_regulator_get(dev, "avdd");
+   if (IS_ERR(innolux->avdd))
+   return PTR_ERR(innolux->avdd);
+
+   innolux->avee = devm_regulator_get(dev, "avee");
+   if (IS_ERR(innolux->avee))
+   return PTR_ERR(innolux->avee);

According to the device tree bindings these regulators are all optional.
Now devm_regulator_get() will return a dummy regulator if one has not
been specified in DT, so this seems like it should work fine, even for
existing DTBs that don't have the avdd and avee regulators. However, the
DT bindings seem to be wrong, because these are in fact required
regulators.
Actually, the vddi is request, and avdd and avee is optional, i will 
only check

vddi error later.
  
  	innolux->enable_gpio = devm_gpiod_get_optional(dev, "enable",

   GPIOD_OUT_HIGH);
@@ -243,14 +306,16 @@ static int innolux_panel_add(struct innolux_panel 
*innolux)
  
  	drm_panel_init(>base);

innolux->base.funcs = _panel_funcs;
-   innolux->base.dev = >link->dev;
+   innolux->base.dev = dev;
  
  	err = drm_panel_add(>base);

if (err < 0)
goto put_backlight;
  
-	return 0;

+   dev_set_drvdata(dev, innolux);
+   innolux->link = dsi;
  
+	return 0;

  put_backlight:
put_device(>backlight->dev);
  
@@ -267,28 +332,25 @@ static void innolux_panel_del(struct innolux_panel *innolux)
  
  static int innolux_panel_probe(struct mipi_dsi_device *dsi)

  {
-   struct innolux_panel *innolux;
-   int err;
  
-	dsi->lanes = 4;

-   dsi->format = MIPI_DSI_FMT_RGB888;
-   dsi->mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_SYNC_PULSE |
- MIPI_DSI_MODE_LPM;
-
-   innolux = devm_kzalloc(>dev, sizeof(*innolux), GFP_KERNEL);
-   if (!innolux)
-   return -ENOMEM;
+   const struct panel_desc_dsi *dsi_desc;
+   const struct of_device_id *id;
+   int err;
  
-	mipi_dsi_set_drvdata(dsi, innolux);

+   id = of_match_node(innolux_of_match, dsi->dev.of_node);
+   if (!id)
+   return -ENODEV;
  
-	innolux->link = dsi;

+   dsi_desc = id->data;
+   dsi->mode_flags = dsi_desc->flags;
+   dsi->format = dsi_desc->format;
+   dsi->lanes = dsi_desc->lanes;
  
-	err = innolux_panel_add(innolux);

+   err = innolux_panel_add(dsi, dsi_desc);
if (err < 0)
return err;
  
-	err = mipi_dsi_attach(dsi);

-   return err;
+   return 

Re: [PATCH v12 0/6] Address error and recovery for AER and DPC

2018-03-12 Thread Sinan Kaya
On 3/12/2018 7:26 PM, Keith Busch wrote:
> On Mon, Mar 12, 2018 at 02:47:30PM -0500, Bjorn Helgaas wrote:
>> [+cc Alex]
>>
>> On Mon, Mar 12, 2018 at 08:25:51AM -0600, Keith Busch wrote:
>>> On Sun, Mar 11, 2018 at 11:03:58PM -0400, Sinan Kaya wrote:
 On 3/11/2018 6:03 PM, Bjorn Helgaas wrote:
> On Wed, Feb 28, 2018 at 10:34:11PM +0530, Oza Pawandeep wrote:

> That difference has been there since the beginning of DPC, so it has
> nothing to do with *this* series EXCEPT for the fact that it really
> complicates the logic you're adding to reset_link() and
> broadcast_error_message().
>
> We ought to be able to simplify that somehow because the only real
> difference between AER and DPC should be that DPC automatically
> disables the link and AER does it in software.

 I agree this should be possible. Code execution path should be almost
 identical to fatal error case.

 Is there any reason why you went to stop driver path, Keith?
>>>
>>> The fact is the link is truly down during a DPC event. When the link
>>> is enabled again, you don't know at that point if the device(s) on the
>>> other side have changed.
>>
>> When DPC is triggered, the port takes the link down.  When we handle
>> an uncorrectable (nonfatal or fatal) AER event, software takes the
>> link down.
>>
>> In both cases, devices on the other side are at least reset.  Whenever
>> the link goes down, it's conceivable the device could be replaced with
>> a different one before the link comes back up.  Is this why you remove
>> and re-enumerate?  (See tangent [1] below.)
> 
> Yes. Truthfully, DPC events due to changing topologies was the motivating
> use case when we initially developed this. We were also going for
> simplicity (at least initially), and remove + re-enumerate seemed
> safe without concerning this driver with other capability regsiters, or
> coordinating with/depending on other drivers. For example, a successful
> reset may depend on any particular driver calling pci_restore_state from
> a good saved state.

The spec is recommending code to use "Hotplug Surprise" to differentiate
these two cases we are looking for. 

The use case Keith is looking for is for hotplug support. 
The case I and Oza are more interested is for error handling on platforms
with no hotplug support.

According to the spec, if "Hotplug Surprise" is set in slot capabilities,
then hotplug driver handles link up and DPC driver doesn't interfere with
its operation. Hotplug driver observes link up interrupt like it is doing today.
When link up event is observed, hotplug driver will do the enumeration.

If "Hotplug Surprise" bit is not set, it is the job of the DPC driver to
bring up the link. I believe this path should follow the AER driver path
as there is a very well defined error reporting and recovery framework
in the code.

The link comes back up automatically when DPC driver handles its interrupt
very similar to what secondary bus reset does for AER. I don't believe
there is a hotplug possibility under this condition since it is not supported
to begin with.

Should we plumb the "Hotplug Surprise" condition into the code to satisfy
both cases and leave the error handling path according to this code series?

> 
>> The point is that from the device's hardware perspective, these
>> scenarios are the same (it sent a ERR_NONFATAL or ERR_FATAL message
>> and it sees the link go down).  I think we should make them the same
>> on the software side, too: the driver should see the same callbacks,
>> in the same order, whether we're doing AER or DPC.
>>
>> If removing and re-enumerating is the right thing for DPC, I think
>> that means it's also the right thing for AER.
>>
-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux 
Foundation Collaborative Project.


Re: [PATCH v12 0/6] Address error and recovery for AER and DPC

2018-03-12 Thread Sinan Kaya
On 3/12/2018 7:26 PM, Keith Busch wrote:
> On Mon, Mar 12, 2018 at 02:47:30PM -0500, Bjorn Helgaas wrote:
>> [+cc Alex]
>>
>> On Mon, Mar 12, 2018 at 08:25:51AM -0600, Keith Busch wrote:
>>> On Sun, Mar 11, 2018 at 11:03:58PM -0400, Sinan Kaya wrote:
 On 3/11/2018 6:03 PM, Bjorn Helgaas wrote:
> On Wed, Feb 28, 2018 at 10:34:11PM +0530, Oza Pawandeep wrote:

> That difference has been there since the beginning of DPC, so it has
> nothing to do with *this* series EXCEPT for the fact that it really
> complicates the logic you're adding to reset_link() and
> broadcast_error_message().
>
> We ought to be able to simplify that somehow because the only real
> difference between AER and DPC should be that DPC automatically
> disables the link and AER does it in software.

 I agree this should be possible. Code execution path should be almost
 identical to fatal error case.

 Is there any reason why you went to stop driver path, Keith?
>>>
>>> The fact is the link is truly down during a DPC event. When the link
>>> is enabled again, you don't know at that point if the device(s) on the
>>> other side have changed.
>>
>> When DPC is triggered, the port takes the link down.  When we handle
>> an uncorrectable (nonfatal or fatal) AER event, software takes the
>> link down.
>>
>> In both cases, devices on the other side are at least reset.  Whenever
>> the link goes down, it's conceivable the device could be replaced with
>> a different one before the link comes back up.  Is this why you remove
>> and re-enumerate?  (See tangent [1] below.)
> 
> Yes. Truthfully, DPC events due to changing topologies was the motivating
> use case when we initially developed this. We were also going for
> simplicity (at least initially), and remove + re-enumerate seemed
> safe without concerning this driver with other capability regsiters, or
> coordinating with/depending on other drivers. For example, a successful
> reset may depend on any particular driver calling pci_restore_state from
> a good saved state.

The spec is recommending code to use "Hotplug Surprise" to differentiate
these two cases we are looking for. 

The use case Keith is looking for is for hotplug support. 
The case I and Oza are more interested is for error handling on platforms
with no hotplug support.

According to the spec, if "Hotplug Surprise" is set in slot capabilities,
then hotplug driver handles link up and DPC driver doesn't interfere with
its operation. Hotplug driver observes link up interrupt like it is doing today.
When link up event is observed, hotplug driver will do the enumeration.

If "Hotplug Surprise" bit is not set, it is the job of the DPC driver to
bring up the link. I believe this path should follow the AER driver path
as there is a very well defined error reporting and recovery framework
in the code.

The link comes back up automatically when DPC driver handles its interrupt
very similar to what secondary bus reset does for AER. I don't believe
there is a hotplug possibility under this condition since it is not supported
to begin with.

Should we plumb the "Hotplug Surprise" condition into the code to satisfy
both cases and leave the error handling path according to this code series?

> 
>> The point is that from the device's hardware perspective, these
>> scenarios are the same (it sent a ERR_NONFATAL or ERR_FATAL message
>> and it sees the link go down).  I think we should make them the same
>> on the software side, too: the driver should see the same callbacks,
>> in the same order, whether we're doing AER or DPC.
>>
>> If removing and re-enumerating is the right thing for DPC, I think
>> that means it's also the right thing for AER.
>>
-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux 
Foundation Collaborative Project.


Re: [PATCH v4 3/3 update] mm/free_pcppages_bulk: prefetch buddy while not holding lock

2018-03-12 Thread Aaron Lu
On Mon, Mar 12, 2018 at 10:32:32AM -0700, Dave Hansen wrote:
> On 03/09/2018 12:24 AM, Aaron Lu wrote:
> > +   /*
> > +* We are going to put the page back to the global
> > +* pool, prefetch its buddy to speed up later access
> > +* under zone->lock. It is believed the overhead of
> > +* an additional test and calculating buddy_pfn here
> > +* can be offset by reduced memory latency later. To
> > +* avoid excessive prefetching due to large count, only
> > +* prefetch buddy for the last pcp->batch nr of pages.
> > +*/
> > +   if (count > pcp->batch)
> > +   continue;
> > +   pfn = page_to_pfn(page);
> > +   buddy_pfn = __find_buddy_pfn(pfn, 0);
> > +   buddy = page + (buddy_pfn - pfn);
> > +   prefetch(buddy);
> 
> FWIW, I think this needs to go into a helper function.  Is that possible?

I'll give it a try.

> 
> There's too much logic happening here.  Also, 'count' going from
> batch_size->0 is totally non-obvious from the patch context.  It makes
> this hunk look totally wrong by itself.


Re: [PATCH v4 2/3] mm/free_pcppages_bulk: do not hold lock when picking pages to free

2018-03-12 Thread Aaron Lu
On Mon, Mar 12, 2018 at 03:22:53PM +0100, Vlastimil Babka wrote:
> On 03/01/2018 07:28 AM, Aaron Lu wrote:
> > When freeing a batch of pages from Per-CPU-Pages(PCP) back to buddy,
> > the zone->lock is held and then pages are chosen from PCP's migratetype
> > list. While there is actually no need to do this 'choose part' under
> > lock since it's PCP pages, the only CPU that can touch them is us and
> > irq is also disabled.
> > 
> > Moving this part outside could reduce lock held time and improve
> > performance. Test with will-it-scale/page_fault1 full load:
> > 
> > kernel  Broadwell(2S)  Skylake(2S)   Broadwell(4S)  Skylake(4S)
> > v4.16-rc2+  90342157971818   13667135   15677465
> > this patch  9536374 +5.6%  8314710 +4.3% 14070408 +3.0% 16675866 +6.4%
> > 
> > What the test does is: starts $nr_cpu processes and each will repeatedly
> > do the following for 5 minutes:
> > 1 mmap 128M anonymouse space;
> > 2 write access to that space;
> > 3 munmap.
> > The score is the aggregated iteration.
> > 
> > https://github.com/antonblanchard/will-it-scale/blob/master/tests/page_fault1.c
> > 
> > Acked-by: Mel Gorman 
> > Signed-off-by: Aaron Lu 
> > ---
> >  mm/page_alloc.c | 39 +++
> >  1 file changed, 23 insertions(+), 16 deletions(-)
> > 
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index faa33eac1635..dafdcdec9c1f 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -1116,12 +1116,10 @@ static void free_pcppages_bulk(struct zone *zone, 
> > int count,
> > int migratetype = 0;
> > int batch_free = 0;
> > bool isolated_pageblocks;
> > -
> > -   spin_lock(>lock);
> > -   isolated_pageblocks = has_isolate_pageblock(zone);
> > +   struct page *page, *tmp;
> > +   LIST_HEAD(head);
> >  
> > while (count) {
> > -   struct page *page;
> > struct list_head *list;
> >  
> > /*
> > @@ -1143,27 +1141,36 @@ static void free_pcppages_bulk(struct zone *zone, 
> > int count,
> > batch_free = count;
> >  
> > do {
> > -   int mt; /* migratetype of the to-be-freed page */
> > -
> > page = list_last_entry(list, struct page, lru);
> > -   /* must delete as __free_one_page list manipulates */
> > +   /* must delete to avoid corrupting pcp list */
> > list_del(>lru);
> 
> Well, since bulkfree_pcp_prepare() doesn't care about page->lru, you
> could maybe use list_move_tail() instead of list_del() +
> list_add_tail()? That avoids temporarily writing poison values.

Good point, except bulkfree_pcp_prepare() could return error and then
the page will need to be removed from the to-be-freed list, like this:

do {
page = list_last_entry(list, struct page, lru);
list_move_tail(>lru, );
pcp->count--;

if (bulkfree_pcp_prepare(page))
list_del(>lru);
} while (--count && --batch_free && !list_empty(list));

Considering bulkfree_pcp_prepare() returning error is the rare case,
this list_del() should rarely happen. At the same time, this part is
outside of zone->lock and can hardly impact performance...so I'm not
sure.
 
> Hm actually, you are reversing the list in the process, because page is
> obtained by list_last_entry and you use list_add_tail. That could have
> unintended performance consequences?

True the order is changed when these to-be-freed pages are in this
temporary list, but then they are iterated and freed one by one from
head to tail so the order they landed in free_list is the same as
before the patch(also the same as they are in pcp list).

> 
> Also maybe list_cut_position() could be faster than shuffling pages one
> by one? I guess not really, because batch_free will be generally low?

We will need to know where to cut if list_cut_position() is to be used
and to find that out, the list will need to be iterated first. I guess
that's too much trouble.

Since this part of code is per-cpu(outside of zone->lock) and these
pages are in pcp(meaning their cachelines are not likely in remote),
I didn't worry too much about not being able to list_cut_position() but
iterate. On allocation side though, when manipulating the global
free_list under zone->lock, this is a big problem since pages there are
freed from different CPUs and the cache could be cold for the allocating
CPU. That is why I'm proposing clusted allocation sometime ago as an RFC
patch where list_cut_position() is so good that it could eliminate the
cacheline miss issue since we do not need to iterate cold pages one by
one.

I wish there is a data structure that has the flexibility of list while
at the same time we can locate the Nth element in the list without the
need to iterate. That's what I'm looking for when 

Re: [PATCH v4 3/3 update] mm/free_pcppages_bulk: prefetch buddy while not holding lock

2018-03-12 Thread Aaron Lu
On Mon, Mar 12, 2018 at 10:32:32AM -0700, Dave Hansen wrote:
> On 03/09/2018 12:24 AM, Aaron Lu wrote:
> > +   /*
> > +* We are going to put the page back to the global
> > +* pool, prefetch its buddy to speed up later access
> > +* under zone->lock. It is believed the overhead of
> > +* an additional test and calculating buddy_pfn here
> > +* can be offset by reduced memory latency later. To
> > +* avoid excessive prefetching due to large count, only
> > +* prefetch buddy for the last pcp->batch nr of pages.
> > +*/
> > +   if (count > pcp->batch)
> > +   continue;
> > +   pfn = page_to_pfn(page);
> > +   buddy_pfn = __find_buddy_pfn(pfn, 0);
> > +   buddy = page + (buddy_pfn - pfn);
> > +   prefetch(buddy);
> 
> FWIW, I think this needs to go into a helper function.  Is that possible?

I'll give it a try.

> 
> There's too much logic happening here.  Also, 'count' going from
> batch_size->0 is totally non-obvious from the patch context.  It makes
> this hunk look totally wrong by itself.


Re: [PATCH v4 2/3] mm/free_pcppages_bulk: do not hold lock when picking pages to free

2018-03-12 Thread Aaron Lu
On Mon, Mar 12, 2018 at 03:22:53PM +0100, Vlastimil Babka wrote:
> On 03/01/2018 07:28 AM, Aaron Lu wrote:
> > When freeing a batch of pages from Per-CPU-Pages(PCP) back to buddy,
> > the zone->lock is held and then pages are chosen from PCP's migratetype
> > list. While there is actually no need to do this 'choose part' under
> > lock since it's PCP pages, the only CPU that can touch them is us and
> > irq is also disabled.
> > 
> > Moving this part outside could reduce lock held time and improve
> > performance. Test with will-it-scale/page_fault1 full load:
> > 
> > kernel  Broadwell(2S)  Skylake(2S)   Broadwell(4S)  Skylake(4S)
> > v4.16-rc2+  90342157971818   13667135   15677465
> > this patch  9536374 +5.6%  8314710 +4.3% 14070408 +3.0% 16675866 +6.4%
> > 
> > What the test does is: starts $nr_cpu processes and each will repeatedly
> > do the following for 5 minutes:
> > 1 mmap 128M anonymouse space;
> > 2 write access to that space;
> > 3 munmap.
> > The score is the aggregated iteration.
> > 
> > https://github.com/antonblanchard/will-it-scale/blob/master/tests/page_fault1.c
> > 
> > Acked-by: Mel Gorman 
> > Signed-off-by: Aaron Lu 
> > ---
> >  mm/page_alloc.c | 39 +++
> >  1 file changed, 23 insertions(+), 16 deletions(-)
> > 
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index faa33eac1635..dafdcdec9c1f 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -1116,12 +1116,10 @@ static void free_pcppages_bulk(struct zone *zone, 
> > int count,
> > int migratetype = 0;
> > int batch_free = 0;
> > bool isolated_pageblocks;
> > -
> > -   spin_lock(>lock);
> > -   isolated_pageblocks = has_isolate_pageblock(zone);
> > +   struct page *page, *tmp;
> > +   LIST_HEAD(head);
> >  
> > while (count) {
> > -   struct page *page;
> > struct list_head *list;
> >  
> > /*
> > @@ -1143,27 +1141,36 @@ static void free_pcppages_bulk(struct zone *zone, 
> > int count,
> > batch_free = count;
> >  
> > do {
> > -   int mt; /* migratetype of the to-be-freed page */
> > -
> > page = list_last_entry(list, struct page, lru);
> > -   /* must delete as __free_one_page list manipulates */
> > +   /* must delete to avoid corrupting pcp list */
> > list_del(>lru);
> 
> Well, since bulkfree_pcp_prepare() doesn't care about page->lru, you
> could maybe use list_move_tail() instead of list_del() +
> list_add_tail()? That avoids temporarily writing poison values.

Good point, except bulkfree_pcp_prepare() could return error and then
the page will need to be removed from the to-be-freed list, like this:

do {
page = list_last_entry(list, struct page, lru);
list_move_tail(>lru, );
pcp->count--;

if (bulkfree_pcp_prepare(page))
list_del(>lru);
} while (--count && --batch_free && !list_empty(list));

Considering bulkfree_pcp_prepare() returning error is the rare case,
this list_del() should rarely happen. At the same time, this part is
outside of zone->lock and can hardly impact performance...so I'm not
sure.
 
> Hm actually, you are reversing the list in the process, because page is
> obtained by list_last_entry and you use list_add_tail. That could have
> unintended performance consequences?

True the order is changed when these to-be-freed pages are in this
temporary list, but then they are iterated and freed one by one from
head to tail so the order they landed in free_list is the same as
before the patch(also the same as they are in pcp list).

> 
> Also maybe list_cut_position() could be faster than shuffling pages one
> by one? I guess not really, because batch_free will be generally low?

We will need to know where to cut if list_cut_position() is to be used
and to find that out, the list will need to be iterated first. I guess
that's too much trouble.

Since this part of code is per-cpu(outside of zone->lock) and these
pages are in pcp(meaning their cachelines are not likely in remote),
I didn't worry too much about not being able to list_cut_position() but
iterate. On allocation side though, when manipulating the global
free_list under zone->lock, this is a big problem since pages there are
freed from different CPUs and the cache could be cold for the allocating
CPU. That is why I'm proposing clusted allocation sometime ago as an RFC
patch where list_cut_position() is so good that it could eliminate the
cacheline miss issue since we do not need to iterate cold pages one by
one.

I wish there is a data structure that has the flexibility of list while
at the same time we can locate the Nth element in the list without the
need to iterate. That's what I'm looking for when developing clustered
allocation for order 0 pages. In 

[RFC/PATCH] risc-v: Fix issue ignoring CONFIG_CMDLINE_OVERRIDE

2018-03-12 Thread Moritz Fischer
Fix issue where CONFIG_CMDLINE_OVERRIDE is being
ignored.

Signed-off-by: Trung Tran 
Signed-off-by: Moritz Fischer 
---

Hi all,

not sure this is the right fix, hopefully someone
can chime in and come up with the correct solution.

Thanks,

Moritz
---
 arch/riscv/kernel/setup.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index 09f7064e898c..e24191a0c244 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -38,6 +38,8 @@
 #include 
 #include 
 
+static char default_command_line[COMMAND_LINE_SIZE] __initdata = 
CONFIG_CMDLINE;
+
 #ifdef CONFIG_DUMMY_CONSOLE
 struct screen_info screen_info = {
.orig_video_lines   = 30,
@@ -194,7 +196,11 @@ static void __init setup_bootmem(void)
 
 void __init setup_arch(char **cmdline_p)
 {
+#if defined(CONFIG_CMDLINE_OVERRIDE)
+   *cmdline_p = default_command_line;
+#else
*cmdline_p = boot_command_line;
+#endif
 
parse_early_param();
 
-- 
2.16.2



[RFC/PATCH] risc-v: Fix issue ignoring CONFIG_CMDLINE_OVERRIDE

2018-03-12 Thread Moritz Fischer
Fix issue where CONFIG_CMDLINE_OVERRIDE is being
ignored.

Signed-off-by: Trung Tran 
Signed-off-by: Moritz Fischer 
---

Hi all,

not sure this is the right fix, hopefully someone
can chime in and come up with the correct solution.

Thanks,

Moritz
---
 arch/riscv/kernel/setup.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index 09f7064e898c..e24191a0c244 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -38,6 +38,8 @@
 #include 
 #include 
 
+static char default_command_line[COMMAND_LINE_SIZE] __initdata = 
CONFIG_CMDLINE;
+
 #ifdef CONFIG_DUMMY_CONSOLE
 struct screen_info screen_info = {
.orig_video_lines   = 30,
@@ -194,7 +196,11 @@ static void __init setup_bootmem(void)
 
 void __init setup_arch(char **cmdline_p)
 {
+#if defined(CONFIG_CMDLINE_OVERRIDE)
+   *cmdline_p = default_command_line;
+#else
*cmdline_p = boot_command_line;
+#endif
 
parse_early_param();
 
-- 
2.16.2



Re: [PATCH v2 2/2] mfd: rk808: Add restart functionality

2018-03-12 Thread Joseph Chen

Hi, Daniel:

        On Rockchip platforms, we always use CRU to restart system. 
From your patch, I think you would like to support restart by CRU or 
PMIC, right ?


        I think restart by PMIC is not accepted, because PMIC restart 
means all regualtors are reset (voltage drops to 0mv and then power on 
again), including vdd_arm, vdd_logic, vcc_ddr, etc. This is like a cold 
boot for machine.


       We use CRU reset, because we want to keep status of last log, 
GRF, CRU or other registers, but PMIC restart can make them lost.



在 2018/3/12 22:10, Lee Jones 写道:

Rockchip guys,

I'd really appreciate your input on these two patches please.

Please provide Reviewed-by/Tested-by tags.

On Wed, 07 Mar 2018, Daniel Schultz wrote:

When using Rockchip SoCs with rk805/808/818 PMICs, restarts are realized by
setting the reset registers in the "Clock and Reset Unit".

Since all three shutdown functions have almost the same code, all logic
from the shutdown functions can be refactored to a new function
"rk808_update_bits", which can update a register by a given address and
bitmask.

After that, notifier blocks and the restart functions were added to the
driver. Like the shutdown function, the restart is bound to the
"rockchip,system-power-controller" device tree property and can easily
revoked to the software restart by removing the property.

Signed-off-by: Daniel Schultz 
---
Changes:
v2: Re-submit with recipients from Rockchip.

  drivers/mfd/rk808.c   | 97 ++-
  include/linux/mfd/rk808.h |  1 +
  2 files changed, 63 insertions(+), 35 deletions(-)

diff --git a/drivers/mfd/rk808.c b/drivers/mfd/rk808.c
index d138721..2c68f8c 100644
--- a/drivers/mfd/rk808.c
+++ b/drivers/mfd/rk808.c
@@ -27,6 +27,7 @@
  #include 
  #include 
  #include 
+#include 
  
  struct rk808_reg_data {

int addr;
@@ -369,59 +370,73 @@ static const struct regmap_irq_chip rk818_irq_chip = {
  
  static struct i2c_client *rk808_i2c_client;
  
-static void rk805_device_shutdown(void)

+static void rk808_update_bits(unsigned int reg, unsigned int bit_mask)
  {
int ret;
struct rk808 *rk808 = i2c_get_clientdata(rk808_i2c_client);
  
  	if (!rk808) {

dev_warn(_i2c_client->dev,
-"have no rk805, so do nothing here\n");
+"have no %s, so do nothing here\n",
+rk808->regmap_irq_chip->name);
return;
}
  
  	ret = regmap_update_bits(rk808->regmap,

-RK805_DEV_CTRL_REG,
-DEV_OFF, DEV_OFF);
+reg,
+bit_mask, bit_mask);
if (ret)
-   dev_err(_i2c_client->dev, "power off error!\n");
+   dev_err(_i2c_client->dev, "can't write to DEVCTRL: %x!\n",
+   ret);
  }
  
-static void rk808_device_shutdown(void)

+static void rk805_device_shutdown(void)
  {
-   int ret;
-   struct rk808 *rk808 = i2c_get_clientdata(rk808_i2c_client);
-
-   if (!rk808) {
-   dev_warn(_i2c_client->dev,
-"have no rk808, so do nothing here\n");
-   return;
-   }
+   rk808_update_bits(RK805_DEV_CTRL_REG, DEV_OFF);
+}
+static int rk805_restart_notify(struct notifier_block *this,
+  unsigned long mode, void *cmd)
+{
+   rk808_update_bits(RK805_DEV_CTRL_REG, DEV_OFF_RST);
+   return NOTIFY_DONE;
+}
  
-	ret = regmap_update_bits(rk808->regmap,

-RK808_DEVCTRL_REG,
-DEV_OFF_RST, DEV_OFF_RST);
-   if (ret)
-   dev_err(_i2c_client->dev, "power off error!\n");
+static void rk808_device_shutdown(void)
+{
+   rk808_update_bits(RK808_DEVCTRL_REG, DEV_OFF_RST);
+}
+static int rk808_restart_notify(struct notifier_block *this,
+  unsigned long mode, void *cmd)
+{
+   rk808_update_bits(RK808_DEVCTRL_REG, DEV_OFF);
+   return NOTIFY_DONE;
  }
  
  static void rk818_device_shutdown(void)

  {
-   int ret;
-   struct rk808 *rk808 = i2c_get_clientdata(rk808_i2c_client);
+   rk808_update_bits(RK818_DEVCTRL_REG, DEV_OFF_RST);
+}
+static int rk818_restart_notify(struct notifier_block *this,
+  unsigned long mode, void *cmd)
+{
+   rk808_update_bits(RK818_DEVCTRL_REG, DEV_OFF);
+   return NOTIFY_DONE;
+}
  
-	if (!rk808) {

-   dev_warn(_i2c_client->dev,
-"have no rk818, so do nothing here\n");
-   return;
-   }
+static struct notifier_block rk805_restart_handler = {
+   .notifier_call = rk805_restart_notify,
+   .priority = 196,
+};
  
-	ret = regmap_update_bits(rk808->regmap,

-RK818_DEVCTRL_REG,
-DEV_OFF_RST, 

Re: [PATCH v2 2/2] mfd: rk808: Add restart functionality

2018-03-12 Thread Joseph Chen

Hi, Daniel:

        On Rockchip platforms, we always use CRU to restart system. 
From your patch, I think you would like to support restart by CRU or 
PMIC, right ?


        I think restart by PMIC is not accepted, because PMIC restart 
means all regualtors are reset (voltage drops to 0mv and then power on 
again), including vdd_arm, vdd_logic, vcc_ddr, etc. This is like a cold 
boot for machine.


       We use CRU reset, because we want to keep status of last log, 
GRF, CRU or other registers, but PMIC restart can make them lost.



在 2018/3/12 22:10, Lee Jones 写道:

Rockchip guys,

I'd really appreciate your input on these two patches please.

Please provide Reviewed-by/Tested-by tags.

On Wed, 07 Mar 2018, Daniel Schultz wrote:

When using Rockchip SoCs with rk805/808/818 PMICs, restarts are realized by
setting the reset registers in the "Clock and Reset Unit".

Since all three shutdown functions have almost the same code, all logic
from the shutdown functions can be refactored to a new function
"rk808_update_bits", which can update a register by a given address and
bitmask.

After that, notifier blocks and the restart functions were added to the
driver. Like the shutdown function, the restart is bound to the
"rockchip,system-power-controller" device tree property and can easily
revoked to the software restart by removing the property.

Signed-off-by: Daniel Schultz 
---
Changes:
v2: Re-submit with recipients from Rockchip.

  drivers/mfd/rk808.c   | 97 ++-
  include/linux/mfd/rk808.h |  1 +
  2 files changed, 63 insertions(+), 35 deletions(-)

diff --git a/drivers/mfd/rk808.c b/drivers/mfd/rk808.c
index d138721..2c68f8c 100644
--- a/drivers/mfd/rk808.c
+++ b/drivers/mfd/rk808.c
@@ -27,6 +27,7 @@
  #include 
  #include 
  #include 
+#include 
  
  struct rk808_reg_data {

int addr;
@@ -369,59 +370,73 @@ static const struct regmap_irq_chip rk818_irq_chip = {
  
  static struct i2c_client *rk808_i2c_client;
  
-static void rk805_device_shutdown(void)

+static void rk808_update_bits(unsigned int reg, unsigned int bit_mask)
  {
int ret;
struct rk808 *rk808 = i2c_get_clientdata(rk808_i2c_client);
  
  	if (!rk808) {

dev_warn(_i2c_client->dev,
-"have no rk805, so do nothing here\n");
+"have no %s, so do nothing here\n",
+rk808->regmap_irq_chip->name);
return;
}
  
  	ret = regmap_update_bits(rk808->regmap,

-RK805_DEV_CTRL_REG,
-DEV_OFF, DEV_OFF);
+reg,
+bit_mask, bit_mask);
if (ret)
-   dev_err(_i2c_client->dev, "power off error!\n");
+   dev_err(_i2c_client->dev, "can't write to DEVCTRL: %x!\n",
+   ret);
  }
  
-static void rk808_device_shutdown(void)

+static void rk805_device_shutdown(void)
  {
-   int ret;
-   struct rk808 *rk808 = i2c_get_clientdata(rk808_i2c_client);
-
-   if (!rk808) {
-   dev_warn(_i2c_client->dev,
-"have no rk808, so do nothing here\n");
-   return;
-   }
+   rk808_update_bits(RK805_DEV_CTRL_REG, DEV_OFF);
+}
+static int rk805_restart_notify(struct notifier_block *this,
+  unsigned long mode, void *cmd)
+{
+   rk808_update_bits(RK805_DEV_CTRL_REG, DEV_OFF_RST);
+   return NOTIFY_DONE;
+}
  
-	ret = regmap_update_bits(rk808->regmap,

-RK808_DEVCTRL_REG,
-DEV_OFF_RST, DEV_OFF_RST);
-   if (ret)
-   dev_err(_i2c_client->dev, "power off error!\n");
+static void rk808_device_shutdown(void)
+{
+   rk808_update_bits(RK808_DEVCTRL_REG, DEV_OFF_RST);
+}
+static int rk808_restart_notify(struct notifier_block *this,
+  unsigned long mode, void *cmd)
+{
+   rk808_update_bits(RK808_DEVCTRL_REG, DEV_OFF);
+   return NOTIFY_DONE;
  }
  
  static void rk818_device_shutdown(void)

  {
-   int ret;
-   struct rk808 *rk808 = i2c_get_clientdata(rk808_i2c_client);
+   rk808_update_bits(RK818_DEVCTRL_REG, DEV_OFF_RST);
+}
+static int rk818_restart_notify(struct notifier_block *this,
+  unsigned long mode, void *cmd)
+{
+   rk808_update_bits(RK818_DEVCTRL_REG, DEV_OFF);
+   return NOTIFY_DONE;
+}
  
-	if (!rk808) {

-   dev_warn(_i2c_client->dev,
-"have no rk818, so do nothing here\n");
-   return;
-   }
+static struct notifier_block rk805_restart_handler = {
+   .notifier_call = rk805_restart_notify,
+   .priority = 196,
+};
  
-	ret = regmap_update_bits(rk808->regmap,

-RK818_DEVCTRL_REG,
-DEV_OFF_RST, DEV_OFF_RST);
-   if (ret)

Re: [PATCH] video: fbdev: via: remove VLA usage

2018-03-12 Thread Gustavo A. R. Silva



On 03/09/2018 05:02 AM, Emil Velikov wrote:

On 9 March 2018 at 10:59, Eric Engestrom  wrote:

On Thursday, 2018-03-08 11:39:49 -0600, Gustavo A. R. Silva wrote:

In preparation to enabling -Wvla, remove VLA usage.

Also, fixed as part of the directive to remove all VLAs from
the kernel: https://lkml.org/lkml/2018/3/7/621

Signed-off-by: Gustavo A. R. Silva 


Sorry for my reply on v1, I missed that Emil already replied to another
post of the same patch (side note on that, making use of `-vX` and
`--in-reply-to` helps track the evolution of patches)



Yep. I know. It is just that some people prefer vX in a new thread. 
Anyway, I will take this into account for the next time.



This looks good to me:
Reviewed-by: Eric Engestrom 


Same here
Reviewed-by: Emil Velikov 

-Emil



Thanks for reviewing it, guys.
--
Gustavo A. R. Silva


Re: [PATCH] video: fbdev: via: remove VLA usage

2018-03-12 Thread Gustavo A. R. Silva



On 03/09/2018 05:02 AM, Emil Velikov wrote:

On 9 March 2018 at 10:59, Eric Engestrom  wrote:

On Thursday, 2018-03-08 11:39:49 -0600, Gustavo A. R. Silva wrote:

In preparation to enabling -Wvla, remove VLA usage.

Also, fixed as part of the directive to remove all VLAs from
the kernel: https://lkml.org/lkml/2018/3/7/621

Signed-off-by: Gustavo A. R. Silva 


Sorry for my reply on v1, I missed that Emil already replied to another
post of the same patch (side note on that, making use of `-vX` and
`--in-reply-to` helps track the evolution of patches)



Yep. I know. It is just that some people prefer vX in a new thread. 
Anyway, I will take this into account for the next time.



This looks good to me:
Reviewed-by: Eric Engestrom 


Same here
Reviewed-by: Emil Velikov 

-Emil



Thanks for reviewing it, guys.
--
Gustavo A. R. Silva


Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-12 Thread Sinan Kaya
On 3/12/2018 3:35 PM, Logan Gunthorpe wrote:
> +int pci_p2pdma_add_client(struct list_head *head, struct device *dev)

It feels like code tried to be a generic p2pdma provider first. Then got
converted to PCI, yet all dev parameters are still struct device.

Maybe, dev parameter should also be struct pci_dev so that you can get rid of
all to_pci_dev() calls in this code including find_parent_pci_dev() function.

Regarding the switch business, It is amazing how much trouble you went into
limit this functionality into very specific hardware.

I thought that we reached to an agreement that code would not impose
any limits on what user wants.

What happened to all the emails we exchanged?

I understand you are coming from what you tested. Considering that you
are singing up for providing a generic PCI functionality into the kernel,
why don't you just blacklist the products that you had problems with and
yet still allow other architectures to use your code with their root ports?

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux 
Foundation Collaborative Project.


Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-12 Thread Sinan Kaya
On 3/12/2018 3:35 PM, Logan Gunthorpe wrote:
> +int pci_p2pdma_add_client(struct list_head *head, struct device *dev)

It feels like code tried to be a generic p2pdma provider first. Then got
converted to PCI, yet all dev parameters are still struct device.

Maybe, dev parameter should also be struct pci_dev so that you can get rid of
all to_pci_dev() calls in this code including find_parent_pci_dev() function.

Regarding the switch business, It is amazing how much trouble you went into
limit this functionality into very specific hardware.

I thought that we reached to an agreement that code would not impose
any limits on what user wants.

What happened to all the emails we exchanged?

I understand you are coming from what you tested. Considering that you
are singing up for providing a generic PCI functionality into the kernel,
why don't you just blacklist the products that you had problems with and
yet still allow other architectures to use your code with their root ports?

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux 
Foundation Collaborative Project.


Re: [PATCH] perf stat: Add support for s390 transaction counters

2018-03-12 Thread Andi Kleen
Thomas Richter  writes:

> Right now there is only hard coded support for x86.

That's not true. There is support for generic transaction events in perf.

As far as I can tell your events would map 1:1 to the generic tx-* events.

-Andi


Re: [PATCH] perf stat: Add support for s390 transaction counters

2018-03-12 Thread Andi Kleen
Thomas Richter  writes:

> Right now there is only hard coded support for x86.

That's not true. There is support for generic transaction events in perf.

As far as I can tell your events would map 1:1 to the generic tx-* events.

-Andi


Re: [PATCH 1/2] rtc: s5m: Move enum from rtc.h to rtc-s5m.c

2018-03-12 Thread Gustavo A. R. Silva



On 03/11/2018 11:39 AM, Krzysztof Kozlowski wrote:

On Sat, Mar 10, 2018 at 7:27 AM, Gustavo A. R. Silva
 wrote:

Move this enum to rtc-s5m.c once it is meaningless to others drivers [1].

[1] https://marc.info/?l=linux-rtc=152060068925948=2


Instead of external link (which might or might not work soon) you can
just put "Suggested-by: Krzysztof Kozlowski ".



I got it.
I'll do that next time.


Anyway:
Reviewed-by: Krzysztof Kozlowski 



Thanks for reviewing it.
--
Gustavo


Re: [PATCH 1/2] rtc: s5m: Move enum from rtc.h to rtc-s5m.c

2018-03-12 Thread Gustavo A. R. Silva



On 03/11/2018 11:39 AM, Krzysztof Kozlowski wrote:

On Sat, Mar 10, 2018 at 7:27 AM, Gustavo A. R. Silva
 wrote:

Move this enum to rtc-s5m.c once it is meaningless to others drivers [1].

[1] https://marc.info/?l=linux-rtc=152060068925948=2


Instead of external link (which might or might not work soon) you can
just put "Suggested-by: Krzysztof Kozlowski ".



I got it.
I'll do that next time.


Anyway:
Reviewed-by: Krzysztof Kozlowski 



Thanks for reviewing it.
--
Gustavo


[PATCH v2] netfilter: nf_tables: remove VLA usage

2018-03-12 Thread Gustavo A. R. Silva
In preparation to enabling -Wvla, remove VLA and replace it
with dynamic memory allocation.

>From a security viewpoint, the use of Variable Length Arrays can be
a vector for stack overflow attacks. Also, in general, as the code
evolves it is easy to lose track of how big a VLA can get. Thus, we
can end up having segfaults that are hard to debug.

Also, fixed as part of the directive to remove all VLAs from
the kernel: https://lkml.org/lkml/2018/3/7/621

Signed-off-by: Gustavo A. R. Silva 
---
Changes in v2:
 - Use kmalloc_array instead of kcalloc.

 net/netfilter/nf_tables_api.c | 23 +++
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 3f815b6..5a42e97 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -4357,16 +4357,20 @@ static struct nft_object *nft_obj_init(const struct 
nft_ctx *ctx,
   const struct nft_object_type *type,
   const struct nlattr *attr)
 {
-   struct nlattr *tb[type->maxattr + 1];
+   struct nlattr **tb;
const struct nft_object_ops *ops;
struct nft_object *obj;
-   int err;
+   int err = -ENOMEM;
+
+   tb = kmalloc_array(type->maxattr + 1, sizeof(*tb), GFP_KERNEL);
+   if (!tb)
+   goto err1;
 
if (attr) {
err = nla_parse_nested(tb, type->maxattr, attr, type->policy,
   NULL);
if (err < 0)
-   goto err1;
+   goto err2;
} else {
memset(tb, 0, sizeof(tb[0]) * (type->maxattr + 1));
}
@@ -4375,7 +4379,7 @@ static struct nft_object *nft_obj_init(const struct 
nft_ctx *ctx,
ops = type->select_ops(ctx, (const struct nlattr * const *)tb);
if (IS_ERR(ops)) {
err = PTR_ERR(ops);
-   goto err1;
+   goto err2;
}
} else {
ops = type->ops;
@@ -4383,18 +4387,21 @@ static struct nft_object *nft_obj_init(const struct 
nft_ctx *ctx,
 
err = -ENOMEM;
obj = kzalloc(sizeof(*obj) + ops->size, GFP_KERNEL);
-   if (obj == NULL)
-   goto err1;
+   if (!obj)
+   goto err2;
 
err = ops->init(ctx, (const struct nlattr * const *)tb, obj);
if (err < 0)
-   goto err2;
+   goto err3;
 
obj->ops = ops;
 
+   kfree(tb);
return obj;
-err2:
+err3:
kfree(obj);
+err2:
+   kfree(tb);
 err1:
return ERR_PTR(err);
 }
-- 
2.7.4



[PATCH v2] netfilter: nf_tables: remove VLA usage

2018-03-12 Thread Gustavo A. R. Silva
In preparation to enabling -Wvla, remove VLA and replace it
with dynamic memory allocation.

>From a security viewpoint, the use of Variable Length Arrays can be
a vector for stack overflow attacks. Also, in general, as the code
evolves it is easy to lose track of how big a VLA can get. Thus, we
can end up having segfaults that are hard to debug.

Also, fixed as part of the directive to remove all VLAs from
the kernel: https://lkml.org/lkml/2018/3/7/621

Signed-off-by: Gustavo A. R. Silva 
---
Changes in v2:
 - Use kmalloc_array instead of kcalloc.

 net/netfilter/nf_tables_api.c | 23 +++
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 3f815b6..5a42e97 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -4357,16 +4357,20 @@ static struct nft_object *nft_obj_init(const struct 
nft_ctx *ctx,
   const struct nft_object_type *type,
   const struct nlattr *attr)
 {
-   struct nlattr *tb[type->maxattr + 1];
+   struct nlattr **tb;
const struct nft_object_ops *ops;
struct nft_object *obj;
-   int err;
+   int err = -ENOMEM;
+
+   tb = kmalloc_array(type->maxattr + 1, sizeof(*tb), GFP_KERNEL);
+   if (!tb)
+   goto err1;
 
if (attr) {
err = nla_parse_nested(tb, type->maxattr, attr, type->policy,
   NULL);
if (err < 0)
-   goto err1;
+   goto err2;
} else {
memset(tb, 0, sizeof(tb[0]) * (type->maxattr + 1));
}
@@ -4375,7 +4379,7 @@ static struct nft_object *nft_obj_init(const struct 
nft_ctx *ctx,
ops = type->select_ops(ctx, (const struct nlattr * const *)tb);
if (IS_ERR(ops)) {
err = PTR_ERR(ops);
-   goto err1;
+   goto err2;
}
} else {
ops = type->ops;
@@ -4383,18 +4387,21 @@ static struct nft_object *nft_obj_init(const struct 
nft_ctx *ctx,
 
err = -ENOMEM;
obj = kzalloc(sizeof(*obj) + ops->size, GFP_KERNEL);
-   if (obj == NULL)
-   goto err1;
+   if (!obj)
+   goto err2;
 
err = ops->init(ctx, (const struct nlattr * const *)tb, obj);
if (err < 0)
-   goto err2;
+   goto err3;
 
obj->ops = ops;
 
+   kfree(tb);
return obj;
-err2:
+err3:
kfree(obj);
+err2:
+   kfree(tb);
 err1:
return ERR_PTR(err);
 }
-- 
2.7.4



linux-next: manual merge of the tip tree with the arm-soc tree

2018-03-12 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the tip tree got a conflict in:

  drivers/bus/arm-cci.c

between commit:

  3de6be7a3dd8 ("drivers/bus: Split Arm CCI driver")

from the arm-soc tree and commit:

  8343aae66167 ("perf/core: Remove perf_event::group_entry")

from the tip tree.

I fixed it up (I have added the following merge fix patch) and can
carry the fix as necessary. This is now fixed as far as linux-next is
concerned, but any non trivial conflicts should be mentioned to your
upstream maintainer when your tree is submitted for merging.  You may
also want to consider cooperating with the maintainer of the conflicting
tree to minimise any particularly complex conflicts.

From: Stephen Rothwell 
Date: Tue, 13 Mar 2018 14:08:01 +1100
Subject: [PATCH] perf/core: merge fix for "drivers/bus: Split Arm CCI driver"

Signed-off-by: Stephen Rothwell 
---
 drivers/perf/arm-cci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/perf/arm-cci.c b/drivers/perf/arm-cci.c
index 67a74c48c7c2..192d23b760a0 100644
--- a/drivers/perf/arm-cci.c
+++ b/drivers/perf/arm-cci.c
@@ -1265,7 +1265,7 @@ static int validate_group(struct perf_event *event)
if (!validate_event(event->pmu, _pmu, leader))
return -EINVAL;
 
-   list_for_each_entry(sibling, >sibling_list, group_entry) {
+   list_for_each_entry(sibling, >sibling_list, sibling_list) {
if (!validate_event(event->pmu, _pmu, sibling))
return -EINVAL;
}
-- 
2.16.1

-- 
Cheers,
Stephen Rothwell


pgpOwcB5NDoK8.pgp
Description: OpenPGP digital signature


linux-next: manual merge of the tip tree with the arm-soc tree

2018-03-12 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the tip tree got a conflict in:

  drivers/bus/arm-cci.c

between commit:

  3de6be7a3dd8 ("drivers/bus: Split Arm CCI driver")

from the arm-soc tree and commit:

  8343aae66167 ("perf/core: Remove perf_event::group_entry")

from the tip tree.

I fixed it up (I have added the following merge fix patch) and can
carry the fix as necessary. This is now fixed as far as linux-next is
concerned, but any non trivial conflicts should be mentioned to your
upstream maintainer when your tree is submitted for merging.  You may
also want to consider cooperating with the maintainer of the conflicting
tree to minimise any particularly complex conflicts.

From: Stephen Rothwell 
Date: Tue, 13 Mar 2018 14:08:01 +1100
Subject: [PATCH] perf/core: merge fix for "drivers/bus: Split Arm CCI driver"

Signed-off-by: Stephen Rothwell 
---
 drivers/perf/arm-cci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/perf/arm-cci.c b/drivers/perf/arm-cci.c
index 67a74c48c7c2..192d23b760a0 100644
--- a/drivers/perf/arm-cci.c
+++ b/drivers/perf/arm-cci.c
@@ -1265,7 +1265,7 @@ static int validate_group(struct perf_event *event)
if (!validate_event(event->pmu, _pmu, leader))
return -EINVAL;
 
-   list_for_each_entry(sibling, >sibling_list, group_entry) {
+   list_for_each_entry(sibling, >sibling_list, sibling_list) {
if (!validate_event(event->pmu, _pmu, sibling))
return -EINVAL;
}
-- 
2.16.1

-- 
Cheers,
Stephen Rothwell


pgpOwcB5NDoK8.pgp
Description: OpenPGP digital signature


Re: [PATCH V3 0/4] genirq/affinity: irq vector spread among online CPUs as far as possible

2018-03-12 Thread Dou Liyang

Hi Thomas,

At 03/09/2018 11:08 PM, Thomas Gleixner wrote:
[...]


I'm not sure if there is a clear indicator whether physcial hotplug is
supported or not, but the ACPI folks (x86) and architecture maintainers

+cc Rafael


should be able to answer that question. I have a machine which says:

smpboot: Allowing 128 CPUs, 96 hotplug CPUs

There is definitely no way to hotplug anything on that machine and sure the


AFAIK, in ACPI based dynamic reconfiguration, there is no clear
indicator. In theory, If the ACPI tables have the hotpluggable
CPU resources, the OS can support physical hotplug.

For your machine, Did your CPUs support multi-threading, but not enable
it?

And, sometimes we should not trust the number of possible CPUs. I also
met the situation that BIOS told to ACPI that it could support physical
CPUs hotplug, But actually, there was no hardware slots in the machine.
the ACPI tables like user inputs which should be validated when we use.


existing spread algorithm will waste vectors to no end.

Sure then there is virt, which can pretend to have a gazillion of possible
hotpluggable CPUs, but virt is an insanity on its own. Though someone might
come up with reasonable heuristics for that as well.

Thoughts?


Do we have to map the vectors to CPU statically? Can we map them when
we hotplug/enable the possible CPU?

Thanks,

dou




Re: [PATCH V3 0/4] genirq/affinity: irq vector spread among online CPUs as far as possible

2018-03-12 Thread Dou Liyang

Hi Thomas,

At 03/09/2018 11:08 PM, Thomas Gleixner wrote:
[...]


I'm not sure if there is a clear indicator whether physcial hotplug is
supported or not, but the ACPI folks (x86) and architecture maintainers

+cc Rafael


should be able to answer that question. I have a machine which says:

smpboot: Allowing 128 CPUs, 96 hotplug CPUs

There is definitely no way to hotplug anything on that machine and sure the


AFAIK, in ACPI based dynamic reconfiguration, there is no clear
indicator. In theory, If the ACPI tables have the hotpluggable
CPU resources, the OS can support physical hotplug.

For your machine, Did your CPUs support multi-threading, but not enable
it?

And, sometimes we should not trust the number of possible CPUs. I also
met the situation that BIOS told to ACPI that it could support physical
CPUs hotplug, But actually, there was no hardware slots in the machine.
the ACPI tables like user inputs which should be validated when we use.


existing spread algorithm will waste vectors to no end.

Sure then there is virt, which can pretend to have a gazillion of possible
hotpluggable CPUs, but virt is an insanity on its own. Though someone might
come up with reasonable heuristics for that as well.

Thoughts?


Do we have to map the vectors to CPU statically? Can we map them when
we hotplug/enable the possible CPU?

Thanks,

dou




Re: [PATCH] net: hns: use put_device() if device_register fail

2018-03-12 Thread arvindY



On Monday 12 March 2018 10:59 PM, Richard Weinberger wrote:

On Mon, Mar 12, 2018 at 5:27 PM, arvindY  wrote:


On Monday 12 March 2018 08:13 PM, David Miller wrote:

From: Arvind Yadav 
Date: Fri,  9 Mar 2018 16:11:17 +0530


if device_register() returned an error! Always use put_device()
to give up the reference initialized.

Signed-off-by: Arvind Yadav 

I do not see anything giving cls_dev an initial non-zero reference
count before this device_register() call.

Yes,  you are correct there is nothing to release (hnae_release).

Wait, this is not what DaveM said.
Since you sent also patches to MTD I care about this too.

Do we have to call put_device() in any case after a failure of
device_register() or not?
In this case the release function is empty, so nothing is going to released.
But technically a put_device() is needed and correct according to your
change log.

I have to admit I don't know all details of the driver core, maybe you
can help me.
device_register() calls device_initialize() followed by device_add().
device_initialize() does a kobject_init() which again sets the
reference counter to 1 via kref_init().
In the next step device_add() does a get_device() --> reference
counter goes up to 2.
But in the exit path of the function a put_device() is done, also in
case of an error.
So device_register() always returns with reference count 1.

If I understand this correctly a put_device() is mandatory.
Also in this driver.
Can you please enlighten me? :-)


Oh, I just miss understood david question.

But what ever you are telling that is correct.
device_initialize() will call kref_init() which is setting
refcount 1 by refcount_set(>refcount, 1).
and device_add() will call kref_get() to increment refcount
by calling refcount_inc(>refcount).
So we need put_device() to decrement and release the
kboject.
Internally it'll call kref_put() -> refcount_dec_and_test(>refcount)
to decrement refcount.

~arvind



Re: [PATCH] net: hns: use put_device() if device_register fail

2018-03-12 Thread arvindY



On Monday 12 March 2018 10:59 PM, Richard Weinberger wrote:

On Mon, Mar 12, 2018 at 5:27 PM, arvindY  wrote:


On Monday 12 March 2018 08:13 PM, David Miller wrote:

From: Arvind Yadav 
Date: Fri,  9 Mar 2018 16:11:17 +0530


if device_register() returned an error! Always use put_device()
to give up the reference initialized.

Signed-off-by: Arvind Yadav 

I do not see anything giving cls_dev an initial non-zero reference
count before this device_register() call.

Yes,  you are correct there is nothing to release (hnae_release).

Wait, this is not what DaveM said.
Since you sent also patches to MTD I care about this too.

Do we have to call put_device() in any case after a failure of
device_register() or not?
In this case the release function is empty, so nothing is going to released.
But technically a put_device() is needed and correct according to your
change log.

I have to admit I don't know all details of the driver core, maybe you
can help me.
device_register() calls device_initialize() followed by device_add().
device_initialize() does a kobject_init() which again sets the
reference counter to 1 via kref_init().
In the next step device_add() does a get_device() --> reference
counter goes up to 2.
But in the exit path of the function a put_device() is done, also in
case of an error.
So device_register() always returns with reference count 1.

If I understand this correctly a put_device() is mandatory.
Also in this driver.
Can you please enlighten me? :-)


Oh, I just miss understood david question.

But what ever you are telling that is correct.
device_initialize() will call kref_init() which is setting
refcount 1 by refcount_set(>refcount, 1).
and device_add() will call kref_get() to increment refcount
by calling refcount_inc(>refcount).
So we need put_device() to decrement and release the
kboject.
Internally it'll call kref_put() -> refcount_dec_and_test(>refcount)
to decrement refcount.

~arvind



Re: [PATCH][drm-next] drm/i915/gvt: fix spelling mistake: "destoried" -> "destroyed"

2018-03-12 Thread Zhenyu Wang
On 2018.03.12 12:43:58 +0100, Colin King wrote:
> From: Colin Ian King 
> 
> Trivial fix to spelling mistake in gvt_err error message text.
> 
> Signed-off-by: Colin Ian King 
> ---

Thanks Colin, will pick up.

>  drivers/gpu/drm/i915/gvt/gtt.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gvt/gtt.c b/drivers/gpu/drm/i915/gvt/gtt.c
> index 0a100a288e6d..8eb76becd676 100644
> --- a/drivers/gpu/drm/i915/gvt/gtt.c
> +++ b/drivers/gpu/drm/i915/gvt/gtt.c
> @@ -2046,7 +2046,7 @@ static void intel_vgpu_destroy_all_ppgtt_mm(struct 
> intel_vgpu *vgpu)
>   }
>  
>   if (GEM_WARN_ON(!list_empty(>gtt.ppgtt_mm_list_head)))
> - gvt_err("vgpu ppgtt mm is not fully destoried\n");
> + gvt_err("vgpu ppgtt mm is not fully destroyed\n");
>  
>   if (GEM_WARN_ON(!radix_tree_empty(>gtt.spt_tree))) {
>   gvt_err("Why we still has spt not freed?\n");
> -- 
> 2.15.1
> 

-- 
Open Source Technology Center, Intel ltd.

$gpg --keyserver wwwkeys.pgp.net --recv-keys 4D781827


signature.asc
Description: PGP signature


Re: [PATCH][drm-next] drm/i915/gvt: fix spelling mistake: "destoried" -> "destroyed"

2018-03-12 Thread Zhenyu Wang
On 2018.03.12 12:43:58 +0100, Colin King wrote:
> From: Colin Ian King 
> 
> Trivial fix to spelling mistake in gvt_err error message text.
> 
> Signed-off-by: Colin Ian King 
> ---

Thanks Colin, will pick up.

>  drivers/gpu/drm/i915/gvt/gtt.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gvt/gtt.c b/drivers/gpu/drm/i915/gvt/gtt.c
> index 0a100a288e6d..8eb76becd676 100644
> --- a/drivers/gpu/drm/i915/gvt/gtt.c
> +++ b/drivers/gpu/drm/i915/gvt/gtt.c
> @@ -2046,7 +2046,7 @@ static void intel_vgpu_destroy_all_ppgtt_mm(struct 
> intel_vgpu *vgpu)
>   }
>  
>   if (GEM_WARN_ON(!list_empty(>gtt.ppgtt_mm_list_head)))
> - gvt_err("vgpu ppgtt mm is not fully destoried\n");
> + gvt_err("vgpu ppgtt mm is not fully destroyed\n");
>  
>   if (GEM_WARN_ON(!radix_tree_empty(>gtt.spt_tree))) {
>   gvt_err("Why we still has spt not freed?\n");
> -- 
> 2.15.1
> 

-- 
Open Source Technology Center, Intel ltd.

$gpg --keyserver wwwkeys.pgp.net --recv-keys 4D781827


signature.asc
Description: PGP signature


Re: [PATCH] device_handler: remove VLAs

2018-03-12 Thread Martin K. Petersen

Stephen,

> In preparation to enabling -Wvla, remove VLAs and replace them with
> fixed-length arrays instead.
>
> scsi_dh_{alua,emc,rdac} use variable-length array declarations to
> store command blocks, with the appropriate size as determined by
> COMMAND_SIZE. This patch replaces these with fixed-sized arrays using
> MAX_COMMAND_SIZE, so that the array size can be determined at compile
> time.

Applied to 4.17/scsi-queue. Thank you!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] device_handler: remove VLAs

2018-03-12 Thread Martin K. Petersen

Stephen,

> In preparation to enabling -Wvla, remove VLAs and replace them with
> fixed-length arrays instead.
>
> scsi_dh_{alua,emc,rdac} use variable-length array declarations to
> store command blocks, with the appropriate size as determined by
> COMMAND_SIZE. This patch replaces these with fixed-sized arrays using
> MAX_COMMAND_SIZE, so that the array size can be determined at compile
> time.

Applied to 4.17/scsi-queue. Thank you!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] scsi: eata: drop VLA in reorder()

2018-03-12 Thread Martin K. Petersen

Linus,

> That said, I wonder if the solution to this particular driver is
> "delete it". Because the hardware is truly ancient and nobody sane
> would use it any more.

I'm not aware of anybody actively using these anymore. They are
mid-nineties vintage with an M68K processor onboard. I ran a couple of
these when they were new but haven't had a working board in probably a
decade.

No objections to Salvatore's patch but I have a slight affinity for
retiring unused code over patching it. So unless there are objections...

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] scsi: eata: drop VLA in reorder()

2018-03-12 Thread Martin K. Petersen

Linus,

> That said, I wonder if the solution to this particular driver is
> "delete it". Because the hardware is truly ancient and nobody sane
> would use it any more.

I'm not aware of anybody actively using these anymore. They are
mid-nineties vintage with an M68K processor onboard. I ran a couple of
these when they were new but haven't had a working board in probably a
decade.

No objections to Salvatore's patch but I have a slight affinity for
retiring unused code over patching it. So unless there are objections...

-- 
Martin K. Petersen  Oracle Linux Engineering


[PATCH v2] kbuild: check for pkg-config on make {menu,n,g,x}config

2018-03-12 Thread Randy Dunlap
From: Randy Dunlap 

Each of 'make {menu,n,g,x}config' uses (needs) pkg-config to make sure
that other required files are present, but none of these check that
pkg-config itself is present.  Add a check for all 4 of these targets.

Fixes kernel bugzilla #77511:
https://bugzilla.kernel.org/show_bug.cgi?id=77511

Signed-off-by: Randy Dunlap 
---
v2: use 'command -v' instead of 'which'

I'm also OK with just documenting the pkg-config requirement in
Documentation/Changes (= Documentation/process/changes.rst).

 scripts/kconfig/Makefile   |   15 ++-
 scripts/kconfig/check-pkgconfig.sh |   12 
 2 files changed, 26 insertions(+), 1 deletion(-)

--- lnx-416-rc3.orig/scripts/kconfig/Makefile
+++ lnx-416-rc3/scripts/kconfig/Makefile
@@ -160,6 +160,9 @@ help:
@echo  '  xenconfig   - Enable additional options for xen dom0 and 
guest kernel support'
@echo  '  tinyconfig  - Configure the tiniest possible kernel'
 
+# pkg-config check
+check-pkgconfig  := $(srctree)/$(src)/check-pkgconfig.sh
+
 # lxdialog stuff
 check-lxdialog  := $(srctree)/$(src)/lxdialog/check-lxdialog.sh
 
@@ -205,7 +208,17 @@ $(addprefix $(obj)/, mconf.o $(lxdialog)
 $(obj)/dochecklxdialog:
$(Q)$(CONFIG_SHELL) $(check-lxdialog) -check $(HOSTCC) 
$(HOST_EXTRACFLAGS) $(HOSTLOADLIBES_mconf)
 
-always := dochecklxdialog
+# Check that we have pkg-config (used by each of menu/n/g/xconfig)
+PHONY += $(obj)/docheckpkgconfig
+$(addprefix $(obj)/, mconf.o): $(obj)/docheckpkgconfig
+$(addprefix $(obj)/, nconf.o): $(obj)/docheckpkgconfig
+$(addprefix $(obj)/, gconf.o): $(obj)/docheckpkgconfig
+$(addprefix $(obj)/, qconf.o): $(obj)/docheckpkgconfig
+
+$(obj)/docheckpkgconfig:
+   $(Q)$(CONFIG_SHELL) $(check-pkgconfig)
+
+always := docheckpkgconfig dochecklxdialog
 
 # Add environment specific flags
 HOST_EXTRACFLAGS += $(shell $(CONFIG_SHELL) $(srctree)/$(src)/check.sh 
$(HOSTCC) $(HOSTCFLAGS))
--- /dev/null
+++ lnx-416-rc3/scripts/kconfig/check-pkgconfig.sh
@@ -0,0 +1,12 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+# Check for pkg-config presence
+
+pkgcfg=`command -v pkg-config`
+
+if [ "$pkgcfg" = "" ]; then
+   echo "'make *config' requires 'pkg-config'. Please install it." 1>&2
+   exit 1  # error
+fi
+
+exit 0




[PATCH v2] kbuild: check for pkg-config on make {menu,n,g,x}config

2018-03-12 Thread Randy Dunlap
From: Randy Dunlap 

Each of 'make {menu,n,g,x}config' uses (needs) pkg-config to make sure
that other required files are present, but none of these check that
pkg-config itself is present.  Add a check for all 4 of these targets.

Fixes kernel bugzilla #77511:
https://bugzilla.kernel.org/show_bug.cgi?id=77511

Signed-off-by: Randy Dunlap 
---
v2: use 'command -v' instead of 'which'

I'm also OK with just documenting the pkg-config requirement in
Documentation/Changes (= Documentation/process/changes.rst).

 scripts/kconfig/Makefile   |   15 ++-
 scripts/kconfig/check-pkgconfig.sh |   12 
 2 files changed, 26 insertions(+), 1 deletion(-)

--- lnx-416-rc3.orig/scripts/kconfig/Makefile
+++ lnx-416-rc3/scripts/kconfig/Makefile
@@ -160,6 +160,9 @@ help:
@echo  '  xenconfig   - Enable additional options for xen dom0 and 
guest kernel support'
@echo  '  tinyconfig  - Configure the tiniest possible kernel'
 
+# pkg-config check
+check-pkgconfig  := $(srctree)/$(src)/check-pkgconfig.sh
+
 # lxdialog stuff
 check-lxdialog  := $(srctree)/$(src)/lxdialog/check-lxdialog.sh
 
@@ -205,7 +208,17 @@ $(addprefix $(obj)/, mconf.o $(lxdialog)
 $(obj)/dochecklxdialog:
$(Q)$(CONFIG_SHELL) $(check-lxdialog) -check $(HOSTCC) 
$(HOST_EXTRACFLAGS) $(HOSTLOADLIBES_mconf)
 
-always := dochecklxdialog
+# Check that we have pkg-config (used by each of menu/n/g/xconfig)
+PHONY += $(obj)/docheckpkgconfig
+$(addprefix $(obj)/, mconf.o): $(obj)/docheckpkgconfig
+$(addprefix $(obj)/, nconf.o): $(obj)/docheckpkgconfig
+$(addprefix $(obj)/, gconf.o): $(obj)/docheckpkgconfig
+$(addprefix $(obj)/, qconf.o): $(obj)/docheckpkgconfig
+
+$(obj)/docheckpkgconfig:
+   $(Q)$(CONFIG_SHELL) $(check-pkgconfig)
+
+always := docheckpkgconfig dochecklxdialog
 
 # Add environment specific flags
 HOST_EXTRACFLAGS += $(shell $(CONFIG_SHELL) $(srctree)/$(src)/check.sh 
$(HOSTCC) $(HOSTCFLAGS))
--- /dev/null
+++ lnx-416-rc3/scripts/kconfig/check-pkgconfig.sh
@@ -0,0 +1,12 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+# Check for pkg-config presence
+
+pkgcfg=`command -v pkg-config`
+
+if [ "$pkgcfg" = "" ]; then
+   echo "'make *config' requires 'pkg-config'. Please install it." 1>&2
+   exit 1  # error
+fi
+
+exit 0




  1   2   3   4   5   6   7   8   9   10   >