date:20201102

Re: [PATCH] powerpc/32s: Setup the early hash table at all time.

2020-11-02 Thread Christophe Leroy


Hi Andreas,

Le 30/10/2020 à 14:11, Andreas Schwab a écrit :

#
# Automatically generated file; DO NOT EDIT.
# Linux/powerpc 5.10.0-rc1 Kernel Configuration
#


I tried again on QEMU with both pmac32_defconfig and your config, and it boots.

I really can't understand what the problem is, because that patch only activates at all time 
something that has been working well when CONFIG_KASAN is set.


Would you mind checking that with that patch reverted, you are able to boot a kernel built with 
CONFIG_KASAN ?


Thanks
Christophe

[Bug 209869] Kernel 5.10-rc1 fails to boot on a PowerMac G4 3,6 at an early stage

2020-11-02 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=209869

Christophe Leroy (christophe.le...@csgroup.eu) changed:

   What|Removed |Added

 CC||christophe.le...@csgroup.eu

--- Comment #1 from Christophe Leroy (christophe.le...@csgroup.eu) ---
Could you try reverting commit
https://github.com/linuxppc/linux/commit/69a1593abdbcf03a76367320d929a8ae7a5e3d71
?

I got another report from someone who has the same problem and bisected it to
that commit.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[PATCH 2/2] powerpc/eeh: Add a debugfs interface to check if a driver supports recovery

2020-11-02 Thread Oliver O'Halloran

If a PCI device's current driver implements the error handling callbacks
EEH can use them to recover the device after an error occurs. For devices
without the error handling callbacks we recover them by removing the device
and re-scanning it so the PCI core puts the device back into a known good
state.

Currently there's no way for userspace to determine if the driver supports
recovery or not which makes it difficult to write automated tests for EEH.
This patch addressing that by adding a debugfs interface for querying if
a specific device can be recovered or not.

Signed-off-by: Oliver O'Halloran 
---
 arch/powerpc/kernel/eeh.c | 50 +++
 1 file changed, 50 insertions(+)

diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index f9182ff57804..cd60bc1c8701 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -1868,6 +1868,53 @@ static const struct file_operations eeh_dev_break_fops = 
{
.read   = eeh_debugfs_dev_usage,
 };
 
+static ssize_t eeh_dev_can_recover(struct file *filp,
+  const char __user *user_buf,
+  size_t count, loff_t *ppos)
+{
+   struct pci_driver *drv;
+   struct pci_dev *pdev;
+   size_t ret;
+
+   pdev = eeh_debug_lookup_pdev(filp, user_buf, count, ppos);
+   if (IS_ERR(pdev))
+   return PTR_ERR(pdev);
+
+   /*
+* In order for error recovery to work the driver needs to implement
+* .error_detected(), so it can quiesce IO to the device, and
+* .slot_reset() so it can re-initialise the device after a reset.
+*
+* Ideally they'd implement .resume() too, but some drivers which
+* we need to support (notably IPR) don't so I guess we can tolerate
+* that.
+*
+* .mmio_enabled() is mostly there as a work-around for devices which
+* take forever to re-init after a hot reset. Implementing that is
+* strictly optional.
+*/
+   drv = pci_dev_driver(pdev);
+   if (drv &&
+   drv->err_handler &&
+   drv->err_handler->error_detected &&
+   drv->err_handler->slot_reset) {
+   ret = count;
+   } else {
+   ret = -EOPNOTSUPP;
+   }
+
+   pci_dev_put(pdev);
+
+   return ret;
+}
+
+static const struct file_operations eeh_dev_can_recover_fops = {
+   .open   = simple_open,
+   .llseek = no_llseek,
+   .write  = eeh_dev_can_recover,
+   .read   = eeh_debugfs_dev_usage,
+};
+
 #endif
 
 static int __init eeh_init_proc(void)
@@ -1892,6 +1939,9 @@ static int __init eeh_init_proc(void)
debugfs_create_file_unsafe("eeh_force_recover", 0600,
powerpc_debugfs_root, NULL,
&eeh_force_recover_fops);
+   debugfs_create_file_unsafe("eeh_dev_can_recover", 0600,
+   powerpc_debugfs_root, NULL,
+   &eeh_dev_can_recover_fops);
eeh_cache_debugfs_init();
 #endif
}
-- 
2.26.2

[PATCH 1/2] powerpc/eeh: Rework pci_dev lookup in debugfs attributes

2020-11-02 Thread Oliver O'Halloran

Pull the string -> pci_dev lookup stuff into a helper function. No functional 
change.

Signed-off-by: Oliver O'Halloran 
---
 arch/powerpc/kernel/eeh.c | 71 ---
 1 file changed, 37 insertions(+), 34 deletions(-)

diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index 813713c9120c..f9182ff57804 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -1596,6 +1596,35 @@ static int proc_eeh_show(struct seq_file *m, void *v)
 }
 
 #ifdef CONFIG_DEBUG_FS
+
+
+static struct pci_dev *eeh_debug_lookup_pdev(struct file *filp,
+const char __user *user_buf,
+size_t count, loff_t *ppos)
+{
+   uint32_t domain, bus, dev, fn;
+   struct pci_dev *pdev;
+   char buf[20];
+   int ret;
+
+   memset(buf, 0, sizeof(buf));
+   ret = simple_write_to_buffer(buf, sizeof(buf)-1, ppos, user_buf, count);
+   if (!ret)
+   return ERR_PTR(-EFAULT);
+
+   ret = sscanf(buf, "%x:%x:%x.%x", &domain, &bus, &dev, &fn);
+   if (ret != 4) {
+   pr_err("%s: expected 4 args, got %d\n", __func__, ret);
+   return ERR_PTR(-EINVAL);
+   }
+
+   pdev = pci_get_domain_bus_and_slot(domain, bus, (dev << 3) | fn);
+   if (!pdev)
+   return ERR_PTR(-ENODEV);
+
+   return pdev;
+}
+
 static int eeh_enable_dbgfs_set(void *data, u64 val)
 {
if (val)
@@ -1688,26 +1717,13 @@ static ssize_t eeh_dev_check_write(struct file *filp,
const char __user *user_buf,
size_t count, loff_t *ppos)
 {
-   uint32_t domain, bus, dev, fn;
struct pci_dev *pdev;
struct eeh_dev *edev;
-   char buf[20];
int ret;
 
-   memset(buf, 0, sizeof(buf));
-   ret = simple_write_to_buffer(buf, sizeof(buf)-1, ppos, user_buf, count);
-   if (!ret)
-   return -EFAULT;
-
-   ret = sscanf(buf, "%x:%x:%x.%x", &domain, &bus, &dev, &fn);
-   if (ret != 4) {
-   pr_err("%s: expected 4 args, got %d\n", __func__, ret);
-   return -EINVAL;
-   }
-
-   pdev = pci_get_domain_bus_and_slot(domain, bus, (dev << 3) | fn);
-   if (!pdev)
-   return -ENODEV;
+   pdev = eeh_debug_lookup_pdev(filp, user_buf, count, ppos);
+   if (IS_ERR(pdev))
+   return PTR_ERR(pdev);
 
edev = pci_dev_to_eeh_dev(pdev);
if (!edev) {
@@ -1717,8 +1733,8 @@ static ssize_t eeh_dev_check_write(struct file *filp,
}
 
ret = eeh_dev_check_failure(edev);
-   pci_info(pdev, "eeh_dev_check_failure(%04x:%02x:%02x.%01x) = %d\n",
-   domain, bus, dev, fn, ret);
+   pci_info(pdev, "eeh_dev_check_failure(%s) = %d\n",
+   pci_name(pdev), ret);
 
pci_dev_put(pdev);
 
@@ -1829,25 +1845,12 @@ static ssize_t eeh_dev_break_write(struct file *filp,
const char __user *user_buf,
size_t count, loff_t *ppos)
 {
-   uint32_t domain, bus, dev, fn;
struct pci_dev *pdev;
-   char buf[20];
int ret;
 
-   memset(buf, 0, sizeof(buf));
-   ret = simple_write_to_buffer(buf, sizeof(buf)-1, ppos, user_buf, count);
-   if (!ret)
-   return -EFAULT;
-
-   ret = sscanf(buf, "%x:%x:%x.%x", &domain, &bus, &dev, &fn);
-   if (ret != 4) {
-   pr_err("%s: expected 4 args, got %d\n", __func__, ret);
-   return -EINVAL;
-   }
-
-   pdev = pci_get_domain_bus_and_slot(domain, bus, (dev << 3) | fn);
-   if (!pdev)
-   return -ENODEV;
+   pdev = eeh_debug_lookup_pdev(filp, user_buf, count, ppos);
+   if (IS_ERR(pdev))
+   return PTR_ERR(pdev);
 
ret = eeh_debugfs_break_device(pdev);
pci_dev_put(pdev);
-- 
2.26.2

[PATCH 3/3] selftests/powerpc: Add VF recovery tests

2020-11-02 Thread Oliver O'Halloran

The basic EEH test ignores VFs since we the way the eeh_dev_break debugfs
interface works means that if multiple VFs are enabled we may cause errors
on all them them. However, we can work around that by only enabling a
single VF at a time.

This patch adds some infrastructure for finding SR-IOV capable devices and
enabling / disabling VFs so we can exercise the VF specific EEH recovery
paths. Two new tests are added, one for testing EEH aware devices and one
for EEH un-aware VFs.

Signed-off-by: Oliver O'Halloran 
---
 .../selftests/powerpc/eeh/eeh-functions.sh| 108 ++
 .../selftests/powerpc/eeh/eeh-vf-aware.sh |  45 
 .../selftests/powerpc/eeh/eeh-vf-unaware.sh   |  35 ++
 3 files changed, 188 insertions(+)
 create mode 100755 tools/testing/selftests/powerpc/eeh/eeh-vf-aware.sh
 create mode 100755 tools/testing/selftests/powerpc/eeh/eeh-vf-unaware.sh

diff --git a/tools/testing/selftests/powerpc/eeh/eeh-functions.sh 
b/tools/testing/selftests/powerpc/eeh/eeh-functions.sh
index 32e5b7fbf18a..70daa3925dcb 100644
--- a/tools/testing/selftests/powerpc/eeh/eeh-functions.sh
+++ b/tools/testing/selftests/powerpc/eeh/eeh-functions.sh
@@ -135,3 +135,111 @@ eeh_one_dev() {
return 0;
 }
 
+eeh_has_driver() {
+   test -e /sys/bus/pci/devices/$1/driver;
+   return $?
+}
+
+eeh_can_recover() {
+   # we'll get an IO error if the device's current driver doesn't support
+   # error recovery
+   echo $1 > '/sys/kernel/debug/powerpc/eeh_dev_can_recover' 2>/dev/null
+
+   return $?
+}
+
+eeh_find_all_pfs() {
+   devices=""
+
+   # SR-IOV on pseries requires hypervisor support, so check for that
+   is_pseries=""
+   if grep -q pSeries /proc/cpuinfo ; then
+   if [ ! -f /proc/device-tree/rtas/ibm,open-sriov-allow-unfreeze 
] ||
+  [ ! -f /proc/device-tree/rtas/ibm,open-sriov-map-pe-number ] 
; then
+   return 1;
+   fi
+
+   is_pseries="true"
+   fi
+
+   for dev in `ls -1 /sys/bus/pci/devices/` ; do
+   sysfs="/sys/bus/pci/devices/$dev"
+   if [ ! -e "$sysfs/sriov_numvfs" ] ; then
+   continue
+   fi
+
+   # skip unsupported PFs on pseries
+   if [ -z "$is_pseries" ] &&
+  [ ! -f "$sysfs/of_node/ibm,is-open-sriov-pf" ] &&
+  [ ! -f "$sysfs/of_node/ibm,open-sriov-vf-bar-info" ] ; then
+   continue;
+   fi
+
+   # no driver, no vfs
+   if ! eeh_has_driver $dev ; then
+   continue
+   fi
+
+   devices="$devices $dev"
+   done
+
+   if [ -z "$devices" ] ; then
+   return 1;
+   fi
+
+   echo $devices
+   return 0;
+}
+
+# attempts to enable one VF on each PF so we can do VF specific tests.
+# stdout: list of enabled VFs, one per line
+# return code: 0 if vfs are found, 1 otherwise
+eeh_enable_vfs() {
+   pf_list="$(eeh_find_all_pfs)"
+
+   vfs=0
+   for dev in $pf_list ; do
+   pf_sysfs="/sys/bus/pci/devices/$dev"
+
+   # make sure we have a single VF
+   echo 0 > "$pf_sysfs/sriov_numvfs"
+   echo 1 > "$pf_sysfs/sriov_numvfs"
+   if [ "$?" != 0 ] ; then
+   log "Unable to enable VFs on $pf, skipping"
+   continue;
+   fi
+
+   vf="$(basename $(realpath "$pf_sysfs/virtfn0"))"
+   if [ $? != 0 ] ; then
+   log "unable to find enabled vf on $pf"
+   echo 0 > "$pf_sysfs/sriov_numvfs"
+   continue;
+   fi
+
+   if ! eeh_can_break $vf ; then
+   log "skipping "
+
+   echo 0 > "$pf_sysfs/sriov_numvfs"
+   continue;
+   fi
+
+   vfs="$((vfs + 1))"
+   echo $vf
+   done
+
+   test "$vfs" != 0
+   return $?
+}
+
+eeh_disable_vfs() {
+   pf_list="$(eeh_find_all_pfs)"
+   if [ -z "$pf_list" ] ; then
+   return 1;
+   fi
+
+   for dev in $pf_list ; do
+   echo 0 > "/sys/bus/pci/devices/$dev/sriov_numvfs"
+   done
+
+   return 0;
+}
diff --git a/tools/testing/selftests/powerpc/eeh/eeh-vf-aware.sh 
b/tools/testing/selftests/powerpc/eeh/eeh-vf-aware.sh
new file mode 100755
index ..874c11953bb6
--- /dev/null
+++ b/tools/testing/selftests/powerpc/eeh/eeh-vf-aware.sh
@@ -0,0 +1,45 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0-only
+
+. ./eeh-functions.sh
+
+eeh_test_prep # NB: may exit
+
+vf_list="$(eeh_enable_vfs)";
+if $? != 0 ; then
+   log "No usable VFs found. Skipping EEH unaware VF test"
+   exit $KSELFTESTS_SKIP;
+fi
+
+log "Enabled VFs: $vf_list"
+
+tested=0
+passed=0
+for vf in $vf_list ; do
+   log "Testing $vf"
+
+   if ! e

[PATCH 2/3] selftests/powerpc: Use stderr for debug messages in eeh-functions

2020-11-02 Thread Oliver O'Halloran

We want to use stdout to return lists of devices, etc so log debug / status
messages to stderr rather than stdout.

Signed-off-by: Oliver O'Halloran 
---
 .../selftests/powerpc/eeh/eeh-functions.sh| 20 +++
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/tools/testing/selftests/powerpc/eeh/eeh-functions.sh 
b/tools/testing/selftests/powerpc/eeh/eeh-functions.sh
index 9b1bcc1fd4ad..32e5b7fbf18a 100644
--- a/tools/testing/selftests/powerpc/eeh/eeh-functions.sh
+++ b/tools/testing/selftests/powerpc/eeh/eeh-functions.sh
@@ -3,6 +3,10 @@
 
 export KSELFTESTS_SKIP=4
 
+log() {
+   echo >/dev/stderr $*
+}
+
 pe_ok() {
local dev="$1"
local path="/sys/bus/pci/devices/$dev/eeh_pe_state"
@@ -49,7 +53,7 @@ eeh_test_prep() {
 
if [ ! -e "/sys/kernel/debug/powerpc/eeh_dev_check" ] && \
   [ ! -e "/sys/kernel/debug/powerpc/eeh_dev_break" ] ; then
-   echo "debugfs EEH testing files are missing. Is debugfs 
mounted?"
+   log "debugfs EEH testing files are missing. Is debugfs mounted?"
exit $KSELFTESTS_SKIP;
fi
 
@@ -61,7 +65,7 @@ eeh_test_prep() {
 eeh_can_break() {
# skip bridges since we can't recover them (yet...)
if [ -e "/sys/bus/pci/devices/$dev/pci_bus" ] ; then
-   echo "$dev, Skipped: bridge"
+   log "$dev, Skipped: bridge"
return 1;
fi
 
@@ -70,7 +74,7 @@ eeh_can_break() {
# it the system will generally go down. We should probably fix that
# at some point
if [ "ahci" = "$(basename $(realpath 
/sys/bus/pci/devices/$dev/driver))" ] ; then
-   echo "$dev, Skipped: ahci doesn't support recovery"
+   log "$dev, Skipped: ahci doesn't support recovery"
return 1;
fi
 
@@ -80,7 +84,7 @@ eeh_can_break() {
# result in the recovery failing and the device being marked as
# failed.
if ! pe_ok $dev ; then
-   echo "$dev, Skipped: Bad initial PE state"
+   log "$dev, Skipped: Bad initial PE state"
return 1;
fi
 
@@ -94,7 +98,7 @@ eeh_one_dev() {
# testing so check that the argument is a well-formed sysfs device
# name.
if ! test -e /sys/bus/pci/devices/$dev/ ; then
-   echo "Error: '$dev' must be a sysfs device name (:BB:DD.F)"
+   log "Error: '$dev' must be a sysfs device name (:BB:DD.F)"
return 1;
fi
 
@@ -118,16 +122,16 @@ eeh_one_dev() {
if pe_ok $dev ; then
break;
fi
-   echo "$dev, waited $i/${max_wait}"
+   log "$dev, waited $i/${max_wait}"
sleep 1
done
 
if ! pe_ok $dev ; then
-   echo "$dev, Failed to recover!"
+   log "$dev, Failed to recover!"
return 1;
fi
 
-   echo "$dev, Recovered after $i seconds"
+   log "$dev, Recovered after $i seconds"
return 0;
 }
 
-- 
2.26.2

[PATCH 1/3] selftests/powerpc: Hoist helper code out of eeh-basic

2020-11-02 Thread Oliver O'Halloran

Hoist some of the useful test environment checking and prep code into
eeh-functions.sh so they can be reused in other tests.

Signed-off-by: Oliver O'Halloran 
---
 .../selftests/powerpc/eeh/eeh-basic.sh| 39 ++-
 .../selftests/powerpc/eeh/eeh-functions.sh| 48 +++
 2 files changed, 51 insertions(+), 36 deletions(-)
 mode change 100755 => 100644 
tools/testing/selftests/powerpc/eeh/eeh-functions.sh

diff --git a/tools/testing/selftests/powerpc/eeh/eeh-basic.sh 
b/tools/testing/selftests/powerpc/eeh/eeh-basic.sh
index 0d783e1065c8..16d00555f13e 100755
--- a/tools/testing/selftests/powerpc/eeh/eeh-basic.sh
+++ b/tools/testing/selftests/powerpc/eeh/eeh-basic.sh
@@ -1,28 +1,13 @@
 #!/bin/sh
 # SPDX-License-Identifier: GPL-2.0-only
 
-KSELFTESTS_SKIP=4
-
 . ./eeh-functions.sh
 
-if ! eeh_supported ; then
-   echo "EEH not supported on this system, skipping"
-   exit $KSELFTESTS_SKIP;
-fi
-
-if [ ! -e "/sys/kernel/debug/powerpc/eeh_dev_check" ] && \
-   [ ! -e "/sys/kernel/debug/powerpc/eeh_dev_break" ] ; then
-   echo "debugfs EEH testing files are missing. Is debugfs mounted?"
-   exit $KSELFTESTS_SKIP;
-fi
+eeh_test_prep # NB: may exit
 
 pre_lspci=`mktemp`
 lspci > $pre_lspci
 
-# Bump the max freeze count to something absurd so we don't
-# trip over it while breaking things.
-echo 5000 > /sys/kernel/debug/powerpc/eeh_max_freezes
-
 # record the devices that we break in here. Assuming everything
 # goes to plan we should get them back once the recover process
 # is finished.
@@ -30,34 +15,16 @@ devices=""
 
 # Build up a list of candidate devices.
 for dev in `ls -1 /sys/bus/pci/devices/ | grep '\.0$'` ; do
-   # skip bridges since we can't recover them (yet...)
-   if [ -e "/sys/bus/pci/devices/$dev/pci_bus" ] ; then
-   echo "$dev, Skipped: bridge"
+   if ! eeh_can_break $dev ; then
continue;
fi
 
-   # Skip VFs for now since we don't have a reliable way
-   # to break them.
+   # Skip VFs for now since we don't have a reliable way to break them.
if [ -e "/sys/bus/pci/devices/$dev/physfn" ] ; then
echo "$dev, Skipped: virtfn"
continue;
fi
 
-   if [ "ahci" = "$(basename $(realpath 
/sys/bus/pci/devices/$dev/driver))" ] ; then
-   echo "$dev, Skipped: ahci doesn't support recovery"
-   continue
-   fi
-
-   # Don't inject errosr into an already-frozen PE. This happens with
-   # PEs that contain multiple PCI devices (e.g. multi-function cards)
-   # and injecting new errors during the recovery process will probably
-   # result in the recovery failing and the device being marked as
-   # failed.
-   if ! pe_ok $dev ; then
-   echo "$dev, Skipped: Bad initial PE state"
-   continue;
-   fi
-
echo "$dev, Added"
 
# Add to this list of device to check
diff --git a/tools/testing/selftests/powerpc/eeh/eeh-functions.sh 
b/tools/testing/selftests/powerpc/eeh/eeh-functions.sh
old mode 100755
new mode 100644
index 00dc32c0ed75..9b1bcc1fd4ad
--- a/tools/testing/selftests/powerpc/eeh/eeh-functions.sh
+++ b/tools/testing/selftests/powerpc/eeh/eeh-functions.sh
@@ -1,6 +1,8 @@
 #!/bin/sh
 # SPDX-License-Identifier: GPL-2.0-only
 
+export KSELFTESTS_SKIP=4
+
 pe_ok() {
local dev="$1"
local path="/sys/bus/pci/devices/$dev/eeh_pe_state"
@@ -39,6 +41,52 @@ eeh_supported() {
grep -q 'EEH Subsystem is enabled' /proc/powerpc/eeh
 }
 
+eeh_test_prep() {
+   if ! eeh_supported ; then
+   echo "EEH not supported on this system, skipping"
+   exit $KSELFTESTS_SKIP;
+   fi
+
+   if [ ! -e "/sys/kernel/debug/powerpc/eeh_dev_check" ] && \
+  [ ! -e "/sys/kernel/debug/powerpc/eeh_dev_break" ] ; then
+   echo "debugfs EEH testing files are missing. Is debugfs 
mounted?"
+   exit $KSELFTESTS_SKIP;
+   fi
+
+   # Bump the max freeze count to something absurd so we don't
+   # trip over it while breaking things.
+   echo 5000 > /sys/kernel/debug/powerpc/eeh_max_freezes
+}
+
+eeh_can_break() {
+   # skip bridges since we can't recover them (yet...)
+   if [ -e "/sys/bus/pci/devices/$dev/pci_bus" ] ; then
+   echo "$dev, Skipped: bridge"
+   return 1;
+   fi
+
+   # The ahci driver doesn't support error recovery. If the ahci device
+   # happens to be hosting the root filesystem, and then we go and break
+   # it the system will generally go down. We should probably fix that
+   # at some point
+   if [ "ahci" = "$(basename $(realpath 
/sys/bus/pci/devices/$dev/driver))" ] ; then
+   echo "$dev, Skipped: ahci doesn't support recovery"
+   return 1;
+   fi
+
+   # Don't inject errosr into an already-frozen PE. This happens with
+   # PEs that contain multiple PCI devices (e.g. multi-functi

Re: [PATCH v2] powerpc/pci: unmap legacy INTx interrupts when a PHB is removed

2020-11-02 Thread Oliver O'Halloran

On Tue, Nov 3, 2020 at 1:39 AM Cédric Le Goater  wrote:
>
> On 10/14/20 4:55 AM, Alexey Kardashevskiy wrote:
> >
> > How do you remove PHBs exactly? There is no such thing in the powernv 
> > platform, I thought someone added this and you are fixing it but no. PHBs 
> > on powernv are created at the boot time and there is no way to remove them, 
> > you can only try removing all the bridges.
>
> yes. I noticed that later when proposing the fix for the double
> free.
>
> > So what exactly are you doing?
>
> What you just said above, with the commands :
>
>   echo 1 >  /sys/devices/pci0031\:00/0031\:00\:00.0/remove
>   echo 1 >  /sys/devices/pci0031\:00/pci_bus/0031\:00/rescan

Right, so that'll remove the root port device (and Bus 01 beneath it),
but the PHB itself is still there. If it was removed the root bus
would also disappear.

[PATCH 18/18] powerpc/powermac: Move PHB discovery

2020-11-02 Thread Oliver O'Halloran

Signed-off-by: Oliver O'Halloran 
---
compile tested with pmac32_defconfig and g5_defconfig
---
 arch/powerpc/platforms/powermac/setup.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/powerpc/platforms/powermac/setup.c 
b/arch/powerpc/platforms/powermac/setup.c
index 2e2cc0c75d87..86aee3f2483f 100644
--- a/arch/powerpc/platforms/powermac/setup.c
+++ b/arch/powerpc/platforms/powermac/setup.c
@@ -298,9 +298,6 @@ static void __init pmac_setup_arch(void)
of_node_put(ic);
}
 
-   /* Lookup PCI hosts */
-   pmac_pci_init();
-
 #ifdef CONFIG_PPC32
ohare_init();
l2cr_init();
@@ -600,6 +597,7 @@ define_machine(powermac) {
.name   = "PowerMac",
.probe  = pmac_probe,
.setup_arch = pmac_setup_arch,
+   .discover_phbs  = pmac_pci_init,
.show_cpuinfo   = pmac_show_cpuinfo,
.init_IRQ   = pmac_pic_init,
.get_irq= NULL, /* changed later */
-- 
2.26.2

[PATCH 17/18] powerpc/pasemi: Move PHB discovery

2020-11-02 Thread Oliver O'Halloran

Signed-off-by: Oliver O'Halloran 
---
compile tested with pasemi_defconfig
---
 arch/powerpc/platforms/pasemi/setup.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/pasemi/setup.c 
b/arch/powerpc/platforms/pasemi/setup.c
index b612474f8f8e..376797eb7894 100644
--- a/arch/powerpc/platforms/pasemi/setup.c
+++ b/arch/powerpc/platforms/pasemi/setup.c
@@ -144,8 +144,6 @@ static void __init pas_setup_arch(void)
/* Setup SMP callback */
smp_ops = &pas_smp_ops;
 #endif
-   /* Lookup PCI hosts */
-   pas_pci_init();
 
/* Remap SDC register for doing reset */
/* XXXOJN This should maybe come out of the device tree */
@@ -446,6 +444,7 @@ define_machine(pasemi) {
.name   = "PA Semi PWRficient",
.probe  = pas_probe,
.setup_arch = pas_setup_arch,
+   .discover_phbs  = pas_pci_init,
.init_IRQ   = pas_init_IRQ,
.get_irq= mpic_get_irq,
.restart= pas_restart,
-- 
2.26.2

[PATCH 16/18] powerpc/embedded6xx/mve5100: Move PHB discovery

2020-11-02 Thread Oliver O'Halloran

Signed-off-by: Oliver O'Halloran 
---
compile tested with mvme5100_defconfig
---
 arch/powerpc/platforms/embedded6xx/mvme5100.c   | 13 -
 arch/powerpc/platforms/embedded6xx/storcenter.c |  8 ++--
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/platforms/embedded6xx/mvme5100.c 
b/arch/powerpc/platforms/embedded6xx/mvme5100.c
index 1cd488daa0bf..c06a0490d157 100644
--- a/arch/powerpc/platforms/embedded6xx/mvme5100.c
+++ b/arch/powerpc/platforms/embedded6xx/mvme5100.c
@@ -154,17 +154,19 @@ static const struct of_device_id mvme5100_of_bus_ids[] 
__initconst = {
  */
 static void __init mvme5100_setup_arch(void)
 {
-   struct device_node *np;
-
if (ppc_md.progress)
ppc_md.progress("mvme5100_setup_arch()", 0);
 
-   for_each_compatible_node(np, "pci", "hawk-pci")
-   mvme5100_add_bridge(np);
-
restart = ioremap(BOARD_MODRST_REG, 4);
 }
 
+static void __init mvme5100_setup_pci(void)
+{
+   struct device_node *np;
+
+   for_each_compatible_node(np, "pci", "hawk-pci")
+   mvme5100_add_bridge(np);
+}
 
 static void mvme5100_show_cpuinfo(struct seq_file *m)
 {
@@ -205,6 +207,7 @@ define_machine(mvme5100) {
.name   = "MVME5100",
.probe  = mvme5100_probe,
.setup_arch = mvme5100_setup_arch,
+   .discover_phbs  = mvme5100_setup_pci,
.init_IRQ   = mvme5100_pic_init,
.show_cpuinfo   = mvme5100_show_cpuinfo,
.get_irq= mpic_get_irq,
diff --git a/arch/powerpc/platforms/embedded6xx/storcenter.c 
b/arch/powerpc/platforms/embedded6xx/storcenter.c
index e346ddcef45e..e188b90f7016 100644
--- a/arch/powerpc/platforms/embedded6xx/storcenter.c
+++ b/arch/powerpc/platforms/embedded6xx/storcenter.c
@@ -65,14 +65,17 @@ static int __init storcenter_add_bridge(struct device_node 
*dev)
 }
 
 static void __init storcenter_setup_arch(void)
+{
+   printk(KERN_INFO "IOMEGA StorCenter\n");
+}
+
+static void __init storcenter_setup_pci(void)
 {
struct device_node *np;
 
/* Lookup PCI host bridges */
for_each_compatible_node(np, "pci", "mpc10x-pci")
storcenter_add_bridge(np);
-
-   printk(KERN_INFO "IOMEGA StorCenter\n");
 }
 
 /*
@@ -117,6 +120,7 @@ define_machine(storcenter){
.name   = "IOMEGA StorCenter",
.probe  = storcenter_probe,
.setup_arch = storcenter_setup_arch,
+   .discover_phbs  = storcenter_setup_pci,
.init_IRQ   = storcenter_init_IRQ,
.get_irq= mpic_get_irq,
.restart= storcenter_restart,
-- 
2.26.2

[PATCH 15/18] powerpc/embedded6xx/mpc7448: Move PHB discovery

2020-11-02 Thread Oliver O'Halloran

Signed-off-by: Oliver O'Halloran 
---
compile tested with mpc7448_hpc2_defconfig
---
 arch/powerpc/platforms/embedded6xx/mpc7448_hpc2.c | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/platforms/embedded6xx/mpc7448_hpc2.c 
b/arch/powerpc/platforms/embedded6xx/mpc7448_hpc2.c
index b95c3380d2b5..5565647dc879 100644
--- a/arch/powerpc/platforms/embedded6xx/mpc7448_hpc2.c
+++ b/arch/powerpc/platforms/embedded6xx/mpc7448_hpc2.c
@@ -58,16 +58,14 @@ int mpc7448_hpc2_exclude_device(struct pci_controller *hose,
return PCIBIOS_SUCCESSFUL;
 }
 
-static void __init mpc7448_hpc2_setup_arch(void)
+static void __init mpc7448_hpc2_setup_pci(void)
 {
+#ifdef CONFIG_PCI
struct device_node *np;
if (ppc_md.progress)
-   ppc_md.progress("mpc7448_hpc2_setup_arch():set_bridge", 0);
-
-   tsi108_csr_vir_base = get_vir_csrbase();
+   ppc_md.progress("mpc7448_hpc2_setup_pci():set_bridge", 0);
 
/* setup PCI host bridge */
-#ifdef CONFIG_PCI
for_each_compatible_node(np, "pci", "tsi108-pci")
tsi108_setup_pci(np, MPC7448HPC2_PCI_CFG_PHYS, 0);
 
@@ -75,6 +73,11 @@ static void __init mpc7448_hpc2_setup_arch(void)
if (ppc_md.progress)
ppc_md.progress("tsi108: resources set", 0x100);
 #endif
+}
+
+static void __init mpc7448_hpc2_setup_arch(void)
+{
+   tsi108_csr_vir_base = get_vir_csrbase();
 
printk(KERN_INFO "MPC7448HPC2 (TAIGA) Platform\n");
printk(KERN_INFO
@@ -181,6 +184,7 @@ define_machine(mpc7448_hpc2){
.name   = "MPC7448 HPC2",
.probe  = mpc7448_hpc2_probe,
.setup_arch = mpc7448_hpc2_setup_arch,
+   .discover_phbs  = mpc7448_hpc2_setup_pci,
.init_IRQ   = mpc7448_hpc2_init_IRQ,
.show_cpuinfo   = mpc7448_hpc2_show_cpuinfo,
.get_irq= mpic_get_irq,
-- 
2.26.2

[PATCH 14/18] powerpc/embedded6xx/linkstation: Move PHB discovery

2020-11-02 Thread Oliver O'Halloran

Signed-off-by: Oliver O'Halloran 
---
compile tested with linkstation_defconfig
---
 arch/powerpc/platforms/embedded6xx/linkstation.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/platforms/embedded6xx/linkstation.c 
b/arch/powerpc/platforms/embedded6xx/linkstation.c
index f514d5d28cd4..eb8342e7f84e 100644
--- a/arch/powerpc/platforms/embedded6xx/linkstation.c
+++ b/arch/powerpc/platforms/embedded6xx/linkstation.c
@@ -63,15 +63,18 @@ static int __init linkstation_add_bridge(struct device_node 
*dev)
 }
 
 static void __init linkstation_setup_arch(void)
+{
+   printk(KERN_INFO "BUFFALO Network Attached Storage Series\n");
+   printk(KERN_INFO "(C) 2002-2005 BUFFALO INC.\n");
+}
+
+static void __init linkstation_setup_pci(void)
 {
struct device_node *np;
 
/* Lookup PCI host bridges */
for_each_compatible_node(np, "pci", "mpc10x-pci")
linkstation_add_bridge(np);
-
-   printk(KERN_INFO "BUFFALO Network Attached Storage Series\n");
-   printk(KERN_INFO "(C) 2002-2005 BUFFALO INC.\n");
 }
 
 /*
@@ -153,6 +156,7 @@ define_machine(linkstation){
.name   = "Buffalo Linkstation",
.probe  = linkstation_probe,
.setup_arch = linkstation_setup_arch,
+   .discover_phbs  = linkstation_setup_pci,
.init_IRQ   = linkstation_init_IRQ,
.show_cpuinfo   = linkstation_show_cpuinfo,
.get_irq= mpic_get_irq,
-- 
2.26.2

[PATCH 13/18] powerpc/embedded6xx/holly: Move PHB discovery

2020-11-02 Thread Oliver O'Halloran

Signed-off-by: Oliver O'Halloran 
---
compile tested with holly_defconfig
---
 arch/powerpc/platforms/embedded6xx/holly.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/platforms/embedded6xx/holly.c 
b/arch/powerpc/platforms/embedded6xx/holly.c
index d8f2e2c737bb..53065d564161 100644
--- a/arch/powerpc/platforms/embedded6xx/holly.c
+++ b/arch/powerpc/platforms/embedded6xx/holly.c
@@ -108,15 +108,13 @@ static void holly_remap_bridge(void)
tsi108_write_reg(TSI108_PCI_P2O_BAR2, 0x0);
 }
 
-static void __init holly_setup_arch(void)
+static void __init holly_init_pci(void)
 {
struct device_node *np;
 
if (ppc_md.progress)
ppc_md.progress("holly_setup_arch():set_bridge", 0);
 
-   tsi108_csr_vir_base = get_vir_csrbase();
-
/* setup PCI host bridge */
holly_remap_bridge();
 
@@ -127,6 +125,11 @@ static void __init holly_setup_arch(void)
ppc_md.pci_exclude_device = holly_exclude_device;
if (ppc_md.progress)
ppc_md.progress("tsi108: resources set", 0x100);
+}
+
+static void __init holly_setup_arch(void)
+{
+   tsi108_csr_vir_base = get_vir_csrbase();
 
printk(KERN_INFO "PPC750GX/CL Platform\n");
 }
@@ -259,6 +262,7 @@ define_machine(holly){
.name   = "PPC750 GX/CL TSI",
.probe  = holly_probe,
.setup_arch = holly_setup_arch,
+   .discover_phbs  = holly_init_pci,
.init_IRQ   = holly_init_IRQ,
.show_cpuinfo   = holly_show_cpuinfo,
.get_irq= mpic_get_irq,
-- 
2.26.2

[PATCH 12/18] powerpc/chrp: Move PHB discovery

2020-11-02 Thread Oliver O'Halloran

Signed-off-by: Oliver O'Halloran 
---
compile tested with chrp32_defconfig
---
 arch/powerpc/platforms/chrp/pci.c   |  8 
 arch/powerpc/platforms/chrp/setup.c | 12 +---
 2 files changed, 9 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/platforms/chrp/pci.c 
b/arch/powerpc/platforms/chrp/pci.c
index b2c2bf35b76c..8c421dc78b28 100644
--- a/arch/powerpc/platforms/chrp/pci.c
+++ b/arch/powerpc/platforms/chrp/pci.c
@@ -314,6 +314,14 @@ chrp_find_bridges(void)
}
}
of_node_put(root);
+
+   /*
+*  "Temporary" fixes for PCI devices.
+*  -- Geert
+*/
+   hydra_init();   /* Mac I/O */
+
+   pci_create_OF_bus_map();
 }
 
 /* SL82C105 IDE Control/Status Register */
diff --git a/arch/powerpc/platforms/chrp/setup.c 
b/arch/powerpc/platforms/chrp/setup.c
index c45435aa5e36..3cfc382841e5 100644
--- a/arch/powerpc/platforms/chrp/setup.c
+++ b/arch/powerpc/platforms/chrp/setup.c
@@ -334,22 +334,11 @@ static void __init chrp_setup_arch(void)
/* On pegasos, enable the L2 cache if not already done by OF */
pegasos_set_l2cr();
 
-   /* Lookup PCI host bridges */
-   chrp_find_bridges();
-
-   /*
-*  Temporary fixes for PCI devices.
-*  -- Geert
-*/
-   hydra_init();   /* Mac I/O */
-
/*
 *  Fix the Super I/O configuration
 */
sio_init();
 
-   pci_create_OF_bus_map();
-
/*
 * Print the banner, then scroll down so boot progress
 * can be printed.  -- Cort
@@ -582,6 +571,7 @@ define_machine(chrp) {
.name   = "CHRP",
.probe  = chrp_probe,
.setup_arch = chrp_setup_arch,
+   .discover_phbs  = chrp_find_bridges,
.init   = chrp_init2,
.show_cpuinfo   = chrp_show_cpuinfo,
.init_IRQ   = chrp_init_IRQ,
-- 
2.26.2

[PATCH 11/18] powerpc/amigaone: Move PHB discovery

2020-11-02 Thread Oliver O'Halloran

Signed-off-by: Oliver O'Halloran 
---
compile tested with amigaone_defconfig
---
 arch/powerpc/platforms/amigaone/setup.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/platforms/amigaone/setup.c 
b/arch/powerpc/platforms/amigaone/setup.c
index f5d0bf999759..b25ddf39dd43 100644
--- a/arch/powerpc/platforms/amigaone/setup.c
+++ b/arch/powerpc/platforms/amigaone/setup.c
@@ -65,6 +65,12 @@ static int __init amigaone_add_bridge(struct device_node 
*dev)
 }
 
 void __init amigaone_setup_arch(void)
+{
+   if (ppc_md.progress)
+   ppc_md.progress("Linux/PPC "UTS_RELEASE"\n", 0);
+}
+
+void __init amigaone_discover_phbs(void)
 {
struct device_node *np;
int phb = -ENODEV;
@@ -74,9 +80,6 @@ void __init amigaone_setup_arch(void)
phb = amigaone_add_bridge(np);
 
BUG_ON(phb != 0);
-
-   if (ppc_md.progress)
-   ppc_md.progress("Linux/PPC "UTS_RELEASE"\n", 0);
 }
 
 void __init amigaone_init_IRQ(void)
@@ -159,6 +162,7 @@ define_machine(amigaone) {
.name   = "AmigaOne",
.probe  = amigaone_probe,
.setup_arch = amigaone_setup_arch,
+   .discover_phbs  = amigaone_discover_phbs,
.show_cpuinfo   = amigaone_show_cpuinfo,
.init_IRQ   = amigaone_init_IRQ,
.restart= amigaone_restart,
-- 
2.26.2

[PATCH 10/18] powerpc/83xx: Move PHB discovery

2020-11-02 Thread Oliver O'Halloran

Signed-off-by: Oliver O'Halloran 
---
compile tested with mpc83xx_defconfig
---
 arch/powerpc/platforms/83xx/asp834x.c | 1 +
 arch/powerpc/platforms/83xx/km83xx.c  | 1 +
 arch/powerpc/platforms/83xx/misc.c| 2 --
 arch/powerpc/platforms/83xx/mpc830x_rdb.c | 1 +
 arch/powerpc/platforms/83xx/mpc831x_rdb.c | 1 +
 arch/powerpc/platforms/83xx/mpc832x_mds.c | 1 +
 arch/powerpc/platforms/83xx/mpc832x_rdb.c | 1 +
 arch/powerpc/platforms/83xx/mpc834x_itx.c | 1 +
 arch/powerpc/platforms/83xx/mpc834x_mds.c | 1 +
 arch/powerpc/platforms/83xx/mpc836x_mds.c | 1 +
 arch/powerpc/platforms/83xx/mpc836x_rdk.c | 1 +
 arch/powerpc/platforms/83xx/mpc837x_mds.c | 1 +
 arch/powerpc/platforms/83xx/mpc837x_rdb.c | 1 +
 13 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/83xx/asp834x.c 
b/arch/powerpc/platforms/83xx/asp834x.c
index 28474876f41b..68061c2a57c1 100644
--- a/arch/powerpc/platforms/83xx/asp834x.c
+++ b/arch/powerpc/platforms/83xx/asp834x.c
@@ -44,6 +44,7 @@ define_machine(asp834x) {
.name   = "ASP8347E",
.probe  = asp834x_probe,
.setup_arch = asp834x_setup_arch,
+   .discover_phbs  = mpc83xx_setup_pci,
.init_IRQ   = mpc83xx_ipic_init_IRQ,
.get_irq= ipic_get_irq,
.restart= mpc83xx_restart,
diff --git a/arch/powerpc/platforms/83xx/km83xx.c 
b/arch/powerpc/platforms/83xx/km83xx.c
index bcdc2c203ec9..108e1e4d2683 100644
--- a/arch/powerpc/platforms/83xx/km83xx.c
+++ b/arch/powerpc/platforms/83xx/km83xx.c
@@ -180,6 +180,7 @@ define_machine(mpc83xx_km) {
.name   = "mpc83xx-km-platform",
.probe  = mpc83xx_km_probe,
.setup_arch = mpc83xx_km_setup_arch,
+   .discover_phbs  = mpc83xx_setup_pci,
.init_IRQ   = mpc83xx_ipic_init_IRQ,
.get_irq= ipic_get_irq,
.restart= mpc83xx_restart,
diff --git a/arch/powerpc/platforms/83xx/misc.c 
b/arch/powerpc/platforms/83xx/misc.c
index a952e91db3ee..3285dabcf923 100644
--- a/arch/powerpc/platforms/83xx/misc.c
+++ b/arch/powerpc/platforms/83xx/misc.c
@@ -132,8 +132,6 @@ void __init mpc83xx_setup_arch(void)
setbat(-1, va, immrbase, immrsize, PAGE_KERNEL_NCG);
update_bats();
}
-
-   mpc83xx_setup_pci();
 }
 
 int machine_check_83xx(struct pt_regs *regs)
diff --git a/arch/powerpc/platforms/83xx/mpc830x_rdb.c 
b/arch/powerpc/platforms/83xx/mpc830x_rdb.c
index 51426e88ec67..956d4389effa 100644
--- a/arch/powerpc/platforms/83xx/mpc830x_rdb.c
+++ b/arch/powerpc/platforms/83xx/mpc830x_rdb.c
@@ -48,6 +48,7 @@ define_machine(mpc830x_rdb) {
.name   = "MPC830x RDB",
.probe  = mpc830x_rdb_probe,
.setup_arch = mpc830x_rdb_setup_arch,
+   .discover_phbs  = mpc83xx_setup_pci,
.init_IRQ   = mpc83xx_ipic_init_IRQ,
.get_irq= ipic_get_irq,
.restart= mpc83xx_restart,
diff --git a/arch/powerpc/platforms/83xx/mpc831x_rdb.c 
b/arch/powerpc/platforms/83xx/mpc831x_rdb.c
index 5ccd57a48492..3b578f080e3b 100644
--- a/arch/powerpc/platforms/83xx/mpc831x_rdb.c
+++ b/arch/powerpc/platforms/83xx/mpc831x_rdb.c
@@ -48,6 +48,7 @@ define_machine(mpc831x_rdb) {
.name   = "MPC831x RDB",
.probe  = mpc831x_rdb_probe,
.setup_arch = mpc831x_rdb_setup_arch,
+   .discover_phbs  = mpc83xx_setup_pci,
.init_IRQ   = mpc83xx_ipic_init_IRQ,
.get_irq= ipic_get_irq,
.restart= mpc83xx_restart,
diff --git a/arch/powerpc/platforms/83xx/mpc832x_mds.c 
b/arch/powerpc/platforms/83xx/mpc832x_mds.c
index 6fa5402ebf20..850d566ef900 100644
--- a/arch/powerpc/platforms/83xx/mpc832x_mds.c
+++ b/arch/powerpc/platforms/83xx/mpc832x_mds.c
@@ -101,6 +101,7 @@ define_machine(mpc832x_mds) {
.name   = "MPC832x MDS",
.probe  = mpc832x_sys_probe,
.setup_arch = mpc832x_sys_setup_arch,
+   .discover_phbs  = mpc83xx_setup_pci,
.init_IRQ   = mpc83xx_ipic_init_IRQ,
.get_irq= ipic_get_irq,
.restart= mpc83xx_restart,
diff --git a/arch/powerpc/platforms/83xx/mpc832x_rdb.c 
b/arch/powerpc/platforms/83xx/mpc832x_rdb.c
index 622c625d5ce4..b6133a237a70 100644
--- a/arch/powerpc/platforms/83xx/mpc832x_rdb.c
+++ b/arch/powerpc/platforms/83xx/mpc832x_rdb.c
@@ -219,6 +219,7 @@ define_machine(mpc832x_rdb) {
.name   = "MPC832x RDB",
.probe  = mpc832x_rdb_probe,
.setup_arch = mpc832x_rdb_setup_arch,
+   .discover_phbs  = mpc83xx_setup_pci,
.init_IRQ   = mpc83xx_ipic_init_IRQ,
.get_irq= ipic_get_irq,
.restart= mpc83xx_restart,
diff --git a/arch/powerpc/platforms/83xx/mpc834x_itx.c 
b/arc

[PATCH 09/18] powerpc/82xx/*: Move PHB discovery

2020-11-02 Thread Oliver O'Halloran

Signed-off-by: Oliver O'Halloran 
---
compile tested with pq2fads_defconfig
---
 arch/powerpc/platforms/82xx/mpc8272_ads.c | 2 +-
 arch/powerpc/platforms/82xx/pq2fads.c | 3 +--
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/platforms/82xx/mpc8272_ads.c 
b/arch/powerpc/platforms/82xx/mpc8272_ads.c
index 3fe1a6593280..0b5b9dec16d5 100644
--- a/arch/powerpc/platforms/82xx/mpc8272_ads.c
+++ b/arch/powerpc/platforms/82xx/mpc8272_ads.c
@@ -171,7 +171,6 @@ static void __init mpc8272_ads_setup_arch(void)
iounmap(bcsr);
 
init_ioports();
-   pq2_init_pci();
 
if (ppc_md.progress)
ppc_md.progress("mpc8272_ads_setup_arch(), finish", 0);
@@ -205,6 +204,7 @@ define_machine(mpc8272_ads)
.name = "Freescale MPC8272 ADS",
.probe = mpc8272_ads_probe,
.setup_arch = mpc8272_ads_setup_arch,
+   .discover_phbs = pq2_init_pci,
.init_IRQ = mpc8272_ads_pic_init,
.get_irq = cpm2_get_irq,
.calibrate_decr = generic_calibrate_decr,
diff --git a/arch/powerpc/platforms/82xx/pq2fads.c 
b/arch/powerpc/platforms/82xx/pq2fads.c
index a74082140718..ac9113d524af 100644
--- a/arch/powerpc/platforms/82xx/pq2fads.c
+++ b/arch/powerpc/platforms/82xx/pq2fads.c
@@ -150,8 +150,6 @@ static void __init pq2fads_setup_arch(void)
/* Enable external IRQs */
clrbits32(&cpm2_immr->im_siu_conf.siu_82xx.sc_siumcr, 0x0c00);
 
-   pq2_init_pci();
-
if (ppc_md.progress)
ppc_md.progress("pq2fads_setup_arch(), finish", 0);
 }
@@ -184,6 +182,7 @@ define_machine(pq2fads)
.name = "Freescale PQ2FADS",
.probe = pq2fads_probe,
.setup_arch = pq2fads_setup_arch,
+   .discover_phbs = pq2_init_pci,
.init_IRQ = pq2fads_pic_init,
.get_irq = cpm2_get_irq,
.calibrate_decr = generic_calibrate_decr,
-- 
2.26.2

[PATCH 08/18] powerpc/52xx/mpc5200_simple: Move PHB discovery

2020-11-02 Thread Oliver O'Halloran

Signed-off-by: Oliver O'Halloran 
---
 arch/powerpc/platforms/52xx/mpc5200_simple.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/52xx/mpc5200_simple.c 
b/arch/powerpc/platforms/52xx/mpc5200_simple.c
index 2d01e9b2e779..b9f5675b0a1d 100644
--- a/arch/powerpc/platforms/52xx/mpc5200_simple.c
+++ b/arch/powerpc/platforms/52xx/mpc5200_simple.c
@@ -40,8 +40,6 @@ static void __init mpc5200_simple_setup_arch(void)
 
/* Some mpc5200 & mpc5200b related configuration */
mpc5200_setup_xlb_arbiter();
-
-   mpc52xx_setup_pci();
 }
 
 /* list of the supported boards */
@@ -73,6 +71,7 @@ define_machine(mpc5200_simple_platform) {
.name   = "mpc5200-simple-platform",
.probe  = mpc5200_simple_probe,
.setup_arch = mpc5200_simple_setup_arch,
+   .discover_phbs  = mpc52xx_setup_pci,
.init   = mpc52xx_declare_of_platform_devices,
.init_IRQ   = mpc52xx_init_irq,
.get_irq= mpc52xx_get_irq,
-- 
2.26.2

[PATCH 07/18] powerpc/52xx/media5200: Move PHB discovery

2020-11-02 Thread Oliver O'Halloran

Signed-off-by: Oliver O'Halloran 
---
 arch/powerpc/platforms/52xx/media5200.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/52xx/media5200.c 
b/arch/powerpc/platforms/52xx/media5200.c
index 07c5bc4ed0b5..efb8bdecbcc7 100644
--- a/arch/powerpc/platforms/52xx/media5200.c
+++ b/arch/powerpc/platforms/52xx/media5200.c
@@ -202,8 +202,6 @@ static void __init media5200_setup_arch(void)
/* Some mpc5200 & mpc5200b related configuration */
mpc5200_setup_xlb_arbiter();
 
-   mpc52xx_setup_pci();
-
np = of_find_matching_node(NULL, mpc5200_gpio_ids);
gpio = of_iomap(np, 0);
of_node_put(np);
@@ -244,6 +242,7 @@ define_machine(media5200_platform) {
.name   = "media5200-platform",
.probe  = media5200_probe,
.setup_arch = media5200_setup_arch,
+   .discover_phbs  = mpc52xx_setup_pci,
.init   = mpc52xx_declare_of_platform_devices,
.init_IRQ   = media5200_init_irq,
.get_irq= mpc52xx_get_irq,
-- 
2.26.2

[PATCH 06/18] powerpc/52xx/lite5200: Move PHB discovery

2020-11-02 Thread Oliver O'Halloran

Signed-off-by: Oliver O'Halloran 
---
compile tested with 52xx/lite5200b_defconfig
---
 arch/powerpc/platforms/52xx/lite5200.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/52xx/lite5200.c 
b/arch/powerpc/platforms/52xx/lite5200.c
index 3181aac08225..04cc97397095 100644
--- a/arch/powerpc/platforms/52xx/lite5200.c
+++ b/arch/powerpc/platforms/52xx/lite5200.c
@@ -165,8 +165,6 @@ static void __init lite5200_setup_arch(void)
mpc52xx_suspend.board_resume_finish = lite5200_resume_finish;
lite5200_pm_init();
 #endif
-
-   mpc52xx_setup_pci();
 }
 
 static const char * const board[] __initconst = {
@@ -187,6 +185,7 @@ define_machine(lite5200) {
.name   = "lite5200",
.probe  = lite5200_probe,
.setup_arch = lite5200_setup_arch,
+   .discover_phbs  = mpc52xx_setup_pci,
.init   = mpc52xx_declare_of_platform_devices,
.init_IRQ   = mpc52xx_init_irq,
.get_irq= mpc52xx_get_irq,
-- 
2.26.2

[PATCH 05/18] powerpc/52xx/efika: Move PHB discovery

2020-11-02 Thread Oliver O'Halloran

Signed-off-by: Oliver O'Halloran 
---
compile tested with mpc5200_defconfig
---
 arch/powerpc/platforms/52xx/efika.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/52xx/efika.c 
b/arch/powerpc/platforms/52xx/efika.c
index 4514a6f7458a..3b7d70d71692 100644
--- a/arch/powerpc/platforms/52xx/efika.c
+++ b/arch/powerpc/platforms/52xx/efika.c
@@ -185,8 +185,6 @@ static void __init efika_setup_arch(void)
/* Map important registers from the internal memory map */
mpc52xx_map_common_devices();
 
-   efika_pcisetup();
-
 #ifdef CONFIG_PM
mpc52xx_suspend.board_suspend_prepare = efika_suspend_prepare;
mpc52xx_pm_init();
@@ -218,6 +216,7 @@ define_machine(efika)
.name   = EFIKA_PLATFORM_NAME,
.probe  = efika_probe,
.setup_arch = efika_setup_arch,
+   .discover_phbs  = efika_pcisetup,
.init   = mpc52xx_declare_of_platform_devices,
.show_cpuinfo   = efika_show_cpuinfo,
.init_IRQ   = mpc52xx_init_irq,
-- 
2.26.2

[PATCH 04/18] powerpc/512x: Move PHB discovery

2020-11-02 Thread Oliver O'Halloran

Signed-off-by: Oliver O'Halloran 
---
only compile tested
---
 arch/powerpc/platforms/512x/mpc5121_ads.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/platforms/512x/mpc5121_ads.c 
b/arch/powerpc/platforms/512x/mpc5121_ads.c
index 6303fbfc4e4f..9d030c2e0004 100644
--- a/arch/powerpc/platforms/512x/mpc5121_ads.c
+++ b/arch/powerpc/platforms/512x/mpc5121_ads.c
@@ -24,21 +24,23 @@
 
 static void __init mpc5121_ads_setup_arch(void)
 {
-#ifdef CONFIG_PCI
-   struct device_node *np;
-#endif
printk(KERN_INFO "MPC5121 ADS board from Freescale Semiconductor\n");
/*
 * cpld regs are needed early
 */
mpc5121_ads_cpld_map();
 
+   mpc512x_setup_arch();
+}
+
+static void __init mpc5121_ads_setup_pci(void)
+{
 #ifdef CONFIG_PCI
+   struct device_node *np;
+
for_each_compatible_node(np, "pci", "fsl,mpc5121-pci")
mpc83xx_add_bridge(np);
 #endif
-
-   mpc512x_setup_arch();
 }
 
 static void __init mpc5121_ads_init_IRQ(void)
@@ -64,6 +66,7 @@ define_machine(mpc5121_ads) {
.name   = "MPC5121 ADS",
.probe  = mpc5121_ads_probe,
.setup_arch = mpc5121_ads_setup_arch,
+   .discover_phbs  = mpc5121_ads_setup_pci,
.init   = mpc512x_init,
.init_IRQ   = mpc5121_ads_init_IRQ,
.get_irq= ipic_get_irq,
-- 
2.26.2

[PATCH 03/18] powerpc/maple: Move PHB discovery

2020-11-02 Thread Oliver O'Halloran

Signed-off-by: Oliver O'Halloran 
---
 arch/powerpc/platforms/maple/setup.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/powerpc/platforms/maple/setup.c 
b/arch/powerpc/platforms/maple/setup.c
index f7e66a2005b4..4e9ad5bf3efb 100644
--- a/arch/powerpc/platforms/maple/setup.c
+++ b/arch/powerpc/platforms/maple/setup.c
@@ -179,9 +179,6 @@ static void __init maple_setup_arch(void)
 #ifdef CONFIG_SMP
smp_ops = &maple_smp_ops;
 #endif
-   /* Lookup PCI hosts */
-   maple_pci_init();
-
maple_use_rtas_reboot_and_halt_if_present();
 
printk(KERN_DEBUG "Using native/NAP idle loop\n");
@@ -351,6 +348,7 @@ define_machine(maple) {
.name   = "Maple",
.probe  = maple_probe,
.setup_arch = maple_setup_arch,
+   .discover_phbs  = maple_pci_init,
.init_IRQ   = maple_init_IRQ,
.pci_irq_fixup  = maple_pci_irq_fixup,
.pci_get_legacy_ide_irq = maple_pci_get_legacy_ide_irq,
-- 
2.26.2

[PATCH 02/18] powerpc/{powernv,pseries}: Move PHB discovery

2020-11-02 Thread Oliver O'Halloran

Make powernv and pseries use ppc_mc.discover_phbs. These two platforms need
to be done together because they both depends on pci_dn's being created
from the DT. The pci_dn contains a pointer to the relevant pci_controller
so they need to be created after the pci_controller structures are
available, but before  and before PCI devices are scanned. Currently this
ordering is provided by initcalls and the sequence is:

1. PHBs are discovered (setup_arch) (early boot, pre-initcalls)
2. pci_dn are created from the unflattended DT (core initcall)
3. PHBs are scanned pcibios_init() (subsys initcall)

The new ppc_md.discover_phbs() function is also a core_initcall so we can't
guarantee ordering between the creations of pci_controllers and the
creation of pci_dn's which require a pci_controller. We could use the
postcore, or core_sync initcall levels, but it's cleaner to just move the
pci_dn setup into the per-PHB inits which occur inside of .discover_phb()
for these platforms. This brings the boot-time path in line with the PHB
hotplug path that is used for pseries DLPAR operations too.

Signed-off-by: Oliver O'Halloran 
---
 arch/powerpc/kernel/pci_dn.c  | 22 --
 arch/powerpc/platforms/powernv/pci-ioda.c |  3 +++
 arch/powerpc/platforms/powernv/setup.c|  4 +---
 arch/powerpc/platforms/pseries/setup.c|  7 +--
 4 files changed, 9 insertions(+), 27 deletions(-)

diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c
index 54e240597fd9..61571ae23953 100644
--- a/arch/powerpc/kernel/pci_dn.c
+++ b/arch/powerpc/kernel/pci_dn.c
@@ -481,28 +481,6 @@ void pci_devs_phb_init_dynamic(struct pci_controller *phb)
pci_traverse_device_nodes(dn, add_pdn, phb);
 }
 
-/** 
- * pci_devs_phb_init - Initialize phbs and pci devs under them.
- * 
- * This routine walks over all phb's (pci-host bridges) on the
- * system, and sets up assorted pci-related structures 
- * (including pci info in the device node structs) for each
- * pci device found underneath.  This routine runs once,
- * early in the boot sequence.
- */
-static int __init pci_devs_phb_init(void)
-{
-   struct pci_controller *phb, *tmp;
-
-   /* This must be done first so the device nodes have valid pci info! */
-   list_for_each_entry_safe(phb, tmp, &hose_list, list_node)
-   pci_devs_phb_init_dynamic(phb);
-
-   return 0;
-}
-
-core_initcall(pci_devs_phb_init);
-
 static void pci_dev_pdn_setup(struct pci_dev *pdev)
 {
struct pci_dn *pdn;
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index 2b4ceb5e6ce4..d6815f03fee3 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -3176,6 +3176,9 @@ static void __init pnv_pci_init_ioda_phb(struct 
device_node *np,
/* Remove M64 resource if we can't configure it successfully */
if (!phb->init_m64 || phb->init_m64(phb))
hose->mem_resources[1].flags = 0;
+
+   /* create pci_dn's for DT nodes under this PHB */
+   pci_devs_phb_init_dynamic(hose);
 }
 
 void __init pnv_pci_init_ioda2_phb(struct device_node *np)
diff --git a/arch/powerpc/platforms/powernv/setup.c 
b/arch/powerpc/platforms/powernv/setup.c
index 9acaa0f131b9..92f5fa827909 100644
--- a/arch/powerpc/platforms/powernv/setup.c
+++ b/arch/powerpc/platforms/powernv/setup.c
@@ -162,9 +162,6 @@ static void __init pnv_setup_arch(void)
/* Initialize SMP */
pnv_smp_init();
 
-   /* Setup PCI */
-   pnv_pci_init();
-
/* Setup RTC and NVRAM callbacks */
if (firmware_has_feature(FW_FEATURE_OPAL))
opal_nvram_init();
@@ -524,6 +521,7 @@ define_machine(powernv) {
.init_IRQ   = pnv_init_IRQ,
.show_cpuinfo   = pnv_show_cpuinfo,
.get_proc_freq  = pnv_get_proc_freq,
+   .discover_phbs  = pnv_pci_init,
.progress   = pnv_progress,
.machine_shutdown   = pnv_shutdown,
.power_save = NULL,
diff --git a/arch/powerpc/platforms/pseries/setup.c 
b/arch/powerpc/platforms/pseries/setup.c
index 633c45ec406d..e88b30d4b6cd 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -463,7 +463,7 @@ void pseries_little_endian_exceptions(void)
 }
 #endif
 
-static void __init find_and_init_phbs(void)
+static void __init pSeries_discover_phbs(void)
 {
struct device_node *node;
struct pci_controller *phb;
@@ -481,6 +481,9 @@ static void __init find_and_init_phbs(void)
pci_process_bridge_OF_ranges(phb, node, 0);
isa_bridge_find_early(phb);
phb->controller_ops = pseries_pci_controller_ops;
+
+   /* create pci_dn's for DT nodes under this PHB */
+   pci_devs_phb_init_dynamic(phb);
}
 
of_node_put(root);
@@ -777,7 +780,6 @@ static void __init pSeries_setup_arch(

[PATCH 01/18] powerpc/pci: Add ppc_md.discover_phbs()

2020-11-02 Thread Oliver O'Halloran

On many powerpc platforms the discovery and initalisation of
pci_controllers (PHBs) happens inside of setup_arch(). This is very early
in boot (pre-initcalls) and means that we're initialising the PHB long
before many basic kernel services (slab allocator, debugfs, a real ioremap)
are available.

On PowerNV this causes an additional problem since we map the PHB registers
with ioremap(). As of commit d538aadc2718 ("powerpc/ioremap: warn on early
use of ioremap()") a warning is printed because we're using the "incorrect"
API to setup and MMIO mapping in searly boot. The kernel does provide
early_ioremap(), but that is not intended to create long-lived MMIO
mappings and a seperate warning is printed by generic code if
early_ioremap() mappings are "leaked."

This is all fixable with dumb hacks like using early_ioremap() to setup
the initial mapping then replacing it with a real ioremap later on in
boot, but it does raise the question: Why the hell are we setting up the
PHB's this early in boot?

The old and wise claim it's due to "hysterical rasins." Aside from amused
grapes there doesn't appear to be any real reason to maintain the current
behaviour. Already most of the newer embedded platforms perform PHB
discovery in an arch_initcall and between the end of setup_arch() and the
start of initcalls none of the generic kernel code does anything PCI
related. On powerpc scanning PHBs occurs in a subsys_initcall so it should
be possible to move the PHB discovery to a core, postcore or arch initcall.

This patch adds the ppc_md.discover_phbs hook and a core_initcall stub that
calls it. The core_initcalls are the earliest to be called so this will
any possibly issues with dependency between initcalls. This isn't just an
academic issue either since on pseries and PowerNV EEH init occurs in an
arch_initcall and depends on the pci_controllers being available, similarly
the creation of pci_dns occurs at core_initcall_sync (i.e. between core and
postcore initcalls). These problems need to be addressed seperately.

Cc: Paul Mackerras 
Cc: Christophe Leroy 
Signed-off-by: Oliver O'Halloran 
---
 arch/powerpc/include/asm/machdep.h |  3 +++
 arch/powerpc/kernel/pci-common.c   | 10 ++
 2 files changed, 13 insertions(+)

diff --git a/arch/powerpc/include/asm/machdep.h 
b/arch/powerpc/include/asm/machdep.h
index 475687f24f4a..d319160d790c 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -59,6 +59,9 @@ struct machdep_calls {
int (*pcibios_root_bridge_prepare)(struct pci_host_bridge
*bridge);
 
+   /* finds all the pci_controllers present at boot */
+   void(*discover_phbs)(void);
+
/* To setup PHBs when using automatic OF platform driver for PCI */
int (*pci_setup_phb)(struct pci_controller *host);
 
diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index be108616a721..6265e7d1c697 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -1625,3 +1625,13 @@ static void fixup_hide_host_resource_fsl(struct pci_dev 
*dev)
 }
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MOTOROLA, PCI_ANY_ID, 
fixup_hide_host_resource_fsl);
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_FREESCALE, PCI_ANY_ID, 
fixup_hide_host_resource_fsl);
+
+
+int __init discover_phbs(void)
+{
+   if (ppc_md.discover_phbs)
+   ppc_md.discover_phbs();
+
+   return 0;
+}
+core_initcall(discover_phbs);
-- 
2.26.2

Re: Kernel panic from malloc() on SUSE 15.1?

2020-11-02 Thread Michael Ellerman

Carl Jacobsen  writes:
> I've got a SUSE 15.1 install (on ppc64le) that kernel panics on a very
> simple
> test program, built in a slightly unusual way.
>
> I'm compiling on SUSE 12, using gcc 4.8.3. I'm linking to a static
> copy of libcrypto.a (from openssl-1.1.1g), built without threads.
> I have a 10 line C test program that compiles and runs fine on the
> SUSE 12 system. If I compile the same program on SUSE 15.1 (with
> gcc 7.4.1), it runs fine on SUSE 15.1.
>
> But, if I run the version that I compiled on SUSE 12, on the SUSE 15.1
> system, the call to RAND_status() gets to a malloc() and then panics.
> (And, of course, if I just compile a call to malloc(), that runs fine
> on both systems.) Here's the test program, it's really just a call to
> RAND_status():
>
> #include 
> #include 
>
> int main(int argc, char **argv)
> {
> int has_enough_data = RAND_status();
> printf("The PRNG %s been seeded with enough data\n",
>has_enough_data ? "HAS" : "has NOT");
> return 0;
> }
>
> openssl is configured/built with:
> ./config no-shared no-dso no-threads -fPIC -ggdb3 -debug -static
> make
>
> and the test program is compiled with:
> gcc -ggdb3 -o rand_test rand_test.c libcrypto.a
>
> The kernel on SUSE 12 is: 3.12.28-4-default
> And glibc is: 2.19
>
> The kernel on SUSE 15.1 is: 4.12.14-197.18-default
> And glibc is: 2.26
>
> In a previous iteration it was panicking in pthread_once(), so
> I compiled openssl without pthreads support, and now it panics
> calling malloc().

What's the panic look like?

cheers

[PATCH AUTOSEL 5.4 15/24] scsi: ibmvscsi: Fix potential race after loss of transport

2020-11-02 Thread Sasha Levin

From: Tyrel Datwyler 

[ Upstream commit 665e0224a3d76f36da40bd9012270fa629aa42ed ]

After a loss of transport due to an adapter migration or crash/disconnect
from the host partner there is a tiny window where we can race adjusting
the request_limit of the adapter. The request limit is atomically
increased/decreased to track the number of inflight requests against the
allowed limit of our VIOS partner.

After a transport loss we set the request_limit to zero to reflect this
state.  However, there is a window where the adapter may attempt to queue a
command because the transport loss event hasn't been fully processed yet
and request_limit is still greater than zero.  The hypercall to send the
event will fail and the error path will increment the request_limit as a
result.  If the adapter processes the transport event prior to this
increment the request_limit becomes out of sync with the adapter state and
can result in SCSI commands being submitted on the now reset connection
prior to an SRP Login resulting in a protocol violation.

Fix this race by protecting request_limit with the host lock when changing
the value via atomic_set() to indicate no transport.

Link: https://lore.kernel.org/r/20201025001355.4527-1-tyr...@linux.ibm.com
Signed-off-by: Tyrel Datwyler 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/ibmvscsi/ibmvscsi.c | 36 +++-
 1 file changed, 26 insertions(+), 10 deletions(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c
index c5711c659b517..1ab0a61e3fb59 100644
--- a/drivers/scsi/ibmvscsi/ibmvscsi.c
+++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
@@ -806,6 +806,22 @@ static void purge_requests(struct ibmvscsi_host_data 
*hostdata, int error_code)
spin_unlock_irqrestore(hostdata->host->host_lock, flags);
 }
 
+/**
+ * ibmvscsi_set_request_limit - Set the adapter request_limit in response to
+ * an adapter failure, reset, or SRP Login. Done under host lock to prevent
+ * race with SCSI command submission.
+ * @hostdata:  adapter to adjust
+ * @limit: new request limit
+ */
+static void ibmvscsi_set_request_limit(struct ibmvscsi_host_data *hostdata, 
int limit)
+{
+   unsigned long flags;
+
+   spin_lock_irqsave(hostdata->host->host_lock, flags);
+   atomic_set(&hostdata->request_limit, limit);
+   spin_unlock_irqrestore(hostdata->host->host_lock, flags);
+}
+
 /**
  * ibmvscsi_reset_host - Reset the connection to the server
  * @hostdata:  struct ibmvscsi_host_data to reset
@@ -813,7 +829,7 @@ static void purge_requests(struct ibmvscsi_host_data 
*hostdata, int error_code)
 static void ibmvscsi_reset_host(struct ibmvscsi_host_data *hostdata)
 {
scsi_block_requests(hostdata->host);
-   atomic_set(&hostdata->request_limit, 0);
+   ibmvscsi_set_request_limit(hostdata, 0);
 
purge_requests(hostdata, DID_ERROR);
hostdata->action = IBMVSCSI_HOST_ACTION_RESET;
@@ -1146,13 +1162,13 @@ static void login_rsp(struct srp_event_struct 
*evt_struct)
dev_info(hostdata->dev, "SRP_LOGIN_REJ reason %u\n",
 evt_struct->xfer_iu->srp.login_rej.reason);
/* Login failed.  */
-   atomic_set(&hostdata->request_limit, -1);
+   ibmvscsi_set_request_limit(hostdata, -1);
return;
default:
dev_err(hostdata->dev, "Invalid login response typecode 
0x%02x!\n",
evt_struct->xfer_iu->srp.login_rsp.opcode);
/* Login failed.  */
-   atomic_set(&hostdata->request_limit, -1);
+   ibmvscsi_set_request_limit(hostdata, -1);
return;
}
 
@@ -1163,7 +1179,7 @@ static void login_rsp(struct srp_event_struct *evt_struct)
 * This value is set rather than added to request_limit because
 * request_limit could have been set to -1 by this client.
 */
-   atomic_set(&hostdata->request_limit,
+   ibmvscsi_set_request_limit(hostdata,
   
be32_to_cpu(evt_struct->xfer_iu->srp.login_rsp.req_lim_delta));
 
/* If we had any pending I/Os, kick them */
@@ -1195,13 +1211,13 @@ static int send_srp_login(struct ibmvscsi_host_data 
*hostdata)
login->req_buf_fmt = cpu_to_be16(SRP_BUF_FORMAT_DIRECT |
 SRP_BUF_FORMAT_INDIRECT);
 
-   spin_lock_irqsave(hostdata->host->host_lock, flags);
/* Start out with a request limit of 0, since this is negotiated in
 * the login request we are just sending and login requests always
 * get sent by the driver regardless of request_limit.
 */
-   atomic_set(&hostdata->request_limit, 0);
+   ibmvscsi_set_request_limit(hostdata, 0);
 
+   spin_lock_irqsave(hostdata->host->host_lock, flags);
rc = ibmvscsi_send_srp_event(evt_struct, hostdata, login_timeout * 2);
spin_unlock_irqrestore(hostdata->host->host_lock, flags);

[PATCH AUTOSEL 5.8 20/29] scsi: ibmvscsi: Fix potential race after loss of transport

2020-11-02 Thread Sasha Levin

From: Tyrel Datwyler 

[ Upstream commit 665e0224a3d76f36da40bd9012270fa629aa42ed ]

After a loss of transport due to an adapter migration or crash/disconnect
from the host partner there is a tiny window where we can race adjusting
the request_limit of the adapter. The request limit is atomically
increased/decreased to track the number of inflight requests against the
allowed limit of our VIOS partner.

After a transport loss we set the request_limit to zero to reflect this
state.  However, there is a window where the adapter may attempt to queue a
command because the transport loss event hasn't been fully processed yet
and request_limit is still greater than zero.  The hypercall to send the
event will fail and the error path will increment the request_limit as a
result.  If the adapter processes the transport event prior to this
increment the request_limit becomes out of sync with the adapter state and
can result in SCSI commands being submitted on the now reset connection
prior to an SRP Login resulting in a protocol violation.

Fix this race by protecting request_limit with the host lock when changing
the value via atomic_set() to indicate no transport.

Link: https://lore.kernel.org/r/20201025001355.4527-1-tyr...@linux.ibm.com
Signed-off-by: Tyrel Datwyler 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/ibmvscsi/ibmvscsi.c | 36 +++-
 1 file changed, 26 insertions(+), 10 deletions(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c
index 14f687e9b1f44..62faeab47d905 100644
--- a/drivers/scsi/ibmvscsi/ibmvscsi.c
+++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
@@ -806,6 +806,22 @@ static void purge_requests(struct ibmvscsi_host_data 
*hostdata, int error_code)
spin_unlock_irqrestore(hostdata->host->host_lock, flags);
 }
 
+/**
+ * ibmvscsi_set_request_limit - Set the adapter request_limit in response to
+ * an adapter failure, reset, or SRP Login. Done under host lock to prevent
+ * race with SCSI command submission.
+ * @hostdata:  adapter to adjust
+ * @limit: new request limit
+ */
+static void ibmvscsi_set_request_limit(struct ibmvscsi_host_data *hostdata, 
int limit)
+{
+   unsigned long flags;
+
+   spin_lock_irqsave(hostdata->host->host_lock, flags);
+   atomic_set(&hostdata->request_limit, limit);
+   spin_unlock_irqrestore(hostdata->host->host_lock, flags);
+}
+
 /**
  * ibmvscsi_reset_host - Reset the connection to the server
  * @hostdata:  struct ibmvscsi_host_data to reset
@@ -813,7 +829,7 @@ static void purge_requests(struct ibmvscsi_host_data 
*hostdata, int error_code)
 static void ibmvscsi_reset_host(struct ibmvscsi_host_data *hostdata)
 {
scsi_block_requests(hostdata->host);
-   atomic_set(&hostdata->request_limit, 0);
+   ibmvscsi_set_request_limit(hostdata, 0);
 
purge_requests(hostdata, DID_ERROR);
hostdata->action = IBMVSCSI_HOST_ACTION_RESET;
@@ -1146,13 +1162,13 @@ static void login_rsp(struct srp_event_struct 
*evt_struct)
dev_info(hostdata->dev, "SRP_LOGIN_REJ reason %u\n",
 evt_struct->xfer_iu->srp.login_rej.reason);
/* Login failed.  */
-   atomic_set(&hostdata->request_limit, -1);
+   ibmvscsi_set_request_limit(hostdata, -1);
return;
default:
dev_err(hostdata->dev, "Invalid login response typecode 
0x%02x!\n",
evt_struct->xfer_iu->srp.login_rsp.opcode);
/* Login failed.  */
-   atomic_set(&hostdata->request_limit, -1);
+   ibmvscsi_set_request_limit(hostdata, -1);
return;
}
 
@@ -1163,7 +1179,7 @@ static void login_rsp(struct srp_event_struct *evt_struct)
 * This value is set rather than added to request_limit because
 * request_limit could have been set to -1 by this client.
 */
-   atomic_set(&hostdata->request_limit,
+   ibmvscsi_set_request_limit(hostdata,
   
be32_to_cpu(evt_struct->xfer_iu->srp.login_rsp.req_lim_delta));
 
/* If we had any pending I/Os, kick them */
@@ -1195,13 +1211,13 @@ static int send_srp_login(struct ibmvscsi_host_data 
*hostdata)
login->req_buf_fmt = cpu_to_be16(SRP_BUF_FORMAT_DIRECT |
 SRP_BUF_FORMAT_INDIRECT);
 
-   spin_lock_irqsave(hostdata->host->host_lock, flags);
/* Start out with a request limit of 0, since this is negotiated in
 * the login request we are just sending and login requests always
 * get sent by the driver regardless of request_limit.
 */
-   atomic_set(&hostdata->request_limit, 0);
+   ibmvscsi_set_request_limit(hostdata, 0);
 
+   spin_lock_irqsave(hostdata->host->host_lock, flags);
rc = ibmvscsi_send_srp_event(evt_struct, hostdata, login_timeout * 2);
spin_unlock_irqrestore(hostdata->host->host_lock, flags);

[PATCH AUTOSEL 5.9 24/35] scsi: ibmvscsi: Fix potential race after loss of transport

2020-11-02 Thread Sasha Levin

From: Tyrel Datwyler 

[ Upstream commit 665e0224a3d76f36da40bd9012270fa629aa42ed ]

After a loss of transport due to an adapter migration or crash/disconnect
from the host partner there is a tiny window where we can race adjusting
the request_limit of the adapter. The request limit is atomically
increased/decreased to track the number of inflight requests against the
allowed limit of our VIOS partner.

After a transport loss we set the request_limit to zero to reflect this
state.  However, there is a window where the adapter may attempt to queue a
command because the transport loss event hasn't been fully processed yet
and request_limit is still greater than zero.  The hypercall to send the
event will fail and the error path will increment the request_limit as a
result.  If the adapter processes the transport event prior to this
increment the request_limit becomes out of sync with the adapter state and
can result in SCSI commands being submitted on the now reset connection
prior to an SRP Login resulting in a protocol violation.

Fix this race by protecting request_limit with the host lock when changing
the value via atomic_set() to indicate no transport.

Link: https://lore.kernel.org/r/20201025001355.4527-1-tyr...@linux.ibm.com
Signed-off-by: Tyrel Datwyler 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/ibmvscsi/ibmvscsi.c | 36 +++-
 1 file changed, 26 insertions(+), 10 deletions(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c
index b1f3017b6547a..29fcc44be2d57 100644
--- a/drivers/scsi/ibmvscsi/ibmvscsi.c
+++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
@@ -806,6 +806,22 @@ static void purge_requests(struct ibmvscsi_host_data 
*hostdata, int error_code)
spin_unlock_irqrestore(hostdata->host->host_lock, flags);
 }
 
+/**
+ * ibmvscsi_set_request_limit - Set the adapter request_limit in response to
+ * an adapter failure, reset, or SRP Login. Done under host lock to prevent
+ * race with SCSI command submission.
+ * @hostdata:  adapter to adjust
+ * @limit: new request limit
+ */
+static void ibmvscsi_set_request_limit(struct ibmvscsi_host_data *hostdata, 
int limit)
+{
+   unsigned long flags;
+
+   spin_lock_irqsave(hostdata->host->host_lock, flags);
+   atomic_set(&hostdata->request_limit, limit);
+   spin_unlock_irqrestore(hostdata->host->host_lock, flags);
+}
+
 /**
  * ibmvscsi_reset_host - Reset the connection to the server
  * @hostdata:  struct ibmvscsi_host_data to reset
@@ -813,7 +829,7 @@ static void purge_requests(struct ibmvscsi_host_data 
*hostdata, int error_code)
 static void ibmvscsi_reset_host(struct ibmvscsi_host_data *hostdata)
 {
scsi_block_requests(hostdata->host);
-   atomic_set(&hostdata->request_limit, 0);
+   ibmvscsi_set_request_limit(hostdata, 0);
 
purge_requests(hostdata, DID_ERROR);
hostdata->action = IBMVSCSI_HOST_ACTION_RESET;
@@ -1146,13 +1162,13 @@ static void login_rsp(struct srp_event_struct 
*evt_struct)
dev_info(hostdata->dev, "SRP_LOGIN_REJ reason %u\n",
 evt_struct->xfer_iu->srp.login_rej.reason);
/* Login failed.  */
-   atomic_set(&hostdata->request_limit, -1);
+   ibmvscsi_set_request_limit(hostdata, -1);
return;
default:
dev_err(hostdata->dev, "Invalid login response typecode 
0x%02x!\n",
evt_struct->xfer_iu->srp.login_rsp.opcode);
/* Login failed.  */
-   atomic_set(&hostdata->request_limit, -1);
+   ibmvscsi_set_request_limit(hostdata, -1);
return;
}
 
@@ -1163,7 +1179,7 @@ static void login_rsp(struct srp_event_struct *evt_struct)
 * This value is set rather than added to request_limit because
 * request_limit could have been set to -1 by this client.
 */
-   atomic_set(&hostdata->request_limit,
+   ibmvscsi_set_request_limit(hostdata,
   
be32_to_cpu(evt_struct->xfer_iu->srp.login_rsp.req_lim_delta));
 
/* If we had any pending I/Os, kick them */
@@ -1195,13 +1211,13 @@ static int send_srp_login(struct ibmvscsi_host_data 
*hostdata)
login->req_buf_fmt = cpu_to_be16(SRP_BUF_FORMAT_DIRECT |
 SRP_BUF_FORMAT_INDIRECT);
 
-   spin_lock_irqsave(hostdata->host->host_lock, flags);
/* Start out with a request limit of 0, since this is negotiated in
 * the login request we are just sending and login requests always
 * get sent by the driver regardless of request_limit.
 */
-   atomic_set(&hostdata->request_limit, 0);
+   ibmvscsi_set_request_limit(hostdata, 0);
 
+   spin_lock_irqsave(hostdata->host->host_lock, flags);
rc = ibmvscsi_send_srp_event(evt_struct, hostdata, login_timeout * 2);
spin_unlock_irqrestore(hostdata->host->host_lock, flags);

[powerpc:merge] BUILD SUCCESS 09a0972ac14f67d600aa3c80035367a8074e90eb

2020-11-02 Thread kernel test robot

mvme147_defconfig
h8300 edosk2674_defconfig
sh   sh2007_defconfig
ia64 alldefconfig
mips cobalt_defconfig
microblazenommu_defconfig
arm  gemini_defconfig
sparc   sparc32_defconfig
arm   aspeed_g4_defconfig
arm   imx_v4_v5_defconfig
sh  rsk7264_defconfig
arm   versatile_defconfig
shtitan_defconfig
arm rpc_defconfig
c6x  alldefconfig
powerpc  pmac32_defconfig
powerpc ksi8560_defconfig
powerpcicon_defconfig
arm  pxa3xx_defconfig
arm   cns3420vb_defconfig
arm  colibri_pxa270_defconfig
ia64 allmodconfig
ia64defconfig
ia64 allyesconfig
m68k allmodconfig
m68kdefconfig
m68k allyesconfig
nios2   defconfig
arc  allyesconfig
nds32 allnoconfig
c6x  allyesconfig
nds32   defconfig
nios2allyesconfig
cskydefconfig
alpha   defconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
arc defconfig
sh   allmodconfig
parisc  defconfig
s390 allyesconfig
parisc   allyesconfig
s390defconfig
i386 allyesconfig
sparcallyesconfig
sparc   defconfig
i386defconfig
mips allyesconfig
mips allmodconfig
powerpc  allyesconfig
powerpc  allmodconfig
powerpc   allnoconfig
x86_64   randconfig-a004-20201101
x86_64   randconfig-a003-20201101
x86_64   randconfig-a005-20201101
x86_64   randconfig-a002-20201101
x86_64   randconfig-a006-20201101
x86_64   randconfig-a001-20201101
i386 randconfig-a004-20201102
i386 randconfig-a006-20201102
i386 randconfig-a005-20201102
i386 randconfig-a001-20201102
i386 randconfig-a002-20201102
i386 randconfig-a003-20201102
i386 randconfig-a004-20201101
i386 randconfig-a006-20201101
i386 randconfig-a005-20201101
i386 randconfig-a001-20201101
i386 randconfig-a002-20201101
i386 randconfig-a003-20201101
x86_64   randconfig-a012-20201102
x86_64   randconfig-a015-20201102
x86_64   randconfig-a011-20201102
x86_64   randconfig-a013-20201102
x86_64   randconfig-a014-20201102
x86_64   randconfig-a016-20201102
i386 randconfig-a013-20201102
i386 randconfig-a015-20201102
i386 randconfig-a014-20201102
i386 randconfig-a016-20201102
i386 randconfig-a011-20201102
i386 randconfig-a012-20201102
riscvnommu_k210_defconfig
riscvnommu_virt_defconfig
riscv allnoconfig
riscv   defconfig
riscvallmodconfig
x86_64   rhel
x86_64   allyesconfig
x86_64rhel-7.6-kselftests
x86_64  defconfig
x86_64   rhel-8.3
x86_64  kexec

clang tested configs:
x86_64   randconfig-a004-20201102
x86_64   randconfig-a005-20201102
x86_64   randconfig-a003-20201102
x86_64   randconfig-a002-20201102
x86_64   randconfig-a006-20201102
x86_64   randconfig-a001-20201102
x86_64   randconfig-a012-20201101
x86_64   randconfig-a015-20201101
x86_64   randconfig-a013-20201101
x86_64   randconfig-a011-20201101
x86_64   randconfig-a014-20201101
x86_64   randconfig-a016-20201101

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org

[powerpc:fixes-test] BUILD SUCCESS 99f070b62322a4b8c1252952735806d09eb44b68

2020-11-02 Thread kernel test robot

 haps_hs_smp_defconfig
arm vf610m4_defconfig
arm mv78xx0_defconfig
powerpc  ppc40x_defconfig
sh  sdk7780_defconfig
m68k  multi_defconfig
arm socfpga_defconfig
riscvallyesconfig
arm  badge4_defconfig
arm   sunxi_defconfig
xtensageneric_kc705_defconfig
sh   sh7770_generic_defconfig
cskydefconfig
nds32alldefconfig
sh ecovec24_defconfig
riscv  rv32_defconfig
powerpc  arches_defconfig
shmigor_defconfig
arm  pxa168_defconfig
sh  sh7785lcr_32bit_defconfig
powerpc  ppc44x_defconfig
i386 alldefconfig
powerpc mpc8560_ads_defconfig
arm  iop32x_defconfig
mipsmalta_kvm_guest_defconfig
mipsjmr3927_defconfig
powerpc mpc836x_rdk_defconfig
mipsmalta_qemu_32r6_defconfig
sh microdev_defconfig
powerpc rainier_defconfig
arm  footbridge_defconfig
powerpc  katmai_defconfig
powerpcge_imp3a_defconfig
powerpc mpc8313_rdb_defconfig
powerpcgamecube_defconfig
powerpc  allmodconfig
sh   se7206_defconfig
powerpc tqm8541_defconfig
shecovec24-romimage_defconfig
armmulti_v5_defconfig
powerpc tqm5200_defconfig
powerpc   lite5200b_defconfig
m68kmvme147_defconfig
h8300 edosk2674_defconfig
sh   sh2007_defconfig
ia64 alldefconfig
mips cobalt_defconfig
microblazenommu_defconfig
arm  gemini_defconfig
sparc   sparc32_defconfig
powerpc   allnoconfig
arm   aspeed_g4_defconfig
sh  rsk7264_defconfig
arm   versatile_defconfig
sh sh7710voipgw_defconfig
shtitan_defconfig
arm rpc_defconfig
c6x  alldefconfig
powerpc  pmac32_defconfig
powerpc ksi8560_defconfig
powerpcicon_defconfig
mips   ip27_defconfig
xtensasmp_lx200_defconfig
m68k   sun3_defconfig
ia64 allmodconfig
m68k allmodconfig
m68kdefconfig
m68k allyesconfig
nios2   defconfig
nds32 allnoconfig
c6x  allyesconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
arc defconfig
sh   allmodconfig
parisc  defconfig
s390 allyesconfig
parisc   allyesconfig
s390defconfig
i386 allyesconfig
sparcallyesconfig
sparc   defconfig
i386defconfig
mips allmodconfig
powerpc  allyesconfig
x86_64   randconfig-a004-20201101
x86_64   randconfig-a003-20201101
x86_64   randconfig-a005-20201101
x86_64   randconfig-a002-20201101
x86_64   randconfig-a006-20201101
x86_64   randconfig-a001-20201101
i386 randconfig-a004-20201102
i386 randconfig-a006-20201102
i386 randconfig-a005-20201102
i386 randconfig-a001-20201102
i386 randconfig-a002-20201102
i386 randconfig-a003-20201102
i386 randconfig-a004-20201101
i386 randconfig-a006-20201101
i386 randconfig-a005-20201101
i386 randconfig-a001-20201101
i386 randconfig-a002-20201101
i386 randconfig-a003-20201101
x86_64   randconfig-a012-20201102
x86_64   randconfig-a015-20201102
x86_64   randconfig-a011-20201102
x86_64   randconfig-a013-20201102
x86_64   randconfig-a014-20201102
x86_64

[powerpc:next-test] BUILD SUCCESS 2d83b0f30c1483a556c8aa1f7d891006fffcd5e0

2020-11-02 Thread kernel test robot

  rv32_defconfig
powerpc tqm8541_defconfig
shecovec24-romimage_defconfig
armmulti_v5_defconfig
powerpc tqm5200_defconfig
powerpc   lite5200b_defconfig
m68kmvme147_defconfig
h8300 edosk2674_defconfig
sh   sh2007_defconfig
ia64 alldefconfig
mips cobalt_defconfig
microblazenommu_defconfig
arm  gemini_defconfig
sparc   sparc32_defconfig
powerpc   allnoconfig
arm   aspeed_g4_defconfig
arm   imx_v4_v5_defconfig
sh  rsk7264_defconfig
arm   versatile_defconfig
shtitan_defconfig
arm rpc_defconfig
c6x  alldefconfig
powerpc  pmac32_defconfig
powerpc ksi8560_defconfig
powerpcicon_defconfig
arm  pxa3xx_defconfig
arm   cns3420vb_defconfig
arm  colibri_pxa270_defconfig
ia64 allmodconfig
ia64defconfig
m68k allmodconfig
m68kdefconfig
m68k allyesconfig
nios2   defconfig
arc  allyesconfig
nds32 allnoconfig
c6x  allyesconfig
nds32   defconfig
nios2allyesconfig
cskydefconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
arc defconfig
sh   allmodconfig
parisc  defconfig
s390 allyesconfig
parisc   allyesconfig
s390defconfig
i386 allyesconfig
sparcallyesconfig
sparc   defconfig
i386defconfig
mips allyesconfig
mips allmodconfig
powerpc  allyesconfig
powerpc  allmodconfig
x86_64   randconfig-a004-20201101
x86_64   randconfig-a003-20201101
x86_64   randconfig-a005-20201101
x86_64   randconfig-a002-20201101
x86_64   randconfig-a006-20201101
x86_64   randconfig-a001-20201101
i386 randconfig-a004-20201102
i386 randconfig-a006-20201102
i386 randconfig-a005-20201102
i386 randconfig-a001-20201102
i386 randconfig-a002-20201102
i386 randconfig-a003-20201102
i386 randconfig-a004-20201101
i386 randconfig-a006-20201101
i386 randconfig-a005-20201101
i386 randconfig-a001-20201101
i386 randconfig-a002-20201101
i386 randconfig-a003-20201101
x86_64   randconfig-a012-20201102
x86_64   randconfig-a015-20201102
x86_64   randconfig-a011-20201102
x86_64   randconfig-a013-20201102
x86_64   randconfig-a014-20201102
x86_64   randconfig-a016-20201102
i386 randconfig-a013-20201102
i386 randconfig-a015-20201102
i386 randconfig-a014-20201102
i386 randconfig-a016-20201102
i386 randconfig-a011-20201102
i386 randconfig-a012-20201102
riscvnommu_virt_defconfig
riscv allnoconfig
riscv   defconfig
riscvallmodconfig
x86_64   rhel
x86_64   allyesconfig
x86_64rhel-7.6-kselftests
x86_64  defconfig
x86_64   rhel-8.3
x86_64  kexec

clang tested configs:
x86_64   randconfig-a004-20201102
x86_64   randconfig-a005-20201102
x86_64   randconfig-a003-20201102
x86_64   randconfig-a002-20201102
x86_64   randconfig-a006-20201102
x86_64   randconfig-a001-20201102
x86_64   randconfig-a012-20201101
x86_64   randconfig-a015-20201101
x86_64   randconfig-a013-20201101
x86_64   randconfig-a011-20201101
x86_64   randconfig-a014-20201101
x86_64

Re: [PATCH v2 2/2] misc: ocxl: config: Rename function attribute description

2020-11-02 Thread Andrew Donnellan


On 3/11/20 1:20 am, Lee Jones wrote:

Fixes the following W=1 kernel build warning(s):

  drivers/misc/ocxl/config.c:81: warning: Function parameter or member 'dev' 
not described in 'get_function_0'
  drivers/misc/ocxl/config.c:81: warning: Excess function parameter 'device' 
description in 'get_function_0'

Cc: Frederic Barrat 
Cc: Andrew Donnellan 
Cc: Arnd Bergmann 
Cc: Greg Kroah-Hartman 
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Lee Jones 


Thanks!

Acked-by: Andrew Donnellan 

--
Andrew Donnellan  OzLabs, ADL Canberra
a...@linux.ibm.com IBM Australia Limited

Re: [PATCH net-next 04/15] net: mlx5: Replace in_irq() usage.

2020-11-02 Thread Saeed Mahameed

On Sat, 2020-10-31 at 09:59 -0700, Jakub Kicinski wrote:
> On Tue, 27 Oct 2020 23:54:43 +0100 Sebastian Andrzej Siewior wrote:
> > mlx5_eq_async_int() uses in_irq() to decide whether eq::lock needs
> > to be
> > acquired and released with spin_[un]lock() or the irq
> > saving/restoring
> > variants.
> > 
> > The usage of in_*() in drivers is phased out and Linus clearly
> > requested
> > that code which changes behaviour depending on context should
> > either be
> > seperated or the context be conveyed in an argument passed by the
> > caller,
> > which usually knows the context.
> > 
> > mlx5_eq_async_int() knows the context via the action argument
> > already so
> > using it for the lock variant decision is a straight forward
> > replacement
> > for in_irq().
> > 
> > Signed-off-by: Sebastian Andrzej Siewior 
> > Cc: Saeed Mahameed 
> > Cc: Leon Romanovsky 
> > Cc: "David S. Miller" 
> > Cc: Jakub Kicinski 
> > Cc: linux-r...@vger.kernel.org
> 
> Saeed, please pick this up into your tree.

Ack

Re: [PATCH v2 net-next 3/3] crypto: caam: Replace in_irq() usage.

2020-11-02 Thread Horia Geantă

On 11/2/2020 1:23 AM, Sebastian Andrzej Siewior wrote:
> The driver uses in_irq() + in_serving_softirq() magic to decide if NAPI
> scheduling is required or packet processing.
> 
> The usage of in_*() in drivers is phased out and Linus clearly requested
> that code which changes behaviour depending on context should either be
> separated or the context be conveyed in an argument passed by the caller,
> which usually knows the context.
> 
> Use the `sched_napi' argument passed by the callback. It is set true if
> called from the interrupt handler and NAPI should be scheduled.
> 
> Signed-off-by: Sebastian Andrzej Siewior 
> Cc: "Horia Geantă" 
> Cc: Aymen Sghaier 
> Cc: Herbert Xu 
> Cc: "David S. Miller" 
> Cc: Madalin Bucur 
> Cc: Jakub Kicinski 
> Cc: Li Yang 
> Cc: linux-cry...@vger.kernel.org
> Cc: net...@vger.kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: linux-arm-ker...@lists.infradead.org
Reviewed-by: Horia Geantă 

Thanks,
Horia

Re: [PATCH v2 net-next 1/3] soc/fsl/qbman: Add an argument to signal if NAPI processing is required.

2020-11-02 Thread Horia Geantă

On 11/2/2020 1:23 AM, Sebastian Andrzej Siewior wrote:
> dpaa_eth_napi_schedule() and caam_qi_napi_schedule() schedule NAPI if
> invoked from:
> 
>  - Hard interrupt context
>  - Any context which is not serving soft interrupts
> 
> Any context which is not serving soft interrupts includes hard interrupts
> so the in_irq() check is redundant. caam_qi_napi_schedule() has a comment
> about this:
> 
> /*
>  * In case of threaded ISR, for RT kernels in_irq() does not return
>  * appropriate value, so use in_serving_softirq to distinguish between
>  * softirq and irq contexts.
>  */
>  if (in_irq() || !in_serving_softirq())
> 
> This has nothing to do with RT. Even on a non RT kernel force threaded
> interrupts run obviously in thread context and therefore in_irq() returns
> false when invoked from the handler.
> 
> The extension of the in_irq() check with !in_serving_softirq() was there
> when the drivers were added, but in the out of tree FSL BSP the original
> condition was in_irq() which got extended due to failures on RT.
> 
Looks like the initial FSL BSP commit adding this check is:
edca0b7a448a ("dpaa_eth: Fix Rx-stall issue in threaded ISR")
https://source.codeaurora.org/external/qoriq/qoriq-yocto-sdk/linux/commit/?h=fsl-sdk-v1.2&id=edca0b7a448ac18ef0a9b1238209b7595d511e19

This was done for dpaa_eth and the same logic was reused in caam.
In the process of upstreaming the development history got lost and
the comment in dpaa_eth was removed.

This was back in 2012 on a v3.0.34 kernel.
Not sure if/how things changed in the meantime, i.e. whether in_irq()
behaviour when called from softirq changed on -rt kernels (assuming this was
the problem Priyanka tried solving).

> The usage of in_xxx() in drivers is phased out and Linus clearly requested
> that code which changes behaviour depending on context should either be
> separated or the context be conveyed in an argument passed by the caller,
> which usually knows the context. Right he is, the above construct is
> clearly showing why.
> 
> The following callchains have been analyzed to end up in
> dpaa_eth_napi_schedule():
> 
> qman_p_poll_dqrr()
>   __poll_portal_fast()
> fq->cb.dqrr()
>dpaa_eth_napi_schedule()
> 
> portal_isr()
>   __poll_portal_fast()
> fq->cb.dqrr()
>dpaa_eth_napi_schedule()
> 
> Both need to schedule NAPI.
Only the call from interrupt context.

> The crypto part has another code path leading up to this:
>   kill_fq()
>  empty_retired_fq()
>qman_p_poll_dqrr()
>  __poll_portal_fast()
> fq->cb.dqrr()
>dpaa_eth_napi_schedule()
> 
> kill_fq() is called from task context and ends up scheduling NAPI, but
> that's pointless and an unintended side effect of the !in_serving_softirq()
> check.
> 
Correct.

> The code path:
>   caam_qi_poll() -> qman_p_poll_dqrr()
> 
> is invoked from NAPI and I *assume* from crypto's NAPI device and not
> from qbman's NAPI device. I *guess* it is okay to skip scheduling NAPI
> (because this is what happens now) but could be changed if it is wrong
> due to `budget' handling.
> 
Looks good to me.

> Add an argument to __poll_portal_fast() which is true if NAPI needs to be
> scheduled. This requires propagating the value to the caller including
> `qman_cb_dqrr' typedef which is used by the dpaa and the crypto driver.
> 
> Signed-off-by: Sebastian Andrzej Siewior 
> Cc: "Horia Geantă" 
> Cc: Aymen Sghaier 
> Cc: Herbert XS 
> Cc: "David S. Miller" 
> Cc: Madalin Bucur 
> Cc: Jakub Kicinski 
> Cc: Li Yang 
> Cc: linux-cry...@vger.kernel.org
> Cc: net...@vger.kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: linux-arm-ker...@lists.infradead.org
Reviewed-by: Horia Geantă 

Thanks,
Horia

Kernel panic from malloc() on SUSE 15.1?

2020-11-02 Thread Carl Jacobsen

I've got a SUSE 15.1 install (on ppc64le) that kernel panics on a very
simple
test program, built in a slightly unusual way.

I'm compiling on SUSE 12, using gcc 4.8.3. I'm linking to a static
copy of libcrypto.a (from openssl-1.1.1g), built without threads.
I have a 10 line C test program that compiles and runs fine on the
SUSE 12 system. If I compile the same program on SUSE 15.1 (with
gcc 7.4.1), it runs fine on SUSE 15.1.

But, if I run the version that I compiled on SUSE 12, on the SUSE 15.1
system, the call to RAND_status() gets to a malloc() and then panics.
(And, of course, if I just compile a call to malloc(), that runs fine
on both systems.) Here's the test program, it's really just a call to
RAND_status():

#include 
#include 

int main(int argc, char **argv)
{
int has_enough_data = RAND_status();
printf("The PRNG %s been seeded with enough data\n",
   has_enough_data ? "HAS" : "has NOT");
return 0;
}

openssl is configured/built with:
./config no-shared no-dso no-threads -fPIC -ggdb3 -debug -static
make

and the test program is compiled with:
gcc -ggdb3 -o rand_test rand_test.c libcrypto.a

The kernel on SUSE 12 is: 3.12.28-4-default
And glibc is: 2.19

The kernel on SUSE 15.1 is: 4.12.14-197.18-default
And glibc is: 2.26

In a previous iteration it was panicking in pthread_once(), so
I compiled openssl without pthreads support, and now it panics
calling malloc().

If I link to the system-supplied libcrypto.so, it works fine, and
running the same tests on x86_64 works fine, it's only ppc64le
that panics, and only running code from the old system on the
new one.

I'm trying to dig further down into this to come up with a standalone
test case, but I'm wondering if anything here stands out as a known
problem, or if someone can point me in the right direction.

Thanks,
Carl Jacobsen

Re: [PATCH 20/33] docs: ABI: testing: make the files compatible with ReST output

2020-11-02 Thread Gautham R Shenoy

On Wed, Oct 28, 2020 at 03:23:18PM +0100, Mauro Carvalho Chehab wrote:
> From: Mauro Carvalho Chehab 
> 
> Some files over there won't parse well by Sphinx.
> 

[..snip..]



> diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu 
> b/Documentation/ABI/testing/sysfs-devices-system-cpu
> index b555df825447..274c337ec6a9 100644
> --- a/Documentation/ABI/testing/sysfs-devices-system-cpu
> +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
> @@ -151,23 +151,28 @@ Description:
>   The processor idle states which are available for use have the
>   following attributes:
> 
> - name: (RO) Name of the idle state (string).
> +   =
> + name:(RO) Name of the idle state (string).
> 
>   latency: (RO) The latency to exit out of this idle state (in
> - microseconds).
> +   microseconds).
> 
> - power: (RO) The power consumed while in this idle state (in
> - milliwatts).
> + power:   (RO) The power consumed while in this idle state (in
> +   milliwatts).
> 
> - time: (RO) The total time spent in this idle state (in 
> microseconds).
> + time:(RO) The total time spent in this idle state
> +   (in microseconds).
> 
> - usage: (RO) Number of times this state was entered (a count).
> + usage:   (RO) Number of times this state was entered (a count).
> 
> - above: (RO) Number of times this state was entered, but the
> -observed CPU idle duration was too short for it (a 
> count).
> + above:   (RO) Number of times this state was entered, but the
> +   observed CPU idle duration was too short for it
> +   (a count).
> 
> - below: (RO) Number of times this state was entered, but the
> -observed CPU idle duration was too long for it (a count).
> + below:   (RO) Number of times this state was entered, but the
> +   observed CPU idle duration was too long for it
> +   (a count).
> +   =
> 
>  What:/sys/devices/system/cpu/cpuX/cpuidle/stateN/desc
>  Date:February 2008
> @@ -290,6 +295,7 @@ Description:  Processor frequency boosting control
>   This switch controls the boost setting for the whole system.
>   Boosting allows the CPU and the firmware to run at a frequency
>   beyound it's nominal limit.
> +
>   More details can be found in
>   Documentation/admin-guide/pm/cpufreq.rst
> 

The changes to cpuidle states look good to me.


[..snip..]

> @@ -414,30 +434,30 @@ Description:POWERNV CPUFreq driver's frequency 
> throttle stats directory and
>   throttle attributes exported in the 'throttle_stats' directory:
> 
>   - turbo_stat : This file gives the total number of times the max
> - frequency is throttled to lower frequency in turbo (at and above
> - nominal frequency) range of frequencies.
> +   frequency is throttled to lower frequency in turbo (at and 
> above
> +   nominal frequency) range of frequencies.
> 
>   - sub_turbo_stat : This file gives the total number of times the
> - max frequency is throttled to lower frequency in sub-turbo(below
> - nominal frequency) range of frequencies.
> +   max frequency is throttled to lower frequency in 
> sub-turbo(below
> +   nominal frequency) range of frequencies.
> 
>   - unthrottle : This file gives the total number of times the max
> - frequency is unthrottled after being throttled.
> +   frequency is unthrottled after being throttled.
> 
>   - powercap : This file gives the total number of times the max
> - frequency is throttled due to 'Power Capping'.
> +   frequency is throttled due to 'Power Capping'.
> 
>   - overtemp : This file gives the total number of times the max
> - frequency is throttled due to 'CPU Over Temperature'.
> +   frequency is throttled due to 'CPU Over Temperature'.
> 
>   - supply_fault : This file gives the total number of times the
> - max frequency is throttled due to 'Power Supply Failure'.
> +   max frequency is throttled due to 'Power Supply Failure'.
> 
>   - overcurrent : This file gives the total number of times the
> - max frequency is throttled due to 'Overcurrent'.
> +   max frequency is throttled due to 'Overcurrent'.
> 
>   - occ_reset : This file gives the

Re: [PATCH v2 20/39] docs: ABI: testing: make the files compatible with ReST output

2020-11-02 Thread Mauro Carvalho Chehab

Em Mon, 2 Nov 2020 13:46:41 +0100
Greg Kroah-Hartman  escreveu:

> On Mon, Nov 02, 2020 at 12:04:36PM +0100, Fabrice Gasnier wrote:
> > On 10/30/20 11:09 AM, Mauro Carvalho Chehab wrote:  
> > > Em Fri, 30 Oct 2020 10:19:12 +0100
> > > Fabrice Gasnier  escreveu:
> > >   
> > >> Hi Mauro,
> > >>
> > >> [...]
> > >>  
> > >>>  
> > >>> +What:  
> > >>> /sys/bus/iio/devices/iio:deviceX/in_count_quadrature_mode_available
> > >>> +KernelVersion: 4.12
> > >>> +Contact:   benjamin.gaign...@st.com
> > >>> +Description:
> > >>> +   Reading returns the list possible quadrature modes.
> > >>> +
> > >>> +What:  
> > >>> /sys/bus/iio/devices/iio:deviceX/in_count0_quadrature_mode
> > >>> +KernelVersion: 4.12
> > >>> +Contact:   benjamin.gaign...@st.com
> > >>> +Description:
> > >>> +   Configure the device counter quadrature modes:
> > >>> +
> > >>> +   channel_A:
> > >>> +   Encoder A input servers as the count input and 
> > >>> B as
> > >>> +   the UP/DOWN direction control input.
> > >>> +
> > >>> +   channel_B:
> > >>> +   Encoder B input serves as the count input and A 
> > >>> as
> > >>> +   the UP/DOWN direction control input.
> > >>> +
> > >>> +   quadrature:
> > >>> +   Encoder A and B inputs are mixed to get 
> > >>> direction
> > >>> +   and count with a scale of 0.25.
> > >>> +
> > >>  
> > > 
> > > Hi Fabrice,
> > >   
> > >> I just noticed that since Jonathan question in v1.
> > >>
> > >> Above ABI has been moved in the past as discussed in [1]. You can take a
> > >> look at:
> > >> b299d00 IIO: stm32: Remove quadrature related functions from trigger 
> > >> driver
> > >>
> > >> Could you please remove the above chunk ?
> > >>
> > >> With that, for the stm32 part:
> > >> Acked-by: Fabrice Gasnier   
> > > 
> > > 
> > > Hmm... probably those were re-introduced due to a rebase. This
> > > series were originally written about 1,5 years ago.
> > > 
> > > I'll drop those hunks.  
> > 
> > Hi Mauro, Greg,
> > 
> > I just figured out this patch has been applied with above hunk.
> > 
> > This should be dropped: is there a fix on its way already ?
> > (I may have missed it)  
> 
> Can you send a fix for just this hunk?

Hmm...

$ git grep 
/sys/bus/iio/devices/iio:deviceX/in_count_quadrature_mode_available
Documentation/ABI/testing/sysfs-bus-iio-counter-104-quad-8:What:
/sys/bus/iio/devices/iio:deviceX/in_count_quadrature_mode_available
Documentation/ABI/testing/sysfs-bus-iio-lptimer-stm32:What: 
/sys/bus/iio/devices/iio:deviceX/in_count_quadrature_mode_available
Documentation/ABI/testing/sysfs-bus-iio-timer-stm32:What:   
/sys/bus/iio/devices/iio:deviceX/in_count_quadrature_mode_available

Even re-doing the changes from 
changeset b299d00420e2 ("IIO: stm32: Remove quadrature related functions from 
trigger driver")
at Documentation/ABI/testing/sysfs-bus-iio-timer-stm32, there's still
a third duplicate of some of those, as reported by the script:

$ ./scripts/get_abi.pl validate 2>&1|grep quadra
Warning: /sys/bus/iio/devices/iio:deviceX/in_count0_quadrature_mode is 
defined 2 times:  Documentation/ABI/testing/sysfs-bus-iio-timer-stm32:117  
Documentation/ABI/testing/sysfs-bus-iio-lptimer-stm32:14
Warning: 
/sys/bus/iio/devices/iio:deviceX/in_count_quadrature_mode_available is defined 
3 times:  Documentation/ABI/testing/sysfs-bus-iio-counter-104-quad-8:2  
Documentation/ABI/testing/sysfs-bus-iio-timer-stm32:111  
Documentation/ABI/testing/sysfs-bus-iio-lptimer-stm32:8

As in_count_quadrature_mode_available is also defined at:
Documentation/ABI/testing/sysfs-bus-iio-counter-104-quad-8:2

The best here seems to have a patch that will also drop the other
duplication of this, probably moving in_count_quadrature_mode_available
to a generic node probably placing it inside 
Documentation/ABI/testing/sysfs-bus-iio.

Comments?

Thanks,
Mauro

PS.: the IIO subsystem is the one that currently has more duplicated
ABI entries:

$ ./scripts/get_abi.pl validate 2>&1|grep iio
Warning: /sys/bus/iio/devices/iio:deviceX/in_accel_x_calibbias is defined 2 
times:  Documentation/ABI/testing/sysfs-bus-iio-icm42600:0  
Documentation/ABI/testing/sysfs-bus-iio:394
Warning: /sys/bus/iio/devices/iio:deviceX/in_accel_y_calibbias is defined 2 
times:  Documentation/ABI/testing/sysfs-bus-iio-icm42600:1  
Documentation/ABI/testing/sysfs-bus-iio:395
Warning: /sys/bus/iio/devices/iio:deviceX/in_accel_z_calibbias is defined 2 
times:  Documentation/ABI/testing/sysfs-bus-iio-icm42600:2  
Documentation/ABI/testing/sysfs-bus-iio:396
Warning: /sys/bus/iio/devices/iio:deviceX/in_anglvel_x_calibbias is defined 2 
times:  Documentation/ABI/testing/sysfs-bus-iio-icm42600:3  
Documentation/ABI/testing/sysfs-bus-iio:397
Warning: /sys/bus/iio/devices/ii

Re: [PATCH v2 20/39] docs: ABI: testing: make the files compatible with ReST output

2020-11-02 Thread Greg Kroah-Hartman

On Mon, Nov 02, 2020 at 12:04:36PM +0100, Fabrice Gasnier wrote:
> On 10/30/20 11:09 AM, Mauro Carvalho Chehab wrote:
> > Em Fri, 30 Oct 2020 10:19:12 +0100
> > Fabrice Gasnier  escreveu:
> > 
> >> Hi Mauro,
> >>
> >> [...]
> >>
> >>>  
> >>> +What:
> >>> /sys/bus/iio/devices/iio:deviceX/in_count_quadrature_mode_available
> >>> +KernelVersion:   4.12
> >>> +Contact: benjamin.gaign...@st.com
> >>> +Description:
> >>> + Reading returns the list possible quadrature modes.
> >>> +
> >>> +What:
> >>> /sys/bus/iio/devices/iio:deviceX/in_count0_quadrature_mode
> >>> +KernelVersion:   4.12
> >>> +Contact: benjamin.gaign...@st.com
> >>> +Description:
> >>> + Configure the device counter quadrature modes:
> >>> +
> >>> + channel_A:
> >>> + Encoder A input servers as the count input and B as
> >>> + the UP/DOWN direction control input.
> >>> +
> >>> + channel_B:
> >>> + Encoder B input serves as the count input and A as
> >>> + the UP/DOWN direction control input.
> >>> +
> >>> + quadrature:
> >>> + Encoder A and B inputs are mixed to get direction
> >>> + and count with a scale of 0.25.
> >>> +  
> >>
> > 
> > Hi Fabrice,
> > 
> >> I just noticed that since Jonathan question in v1.
> >>
> >> Above ABI has been moved in the past as discussed in [1]. You can take a
> >> look at:
> >> b299d00 IIO: stm32: Remove quadrature related functions from trigger driver
> >>
> >> Could you please remove the above chunk ?
> >>
> >> With that, for the stm32 part:
> >> Acked-by: Fabrice Gasnier 
> > 
> > 
> > Hmm... probably those were re-introduced due to a rebase. This
> > series were originally written about 1,5 years ago.
> > 
> > I'll drop those hunks.
> 
> Hi Mauro, Greg,
> 
> I just figured out this patch has been applied with above hunk.
> 
> This should be dropped: is there a fix on its way already ?
> (I may have missed it)

Can you send a fix for just this hunk?

thanks,

greg k-h

Re: [PATCH v2 20/39] docs: ABI: testing: make the files compatible with ReST output

2020-11-02 Thread Fabrice Gasnier

On 10/30/20 11:09 AM, Mauro Carvalho Chehab wrote:
> Em Fri, 30 Oct 2020 10:19:12 +0100
> Fabrice Gasnier  escreveu:
> 
>> Hi Mauro,
>>
>> [...]
>>
>>>  
>>> +What:  
>>> /sys/bus/iio/devices/iio:deviceX/in_count_quadrature_mode_available
>>> +KernelVersion: 4.12
>>> +Contact:   benjamin.gaign...@st.com
>>> +Description:
>>> +   Reading returns the list possible quadrature modes.
>>> +
>>> +What:  
>>> /sys/bus/iio/devices/iio:deviceX/in_count0_quadrature_mode
>>> +KernelVersion: 4.12
>>> +Contact:   benjamin.gaign...@st.com
>>> +Description:
>>> +   Configure the device counter quadrature modes:
>>> +
>>> +   channel_A:
>>> +   Encoder A input servers as the count input and B as
>>> +   the UP/DOWN direction control input.
>>> +
>>> +   channel_B:
>>> +   Encoder B input serves as the count input and A as
>>> +   the UP/DOWN direction control input.
>>> +
>>> +   quadrature:
>>> +   Encoder A and B inputs are mixed to get direction
>>> +   and count with a scale of 0.25.
>>> +  
>>
> 
> Hi Fabrice,
> 
>> I just noticed that since Jonathan question in v1.
>>
>> Above ABI has been moved in the past as discussed in [1]. You can take a
>> look at:
>> b299d00 IIO: stm32: Remove quadrature related functions from trigger driver
>>
>> Could you please remove the above chunk ?
>>
>> With that, for the stm32 part:
>> Acked-by: Fabrice Gasnier 
> 
> 
> Hmm... probably those were re-introduced due to a rebase. This
> series were originally written about 1,5 years ago.
> 
> I'll drop those hunks.

Hi Mauro, Greg,

I just figured out this patch has been applied with above hunk.

This should be dropped: is there a fix on its way already ?
(I may have missed it)

Please advise,
Fabrice
> 
> Thanks!
> Mauro
>

[PATCH 11/11 v2.2] ftrace: Add recording of functions that caused recursion

2020-11-02 Thread Steven Rostedt

From c532ff6b048dd4a12943b05c7b8ce30666c587c8 Mon Sep 17 00:00:00 2001
From: "Steven Rostedt (VMware)" 
Date: Thu, 29 Oct 2020 15:27:06 -0400
Subject: [PATCH] ftrace: Add recording of functions that caused recursion

This adds CONFIG_FTRACE_RECORD_RECURSION that will record to a file
"recursed_functions" all the functions that caused recursion while a
callback to the function tracer was running.

Cc: Jonathan Corbet 
Cc: Guo Ren 
Cc: "James E.J. Bottomley" 
Cc: Helge Deller 
Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Heiko Carstens 
Cc: Vasily Gorbik 
Cc: Christian Borntraeger 
Cc: Thomas Gleixner 
Cc: Borislav Petkov 
Cc: x...@kernel.org
Cc: "H. Peter Anvin" 
Cc: Kees Cook 
Cc: Anton Vorontsov 
Cc: Colin Cross 
Cc: Tony Luck 
Cc: Josh Poimboeuf 
Cc: Jiri Kosina 
Cc: Miroslav Benes 
Cc: Petr Mladek 
Cc: Joe Lawrence 
Cc: Kamalesh Babulal 
Cc: Mauro Carvalho Chehab 
Cc: Sebastian Andrzej Siewior 
Cc: linux-...@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Cc: linux-c...@vger.kernel.org
Cc: linux-par...@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s...@vger.kernel.org
Cc: live-patch...@vger.kernel.org
Signed-off-by: Steven Rostedt (VMware) 
---

Changes since v2.1:
  Added EXPORT_SYMBOL_GPL() to ftrace_record_recursion() function

 Documentation/trace/ftrace-uses.rst   |   6 +-
 arch/csky/kernel/probes/ftrace.c  |   2 +-
 arch/parisc/kernel/ftrace.c   |   2 +-
 arch/powerpc/kernel/kprobes-ftrace.c  |   2 +-
 arch/s390/kernel/ftrace.c |   2 +-
 arch/x86/kernel/kprobes/ftrace.c  |   2 +-
 fs/pstore/ftrace.c|   2 +-
 include/linux/trace_recursion.h   |  32 +++-
 kernel/livepatch/patch.c  |   2 +-
 kernel/trace/Kconfig  |  25 +++
 kernel/trace/Makefile |   1 +
 kernel/trace/ftrace.c |   4 +-
 kernel/trace/trace_event_perf.c   |   2 +-
 kernel/trace/trace_functions.c|   2 +-
 kernel/trace/trace_output.c   |   6 +-
 kernel/trace/trace_output.h   |   1 +
 kernel/trace/trace_recursion_record.c | 236 ++
 17 files changed, 309 insertions(+), 20 deletions(-)
 create mode 100644 kernel/trace/trace_recursion_record.c

diff --git a/Documentation/trace/ftrace-uses.rst 
b/Documentation/trace/ftrace-uses.rst
index 86cd14b8e126..5981d5691745 100644
--- a/Documentation/trace/ftrace-uses.rst
+++ b/Documentation/trace/ftrace-uses.rst
@@ -118,7 +118,7 @@ can help in this regard. If you start your code with:
 
int bit;
 
-   bit = ftrace_test_recursion_trylock();
+   bit = ftrace_test_recursion_trylock(ip, parent_ip);
if (bit < 0)
return;
 
@@ -130,7 +130,9 @@ The code in between will be safe to use, even if it ends up 
calling a
 function that the callback is tracing. Note, on success,
 ftrace_test_recursion_trylock() will disable preemption, and the
 ftrace_test_recursion_unlock() will enable it again (if it was previously
-enabled).
+enabled). The instruction pointer (ip) and its parent (parent_ip) is passed to
+ftrace_test_recursion_trylock() to record where the recursion happened
+(if CONFIG_FTRACE_RECORD_RECURSION is set).
 
 Alternatively, if the FTRACE_OPS_FL_RECURSION flag is set on the ftrace_ops
 (as explained below), then a helper trampoline will be used to test
diff --git a/arch/csky/kernel/probes/ftrace.c b/arch/csky/kernel/probes/ftrace.c
index 5eb2604fdf71..f30b179924ef 100644
--- a/arch/csky/kernel/probes/ftrace.c
+++ b/arch/csky/kernel/probes/ftrace.c
@@ -18,7 +18,7 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long 
parent_ip,
struct kprobe *p;
struct kprobe_ctlblk *kcb;
 
-   bit = ftrace_test_recursion_trylock();
+   bit = ftrace_test_recursion_trylock(ip, parent_ip);
if (bit < 0)
return;
 
diff --git a/arch/parisc/kernel/ftrace.c b/arch/parisc/kernel/ftrace.c
index 4b1fdf15662c..8b0ed7c5a4ab 100644
--- a/arch/parisc/kernel/ftrace.c
+++ b/arch/parisc/kernel/ftrace.c
@@ -210,7 +210,7 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long 
parent_ip,
struct kprobe *p = get_kprobe((kprobe_opcode_t *)ip);
int bit;
 
-   bit = ftrace_test_recursion_trylock();
+   bit = ftrace_test_recursion_trylock(ip, parent_ip);
if (bit < 0)
return;
 
diff --git a/arch/powerpc/kernel/kprobes-ftrace.c 
b/arch/powerpc/kernel/kprobes-ftrace.c
index 5df8d50c65ae..fdfee39938ea 100644
--- a/arch/powerpc/kernel/kprobes-ftrace.c
+++ b/arch/powerpc/kernel/kprobes-ftrace.c
@@ -20,7 +20,7 @@ void kprobe_ftrace_handler(unsigned long nip, unsigned long 
parent_nip,
struct kprobe_ctlblk *kcb;
int bit;
 
-   bit = ftrace_test_recursion_trylock();
+   bit = ftrace_test_recursion_trylock(nip, parent_nip);
if (bit < 0)
return;
 
diff --git a/arch/s390/kernel/ftrace.c b/arch/s390/kernel/ftrace.c
index 88466d7fb6b2..a1556333d481 100644
--- a/

[PATCH 11/11 v2.1] ftrace: Add recording of functions that caused recursion

2020-11-02 Thread Steven Rostedt

From: "Steven Rostedt (VMware)" 

This adds CONFIG_FTRACE_RECORD_RECURSION that will record to a file
"recursed_functions" all the functions that caused recursion while a
callback to the function tracer was running.

Cc: Jonathan Corbet 
Cc: Guo Ren 
Cc: "James E.J. Bottomley" 
Cc: Helge Deller 
Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Heiko Carstens 
Cc: Vasily Gorbik 
Cc: Christian Borntraeger 
Cc: Thomas Gleixner 
Cc: Borislav Petkov 
Cc: x...@kernel.org
Cc: "H. Peter Anvin" 
Cc: Kees Cook 
Cc: Anton Vorontsov 
Cc: Colin Cross 
Cc: Tony Luck 
Cc: Josh Poimboeuf 
Cc: Jiri Kosina 
Cc: Miroslav Benes 
Cc: Petr Mladek 
Cc: Joe Lawrence 
Cc: Kamalesh Babulal 
Cc: Mauro Carvalho Chehab 
Cc: Sebastian Andrzej Siewior 
Cc: linux-...@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Cc: linux-c...@vger.kernel.org
Cc: linux-par...@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s...@vger.kernel.org
Cc: live-patch...@vger.kernel.org
Signed-off-by: Steven Rostedt (VMware) 
---
 Documentation/trace/ftrace-uses.rst   |   6 +-
 arch/csky/kernel/probes/ftrace.c  |   2 +-
 arch/parisc/kernel/ftrace.c   |   2 +-
 arch/powerpc/kernel/kprobes-ftrace.c  |   2 +-
 arch/s390/kernel/ftrace.c |   2 +-
 arch/x86/kernel/kprobes/ftrace.c  |   2 +-
 fs/pstore/ftrace.c|   2 +-
 include/linux/trace_recursion.h   |  32 +++-
 kernel/livepatch/patch.c  |   2 +-
 kernel/trace/Kconfig  |  25 +++
 kernel/trace/Makefile |   1 +
 kernel/trace/ftrace.c |   4 +-
 kernel/trace/trace_event_perf.c   |   2 +-
 kernel/trace/trace_functions.c|   2 +-
 kernel/trace/trace_output.c   |   6 +-
 kernel/trace/trace_output.h   |   1 +
 kernel/trace/trace_recursion_record.c | 235 ++
 17 files changed, 308 insertions(+), 20 deletions(-)
 create mode 100644 kernel/trace/trace_recursion_record.c

diff --git a/Documentation/trace/ftrace-uses.rst 
b/Documentation/trace/ftrace-uses.rst
index 86cd14b8e126..5981d5691745 100644
--- a/Documentation/trace/ftrace-uses.rst
+++ b/Documentation/trace/ftrace-uses.rst
@@ -118,7 +118,7 @@ can help in this regard. If you start your code with:
 
int bit;
 
-   bit = ftrace_test_recursion_trylock();
+   bit = ftrace_test_recursion_trylock(ip, parent_ip);
if (bit < 0)
return;
 
@@ -130,7 +130,9 @@ The code in between will be safe to use, even if it ends up 
calling a
 function that the callback is tracing. Note, on success,
 ftrace_test_recursion_trylock() will disable preemption, and the
 ftrace_test_recursion_unlock() will enable it again (if it was previously
-enabled).
+enabled). The instruction pointer (ip) and its parent (parent_ip) is passed to
+ftrace_test_recursion_trylock() to record where the recursion happened
+(if CONFIG_FTRACE_RECORD_RECURSION is set).
 
 Alternatively, if the FTRACE_OPS_FL_RECURSION flag is set on the ftrace_ops
 (as explained below), then a helper trampoline will be used to test
diff --git a/arch/csky/kernel/probes/ftrace.c b/arch/csky/kernel/probes/ftrace.c
index 5eb2604fdf71..f30b179924ef 100644
--- a/arch/csky/kernel/probes/ftrace.c
+++ b/arch/csky/kernel/probes/ftrace.c
@@ -18,7 +18,7 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long 
parent_ip,
struct kprobe *p;
struct kprobe_ctlblk *kcb;
 
-   bit = ftrace_test_recursion_trylock();
+   bit = ftrace_test_recursion_trylock(ip, parent_ip);
if (bit < 0)
return;
 
diff --git a/arch/parisc/kernel/ftrace.c b/arch/parisc/kernel/ftrace.c
index 4b1fdf15662c..8b0ed7c5a4ab 100644
--- a/arch/parisc/kernel/ftrace.c
+++ b/arch/parisc/kernel/ftrace.c
@@ -210,7 +210,7 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long 
parent_ip,
struct kprobe *p = get_kprobe((kprobe_opcode_t *)ip);
int bit;
 
-   bit = ftrace_test_recursion_trylock();
+   bit = ftrace_test_recursion_trylock(ip, parent_ip);
if (bit < 0)
return;
 
diff --git a/arch/powerpc/kernel/kprobes-ftrace.c 
b/arch/powerpc/kernel/kprobes-ftrace.c
index 5df8d50c65ae..fdfee39938ea 100644
--- a/arch/powerpc/kernel/kprobes-ftrace.c
+++ b/arch/powerpc/kernel/kprobes-ftrace.c
@@ -20,7 +20,7 @@ void kprobe_ftrace_handler(unsigned long nip, unsigned long 
parent_nip,
struct kprobe_ctlblk *kcb;
int bit;
 
-   bit = ftrace_test_recursion_trylock();
+   bit = ftrace_test_recursion_trylock(nip, parent_nip);
if (bit < 0)
return;
 
diff --git a/arch/s390/kernel/ftrace.c b/arch/s390/kernel/ftrace.c
index 88466d7fb6b2..a1556333d481 100644
--- a/arch/s390/kernel/ftrace.c
+++ b/arch/s390/kernel/ftrace.c
@@ -204,7 +204,7 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long 
parent_ip,
struct kprobe *p = get_kprobe((kprobe_opcode_t *)ip);
int bit;
 
-   bit = ftrace_test_recursion_trylo

RE: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"

2020-11-02 Thread David Laight

From: 'Greg KH'
> Sent: 02 November 2020 13:52
> 
> On Mon, Nov 02, 2020 at 09:06:38AM +, David Laight wrote:
> > From: 'Greg KH'
> > > Sent: 23 October 2020 15:47
> > >
> > > On Fri, Oct 23, 2020 at 02:39:24PM +, David Laight wrote:
> > > > From: David Hildenbrand
> > > > > Sent: 23 October 2020 15:33
> > > > ...
> > > > > I just checked against upstream code generated by clang 10 and it
> > > > > properly discards the upper 32bit via a mov w23 w2.
> > > > >
> > > > > So at least clang 10 indeed properly assumes we could have garbage and
> > > > > masks it off.
> > > > >
> > > > > Maybe the issue is somewhere else, unrelated to nr_pages ... or clang 
> > > > > 11
> > > > > behaves differently.
> > > >
> > > > We'll need the disassembly from a failing kernel image.
> > > > It isn't that big to hand annotate.
> > >
> > > I've worked around the merge at the moment in the android tree, but it
> > > is still quite reproducable, and will try to get a .o file to
> > > disassemble on Monday or so...
> >
> > Did this get properly resolved?
> 
> For some reason, 5.10-rc2 fixed all of this up.  I backed out all of the
> patches I had to revert to get 5.10-rc1 to work properly, and then did
> the merge and all is well.
> 
> It must have been something to do with the compat changes in this same
> area that went in after 5.10-rc1, and something got reorganized in the
> files somehow.  I really do not know, and at the moment, don't have the
> time to track it down anymore.  So for now, I'd say it's all good, sorry
> for the noise.

Hopefully it won't appear again.

Saved me spending a day off reading arm64 assembler.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)

Re: [PATCH 11/11 v2] ftrace: Add recording of functions that caused recursion

2020-11-02 Thread Steven Rostedt

On Mon, 2 Nov 2020 12:37:21 -0500
Steven Rostedt  wrote:

> The only race that I see that can happen, is the one in the comment I
> showed. And that is after enabling the recursed functions again after
> clearing, one CPU could add a function while another CPU that just added
> that same function could be just exiting this routine, notice that a
> clearing of the array happened, and remove its function (which was the same
> as the one just happened). So we get a "zero" in the array. If this
> happens, it is likely that that function will recurse again and will be
> added later.
> 

Updated version of this function:

-- Steve

void ftrace_record_recursion(unsigned long ip, unsigned long parent_ip)
{
int index = 0;
int i;
unsigned long old;

 again:
/* First check the last one recorded */
if (ip == cached_function)
return;

i = atomic_read(&nr_records);
/* nr_records is -1 when clearing records */
smp_mb__after_atomic();
if (i < 0)
return;

/*
 * If there's two writers and this writer comes in second,
 * the cmpxchg() below to update the ip will fail. Then this
 * writer will try again. It is possible that index will now
 * be greater than nr_records. This is because the writer
 * that succeeded has not updated the nr_records yet.
 * This writer could keep trying again until the other writer
 * updates nr_records. But if the other writer takes an
 * interrupt, and that interrupt locks up that CPU, we do
 * not want this CPU to lock up due to the recursion protection,
 * and have a bug report showing this CPU as the cause of
 * locking up the computer. To not lose this record, this
 * writer will simply use the next position to update the
 * recursed_functions, and it will update the nr_records
 * accordingly.
 */
if (index < i)
index = i;
if (index >= CONFIG_FTRACE_RECORD_RECURSION_SIZE)
return;

for (i = index - 1; i >= 0; i--) {
if (recursed_functions[i].ip == ip) {
cached_function = ip;
return;
}
}

cached_function = ip;

/*
 * We only want to add a function if it hasn't been added before.
 * Add to the current location before incrementing the count.
 * If it fails to add, then increment the index (save in i)
 * and try again.
 */
old = cmpxchg(&recursed_functions[index].ip, 0, ip);
if (old != 0) {
/* Did something else already added this for us? */
if (old == ip)
return;
/* Try the next location (use i for the next index) */
index++;
goto again;
}

recursed_functions[index].parent_ip = parent_ip;

/*
 * It's still possible that we could race with the clearing
 *CPU0CPU1
 *
 *   ip = func
 *  nr_records = -1;
 *  recursed_functions[0] = 0;
 *   i = -1
 *   if (i < 0)
 *  nr_records = 0;
 *  (new recursion detected)
 *  recursed_functions[0] = func
 *
cmpxchg(recursed_functions[0],
 *func, 0)
 *
 * But the worse that could happen is that we get a zero in
 * the recursed_functions array, and it's likely that "func" will
 * be recorded again.
 */
i = atomic_read(&nr_records);
smp_mb__after_atomic();
if (i < 0)
cmpxchg(&recursed_functions[index].ip, ip, 0);
else if (i <= index)
atomic_cmpxchg(&nr_records, i, index + 1);
}

Re: [PATCH 11/11 v2] ftrace: Add recording of functions that caused recursion

2020-11-02 Thread Steven Rostedt

On Mon, 2 Nov 2020 17:41:47 +0100
Petr Mladek  wrote:

> > +   i = atomic_read(&nr_records);
> > +   smp_mb__after_atomic();
> > +   if (i < 0)
> > +   cmpxchg(&recursed_functions[index].ip, ip, 0);
> > +   else if (i <= index)
> > +   atomic_cmpxchg(&nr_records, i, index + 1);  
> 
> This looks weird. It would shift nr_records past the record added
> in this call. It might skip many slots that were zeroed when clearing.
> Also we do not know if our entry was not zeroed as well.

nr_records always holds the next position to write to.

index = nr_records;
recursed_functions[index].ip = ip;
nr_records++;

Before clearing, we have:

nr_records = -1;
smp_mb();
memset(recursed_functions, 0);
smp_wmb();
nr_records = 0;

When we enter this function:

i = nr_records;
smp_mb();
if (i < 0)
return;

Thus, we just stopped all new updates while clearing the records.

But what about if something is currently updating?

i = nr_records;
smp_mb();
if (i < 0)
cmpxchg(recursed_functions, ip, 0);

The above shows that if the current updating process notices that the
clearing happens, it will clear the function it added.

else if (i <= index)
cmpxchg(nr_records, i, index + 1);

This makes sure that nr_records only grows if it is greater or equal to
zero.

The only race that I see that can happen, is the one in the comment I
showed. And that is after enabling the recursed functions again after
clearing, one CPU could add a function while another CPU that just added
that same function could be just exiting this routine, notice that a
clearing of the array happened, and remove its function (which was the same
as the one just happened). So we get a "zero" in the array. If this
happens, it is likely that that function will recurse again and will be
added later.

-- Steve

Re: [PATCH 11/11 v2] ftrace: Add recording of functions that caused recursion

2020-11-02 Thread Steven Rostedt

On Mon, 2 Nov 2020 12:09:07 -0500
Steven Rostedt  wrote:

> > > +void ftrace_record_recursion(unsigned long ip, unsigned long parent_ip)
> > > +{
> > > + int index;
> > > + int i = 0;
> > > + unsigned long old;
> > > +
> > > + again:
> > > + /* First check the last one recorded */
> > > + if (ip == cached_function)
> > > + return;
> > > +
> > > + index = atomic_read(&nr_records);
> > > + /* nr_records is -1 when clearing records */
> > > + smp_mb__after_atomic();
> > > + if (index < 0)
> > > + return;
> > > +
> > > + /* See below */
> > > + if (i > index)
> > > + index = i;
> > 
> > This looks like a complicated way to do index++ via "i" variable.
> > I guess that it was needed only in some older variant of the code.
> > See below.  
> 
> Because we reread the index above, and index could be bigger than i (more
> than index + 1).
> 
> >   
> > > + if (index >= CONFIG_FTRACE_RECORD_RECURSION_SIZE)
> > > + return;
> > > +
> > > + for (i = index - 1; i >= 0; i--) {
> > > + if (recursed_functions[i].ip == ip) {
> > > + cached_function = ip;
> > > + return;
> > > + }
> > > + }
> > > +
> > > + cached_function = ip;
> > > +
> > > + /*
> > > +  * We only want to add a function if it hasn't been added before.
> > > +  * Add to the current location before incrementing the count.
> > > +  * If it fails to add, then increment the index (save in i)
> > > +  * and try again.
> > > +  */
> > > + old = cmpxchg(&recursed_functions[index].ip, 0, ip);
> > > + if (old != 0) {
> > > + /* Did something else already added this for us? */
> > > + if (old == ip)
> > > + return;
> > > + /* Try the next location (use i for the next index) */
> > > + i = index + 1;
> > 
> > What about
> > 
> > index++;
> > 
> > We basically want to run the code again with index + 1 limit.  
> 
> But something else could update nr_records, and we want to use that if
> nr_records is greater than i.
> 
> Now, we could swap the use case, and have
> 
>   int index = 0;
> 
>   [..]
>   i = atomic_read(&nr_records);
>   if (i > index)
>   index = i;
> 
>   [..]
> 
>   index++;
>   goto again;
> 
> 
> > 
> > Maybe, it even does not make sense to check the array again
> > and we should just try to store the value into the next slot.  
> 
> We do this dance to prevent duplicates.
> 
> But you are correct, that this went through a few iterations. And the first
> ones didn't have the cmpxchg on the ip itself, and that could make it so
> that we don't need this index = i dance.

Playing with this more, I remember why I did this song and dance.

If we have two or more writers, and one beats the other in updating the ip
(with a different function). This one will go and try again. The reason to
look at one passed nr_records, is because of the race between the multiple
writers. This one may loop before the other can update nr_records, and it
will fail to apply it again.

You could just say, "hey we'll just keep looping until the other writer
eventually updates nr_records". But this is where my paranoia gets in. What
happens if that other writer takes an interrupt (interrupts are not
disabled), and then deadlocks, or does something bad? This CPU will not get
locked up spinning.

Unlikely scenario, and it would require a bug someplace else. But I don't
want a bug report stating that it found this recursion locking locking up
the CPU and hide the real culprit.

I'll add a comment to explain this in the code. And also swap the i and
index around to make a little more sense.

-- Steve

Re: [PATCH 11/11 v2] ftrace: Add recording of functions that caused recursion

2020-11-02 Thread Steven Rostedt

On Mon, 2 Nov 2020 17:41:47 +0100
Petr Mladek  wrote:

> On Fri 2020-10-30 17:31:53, Steven Rostedt wrote:
> > From: "Steven Rostedt (VMware)" 
> > 
> > This adds CONFIG_FTRACE_RECORD_RECURSION that will record to a file
> > "recursed_functions" all the functions that caused recursion while a
> > callback to the function tracer was running.
> >   
> 
> > --- /dev/null
> > +++ b/kernel/trace/trace_recursion_record.c
> > @@ -0,0 +1,220 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +
> > +#include "trace_output.h"
> > +
> > +struct recursed_functions {
> > +   unsigned long   ip;
> > +   unsigned long   parent_ip;
> > +};
> > +
> > +static struct recursed_functions 
> > recursed_functions[CONFIG_FTRACE_RECORD_RECURSION_SIZE];  
> 
> The code tries to be lockless safe as much as possible. It would make
> sense to allign the array.

Hmm, is there an arch where the compiler would put an array of structures
with two unsigned long, misaligned?

> 
> 
> > +static atomic_t nr_records;
> > +
> > +/*
> > + * Cache the last found function. Yes, updates to this is racey, but
> > + * so is memory cache ;-)
> > + */
> > +static unsigned long cached_function;
> > +
> > +void ftrace_record_recursion(unsigned long ip, unsigned long parent_ip)
> > +{
> > +   int index;
> > +   int i = 0;
> > +   unsigned long old;
> > +
> > + again:
> > +   /* First check the last one recorded */
> > +   if (ip == cached_function)
> > +   return;
> > +
> > +   index = atomic_read(&nr_records);
> > +   /* nr_records is -1 when clearing records */
> > +   smp_mb__after_atomic();
> > +   if (index < 0)
> > +   return;
> > +
> > +   /* See below */
> > +   if (i > index)
> > +   index = i;  
> 
> This looks like a complicated way to do index++ via "i" variable.
> I guess that it was needed only in some older variant of the code.
> See below.

Because we reread the index above, and index could be bigger than i (more
than index + 1).

> 
> > +   if (index >= CONFIG_FTRACE_RECORD_RECURSION_SIZE)
> > +   return;
> > +
> > +   for (i = index - 1; i >= 0; i--) {
> > +   if (recursed_functions[i].ip == ip) {
> > +   cached_function = ip;
> > +   return;
> > +   }
> > +   }
> > +
> > +   cached_function = ip;
> > +
> > +   /*
> > +* We only want to add a function if it hasn't been added before.
> > +* Add to the current location before incrementing the count.
> > +* If it fails to add, then increment the index (save in i)
> > +* and try again.
> > +*/
> > +   old = cmpxchg(&recursed_functions[index].ip, 0, ip);
> > +   if (old != 0) {
> > +   /* Did something else already added this for us? */
> > +   if (old == ip)
> > +   return;
> > +   /* Try the next location (use i for the next index) */
> > +   i = index + 1;  
> 
> What about
> 
>   index++;
> 
> We basically want to run the code again with index + 1 limit.

But something else could update nr_records, and we want to use that if
nr_records is greater than i.

Now, we could swap the use case, and have

int index = 0;

[..]
i = atomic_read(&nr_records);
if (i > index)
index = i;

[..]

index++;
goto again;


> 
> Maybe, it even does not make sense to check the array again
> and we should just try to store the value into the next slot.

We do this dance to prevent duplicates.

But you are correct, that this went through a few iterations. And the first
ones didn't have the cmpxchg on the ip itself, and that could make it so
that we don't need this index = i dance.

> 
> > +   goto again;
> > +   }
> > +
> > +   recursed_functions[index].parent_ip = parent_ip;  
> 
> WRITE_ONCE() ?

Does it really matter?

> 
> > +
> > +   /*
> > +* It's still possible that we could race with the clearing
> > +*CPU0CPU1
> > +*
> > +*   ip = func
> > +*  nr_records = -1;
> > +*  recursed_functions[0] = 0;
> > +*   i = -1
> > +*   if (i < 0)
> > +*  nr_records = 0;
> > +*  (new recursion detected)
> > +*  recursed_functions[0] = func
> > +*
> > cmpxchg(recursed_functions[0],
> > +*func, 0)
> > +*
> > +* But the worse that could happen is that we get a zero in
> > +* the recursed_functions array, and it's likely that "func" will
> > +* be recorded again.
> > +*/
> > +   i = atomic_read(&nr_records);
> > +   smp_mb__after_atomic();
> > +   if (i < 0)
> > +   cmpxchg(&recursed_functions[index].ip, ip, 0);
>

Re: [PATCH 11/11 v2] ftrace: Add recording of functions that caused recursion

2020-11-02 Thread Petr Mladek

On Fri 2020-10-30 17:31:53, Steven Rostedt wrote:
> From: "Steven Rostedt (VMware)" 
> 
> This adds CONFIG_FTRACE_RECORD_RECURSION that will record to a file
> "recursed_functions" all the functions that caused recursion while a
> callback to the function tracer was running.
> 

> --- /dev/null
> +++ b/kernel/trace/trace_recursion_record.c
> @@ -0,0 +1,220 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "trace_output.h"
> +
> +struct recursed_functions {
> + unsigned long   ip;
> + unsigned long   parent_ip;
> +};
> +
> +static struct recursed_functions 
> recursed_functions[CONFIG_FTRACE_RECORD_RECURSION_SIZE];

The code tries to be lockless safe as much as possible. It would make
sense to allign the array.


> +static atomic_t nr_records;
> +
> +/*
> + * Cache the last found function. Yes, updates to this is racey, but
> + * so is memory cache ;-)
> + */
> +static unsigned long cached_function;
> +
> +void ftrace_record_recursion(unsigned long ip, unsigned long parent_ip)
> +{
> + int index;
> + int i = 0;
> + unsigned long old;
> +
> + again:
> + /* First check the last one recorded */
> + if (ip == cached_function)
> + return;
> +
> + index = atomic_read(&nr_records);
> + /* nr_records is -1 when clearing records */
> + smp_mb__after_atomic();
> + if (index < 0)
> + return;
> +
> + /* See below */
> + if (i > index)
> + index = i;

This looks like a complicated way to do index++ via "i" variable.
I guess that it was needed only in some older variant of the code.
See below.

> + if (index >= CONFIG_FTRACE_RECORD_RECURSION_SIZE)
> + return;
> +
> + for (i = index - 1; i >= 0; i--) {
> + if (recursed_functions[i].ip == ip) {
> + cached_function = ip;
> + return;
> + }
> + }
> +
> + cached_function = ip;
> +
> + /*
> +  * We only want to add a function if it hasn't been added before.
> +  * Add to the current location before incrementing the count.
> +  * If it fails to add, then increment the index (save in i)
> +  * and try again.
> +  */
> + old = cmpxchg(&recursed_functions[index].ip, 0, ip);
> + if (old != 0) {
> + /* Did something else already added this for us? */
> + if (old == ip)
> + return;
> + /* Try the next location (use i for the next index) */
> + i = index + 1;

What about

index++;

We basically want to run the code again with index + 1 limit.

Maybe, it even does not make sense to check the array again
and we should just try to store the value into the next slot.

> + goto again;
> + }
> +
> + recursed_functions[index].parent_ip = parent_ip;

WRITE_ONCE() ?

> +
> + /*
> +  * It's still possible that we could race with the clearing
> +  *CPU0CPU1
> +  *
> +  *   ip = func
> +  *  nr_records = -1;
> +  *  recursed_functions[0] = 0;
> +  *   i = -1
> +  *   if (i < 0)
> +  *  nr_records = 0;
> +  *  (new recursion detected)
> +  *  recursed_functions[0] = func
> +  *
> cmpxchg(recursed_functions[0],
> +  *func, 0)
> +  *
> +  * But the worse that could happen is that we get a zero in
> +  * the recursed_functions array, and it's likely that "func" will
> +  * be recorded again.
> +  */
> + i = atomic_read(&nr_records);
> + smp_mb__after_atomic();
> + if (i < 0)
> + cmpxchg(&recursed_functions[index].ip, ip, 0);
> + else if (i <= index)
> + atomic_cmpxchg(&nr_records, i, index + 1);

This looks weird. It would shift nr_records past the record added
in this call. It might skip many slots that were zeroed when clearing.
Also we do not know if our entry was not zeroed as well.

I would suggest to do it some other way (not even compile tested):

void ftrace_record_recursion(unsigned long ip, unsigned long parent_ip)
{
int index, old_index;
int i = 0;
unsigned long old_ip;

 again:
/* First check the last one recorded. */
if (ip == READ_ONCE(cached_function))
return;

index = atomic_read(&nr_records);
/* nr_records is -1 when clearing records. */
smp_mb__after_atomic();
if (index < 0)
return;

/* Already cached? */
for (i = index - 1; i >= 0; i--) {
if (recursed_functions[i].ip == ip) {
WRITE_ONCE(cached_function, ip);

[PATCH] ASoC: fsl_xcvr: fix break condition

2020-11-02 Thread Viorel Suman (OSS)

From: Viorel Suman 

The break condition copied by mistake as same
as loop condition in the previous version, but must
be the opposite. So fix it.

Signed-off-by: Viorel Suman 
---
 sound/soc/fsl/fsl_xcvr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sound/soc/fsl/fsl_xcvr.c b/sound/soc/fsl/fsl_xcvr.c
index c055179e6d11..2a28810d0e29 100644
--- a/sound/soc/fsl/fsl_xcvr.c
+++ b/sound/soc/fsl/fsl_xcvr.c
@@ -247,7 +247,7 @@ static int fsl_xcvr_ai_write(struct fsl_xcvr *xcvr, u8 reg, 
u32 data, bool phy)
regmap_write(xcvr->regmap, FSL_XCVR_PHY_AI_CTRL_TOG, idx);
 
ret = regmap_read_poll_timeout(xcvr->regmap, FSL_XCVR_PHY_AI_CTRL, val,
-  (val & idx) != ((val & tidx) >> 1),
+  (val & idx) == ((val & tidx) >> 1),
   10, 1);
if (ret)
dev_err(dev, "AI timeout: failed to set %s reg 0x%02x=0x%08x\n",
-- 
2.26.2

Re: [PATCH v3 4/4] arch, mm: make kernel_page_present() always available

2020-11-02 Thread Mike Rapoport

On Mon, Nov 02, 2020 at 10:28:14AM +0100, David Hildenbrand wrote:
> On 01.11.20 18:08, Mike Rapoport wrote:
> > From: Mike Rapoport 
> > 
> > For architectures that enable ARCH_HAS_SET_MEMORY having the ability to
> > verify that a page is mapped in the kernel direct map can be useful
> > regardless of hibernation.
> > 
> > Add RISC-V implementation of kernel_page_present(), update its forward
> > declarations and stubs to be a part of set_memory API and remove ugly
> > ifdefery in inlcude/linux/mm.h around current declarations of
> > kernel_page_present().
> > 
> > Signed-off-by: Mike Rapoport 
> > ---
> >   arch/arm64/include/asm/cacheflush.h |  1 +
> >   arch/arm64/mm/pageattr.c|  4 +---
> >   arch/riscv/include/asm/set_memory.h |  1 +
> >   arch/riscv/mm/pageattr.c| 29 +
> >   arch/x86/include/asm/set_memory.h   |  1 +
> >   arch/x86/mm/pat/set_memory.c|  4 +---
> >   include/linux/mm.h  |  7 ---
> >   include/linux/set_memory.h  |  5 +
> >   8 files changed, 39 insertions(+), 13 deletions(-)
> > 
> > diff --git a/arch/arm64/include/asm/cacheflush.h 
> > b/arch/arm64/include/asm/cacheflush.h
> > index 9384fd8fc13c..45217f21f1fe 100644
> > --- a/arch/arm64/include/asm/cacheflush.h
> > +++ b/arch/arm64/include/asm/cacheflush.h
> > @@ -140,6 +140,7 @@ int set_memory_valid(unsigned long addr, int numpages, 
> > int enable);
> >   int set_direct_map_invalid_noflush(struct page *page);
> >   int set_direct_map_default_noflush(struct page *page);
> > +bool kernel_page_present(struct page *page);
> >   #include 
> > diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c
> > index 439325532be1..92eccaf595c8 100644
> > --- a/arch/arm64/mm/pageattr.c
> > +++ b/arch/arm64/mm/pageattr.c
> > @@ -186,8 +186,8 @@ void __kernel_map_pages(struct page *page, int 
> > numpages, int enable)
> > set_memory_valid((unsigned long)page_address(page), numpages, enable);
> >   }
> > +#endif /* CONFIG_DEBUG_PAGEALLOC */
> > -#ifdef CONFIG_HIBERNATION
> >   /*
> >* This function is used to determine if a linear map page has been 
> > marked as
> >* not-valid. Walk the page table and check the PTE_VALID bit. This is 
> > based
> > @@ -234,5 +234,3 @@ bool kernel_page_present(struct page *page)
> > ptep = pte_offset_kernel(pmdp, addr);
> > return pte_valid(READ_ONCE(*ptep));
> >   }
> > -#endif /* CONFIG_HIBERNATION */
> > -#endif /* CONFIG_DEBUG_PAGEALLOC */
> > diff --git a/arch/riscv/include/asm/set_memory.h 
> > b/arch/riscv/include/asm/set_memory.h
> > index 4c5bae7ca01c..d690b08dff2a 100644
> > --- a/arch/riscv/include/asm/set_memory.h
> > +++ b/arch/riscv/include/asm/set_memory.h
> > @@ -24,6 +24,7 @@ static inline int set_memory_nx(unsigned long addr, int 
> > numpages) { return 0; }
> >   int set_direct_map_invalid_noflush(struct page *page);
> >   int set_direct_map_default_noflush(struct page *page);
> > +bool kernel_page_present(struct page *page);
> >   #endif /* __ASSEMBLY__ */
> > diff --git a/arch/riscv/mm/pageattr.c b/arch/riscv/mm/pageattr.c
> > index 321b09d2e2ea..87ba5a68bbb8 100644
> > --- a/arch/riscv/mm/pageattr.c
> > +++ b/arch/riscv/mm/pageattr.c
> > @@ -198,3 +198,32 @@ void __kernel_map_pages(struct page *page, int 
> > numpages, int enable)
> >  __pgprot(0), __pgprot(_PAGE_PRESENT));
> >   }
> >   #endif
> > +
> > +bool kernel_page_present(struct page *page)
> > +{
> > +   unsigned long addr = (unsigned long)page_address(page);
> > +   pgd_t *pgd;
> > +   pud_t *pud;
> > +   p4d_t *p4d;
> > +   pmd_t *pmd;
> > +   pte_t *pte;
> > +
> > +   pgd = pgd_offset_k(addr);
> > +   if (!pgd_present(*pgd))
> > +   return false;
> > +
> > +   p4d = p4d_offset(pgd, addr);
> > +   if (!p4d_present(*p4d))
> > +   return false;
> > +
> > +   pud = pud_offset(p4d, addr);
> > +   if (!pud_present(*pud))
> > +   return false;
> > +
> > +   pmd = pmd_offset(pud, addr);
> > +   if (!pmd_present(*pmd))
> > +   return false;
> > +
> > +   pte = pte_offset_kernel(pmd, addr);
> > +   return pte_present(*pte);
> > +}
> > diff --git a/arch/x86/include/asm/set_memory.h 
> > b/arch/x86/include/asm/set_memory.h
> > index 5948218f35c5..4352f08bfbb5 100644
> > --- a/arch/x86/include/asm/set_memory.h
> > +++ b/arch/x86/include/asm/set_memory.h
> > @@ -82,6 +82,7 @@ int set_pages_rw(struct page *page, int numpages);
> >   int set_direct_map_invalid_noflush(struct page *page);
> >   int set_direct_map_default_noflush(struct page *page);
> > +bool kernel_page_present(struct page *page);
> >   extern int kernel_set_to_readonly;
> > diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
> > index bc9be96b777f..16f878c26667 100644
> > --- a/arch/x86/mm/pat/set_memory.c
> > +++ b/arch/x86/mm/pat/set_memory.c
> > @@ -2226,8 +2226,8 @@ void __kernel_map_pages(struct page *page, int 
> > numpages, int enable)
> > arch_flush_lazy_mmu_mode();
> >   }
>

Re: [PATCH v3 3/4] arch, mm: restore dependency of __kernel_map_pages() of DEBUG_PAGEALLOC

2020-11-02 Thread Mike Rapoport

On Mon, Nov 02, 2020 at 10:23:20AM +0100, David Hildenbrand wrote:
> 
> >   int __init kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, unsigned long 
> > address,
> >unsigned numpages, unsigned long page_flags)
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index 14e397f3752c..ab0ef6bd351d 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -2924,7 +2924,11 @@ static inline bool 
> > debug_pagealloc_enabled_static(void)
> > return static_branch_unlikely(&_debug_pagealloc_enabled);
> >   }
> > -#if defined(CONFIG_DEBUG_PAGEALLOC) || 
> > defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
> > +#ifdef CONFIG_DEBUG_PAGEALLOC
> > +/*
> > + * To support DEBUG_PAGEALLOC architecture must ensure that
> > + * __kernel_map_pages() never fails
> 
> Maybe add here, that this implies mapping everything via PTEs during boot.

This is more of an implementation detail, while assumption that
__kernel_map_pages() does not fail is somewhat a requirement :)

> Acked-by: David Hildenbrand 

Thanks!

> -- 
> Thanks,
> 
> David / dhildenb
> 

-- 
Sincerely yours,
Mike.

Re: [PATCH v2] powerpc/pci: unmap legacy INTx interrupts when a PHB is removed

2020-11-02 Thread Cédric Le Goater

On 10/14/20 4:55 AM, Alexey Kardashevskiy wrote:
> 
> 
> On 23/09/2020 17:06, Cédric Le Goater wrote:
>> On 9/23/20 2:33 AM, Qian Cai wrote:
>>> On Fri, 2020-08-07 at 12:18 +0200, Cédric Le Goater wrote:
 When a passthrough IO adapter is removed from a pseries machine using
 hash MMU and the XIVE interrupt mode, the POWER hypervisor expects the
 guest OS to clear all page table entries related to the adapter. If
 some are still present, the RTAS call which isolates the PCI slot
 returns error 9001 "valid outstanding translations" and the removal of
 the IO adapter fails. This is because when the PHBs are scanned, Linux
 maps automatically the INTx interrupts in the Linux interrupt number
 space but these are never removed.

 To solve this problem, we introduce a PPC platform specific
 pcibios_remove_bus() routine which clears all interrupt mappings when
 the bus is removed. This also clears the associated page table entries
 of the ESB pages when using XIVE.

 For this purpose, we record the logical interrupt numbers of the
 mapped interrupt under the PHB structure and let pcibios_remove_bus()
 do the clean up.

 Since some PCI adapters, like GPUs, use the "interrupt-map" property
 to describe interrupt mappings other than the legacy INTx interrupts,
 we can not restrict the size of the mapping array to PCI_NUM_INTX. The
 number of interrupt mappings is computed from the "interrupt-map"
 property and the mapping array is allocated accordingly.

 Cc: "Oliver O'Halloran" 
 Cc: Alexey Kardashevskiy 
 Signed-off-by: Cédric Le Goater 
>>>
>>> Some syscall fuzzing will trigger this on POWER9 NV where the traces 
>>> pointed to
>>> this patch.
>>>
>>> .config: https://gitlab.com/cailca/linux-mm/-/blob/master/powerpc.config
>>
>> OK. The patch is missing a NULL assignement after kfree() and that
>> might be the issue.
>>
>> I did try PHB removal under PowerNV, so I would like to understand
>> how we managed to remove twice the PCI bus and possibly reproduce.
>> Any chance we could grab what the syscall fuzzer (syzkaller) did ?
> 
> 
> How do you remove PHBs exactly? There is no such thing in the powernv 
> platform, I thought someone added this and you are fixing it but no. PHBs on 
> powernv are created at the boot time and there is no way to remove them, you 
> can only try removing all the bridges.

yes. I noticed that later when proposing the fix for the double 
free.

> So what exactly are you doing?

What you just said above, with the commands : 

  echo 1 >  /sys/devices/pci0031\:00/0031\:00\:00.0/remove
  echo 1 >  /sys/devices/pci0031\:00/pci_bus/0031\:00/rescan

C.

Re: [PATCH v3 2/4] PM: hibernate: make direct map manipulations more explicit

2020-11-02 Thread Mike Rapoport

On Mon, Nov 02, 2020 at 10:19:36AM +0100, David Hildenbrand wrote:
> On 01.11.20 18:08, Mike Rapoport wrote:
> > From: Mike Rapoport 
> > 
> > When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page may be
> > not present in the direct map and has to be explicitly mapped before it
> > could be copied.
> > 
> > Introduce hibernate_map_page() that will explicitly use
> > set_direct_map_{default,invalid}_noflush() for ARCH_HAS_SET_DIRECT_MAP case
> > and debug_pagealloc_map_pages() for DEBUG_PAGEALLOC case.
> > 
> > The remapping of the pages in safe_copy_page() presumes that it only
> > changes protection bits in an existing PTE and so it is safe to ignore
> > return value of set_direct_map_{default,invalid}_noflush().
> > 
> > Still, add a WARN_ON() so that future changes in set_memory APIs will not
> > silently break hibernation.
> > 
> > Signed-off-by: Mike Rapoport 
> > Acked-by: Rafael J. Wysocki 
> > ---
> >   include/linux/mm.h  | 12 
> >   kernel/power/snapshot.c | 30 --
> >   2 files changed, 28 insertions(+), 14 deletions(-)
> > 
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index 1fc0609056dc..14e397f3752c 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -2927,16 +2927,6 @@ static inline bool 
> > debug_pagealloc_enabled_static(void)
> >   #if defined(CONFIG_DEBUG_PAGEALLOC) || 
> > defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
> >   extern void __kernel_map_pages(struct page *page, int numpages, int 
> > enable);
> > -/*
> > - * When called in DEBUG_PAGEALLOC context, the call should most likely be
> > - * guarded by debug_pagealloc_enabled() or debug_pagealloc_enabled_static()
> > - */
> > -static inline void
> > -kernel_map_pages(struct page *page, int numpages, int enable)
> > -{
> > -   __kernel_map_pages(page, numpages, enable);
> > -}
> > -
> >   static inline void debug_pagealloc_map_pages(struct page *page,
> >  int numpages, int enable)
> >   {
> > @@ -2948,8 +2938,6 @@ static inline void debug_pagealloc_map_pages(struct 
> > page *page,
> >   extern bool kernel_page_present(struct page *page);
> >   #endif/* CONFIG_HIBERNATION */
> >   #else /* CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */
> > -static inline void
> > -kernel_map_pages(struct page *page, int numpages, int enable) {}
> >   static inline void debug_pagealloc_map_pages(struct page *page,
> >  int numpages, int enable) {}
> >   #ifdef CONFIG_HIBERNATION
> > diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
> > index 46b1804c1ddf..054c8cce4236 100644
> > --- a/kernel/power/snapshot.c
> > +++ b/kernel/power/snapshot.c
> > @@ -76,6 +76,32 @@ static inline void hibernate_restore_protect_page(void 
> > *page_address) {}
> >   static inline void hibernate_restore_unprotect_page(void *page_address) {}
> >   #endif /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
> > +static inline void hibernate_map_page(struct page *page, int enable)
> > +{
> > +   if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
> > +   unsigned long addr = (unsigned long)page_address(page);
> > +   int ret;
> > +
> > +   /*
> > +* This should not fail because remapping a page here means
> > +* that we only update protection bits in an existing PTE.
> > +* It is still worth to have WARN_ON() here if something
> > +* changes and this will no longer be the case.
> > +*/
> > +   if (enable)
> > +   ret = set_direct_map_default_noflush(page);
> > +   else
> > +   ret = set_direct_map_invalid_noflush(page);
> > +
> > +   if (WARN_ON(ret))
> > +   return;
> 
> People seem to prefer pr_warn() now that production kernels have panic on
> warn enabled. It's weird.

Weird indeed as the whole point of WARN to yell without causing a
crash...
I can change to pr_warn though...

> > +
> > +   flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> > +   } else {
> > +   debug_pagealloc_map_pages(page, 1, enable);
> 
> Reviewed-by: David Hildenbrand 

Thanks!

> -- 
> Thanks,
> 
> David / dhildenb
> 

-- 
Sincerely yours,
Mike.

Re: [PATCH v2 2/2] misc: ocxl: config: Rename function attribute description

2020-11-02 Thread Frederic Barrat





Le 02/11/2020 à 15:20, Lee Jones a écrit :

Fixes the following W=1 kernel build warning(s):

  drivers/misc/ocxl/config.c:81: warning: Function parameter or member 'dev' 
not described in 'get_function_0'
  drivers/misc/ocxl/config.c:81: warning: Excess function parameter 'device' 
description in 'get_function_0'

Cc: Frederic Barrat 
Cc: Andrew Donnellan 
Cc: Arnd Bergmann 
Cc: Greg Kroah-Hartman 
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Lee Jones 
---



Thanks!
Acked-by: Frederic Barrat 



  drivers/misc/ocxl/config.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/misc/ocxl/config.c b/drivers/misc/ocxl/config.c
index 4d490b92d951f..a68738f382521 100644
--- a/drivers/misc/ocxl/config.c
+++ b/drivers/misc/ocxl/config.c
@@ -73,7 +73,7 @@ static int find_dvsec_afu_ctrl(struct pci_dev *dev, u8 
afu_idx)
  
  /**

   * get_function_0() - Find a related PCI device (function 0)
- * @device: PCI device to match
+ * @dev: PCI device to match
   *
   * Returns a pointer to the related device, or null if not found
   */

[PATCH v2 2/2] misc: ocxl: config: Rename function attribute description

2020-11-02 Thread Lee Jones

Fixes the following W=1 kernel build warning(s):

 drivers/misc/ocxl/config.c:81: warning: Function parameter or member 'dev' not 
described in 'get_function_0'
 drivers/misc/ocxl/config.c:81: warning: Excess function parameter 'device' 
description in 'get_function_0'

Cc: Frederic Barrat 
Cc: Andrew Donnellan 
Cc: Arnd Bergmann 
Cc: Greg Kroah-Hartman 
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Lee Jones 
---
 drivers/misc/ocxl/config.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/misc/ocxl/config.c b/drivers/misc/ocxl/config.c
index 4d490b92d951f..a68738f382521 100644
--- a/drivers/misc/ocxl/config.c
+++ b/drivers/misc/ocxl/config.c
@@ -73,7 +73,7 @@ static int find_dvsec_afu_ctrl(struct pci_dev *dev, u8 
afu_idx)
 
 /**
  * get_function_0() - Find a related PCI device (function 0)
- * @device: PCI device to match
+ * @dev: PCI device to match
  *
  * Returns a pointer to the related device, or null if not found
  */
-- 
2.25.1

Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"

2020-11-02 Thread 'Greg KH'

On Mon, Nov 02, 2020 at 09:06:38AM +, David Laight wrote:
> From: 'Greg KH'
> > Sent: 23 October 2020 15:47
> > 
> > On Fri, Oct 23, 2020 at 02:39:24PM +, David Laight wrote:
> > > From: David Hildenbrand
> > > > Sent: 23 October 2020 15:33
> > > ...
> > > > I just checked against upstream code generated by clang 10 and it
> > > > properly discards the upper 32bit via a mov w23 w2.
> > > >
> > > > So at least clang 10 indeed properly assumes we could have garbage and
> > > > masks it off.
> > > >
> > > > Maybe the issue is somewhere else, unrelated to nr_pages ... or clang 11
> > > > behaves differently.
> > >
> > > We'll need the disassembly from a failing kernel image.
> > > It isn't that big to hand annotate.
> > 
> > I've worked around the merge at the moment in the android tree, but it
> > is still quite reproducable, and will try to get a .o file to
> > disassemble on Monday or so...
> 
> Did this get properly resolved?

For some reason, 5.10-rc2 fixed all of this up.  I backed out all of the
patches I had to revert to get 5.10-rc1 to work properly, and then did
the merge and all is well.

It must have been something to do with the compat changes in this same
area that went in after 5.10-rc1, and something got reorganized in the
files somehow.  I really do not know, and at the moment, don't have the
time to track it down anymore.  So for now, I'd say it's all good, sorry
for the noise.

greg k-h

Re: [PATCH 23/23] mtd: devices: powernv_flash: Add function names to headers and fix 'dev'

2020-11-02 Thread Miquel Raynal

Hi Lee,

Lee Jones  wrote on Mon,  2 Nov 2020 11:54:06
+:

> Fixes the following W=1 kernel build warning(s):
> 
>  drivers/mtd/devices/powernv_flash.c:129: warning: Cannot understand  * @mtd: 
> the device
>  drivers/mtd/devices/powernv_flash.c:145: warning: Cannot understand  * @mtd: 
> the device
>  drivers/mtd/devices/powernv_flash.c:161: warning: Cannot understand  * @mtd: 
> the device
>  drivers/mtd/devices/powernv_flash.c:184: warning: Function parameter or 
> member 'dev' not described in 'powernv_flash_set_driver_info'
> 
> Cc: Miquel Raynal 
> Cc: Richard Weinberger 
> Cc: Vignesh Raghavendra 
> Cc: Michael Ellerman 
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Cc: Cyril Bur 
> Cc: linux-...@lists.infradead.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/mtd/devices/powernv_flash.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/mtd/devices/powernv_flash.c 
> b/drivers/mtd/devices/powernv_flash.c
> index 0b757d9ba2f6b..32cb0e649096f 100644
> --- a/drivers/mtd/devices/powernv_flash.c
> +++ b/drivers/mtd/devices/powernv_flash.c
> @@ -126,6 +126,8 @@ static int powernv_flash_async_op(struct mtd_info *mtd, 
> enum flash_op op,
>  }
>  
>  /**
> + * powernv_flash_read
> + *

Perhaps we should not add blank lines if the rest of the file does not
already have such spacing (see below).

>   * @mtd: the device
>   * @from: the offset to read from
>   * @len: the number of bytes to read
> @@ -142,6 +144,7 @@ static int powernv_flash_read(struct mtd_info *mtd, 
> loff_t from, size_t len,
>  }
>  
>  /**
> + * powernv_flash_write
>   * @mtd: the device
>   * @to: the offset to write to
>   * @len: the number of bytes to write
> @@ -158,6 +161,7 @@ static int powernv_flash_write(struct mtd_info *mtd, 
> loff_t to, size_t len,
>  }
>  
>  /**
> + * powernv_flash_erase
>   * @mtd: the device
>   * @erase: the erase info
>   * Returns 0 if erase successful or -ERRNO if an error occurred
> @@ -176,7 +180,7 @@ static int powernv_flash_erase(struct mtd_info *mtd, 
> struct erase_info *erase)
>  
>  /**
>   * powernv_flash_set_driver_info - Fill the mtd_info structure and docg3
> - * structure @pdev: The platform device
> + * @dev: The device structure
>   * @mtd: The structure to fill
>   */
>  static int powernv_flash_set_driver_info(struct device *dev,


Thanks,
Miquèl

[PATCH 23/23] mtd: devices: powernv_flash: Add function names to headers and fix 'dev'

2020-11-02 Thread Lee Jones

Fixes the following W=1 kernel build warning(s):

 drivers/mtd/devices/powernv_flash.c:129: warning: Cannot understand  * @mtd: 
the device
 drivers/mtd/devices/powernv_flash.c:145: warning: Cannot understand  * @mtd: 
the device
 drivers/mtd/devices/powernv_flash.c:161: warning: Cannot understand  * @mtd: 
the device
 drivers/mtd/devices/powernv_flash.c:184: warning: Function parameter or member 
'dev' not described in 'powernv_flash_set_driver_info'

Cc: Miquel Raynal 
Cc: Richard Weinberger 
Cc: Vignesh Raghavendra 
Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Cyril Bur 
Cc: linux-...@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Lee Jones 
---
 drivers/mtd/devices/powernv_flash.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/mtd/devices/powernv_flash.c 
b/drivers/mtd/devices/powernv_flash.c
index 0b757d9ba2f6b..32cb0e649096f 100644
--- a/drivers/mtd/devices/powernv_flash.c
+++ b/drivers/mtd/devices/powernv_flash.c
@@ -126,6 +126,8 @@ static int powernv_flash_async_op(struct mtd_info *mtd, 
enum flash_op op,
 }
 
 /**
+ * powernv_flash_read
+ *
  * @mtd: the device
  * @from: the offset to read from
  * @len: the number of bytes to read
@@ -142,6 +144,7 @@ static int powernv_flash_read(struct mtd_info *mtd, loff_t 
from, size_t len,
 }
 
 /**
+ * powernv_flash_write
  * @mtd: the device
  * @to: the offset to write to
  * @len: the number of bytes to write
@@ -158,6 +161,7 @@ static int powernv_flash_write(struct mtd_info *mtd, loff_t 
to, size_t len,
 }
 
 /**
+ * powernv_flash_erase
  * @mtd: the device
  * @erase: the erase info
  * Returns 0 if erase successful or -ERRNO if an error occurred
@@ -176,7 +180,7 @@ static int powernv_flash_erase(struct mtd_info *mtd, struct 
erase_info *erase)
 
 /**
  * powernv_flash_set_driver_info - Fill the mtd_info structure and docg3
- * structure @pdev: The platform device
+ * @dev: The device structure
  * @mtd: The structure to fill
  */
 static int powernv_flash_set_driver_info(struct device *dev,
-- 
2.25.1

[PATCH 2/2] misc: ocxl: config: Rename function attribute description

2020-11-02 Thread Lee Jones

Fixes the following W=1 kernel build warning(s):

 drivers/misc/ocxl/config.c:81: warning: Function parameter or member 'dev' not 
described in 'get_function_0'
 drivers/misc/ocxl/config.c:81: warning: Excess function parameter 'device' 
description in 'get_function_0'

Cc: Frederic Barrat 
Cc: Andrew Donnellan 
Cc: Arnd Bergmann 
Cc: Greg Kroah-Hartman 
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Lee Jones 
---
 drivers/misc/ocxl/config.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/misc/ocxl/config.c b/drivers/misc/ocxl/config.c
index 4d490b92d951f..a68738f382521 100644
--- a/drivers/misc/ocxl/config.c
+++ b/drivers/misc/ocxl/config.c
@@ -73,7 +73,7 @@ static int find_dvsec_afu_ctrl(struct pci_dev *dev, u8 
afu_idx)
 
 /**
  * get_function_0() - Find a related PCI device (function 0)
- * @device: PCI device to match
+ * @dev: PCI device to match
  *
  * Returns a pointer to the related device, or null if not found
  */
-- 
2.25.1

Re: [PATCH v3 4/4] arch, mm: make kernel_page_present() always available

2020-11-02 Thread David Hildenbrand


On 01.11.20 18:08, Mike Rapoport wrote:

From: Mike Rapoport 

For architectures that enable ARCH_HAS_SET_MEMORY having the ability to
verify that a page is mapped in the kernel direct map can be useful
regardless of hibernation.

Add RISC-V implementation of kernel_page_present(), update its forward
declarations and stubs to be a part of set_memory API and remove ugly
ifdefery in inlcude/linux/mm.h around current declarations of
kernel_page_present().

Signed-off-by: Mike Rapoport 
---
  arch/arm64/include/asm/cacheflush.h |  1 +
  arch/arm64/mm/pageattr.c|  4 +---
  arch/riscv/include/asm/set_memory.h |  1 +
  arch/riscv/mm/pageattr.c| 29 +
  arch/x86/include/asm/set_memory.h   |  1 +
  arch/x86/mm/pat/set_memory.c|  4 +---
  include/linux/mm.h  |  7 ---
  include/linux/set_memory.h  |  5 +
  8 files changed, 39 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/include/asm/cacheflush.h 
b/arch/arm64/include/asm/cacheflush.h
index 9384fd8fc13c..45217f21f1fe 100644
--- a/arch/arm64/include/asm/cacheflush.h
+++ b/arch/arm64/include/asm/cacheflush.h
@@ -140,6 +140,7 @@ int set_memory_valid(unsigned long addr, int numpages, int 
enable);
  
  int set_direct_map_invalid_noflush(struct page *page);

  int set_direct_map_default_noflush(struct page *page);
+bool kernel_page_present(struct page *page);
  
  #include 
  
diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c

index 439325532be1..92eccaf595c8 100644
--- a/arch/arm64/mm/pageattr.c
+++ b/arch/arm64/mm/pageattr.c
@@ -186,8 +186,8 @@ void __kernel_map_pages(struct page *page, int numpages, 
int enable)
  
  	set_memory_valid((unsigned long)page_address(page), numpages, enable);

  }
+#endif /* CONFIG_DEBUG_PAGEALLOC */
  
-#ifdef CONFIG_HIBERNATION

  /*
   * This function is used to determine if a linear map page has been marked as
   * not-valid. Walk the page table and check the PTE_VALID bit. This is based
@@ -234,5 +234,3 @@ bool kernel_page_present(struct page *page)
ptep = pte_offset_kernel(pmdp, addr);
return pte_valid(READ_ONCE(*ptep));
  }
-#endif /* CONFIG_HIBERNATION */
-#endif /* CONFIG_DEBUG_PAGEALLOC */
diff --git a/arch/riscv/include/asm/set_memory.h 
b/arch/riscv/include/asm/set_memory.h
index 4c5bae7ca01c..d690b08dff2a 100644
--- a/arch/riscv/include/asm/set_memory.h
+++ b/arch/riscv/include/asm/set_memory.h
@@ -24,6 +24,7 @@ static inline int set_memory_nx(unsigned long addr, int 
numpages) { return 0; }
  
  int set_direct_map_invalid_noflush(struct page *page);

  int set_direct_map_default_noflush(struct page *page);
+bool kernel_page_present(struct page *page);
  
  #endif /* __ASSEMBLY__ */
  
diff --git a/arch/riscv/mm/pageattr.c b/arch/riscv/mm/pageattr.c

index 321b09d2e2ea..87ba5a68bbb8 100644
--- a/arch/riscv/mm/pageattr.c
+++ b/arch/riscv/mm/pageattr.c
@@ -198,3 +198,32 @@ void __kernel_map_pages(struct page *page, int numpages, 
int enable)
 __pgprot(0), __pgprot(_PAGE_PRESENT));
  }
  #endif
+
+bool kernel_page_present(struct page *page)
+{
+   unsigned long addr = (unsigned long)page_address(page);
+   pgd_t *pgd;
+   pud_t *pud;
+   p4d_t *p4d;
+   pmd_t *pmd;
+   pte_t *pte;
+
+   pgd = pgd_offset_k(addr);
+   if (!pgd_present(*pgd))
+   return false;
+
+   p4d = p4d_offset(pgd, addr);
+   if (!p4d_present(*p4d))
+   return false;
+
+   pud = pud_offset(p4d, addr);
+   if (!pud_present(*pud))
+   return false;
+
+   pmd = pmd_offset(pud, addr);
+   if (!pmd_present(*pmd))
+   return false;
+
+   pte = pte_offset_kernel(pmd, addr);
+   return pte_present(*pte);
+}
diff --git a/arch/x86/include/asm/set_memory.h 
b/arch/x86/include/asm/set_memory.h
index 5948218f35c5..4352f08bfbb5 100644
--- a/arch/x86/include/asm/set_memory.h
+++ b/arch/x86/include/asm/set_memory.h
@@ -82,6 +82,7 @@ int set_pages_rw(struct page *page, int numpages);
  
  int set_direct_map_invalid_noflush(struct page *page);

  int set_direct_map_default_noflush(struct page *page);
+bool kernel_page_present(struct page *page);
  
  extern int kernel_set_to_readonly;
  
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c

index bc9be96b777f..16f878c26667 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -2226,8 +2226,8 @@ void __kernel_map_pages(struct page *page, int numpages, 
int enable)
  
  	arch_flush_lazy_mmu_mode();

  }
+#endif /* CONFIG_DEBUG_PAGEALLOC */
  
-#ifdef CONFIG_HIBERNATION

  bool kernel_page_present(struct page *page)
  {
unsigned int level;
@@ -2239,8 +2239,6 @@ bool kernel_page_present(struct page *page)
pte = lookup_address((unsigned long)page_address(page), &level);
return (pte_val(*pte) & _PAGE_PRESENT);
  }
-#endif /* CONFIG_HIBERNATION */
-#endif /* CONFIG_DEBUG_PAGEALLOC */
  
  int __init

Re: [PATCH v3 3/4] arch, mm: restore dependency of __kernel_map_pages() of DEBUG_PAGEALLOC

2020-11-02 Thread David Hildenbrand





  int __init kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, unsigned long address,
   unsigned numpages, unsigned long page_flags)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 14e397f3752c..ab0ef6bd351d 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2924,7 +2924,11 @@ static inline bool debug_pagealloc_enabled_static(void)
return static_branch_unlikely(&_debug_pagealloc_enabled);
  }
  
-#if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)

+#ifdef CONFIG_DEBUG_PAGEALLOC
+/*
+ * To support DEBUG_PAGEALLOC architecture must ensure that
+ * __kernel_map_pages() never fails


Maybe add here, that this implies mapping everything via PTEs during boot.

Acked-by: David Hildenbrand 

--
Thanks,

David / dhildenb

Re: [PATCH v3 2/4] PM: hibernate: make direct map manipulations more explicit

2020-11-02 Thread David Hildenbrand


On 01.11.20 18:08, Mike Rapoport wrote:

From: Mike Rapoport 

When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page may be
not present in the direct map and has to be explicitly mapped before it
could be copied.

Introduce hibernate_map_page() that will explicitly use
set_direct_map_{default,invalid}_noflush() for ARCH_HAS_SET_DIRECT_MAP case
and debug_pagealloc_map_pages() for DEBUG_PAGEALLOC case.

The remapping of the pages in safe_copy_page() presumes that it only
changes protection bits in an existing PTE and so it is safe to ignore
return value of set_direct_map_{default,invalid}_noflush().

Still, add a WARN_ON() so that future changes in set_memory APIs will not
silently break hibernation.

Signed-off-by: Mike Rapoport 
Acked-by: Rafael J. Wysocki 
---
  include/linux/mm.h  | 12 
  kernel/power/snapshot.c | 30 --
  2 files changed, 28 insertions(+), 14 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 1fc0609056dc..14e397f3752c 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2927,16 +2927,6 @@ static inline bool debug_pagealloc_enabled_static(void)
  #if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
  extern void __kernel_map_pages(struct page *page, int numpages, int enable);
  
-/*

- * When called in DEBUG_PAGEALLOC context, the call should most likely be
- * guarded by debug_pagealloc_enabled() or debug_pagealloc_enabled_static()
- */
-static inline void
-kernel_map_pages(struct page *page, int numpages, int enable)
-{
-   __kernel_map_pages(page, numpages, enable);
-}
-
  static inline void debug_pagealloc_map_pages(struct page *page,
 int numpages, int enable)
  {
@@ -2948,8 +2938,6 @@ static inline void debug_pagealloc_map_pages(struct page 
*page,
  extern bool kernel_page_present(struct page *page);
  #endif/* CONFIG_HIBERNATION */
  #else /* CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */
-static inline void
-kernel_map_pages(struct page *page, int numpages, int enable) {}
  static inline void debug_pagealloc_map_pages(struct page *page,
 int numpages, int enable) {}
  #ifdef CONFIG_HIBERNATION
diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
index 46b1804c1ddf..054c8cce4236 100644
--- a/kernel/power/snapshot.c
+++ b/kernel/power/snapshot.c
@@ -76,6 +76,32 @@ static inline void hibernate_restore_protect_page(void 
*page_address) {}
  static inline void hibernate_restore_unprotect_page(void *page_address) {}
  #endif /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
  
+static inline void hibernate_map_page(struct page *page, int enable)

+{
+   if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
+   unsigned long addr = (unsigned long)page_address(page);
+   int ret;
+
+   /*
+* This should not fail because remapping a page here means
+* that we only update protection bits in an existing PTE.
+* It is still worth to have WARN_ON() here if something
+* changes and this will no longer be the case.
+*/
+   if (enable)
+   ret = set_direct_map_default_noflush(page);
+   else
+   ret = set_direct_map_invalid_noflush(page);
+
+   if (WARN_ON(ret))
+   return;


People seem to prefer pr_warn() now that production kernels have panic 
on warn enabled. It's weird.



+
+   flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
+   } else {
+   debug_pagealloc_map_pages(page, 1, enable);


Reviewed-by: David Hildenbrand 

--
Thanks,

David / dhildenb

RE: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"

2020-11-02 Thread David Laight

From: 'Greg KH'
> Sent: 23 October 2020 15:47
> 
> On Fri, Oct 23, 2020 at 02:39:24PM +, David Laight wrote:
> > From: David Hildenbrand
> > > Sent: 23 October 2020 15:33
> > ...
> > > I just checked against upstream code generated by clang 10 and it
> > > properly discards the upper 32bit via a mov w23 w2.
> > >
> > > So at least clang 10 indeed properly assumes we could have garbage and
> > > masks it off.
> > >
> > > Maybe the issue is somewhere else, unrelated to nr_pages ... or clang 11
> > > behaves differently.
> >
> > We'll need the disassembly from a failing kernel image.
> > It isn't that big to hand annotate.
> 
> I've worked around the merge at the moment in the android tree, but it
> is still quite reproducable, and will try to get a .o file to
> disassemble on Monday or so...

Did this get properly resolved?

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)

65 matches

Mail list logo