Re: [PATCH] powerpc/32s: Setup the early hash table at all time.
Hi Andreas, Le 30/10/2020 à 14:11, Andreas Schwab a écrit : # # Automatically generated file; DO NOT EDIT. # Linux/powerpc 5.10.0-rc1 Kernel Configuration # I tried again on QEMU with both pmac32_defconfig and your config, and it boots. I really can't understand what the problem is, because that patch only activates at all time something that has been working well when CONFIG_KASAN is set. Would you mind checking that with that patch reverted, you are able to boot a kernel built with CONFIG_KASAN ? Thanks Christophe
[Bug 209869] Kernel 5.10-rc1 fails to boot on a PowerMac G4 3,6 at an early stage
https://bugzilla.kernel.org/show_bug.cgi?id=209869 Christophe Leroy (christophe.le...@csgroup.eu) changed: What|Removed |Added CC||christophe.le...@csgroup.eu --- Comment #1 from Christophe Leroy (christophe.le...@csgroup.eu) --- Could you try reverting commit https://github.com/linuxppc/linux/commit/69a1593abdbcf03a76367320d929a8ae7a5e3d71 ? I got another report from someone who has the same problem and bisected it to that commit. -- You are receiving this mail because: You are watching the assignee of the bug.
[PATCH 2/2] powerpc/eeh: Add a debugfs interface to check if a driver supports recovery
If a PCI device's current driver implements the error handling callbacks EEH can use them to recover the device after an error occurs. For devices without the error handling callbacks we recover them by removing the device and re-scanning it so the PCI core puts the device back into a known good state. Currently there's no way for userspace to determine if the driver supports recovery or not which makes it difficult to write automated tests for EEH. This patch addressing that by adding a debugfs interface for querying if a specific device can be recovered or not. Signed-off-by: Oliver O'Halloran --- arch/powerpc/kernel/eeh.c | 50 +++ 1 file changed, 50 insertions(+) diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c index f9182ff57804..cd60bc1c8701 100644 --- a/arch/powerpc/kernel/eeh.c +++ b/arch/powerpc/kernel/eeh.c @@ -1868,6 +1868,53 @@ static const struct file_operations eeh_dev_break_fops = { .read = eeh_debugfs_dev_usage, }; +static ssize_t eeh_dev_can_recover(struct file *filp, + const char __user *user_buf, + size_t count, loff_t *ppos) +{ + struct pci_driver *drv; + struct pci_dev *pdev; + size_t ret; + + pdev = eeh_debug_lookup_pdev(filp, user_buf, count, ppos); + if (IS_ERR(pdev)) + return PTR_ERR(pdev); + + /* +* In order for error recovery to work the driver needs to implement +* .error_detected(), so it can quiesce IO to the device, and +* .slot_reset() so it can re-initialise the device after a reset. +* +* Ideally they'd implement .resume() too, but some drivers which +* we need to support (notably IPR) don't so I guess we can tolerate +* that. +* +* .mmio_enabled() is mostly there as a work-around for devices which +* take forever to re-init after a hot reset. Implementing that is +* strictly optional. +*/ + drv = pci_dev_driver(pdev); + if (drv && + drv->err_handler && + drv->err_handler->error_detected && + drv->err_handler->slot_reset) { + ret = count; + } else { + ret = -EOPNOTSUPP; + } + + pci_dev_put(pdev); + + return ret; +} + +static const struct file_operations eeh_dev_can_recover_fops = { + .open = simple_open, + .llseek = no_llseek, + .write = eeh_dev_can_recover, + .read = eeh_debugfs_dev_usage, +}; + #endif static int __init eeh_init_proc(void) @@ -1892,6 +1939,9 @@ static int __init eeh_init_proc(void) debugfs_create_file_unsafe("eeh_force_recover", 0600, powerpc_debugfs_root, NULL, &eeh_force_recover_fops); + debugfs_create_file_unsafe("eeh_dev_can_recover", 0600, + powerpc_debugfs_root, NULL, + &eeh_dev_can_recover_fops); eeh_cache_debugfs_init(); #endif } -- 2.26.2
[PATCH 1/2] powerpc/eeh: Rework pci_dev lookup in debugfs attributes
Pull the string -> pci_dev lookup stuff into a helper function. No functional change. Signed-off-by: Oliver O'Halloran --- arch/powerpc/kernel/eeh.c | 71 --- 1 file changed, 37 insertions(+), 34 deletions(-) diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c index 813713c9120c..f9182ff57804 100644 --- a/arch/powerpc/kernel/eeh.c +++ b/arch/powerpc/kernel/eeh.c @@ -1596,6 +1596,35 @@ static int proc_eeh_show(struct seq_file *m, void *v) } #ifdef CONFIG_DEBUG_FS + + +static struct pci_dev *eeh_debug_lookup_pdev(struct file *filp, +const char __user *user_buf, +size_t count, loff_t *ppos) +{ + uint32_t domain, bus, dev, fn; + struct pci_dev *pdev; + char buf[20]; + int ret; + + memset(buf, 0, sizeof(buf)); + ret = simple_write_to_buffer(buf, sizeof(buf)-1, ppos, user_buf, count); + if (!ret) + return ERR_PTR(-EFAULT); + + ret = sscanf(buf, "%x:%x:%x.%x", &domain, &bus, &dev, &fn); + if (ret != 4) { + pr_err("%s: expected 4 args, got %d\n", __func__, ret); + return ERR_PTR(-EINVAL); + } + + pdev = pci_get_domain_bus_and_slot(domain, bus, (dev << 3) | fn); + if (!pdev) + return ERR_PTR(-ENODEV); + + return pdev; +} + static int eeh_enable_dbgfs_set(void *data, u64 val) { if (val) @@ -1688,26 +1717,13 @@ static ssize_t eeh_dev_check_write(struct file *filp, const char __user *user_buf, size_t count, loff_t *ppos) { - uint32_t domain, bus, dev, fn; struct pci_dev *pdev; struct eeh_dev *edev; - char buf[20]; int ret; - memset(buf, 0, sizeof(buf)); - ret = simple_write_to_buffer(buf, sizeof(buf)-1, ppos, user_buf, count); - if (!ret) - return -EFAULT; - - ret = sscanf(buf, "%x:%x:%x.%x", &domain, &bus, &dev, &fn); - if (ret != 4) { - pr_err("%s: expected 4 args, got %d\n", __func__, ret); - return -EINVAL; - } - - pdev = pci_get_domain_bus_and_slot(domain, bus, (dev << 3) | fn); - if (!pdev) - return -ENODEV; + pdev = eeh_debug_lookup_pdev(filp, user_buf, count, ppos); + if (IS_ERR(pdev)) + return PTR_ERR(pdev); edev = pci_dev_to_eeh_dev(pdev); if (!edev) { @@ -1717,8 +1733,8 @@ static ssize_t eeh_dev_check_write(struct file *filp, } ret = eeh_dev_check_failure(edev); - pci_info(pdev, "eeh_dev_check_failure(%04x:%02x:%02x.%01x) = %d\n", - domain, bus, dev, fn, ret); + pci_info(pdev, "eeh_dev_check_failure(%s) = %d\n", + pci_name(pdev), ret); pci_dev_put(pdev); @@ -1829,25 +1845,12 @@ static ssize_t eeh_dev_break_write(struct file *filp, const char __user *user_buf, size_t count, loff_t *ppos) { - uint32_t domain, bus, dev, fn; struct pci_dev *pdev; - char buf[20]; int ret; - memset(buf, 0, sizeof(buf)); - ret = simple_write_to_buffer(buf, sizeof(buf)-1, ppos, user_buf, count); - if (!ret) - return -EFAULT; - - ret = sscanf(buf, "%x:%x:%x.%x", &domain, &bus, &dev, &fn); - if (ret != 4) { - pr_err("%s: expected 4 args, got %d\n", __func__, ret); - return -EINVAL; - } - - pdev = pci_get_domain_bus_and_slot(domain, bus, (dev << 3) | fn); - if (!pdev) - return -ENODEV; + pdev = eeh_debug_lookup_pdev(filp, user_buf, count, ppos); + if (IS_ERR(pdev)) + return PTR_ERR(pdev); ret = eeh_debugfs_break_device(pdev); pci_dev_put(pdev); -- 2.26.2
[PATCH 3/3] selftests/powerpc: Add VF recovery tests
The basic EEH test ignores VFs since we the way the eeh_dev_break debugfs interface works means that if multiple VFs are enabled we may cause errors on all them them. However, we can work around that by only enabling a single VF at a time. This patch adds some infrastructure for finding SR-IOV capable devices and enabling / disabling VFs so we can exercise the VF specific EEH recovery paths. Two new tests are added, one for testing EEH aware devices and one for EEH un-aware VFs. Signed-off-by: Oliver O'Halloran --- .../selftests/powerpc/eeh/eeh-functions.sh| 108 ++ .../selftests/powerpc/eeh/eeh-vf-aware.sh | 45 .../selftests/powerpc/eeh/eeh-vf-unaware.sh | 35 ++ 3 files changed, 188 insertions(+) create mode 100755 tools/testing/selftests/powerpc/eeh/eeh-vf-aware.sh create mode 100755 tools/testing/selftests/powerpc/eeh/eeh-vf-unaware.sh diff --git a/tools/testing/selftests/powerpc/eeh/eeh-functions.sh b/tools/testing/selftests/powerpc/eeh/eeh-functions.sh index 32e5b7fbf18a..70daa3925dcb 100644 --- a/tools/testing/selftests/powerpc/eeh/eeh-functions.sh +++ b/tools/testing/selftests/powerpc/eeh/eeh-functions.sh @@ -135,3 +135,111 @@ eeh_one_dev() { return 0; } +eeh_has_driver() { + test -e /sys/bus/pci/devices/$1/driver; + return $? +} + +eeh_can_recover() { + # we'll get an IO error if the device's current driver doesn't support + # error recovery + echo $1 > '/sys/kernel/debug/powerpc/eeh_dev_can_recover' 2>/dev/null + + return $? +} + +eeh_find_all_pfs() { + devices="" + + # SR-IOV on pseries requires hypervisor support, so check for that + is_pseries="" + if grep -q pSeries /proc/cpuinfo ; then + if [ ! -f /proc/device-tree/rtas/ibm,open-sriov-allow-unfreeze ] || + [ ! -f /proc/device-tree/rtas/ibm,open-sriov-map-pe-number ] ; then + return 1; + fi + + is_pseries="true" + fi + + for dev in `ls -1 /sys/bus/pci/devices/` ; do + sysfs="/sys/bus/pci/devices/$dev" + if [ ! -e "$sysfs/sriov_numvfs" ] ; then + continue + fi + + # skip unsupported PFs on pseries + if [ -z "$is_pseries" ] && + [ ! -f "$sysfs/of_node/ibm,is-open-sriov-pf" ] && + [ ! -f "$sysfs/of_node/ibm,open-sriov-vf-bar-info" ] ; then + continue; + fi + + # no driver, no vfs + if ! eeh_has_driver $dev ; then + continue + fi + + devices="$devices $dev" + done + + if [ -z "$devices" ] ; then + return 1; + fi + + echo $devices + return 0; +} + +# attempts to enable one VF on each PF so we can do VF specific tests. +# stdout: list of enabled VFs, one per line +# return code: 0 if vfs are found, 1 otherwise +eeh_enable_vfs() { + pf_list="$(eeh_find_all_pfs)" + + vfs=0 + for dev in $pf_list ; do + pf_sysfs="/sys/bus/pci/devices/$dev" + + # make sure we have a single VF + echo 0 > "$pf_sysfs/sriov_numvfs" + echo 1 > "$pf_sysfs/sriov_numvfs" + if [ "$?" != 0 ] ; then + log "Unable to enable VFs on $pf, skipping" + continue; + fi + + vf="$(basename $(realpath "$pf_sysfs/virtfn0"))" + if [ $? != 0 ] ; then + log "unable to find enabled vf on $pf" + echo 0 > "$pf_sysfs/sriov_numvfs" + continue; + fi + + if ! eeh_can_break $vf ; then + log "skipping " + + echo 0 > "$pf_sysfs/sriov_numvfs" + continue; + fi + + vfs="$((vfs + 1))" + echo $vf + done + + test "$vfs" != 0 + return $? +} + +eeh_disable_vfs() { + pf_list="$(eeh_find_all_pfs)" + if [ -z "$pf_list" ] ; then + return 1; + fi + + for dev in $pf_list ; do + echo 0 > "/sys/bus/pci/devices/$dev/sriov_numvfs" + done + + return 0; +} diff --git a/tools/testing/selftests/powerpc/eeh/eeh-vf-aware.sh b/tools/testing/selftests/powerpc/eeh/eeh-vf-aware.sh new file mode 100755 index ..874c11953bb6 --- /dev/null +++ b/tools/testing/selftests/powerpc/eeh/eeh-vf-aware.sh @@ -0,0 +1,45 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0-only + +. ./eeh-functions.sh + +eeh_test_prep # NB: may exit + +vf_list="$(eeh_enable_vfs)"; +if $? != 0 ; then + log "No usable VFs found. Skipping EEH unaware VF test" + exit $KSELFTESTS_SKIP; +fi + +log "Enabled VFs: $vf_list" + +tested=0 +passed=0 +for vf in $vf_list ; do + log "Testing $vf" + + if ! e
[PATCH 2/3] selftests/powerpc: Use stderr for debug messages in eeh-functions
We want to use stdout to return lists of devices, etc so log debug / status messages to stderr rather than stdout. Signed-off-by: Oliver O'Halloran --- .../selftests/powerpc/eeh/eeh-functions.sh| 20 +++ 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/tools/testing/selftests/powerpc/eeh/eeh-functions.sh b/tools/testing/selftests/powerpc/eeh/eeh-functions.sh index 9b1bcc1fd4ad..32e5b7fbf18a 100644 --- a/tools/testing/selftests/powerpc/eeh/eeh-functions.sh +++ b/tools/testing/selftests/powerpc/eeh/eeh-functions.sh @@ -3,6 +3,10 @@ export KSELFTESTS_SKIP=4 +log() { + echo >/dev/stderr $* +} + pe_ok() { local dev="$1" local path="/sys/bus/pci/devices/$dev/eeh_pe_state" @@ -49,7 +53,7 @@ eeh_test_prep() { if [ ! -e "/sys/kernel/debug/powerpc/eeh_dev_check" ] && \ [ ! -e "/sys/kernel/debug/powerpc/eeh_dev_break" ] ; then - echo "debugfs EEH testing files are missing. Is debugfs mounted?" + log "debugfs EEH testing files are missing. Is debugfs mounted?" exit $KSELFTESTS_SKIP; fi @@ -61,7 +65,7 @@ eeh_test_prep() { eeh_can_break() { # skip bridges since we can't recover them (yet...) if [ -e "/sys/bus/pci/devices/$dev/pci_bus" ] ; then - echo "$dev, Skipped: bridge" + log "$dev, Skipped: bridge" return 1; fi @@ -70,7 +74,7 @@ eeh_can_break() { # it the system will generally go down. We should probably fix that # at some point if [ "ahci" = "$(basename $(realpath /sys/bus/pci/devices/$dev/driver))" ] ; then - echo "$dev, Skipped: ahci doesn't support recovery" + log "$dev, Skipped: ahci doesn't support recovery" return 1; fi @@ -80,7 +84,7 @@ eeh_can_break() { # result in the recovery failing and the device being marked as # failed. if ! pe_ok $dev ; then - echo "$dev, Skipped: Bad initial PE state" + log "$dev, Skipped: Bad initial PE state" return 1; fi @@ -94,7 +98,7 @@ eeh_one_dev() { # testing so check that the argument is a well-formed sysfs device # name. if ! test -e /sys/bus/pci/devices/$dev/ ; then - echo "Error: '$dev' must be a sysfs device name (:BB:DD.F)" + log "Error: '$dev' must be a sysfs device name (:BB:DD.F)" return 1; fi @@ -118,16 +122,16 @@ eeh_one_dev() { if pe_ok $dev ; then break; fi - echo "$dev, waited $i/${max_wait}" + log "$dev, waited $i/${max_wait}" sleep 1 done if ! pe_ok $dev ; then - echo "$dev, Failed to recover!" + log "$dev, Failed to recover!" return 1; fi - echo "$dev, Recovered after $i seconds" + log "$dev, Recovered after $i seconds" return 0; } -- 2.26.2
[PATCH 1/3] selftests/powerpc: Hoist helper code out of eeh-basic
Hoist some of the useful test environment checking and prep code into eeh-functions.sh so they can be reused in other tests. Signed-off-by: Oliver O'Halloran --- .../selftests/powerpc/eeh/eeh-basic.sh| 39 ++- .../selftests/powerpc/eeh/eeh-functions.sh| 48 +++ 2 files changed, 51 insertions(+), 36 deletions(-) mode change 100755 => 100644 tools/testing/selftests/powerpc/eeh/eeh-functions.sh diff --git a/tools/testing/selftests/powerpc/eeh/eeh-basic.sh b/tools/testing/selftests/powerpc/eeh/eeh-basic.sh index 0d783e1065c8..16d00555f13e 100755 --- a/tools/testing/selftests/powerpc/eeh/eeh-basic.sh +++ b/tools/testing/selftests/powerpc/eeh/eeh-basic.sh @@ -1,28 +1,13 @@ #!/bin/sh # SPDX-License-Identifier: GPL-2.0-only -KSELFTESTS_SKIP=4 - . ./eeh-functions.sh -if ! eeh_supported ; then - echo "EEH not supported on this system, skipping" - exit $KSELFTESTS_SKIP; -fi - -if [ ! -e "/sys/kernel/debug/powerpc/eeh_dev_check" ] && \ - [ ! -e "/sys/kernel/debug/powerpc/eeh_dev_break" ] ; then - echo "debugfs EEH testing files are missing. Is debugfs mounted?" - exit $KSELFTESTS_SKIP; -fi +eeh_test_prep # NB: may exit pre_lspci=`mktemp` lspci > $pre_lspci -# Bump the max freeze count to something absurd so we don't -# trip over it while breaking things. -echo 5000 > /sys/kernel/debug/powerpc/eeh_max_freezes - # record the devices that we break in here. Assuming everything # goes to plan we should get them back once the recover process # is finished. @@ -30,34 +15,16 @@ devices="" # Build up a list of candidate devices. for dev in `ls -1 /sys/bus/pci/devices/ | grep '\.0$'` ; do - # skip bridges since we can't recover them (yet...) - if [ -e "/sys/bus/pci/devices/$dev/pci_bus" ] ; then - echo "$dev, Skipped: bridge" + if ! eeh_can_break $dev ; then continue; fi - # Skip VFs for now since we don't have a reliable way - # to break them. + # Skip VFs for now since we don't have a reliable way to break them. if [ -e "/sys/bus/pci/devices/$dev/physfn" ] ; then echo "$dev, Skipped: virtfn" continue; fi - if [ "ahci" = "$(basename $(realpath /sys/bus/pci/devices/$dev/driver))" ] ; then - echo "$dev, Skipped: ahci doesn't support recovery" - continue - fi - - # Don't inject errosr into an already-frozen PE. This happens with - # PEs that contain multiple PCI devices (e.g. multi-function cards) - # and injecting new errors during the recovery process will probably - # result in the recovery failing and the device being marked as - # failed. - if ! pe_ok $dev ; then - echo "$dev, Skipped: Bad initial PE state" - continue; - fi - echo "$dev, Added" # Add to this list of device to check diff --git a/tools/testing/selftests/powerpc/eeh/eeh-functions.sh b/tools/testing/selftests/powerpc/eeh/eeh-functions.sh old mode 100755 new mode 100644 index 00dc32c0ed75..9b1bcc1fd4ad --- a/tools/testing/selftests/powerpc/eeh/eeh-functions.sh +++ b/tools/testing/selftests/powerpc/eeh/eeh-functions.sh @@ -1,6 +1,8 @@ #!/bin/sh # SPDX-License-Identifier: GPL-2.0-only +export KSELFTESTS_SKIP=4 + pe_ok() { local dev="$1" local path="/sys/bus/pci/devices/$dev/eeh_pe_state" @@ -39,6 +41,52 @@ eeh_supported() { grep -q 'EEH Subsystem is enabled' /proc/powerpc/eeh } +eeh_test_prep() { + if ! eeh_supported ; then + echo "EEH not supported on this system, skipping" + exit $KSELFTESTS_SKIP; + fi + + if [ ! -e "/sys/kernel/debug/powerpc/eeh_dev_check" ] && \ + [ ! -e "/sys/kernel/debug/powerpc/eeh_dev_break" ] ; then + echo "debugfs EEH testing files are missing. Is debugfs mounted?" + exit $KSELFTESTS_SKIP; + fi + + # Bump the max freeze count to something absurd so we don't + # trip over it while breaking things. + echo 5000 > /sys/kernel/debug/powerpc/eeh_max_freezes +} + +eeh_can_break() { + # skip bridges since we can't recover them (yet...) + if [ -e "/sys/bus/pci/devices/$dev/pci_bus" ] ; then + echo "$dev, Skipped: bridge" + return 1; + fi + + # The ahci driver doesn't support error recovery. If the ahci device + # happens to be hosting the root filesystem, and then we go and break + # it the system will generally go down. We should probably fix that + # at some point + if [ "ahci" = "$(basename $(realpath /sys/bus/pci/devices/$dev/driver))" ] ; then + echo "$dev, Skipped: ahci doesn't support recovery" + return 1; + fi + + # Don't inject errosr into an already-frozen PE. This happens with + # PEs that contain multiple PCI devices (e.g. multi-functi
Re: [PATCH v2] powerpc/pci: unmap legacy INTx interrupts when a PHB is removed
On Tue, Nov 3, 2020 at 1:39 AM Cédric Le Goater wrote: > > On 10/14/20 4:55 AM, Alexey Kardashevskiy wrote: > > > > How do you remove PHBs exactly? There is no such thing in the powernv > > platform, I thought someone added this and you are fixing it but no. PHBs > > on powernv are created at the boot time and there is no way to remove them, > > you can only try removing all the bridges. > > yes. I noticed that later when proposing the fix for the double > free. > > > So what exactly are you doing? > > What you just said above, with the commands : > > echo 1 > /sys/devices/pci0031\:00/0031\:00\:00.0/remove > echo 1 > /sys/devices/pci0031\:00/pci_bus/0031\:00/rescan Right, so that'll remove the root port device (and Bus 01 beneath it), but the PHB itself is still there. If it was removed the root bus would also disappear.
[PATCH 18/18] powerpc/powermac: Move PHB discovery
Signed-off-by: Oliver O'Halloran --- compile tested with pmac32_defconfig and g5_defconfig --- arch/powerpc/platforms/powermac/setup.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/arch/powerpc/platforms/powermac/setup.c b/arch/powerpc/platforms/powermac/setup.c index 2e2cc0c75d87..86aee3f2483f 100644 --- a/arch/powerpc/platforms/powermac/setup.c +++ b/arch/powerpc/platforms/powermac/setup.c @@ -298,9 +298,6 @@ static void __init pmac_setup_arch(void) of_node_put(ic); } - /* Lookup PCI hosts */ - pmac_pci_init(); - #ifdef CONFIG_PPC32 ohare_init(); l2cr_init(); @@ -600,6 +597,7 @@ define_machine(powermac) { .name = "PowerMac", .probe = pmac_probe, .setup_arch = pmac_setup_arch, + .discover_phbs = pmac_pci_init, .show_cpuinfo = pmac_show_cpuinfo, .init_IRQ = pmac_pic_init, .get_irq= NULL, /* changed later */ -- 2.26.2
[PATCH 17/18] powerpc/pasemi: Move PHB discovery
Signed-off-by: Oliver O'Halloran --- compile tested with pasemi_defconfig --- arch/powerpc/platforms/pasemi/setup.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/powerpc/platforms/pasemi/setup.c b/arch/powerpc/platforms/pasemi/setup.c index b612474f8f8e..376797eb7894 100644 --- a/arch/powerpc/platforms/pasemi/setup.c +++ b/arch/powerpc/platforms/pasemi/setup.c @@ -144,8 +144,6 @@ static void __init pas_setup_arch(void) /* Setup SMP callback */ smp_ops = &pas_smp_ops; #endif - /* Lookup PCI hosts */ - pas_pci_init(); /* Remap SDC register for doing reset */ /* XXXOJN This should maybe come out of the device tree */ @@ -446,6 +444,7 @@ define_machine(pasemi) { .name = "PA Semi PWRficient", .probe = pas_probe, .setup_arch = pas_setup_arch, + .discover_phbs = pas_pci_init, .init_IRQ = pas_init_IRQ, .get_irq= mpic_get_irq, .restart= pas_restart, -- 2.26.2
[PATCH 16/18] powerpc/embedded6xx/mve5100: Move PHB discovery
Signed-off-by: Oliver O'Halloran --- compile tested with mvme5100_defconfig --- arch/powerpc/platforms/embedded6xx/mvme5100.c | 13 - arch/powerpc/platforms/embedded6xx/storcenter.c | 8 ++-- 2 files changed, 14 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/platforms/embedded6xx/mvme5100.c b/arch/powerpc/platforms/embedded6xx/mvme5100.c index 1cd488daa0bf..c06a0490d157 100644 --- a/arch/powerpc/platforms/embedded6xx/mvme5100.c +++ b/arch/powerpc/platforms/embedded6xx/mvme5100.c @@ -154,17 +154,19 @@ static const struct of_device_id mvme5100_of_bus_ids[] __initconst = { */ static void __init mvme5100_setup_arch(void) { - struct device_node *np; - if (ppc_md.progress) ppc_md.progress("mvme5100_setup_arch()", 0); - for_each_compatible_node(np, "pci", "hawk-pci") - mvme5100_add_bridge(np); - restart = ioremap(BOARD_MODRST_REG, 4); } +static void __init mvme5100_setup_pci(void) +{ + struct device_node *np; + + for_each_compatible_node(np, "pci", "hawk-pci") + mvme5100_add_bridge(np); +} static void mvme5100_show_cpuinfo(struct seq_file *m) { @@ -205,6 +207,7 @@ define_machine(mvme5100) { .name = "MVME5100", .probe = mvme5100_probe, .setup_arch = mvme5100_setup_arch, + .discover_phbs = mvme5100_setup_pci, .init_IRQ = mvme5100_pic_init, .show_cpuinfo = mvme5100_show_cpuinfo, .get_irq= mpic_get_irq, diff --git a/arch/powerpc/platforms/embedded6xx/storcenter.c b/arch/powerpc/platforms/embedded6xx/storcenter.c index e346ddcef45e..e188b90f7016 100644 --- a/arch/powerpc/platforms/embedded6xx/storcenter.c +++ b/arch/powerpc/platforms/embedded6xx/storcenter.c @@ -65,14 +65,17 @@ static int __init storcenter_add_bridge(struct device_node *dev) } static void __init storcenter_setup_arch(void) +{ + printk(KERN_INFO "IOMEGA StorCenter\n"); +} + +static void __init storcenter_setup_pci(void) { struct device_node *np; /* Lookup PCI host bridges */ for_each_compatible_node(np, "pci", "mpc10x-pci") storcenter_add_bridge(np); - - printk(KERN_INFO "IOMEGA StorCenter\n"); } /* @@ -117,6 +120,7 @@ define_machine(storcenter){ .name = "IOMEGA StorCenter", .probe = storcenter_probe, .setup_arch = storcenter_setup_arch, + .discover_phbs = storcenter_setup_pci, .init_IRQ = storcenter_init_IRQ, .get_irq= mpic_get_irq, .restart= storcenter_restart, -- 2.26.2
[PATCH 15/18] powerpc/embedded6xx/mpc7448: Move PHB discovery
Signed-off-by: Oliver O'Halloran --- compile tested with mpc7448_hpc2_defconfig --- arch/powerpc/platforms/embedded6xx/mpc7448_hpc2.c | 14 +- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/platforms/embedded6xx/mpc7448_hpc2.c b/arch/powerpc/platforms/embedded6xx/mpc7448_hpc2.c index b95c3380d2b5..5565647dc879 100644 --- a/arch/powerpc/platforms/embedded6xx/mpc7448_hpc2.c +++ b/arch/powerpc/platforms/embedded6xx/mpc7448_hpc2.c @@ -58,16 +58,14 @@ int mpc7448_hpc2_exclude_device(struct pci_controller *hose, return PCIBIOS_SUCCESSFUL; } -static void __init mpc7448_hpc2_setup_arch(void) +static void __init mpc7448_hpc2_setup_pci(void) { +#ifdef CONFIG_PCI struct device_node *np; if (ppc_md.progress) - ppc_md.progress("mpc7448_hpc2_setup_arch():set_bridge", 0); - - tsi108_csr_vir_base = get_vir_csrbase(); + ppc_md.progress("mpc7448_hpc2_setup_pci():set_bridge", 0); /* setup PCI host bridge */ -#ifdef CONFIG_PCI for_each_compatible_node(np, "pci", "tsi108-pci") tsi108_setup_pci(np, MPC7448HPC2_PCI_CFG_PHYS, 0); @@ -75,6 +73,11 @@ static void __init mpc7448_hpc2_setup_arch(void) if (ppc_md.progress) ppc_md.progress("tsi108: resources set", 0x100); #endif +} + +static void __init mpc7448_hpc2_setup_arch(void) +{ + tsi108_csr_vir_base = get_vir_csrbase(); printk(KERN_INFO "MPC7448HPC2 (TAIGA) Platform\n"); printk(KERN_INFO @@ -181,6 +184,7 @@ define_machine(mpc7448_hpc2){ .name = "MPC7448 HPC2", .probe = mpc7448_hpc2_probe, .setup_arch = mpc7448_hpc2_setup_arch, + .discover_phbs = mpc7448_hpc2_setup_pci, .init_IRQ = mpc7448_hpc2_init_IRQ, .show_cpuinfo = mpc7448_hpc2_show_cpuinfo, .get_irq= mpic_get_irq, -- 2.26.2
[PATCH 14/18] powerpc/embedded6xx/linkstation: Move PHB discovery
Signed-off-by: Oliver O'Halloran --- compile tested with linkstation_defconfig --- arch/powerpc/platforms/embedded6xx/linkstation.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/platforms/embedded6xx/linkstation.c b/arch/powerpc/platforms/embedded6xx/linkstation.c index f514d5d28cd4..eb8342e7f84e 100644 --- a/arch/powerpc/platforms/embedded6xx/linkstation.c +++ b/arch/powerpc/platforms/embedded6xx/linkstation.c @@ -63,15 +63,18 @@ static int __init linkstation_add_bridge(struct device_node *dev) } static void __init linkstation_setup_arch(void) +{ + printk(KERN_INFO "BUFFALO Network Attached Storage Series\n"); + printk(KERN_INFO "(C) 2002-2005 BUFFALO INC.\n"); +} + +static void __init linkstation_setup_pci(void) { struct device_node *np; /* Lookup PCI host bridges */ for_each_compatible_node(np, "pci", "mpc10x-pci") linkstation_add_bridge(np); - - printk(KERN_INFO "BUFFALO Network Attached Storage Series\n"); - printk(KERN_INFO "(C) 2002-2005 BUFFALO INC.\n"); } /* @@ -153,6 +156,7 @@ define_machine(linkstation){ .name = "Buffalo Linkstation", .probe = linkstation_probe, .setup_arch = linkstation_setup_arch, + .discover_phbs = linkstation_setup_pci, .init_IRQ = linkstation_init_IRQ, .show_cpuinfo = linkstation_show_cpuinfo, .get_irq= mpic_get_irq, -- 2.26.2
[PATCH 13/18] powerpc/embedded6xx/holly: Move PHB discovery
Signed-off-by: Oliver O'Halloran --- compile tested with holly_defconfig --- arch/powerpc/platforms/embedded6xx/holly.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/platforms/embedded6xx/holly.c b/arch/powerpc/platforms/embedded6xx/holly.c index d8f2e2c737bb..53065d564161 100644 --- a/arch/powerpc/platforms/embedded6xx/holly.c +++ b/arch/powerpc/platforms/embedded6xx/holly.c @@ -108,15 +108,13 @@ static void holly_remap_bridge(void) tsi108_write_reg(TSI108_PCI_P2O_BAR2, 0x0); } -static void __init holly_setup_arch(void) +static void __init holly_init_pci(void) { struct device_node *np; if (ppc_md.progress) ppc_md.progress("holly_setup_arch():set_bridge", 0); - tsi108_csr_vir_base = get_vir_csrbase(); - /* setup PCI host bridge */ holly_remap_bridge(); @@ -127,6 +125,11 @@ static void __init holly_setup_arch(void) ppc_md.pci_exclude_device = holly_exclude_device; if (ppc_md.progress) ppc_md.progress("tsi108: resources set", 0x100); +} + +static void __init holly_setup_arch(void) +{ + tsi108_csr_vir_base = get_vir_csrbase(); printk(KERN_INFO "PPC750GX/CL Platform\n"); } @@ -259,6 +262,7 @@ define_machine(holly){ .name = "PPC750 GX/CL TSI", .probe = holly_probe, .setup_arch = holly_setup_arch, + .discover_phbs = holly_init_pci, .init_IRQ = holly_init_IRQ, .show_cpuinfo = holly_show_cpuinfo, .get_irq= mpic_get_irq, -- 2.26.2
[PATCH 12/18] powerpc/chrp: Move PHB discovery
Signed-off-by: Oliver O'Halloran --- compile tested with chrp32_defconfig --- arch/powerpc/platforms/chrp/pci.c | 8 arch/powerpc/platforms/chrp/setup.c | 12 +--- 2 files changed, 9 insertions(+), 11 deletions(-) diff --git a/arch/powerpc/platforms/chrp/pci.c b/arch/powerpc/platforms/chrp/pci.c index b2c2bf35b76c..8c421dc78b28 100644 --- a/arch/powerpc/platforms/chrp/pci.c +++ b/arch/powerpc/platforms/chrp/pci.c @@ -314,6 +314,14 @@ chrp_find_bridges(void) } } of_node_put(root); + + /* +* "Temporary" fixes for PCI devices. +* -- Geert +*/ + hydra_init(); /* Mac I/O */ + + pci_create_OF_bus_map(); } /* SL82C105 IDE Control/Status Register */ diff --git a/arch/powerpc/platforms/chrp/setup.c b/arch/powerpc/platforms/chrp/setup.c index c45435aa5e36..3cfc382841e5 100644 --- a/arch/powerpc/platforms/chrp/setup.c +++ b/arch/powerpc/platforms/chrp/setup.c @@ -334,22 +334,11 @@ static void __init chrp_setup_arch(void) /* On pegasos, enable the L2 cache if not already done by OF */ pegasos_set_l2cr(); - /* Lookup PCI host bridges */ - chrp_find_bridges(); - - /* -* Temporary fixes for PCI devices. -* -- Geert -*/ - hydra_init(); /* Mac I/O */ - /* * Fix the Super I/O configuration */ sio_init(); - pci_create_OF_bus_map(); - /* * Print the banner, then scroll down so boot progress * can be printed. -- Cort @@ -582,6 +571,7 @@ define_machine(chrp) { .name = "CHRP", .probe = chrp_probe, .setup_arch = chrp_setup_arch, + .discover_phbs = chrp_find_bridges, .init = chrp_init2, .show_cpuinfo = chrp_show_cpuinfo, .init_IRQ = chrp_init_IRQ, -- 2.26.2
[PATCH 11/18] powerpc/amigaone: Move PHB discovery
Signed-off-by: Oliver O'Halloran --- compile tested with amigaone_defconfig --- arch/powerpc/platforms/amigaone/setup.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/platforms/amigaone/setup.c b/arch/powerpc/platforms/amigaone/setup.c index f5d0bf999759..b25ddf39dd43 100644 --- a/arch/powerpc/platforms/amigaone/setup.c +++ b/arch/powerpc/platforms/amigaone/setup.c @@ -65,6 +65,12 @@ static int __init amigaone_add_bridge(struct device_node *dev) } void __init amigaone_setup_arch(void) +{ + if (ppc_md.progress) + ppc_md.progress("Linux/PPC "UTS_RELEASE"\n", 0); +} + +void __init amigaone_discover_phbs(void) { struct device_node *np; int phb = -ENODEV; @@ -74,9 +80,6 @@ void __init amigaone_setup_arch(void) phb = amigaone_add_bridge(np); BUG_ON(phb != 0); - - if (ppc_md.progress) - ppc_md.progress("Linux/PPC "UTS_RELEASE"\n", 0); } void __init amigaone_init_IRQ(void) @@ -159,6 +162,7 @@ define_machine(amigaone) { .name = "AmigaOne", .probe = amigaone_probe, .setup_arch = amigaone_setup_arch, + .discover_phbs = amigaone_discover_phbs, .show_cpuinfo = amigaone_show_cpuinfo, .init_IRQ = amigaone_init_IRQ, .restart= amigaone_restart, -- 2.26.2
[PATCH 10/18] powerpc/83xx: Move PHB discovery
Signed-off-by: Oliver O'Halloran --- compile tested with mpc83xx_defconfig --- arch/powerpc/platforms/83xx/asp834x.c | 1 + arch/powerpc/platforms/83xx/km83xx.c | 1 + arch/powerpc/platforms/83xx/misc.c| 2 -- arch/powerpc/platforms/83xx/mpc830x_rdb.c | 1 + arch/powerpc/platforms/83xx/mpc831x_rdb.c | 1 + arch/powerpc/platforms/83xx/mpc832x_mds.c | 1 + arch/powerpc/platforms/83xx/mpc832x_rdb.c | 1 + arch/powerpc/platforms/83xx/mpc834x_itx.c | 1 + arch/powerpc/platforms/83xx/mpc834x_mds.c | 1 + arch/powerpc/platforms/83xx/mpc836x_mds.c | 1 + arch/powerpc/platforms/83xx/mpc836x_rdk.c | 1 + arch/powerpc/platforms/83xx/mpc837x_mds.c | 1 + arch/powerpc/platforms/83xx/mpc837x_rdb.c | 1 + 13 files changed, 12 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/platforms/83xx/asp834x.c b/arch/powerpc/platforms/83xx/asp834x.c index 28474876f41b..68061c2a57c1 100644 --- a/arch/powerpc/platforms/83xx/asp834x.c +++ b/arch/powerpc/platforms/83xx/asp834x.c @@ -44,6 +44,7 @@ define_machine(asp834x) { .name = "ASP8347E", .probe = asp834x_probe, .setup_arch = asp834x_setup_arch, + .discover_phbs = mpc83xx_setup_pci, .init_IRQ = mpc83xx_ipic_init_IRQ, .get_irq= ipic_get_irq, .restart= mpc83xx_restart, diff --git a/arch/powerpc/platforms/83xx/km83xx.c b/arch/powerpc/platforms/83xx/km83xx.c index bcdc2c203ec9..108e1e4d2683 100644 --- a/arch/powerpc/platforms/83xx/km83xx.c +++ b/arch/powerpc/platforms/83xx/km83xx.c @@ -180,6 +180,7 @@ define_machine(mpc83xx_km) { .name = "mpc83xx-km-platform", .probe = mpc83xx_km_probe, .setup_arch = mpc83xx_km_setup_arch, + .discover_phbs = mpc83xx_setup_pci, .init_IRQ = mpc83xx_ipic_init_IRQ, .get_irq= ipic_get_irq, .restart= mpc83xx_restart, diff --git a/arch/powerpc/platforms/83xx/misc.c b/arch/powerpc/platforms/83xx/misc.c index a952e91db3ee..3285dabcf923 100644 --- a/arch/powerpc/platforms/83xx/misc.c +++ b/arch/powerpc/platforms/83xx/misc.c @@ -132,8 +132,6 @@ void __init mpc83xx_setup_arch(void) setbat(-1, va, immrbase, immrsize, PAGE_KERNEL_NCG); update_bats(); } - - mpc83xx_setup_pci(); } int machine_check_83xx(struct pt_regs *regs) diff --git a/arch/powerpc/platforms/83xx/mpc830x_rdb.c b/arch/powerpc/platforms/83xx/mpc830x_rdb.c index 51426e88ec67..956d4389effa 100644 --- a/arch/powerpc/platforms/83xx/mpc830x_rdb.c +++ b/arch/powerpc/platforms/83xx/mpc830x_rdb.c @@ -48,6 +48,7 @@ define_machine(mpc830x_rdb) { .name = "MPC830x RDB", .probe = mpc830x_rdb_probe, .setup_arch = mpc830x_rdb_setup_arch, + .discover_phbs = mpc83xx_setup_pci, .init_IRQ = mpc83xx_ipic_init_IRQ, .get_irq= ipic_get_irq, .restart= mpc83xx_restart, diff --git a/arch/powerpc/platforms/83xx/mpc831x_rdb.c b/arch/powerpc/platforms/83xx/mpc831x_rdb.c index 5ccd57a48492..3b578f080e3b 100644 --- a/arch/powerpc/platforms/83xx/mpc831x_rdb.c +++ b/arch/powerpc/platforms/83xx/mpc831x_rdb.c @@ -48,6 +48,7 @@ define_machine(mpc831x_rdb) { .name = "MPC831x RDB", .probe = mpc831x_rdb_probe, .setup_arch = mpc831x_rdb_setup_arch, + .discover_phbs = mpc83xx_setup_pci, .init_IRQ = mpc83xx_ipic_init_IRQ, .get_irq= ipic_get_irq, .restart= mpc83xx_restart, diff --git a/arch/powerpc/platforms/83xx/mpc832x_mds.c b/arch/powerpc/platforms/83xx/mpc832x_mds.c index 6fa5402ebf20..850d566ef900 100644 --- a/arch/powerpc/platforms/83xx/mpc832x_mds.c +++ b/arch/powerpc/platforms/83xx/mpc832x_mds.c @@ -101,6 +101,7 @@ define_machine(mpc832x_mds) { .name = "MPC832x MDS", .probe = mpc832x_sys_probe, .setup_arch = mpc832x_sys_setup_arch, + .discover_phbs = mpc83xx_setup_pci, .init_IRQ = mpc83xx_ipic_init_IRQ, .get_irq= ipic_get_irq, .restart= mpc83xx_restart, diff --git a/arch/powerpc/platforms/83xx/mpc832x_rdb.c b/arch/powerpc/platforms/83xx/mpc832x_rdb.c index 622c625d5ce4..b6133a237a70 100644 --- a/arch/powerpc/platforms/83xx/mpc832x_rdb.c +++ b/arch/powerpc/platforms/83xx/mpc832x_rdb.c @@ -219,6 +219,7 @@ define_machine(mpc832x_rdb) { .name = "MPC832x RDB", .probe = mpc832x_rdb_probe, .setup_arch = mpc832x_rdb_setup_arch, + .discover_phbs = mpc83xx_setup_pci, .init_IRQ = mpc83xx_ipic_init_IRQ, .get_irq= ipic_get_irq, .restart= mpc83xx_restart, diff --git a/arch/powerpc/platforms/83xx/mpc834x_itx.c b/arc
[PATCH 09/18] powerpc/82xx/*: Move PHB discovery
Signed-off-by: Oliver O'Halloran --- compile tested with pq2fads_defconfig --- arch/powerpc/platforms/82xx/mpc8272_ads.c | 2 +- arch/powerpc/platforms/82xx/pq2fads.c | 3 +-- 2 files changed, 2 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/platforms/82xx/mpc8272_ads.c b/arch/powerpc/platforms/82xx/mpc8272_ads.c index 3fe1a6593280..0b5b9dec16d5 100644 --- a/arch/powerpc/platforms/82xx/mpc8272_ads.c +++ b/arch/powerpc/platforms/82xx/mpc8272_ads.c @@ -171,7 +171,6 @@ static void __init mpc8272_ads_setup_arch(void) iounmap(bcsr); init_ioports(); - pq2_init_pci(); if (ppc_md.progress) ppc_md.progress("mpc8272_ads_setup_arch(), finish", 0); @@ -205,6 +204,7 @@ define_machine(mpc8272_ads) .name = "Freescale MPC8272 ADS", .probe = mpc8272_ads_probe, .setup_arch = mpc8272_ads_setup_arch, + .discover_phbs = pq2_init_pci, .init_IRQ = mpc8272_ads_pic_init, .get_irq = cpm2_get_irq, .calibrate_decr = generic_calibrate_decr, diff --git a/arch/powerpc/platforms/82xx/pq2fads.c b/arch/powerpc/platforms/82xx/pq2fads.c index a74082140718..ac9113d524af 100644 --- a/arch/powerpc/platforms/82xx/pq2fads.c +++ b/arch/powerpc/platforms/82xx/pq2fads.c @@ -150,8 +150,6 @@ static void __init pq2fads_setup_arch(void) /* Enable external IRQs */ clrbits32(&cpm2_immr->im_siu_conf.siu_82xx.sc_siumcr, 0x0c00); - pq2_init_pci(); - if (ppc_md.progress) ppc_md.progress("pq2fads_setup_arch(), finish", 0); } @@ -184,6 +182,7 @@ define_machine(pq2fads) .name = "Freescale PQ2FADS", .probe = pq2fads_probe, .setup_arch = pq2fads_setup_arch, + .discover_phbs = pq2_init_pci, .init_IRQ = pq2fads_pic_init, .get_irq = cpm2_get_irq, .calibrate_decr = generic_calibrate_decr, -- 2.26.2
[PATCH 08/18] powerpc/52xx/mpc5200_simple: Move PHB discovery
Signed-off-by: Oliver O'Halloran --- arch/powerpc/platforms/52xx/mpc5200_simple.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/powerpc/platforms/52xx/mpc5200_simple.c b/arch/powerpc/platforms/52xx/mpc5200_simple.c index 2d01e9b2e779..b9f5675b0a1d 100644 --- a/arch/powerpc/platforms/52xx/mpc5200_simple.c +++ b/arch/powerpc/platforms/52xx/mpc5200_simple.c @@ -40,8 +40,6 @@ static void __init mpc5200_simple_setup_arch(void) /* Some mpc5200 & mpc5200b related configuration */ mpc5200_setup_xlb_arbiter(); - - mpc52xx_setup_pci(); } /* list of the supported boards */ @@ -73,6 +71,7 @@ define_machine(mpc5200_simple_platform) { .name = "mpc5200-simple-platform", .probe = mpc5200_simple_probe, .setup_arch = mpc5200_simple_setup_arch, + .discover_phbs = mpc52xx_setup_pci, .init = mpc52xx_declare_of_platform_devices, .init_IRQ = mpc52xx_init_irq, .get_irq= mpc52xx_get_irq, -- 2.26.2
[PATCH 07/18] powerpc/52xx/media5200: Move PHB discovery
Signed-off-by: Oliver O'Halloran --- arch/powerpc/platforms/52xx/media5200.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/powerpc/platforms/52xx/media5200.c b/arch/powerpc/platforms/52xx/media5200.c index 07c5bc4ed0b5..efb8bdecbcc7 100644 --- a/arch/powerpc/platforms/52xx/media5200.c +++ b/arch/powerpc/platforms/52xx/media5200.c @@ -202,8 +202,6 @@ static void __init media5200_setup_arch(void) /* Some mpc5200 & mpc5200b related configuration */ mpc5200_setup_xlb_arbiter(); - mpc52xx_setup_pci(); - np = of_find_matching_node(NULL, mpc5200_gpio_ids); gpio = of_iomap(np, 0); of_node_put(np); @@ -244,6 +242,7 @@ define_machine(media5200_platform) { .name = "media5200-platform", .probe = media5200_probe, .setup_arch = media5200_setup_arch, + .discover_phbs = mpc52xx_setup_pci, .init = mpc52xx_declare_of_platform_devices, .init_IRQ = media5200_init_irq, .get_irq= mpc52xx_get_irq, -- 2.26.2
[PATCH 06/18] powerpc/52xx/lite5200: Move PHB discovery
Signed-off-by: Oliver O'Halloran --- compile tested with 52xx/lite5200b_defconfig --- arch/powerpc/platforms/52xx/lite5200.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/powerpc/platforms/52xx/lite5200.c b/arch/powerpc/platforms/52xx/lite5200.c index 3181aac08225..04cc97397095 100644 --- a/arch/powerpc/platforms/52xx/lite5200.c +++ b/arch/powerpc/platforms/52xx/lite5200.c @@ -165,8 +165,6 @@ static void __init lite5200_setup_arch(void) mpc52xx_suspend.board_resume_finish = lite5200_resume_finish; lite5200_pm_init(); #endif - - mpc52xx_setup_pci(); } static const char * const board[] __initconst = { @@ -187,6 +185,7 @@ define_machine(lite5200) { .name = "lite5200", .probe = lite5200_probe, .setup_arch = lite5200_setup_arch, + .discover_phbs = mpc52xx_setup_pci, .init = mpc52xx_declare_of_platform_devices, .init_IRQ = mpc52xx_init_irq, .get_irq= mpc52xx_get_irq, -- 2.26.2
[PATCH 05/18] powerpc/52xx/efika: Move PHB discovery
Signed-off-by: Oliver O'Halloran --- compile tested with mpc5200_defconfig --- arch/powerpc/platforms/52xx/efika.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/powerpc/platforms/52xx/efika.c b/arch/powerpc/platforms/52xx/efika.c index 4514a6f7458a..3b7d70d71692 100644 --- a/arch/powerpc/platforms/52xx/efika.c +++ b/arch/powerpc/platforms/52xx/efika.c @@ -185,8 +185,6 @@ static void __init efika_setup_arch(void) /* Map important registers from the internal memory map */ mpc52xx_map_common_devices(); - efika_pcisetup(); - #ifdef CONFIG_PM mpc52xx_suspend.board_suspend_prepare = efika_suspend_prepare; mpc52xx_pm_init(); @@ -218,6 +216,7 @@ define_machine(efika) .name = EFIKA_PLATFORM_NAME, .probe = efika_probe, .setup_arch = efika_setup_arch, + .discover_phbs = efika_pcisetup, .init = mpc52xx_declare_of_platform_devices, .show_cpuinfo = efika_show_cpuinfo, .init_IRQ = mpc52xx_init_irq, -- 2.26.2
[PATCH 04/18] powerpc/512x: Move PHB discovery
Signed-off-by: Oliver O'Halloran --- only compile tested --- arch/powerpc/platforms/512x/mpc5121_ads.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/platforms/512x/mpc5121_ads.c b/arch/powerpc/platforms/512x/mpc5121_ads.c index 6303fbfc4e4f..9d030c2e0004 100644 --- a/arch/powerpc/platforms/512x/mpc5121_ads.c +++ b/arch/powerpc/platforms/512x/mpc5121_ads.c @@ -24,21 +24,23 @@ static void __init mpc5121_ads_setup_arch(void) { -#ifdef CONFIG_PCI - struct device_node *np; -#endif printk(KERN_INFO "MPC5121 ADS board from Freescale Semiconductor\n"); /* * cpld regs are needed early */ mpc5121_ads_cpld_map(); + mpc512x_setup_arch(); +} + +static void __init mpc5121_ads_setup_pci(void) +{ #ifdef CONFIG_PCI + struct device_node *np; + for_each_compatible_node(np, "pci", "fsl,mpc5121-pci") mpc83xx_add_bridge(np); #endif - - mpc512x_setup_arch(); } static void __init mpc5121_ads_init_IRQ(void) @@ -64,6 +66,7 @@ define_machine(mpc5121_ads) { .name = "MPC5121 ADS", .probe = mpc5121_ads_probe, .setup_arch = mpc5121_ads_setup_arch, + .discover_phbs = mpc5121_ads_setup_pci, .init = mpc512x_init, .init_IRQ = mpc5121_ads_init_IRQ, .get_irq= ipic_get_irq, -- 2.26.2
[PATCH 03/18] powerpc/maple: Move PHB discovery
Signed-off-by: Oliver O'Halloran --- arch/powerpc/platforms/maple/setup.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/arch/powerpc/platforms/maple/setup.c b/arch/powerpc/platforms/maple/setup.c index f7e66a2005b4..4e9ad5bf3efb 100644 --- a/arch/powerpc/platforms/maple/setup.c +++ b/arch/powerpc/platforms/maple/setup.c @@ -179,9 +179,6 @@ static void __init maple_setup_arch(void) #ifdef CONFIG_SMP smp_ops = &maple_smp_ops; #endif - /* Lookup PCI hosts */ - maple_pci_init(); - maple_use_rtas_reboot_and_halt_if_present(); printk(KERN_DEBUG "Using native/NAP idle loop\n"); @@ -351,6 +348,7 @@ define_machine(maple) { .name = "Maple", .probe = maple_probe, .setup_arch = maple_setup_arch, + .discover_phbs = maple_pci_init, .init_IRQ = maple_init_IRQ, .pci_irq_fixup = maple_pci_irq_fixup, .pci_get_legacy_ide_irq = maple_pci_get_legacy_ide_irq, -- 2.26.2
[PATCH 02/18] powerpc/{powernv,pseries}: Move PHB discovery
Make powernv and pseries use ppc_mc.discover_phbs. These two platforms need to be done together because they both depends on pci_dn's being created from the DT. The pci_dn contains a pointer to the relevant pci_controller so they need to be created after the pci_controller structures are available, but before and before PCI devices are scanned. Currently this ordering is provided by initcalls and the sequence is: 1. PHBs are discovered (setup_arch) (early boot, pre-initcalls) 2. pci_dn are created from the unflattended DT (core initcall) 3. PHBs are scanned pcibios_init() (subsys initcall) The new ppc_md.discover_phbs() function is also a core_initcall so we can't guarantee ordering between the creations of pci_controllers and the creation of pci_dn's which require a pci_controller. We could use the postcore, or core_sync initcall levels, but it's cleaner to just move the pci_dn setup into the per-PHB inits which occur inside of .discover_phb() for these platforms. This brings the boot-time path in line with the PHB hotplug path that is used for pseries DLPAR operations too. Signed-off-by: Oliver O'Halloran --- arch/powerpc/kernel/pci_dn.c | 22 -- arch/powerpc/platforms/powernv/pci-ioda.c | 3 +++ arch/powerpc/platforms/powernv/setup.c| 4 +--- arch/powerpc/platforms/pseries/setup.c| 7 +-- 4 files changed, 9 insertions(+), 27 deletions(-) diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c index 54e240597fd9..61571ae23953 100644 --- a/arch/powerpc/kernel/pci_dn.c +++ b/arch/powerpc/kernel/pci_dn.c @@ -481,28 +481,6 @@ void pci_devs_phb_init_dynamic(struct pci_controller *phb) pci_traverse_device_nodes(dn, add_pdn, phb); } -/** - * pci_devs_phb_init - Initialize phbs and pci devs under them. - * - * This routine walks over all phb's (pci-host bridges) on the - * system, and sets up assorted pci-related structures - * (including pci info in the device node structs) for each - * pci device found underneath. This routine runs once, - * early in the boot sequence. - */ -static int __init pci_devs_phb_init(void) -{ - struct pci_controller *phb, *tmp; - - /* This must be done first so the device nodes have valid pci info! */ - list_for_each_entry_safe(phb, tmp, &hose_list, list_node) - pci_devs_phb_init_dynamic(phb); - - return 0; -} - -core_initcall(pci_devs_phb_init); - static void pci_dev_pdn_setup(struct pci_dev *pdev) { struct pci_dn *pdn; diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c index 2b4ceb5e6ce4..d6815f03fee3 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -3176,6 +3176,9 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np, /* Remove M64 resource if we can't configure it successfully */ if (!phb->init_m64 || phb->init_m64(phb)) hose->mem_resources[1].flags = 0; + + /* create pci_dn's for DT nodes under this PHB */ + pci_devs_phb_init_dynamic(hose); } void __init pnv_pci_init_ioda2_phb(struct device_node *np) diff --git a/arch/powerpc/platforms/powernv/setup.c b/arch/powerpc/platforms/powernv/setup.c index 9acaa0f131b9..92f5fa827909 100644 --- a/arch/powerpc/platforms/powernv/setup.c +++ b/arch/powerpc/platforms/powernv/setup.c @@ -162,9 +162,6 @@ static void __init pnv_setup_arch(void) /* Initialize SMP */ pnv_smp_init(); - /* Setup PCI */ - pnv_pci_init(); - /* Setup RTC and NVRAM callbacks */ if (firmware_has_feature(FW_FEATURE_OPAL)) opal_nvram_init(); @@ -524,6 +521,7 @@ define_machine(powernv) { .init_IRQ = pnv_init_IRQ, .show_cpuinfo = pnv_show_cpuinfo, .get_proc_freq = pnv_get_proc_freq, + .discover_phbs = pnv_pci_init, .progress = pnv_progress, .machine_shutdown = pnv_shutdown, .power_save = NULL, diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c index 633c45ec406d..e88b30d4b6cd 100644 --- a/arch/powerpc/platforms/pseries/setup.c +++ b/arch/powerpc/platforms/pseries/setup.c @@ -463,7 +463,7 @@ void pseries_little_endian_exceptions(void) } #endif -static void __init find_and_init_phbs(void) +static void __init pSeries_discover_phbs(void) { struct device_node *node; struct pci_controller *phb; @@ -481,6 +481,9 @@ static void __init find_and_init_phbs(void) pci_process_bridge_OF_ranges(phb, node, 0); isa_bridge_find_early(phb); phb->controller_ops = pseries_pci_controller_ops; + + /* create pci_dn's for DT nodes under this PHB */ + pci_devs_phb_init_dynamic(phb); } of_node_put(root); @@ -777,7 +780,6 @@ static void __init pSeries_setup_arch(
[PATCH 01/18] powerpc/pci: Add ppc_md.discover_phbs()
On many powerpc platforms the discovery and initalisation of pci_controllers (PHBs) happens inside of setup_arch(). This is very early in boot (pre-initcalls) and means that we're initialising the PHB long before many basic kernel services (slab allocator, debugfs, a real ioremap) are available. On PowerNV this causes an additional problem since we map the PHB registers with ioremap(). As of commit d538aadc2718 ("powerpc/ioremap: warn on early use of ioremap()") a warning is printed because we're using the "incorrect" API to setup and MMIO mapping in searly boot. The kernel does provide early_ioremap(), but that is not intended to create long-lived MMIO mappings and a seperate warning is printed by generic code if early_ioremap() mappings are "leaked." This is all fixable with dumb hacks like using early_ioremap() to setup the initial mapping then replacing it with a real ioremap later on in boot, but it does raise the question: Why the hell are we setting up the PHB's this early in boot? The old and wise claim it's due to "hysterical rasins." Aside from amused grapes there doesn't appear to be any real reason to maintain the current behaviour. Already most of the newer embedded platforms perform PHB discovery in an arch_initcall and between the end of setup_arch() and the start of initcalls none of the generic kernel code does anything PCI related. On powerpc scanning PHBs occurs in a subsys_initcall so it should be possible to move the PHB discovery to a core, postcore or arch initcall. This patch adds the ppc_md.discover_phbs hook and a core_initcall stub that calls it. The core_initcalls are the earliest to be called so this will any possibly issues with dependency between initcalls. This isn't just an academic issue either since on pseries and PowerNV EEH init occurs in an arch_initcall and depends on the pci_controllers being available, similarly the creation of pci_dns occurs at core_initcall_sync (i.e. between core and postcore initcalls). These problems need to be addressed seperately. Cc: Paul Mackerras Cc: Christophe Leroy Signed-off-by: Oliver O'Halloran --- arch/powerpc/include/asm/machdep.h | 3 +++ arch/powerpc/kernel/pci-common.c | 10 ++ 2 files changed, 13 insertions(+) diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h index 475687f24f4a..d319160d790c 100644 --- a/arch/powerpc/include/asm/machdep.h +++ b/arch/powerpc/include/asm/machdep.h @@ -59,6 +59,9 @@ struct machdep_calls { int (*pcibios_root_bridge_prepare)(struct pci_host_bridge *bridge); + /* finds all the pci_controllers present at boot */ + void(*discover_phbs)(void); + /* To setup PHBs when using automatic OF platform driver for PCI */ int (*pci_setup_phb)(struct pci_controller *host); diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c index be108616a721..6265e7d1c697 100644 --- a/arch/powerpc/kernel/pci-common.c +++ b/arch/powerpc/kernel/pci-common.c @@ -1625,3 +1625,13 @@ static void fixup_hide_host_resource_fsl(struct pci_dev *dev) } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MOTOROLA, PCI_ANY_ID, fixup_hide_host_resource_fsl); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_FREESCALE, PCI_ANY_ID, fixup_hide_host_resource_fsl); + + +int __init discover_phbs(void) +{ + if (ppc_md.discover_phbs) + ppc_md.discover_phbs(); + + return 0; +} +core_initcall(discover_phbs); -- 2.26.2
Re: Kernel panic from malloc() on SUSE 15.1?
Carl Jacobsen writes: > I've got a SUSE 15.1 install (on ppc64le) that kernel panics on a very > simple > test program, built in a slightly unusual way. > > I'm compiling on SUSE 12, using gcc 4.8.3. I'm linking to a static > copy of libcrypto.a (from openssl-1.1.1g), built without threads. > I have a 10 line C test program that compiles and runs fine on the > SUSE 12 system. If I compile the same program on SUSE 15.1 (with > gcc 7.4.1), it runs fine on SUSE 15.1. > > But, if I run the version that I compiled on SUSE 12, on the SUSE 15.1 > system, the call to RAND_status() gets to a malloc() and then panics. > (And, of course, if I just compile a call to malloc(), that runs fine > on both systems.) Here's the test program, it's really just a call to > RAND_status(): > > #include > #include > > int main(int argc, char **argv) > { > int has_enough_data = RAND_status(); > printf("The PRNG %s been seeded with enough data\n", >has_enough_data ? "HAS" : "has NOT"); > return 0; > } > > openssl is configured/built with: > ./config no-shared no-dso no-threads -fPIC -ggdb3 -debug -static > make > > and the test program is compiled with: > gcc -ggdb3 -o rand_test rand_test.c libcrypto.a > > The kernel on SUSE 12 is: 3.12.28-4-default > And glibc is: 2.19 > > The kernel on SUSE 15.1 is: 4.12.14-197.18-default > And glibc is: 2.26 > > In a previous iteration it was panicking in pthread_once(), so > I compiled openssl without pthreads support, and now it panics > calling malloc(). What's the panic look like? cheers
[PATCH AUTOSEL 5.4 15/24] scsi: ibmvscsi: Fix potential race after loss of transport
From: Tyrel Datwyler [ Upstream commit 665e0224a3d76f36da40bd9012270fa629aa42ed ] After a loss of transport due to an adapter migration or crash/disconnect from the host partner there is a tiny window where we can race adjusting the request_limit of the adapter. The request limit is atomically increased/decreased to track the number of inflight requests against the allowed limit of our VIOS partner. After a transport loss we set the request_limit to zero to reflect this state. However, there is a window where the adapter may attempt to queue a command because the transport loss event hasn't been fully processed yet and request_limit is still greater than zero. The hypercall to send the event will fail and the error path will increment the request_limit as a result. If the adapter processes the transport event prior to this increment the request_limit becomes out of sync with the adapter state and can result in SCSI commands being submitted on the now reset connection prior to an SRP Login resulting in a protocol violation. Fix this race by protecting request_limit with the host lock when changing the value via atomic_set() to indicate no transport. Link: https://lore.kernel.org/r/20201025001355.4527-1-tyr...@linux.ibm.com Signed-off-by: Tyrel Datwyler Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/ibmvscsi/ibmvscsi.c | 36 +++- 1 file changed, 26 insertions(+), 10 deletions(-) diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c index c5711c659b517..1ab0a61e3fb59 100644 --- a/drivers/scsi/ibmvscsi/ibmvscsi.c +++ b/drivers/scsi/ibmvscsi/ibmvscsi.c @@ -806,6 +806,22 @@ static void purge_requests(struct ibmvscsi_host_data *hostdata, int error_code) spin_unlock_irqrestore(hostdata->host->host_lock, flags); } +/** + * ibmvscsi_set_request_limit - Set the adapter request_limit in response to + * an adapter failure, reset, or SRP Login. Done under host lock to prevent + * race with SCSI command submission. + * @hostdata: adapter to adjust + * @limit: new request limit + */ +static void ibmvscsi_set_request_limit(struct ibmvscsi_host_data *hostdata, int limit) +{ + unsigned long flags; + + spin_lock_irqsave(hostdata->host->host_lock, flags); + atomic_set(&hostdata->request_limit, limit); + spin_unlock_irqrestore(hostdata->host->host_lock, flags); +} + /** * ibmvscsi_reset_host - Reset the connection to the server * @hostdata: struct ibmvscsi_host_data to reset @@ -813,7 +829,7 @@ static void purge_requests(struct ibmvscsi_host_data *hostdata, int error_code) static void ibmvscsi_reset_host(struct ibmvscsi_host_data *hostdata) { scsi_block_requests(hostdata->host); - atomic_set(&hostdata->request_limit, 0); + ibmvscsi_set_request_limit(hostdata, 0); purge_requests(hostdata, DID_ERROR); hostdata->action = IBMVSCSI_HOST_ACTION_RESET; @@ -1146,13 +1162,13 @@ static void login_rsp(struct srp_event_struct *evt_struct) dev_info(hostdata->dev, "SRP_LOGIN_REJ reason %u\n", evt_struct->xfer_iu->srp.login_rej.reason); /* Login failed. */ - atomic_set(&hostdata->request_limit, -1); + ibmvscsi_set_request_limit(hostdata, -1); return; default: dev_err(hostdata->dev, "Invalid login response typecode 0x%02x!\n", evt_struct->xfer_iu->srp.login_rsp.opcode); /* Login failed. */ - atomic_set(&hostdata->request_limit, -1); + ibmvscsi_set_request_limit(hostdata, -1); return; } @@ -1163,7 +1179,7 @@ static void login_rsp(struct srp_event_struct *evt_struct) * This value is set rather than added to request_limit because * request_limit could have been set to -1 by this client. */ - atomic_set(&hostdata->request_limit, + ibmvscsi_set_request_limit(hostdata, be32_to_cpu(evt_struct->xfer_iu->srp.login_rsp.req_lim_delta)); /* If we had any pending I/Os, kick them */ @@ -1195,13 +1211,13 @@ static int send_srp_login(struct ibmvscsi_host_data *hostdata) login->req_buf_fmt = cpu_to_be16(SRP_BUF_FORMAT_DIRECT | SRP_BUF_FORMAT_INDIRECT); - spin_lock_irqsave(hostdata->host->host_lock, flags); /* Start out with a request limit of 0, since this is negotiated in * the login request we are just sending and login requests always * get sent by the driver regardless of request_limit. */ - atomic_set(&hostdata->request_limit, 0); + ibmvscsi_set_request_limit(hostdata, 0); + spin_lock_irqsave(hostdata->host->host_lock, flags); rc = ibmvscsi_send_srp_event(evt_struct, hostdata, login_timeout * 2); spin_unlock_irqrestore(hostdata->host->host_lock, flags);
[PATCH AUTOSEL 5.8 20/29] scsi: ibmvscsi: Fix potential race after loss of transport
From: Tyrel Datwyler [ Upstream commit 665e0224a3d76f36da40bd9012270fa629aa42ed ] After a loss of transport due to an adapter migration or crash/disconnect from the host partner there is a tiny window where we can race adjusting the request_limit of the adapter. The request limit is atomically increased/decreased to track the number of inflight requests against the allowed limit of our VIOS partner. After a transport loss we set the request_limit to zero to reflect this state. However, there is a window where the adapter may attempt to queue a command because the transport loss event hasn't been fully processed yet and request_limit is still greater than zero. The hypercall to send the event will fail and the error path will increment the request_limit as a result. If the adapter processes the transport event prior to this increment the request_limit becomes out of sync with the adapter state and can result in SCSI commands being submitted on the now reset connection prior to an SRP Login resulting in a protocol violation. Fix this race by protecting request_limit with the host lock when changing the value via atomic_set() to indicate no transport. Link: https://lore.kernel.org/r/20201025001355.4527-1-tyr...@linux.ibm.com Signed-off-by: Tyrel Datwyler Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/ibmvscsi/ibmvscsi.c | 36 +++- 1 file changed, 26 insertions(+), 10 deletions(-) diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c index 14f687e9b1f44..62faeab47d905 100644 --- a/drivers/scsi/ibmvscsi/ibmvscsi.c +++ b/drivers/scsi/ibmvscsi/ibmvscsi.c @@ -806,6 +806,22 @@ static void purge_requests(struct ibmvscsi_host_data *hostdata, int error_code) spin_unlock_irqrestore(hostdata->host->host_lock, flags); } +/** + * ibmvscsi_set_request_limit - Set the adapter request_limit in response to + * an adapter failure, reset, or SRP Login. Done under host lock to prevent + * race with SCSI command submission. + * @hostdata: adapter to adjust + * @limit: new request limit + */ +static void ibmvscsi_set_request_limit(struct ibmvscsi_host_data *hostdata, int limit) +{ + unsigned long flags; + + spin_lock_irqsave(hostdata->host->host_lock, flags); + atomic_set(&hostdata->request_limit, limit); + spin_unlock_irqrestore(hostdata->host->host_lock, flags); +} + /** * ibmvscsi_reset_host - Reset the connection to the server * @hostdata: struct ibmvscsi_host_data to reset @@ -813,7 +829,7 @@ static void purge_requests(struct ibmvscsi_host_data *hostdata, int error_code) static void ibmvscsi_reset_host(struct ibmvscsi_host_data *hostdata) { scsi_block_requests(hostdata->host); - atomic_set(&hostdata->request_limit, 0); + ibmvscsi_set_request_limit(hostdata, 0); purge_requests(hostdata, DID_ERROR); hostdata->action = IBMVSCSI_HOST_ACTION_RESET; @@ -1146,13 +1162,13 @@ static void login_rsp(struct srp_event_struct *evt_struct) dev_info(hostdata->dev, "SRP_LOGIN_REJ reason %u\n", evt_struct->xfer_iu->srp.login_rej.reason); /* Login failed. */ - atomic_set(&hostdata->request_limit, -1); + ibmvscsi_set_request_limit(hostdata, -1); return; default: dev_err(hostdata->dev, "Invalid login response typecode 0x%02x!\n", evt_struct->xfer_iu->srp.login_rsp.opcode); /* Login failed. */ - atomic_set(&hostdata->request_limit, -1); + ibmvscsi_set_request_limit(hostdata, -1); return; } @@ -1163,7 +1179,7 @@ static void login_rsp(struct srp_event_struct *evt_struct) * This value is set rather than added to request_limit because * request_limit could have been set to -1 by this client. */ - atomic_set(&hostdata->request_limit, + ibmvscsi_set_request_limit(hostdata, be32_to_cpu(evt_struct->xfer_iu->srp.login_rsp.req_lim_delta)); /* If we had any pending I/Os, kick them */ @@ -1195,13 +1211,13 @@ static int send_srp_login(struct ibmvscsi_host_data *hostdata) login->req_buf_fmt = cpu_to_be16(SRP_BUF_FORMAT_DIRECT | SRP_BUF_FORMAT_INDIRECT); - spin_lock_irqsave(hostdata->host->host_lock, flags); /* Start out with a request limit of 0, since this is negotiated in * the login request we are just sending and login requests always * get sent by the driver regardless of request_limit. */ - atomic_set(&hostdata->request_limit, 0); + ibmvscsi_set_request_limit(hostdata, 0); + spin_lock_irqsave(hostdata->host->host_lock, flags); rc = ibmvscsi_send_srp_event(evt_struct, hostdata, login_timeout * 2); spin_unlock_irqrestore(hostdata->host->host_lock, flags);
[PATCH AUTOSEL 5.9 24/35] scsi: ibmvscsi: Fix potential race after loss of transport
From: Tyrel Datwyler [ Upstream commit 665e0224a3d76f36da40bd9012270fa629aa42ed ] After a loss of transport due to an adapter migration or crash/disconnect from the host partner there is a tiny window where we can race adjusting the request_limit of the adapter. The request limit is atomically increased/decreased to track the number of inflight requests against the allowed limit of our VIOS partner. After a transport loss we set the request_limit to zero to reflect this state. However, there is a window where the adapter may attempt to queue a command because the transport loss event hasn't been fully processed yet and request_limit is still greater than zero. The hypercall to send the event will fail and the error path will increment the request_limit as a result. If the adapter processes the transport event prior to this increment the request_limit becomes out of sync with the adapter state and can result in SCSI commands being submitted on the now reset connection prior to an SRP Login resulting in a protocol violation. Fix this race by protecting request_limit with the host lock when changing the value via atomic_set() to indicate no transport. Link: https://lore.kernel.org/r/20201025001355.4527-1-tyr...@linux.ibm.com Signed-off-by: Tyrel Datwyler Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/ibmvscsi/ibmvscsi.c | 36 +++- 1 file changed, 26 insertions(+), 10 deletions(-) diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c index b1f3017b6547a..29fcc44be2d57 100644 --- a/drivers/scsi/ibmvscsi/ibmvscsi.c +++ b/drivers/scsi/ibmvscsi/ibmvscsi.c @@ -806,6 +806,22 @@ static void purge_requests(struct ibmvscsi_host_data *hostdata, int error_code) spin_unlock_irqrestore(hostdata->host->host_lock, flags); } +/** + * ibmvscsi_set_request_limit - Set the adapter request_limit in response to + * an adapter failure, reset, or SRP Login. Done under host lock to prevent + * race with SCSI command submission. + * @hostdata: adapter to adjust + * @limit: new request limit + */ +static void ibmvscsi_set_request_limit(struct ibmvscsi_host_data *hostdata, int limit) +{ + unsigned long flags; + + spin_lock_irqsave(hostdata->host->host_lock, flags); + atomic_set(&hostdata->request_limit, limit); + spin_unlock_irqrestore(hostdata->host->host_lock, flags); +} + /** * ibmvscsi_reset_host - Reset the connection to the server * @hostdata: struct ibmvscsi_host_data to reset @@ -813,7 +829,7 @@ static void purge_requests(struct ibmvscsi_host_data *hostdata, int error_code) static void ibmvscsi_reset_host(struct ibmvscsi_host_data *hostdata) { scsi_block_requests(hostdata->host); - atomic_set(&hostdata->request_limit, 0); + ibmvscsi_set_request_limit(hostdata, 0); purge_requests(hostdata, DID_ERROR); hostdata->action = IBMVSCSI_HOST_ACTION_RESET; @@ -1146,13 +1162,13 @@ static void login_rsp(struct srp_event_struct *evt_struct) dev_info(hostdata->dev, "SRP_LOGIN_REJ reason %u\n", evt_struct->xfer_iu->srp.login_rej.reason); /* Login failed. */ - atomic_set(&hostdata->request_limit, -1); + ibmvscsi_set_request_limit(hostdata, -1); return; default: dev_err(hostdata->dev, "Invalid login response typecode 0x%02x!\n", evt_struct->xfer_iu->srp.login_rsp.opcode); /* Login failed. */ - atomic_set(&hostdata->request_limit, -1); + ibmvscsi_set_request_limit(hostdata, -1); return; } @@ -1163,7 +1179,7 @@ static void login_rsp(struct srp_event_struct *evt_struct) * This value is set rather than added to request_limit because * request_limit could have been set to -1 by this client. */ - atomic_set(&hostdata->request_limit, + ibmvscsi_set_request_limit(hostdata, be32_to_cpu(evt_struct->xfer_iu->srp.login_rsp.req_lim_delta)); /* If we had any pending I/Os, kick them */ @@ -1195,13 +1211,13 @@ static int send_srp_login(struct ibmvscsi_host_data *hostdata) login->req_buf_fmt = cpu_to_be16(SRP_BUF_FORMAT_DIRECT | SRP_BUF_FORMAT_INDIRECT); - spin_lock_irqsave(hostdata->host->host_lock, flags); /* Start out with a request limit of 0, since this is negotiated in * the login request we are just sending and login requests always * get sent by the driver regardless of request_limit. */ - atomic_set(&hostdata->request_limit, 0); + ibmvscsi_set_request_limit(hostdata, 0); + spin_lock_irqsave(hostdata->host->host_lock, flags); rc = ibmvscsi_send_srp_event(evt_struct, hostdata, login_timeout * 2); spin_unlock_irqrestore(hostdata->host->host_lock, flags);
[powerpc:merge] BUILD SUCCESS 09a0972ac14f67d600aa3c80035367a8074e90eb
mvme147_defconfig h8300 edosk2674_defconfig sh sh2007_defconfig ia64 alldefconfig mips cobalt_defconfig microblazenommu_defconfig arm gemini_defconfig sparc sparc32_defconfig arm aspeed_g4_defconfig arm imx_v4_v5_defconfig sh rsk7264_defconfig arm versatile_defconfig shtitan_defconfig arm rpc_defconfig c6x alldefconfig powerpc pmac32_defconfig powerpc ksi8560_defconfig powerpcicon_defconfig arm pxa3xx_defconfig arm cns3420vb_defconfig arm colibri_pxa270_defconfig ia64 allmodconfig ia64defconfig ia64 allyesconfig m68k allmodconfig m68kdefconfig m68k allyesconfig nios2 defconfig arc allyesconfig nds32 allnoconfig c6x allyesconfig nds32 defconfig nios2allyesconfig cskydefconfig alpha defconfig alphaallyesconfig xtensa allyesconfig h8300allyesconfig arc defconfig sh allmodconfig parisc defconfig s390 allyesconfig parisc allyesconfig s390defconfig i386 allyesconfig sparcallyesconfig sparc defconfig i386defconfig mips allyesconfig mips allmodconfig powerpc allyesconfig powerpc allmodconfig powerpc allnoconfig x86_64 randconfig-a004-20201101 x86_64 randconfig-a003-20201101 x86_64 randconfig-a005-20201101 x86_64 randconfig-a002-20201101 x86_64 randconfig-a006-20201101 x86_64 randconfig-a001-20201101 i386 randconfig-a004-20201102 i386 randconfig-a006-20201102 i386 randconfig-a005-20201102 i386 randconfig-a001-20201102 i386 randconfig-a002-20201102 i386 randconfig-a003-20201102 i386 randconfig-a004-20201101 i386 randconfig-a006-20201101 i386 randconfig-a005-20201101 i386 randconfig-a001-20201101 i386 randconfig-a002-20201101 i386 randconfig-a003-20201101 x86_64 randconfig-a012-20201102 x86_64 randconfig-a015-20201102 x86_64 randconfig-a011-20201102 x86_64 randconfig-a013-20201102 x86_64 randconfig-a014-20201102 x86_64 randconfig-a016-20201102 i386 randconfig-a013-20201102 i386 randconfig-a015-20201102 i386 randconfig-a014-20201102 i386 randconfig-a016-20201102 i386 randconfig-a011-20201102 i386 randconfig-a012-20201102 riscvnommu_k210_defconfig riscvnommu_virt_defconfig riscv allnoconfig riscv defconfig riscvallmodconfig x86_64 rhel x86_64 allyesconfig x86_64rhel-7.6-kselftests x86_64 defconfig x86_64 rhel-8.3 x86_64 kexec clang tested configs: x86_64 randconfig-a004-20201102 x86_64 randconfig-a005-20201102 x86_64 randconfig-a003-20201102 x86_64 randconfig-a002-20201102 x86_64 randconfig-a006-20201102 x86_64 randconfig-a001-20201102 x86_64 randconfig-a012-20201101 x86_64 randconfig-a015-20201101 x86_64 randconfig-a013-20201101 x86_64 randconfig-a011-20201101 x86_64 randconfig-a014-20201101 x86_64 randconfig-a016-20201101 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org
[powerpc:fixes-test] BUILD SUCCESS 99f070b62322a4b8c1252952735806d09eb44b68
haps_hs_smp_defconfig arm vf610m4_defconfig arm mv78xx0_defconfig powerpc ppc40x_defconfig sh sdk7780_defconfig m68k multi_defconfig arm socfpga_defconfig riscvallyesconfig arm badge4_defconfig arm sunxi_defconfig xtensageneric_kc705_defconfig sh sh7770_generic_defconfig cskydefconfig nds32alldefconfig sh ecovec24_defconfig riscv rv32_defconfig powerpc arches_defconfig shmigor_defconfig arm pxa168_defconfig sh sh7785lcr_32bit_defconfig powerpc ppc44x_defconfig i386 alldefconfig powerpc mpc8560_ads_defconfig arm iop32x_defconfig mipsmalta_kvm_guest_defconfig mipsjmr3927_defconfig powerpc mpc836x_rdk_defconfig mipsmalta_qemu_32r6_defconfig sh microdev_defconfig powerpc rainier_defconfig arm footbridge_defconfig powerpc katmai_defconfig powerpcge_imp3a_defconfig powerpc mpc8313_rdb_defconfig powerpcgamecube_defconfig powerpc allmodconfig sh se7206_defconfig powerpc tqm8541_defconfig shecovec24-romimage_defconfig armmulti_v5_defconfig powerpc tqm5200_defconfig powerpc lite5200b_defconfig m68kmvme147_defconfig h8300 edosk2674_defconfig sh sh2007_defconfig ia64 alldefconfig mips cobalt_defconfig microblazenommu_defconfig arm gemini_defconfig sparc sparc32_defconfig powerpc allnoconfig arm aspeed_g4_defconfig sh rsk7264_defconfig arm versatile_defconfig sh sh7710voipgw_defconfig shtitan_defconfig arm rpc_defconfig c6x alldefconfig powerpc pmac32_defconfig powerpc ksi8560_defconfig powerpcicon_defconfig mips ip27_defconfig xtensasmp_lx200_defconfig m68k sun3_defconfig ia64 allmodconfig m68k allmodconfig m68kdefconfig m68k allyesconfig nios2 defconfig nds32 allnoconfig c6x allyesconfig alphaallyesconfig xtensa allyesconfig h8300allyesconfig arc defconfig sh allmodconfig parisc defconfig s390 allyesconfig parisc allyesconfig s390defconfig i386 allyesconfig sparcallyesconfig sparc defconfig i386defconfig mips allmodconfig powerpc allyesconfig x86_64 randconfig-a004-20201101 x86_64 randconfig-a003-20201101 x86_64 randconfig-a005-20201101 x86_64 randconfig-a002-20201101 x86_64 randconfig-a006-20201101 x86_64 randconfig-a001-20201101 i386 randconfig-a004-20201102 i386 randconfig-a006-20201102 i386 randconfig-a005-20201102 i386 randconfig-a001-20201102 i386 randconfig-a002-20201102 i386 randconfig-a003-20201102 i386 randconfig-a004-20201101 i386 randconfig-a006-20201101 i386 randconfig-a005-20201101 i386 randconfig-a001-20201101 i386 randconfig-a002-20201101 i386 randconfig-a003-20201101 x86_64 randconfig-a012-20201102 x86_64 randconfig-a015-20201102 x86_64 randconfig-a011-20201102 x86_64 randconfig-a013-20201102 x86_64 randconfig-a014-20201102 x86_64
[powerpc:next-test] BUILD SUCCESS 2d83b0f30c1483a556c8aa1f7d891006fffcd5e0
rv32_defconfig powerpc tqm8541_defconfig shecovec24-romimage_defconfig armmulti_v5_defconfig powerpc tqm5200_defconfig powerpc lite5200b_defconfig m68kmvme147_defconfig h8300 edosk2674_defconfig sh sh2007_defconfig ia64 alldefconfig mips cobalt_defconfig microblazenommu_defconfig arm gemini_defconfig sparc sparc32_defconfig powerpc allnoconfig arm aspeed_g4_defconfig arm imx_v4_v5_defconfig sh rsk7264_defconfig arm versatile_defconfig shtitan_defconfig arm rpc_defconfig c6x alldefconfig powerpc pmac32_defconfig powerpc ksi8560_defconfig powerpcicon_defconfig arm pxa3xx_defconfig arm cns3420vb_defconfig arm colibri_pxa270_defconfig ia64 allmodconfig ia64defconfig m68k allmodconfig m68kdefconfig m68k allyesconfig nios2 defconfig arc allyesconfig nds32 allnoconfig c6x allyesconfig nds32 defconfig nios2allyesconfig cskydefconfig alphaallyesconfig xtensa allyesconfig h8300allyesconfig arc defconfig sh allmodconfig parisc defconfig s390 allyesconfig parisc allyesconfig s390defconfig i386 allyesconfig sparcallyesconfig sparc defconfig i386defconfig mips allyesconfig mips allmodconfig powerpc allyesconfig powerpc allmodconfig x86_64 randconfig-a004-20201101 x86_64 randconfig-a003-20201101 x86_64 randconfig-a005-20201101 x86_64 randconfig-a002-20201101 x86_64 randconfig-a006-20201101 x86_64 randconfig-a001-20201101 i386 randconfig-a004-20201102 i386 randconfig-a006-20201102 i386 randconfig-a005-20201102 i386 randconfig-a001-20201102 i386 randconfig-a002-20201102 i386 randconfig-a003-20201102 i386 randconfig-a004-20201101 i386 randconfig-a006-20201101 i386 randconfig-a005-20201101 i386 randconfig-a001-20201101 i386 randconfig-a002-20201101 i386 randconfig-a003-20201101 x86_64 randconfig-a012-20201102 x86_64 randconfig-a015-20201102 x86_64 randconfig-a011-20201102 x86_64 randconfig-a013-20201102 x86_64 randconfig-a014-20201102 x86_64 randconfig-a016-20201102 i386 randconfig-a013-20201102 i386 randconfig-a015-20201102 i386 randconfig-a014-20201102 i386 randconfig-a016-20201102 i386 randconfig-a011-20201102 i386 randconfig-a012-20201102 riscvnommu_virt_defconfig riscv allnoconfig riscv defconfig riscvallmodconfig x86_64 rhel x86_64 allyesconfig x86_64rhel-7.6-kselftests x86_64 defconfig x86_64 rhel-8.3 x86_64 kexec clang tested configs: x86_64 randconfig-a004-20201102 x86_64 randconfig-a005-20201102 x86_64 randconfig-a003-20201102 x86_64 randconfig-a002-20201102 x86_64 randconfig-a006-20201102 x86_64 randconfig-a001-20201102 x86_64 randconfig-a012-20201101 x86_64 randconfig-a015-20201101 x86_64 randconfig-a013-20201101 x86_64 randconfig-a011-20201101 x86_64 randconfig-a014-20201101 x86_64
Re: [PATCH v2 2/2] misc: ocxl: config: Rename function attribute description
On 3/11/20 1:20 am, Lee Jones wrote: Fixes the following W=1 kernel build warning(s): drivers/misc/ocxl/config.c:81: warning: Function parameter or member 'dev' not described in 'get_function_0' drivers/misc/ocxl/config.c:81: warning: Excess function parameter 'device' description in 'get_function_0' Cc: Frederic Barrat Cc: Andrew Donnellan Cc: Arnd Bergmann Cc: Greg Kroah-Hartman Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Lee Jones Thanks! Acked-by: Andrew Donnellan -- Andrew Donnellan OzLabs, ADL Canberra a...@linux.ibm.com IBM Australia Limited
Re: [PATCH net-next 04/15] net: mlx5: Replace in_irq() usage.
On Sat, 2020-10-31 at 09:59 -0700, Jakub Kicinski wrote: > On Tue, 27 Oct 2020 23:54:43 +0100 Sebastian Andrzej Siewior wrote: > > mlx5_eq_async_int() uses in_irq() to decide whether eq::lock needs > > to be > > acquired and released with spin_[un]lock() or the irq > > saving/restoring > > variants. > > > > The usage of in_*() in drivers is phased out and Linus clearly > > requested > > that code which changes behaviour depending on context should > > either be > > seperated or the context be conveyed in an argument passed by the > > caller, > > which usually knows the context. > > > > mlx5_eq_async_int() knows the context via the action argument > > already so > > using it for the lock variant decision is a straight forward > > replacement > > for in_irq(). > > > > Signed-off-by: Sebastian Andrzej Siewior > > Cc: Saeed Mahameed > > Cc: Leon Romanovsky > > Cc: "David S. Miller" > > Cc: Jakub Kicinski > > Cc: linux-r...@vger.kernel.org > > Saeed, please pick this up into your tree. Ack
Re: [PATCH v2 net-next 3/3] crypto: caam: Replace in_irq() usage.
On 11/2/2020 1:23 AM, Sebastian Andrzej Siewior wrote: > The driver uses in_irq() + in_serving_softirq() magic to decide if NAPI > scheduling is required or packet processing. > > The usage of in_*() in drivers is phased out and Linus clearly requested > that code which changes behaviour depending on context should either be > separated or the context be conveyed in an argument passed by the caller, > which usually knows the context. > > Use the `sched_napi' argument passed by the callback. It is set true if > called from the interrupt handler and NAPI should be scheduled. > > Signed-off-by: Sebastian Andrzej Siewior > Cc: "Horia Geantă" > Cc: Aymen Sghaier > Cc: Herbert Xu > Cc: "David S. Miller" > Cc: Madalin Bucur > Cc: Jakub Kicinski > Cc: Li Yang > Cc: linux-cry...@vger.kernel.org > Cc: net...@vger.kernel.org > Cc: linuxppc-dev@lists.ozlabs.org > Cc: linux-arm-ker...@lists.infradead.org Reviewed-by: Horia Geantă Thanks, Horia
Re: [PATCH v2 net-next 1/3] soc/fsl/qbman: Add an argument to signal if NAPI processing is required.
On 11/2/2020 1:23 AM, Sebastian Andrzej Siewior wrote: > dpaa_eth_napi_schedule() and caam_qi_napi_schedule() schedule NAPI if > invoked from: > > - Hard interrupt context > - Any context which is not serving soft interrupts > > Any context which is not serving soft interrupts includes hard interrupts > so the in_irq() check is redundant. caam_qi_napi_schedule() has a comment > about this: > > /* > * In case of threaded ISR, for RT kernels in_irq() does not return > * appropriate value, so use in_serving_softirq to distinguish between > * softirq and irq contexts. > */ > if (in_irq() || !in_serving_softirq()) > > This has nothing to do with RT. Even on a non RT kernel force threaded > interrupts run obviously in thread context and therefore in_irq() returns > false when invoked from the handler. > > The extension of the in_irq() check with !in_serving_softirq() was there > when the drivers were added, but in the out of tree FSL BSP the original > condition was in_irq() which got extended due to failures on RT. > Looks like the initial FSL BSP commit adding this check is: edca0b7a448a ("dpaa_eth: Fix Rx-stall issue in threaded ISR") https://source.codeaurora.org/external/qoriq/qoriq-yocto-sdk/linux/commit/?h=fsl-sdk-v1.2&id=edca0b7a448ac18ef0a9b1238209b7595d511e19 This was done for dpaa_eth and the same logic was reused in caam. In the process of upstreaming the development history got lost and the comment in dpaa_eth was removed. This was back in 2012 on a v3.0.34 kernel. Not sure if/how things changed in the meantime, i.e. whether in_irq() behaviour when called from softirq changed on -rt kernels (assuming this was the problem Priyanka tried solving). > The usage of in_xxx() in drivers is phased out and Linus clearly requested > that code which changes behaviour depending on context should either be > separated or the context be conveyed in an argument passed by the caller, > which usually knows the context. Right he is, the above construct is > clearly showing why. > > The following callchains have been analyzed to end up in > dpaa_eth_napi_schedule(): > > qman_p_poll_dqrr() > __poll_portal_fast() > fq->cb.dqrr() >dpaa_eth_napi_schedule() > > portal_isr() > __poll_portal_fast() > fq->cb.dqrr() >dpaa_eth_napi_schedule() > > Both need to schedule NAPI. Only the call from interrupt context. > The crypto part has another code path leading up to this: > kill_fq() > empty_retired_fq() >qman_p_poll_dqrr() > __poll_portal_fast() > fq->cb.dqrr() >dpaa_eth_napi_schedule() > > kill_fq() is called from task context and ends up scheduling NAPI, but > that's pointless and an unintended side effect of the !in_serving_softirq() > check. > Correct. > The code path: > caam_qi_poll() -> qman_p_poll_dqrr() > > is invoked from NAPI and I *assume* from crypto's NAPI device and not > from qbman's NAPI device. I *guess* it is okay to skip scheduling NAPI > (because this is what happens now) but could be changed if it is wrong > due to `budget' handling. > Looks good to me. > Add an argument to __poll_portal_fast() which is true if NAPI needs to be > scheduled. This requires propagating the value to the caller including > `qman_cb_dqrr' typedef which is used by the dpaa and the crypto driver. > > Signed-off-by: Sebastian Andrzej Siewior > Cc: "Horia Geantă" > Cc: Aymen Sghaier > Cc: Herbert XS > Cc: "David S. Miller" > Cc: Madalin Bucur > Cc: Jakub Kicinski > Cc: Li Yang > Cc: linux-cry...@vger.kernel.org > Cc: net...@vger.kernel.org > Cc: linuxppc-dev@lists.ozlabs.org > Cc: linux-arm-ker...@lists.infradead.org Reviewed-by: Horia Geantă Thanks, Horia
Kernel panic from malloc() on SUSE 15.1?
I've got a SUSE 15.1 install (on ppc64le) that kernel panics on a very simple test program, built in a slightly unusual way. I'm compiling on SUSE 12, using gcc 4.8.3. I'm linking to a static copy of libcrypto.a (from openssl-1.1.1g), built without threads. I have a 10 line C test program that compiles and runs fine on the SUSE 12 system. If I compile the same program on SUSE 15.1 (with gcc 7.4.1), it runs fine on SUSE 15.1. But, if I run the version that I compiled on SUSE 12, on the SUSE 15.1 system, the call to RAND_status() gets to a malloc() and then panics. (And, of course, if I just compile a call to malloc(), that runs fine on both systems.) Here's the test program, it's really just a call to RAND_status(): #include #include int main(int argc, char **argv) { int has_enough_data = RAND_status(); printf("The PRNG %s been seeded with enough data\n", has_enough_data ? "HAS" : "has NOT"); return 0; } openssl is configured/built with: ./config no-shared no-dso no-threads -fPIC -ggdb3 -debug -static make and the test program is compiled with: gcc -ggdb3 -o rand_test rand_test.c libcrypto.a The kernel on SUSE 12 is: 3.12.28-4-default And glibc is: 2.19 The kernel on SUSE 15.1 is: 4.12.14-197.18-default And glibc is: 2.26 In a previous iteration it was panicking in pthread_once(), so I compiled openssl without pthreads support, and now it panics calling malloc(). If I link to the system-supplied libcrypto.so, it works fine, and running the same tests on x86_64 works fine, it's only ppc64le that panics, and only running code from the old system on the new one. I'm trying to dig further down into this to come up with a standalone test case, but I'm wondering if anything here stands out as a known problem, or if someone can point me in the right direction. Thanks, Carl Jacobsen
Re: [PATCH 20/33] docs: ABI: testing: make the files compatible with ReST output
On Wed, Oct 28, 2020 at 03:23:18PM +0100, Mauro Carvalho Chehab wrote: > From: Mauro Carvalho Chehab > > Some files over there won't parse well by Sphinx. > [..snip..] > diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu > b/Documentation/ABI/testing/sysfs-devices-system-cpu > index b555df825447..274c337ec6a9 100644 > --- a/Documentation/ABI/testing/sysfs-devices-system-cpu > +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu > @@ -151,23 +151,28 @@ Description: > The processor idle states which are available for use have the > following attributes: > > - name: (RO) Name of the idle state (string). > + = > + name:(RO) Name of the idle state (string). > > latency: (RO) The latency to exit out of this idle state (in > - microseconds). > + microseconds). > > - power: (RO) The power consumed while in this idle state (in > - milliwatts). > + power: (RO) The power consumed while in this idle state (in > + milliwatts). > > - time: (RO) The total time spent in this idle state (in > microseconds). > + time:(RO) The total time spent in this idle state > + (in microseconds). > > - usage: (RO) Number of times this state was entered (a count). > + usage: (RO) Number of times this state was entered (a count). > > - above: (RO) Number of times this state was entered, but the > -observed CPU idle duration was too short for it (a > count). > + above: (RO) Number of times this state was entered, but the > + observed CPU idle duration was too short for it > + (a count). > > - below: (RO) Number of times this state was entered, but the > -observed CPU idle duration was too long for it (a count). > + below: (RO) Number of times this state was entered, but the > + observed CPU idle duration was too long for it > + (a count). > + = > > What:/sys/devices/system/cpu/cpuX/cpuidle/stateN/desc > Date:February 2008 > @@ -290,6 +295,7 @@ Description: Processor frequency boosting control > This switch controls the boost setting for the whole system. > Boosting allows the CPU and the firmware to run at a frequency > beyound it's nominal limit. > + > More details can be found in > Documentation/admin-guide/pm/cpufreq.rst > The changes to cpuidle states look good to me. [..snip..] > @@ -414,30 +434,30 @@ Description:POWERNV CPUFreq driver's frequency > throttle stats directory and > throttle attributes exported in the 'throttle_stats' directory: > > - turbo_stat : This file gives the total number of times the max > - frequency is throttled to lower frequency in turbo (at and above > - nominal frequency) range of frequencies. > + frequency is throttled to lower frequency in turbo (at and > above > + nominal frequency) range of frequencies. > > - sub_turbo_stat : This file gives the total number of times the > - max frequency is throttled to lower frequency in sub-turbo(below > - nominal frequency) range of frequencies. > + max frequency is throttled to lower frequency in > sub-turbo(below > + nominal frequency) range of frequencies. > > - unthrottle : This file gives the total number of times the max > - frequency is unthrottled after being throttled. > + frequency is unthrottled after being throttled. > > - powercap : This file gives the total number of times the max > - frequency is throttled due to 'Power Capping'. > + frequency is throttled due to 'Power Capping'. > > - overtemp : This file gives the total number of times the max > - frequency is throttled due to 'CPU Over Temperature'. > + frequency is throttled due to 'CPU Over Temperature'. > > - supply_fault : This file gives the total number of times the > - max frequency is throttled due to 'Power Supply Failure'. > + max frequency is throttled due to 'Power Supply Failure'. > > - overcurrent : This file gives the total number of times the > - max frequency is throttled due to 'Overcurrent'. > + max frequency is throttled due to 'Overcurrent'. > > - occ_reset : This file gives the
Re: [PATCH v2 20/39] docs: ABI: testing: make the files compatible with ReST output
Em Mon, 2 Nov 2020 13:46:41 +0100 Greg Kroah-Hartman escreveu: > On Mon, Nov 02, 2020 at 12:04:36PM +0100, Fabrice Gasnier wrote: > > On 10/30/20 11:09 AM, Mauro Carvalho Chehab wrote: > > > Em Fri, 30 Oct 2020 10:19:12 +0100 > > > Fabrice Gasnier escreveu: > > > > > >> Hi Mauro, > > >> > > >> [...] > > >> > > >>> > > >>> +What: > > >>> /sys/bus/iio/devices/iio:deviceX/in_count_quadrature_mode_available > > >>> +KernelVersion: 4.12 > > >>> +Contact: benjamin.gaign...@st.com > > >>> +Description: > > >>> + Reading returns the list possible quadrature modes. > > >>> + > > >>> +What: > > >>> /sys/bus/iio/devices/iio:deviceX/in_count0_quadrature_mode > > >>> +KernelVersion: 4.12 > > >>> +Contact: benjamin.gaign...@st.com > > >>> +Description: > > >>> + Configure the device counter quadrature modes: > > >>> + > > >>> + channel_A: > > >>> + Encoder A input servers as the count input and > > >>> B as > > >>> + the UP/DOWN direction control input. > > >>> + > > >>> + channel_B: > > >>> + Encoder B input serves as the count input and A > > >>> as > > >>> + the UP/DOWN direction control input. > > >>> + > > >>> + quadrature: > > >>> + Encoder A and B inputs are mixed to get > > >>> direction > > >>> + and count with a scale of 0.25. > > >>> + > > >> > > > > > > Hi Fabrice, > > > > > >> I just noticed that since Jonathan question in v1. > > >> > > >> Above ABI has been moved in the past as discussed in [1]. You can take a > > >> look at: > > >> b299d00 IIO: stm32: Remove quadrature related functions from trigger > > >> driver > > >> > > >> Could you please remove the above chunk ? > > >> > > >> With that, for the stm32 part: > > >> Acked-by: Fabrice Gasnier > > > > > > > > > Hmm... probably those were re-introduced due to a rebase. This > > > series were originally written about 1,5 years ago. > > > > > > I'll drop those hunks. > > > > Hi Mauro, Greg, > > > > I just figured out this patch has been applied with above hunk. > > > > This should be dropped: is there a fix on its way already ? > > (I may have missed it) > > Can you send a fix for just this hunk? Hmm... $ git grep /sys/bus/iio/devices/iio:deviceX/in_count_quadrature_mode_available Documentation/ABI/testing/sysfs-bus-iio-counter-104-quad-8:What: /sys/bus/iio/devices/iio:deviceX/in_count_quadrature_mode_available Documentation/ABI/testing/sysfs-bus-iio-lptimer-stm32:What: /sys/bus/iio/devices/iio:deviceX/in_count_quadrature_mode_available Documentation/ABI/testing/sysfs-bus-iio-timer-stm32:What: /sys/bus/iio/devices/iio:deviceX/in_count_quadrature_mode_available Even re-doing the changes from changeset b299d00420e2 ("IIO: stm32: Remove quadrature related functions from trigger driver") at Documentation/ABI/testing/sysfs-bus-iio-timer-stm32, there's still a third duplicate of some of those, as reported by the script: $ ./scripts/get_abi.pl validate 2>&1|grep quadra Warning: /sys/bus/iio/devices/iio:deviceX/in_count0_quadrature_mode is defined 2 times: Documentation/ABI/testing/sysfs-bus-iio-timer-stm32:117 Documentation/ABI/testing/sysfs-bus-iio-lptimer-stm32:14 Warning: /sys/bus/iio/devices/iio:deviceX/in_count_quadrature_mode_available is defined 3 times: Documentation/ABI/testing/sysfs-bus-iio-counter-104-quad-8:2 Documentation/ABI/testing/sysfs-bus-iio-timer-stm32:111 Documentation/ABI/testing/sysfs-bus-iio-lptimer-stm32:8 As in_count_quadrature_mode_available is also defined at: Documentation/ABI/testing/sysfs-bus-iio-counter-104-quad-8:2 The best here seems to have a patch that will also drop the other duplication of this, probably moving in_count_quadrature_mode_available to a generic node probably placing it inside Documentation/ABI/testing/sysfs-bus-iio. Comments? Thanks, Mauro PS.: the IIO subsystem is the one that currently has more duplicated ABI entries: $ ./scripts/get_abi.pl validate 2>&1|grep iio Warning: /sys/bus/iio/devices/iio:deviceX/in_accel_x_calibbias is defined 2 times: Documentation/ABI/testing/sysfs-bus-iio-icm42600:0 Documentation/ABI/testing/sysfs-bus-iio:394 Warning: /sys/bus/iio/devices/iio:deviceX/in_accel_y_calibbias is defined 2 times: Documentation/ABI/testing/sysfs-bus-iio-icm42600:1 Documentation/ABI/testing/sysfs-bus-iio:395 Warning: /sys/bus/iio/devices/iio:deviceX/in_accel_z_calibbias is defined 2 times: Documentation/ABI/testing/sysfs-bus-iio-icm42600:2 Documentation/ABI/testing/sysfs-bus-iio:396 Warning: /sys/bus/iio/devices/iio:deviceX/in_anglvel_x_calibbias is defined 2 times: Documentation/ABI/testing/sysfs-bus-iio-icm42600:3 Documentation/ABI/testing/sysfs-bus-iio:397 Warning: /sys/bus/iio/devices/ii
Re: [PATCH v2 20/39] docs: ABI: testing: make the files compatible with ReST output
On Mon, Nov 02, 2020 at 12:04:36PM +0100, Fabrice Gasnier wrote: > On 10/30/20 11:09 AM, Mauro Carvalho Chehab wrote: > > Em Fri, 30 Oct 2020 10:19:12 +0100 > > Fabrice Gasnier escreveu: > > > >> Hi Mauro, > >> > >> [...] > >> > >>> > >>> +What: > >>> /sys/bus/iio/devices/iio:deviceX/in_count_quadrature_mode_available > >>> +KernelVersion: 4.12 > >>> +Contact: benjamin.gaign...@st.com > >>> +Description: > >>> + Reading returns the list possible quadrature modes. > >>> + > >>> +What: > >>> /sys/bus/iio/devices/iio:deviceX/in_count0_quadrature_mode > >>> +KernelVersion: 4.12 > >>> +Contact: benjamin.gaign...@st.com > >>> +Description: > >>> + Configure the device counter quadrature modes: > >>> + > >>> + channel_A: > >>> + Encoder A input servers as the count input and B as > >>> + the UP/DOWN direction control input. > >>> + > >>> + channel_B: > >>> + Encoder B input serves as the count input and A as > >>> + the UP/DOWN direction control input. > >>> + > >>> + quadrature: > >>> + Encoder A and B inputs are mixed to get direction > >>> + and count with a scale of 0.25. > >>> + > >> > > > > Hi Fabrice, > > > >> I just noticed that since Jonathan question in v1. > >> > >> Above ABI has been moved in the past as discussed in [1]. You can take a > >> look at: > >> b299d00 IIO: stm32: Remove quadrature related functions from trigger driver > >> > >> Could you please remove the above chunk ? > >> > >> With that, for the stm32 part: > >> Acked-by: Fabrice Gasnier > > > > > > Hmm... probably those were re-introduced due to a rebase. This > > series were originally written about 1,5 years ago. > > > > I'll drop those hunks. > > Hi Mauro, Greg, > > I just figured out this patch has been applied with above hunk. > > This should be dropped: is there a fix on its way already ? > (I may have missed it) Can you send a fix for just this hunk? thanks, greg k-h
Re: [PATCH v2 20/39] docs: ABI: testing: make the files compatible with ReST output
On 10/30/20 11:09 AM, Mauro Carvalho Chehab wrote: > Em Fri, 30 Oct 2020 10:19:12 +0100 > Fabrice Gasnier escreveu: > >> Hi Mauro, >> >> [...] >> >>> >>> +What: >>> /sys/bus/iio/devices/iio:deviceX/in_count_quadrature_mode_available >>> +KernelVersion: 4.12 >>> +Contact: benjamin.gaign...@st.com >>> +Description: >>> + Reading returns the list possible quadrature modes. >>> + >>> +What: >>> /sys/bus/iio/devices/iio:deviceX/in_count0_quadrature_mode >>> +KernelVersion: 4.12 >>> +Contact: benjamin.gaign...@st.com >>> +Description: >>> + Configure the device counter quadrature modes: >>> + >>> + channel_A: >>> + Encoder A input servers as the count input and B as >>> + the UP/DOWN direction control input. >>> + >>> + channel_B: >>> + Encoder B input serves as the count input and A as >>> + the UP/DOWN direction control input. >>> + >>> + quadrature: >>> + Encoder A and B inputs are mixed to get direction >>> + and count with a scale of 0.25. >>> + >> > > Hi Fabrice, > >> I just noticed that since Jonathan question in v1. >> >> Above ABI has been moved in the past as discussed in [1]. You can take a >> look at: >> b299d00 IIO: stm32: Remove quadrature related functions from trigger driver >> >> Could you please remove the above chunk ? >> >> With that, for the stm32 part: >> Acked-by: Fabrice Gasnier > > > Hmm... probably those were re-introduced due to a rebase. This > series were originally written about 1,5 years ago. > > I'll drop those hunks. Hi Mauro, Greg, I just figured out this patch has been applied with above hunk. This should be dropped: is there a fix on its way already ? (I may have missed it) Please advise, Fabrice > > Thanks! > Mauro >
[PATCH 11/11 v2.2] ftrace: Add recording of functions that caused recursion
From c532ff6b048dd4a12943b05c7b8ce30666c587c8 Mon Sep 17 00:00:00 2001 From: "Steven Rostedt (VMware)" Date: Thu, 29 Oct 2020 15:27:06 -0400 Subject: [PATCH] ftrace: Add recording of functions that caused recursion This adds CONFIG_FTRACE_RECORD_RECURSION that will record to a file "recursed_functions" all the functions that caused recursion while a callback to the function tracer was running. Cc: Jonathan Corbet Cc: Guo Ren Cc: "James E.J. Bottomley" Cc: Helge Deller Cc: Michael Ellerman Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Heiko Carstens Cc: Vasily Gorbik Cc: Christian Borntraeger Cc: Thomas Gleixner Cc: Borislav Petkov Cc: x...@kernel.org Cc: "H. Peter Anvin" Cc: Kees Cook Cc: Anton Vorontsov Cc: Colin Cross Cc: Tony Luck Cc: Josh Poimboeuf Cc: Jiri Kosina Cc: Miroslav Benes Cc: Petr Mladek Cc: Joe Lawrence Cc: Kamalesh Babulal Cc: Mauro Carvalho Chehab Cc: Sebastian Andrzej Siewior Cc: linux-...@vger.kernel.org Cc: linux-ker...@vger.kernel.org Cc: linux-c...@vger.kernel.org Cc: linux-par...@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-s...@vger.kernel.org Cc: live-patch...@vger.kernel.org Signed-off-by: Steven Rostedt (VMware) --- Changes since v2.1: Added EXPORT_SYMBOL_GPL() to ftrace_record_recursion() function Documentation/trace/ftrace-uses.rst | 6 +- arch/csky/kernel/probes/ftrace.c | 2 +- arch/parisc/kernel/ftrace.c | 2 +- arch/powerpc/kernel/kprobes-ftrace.c | 2 +- arch/s390/kernel/ftrace.c | 2 +- arch/x86/kernel/kprobes/ftrace.c | 2 +- fs/pstore/ftrace.c| 2 +- include/linux/trace_recursion.h | 32 +++- kernel/livepatch/patch.c | 2 +- kernel/trace/Kconfig | 25 +++ kernel/trace/Makefile | 1 + kernel/trace/ftrace.c | 4 +- kernel/trace/trace_event_perf.c | 2 +- kernel/trace/trace_functions.c| 2 +- kernel/trace/trace_output.c | 6 +- kernel/trace/trace_output.h | 1 + kernel/trace/trace_recursion_record.c | 236 ++ 17 files changed, 309 insertions(+), 20 deletions(-) create mode 100644 kernel/trace/trace_recursion_record.c diff --git a/Documentation/trace/ftrace-uses.rst b/Documentation/trace/ftrace-uses.rst index 86cd14b8e126..5981d5691745 100644 --- a/Documentation/trace/ftrace-uses.rst +++ b/Documentation/trace/ftrace-uses.rst @@ -118,7 +118,7 @@ can help in this regard. If you start your code with: int bit; - bit = ftrace_test_recursion_trylock(); + bit = ftrace_test_recursion_trylock(ip, parent_ip); if (bit < 0) return; @@ -130,7 +130,9 @@ The code in between will be safe to use, even if it ends up calling a function that the callback is tracing. Note, on success, ftrace_test_recursion_trylock() will disable preemption, and the ftrace_test_recursion_unlock() will enable it again (if it was previously -enabled). +enabled). The instruction pointer (ip) and its parent (parent_ip) is passed to +ftrace_test_recursion_trylock() to record where the recursion happened +(if CONFIG_FTRACE_RECORD_RECURSION is set). Alternatively, if the FTRACE_OPS_FL_RECURSION flag is set on the ftrace_ops (as explained below), then a helper trampoline will be used to test diff --git a/arch/csky/kernel/probes/ftrace.c b/arch/csky/kernel/probes/ftrace.c index 5eb2604fdf71..f30b179924ef 100644 --- a/arch/csky/kernel/probes/ftrace.c +++ b/arch/csky/kernel/probes/ftrace.c @@ -18,7 +18,7 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip, struct kprobe *p; struct kprobe_ctlblk *kcb; - bit = ftrace_test_recursion_trylock(); + bit = ftrace_test_recursion_trylock(ip, parent_ip); if (bit < 0) return; diff --git a/arch/parisc/kernel/ftrace.c b/arch/parisc/kernel/ftrace.c index 4b1fdf15662c..8b0ed7c5a4ab 100644 --- a/arch/parisc/kernel/ftrace.c +++ b/arch/parisc/kernel/ftrace.c @@ -210,7 +210,7 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip, struct kprobe *p = get_kprobe((kprobe_opcode_t *)ip); int bit; - bit = ftrace_test_recursion_trylock(); + bit = ftrace_test_recursion_trylock(ip, parent_ip); if (bit < 0) return; diff --git a/arch/powerpc/kernel/kprobes-ftrace.c b/arch/powerpc/kernel/kprobes-ftrace.c index 5df8d50c65ae..fdfee39938ea 100644 --- a/arch/powerpc/kernel/kprobes-ftrace.c +++ b/arch/powerpc/kernel/kprobes-ftrace.c @@ -20,7 +20,7 @@ void kprobe_ftrace_handler(unsigned long nip, unsigned long parent_nip, struct kprobe_ctlblk *kcb; int bit; - bit = ftrace_test_recursion_trylock(); + bit = ftrace_test_recursion_trylock(nip, parent_nip); if (bit < 0) return; diff --git a/arch/s390/kernel/ftrace.c b/arch/s390/kernel/ftrace.c index 88466d7fb6b2..a1556333d481 100644 --- a/
[PATCH 11/11 v2.1] ftrace: Add recording of functions that caused recursion
From: "Steven Rostedt (VMware)" This adds CONFIG_FTRACE_RECORD_RECURSION that will record to a file "recursed_functions" all the functions that caused recursion while a callback to the function tracer was running. Cc: Jonathan Corbet Cc: Guo Ren Cc: "James E.J. Bottomley" Cc: Helge Deller Cc: Michael Ellerman Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Heiko Carstens Cc: Vasily Gorbik Cc: Christian Borntraeger Cc: Thomas Gleixner Cc: Borislav Petkov Cc: x...@kernel.org Cc: "H. Peter Anvin" Cc: Kees Cook Cc: Anton Vorontsov Cc: Colin Cross Cc: Tony Luck Cc: Josh Poimboeuf Cc: Jiri Kosina Cc: Miroslav Benes Cc: Petr Mladek Cc: Joe Lawrence Cc: Kamalesh Babulal Cc: Mauro Carvalho Chehab Cc: Sebastian Andrzej Siewior Cc: linux-...@vger.kernel.org Cc: linux-ker...@vger.kernel.org Cc: linux-c...@vger.kernel.org Cc: linux-par...@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-s...@vger.kernel.org Cc: live-patch...@vger.kernel.org Signed-off-by: Steven Rostedt (VMware) --- Documentation/trace/ftrace-uses.rst | 6 +- arch/csky/kernel/probes/ftrace.c | 2 +- arch/parisc/kernel/ftrace.c | 2 +- arch/powerpc/kernel/kprobes-ftrace.c | 2 +- arch/s390/kernel/ftrace.c | 2 +- arch/x86/kernel/kprobes/ftrace.c | 2 +- fs/pstore/ftrace.c| 2 +- include/linux/trace_recursion.h | 32 +++- kernel/livepatch/patch.c | 2 +- kernel/trace/Kconfig | 25 +++ kernel/trace/Makefile | 1 + kernel/trace/ftrace.c | 4 +- kernel/trace/trace_event_perf.c | 2 +- kernel/trace/trace_functions.c| 2 +- kernel/trace/trace_output.c | 6 +- kernel/trace/trace_output.h | 1 + kernel/trace/trace_recursion_record.c | 235 ++ 17 files changed, 308 insertions(+), 20 deletions(-) create mode 100644 kernel/trace/trace_recursion_record.c diff --git a/Documentation/trace/ftrace-uses.rst b/Documentation/trace/ftrace-uses.rst index 86cd14b8e126..5981d5691745 100644 --- a/Documentation/trace/ftrace-uses.rst +++ b/Documentation/trace/ftrace-uses.rst @@ -118,7 +118,7 @@ can help in this regard. If you start your code with: int bit; - bit = ftrace_test_recursion_trylock(); + bit = ftrace_test_recursion_trylock(ip, parent_ip); if (bit < 0) return; @@ -130,7 +130,9 @@ The code in between will be safe to use, even if it ends up calling a function that the callback is tracing. Note, on success, ftrace_test_recursion_trylock() will disable preemption, and the ftrace_test_recursion_unlock() will enable it again (if it was previously -enabled). +enabled). The instruction pointer (ip) and its parent (parent_ip) is passed to +ftrace_test_recursion_trylock() to record where the recursion happened +(if CONFIG_FTRACE_RECORD_RECURSION is set). Alternatively, if the FTRACE_OPS_FL_RECURSION flag is set on the ftrace_ops (as explained below), then a helper trampoline will be used to test diff --git a/arch/csky/kernel/probes/ftrace.c b/arch/csky/kernel/probes/ftrace.c index 5eb2604fdf71..f30b179924ef 100644 --- a/arch/csky/kernel/probes/ftrace.c +++ b/arch/csky/kernel/probes/ftrace.c @@ -18,7 +18,7 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip, struct kprobe *p; struct kprobe_ctlblk *kcb; - bit = ftrace_test_recursion_trylock(); + bit = ftrace_test_recursion_trylock(ip, parent_ip); if (bit < 0) return; diff --git a/arch/parisc/kernel/ftrace.c b/arch/parisc/kernel/ftrace.c index 4b1fdf15662c..8b0ed7c5a4ab 100644 --- a/arch/parisc/kernel/ftrace.c +++ b/arch/parisc/kernel/ftrace.c @@ -210,7 +210,7 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip, struct kprobe *p = get_kprobe((kprobe_opcode_t *)ip); int bit; - bit = ftrace_test_recursion_trylock(); + bit = ftrace_test_recursion_trylock(ip, parent_ip); if (bit < 0) return; diff --git a/arch/powerpc/kernel/kprobes-ftrace.c b/arch/powerpc/kernel/kprobes-ftrace.c index 5df8d50c65ae..fdfee39938ea 100644 --- a/arch/powerpc/kernel/kprobes-ftrace.c +++ b/arch/powerpc/kernel/kprobes-ftrace.c @@ -20,7 +20,7 @@ void kprobe_ftrace_handler(unsigned long nip, unsigned long parent_nip, struct kprobe_ctlblk *kcb; int bit; - bit = ftrace_test_recursion_trylock(); + bit = ftrace_test_recursion_trylock(nip, parent_nip); if (bit < 0) return; diff --git a/arch/s390/kernel/ftrace.c b/arch/s390/kernel/ftrace.c index 88466d7fb6b2..a1556333d481 100644 --- a/arch/s390/kernel/ftrace.c +++ b/arch/s390/kernel/ftrace.c @@ -204,7 +204,7 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip, struct kprobe *p = get_kprobe((kprobe_opcode_t *)ip); int bit; - bit = ftrace_test_recursion_trylo
RE: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: 'Greg KH' > Sent: 02 November 2020 13:52 > > On Mon, Nov 02, 2020 at 09:06:38AM +, David Laight wrote: > > From: 'Greg KH' > > > Sent: 23 October 2020 15:47 > > > > > > On Fri, Oct 23, 2020 at 02:39:24PM +, David Laight wrote: > > > > From: David Hildenbrand > > > > > Sent: 23 October 2020 15:33 > > > > ... > > > > > I just checked against upstream code generated by clang 10 and it > > > > > properly discards the upper 32bit via a mov w23 w2. > > > > > > > > > > So at least clang 10 indeed properly assumes we could have garbage and > > > > > masks it off. > > > > > > > > > > Maybe the issue is somewhere else, unrelated to nr_pages ... or clang > > > > > 11 > > > > > behaves differently. > > > > > > > > We'll need the disassembly from a failing kernel image. > > > > It isn't that big to hand annotate. > > > > > > I've worked around the merge at the moment in the android tree, but it > > > is still quite reproducable, and will try to get a .o file to > > > disassemble on Monday or so... > > > > Did this get properly resolved? > > For some reason, 5.10-rc2 fixed all of this up. I backed out all of the > patches I had to revert to get 5.10-rc1 to work properly, and then did > the merge and all is well. > > It must have been something to do with the compat changes in this same > area that went in after 5.10-rc1, and something got reorganized in the > files somehow. I really do not know, and at the moment, don't have the > time to track it down anymore. So for now, I'd say it's all good, sorry > for the noise. Hopefully it won't appear again. Saved me spending a day off reading arm64 assembler. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
Re: [PATCH 11/11 v2] ftrace: Add recording of functions that caused recursion
On Mon, 2 Nov 2020 12:37:21 -0500 Steven Rostedt wrote: > The only race that I see that can happen, is the one in the comment I > showed. And that is after enabling the recursed functions again after > clearing, one CPU could add a function while another CPU that just added > that same function could be just exiting this routine, notice that a > clearing of the array happened, and remove its function (which was the same > as the one just happened). So we get a "zero" in the array. If this > happens, it is likely that that function will recurse again and will be > added later. > Updated version of this function: -- Steve void ftrace_record_recursion(unsigned long ip, unsigned long parent_ip) { int index = 0; int i; unsigned long old; again: /* First check the last one recorded */ if (ip == cached_function) return; i = atomic_read(&nr_records); /* nr_records is -1 when clearing records */ smp_mb__after_atomic(); if (i < 0) return; /* * If there's two writers and this writer comes in second, * the cmpxchg() below to update the ip will fail. Then this * writer will try again. It is possible that index will now * be greater than nr_records. This is because the writer * that succeeded has not updated the nr_records yet. * This writer could keep trying again until the other writer * updates nr_records. But if the other writer takes an * interrupt, and that interrupt locks up that CPU, we do * not want this CPU to lock up due to the recursion protection, * and have a bug report showing this CPU as the cause of * locking up the computer. To not lose this record, this * writer will simply use the next position to update the * recursed_functions, and it will update the nr_records * accordingly. */ if (index < i) index = i; if (index >= CONFIG_FTRACE_RECORD_RECURSION_SIZE) return; for (i = index - 1; i >= 0; i--) { if (recursed_functions[i].ip == ip) { cached_function = ip; return; } } cached_function = ip; /* * We only want to add a function if it hasn't been added before. * Add to the current location before incrementing the count. * If it fails to add, then increment the index (save in i) * and try again. */ old = cmpxchg(&recursed_functions[index].ip, 0, ip); if (old != 0) { /* Did something else already added this for us? */ if (old == ip) return; /* Try the next location (use i for the next index) */ index++; goto again; } recursed_functions[index].parent_ip = parent_ip; /* * It's still possible that we could race with the clearing *CPU0CPU1 * * ip = func * nr_records = -1; * recursed_functions[0] = 0; * i = -1 * if (i < 0) * nr_records = 0; * (new recursion detected) * recursed_functions[0] = func * cmpxchg(recursed_functions[0], *func, 0) * * But the worse that could happen is that we get a zero in * the recursed_functions array, and it's likely that "func" will * be recorded again. */ i = atomic_read(&nr_records); smp_mb__after_atomic(); if (i < 0) cmpxchg(&recursed_functions[index].ip, ip, 0); else if (i <= index) atomic_cmpxchg(&nr_records, i, index + 1); }
Re: [PATCH 11/11 v2] ftrace: Add recording of functions that caused recursion
On Mon, 2 Nov 2020 17:41:47 +0100 Petr Mladek wrote: > > + i = atomic_read(&nr_records); > > + smp_mb__after_atomic(); > > + if (i < 0) > > + cmpxchg(&recursed_functions[index].ip, ip, 0); > > + else if (i <= index) > > + atomic_cmpxchg(&nr_records, i, index + 1); > > This looks weird. It would shift nr_records past the record added > in this call. It might skip many slots that were zeroed when clearing. > Also we do not know if our entry was not zeroed as well. nr_records always holds the next position to write to. index = nr_records; recursed_functions[index].ip = ip; nr_records++; Before clearing, we have: nr_records = -1; smp_mb(); memset(recursed_functions, 0); smp_wmb(); nr_records = 0; When we enter this function: i = nr_records; smp_mb(); if (i < 0) return; Thus, we just stopped all new updates while clearing the records. But what about if something is currently updating? i = nr_records; smp_mb(); if (i < 0) cmpxchg(recursed_functions, ip, 0); The above shows that if the current updating process notices that the clearing happens, it will clear the function it added. else if (i <= index) cmpxchg(nr_records, i, index + 1); This makes sure that nr_records only grows if it is greater or equal to zero. The only race that I see that can happen, is the one in the comment I showed. And that is after enabling the recursed functions again after clearing, one CPU could add a function while another CPU that just added that same function could be just exiting this routine, notice that a clearing of the array happened, and remove its function (which was the same as the one just happened). So we get a "zero" in the array. If this happens, it is likely that that function will recurse again and will be added later. -- Steve
Re: [PATCH 11/11 v2] ftrace: Add recording of functions that caused recursion
On Mon, 2 Nov 2020 12:09:07 -0500 Steven Rostedt wrote: > > > +void ftrace_record_recursion(unsigned long ip, unsigned long parent_ip) > > > +{ > > > + int index; > > > + int i = 0; > > > + unsigned long old; > > > + > > > + again: > > > + /* First check the last one recorded */ > > > + if (ip == cached_function) > > > + return; > > > + > > > + index = atomic_read(&nr_records); > > > + /* nr_records is -1 when clearing records */ > > > + smp_mb__after_atomic(); > > > + if (index < 0) > > > + return; > > > + > > > + /* See below */ > > > + if (i > index) > > > + index = i; > > > > This looks like a complicated way to do index++ via "i" variable. > > I guess that it was needed only in some older variant of the code. > > See below. > > Because we reread the index above, and index could be bigger than i (more > than index + 1). > > > > > > + if (index >= CONFIG_FTRACE_RECORD_RECURSION_SIZE) > > > + return; > > > + > > > + for (i = index - 1; i >= 0; i--) { > > > + if (recursed_functions[i].ip == ip) { > > > + cached_function = ip; > > > + return; > > > + } > > > + } > > > + > > > + cached_function = ip; > > > + > > > + /* > > > + * We only want to add a function if it hasn't been added before. > > > + * Add to the current location before incrementing the count. > > > + * If it fails to add, then increment the index (save in i) > > > + * and try again. > > > + */ > > > + old = cmpxchg(&recursed_functions[index].ip, 0, ip); > > > + if (old != 0) { > > > + /* Did something else already added this for us? */ > > > + if (old == ip) > > > + return; > > > + /* Try the next location (use i for the next index) */ > > > + i = index + 1; > > > > What about > > > > index++; > > > > We basically want to run the code again with index + 1 limit. > > But something else could update nr_records, and we want to use that if > nr_records is greater than i. > > Now, we could swap the use case, and have > > int index = 0; > > [..] > i = atomic_read(&nr_records); > if (i > index) > index = i; > > [..] > > index++; > goto again; > > > > > > Maybe, it even does not make sense to check the array again > > and we should just try to store the value into the next slot. > > We do this dance to prevent duplicates. > > But you are correct, that this went through a few iterations. And the first > ones didn't have the cmpxchg on the ip itself, and that could make it so > that we don't need this index = i dance. Playing with this more, I remember why I did this song and dance. If we have two or more writers, and one beats the other in updating the ip (with a different function). This one will go and try again. The reason to look at one passed nr_records, is because of the race between the multiple writers. This one may loop before the other can update nr_records, and it will fail to apply it again. You could just say, "hey we'll just keep looping until the other writer eventually updates nr_records". But this is where my paranoia gets in. What happens if that other writer takes an interrupt (interrupts are not disabled), and then deadlocks, or does something bad? This CPU will not get locked up spinning. Unlikely scenario, and it would require a bug someplace else. But I don't want a bug report stating that it found this recursion locking locking up the CPU and hide the real culprit. I'll add a comment to explain this in the code. And also swap the i and index around to make a little more sense. -- Steve
Re: [PATCH 11/11 v2] ftrace: Add recording of functions that caused recursion
On Mon, 2 Nov 2020 17:41:47 +0100 Petr Mladek wrote: > On Fri 2020-10-30 17:31:53, Steven Rostedt wrote: > > From: "Steven Rostedt (VMware)" > > > > This adds CONFIG_FTRACE_RECORD_RECURSION that will record to a file > > "recursed_functions" all the functions that caused recursion while a > > callback to the function tracer was running. > > > > > --- /dev/null > > +++ b/kernel/trace/trace_recursion_record.c > > @@ -0,0 +1,220 @@ > > +// SPDX-License-Identifier: GPL-2.0 > > + > > +#include > > +#include > > +#include > > +#include > > +#include > > + > > +#include "trace_output.h" > > + > > +struct recursed_functions { > > + unsigned long ip; > > + unsigned long parent_ip; > > +}; > > + > > +static struct recursed_functions > > recursed_functions[CONFIG_FTRACE_RECORD_RECURSION_SIZE]; > > The code tries to be lockless safe as much as possible. It would make > sense to allign the array. Hmm, is there an arch where the compiler would put an array of structures with two unsigned long, misaligned? > > > > +static atomic_t nr_records; > > + > > +/* > > + * Cache the last found function. Yes, updates to this is racey, but > > + * so is memory cache ;-) > > + */ > > +static unsigned long cached_function; > > + > > +void ftrace_record_recursion(unsigned long ip, unsigned long parent_ip) > > +{ > > + int index; > > + int i = 0; > > + unsigned long old; > > + > > + again: > > + /* First check the last one recorded */ > > + if (ip == cached_function) > > + return; > > + > > + index = atomic_read(&nr_records); > > + /* nr_records is -1 when clearing records */ > > + smp_mb__after_atomic(); > > + if (index < 0) > > + return; > > + > > + /* See below */ > > + if (i > index) > > + index = i; > > This looks like a complicated way to do index++ via "i" variable. > I guess that it was needed only in some older variant of the code. > See below. Because we reread the index above, and index could be bigger than i (more than index + 1). > > > + if (index >= CONFIG_FTRACE_RECORD_RECURSION_SIZE) > > + return; > > + > > + for (i = index - 1; i >= 0; i--) { > > + if (recursed_functions[i].ip == ip) { > > + cached_function = ip; > > + return; > > + } > > + } > > + > > + cached_function = ip; > > + > > + /* > > +* We only want to add a function if it hasn't been added before. > > +* Add to the current location before incrementing the count. > > +* If it fails to add, then increment the index (save in i) > > +* and try again. > > +*/ > > + old = cmpxchg(&recursed_functions[index].ip, 0, ip); > > + if (old != 0) { > > + /* Did something else already added this for us? */ > > + if (old == ip) > > + return; > > + /* Try the next location (use i for the next index) */ > > + i = index + 1; > > What about > > index++; > > We basically want to run the code again with index + 1 limit. But something else could update nr_records, and we want to use that if nr_records is greater than i. Now, we could swap the use case, and have int index = 0; [..] i = atomic_read(&nr_records); if (i > index) index = i; [..] index++; goto again; > > Maybe, it even does not make sense to check the array again > and we should just try to store the value into the next slot. We do this dance to prevent duplicates. But you are correct, that this went through a few iterations. And the first ones didn't have the cmpxchg on the ip itself, and that could make it so that we don't need this index = i dance. > > > + goto again; > > + } > > + > > + recursed_functions[index].parent_ip = parent_ip; > > WRITE_ONCE() ? Does it really matter? > > > + > > + /* > > +* It's still possible that we could race with the clearing > > +*CPU0CPU1 > > +* > > +* ip = func > > +* nr_records = -1; > > +* recursed_functions[0] = 0; > > +* i = -1 > > +* if (i < 0) > > +* nr_records = 0; > > +* (new recursion detected) > > +* recursed_functions[0] = func > > +* > > cmpxchg(recursed_functions[0], > > +*func, 0) > > +* > > +* But the worse that could happen is that we get a zero in > > +* the recursed_functions array, and it's likely that "func" will > > +* be recorded again. > > +*/ > > + i = atomic_read(&nr_records); > > + smp_mb__after_atomic(); > > + if (i < 0) > > + cmpxchg(&recursed_functions[index].ip, ip, 0); >
Re: [PATCH 11/11 v2] ftrace: Add recording of functions that caused recursion
On Fri 2020-10-30 17:31:53, Steven Rostedt wrote: > From: "Steven Rostedt (VMware)" > > This adds CONFIG_FTRACE_RECORD_RECURSION that will record to a file > "recursed_functions" all the functions that caused recursion while a > callback to the function tracer was running. > > --- /dev/null > +++ b/kernel/trace/trace_recursion_record.c > @@ -0,0 +1,220 @@ > +// SPDX-License-Identifier: GPL-2.0 > + > +#include > +#include > +#include > +#include > +#include > + > +#include "trace_output.h" > + > +struct recursed_functions { > + unsigned long ip; > + unsigned long parent_ip; > +}; > + > +static struct recursed_functions > recursed_functions[CONFIG_FTRACE_RECORD_RECURSION_SIZE]; The code tries to be lockless safe as much as possible. It would make sense to allign the array. > +static atomic_t nr_records; > + > +/* > + * Cache the last found function. Yes, updates to this is racey, but > + * so is memory cache ;-) > + */ > +static unsigned long cached_function; > + > +void ftrace_record_recursion(unsigned long ip, unsigned long parent_ip) > +{ > + int index; > + int i = 0; > + unsigned long old; > + > + again: > + /* First check the last one recorded */ > + if (ip == cached_function) > + return; > + > + index = atomic_read(&nr_records); > + /* nr_records is -1 when clearing records */ > + smp_mb__after_atomic(); > + if (index < 0) > + return; > + > + /* See below */ > + if (i > index) > + index = i; This looks like a complicated way to do index++ via "i" variable. I guess that it was needed only in some older variant of the code. See below. > + if (index >= CONFIG_FTRACE_RECORD_RECURSION_SIZE) > + return; > + > + for (i = index - 1; i >= 0; i--) { > + if (recursed_functions[i].ip == ip) { > + cached_function = ip; > + return; > + } > + } > + > + cached_function = ip; > + > + /* > + * We only want to add a function if it hasn't been added before. > + * Add to the current location before incrementing the count. > + * If it fails to add, then increment the index (save in i) > + * and try again. > + */ > + old = cmpxchg(&recursed_functions[index].ip, 0, ip); > + if (old != 0) { > + /* Did something else already added this for us? */ > + if (old == ip) > + return; > + /* Try the next location (use i for the next index) */ > + i = index + 1; What about index++; We basically want to run the code again with index + 1 limit. Maybe, it even does not make sense to check the array again and we should just try to store the value into the next slot. > + goto again; > + } > + > + recursed_functions[index].parent_ip = parent_ip; WRITE_ONCE() ? > + > + /* > + * It's still possible that we could race with the clearing > + *CPU0CPU1 > + * > + * ip = func > + * nr_records = -1; > + * recursed_functions[0] = 0; > + * i = -1 > + * if (i < 0) > + * nr_records = 0; > + * (new recursion detected) > + * recursed_functions[0] = func > + * > cmpxchg(recursed_functions[0], > + *func, 0) > + * > + * But the worse that could happen is that we get a zero in > + * the recursed_functions array, and it's likely that "func" will > + * be recorded again. > + */ > + i = atomic_read(&nr_records); > + smp_mb__after_atomic(); > + if (i < 0) > + cmpxchg(&recursed_functions[index].ip, ip, 0); > + else if (i <= index) > + atomic_cmpxchg(&nr_records, i, index + 1); This looks weird. It would shift nr_records past the record added in this call. It might skip many slots that were zeroed when clearing. Also we do not know if our entry was not zeroed as well. I would suggest to do it some other way (not even compile tested): void ftrace_record_recursion(unsigned long ip, unsigned long parent_ip) { int index, old_index; int i = 0; unsigned long old_ip; again: /* First check the last one recorded. */ if (ip == READ_ONCE(cached_function)) return; index = atomic_read(&nr_records); /* nr_records is -1 when clearing records. */ smp_mb__after_atomic(); if (index < 0) return; /* Already cached? */ for (i = index - 1; i >= 0; i--) { if (recursed_functions[i].ip == ip) { WRITE_ONCE(cached_function, ip);
[PATCH] ASoC: fsl_xcvr: fix break condition
From: Viorel Suman The break condition copied by mistake as same as loop condition in the previous version, but must be the opposite. So fix it. Signed-off-by: Viorel Suman --- sound/soc/fsl/fsl_xcvr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sound/soc/fsl/fsl_xcvr.c b/sound/soc/fsl/fsl_xcvr.c index c055179e6d11..2a28810d0e29 100644 --- a/sound/soc/fsl/fsl_xcvr.c +++ b/sound/soc/fsl/fsl_xcvr.c @@ -247,7 +247,7 @@ static int fsl_xcvr_ai_write(struct fsl_xcvr *xcvr, u8 reg, u32 data, bool phy) regmap_write(xcvr->regmap, FSL_XCVR_PHY_AI_CTRL_TOG, idx); ret = regmap_read_poll_timeout(xcvr->regmap, FSL_XCVR_PHY_AI_CTRL, val, - (val & idx) != ((val & tidx) >> 1), + (val & idx) == ((val & tidx) >> 1), 10, 1); if (ret) dev_err(dev, "AI timeout: failed to set %s reg 0x%02x=0x%08x\n", -- 2.26.2
Re: [PATCH v3 4/4] arch, mm: make kernel_page_present() always available
On Mon, Nov 02, 2020 at 10:28:14AM +0100, David Hildenbrand wrote: > On 01.11.20 18:08, Mike Rapoport wrote: > > From: Mike Rapoport > > > > For architectures that enable ARCH_HAS_SET_MEMORY having the ability to > > verify that a page is mapped in the kernel direct map can be useful > > regardless of hibernation. > > > > Add RISC-V implementation of kernel_page_present(), update its forward > > declarations and stubs to be a part of set_memory API and remove ugly > > ifdefery in inlcude/linux/mm.h around current declarations of > > kernel_page_present(). > > > > Signed-off-by: Mike Rapoport > > --- > > arch/arm64/include/asm/cacheflush.h | 1 + > > arch/arm64/mm/pageattr.c| 4 +--- > > arch/riscv/include/asm/set_memory.h | 1 + > > arch/riscv/mm/pageattr.c| 29 + > > arch/x86/include/asm/set_memory.h | 1 + > > arch/x86/mm/pat/set_memory.c| 4 +--- > > include/linux/mm.h | 7 --- > > include/linux/set_memory.h | 5 + > > 8 files changed, 39 insertions(+), 13 deletions(-) > > > > diff --git a/arch/arm64/include/asm/cacheflush.h > > b/arch/arm64/include/asm/cacheflush.h > > index 9384fd8fc13c..45217f21f1fe 100644 > > --- a/arch/arm64/include/asm/cacheflush.h > > +++ b/arch/arm64/include/asm/cacheflush.h > > @@ -140,6 +140,7 @@ int set_memory_valid(unsigned long addr, int numpages, > > int enable); > > int set_direct_map_invalid_noflush(struct page *page); > > int set_direct_map_default_noflush(struct page *page); > > +bool kernel_page_present(struct page *page); > > #include > > diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c > > index 439325532be1..92eccaf595c8 100644 > > --- a/arch/arm64/mm/pageattr.c > > +++ b/arch/arm64/mm/pageattr.c > > @@ -186,8 +186,8 @@ void __kernel_map_pages(struct page *page, int > > numpages, int enable) > > set_memory_valid((unsigned long)page_address(page), numpages, enable); > > } > > +#endif /* CONFIG_DEBUG_PAGEALLOC */ > > -#ifdef CONFIG_HIBERNATION > > /* > >* This function is used to determine if a linear map page has been > > marked as > >* not-valid. Walk the page table and check the PTE_VALID bit. This is > > based > > @@ -234,5 +234,3 @@ bool kernel_page_present(struct page *page) > > ptep = pte_offset_kernel(pmdp, addr); > > return pte_valid(READ_ONCE(*ptep)); > > } > > -#endif /* CONFIG_HIBERNATION */ > > -#endif /* CONFIG_DEBUG_PAGEALLOC */ > > diff --git a/arch/riscv/include/asm/set_memory.h > > b/arch/riscv/include/asm/set_memory.h > > index 4c5bae7ca01c..d690b08dff2a 100644 > > --- a/arch/riscv/include/asm/set_memory.h > > +++ b/arch/riscv/include/asm/set_memory.h > > @@ -24,6 +24,7 @@ static inline int set_memory_nx(unsigned long addr, int > > numpages) { return 0; } > > int set_direct_map_invalid_noflush(struct page *page); > > int set_direct_map_default_noflush(struct page *page); > > +bool kernel_page_present(struct page *page); > > #endif /* __ASSEMBLY__ */ > > diff --git a/arch/riscv/mm/pageattr.c b/arch/riscv/mm/pageattr.c > > index 321b09d2e2ea..87ba5a68bbb8 100644 > > --- a/arch/riscv/mm/pageattr.c > > +++ b/arch/riscv/mm/pageattr.c > > @@ -198,3 +198,32 @@ void __kernel_map_pages(struct page *page, int > > numpages, int enable) > > __pgprot(0), __pgprot(_PAGE_PRESENT)); > > } > > #endif > > + > > +bool kernel_page_present(struct page *page) > > +{ > > + unsigned long addr = (unsigned long)page_address(page); > > + pgd_t *pgd; > > + pud_t *pud; > > + p4d_t *p4d; > > + pmd_t *pmd; > > + pte_t *pte; > > + > > + pgd = pgd_offset_k(addr); > > + if (!pgd_present(*pgd)) > > + return false; > > + > > + p4d = p4d_offset(pgd, addr); > > + if (!p4d_present(*p4d)) > > + return false; > > + > > + pud = pud_offset(p4d, addr); > > + if (!pud_present(*pud)) > > + return false; > > + > > + pmd = pmd_offset(pud, addr); > > + if (!pmd_present(*pmd)) > > + return false; > > + > > + pte = pte_offset_kernel(pmd, addr); > > + return pte_present(*pte); > > +} > > diff --git a/arch/x86/include/asm/set_memory.h > > b/arch/x86/include/asm/set_memory.h > > index 5948218f35c5..4352f08bfbb5 100644 > > --- a/arch/x86/include/asm/set_memory.h > > +++ b/arch/x86/include/asm/set_memory.h > > @@ -82,6 +82,7 @@ int set_pages_rw(struct page *page, int numpages); > > int set_direct_map_invalid_noflush(struct page *page); > > int set_direct_map_default_noflush(struct page *page); > > +bool kernel_page_present(struct page *page); > > extern int kernel_set_to_readonly; > > diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c > > index bc9be96b777f..16f878c26667 100644 > > --- a/arch/x86/mm/pat/set_memory.c > > +++ b/arch/x86/mm/pat/set_memory.c > > @@ -2226,8 +2226,8 @@ void __kernel_map_pages(struct page *page, int > > numpages, int enable) > > arch_flush_lazy_mmu_mode(); > > } >
Re: [PATCH v3 3/4] arch, mm: restore dependency of __kernel_map_pages() of DEBUG_PAGEALLOC
On Mon, Nov 02, 2020 at 10:23:20AM +0100, David Hildenbrand wrote: > > > int __init kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, unsigned long > > address, > >unsigned numpages, unsigned long page_flags) > > diff --git a/include/linux/mm.h b/include/linux/mm.h > > index 14e397f3752c..ab0ef6bd351d 100644 > > --- a/include/linux/mm.h > > +++ b/include/linux/mm.h > > @@ -2924,7 +2924,11 @@ static inline bool > > debug_pagealloc_enabled_static(void) > > return static_branch_unlikely(&_debug_pagealloc_enabled); > > } > > -#if defined(CONFIG_DEBUG_PAGEALLOC) || > > defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP) > > +#ifdef CONFIG_DEBUG_PAGEALLOC > > +/* > > + * To support DEBUG_PAGEALLOC architecture must ensure that > > + * __kernel_map_pages() never fails > > Maybe add here, that this implies mapping everything via PTEs during boot. This is more of an implementation detail, while assumption that __kernel_map_pages() does not fail is somewhat a requirement :) > Acked-by: David Hildenbrand Thanks! > -- > Thanks, > > David / dhildenb > -- Sincerely yours, Mike.
Re: [PATCH v2] powerpc/pci: unmap legacy INTx interrupts when a PHB is removed
On 10/14/20 4:55 AM, Alexey Kardashevskiy wrote: > > > On 23/09/2020 17:06, Cédric Le Goater wrote: >> On 9/23/20 2:33 AM, Qian Cai wrote: >>> On Fri, 2020-08-07 at 12:18 +0200, Cédric Le Goater wrote: When a passthrough IO adapter is removed from a pseries machine using hash MMU and the XIVE interrupt mode, the POWER hypervisor expects the guest OS to clear all page table entries related to the adapter. If some are still present, the RTAS call which isolates the PCI slot returns error 9001 "valid outstanding translations" and the removal of the IO adapter fails. This is because when the PHBs are scanned, Linux maps automatically the INTx interrupts in the Linux interrupt number space but these are never removed. To solve this problem, we introduce a PPC platform specific pcibios_remove_bus() routine which clears all interrupt mappings when the bus is removed. This also clears the associated page table entries of the ESB pages when using XIVE. For this purpose, we record the logical interrupt numbers of the mapped interrupt under the PHB structure and let pcibios_remove_bus() do the clean up. Since some PCI adapters, like GPUs, use the "interrupt-map" property to describe interrupt mappings other than the legacy INTx interrupts, we can not restrict the size of the mapping array to PCI_NUM_INTX. The number of interrupt mappings is computed from the "interrupt-map" property and the mapping array is allocated accordingly. Cc: "Oliver O'Halloran" Cc: Alexey Kardashevskiy Signed-off-by: Cédric Le Goater >>> >>> Some syscall fuzzing will trigger this on POWER9 NV where the traces >>> pointed to >>> this patch. >>> >>> .config: https://gitlab.com/cailca/linux-mm/-/blob/master/powerpc.config >> >> OK. The patch is missing a NULL assignement after kfree() and that >> might be the issue. >> >> I did try PHB removal under PowerNV, so I would like to understand >> how we managed to remove twice the PCI bus and possibly reproduce. >> Any chance we could grab what the syscall fuzzer (syzkaller) did ? > > > How do you remove PHBs exactly? There is no such thing in the powernv > platform, I thought someone added this and you are fixing it but no. PHBs on > powernv are created at the boot time and there is no way to remove them, you > can only try removing all the bridges. yes. I noticed that later when proposing the fix for the double free. > So what exactly are you doing? What you just said above, with the commands : echo 1 > /sys/devices/pci0031\:00/0031\:00\:00.0/remove echo 1 > /sys/devices/pci0031\:00/pci_bus/0031\:00/rescan C.
Re: [PATCH v3 2/4] PM: hibernate: make direct map manipulations more explicit
On Mon, Nov 02, 2020 at 10:19:36AM +0100, David Hildenbrand wrote: > On 01.11.20 18:08, Mike Rapoport wrote: > > From: Mike Rapoport > > > > When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page may be > > not present in the direct map and has to be explicitly mapped before it > > could be copied. > > > > Introduce hibernate_map_page() that will explicitly use > > set_direct_map_{default,invalid}_noflush() for ARCH_HAS_SET_DIRECT_MAP case > > and debug_pagealloc_map_pages() for DEBUG_PAGEALLOC case. > > > > The remapping of the pages in safe_copy_page() presumes that it only > > changes protection bits in an existing PTE and so it is safe to ignore > > return value of set_direct_map_{default,invalid}_noflush(). > > > > Still, add a WARN_ON() so that future changes in set_memory APIs will not > > silently break hibernation. > > > > Signed-off-by: Mike Rapoport > > Acked-by: Rafael J. Wysocki > > --- > > include/linux/mm.h | 12 > > kernel/power/snapshot.c | 30 -- > > 2 files changed, 28 insertions(+), 14 deletions(-) > > > > diff --git a/include/linux/mm.h b/include/linux/mm.h > > index 1fc0609056dc..14e397f3752c 100644 > > --- a/include/linux/mm.h > > +++ b/include/linux/mm.h > > @@ -2927,16 +2927,6 @@ static inline bool > > debug_pagealloc_enabled_static(void) > > #if defined(CONFIG_DEBUG_PAGEALLOC) || > > defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP) > > extern void __kernel_map_pages(struct page *page, int numpages, int > > enable); > > -/* > > - * When called in DEBUG_PAGEALLOC context, the call should most likely be > > - * guarded by debug_pagealloc_enabled() or debug_pagealloc_enabled_static() > > - */ > > -static inline void > > -kernel_map_pages(struct page *page, int numpages, int enable) > > -{ > > - __kernel_map_pages(page, numpages, enable); > > -} > > - > > static inline void debug_pagealloc_map_pages(struct page *page, > > int numpages, int enable) > > { > > @@ -2948,8 +2938,6 @@ static inline void debug_pagealloc_map_pages(struct > > page *page, > > extern bool kernel_page_present(struct page *page); > > #endif/* CONFIG_HIBERNATION */ > > #else /* CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */ > > -static inline void > > -kernel_map_pages(struct page *page, int numpages, int enable) {} > > static inline void debug_pagealloc_map_pages(struct page *page, > > int numpages, int enable) {} > > #ifdef CONFIG_HIBERNATION > > diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c > > index 46b1804c1ddf..054c8cce4236 100644 > > --- a/kernel/power/snapshot.c > > +++ b/kernel/power/snapshot.c > > @@ -76,6 +76,32 @@ static inline void hibernate_restore_protect_page(void > > *page_address) {} > > static inline void hibernate_restore_unprotect_page(void *page_address) {} > > #endif /* CONFIG_STRICT_KERNEL_RWX && CONFIG_ARCH_HAS_SET_MEMORY */ > > +static inline void hibernate_map_page(struct page *page, int enable) > > +{ > > + if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) { > > + unsigned long addr = (unsigned long)page_address(page); > > + int ret; > > + > > + /* > > +* This should not fail because remapping a page here means > > +* that we only update protection bits in an existing PTE. > > +* It is still worth to have WARN_ON() here if something > > +* changes and this will no longer be the case. > > +*/ > > + if (enable) > > + ret = set_direct_map_default_noflush(page); > > + else > > + ret = set_direct_map_invalid_noflush(page); > > + > > + if (WARN_ON(ret)) > > + return; > > People seem to prefer pr_warn() now that production kernels have panic on > warn enabled. It's weird. Weird indeed as the whole point of WARN to yell without causing a crash... I can change to pr_warn though... > > + > > + flush_tlb_kernel_range(addr, addr + PAGE_SIZE); > > + } else { > > + debug_pagealloc_map_pages(page, 1, enable); > > Reviewed-by: David Hildenbrand Thanks! > -- > Thanks, > > David / dhildenb > -- Sincerely yours, Mike.
Re: [PATCH v2 2/2] misc: ocxl: config: Rename function attribute description
Le 02/11/2020 à 15:20, Lee Jones a écrit : Fixes the following W=1 kernel build warning(s): drivers/misc/ocxl/config.c:81: warning: Function parameter or member 'dev' not described in 'get_function_0' drivers/misc/ocxl/config.c:81: warning: Excess function parameter 'device' description in 'get_function_0' Cc: Frederic Barrat Cc: Andrew Donnellan Cc: Arnd Bergmann Cc: Greg Kroah-Hartman Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Lee Jones --- Thanks! Acked-by: Frederic Barrat drivers/misc/ocxl/config.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/misc/ocxl/config.c b/drivers/misc/ocxl/config.c index 4d490b92d951f..a68738f382521 100644 --- a/drivers/misc/ocxl/config.c +++ b/drivers/misc/ocxl/config.c @@ -73,7 +73,7 @@ static int find_dvsec_afu_ctrl(struct pci_dev *dev, u8 afu_idx) /** * get_function_0() - Find a related PCI device (function 0) - * @device: PCI device to match + * @dev: PCI device to match * * Returns a pointer to the related device, or null if not found */
[PATCH v2 2/2] misc: ocxl: config: Rename function attribute description
Fixes the following W=1 kernel build warning(s): drivers/misc/ocxl/config.c:81: warning: Function parameter or member 'dev' not described in 'get_function_0' drivers/misc/ocxl/config.c:81: warning: Excess function parameter 'device' description in 'get_function_0' Cc: Frederic Barrat Cc: Andrew Donnellan Cc: Arnd Bergmann Cc: Greg Kroah-Hartman Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Lee Jones --- drivers/misc/ocxl/config.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/misc/ocxl/config.c b/drivers/misc/ocxl/config.c index 4d490b92d951f..a68738f382521 100644 --- a/drivers/misc/ocxl/config.c +++ b/drivers/misc/ocxl/config.c @@ -73,7 +73,7 @@ static int find_dvsec_afu_ctrl(struct pci_dev *dev, u8 afu_idx) /** * get_function_0() - Find a related PCI device (function 0) - * @device: PCI device to match + * @dev: PCI device to match * * Returns a pointer to the related device, or null if not found */ -- 2.25.1
Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
On Mon, Nov 02, 2020 at 09:06:38AM +, David Laight wrote: > From: 'Greg KH' > > Sent: 23 October 2020 15:47 > > > > On Fri, Oct 23, 2020 at 02:39:24PM +, David Laight wrote: > > > From: David Hildenbrand > > > > Sent: 23 October 2020 15:33 > > > ... > > > > I just checked against upstream code generated by clang 10 and it > > > > properly discards the upper 32bit via a mov w23 w2. > > > > > > > > So at least clang 10 indeed properly assumes we could have garbage and > > > > masks it off. > > > > > > > > Maybe the issue is somewhere else, unrelated to nr_pages ... or clang 11 > > > > behaves differently. > > > > > > We'll need the disassembly from a failing kernel image. > > > It isn't that big to hand annotate. > > > > I've worked around the merge at the moment in the android tree, but it > > is still quite reproducable, and will try to get a .o file to > > disassemble on Monday or so... > > Did this get properly resolved? For some reason, 5.10-rc2 fixed all of this up. I backed out all of the patches I had to revert to get 5.10-rc1 to work properly, and then did the merge and all is well. It must have been something to do with the compat changes in this same area that went in after 5.10-rc1, and something got reorganized in the files somehow. I really do not know, and at the moment, don't have the time to track it down anymore. So for now, I'd say it's all good, sorry for the noise. greg k-h
Re: [PATCH 23/23] mtd: devices: powernv_flash: Add function names to headers and fix 'dev'
Hi Lee, Lee Jones wrote on Mon, 2 Nov 2020 11:54:06 +: > Fixes the following W=1 kernel build warning(s): > > drivers/mtd/devices/powernv_flash.c:129: warning: Cannot understand * @mtd: > the device > drivers/mtd/devices/powernv_flash.c:145: warning: Cannot understand * @mtd: > the device > drivers/mtd/devices/powernv_flash.c:161: warning: Cannot understand * @mtd: > the device > drivers/mtd/devices/powernv_flash.c:184: warning: Function parameter or > member 'dev' not described in 'powernv_flash_set_driver_info' > > Cc: Miquel Raynal > Cc: Richard Weinberger > Cc: Vignesh Raghavendra > Cc: Michael Ellerman > Cc: Benjamin Herrenschmidt > Cc: Paul Mackerras > Cc: Cyril Bur > Cc: linux-...@lists.infradead.org > Cc: linuxppc-dev@lists.ozlabs.org > Signed-off-by: Lee Jones > --- > drivers/mtd/devices/powernv_flash.c | 6 +- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/drivers/mtd/devices/powernv_flash.c > b/drivers/mtd/devices/powernv_flash.c > index 0b757d9ba2f6b..32cb0e649096f 100644 > --- a/drivers/mtd/devices/powernv_flash.c > +++ b/drivers/mtd/devices/powernv_flash.c > @@ -126,6 +126,8 @@ static int powernv_flash_async_op(struct mtd_info *mtd, > enum flash_op op, > } > > /** > + * powernv_flash_read > + * Perhaps we should not add blank lines if the rest of the file does not already have such spacing (see below). > * @mtd: the device > * @from: the offset to read from > * @len: the number of bytes to read > @@ -142,6 +144,7 @@ static int powernv_flash_read(struct mtd_info *mtd, > loff_t from, size_t len, > } > > /** > + * powernv_flash_write > * @mtd: the device > * @to: the offset to write to > * @len: the number of bytes to write > @@ -158,6 +161,7 @@ static int powernv_flash_write(struct mtd_info *mtd, > loff_t to, size_t len, > } > > /** > + * powernv_flash_erase > * @mtd: the device > * @erase: the erase info > * Returns 0 if erase successful or -ERRNO if an error occurred > @@ -176,7 +180,7 @@ static int powernv_flash_erase(struct mtd_info *mtd, > struct erase_info *erase) > > /** > * powernv_flash_set_driver_info - Fill the mtd_info structure and docg3 > - * structure @pdev: The platform device > + * @dev: The device structure > * @mtd: The structure to fill > */ > static int powernv_flash_set_driver_info(struct device *dev, Thanks, Miquèl
[PATCH 23/23] mtd: devices: powernv_flash: Add function names to headers and fix 'dev'
Fixes the following W=1 kernel build warning(s): drivers/mtd/devices/powernv_flash.c:129: warning: Cannot understand * @mtd: the device drivers/mtd/devices/powernv_flash.c:145: warning: Cannot understand * @mtd: the device drivers/mtd/devices/powernv_flash.c:161: warning: Cannot understand * @mtd: the device drivers/mtd/devices/powernv_flash.c:184: warning: Function parameter or member 'dev' not described in 'powernv_flash_set_driver_info' Cc: Miquel Raynal Cc: Richard Weinberger Cc: Vignesh Raghavendra Cc: Michael Ellerman Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Cyril Bur Cc: linux-...@lists.infradead.org Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Lee Jones --- drivers/mtd/devices/powernv_flash.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/mtd/devices/powernv_flash.c b/drivers/mtd/devices/powernv_flash.c index 0b757d9ba2f6b..32cb0e649096f 100644 --- a/drivers/mtd/devices/powernv_flash.c +++ b/drivers/mtd/devices/powernv_flash.c @@ -126,6 +126,8 @@ static int powernv_flash_async_op(struct mtd_info *mtd, enum flash_op op, } /** + * powernv_flash_read + * * @mtd: the device * @from: the offset to read from * @len: the number of bytes to read @@ -142,6 +144,7 @@ static int powernv_flash_read(struct mtd_info *mtd, loff_t from, size_t len, } /** + * powernv_flash_write * @mtd: the device * @to: the offset to write to * @len: the number of bytes to write @@ -158,6 +161,7 @@ static int powernv_flash_write(struct mtd_info *mtd, loff_t to, size_t len, } /** + * powernv_flash_erase * @mtd: the device * @erase: the erase info * Returns 0 if erase successful or -ERRNO if an error occurred @@ -176,7 +180,7 @@ static int powernv_flash_erase(struct mtd_info *mtd, struct erase_info *erase) /** * powernv_flash_set_driver_info - Fill the mtd_info structure and docg3 - * structure @pdev: The platform device + * @dev: The device structure * @mtd: The structure to fill */ static int powernv_flash_set_driver_info(struct device *dev, -- 2.25.1
[PATCH 2/2] misc: ocxl: config: Rename function attribute description
Fixes the following W=1 kernel build warning(s): drivers/misc/ocxl/config.c:81: warning: Function parameter or member 'dev' not described in 'get_function_0' drivers/misc/ocxl/config.c:81: warning: Excess function parameter 'device' description in 'get_function_0' Cc: Frederic Barrat Cc: Andrew Donnellan Cc: Arnd Bergmann Cc: Greg Kroah-Hartman Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Lee Jones --- drivers/misc/ocxl/config.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/misc/ocxl/config.c b/drivers/misc/ocxl/config.c index 4d490b92d951f..a68738f382521 100644 --- a/drivers/misc/ocxl/config.c +++ b/drivers/misc/ocxl/config.c @@ -73,7 +73,7 @@ static int find_dvsec_afu_ctrl(struct pci_dev *dev, u8 afu_idx) /** * get_function_0() - Find a related PCI device (function 0) - * @device: PCI device to match + * @dev: PCI device to match * * Returns a pointer to the related device, or null if not found */ -- 2.25.1
Re: [PATCH v3 4/4] arch, mm: make kernel_page_present() always available
On 01.11.20 18:08, Mike Rapoport wrote: From: Mike Rapoport For architectures that enable ARCH_HAS_SET_MEMORY having the ability to verify that a page is mapped in the kernel direct map can be useful regardless of hibernation. Add RISC-V implementation of kernel_page_present(), update its forward declarations and stubs to be a part of set_memory API and remove ugly ifdefery in inlcude/linux/mm.h around current declarations of kernel_page_present(). Signed-off-by: Mike Rapoport --- arch/arm64/include/asm/cacheflush.h | 1 + arch/arm64/mm/pageattr.c| 4 +--- arch/riscv/include/asm/set_memory.h | 1 + arch/riscv/mm/pageattr.c| 29 + arch/x86/include/asm/set_memory.h | 1 + arch/x86/mm/pat/set_memory.c| 4 +--- include/linux/mm.h | 7 --- include/linux/set_memory.h | 5 + 8 files changed, 39 insertions(+), 13 deletions(-) diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h index 9384fd8fc13c..45217f21f1fe 100644 --- a/arch/arm64/include/asm/cacheflush.h +++ b/arch/arm64/include/asm/cacheflush.h @@ -140,6 +140,7 @@ int set_memory_valid(unsigned long addr, int numpages, int enable); int set_direct_map_invalid_noflush(struct page *page); int set_direct_map_default_noflush(struct page *page); +bool kernel_page_present(struct page *page); #include diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c index 439325532be1..92eccaf595c8 100644 --- a/arch/arm64/mm/pageattr.c +++ b/arch/arm64/mm/pageattr.c @@ -186,8 +186,8 @@ void __kernel_map_pages(struct page *page, int numpages, int enable) set_memory_valid((unsigned long)page_address(page), numpages, enable); } +#endif /* CONFIG_DEBUG_PAGEALLOC */ -#ifdef CONFIG_HIBERNATION /* * This function is used to determine if a linear map page has been marked as * not-valid. Walk the page table and check the PTE_VALID bit. This is based @@ -234,5 +234,3 @@ bool kernel_page_present(struct page *page) ptep = pte_offset_kernel(pmdp, addr); return pte_valid(READ_ONCE(*ptep)); } -#endif /* CONFIG_HIBERNATION */ -#endif /* CONFIG_DEBUG_PAGEALLOC */ diff --git a/arch/riscv/include/asm/set_memory.h b/arch/riscv/include/asm/set_memory.h index 4c5bae7ca01c..d690b08dff2a 100644 --- a/arch/riscv/include/asm/set_memory.h +++ b/arch/riscv/include/asm/set_memory.h @@ -24,6 +24,7 @@ static inline int set_memory_nx(unsigned long addr, int numpages) { return 0; } int set_direct_map_invalid_noflush(struct page *page); int set_direct_map_default_noflush(struct page *page); +bool kernel_page_present(struct page *page); #endif /* __ASSEMBLY__ */ diff --git a/arch/riscv/mm/pageattr.c b/arch/riscv/mm/pageattr.c index 321b09d2e2ea..87ba5a68bbb8 100644 --- a/arch/riscv/mm/pageattr.c +++ b/arch/riscv/mm/pageattr.c @@ -198,3 +198,32 @@ void __kernel_map_pages(struct page *page, int numpages, int enable) __pgprot(0), __pgprot(_PAGE_PRESENT)); } #endif + +bool kernel_page_present(struct page *page) +{ + unsigned long addr = (unsigned long)page_address(page); + pgd_t *pgd; + pud_t *pud; + p4d_t *p4d; + pmd_t *pmd; + pte_t *pte; + + pgd = pgd_offset_k(addr); + if (!pgd_present(*pgd)) + return false; + + p4d = p4d_offset(pgd, addr); + if (!p4d_present(*p4d)) + return false; + + pud = pud_offset(p4d, addr); + if (!pud_present(*pud)) + return false; + + pmd = pmd_offset(pud, addr); + if (!pmd_present(*pmd)) + return false; + + pte = pte_offset_kernel(pmd, addr); + return pte_present(*pte); +} diff --git a/arch/x86/include/asm/set_memory.h b/arch/x86/include/asm/set_memory.h index 5948218f35c5..4352f08bfbb5 100644 --- a/arch/x86/include/asm/set_memory.h +++ b/arch/x86/include/asm/set_memory.h @@ -82,6 +82,7 @@ int set_pages_rw(struct page *page, int numpages); int set_direct_map_invalid_noflush(struct page *page); int set_direct_map_default_noflush(struct page *page); +bool kernel_page_present(struct page *page); extern int kernel_set_to_readonly; diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c index bc9be96b777f..16f878c26667 100644 --- a/arch/x86/mm/pat/set_memory.c +++ b/arch/x86/mm/pat/set_memory.c @@ -2226,8 +2226,8 @@ void __kernel_map_pages(struct page *page, int numpages, int enable) arch_flush_lazy_mmu_mode(); } +#endif /* CONFIG_DEBUG_PAGEALLOC */ -#ifdef CONFIG_HIBERNATION bool kernel_page_present(struct page *page) { unsigned int level; @@ -2239,8 +2239,6 @@ bool kernel_page_present(struct page *page) pte = lookup_address((unsigned long)page_address(page), &level); return (pte_val(*pte) & _PAGE_PRESENT); } -#endif /* CONFIG_HIBERNATION */ -#endif /* CONFIG_DEBUG_PAGEALLOC */ int __init
Re: [PATCH v3 3/4] arch, mm: restore dependency of __kernel_map_pages() of DEBUG_PAGEALLOC
int __init kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, unsigned long address, unsigned numpages, unsigned long page_flags) diff --git a/include/linux/mm.h b/include/linux/mm.h index 14e397f3752c..ab0ef6bd351d 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2924,7 +2924,11 @@ static inline bool debug_pagealloc_enabled_static(void) return static_branch_unlikely(&_debug_pagealloc_enabled); } -#if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP) +#ifdef CONFIG_DEBUG_PAGEALLOC +/* + * To support DEBUG_PAGEALLOC architecture must ensure that + * __kernel_map_pages() never fails Maybe add here, that this implies mapping everything via PTEs during boot. Acked-by: David Hildenbrand -- Thanks, David / dhildenb
Re: [PATCH v3 2/4] PM: hibernate: make direct map manipulations more explicit
On 01.11.20 18:08, Mike Rapoport wrote: From: Mike Rapoport When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page may be not present in the direct map and has to be explicitly mapped before it could be copied. Introduce hibernate_map_page() that will explicitly use set_direct_map_{default,invalid}_noflush() for ARCH_HAS_SET_DIRECT_MAP case and debug_pagealloc_map_pages() for DEBUG_PAGEALLOC case. The remapping of the pages in safe_copy_page() presumes that it only changes protection bits in an existing PTE and so it is safe to ignore return value of set_direct_map_{default,invalid}_noflush(). Still, add a WARN_ON() so that future changes in set_memory APIs will not silently break hibernation. Signed-off-by: Mike Rapoport Acked-by: Rafael J. Wysocki --- include/linux/mm.h | 12 kernel/power/snapshot.c | 30 -- 2 files changed, 28 insertions(+), 14 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 1fc0609056dc..14e397f3752c 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2927,16 +2927,6 @@ static inline bool debug_pagealloc_enabled_static(void) #if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP) extern void __kernel_map_pages(struct page *page, int numpages, int enable); -/* - * When called in DEBUG_PAGEALLOC context, the call should most likely be - * guarded by debug_pagealloc_enabled() or debug_pagealloc_enabled_static() - */ -static inline void -kernel_map_pages(struct page *page, int numpages, int enable) -{ - __kernel_map_pages(page, numpages, enable); -} - static inline void debug_pagealloc_map_pages(struct page *page, int numpages, int enable) { @@ -2948,8 +2938,6 @@ static inline void debug_pagealloc_map_pages(struct page *page, extern bool kernel_page_present(struct page *page); #endif/* CONFIG_HIBERNATION */ #else /* CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */ -static inline void -kernel_map_pages(struct page *page, int numpages, int enable) {} static inline void debug_pagealloc_map_pages(struct page *page, int numpages, int enable) {} #ifdef CONFIG_HIBERNATION diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c index 46b1804c1ddf..054c8cce4236 100644 --- a/kernel/power/snapshot.c +++ b/kernel/power/snapshot.c @@ -76,6 +76,32 @@ static inline void hibernate_restore_protect_page(void *page_address) {} static inline void hibernate_restore_unprotect_page(void *page_address) {} #endif /* CONFIG_STRICT_KERNEL_RWX && CONFIG_ARCH_HAS_SET_MEMORY */ +static inline void hibernate_map_page(struct page *page, int enable) +{ + if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) { + unsigned long addr = (unsigned long)page_address(page); + int ret; + + /* +* This should not fail because remapping a page here means +* that we only update protection bits in an existing PTE. +* It is still worth to have WARN_ON() here if something +* changes and this will no longer be the case. +*/ + if (enable) + ret = set_direct_map_default_noflush(page); + else + ret = set_direct_map_invalid_noflush(page); + + if (WARN_ON(ret)) + return; People seem to prefer pr_warn() now that production kernels have panic on warn enabled. It's weird. + + flush_tlb_kernel_range(addr, addr + PAGE_SIZE); + } else { + debug_pagealloc_map_pages(page, 1, enable); Reviewed-by: David Hildenbrand -- Thanks, David / dhildenb
RE: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: 'Greg KH' > Sent: 23 October 2020 15:47 > > On Fri, Oct 23, 2020 at 02:39:24PM +, David Laight wrote: > > From: David Hildenbrand > > > Sent: 23 October 2020 15:33 > > ... > > > I just checked against upstream code generated by clang 10 and it > > > properly discards the upper 32bit via a mov w23 w2. > > > > > > So at least clang 10 indeed properly assumes we could have garbage and > > > masks it off. > > > > > > Maybe the issue is somewhere else, unrelated to nr_pages ... or clang 11 > > > behaves differently. > > > > We'll need the disassembly from a failing kernel image. > > It isn't that big to hand annotate. > > I've worked around the merge at the moment in the android tree, but it > is still quite reproducable, and will try to get a .o file to > disassemble on Monday or so... Did this get properly resolved? David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)