Re: [PATCH 1/1] vfio-pci/nvlink2: Allow fallback to ibm,mmio-atsd[0]

2020-02-06 Thread Sam Bobroff
On Thu, Feb 06, 2020 at 03:23:03PM +1100, Alexey Kardashevskiy wrote: > > > On 06/02/2020 14:17, Sam Bobroff wrote: > > Older versions of skiboot only provide a single value in the device > > tree property "ibm,mmio-atsd", even when multiple Address Translation

Re: [PATCH 1/1] vfio-pci/nvlink2: Allow fallback to ibm,mmio-atsd[0]

2020-02-06 Thread Sam Bobroff
On Fri, Feb 07, 2020 at 01:39:14PM +1100, Sam Bobroff wrote: > On Thu, Feb 06, 2020 at 03:23:03PM +1100, Alexey Kardashevskiy wrote: > > > > > > On 06/02/2020 14:17, Sam Bobroff wrote: > > > Older versions of skiboot only provide a single value in the device

[PATCH 1/1] powerpc/eeh: fix deadlock handling dead PHB

2020-02-06 Thread Sam Bobroff
, incorrectly, processed more than once. Untangling this section can move the pe processing out of the loop and also outside the locked section, correcting both problems. Signed-off-by: Sam Bobroff --- I have only compile tested this fix, Frederic Barrat (who discovered it) has offered to test it (thanks

Re: [PATCH 1/3] powerpc/sriov: Remove VF eeh_dev state when disabling SR-IOV

2019-08-21 Thread Sam Bobroff
allbacks so > the EEH fallback path (which removes and re-probes PCI devices) > would be used. I gave this a quick test with some added instrumentation, and I can see that the new code is used during VF removal and it doesn't cause any new problems. I agree that even if it's difficult

Re: [PATCH 3/3] powerpc/pcidn: Warn when sriov pci_dn management is used incorrectly

2019-08-21 Thread Sam Bobroff
d remove the dead > code that checks if the device is a VF. > > Signed-off-by: Oliver O'Halloran Looks good, but you might want to consider using WARN_ON_ONCE() just in case it gets hit a lot. Reviewed-by: Sam Bobroff > --- > arch/powerpc/kernel/pci_dn.c | 17 +++-

Re: [PATCH 2/3] powerpc/pcidn: Make VF pci_dn management CONFIG_PCI_IOV specific

2019-08-21 Thread Sam Bobroff
hen CONFIG_PCI_IOV > is selected, and rename them to reflect their actual usage rather than > having them masquerade as generic code. > > Signed-off-by: Oliver O'Halloran Nice cleanup, Reviewed-by: Sam Bobroff > --- > arch/powerpc/include/asm/pci-bridge.h | 7

[PATCH] powerpc/eeh: Fixup EEH for pSeries hotplug

2019-08-21 Thread Sam Bobroff
Signed-off-by: Sam Bobroff --- Let's move the test into eeh_add_device_tree_late(). Thanks, Sam. arch/powerpc/kernel/eeh.c | 2 ++ arch/powerpc/kernel/of_platform.c | 3 +-- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/k

Re: [PATCH 01/14] powerpc/eeh: Clean up EEH PEs after recovery finishes

2019-09-16 Thread Sam Bobroff
change is where EEH_PE_RECOVERING affects eeh_pe_reset_and_recover() (used when a PE is passed back from a guest to the host), but the test case doesn't seem to be any worse. Reviewed-by: Sam Bobroff > --- > Sam Bobroff is working on implementing proper refcounting for EEH PEs, >

Re: [PATCH 02/14] powerpc/eeh: Fix race when freeing PDNs

2019-09-16 Thread Sam Bobroff
l, meaning the pci_dev is already gone, the release handler is already called, and the PDN can be removed there, or b) returns non-null and atomically increases the refcount and the release handler won't be called until after we've set the DEAD flag and released our reference. Looks g

Re: [PATCH 03/14] powerpc/eeh: Make permanently failed devices non-actionable

2019-09-16 Thread Sam Bobroff
considered un-actionable. > > Signed-off-by: Oliver O'Halloran Other than the typo, looks good (I think it should always have been like this): Reviewed-by: Sam Bobroff > --- > arch/powerpc/kernel/eeh_driver.c | 12 ++-- > 1 file changed, 10 insertions(+), 2 deleti

Re: [PATCH 04/14] powerpc/eeh: Check slot presence state in eeh_handle_normal_event()

2019-09-16 Thread Sam Bobroff
On Tue, Sep 03, 2019 at 08:15:55PM +1000, Oliver O'Halloran wrote: > When a device is surprise removed while undergoing IO we will probably > get an EEH PE freeze due to MMIO timeouts and other errors. When a freeze > is detected we send a recovery event to the EEH worker thread which will > notify

Re: [PATCH 05/14] powerpc/eeh: Defer printing stack trace

2019-09-16 Thread Sam Bobroff
On Tue, Sep 03, 2019 at 08:15:56PM +1000, Oliver O'Halloran wrote: > Currently we print a stack trace in the event handler to help with > debugging EEH issues. In the case of suprise hot-unplug this is unneeded, > so we want to prevent printing the stack trace unless we know it's due to > an actual

Re: [PATCH 06/14] powerpc/eeh: Remove stale CAPI comment

2019-09-16 Thread Sam Bobroff
On Tue, Sep 03, 2019 at 08:15:57PM +1000, Oliver O'Halloran wrote: > Support for switching CAPI cards into and out of CAPI mode was removed a > while ago. Drop the comment since it's no longer relevant. > > Cc: Andrew Donnellan > Signed-off-by: Oliver O'Halloran

Re: [PATCH 07/14] powernv/eeh: Use generic code to handle hot resets

2019-09-16 Thread Sam Bobroff
On Tue, Sep 03, 2019 at 08:15:58PM +1000, Oliver O'Halloran wrote: > When we reset PCI devices managed by a hotplug driver the reset may > generate spurious hotplug events that cause the PCI device we're resetting > to be torn down accidently. This is a problem for EEH (when the driver is > EEH awa

Re: [PATCH 11/14] powerpc/eeh: Set attention indicator while recovering

2019-09-16 Thread Sam Bobroff
> the device is present and only clear it if the device is fully recovered. > > Signed-off-by: Oliver O'Halloran Looks good, although I think it would be clearer if you could separate checking the slot from raising the alert. Reviewed-by: Sam Bobroff >

Re: [PATCH 12/14] powerpc/eeh: Add debugfs interface to run an EEH check

2019-09-16 Thread Sam Bobroff
ace. > > Signed-off-by: Oliver O'Halloran Looks good, and I tested it with the next patch and it seems to work. But I think you should make it clear that this does not work with the hardware "EEH error injection" facility accessible via debugfs in err_injct (that doesn&#x

Re: [PATCH 13/14] powerpc/eeh: Add a eeh_dev_break debugfs interface

2019-09-16 Thread Sam Bobroff
ks good to me. Tested with the previous patch. Tested-by: Sam Bobroff Reviewed-by: Sam Bobroff > --- > arch/powerpc/kernel/eeh.c | 139 +- > 1 file changed, 138 insertions(+), 1 deletion(-) > > diff --git a/arch/powerpc/kernel/eeh.c b/arch/p

Re: [PATCH 05/14] powerpc/eeh: Defer printing stack trace

2019-09-16 Thread Sam Bobroff
On Tue, Sep 17, 2019 at 11:45:14AM +1000, Oliver O'Halloran wrote: > On Tue, Sep 17, 2019 at 11:04 AM Sam Bobroff wrote: > > > > On Tue, Sep 03, 2019 at 08:15:56PM +1000, Oliver O'Halloran wrote: > > > Currently we print a stack trace in the event handler to he

Re: [PATCH v5 05/12] powerpc/eeh: EEH for pSeries hot plug

2019-09-22 Thread Sam Bobroff
On Thu, Sep 19, 2019 at 03:28:40PM -0500, Nathan Lynch wrote: > Hello Sam, > > Sam Bobroff writes: > > On PowerNV and pSeries, devices currently acquire EEH support from > > several different places: Boot-time devices from eeh_probe_devices() > > and eeh_addr_cach

[PATCH RFC 01/15] powerpc/eeh: Introduce refcounting for struct eeh_pe

2019-10-01 Thread Sam Bobroff
provides no additional synchronization of the other EEH state, it seems to be an effective way of providing the necessary safety with a very low risk of introducing deadlocks. Signed-off-by: Sam Bobroff --- arch/powerpc/include/asm/eeh.h | 7 arch/powerpc/kernel/eeh_pe.c | 70

[PATCH RFC 04/15] powerpc/eeh: Sync eeh_pe_next(), eeh_pe_find() and early-out traversals

2019-10-01 Thread Sam Bobroff
set to NULL on removal (see eeh_rmv_from_parent_pe()) (PHB type PEs never have their parent set, but aren't a problem: they can't be removed). If this does occur, the traversal is terminated. This may leave the traversal incomplete, but that is preferable to crashing. Signed-off

[PATCH RFC 10/15] powerpc/eeh: Sync eeh_phb_check_failure()

2019-10-01 Thread Sam Bobroff
Synchronize access to eeh_pe. Signed-off-by: Sam Bobroff --- arch/powerpc/kernel/eeh.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c index 7eb6ca1ab72b..eb37cb384ff4 100644 --- a/arch/powerpc/kernel/eeh.c +++ b

[PATCH RFC 09/15] powerpw/eeh: Sync eeh_handle_special_event(), pnv_eeh_get_pe(), pnv_eeh_next_error()

2019-10-01 Thread Sam Bobroff
Synchronize access to eeh_pe. Signed-off-by: Sam Bobroff --- arch/powerpc/kernel/eeh_driver.c | 15 +--- arch/powerpc/platforms/powernv/eeh-powernv.c | 38 2 files changed, 43 insertions(+), 10 deletions(-) diff --git a/arch/powerpc/kernel/eeh_driver.c b

[PATCH RFC 08/15] powerpc/eeh: Sync eeh_handle_normal_event()

2019-10-01 Thread Sam Bobroff
Synchronize access to eeh_pe. Signed-off-by: Sam Bobroff --- arch/powerpc/kernel/eeh_driver.c | 4 1 file changed, 4 insertions(+) diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c index b3245d0cfb22..c9d73070793e 100644 --- a/arch/powerpc/kernel

[PATCH RFC 11/15] powerpc/eeh: Sync eeh_dev_check_failure()

2019-10-01 Thread Sam Bobroff
Synchronize access to eeh_pe. Signed-off-by: Sam Bobroff --- arch/powerpc/kernel/eeh.c | 26 -- 1 file changed, 20 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c index eb37cb384ff4..171be70b34d8 100644 --- a/arch/powerpc

[PATCH RFC 07/15] powerpc/eeh: Sync eeh_add_to_parent_pe() and eeh_rmv_from_parent_pe()

2019-10-01 Thread Sam Bobroff
Note that even though there is currently only one place where a PE can be removed from the parent/child tree (eeh_rmv_from_parent_pe()), it is still protected against concurrent removal in case that changes in the future. Signed-off-by: Sam Bobroff --- arch/powerpc/kernel/eeh_pe.c | 26

[PATCH RFC 12/15] powerpc/eeh: Sync eeh_pe_get_state()

2019-10-01 Thread Sam Bobroff
Synchronize access to eeh_pe. Signed-off-by: Sam Bobroff --- arch/powerpc/kernel/eeh.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c index 171be70b34d8..cba16ca0694a 100644 --- a/arch/powerpc/kernel/eeh.c +++ b

[PATCH RFC 06/15] powerpc/eeh: Sync eeh_phb_pe_get()

2019-10-01 Thread Sam Bobroff
Synchronize access to eeh_pe. Signed-off-by: Sam Bobroff --- arch/powerpc/kernel/eeh_pe.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c index 0486d3c6ff20..e89a30de2e7e 100644 --- a/arch/powerpc/kernel

[PATCH RFC 03/15] powerpc/eeh: Track orphaned struct eeh_pe

2019-10-01 Thread Sam Bobroff
ing, so any PEs that stay longer will be the result of bugs. The list can be examined by reading from the "eeh_pe_debug" file in debugfs. Signed-off-by: Sam Bobroff --- arch/powerpc/include/asm/eeh.h | 4 +++ arch/powerpc/kernel/eeh.c | 21 ++ arch/power

[PATCH RFC 02/15] powerpc/eeh: Rename eeh_pe_get() to eeh_pe_find()

2019-10-01 Thread Sam Bobroff
There are now functions eeh_get_pe() and eeh_pe_get() which seems likely to cause confusion. Keep eeh_get_pe() because "get" is commonly used to refer to acquiring a reference (which it does), and rename eeh_pe_get() to eeh_pe_find() because it performs a search. Signed-off-by: S

[PATCH RFC 13/15] powerpc/eeh: Sync pnv_eeh_ei_write()

2019-10-01 Thread Sam Bobroff
Synchronize access to eeh_pe. Signed-off-by: Sam Bobroff --- arch/powerpc/platforms/powernv/eeh-powernv.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c index c56a796dd894..12367ed2083b 100644

[PATCH RFC 00/15] powerpc/eeh: Synchronize access to struct eeh_pe

2019-10-01 Thread Sam Bobroff
of PEs that have been removed from the PHB tree, but not yet freed and makes that list available in debugfs. Any PEs that remain orphans for very long are going to be the result of bugs. It's extra risk because it itself could contain bugs, but it could also be useful during debugging. Cheers,

[PATCH RFC 15/15] powerpc/eeh: Sync pcibios_set_pcie_reset_state()

2019-10-01 Thread Sam Bobroff
Synchronize access to eeh_pe. Signed-off-by: Sam Bobroff --- arch/powerpc/kernel/eeh.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c index 26d9367c41a1..c61bfaf4ca26 100644 --- a/arch/powerpc/kernel/eeh.c +++ b/arch/powerpc/kernel

[PATCH RFC 05/15] powerpc/eeh: Sync eeh_pe_get_parent()

2019-10-01 Thread Sam Bobroff
Synchronize access to eeh_pe. Signed-off-by: Sam Bobroff --- arch/powerpc/kernel/eeh_pe.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c index b89ed46f14e6..0486d3c6ff20 100644 --- a/arch/powerpc

[PATCH RFC 14/15] powerpc/eeh: Sync eeh_force_recover_write()

2019-10-01 Thread Sam Bobroff
Synchronize access to eeh_pe. Signed-off-by: Sam Bobroff --- arch/powerpc/kernel/eeh.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c index cba16ca0694a..26d9367c41a1 100644 --- a/arch/powerpc/kernel/eeh.c +++ b/arch

Re: [EXTERNAL] [RFC PATCH] powernv/eeh: Fix oops when probing cxl devices

2019-10-14 Thread Sam Bobroff
On Fri, Sep 27, 2019 at 02:45:10PM +0200, Frederic Barrat wrote: > Recent cleanup in the way EEH support is added to a device causes a > kernel oops when the cxl driver probes a device and creates virtual > devices discovered on the FPGA: > > BUG: Kernel NULL pointer dereference at 0x00a0

Re: [PATCH] powerpc/eeh: Only dump stack once if an MMIO loop is detected

2019-10-15 Thread Sam Bobroff
spinning in a loop. This > results in a lot of spurious stack traces in the kernel log. > > Fix this by limiting it to printing one stack trace for each PE freeze. If > the driver is truely stuck the kernel's hung task detector is better suited > to reporting the probelm anyway.

Re: [PATCH] powernv/eeh: Fix oops when probing cxl devices

2019-10-16 Thread Sam Bobroff
didn't > touch the pseries path. At least on pseries, if there's another > unexpected case where the pdn is NULL, we should catch it more easily > with the oops message. OK. I agree that it's not worth doing more. Reviewed-by: Sam Bobroff > arch/powerpc/platforms/p

[PATCH 1/1] powerpc/eeh: differentiate duplicate detection message

2019-10-16 Thread Sam Bobroff
r EEH: eeh_dev_check_failure: Frozen PHB#0-PE#0 detected EEH: Recovering PHB#0-PE#0 Signed-off-by: Sam Bobroff --- arch/powerpc/kernel/eeh_driver.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c index d9279d0

Re: powerpc/pci: [PATCH 1/1]: PCIE PHB reset

2020-05-13 Thread Sam Bobroff
On Thu, May 07, 2020 at 08:10:37AM -0500, wenxi...@linux.vnet.ibm.com wrote: > From: Wen Xiong > > Several device drivers hit EEH(Extended Error handling) when triggering > kdump on Pseries PowerVM. This patch implemented a reset of the PHBs > in pci general code. PHB reset stop all PCI transacti

Re: powerpc/pci: [PATCH 1/1 V3] PCIE PHB reset

2020-05-28 Thread Sam Bobroff
dump. PHB reset stop all PCI > transactions from normal kernel. We have tested the patch in several > enviroments: > - direct slot adapters > - adapters under the switch > - a VF adapter in PowerVM > - a VF adapter/adapter in KVM guest. > > Signed-off-by: Wen Xiong Loo

[PATCH RFC 1/1] powerpc/eeh: Provide a unique ID for each EEH recovery

2020-06-23 Thread Sam Bobroff
Give a unique ID to each recovery event, to ease log parsing and prepare for parallel recovery. Also add some new messages with a very simple format that may be useful to log-parsers. Signed-off-by: Sam Bobroff --- This patch should be applied on top of my recent(ish) set: "powerp

[PATCH RFC 1/1] powerpc/eeh: Asynchronous recovery

2020-06-23 Thread Sam Bobroff
alled by traversing the tree of affected PEs from the top, stopping to call handlers (in parallel) when a PE with devices is discovered. When the calls for that PE are complete, traversal continues at each child PE. Signed-off-by: Sam Bobroff --- This patch should be applied on top of both: "p

[PATCH RFC 1/1] powerpc/eeh: PE info tree via debugfs and syslog

2020-06-23 Thread Sam Bobroff
/eeh_pe_tree Signed-off-by: Sam Bobroff --- Here's some debug code I've been using for a long time while working on EEH. I haven't posted it before because it wasn't possible to make the code safe enough (to avoid either NULL or LIST_POISON), but with the recent safety w

[PATCH 1/1] MAINTAINERS: Remove self

2020-06-29 Thread Sam Bobroff
I'm sorry to say I can no longer maintain this position. Signed-off-by: Sam Bobroff --- MAINTAINERS | 1 - 1 file changed, 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 496fd4eafb68..7e954e4a29e1 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -13187,7 +13187,6 @@ F: tool

Re: [PATCH 3/8] powerpc/eeh: Convert PNV_PHB_FLAG_EEH to global flag

2019-04-29 Thread Sam Bobroff
On Thu, Apr 18, 2019 at 07:51:40PM +1000, Oliver O'Halloran wrote: > On Wed, 2019-03-20 at 13:58 +1100, Sam Bobroff wrote: > > The PHB flag, PNV_PHB_FLAG_EEH, is set (on PowerNV) individually on > > each PHB once the EEH subsystem is ready. It is the only use of the > &g

Re: [PATCH 5/8] powerpc/eeh: Add eeh_show_enabled()

2019-04-29 Thread Sam Bobroff
On Thu, Apr 18, 2019 at 08:01:36PM +1000, Oliver O'Halloran wrote: > On Wed, 2019-03-20 at 13:58 +1100, Sam Bobroff wrote: > > Move the EEH enabled message into it's own function so that future > > work can call it from multiple places. > > > > Signed-o

Re: [PATCH 6/8] powerpc/eeh: Initialize EEH address cache earlier

2019-04-29 Thread Sam Bobroff
On Thu, Apr 18, 2019 at 08:13:53PM +1000, Oliver O'Halloran wrote: > On Wed, 2019-03-20 at 13:58 +1100, Sam Bobroff wrote: > > The EEH address cache is currently initialized and populated by a > > single function: eeh_addr_cache_build(). While the initial population > >

Re: [PATCH] vfio-pci/nvlink2: Fix potential VMA leak

2019-05-06 Thread Sam Bobroff
On Mon, May 06, 2019 at 03:58:45PM -0600, Alex Williamson wrote: > On Fri, 19 Apr 2019 17:37:17 +0200 > Greg Kurz wrote: > > > If vfio_pci_register_dev_region() fails then we should rollback > > previous changes, ie. unmap the ATSD registers. > > > > Signed-off-by: Greg Kurz > > --- > > Applie

[PATCH v2 1/6] powerpc/64: Adjust order in pcibios_init()

2019-05-06 Thread Sam Bobroff
already the case) and at boot time, to support future work. Signed-off-by: Sam Bobroff Reviewed-by: Alexey Kardashevskiy --- arch/powerpc/kernel/pci-common.c | 4 arch/powerpc/kernel/pci_32.c | 4 arch/powerpc/kernel/pci_64.c | 12 +--- 3 files changed, 13 insertions

[PATCH v2 4/6] powerpc/eeh: Initialize EEH address cache earlier

2019-05-06 Thread Sam Bobroff
step into a separate function and call it from a core_initcall (rather than a subsys initcall). This will allow future work to make use of the cache during boot time PCI scanning. Signed-off-by: Sam Bobroff Reviewed-by: Alexey Kardashevskiy --- arch/powerpc/include/asm/eeh.h | 3 +++ arch

[PATCH v2 0/6]

2019-05-06 Thread Sam Bobroff
che earlier Patch 7/8: powerpc/eeh: EEH for pSeries hot plug Patch 8/8: powerpc/eeh: Remove eeh_probe_devices() and eeh_addr_cache_build() Sam Bobroff (6): powerpc/64: Adjust order in pcibios_init() powerpc/eeh: Clear stale EEH_DEV_NO_HANDLER flag powerpc/eeh: Improve debug messages around

[PATCH v2 2/6] powerpc/eeh: Clear stale EEH_DEV_NO_HANDLER flag

2019-05-06 Thread Sam Bobroff
lers to be incorrectly ignored). To remedy this, clear the flag at the beginning of recovery processing. The flag is still cleared at the end of recovery processing, although it is no longer really necessary. Also clear the flag during eeh_handle_special_event(), for the same reasons. Signed-off-by: S

[PATCH v2 3/6] powerpc/eeh: Improve debug messages around device addition

2019-05-06 Thread Sam Bobroff
Also remove useless comment. Signed-off-by: Sam Bobroff Reviewed-by: Alexey Kardashevskiy --- arch/powerpc/kernel/eeh.c| 2 +- arch/powerpc/platforms/powernv/eeh-powernv.c | 14 arch/powerpc/platforms/pseries/eeh_pseries.c | 23 +++- 3 files

[PATCH v2 6/6] powerpc/eeh: Refactor around eeh_probe_devices()

2019-05-06 Thread Sam Bobroff
Note that previously on pSeries, useless EEH sysfs files were created for some devices that did not have EEH support and this change prevents them from being created. Signed-off-by: Sam Bobroff --- v2 - As it's so small, merged the enablement message patch into this one (where

[PATCH v2 5/6] powerpc/eeh: EEH for pSeries hot plug

2019-05-06 Thread Sam Bobroff
was not previously possible (it was already possible on pSeries). Signed-off-by: Sam Bobroff --- v2 - Dropped changes to the PowerNV PHB EEH flag, instead refactor just enough to use the existing flag from multiple places. - Merge the little remaining work from the above change into the

Re: [EXTERNAL] Re: [PATCH v3 1/3] PCI: Introduce pcibios_ignore_alignment_request

2019-05-29 Thread Sam Bobroff
p the FW assignments otherwise things > break. QEMU however doesn't do any BAR assignments and relies on that > being handled by the guest. At boot time this is done by SLOF, but > Linux only keeps SLOF around until it's extracted the device-tree. > Once that's done SLOF

Re: [PATCH kernel] powerpc/pci/of: Fix OF flags parsing for 64bit BARs

2019-06-04 Thread Sam Bobroff
_alignment= with /chosen/linux,pci-probe-only is broken > anyway. Looks good to me. I gave it a quick test for regressions, with a host and QEMU guest (with some passed-through devices) both using the patch and it seemed fine. Reviewed-by: Sam Bobroff > --- > arch/powerpc/kernel/pci_of_sca

Re: [PATCH 01/12] powerpc: Disable HFSCR:TM if TM not supported

2017-03-27 Thread Sam Bobroff
On Mon, Mar 20, 2017 at 05:49:03PM +1100, Benjamin Herrenschmidt wrote: > Otherwise KVM guests might mess with it even when told not > to causing bad thing interrupts in the host > > Signed-off-by: Benjamin Herrenschmidt I've tested this on a P8, with a kernel and QEMU close to their respective

[PATCH 1/1] KVM: PPC: Book3S: Fix server always zero from kvmppc_xive_get_xive()

2017-09-25 Thread Sam Bobroff
xes: 5af50993850a ("KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller") Cc: sta...@vger.kernel.org Signed-off-by: Sam Bobroff --- The other obvious way to patch this would be to set state->guest_server in kvmppc_xive_set_xive() and that does also work because ac

[PATCH 1/1] powerpc/pseries: Enable RAS hotplug events late

2018-02-11 Thread Sam Bobroff
uot; being uninitialized, due to init_ras_IRQ() executing before hugetlb_init(). To correct this, extract the part of init_ras_IRQ() that enables hotplug event processing and place it in the machine_late_initcall phase, which is guaranteed to be after hugetlb_init() is called. Signed-off-by: S

[PATCH RFC 1/1] powerpc/pseries: fix EEH recovery of IOV devices

2018-02-21 Thread Sam Bobroff
), because the "ibm,open-sriov-vf-bar-info" property is missing, causing it to bail out early. Correct this by zeroing the IOV resources in the bailout path, so that they are not seen by pci_enable_resources(). Signed-off-by: Sam Bobroff --- Hi, This is a fix to allow EEH recovery to succ

[PATCH 0/9] EEH refactoring 1

2018-03-05 Thread Sam Bobroff
Hello everyone, Here is a set of some small, mostly idempotent, changes to improve maintainability in some of the EEH code, primarily in eeh_driver.c. I've kept them all small to aid review but perhaps they should be squashed down before being applied. Cheers, Sam. Sam Bobroff (9): po

[PATCH 1/9] powerpc/eeh: Remove eeh_handle_event()

2018-03-05 Thread Sam Bobroff
g but obscure the flow of control. So, remove it. Signed-off-by: Sam Bobroff --- arch/powerpc/include/asm/eeh_event.h | 3 ++- arch/powerpc/kernel/eeh_driver.c | 42 +--- arch/powerpc/kernel/eeh_event.c | 4 ++-- 3 files changed, 19 insertions(+), 30 dele

[PATCH 2/9] powerpc/eeh: Manage EEH_PE_RECOVERING inside eeh_handle_normal_event()

2018-03-05 Thread Sam Bobroff
ed on an invalid PE, which is now avoided. Signed-off-by: Sam Bobroff --- arch/powerpc/include/asm/eeh_event.h | 2 +- arch/powerpc/kernel/eeh_driver.c | 29 +++-- arch/powerpc/kernel/eeh_event.c | 2 -- 3 files changed, 12 insertions(+), 21 deletions(-) diff

[PATCH 3/9] powerpc/eeh: Fix misleading comment in __eeh_addr_cache_get_device()

2018-03-05 Thread Sam Bobroff
Commit "0ba17b05 powerpc/eeh: Remove reference to PCI device" removed a call to pci_dev_get() from __eeh_addr_cache_get_device() but did not update the comment to match. Signed-off-by: Sam Bobroff --- arch/powerpc/kernel/eeh_cache.c | 3 +-- 1 file changed, 1 insertion(+), 2

[PATCH 4/9] powerpc/eeh: Remove misleading test in eeh_handle_normal_event()

2018-03-05 Thread Sam Bobroff
Remove a test that checks if "frozen_bus" is NULL, because it cannot have changed since it was tested at the start of the function and so must be true here. Signed-off-by: Sam Bobroff --- arch/powerpc/kernel/eeh_driver.c | 24 +++- 1 file changed, 11 inserti

[PATCH 5/9] powerpc/eeh: Rename frozen_bus to bus in eeh_handle_normal_event()

2018-03-05 Thread Sam Bobroff
The name "frozen_bus" is misleading: it's not necessarily frozen, it's just the PE's PCI bus. Signed-off-by: Sam Bobroff --- arch/powerpc/kernel/eeh_driver.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kernel/eeh_drive

[PATCH 6/9] powerpc/eeh: Clarify arguments to eeh_reset_device()

2018-03-05 Thread Sam Bobroff
r' and replace uses of 'frozen_bus' with 'bus'. Also update the function's comment. This should not change behaviour. Signed-off-by: Sam Bobroff --- arch/powerpc/kernel/eeh_driver.c | 22 -- 1 file changed, 12 insertions(+), 10 deletions(-) diff --g

[PATCH 7/9] powerpc/eeh: Remove always-true tests in eeh_reset_device()

2018-03-05 Thread Sam Bobroff
eeh_reset_device() tests the value of 'bus' more than once but the only caller, eeh_handle_normal_device() does this test itself and will never pass NULL. So, remove the dead tests. This should not change behaviour. Signed-off-by: Sam Bobroff --- arch/powerpc/kernel/eeh_driver.c |

[PATCH 8/9] powerpc/eeh: Factor out common code eeh_reset_device()

2018-03-05 Thread Sam Bobroff
The caller will always pass NULL for 'rmv_data' when 'eeh_aware_driver' is true, so the first two calls to eeh_pe_dev_traverse() can be combined without changing behaviour as can the two arms of the final 'if' block. This should not change behaviour. Signed-off-by:

[PATCH 9/9] powerpc/eeh: Add eeh_state_active() helper

2018-03-05 Thread Sam Bobroff
Checking for a "fully active" device state requires testing two flag bits, which is open coded in several places, so add a function to do it. Signed-off-by: Sam Bobroff --- arch/powerpc/include/asm/eeh.h | 6 ++ arch/powerpc/kernel/eeh.c

Re: [PATCH 9/9] powerpc/eeh: Add eeh_state_active() helper

2018-03-06 Thread Sam Bobroff
On Tue, Mar 06, 2018 at 04:49:48PM +1100, Russell Currey wrote: > On Tue, 2018-03-06 at 11:00 +1100, Sam Bobroff wrote: > > Checking for a "fully active" device state requires testing two flag > > bits, which is open coded in several places, so add a function to do >

[PATCH v2 0/9] EEH refactoring 1

2018-03-18 Thread Sam Bobroff
Rename frozen_bus to bus in eeh_handle_normal_event() Patch 6/9: powerpc/eeh: Clarify arguments to eeh_reset_device() Patch 7/9: powerpc/eeh: Remove always-true tests in eeh_reset_device() Patch 8/9: powerpc/eeh: Factor out common code eeh_reset_device() Patch 9/9: powerpc/eeh: Add eeh_state_active() helper Sa

[PATCH v2 1/9] powerpc/eeh: Remove eeh_handle_event()

2018-03-18 Thread Sam Bobroff
g but obscure the flow of control. So, remove it. Signed-off-by: Sam Bobroff --- arch/powerpc/include/asm/eeh_event.h | 3 ++- arch/powerpc/kernel/eeh_driver.c | 42 +--- arch/powerpc/kernel/eeh_event.c | 4 ++-- 3 files changed, 19 insertions(+), 30 dele

[PATCH v2 2/9] powerpc/eeh: Manage EEH_PE_RECOVERING inside eeh_handle_normal_event()

2018-03-18 Thread Sam Bobroff
ed on an invalid PE, which is now avoided. Signed-off-by: Sam Bobroff --- arch/powerpc/include/asm/eeh_event.h | 2 +- arch/powerpc/kernel/eeh_driver.c | 29 +++-- arch/powerpc/kernel/eeh_event.c | 2 -- 3 files changed, 12 insertions(+), 21 deletions(-) diff

[PATCH v2 3/9] powerpc/eeh: Fix misleading comment in __eeh_addr_cache_get_device()

2018-03-18 Thread Sam Bobroff
Commit "0ba17b05 powerpc/eeh: Remove reference to PCI device" removed a call to pci_dev_get() from __eeh_addr_cache_get_device() but did not update the comment to match. Signed-off-by: Sam Bobroff --- arch/powerpc/kernel/eeh_cache.c | 3 +-- 1 file changed, 1 insertion(+), 2

[PATCH v2 4/9] powerpc/eeh: Remove misleading test in eeh_handle_normal_event()

2018-03-18 Thread Sam Bobroff
Remove a test that checks if "frozen_bus" is NULL, because it cannot have changed since it was tested at the start of the function and so must be true here. Signed-off-by: Sam Bobroff --- arch/powerpc/kernel/eeh_driver.c | 24 +++- 1 file changed, 11 inserti

[PATCH v2 5/9] powerpc/eeh: Rename frozen_bus to bus in eeh_handle_normal_event()

2018-03-18 Thread Sam Bobroff
The name "frozen_bus" is misleading: it's not necessarily frozen, it's just the PE's PCI bus. Signed-off-by: Sam Bobroff --- arch/powerpc/kernel/eeh_driver.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kernel/eeh_drive

[PATCH v2 6/9] powerpc/eeh: Clarify arguments to eeh_reset_device()

2018-03-18 Thread Sam Bobroff
e' and replace uses of 'frozen_bus' with 'bus'. Also update the function's comment. This should not change behaviour. Signed-off-by: Sam Bobroff --- arch/powerpc/kernel/eeh_driver.c | 20 +++- 1 file changed, 11 insertions(+), 9 deletions(-) diff --g

[PATCH v2 7/9] powerpc/eeh: Remove always-true tests in eeh_reset_device()

2018-03-18 Thread Sam Bobroff
eeh_reset_device() tests the value of 'bus' more than once but the only caller, eeh_handle_normal_device() does this test itself and will never pass NULL. So, remove the dead tests. This should not change behaviour. Signed-off-by: Sam Bobroff --- arch/powerpc/kernel/eeh_driver.c |

[PATCH v2 8/9] powerpc/eeh: Factor out common code eeh_reset_device()

2018-03-18 Thread Sam Bobroff
The caller will always pass NULL for 'rmv_data' when 'eeh_aware_driver' is true, so the first two calls to eeh_pe_dev_traverse() can be combined without changing behaviour as can the two arms of the final 'if' block. This should not change behaviour. Signed-off-

[PATCH v2 9/9] powerpc/eeh: Add eeh_state_active() helper

2018-03-18 Thread Sam Bobroff
Checking for a "fully active" device state requires testing two flag bits, which is open coded in several places, so add a function to do it. Signed-off-by: Sam Bobroff --- arch/powerpc/include/asm/eeh.h | 6 ++ arch/powerpc/kernel/eeh.c

[PATCH v3 8/9] powerpc/eeh: Factor out common code eeh_reset_device()

2018-03-20 Thread Sam Bobroff
The caller will always pass NULL for 'rmv_data' when 'eeh_aware_driver' is true, so the first two calls to eeh_pe_dev_traverse() can be combined without changing behaviour as can the two arms of the final 'if' block. This should not change behaviour. Signed-off-

[PATCH RFC 1/1] KVM: PPC: Book3S HV: pack VCORE IDs to access full VCPU ID space

2018-04-12 Thread Sam Bobroff
rs, without access to the VCPU structure. Signed-off-by: Sam Bobroff --- Hello everyone, I've tested this on P8 and P9, in lots of combinations of host and guest threading modes and it has been fine but it does feel like a "tricky" approach, so I still feel somewhat wary about it.

Re: [PATCH RFC 1/1] KVM: PPC: Book3S HV: pack VCORE IDs to access full VCPU ID space

2018-04-23 Thread Sam Bobroff
On Mon, Apr 23, 2018 at 11:06:35AM +0200, Cédric Le Goater wrote: > On 04/16/2018 06:09 AM, David Gibson wrote: > > On Thu, Apr 12, 2018 at 05:02:06PM +1000, Sam Bobroff wrote: > >> It is not currently possible to create the full number of possible > >> VCPUs (KVM_MAX_V

Re: [PATCH RFC 1/1] KVM: PPC: Book3S HV: pack VCORE IDs to access full VCPU ID space

2018-04-30 Thread Sam Bobroff
On Tue, Apr 24, 2018 at 01:48:25PM +1000, David Gibson wrote: > On Tue, Apr 24, 2018 at 01:19:15PM +1000, Sam Bobroff wrote: > > On Mon, Apr 23, 2018 at 11:06:35AM +0200, Cédric Le Goater wrote: > > > On 04/16/2018 06:09 AM, David Gibson wrote: > > > > On Thu, Apr 12,

[PATCH v2 RFC 1/1] KVM: PPC: Book3S HV: pack VCORE IDs to access full VCPU ID space

2018-05-01 Thread Sam Bobroff
From: Sam Bobroff It is not currently possible to create the full number of possible VCPUs (KVM_MAX_VCPUS) on Power9 with KVM-HV when the guest uses less threads per core than it's core stride (or "VSMT mode"). This is because the VCORE ID and XIVE offsets to grow beyond KVM

[PATCH 00/13] EEH refactoring 2

2018-05-01 Thread Sam Bobroff
ddition of some useful messaging which should make future maintenance easier (as an example, a recent fix in this area "powerpc/eeh: Fix race with driver un/bind" would have required adding two lines rather than 42+/26-). Cheers, Sam. Sam Bobroff (13): powerpc/eeh: Add eeh_max_freezes to

[PATCH 01/13] powerpc/eeh: Add eeh_max_freezes to initial EEH log line

2018-05-01 Thread Sam Bobroff
. Also remove the embedded newline from the existing message to make it easier to grep for. Signed-off-by: Sam Bobroff --- arch/powerpc/kernel/eeh_driver.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c

[PATCH 02/13] powerpc/eeh: Add final message for successful recovery

2018-05-01 Thread Sam Bobroff
Add a single log line at the end of successful EEH recovery, so that it's clear that event processing has finished. Signed-off-by: Sam Bobroff --- arch/powerpc/kernel/eeh_driver.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/k

[PATCH 03/13] powerpc/eeh: Fix use-after-release of EEH driver

2018-05-01 Thread Sam Bobroff
at all if it wasn't needed. Signed-off-by: Sam Bobroff --- arch/powerpc/kernel/eeh_driver.c | 28 1 file changed, 16 insertions(+), 12 deletions(-) diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c index 07e0a42035ce..54333f6c9d

[PATCH 04/13] powerpc/eeh: Remove unused eeh_pcid_name()

2018-05-01 Thread Sam Bobroff
Signed-off-by: Sam Bobroff --- arch/powerpc/kernel/eeh_driver.c | 14 -- 1 file changed, 14 deletions(-) diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c index 54333f6c9d67..ca9a73fe9cc5 100644 --- a/arch/powerpc/kernel/eeh_driver.c +++ b/arch/powerpc

[PATCH 05/13] powerpc/eeh: Strengthen types of eeh traversal functions

2018-05-01 Thread Sam Bobroff
The traversal functions eeh_pe_traverse() and eeh_pe_dev_traverse() both provide their first argument as void * but every single user casts it to the expected type. Change the type of the first parameter from void * to the appropriate type, and clean up all uses. Signed-off-by: Sam Bobroff

[PATCH 06/13] powerpc/eeh: Add message when PE processing at parent

2018-05-01 Thread Sam Bobroff
To aid debugging, add a message to show when EEH processing for a PE will be done at the device's parent, rather than directly at the device. Signed-off-by: Sam Bobroff --- arch/powerpc/kernel/eeh.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/k

[PATCH 07/13] powerpc/eeh: Clean up pci_ers_result handling

2018-05-01 Thread Sam Bobroff
ependent. Address this by assigning a priority to each result value, and always merging to the highest priority. This renders the intent clear, and provides a stable value for all orderings. Signed-off-by: Sam Bobroff --- arch/powerpc/kernel/eeh_driver.c | 36 ++--

[PATCH 08/13] powerpc/eeh: Introduce eeh_for_each_pe()

2018-05-01 Thread Sam Bobroff
Add a for_each-style macro for iterating through PEs without the boilerplate required by a traversal function. eeh_pe_next() is now exported, as it is now used directly in place. Signed-off-by: Sam Bobroff --- arch/powerpc/include/asm/eeh.h | 4 arch/powerpc/kernel/eeh_pe.c | 7

[PATCH 09/13] powerpc/eeh: Introduce eeh_edev_actionable()

2018-05-01 Thread Sam Bobroff
The same test is done in every EEH report function, so factor it out. Since eeh_dev_removed() needs to be moved higher up in the file, simplify it a little while we're at it. Signed-off-by: Sam Bobroff --- arch/powerpc/kernel/eeh_driver.c | 30 -- 1 file change

[PATCH 10/13] powerpc/eeh: Introduce eeh_set_channel_state()

2018-05-01 Thread Sam Bobroff
To ease future refactoring, extract setting of the channel state from the report functions out into their own functions. This increases the amount of code that is identical across all of the report functions. Signed-off-by: Sam Bobroff --- arch/powerpc/kernel/eeh_driver.c | 19

[PATCH 11/13] powerpc/eeh: Introduce eeh_set_irq_state()

2018-05-01 Thread Sam Bobroff
To ease future refactoring, extract calls to eeh_enable_irq() and eeh_disable_irq() from the various report functions. This makes the report functions initial sequences more similar, as well as making the IRQ changes visible when reading eeh_handle_normal_event(). Signed-off-by: Sam Bobroff

<    1   2   3   4   >