Re: [PATCH 3/6] powerpc/64s: Use htab_convert_pte_flags() in hash__mark_rodata_ro()
Hi Michael, > In hash__mark_rodata_ro() we pass the raw PP_RXXX value to > hash__change_memory_range(). That has the effect of setting the key to > zero, because PP_RXXX contains no key value. > > Fix it by using htab_convert_pte_flags(), which knows how to convert a > pgprot into a pp value, including the key. So far as I can tell by chasing the definitions around, this appears to do what it claims to do. So, for what it's worth: Reviewed-by: Daniel Axtens Kind regards, Daniel > > Fixes: d94b827e89dc ("powerpc/book3s64/kuap: Use Key 3 for kernel mapping > with hash translation") > Signed-off-by: Michael Ellerman > --- > arch/powerpc/mm/book3s64/hash_pgtable.c | 6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/arch/powerpc/mm/book3s64/hash_pgtable.c > b/arch/powerpc/mm/book3s64/hash_pgtable.c > index 567e0c6b3978..03819c259f0a 100644 > --- a/arch/powerpc/mm/book3s64/hash_pgtable.c > +++ b/arch/powerpc/mm/book3s64/hash_pgtable.c > @@ -428,12 +428,14 @@ static bool hash__change_memory_range(unsigned long > start, unsigned long end, > > void hash__mark_rodata_ro(void) > { > - unsigned long start, end; > + unsigned long start, end, pp; > > start = (unsigned long)_stext; > end = (unsigned long)__init_begin; > > - WARN_ON(!hash__change_memory_range(start, end, PP_RXXX)); > + pp = htab_convert_pte_flags(pgprot_val(PAGE_KERNEL_ROX), > HPTE_USE_KERNEL_KEY); > + > + WARN_ON(!hash__change_memory_range(start, end, pp)); > } > > void hash__mark_initmem_nx(void) > -- > 2.25.1
Re: [PATCH 2/6] powerpc/pseries: Add key to flags in pSeries_lpar_hpte_updateboltedpp()
Michael Ellerman writes: > The flags argument to plpar_pte_protect() (aka. H_PROTECT), includes > the key in bits 9-13, but currently we always set those bits to zero. > > In the past that hasn't been a problem because we always used key 0 > for the kernel, and updateboltedpp() is only used for kernel mappings. > > However since commit d94b827e89dc ("powerpc/book3s64/kuap: Use Key 3 > for kernel mapping with hash translation") we are now inadvertently > changing the key (to zero) when we call plpar_pte_protect(). > > That hasn't broken anything because updateboltedpp() is only used for > STRICT_KERNEL_RWX, which is currently disabled on 64s due to other > bugs. > > But we want to fix that, so first we need to pass the key correctly to > plpar_pte_protect(). In the `newpp` value the low 3 bits of the key > are already in the correct spot, but the high 2 bits of the key need > to be shifted down. > > Fixes: d94b827e89dc ("powerpc/book3s64/kuap: Use Key 3 for kernel mapping > with hash translation") > Signed-off-by: Michael Ellerman > --- > arch/powerpc/platforms/pseries/lpar.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/arch/powerpc/platforms/pseries/lpar.c > b/arch/powerpc/platforms/pseries/lpar.c > index 764170fdb0f7..8bbbddff7226 100644 > --- a/arch/powerpc/platforms/pseries/lpar.c > +++ b/arch/powerpc/platforms/pseries/lpar.c > @@ -976,11 +976,13 @@ static void pSeries_lpar_hpte_updateboltedpp(unsigned > long newpp, > slot = pSeries_lpar_hpte_find(vpn, psize, ssize); > BUG_ON(slot == -1); > > - flags = newpp & 7; > + flags = newpp & (HPTE_R_PP | HPTE_R_N); > if (mmu_has_feature(MMU_FTR_KERNEL_RO)) > /* Move pp0 into bit 8 (IBM 55) */ > flags |= (newpp & HPTE_R_PP0) >> 55; > > + flags |= ((newpp & HPTE_R_KEY_HI) >> 48) | (newpp & HPTE_R_KEY_LO); > + I'm really confused about how these bits are getting packed into the flags parameter. It seems to match how they are unpacked in kvmppc_h_pr_protect, but I cannot figure out why they are packed in that order, and the LoPAR doesn't seem especially illuminating on this topic - although I may have missed the relevant section. Kind regards, Daniel > lpar_rc = plpar_pte_protect(flags, slot, 0); > > BUG_ON(lpar_rc != H_SUCCESS); > -- > 2.25.1
[PATCH kernel 2/2] powerpc/iommu: Do not immediately panic when failed IOMMU table allocation
Most platforms allocate IOMMU table structures (specifically it_map) at the boot time and when this fails - it is a valid reason for panic(). However the powernv platform allocates it_map after a device is returned to the host OS after being passed through and this happens long after the host OS booted. It is quite possible to trigger the it_map allocation panic() and kill the host even though it is not necessary - the host OS can still use the DMA bypass mode (requires a tiny fraction of it_map's memory) and even if that fails, the host OS is runnnable as it was without the device for which allocating it_map causes the panic. Instead of immediately crashing in a powernv/ioda2 system, this prints an error and continues. All other platforms still call panic(). Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/kernel/iommu.c | 6 -- arch/powerpc/platforms/cell/iommu.c | 3 ++- arch/powerpc/platforms/pasemi/iommu.c | 4 +++- arch/powerpc/platforms/powernv/pci-ioda.c | 15 --- arch/powerpc/platforms/pseries/iommu.c| 10 +++--- arch/powerpc/sysdev/dart_iommu.c | 3 ++- 6 files changed, 26 insertions(+), 15 deletions(-) diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c index 8eb6eb0afa97..c1a5c366a664 100644 --- a/arch/powerpc/kernel/iommu.c +++ b/arch/powerpc/kernel/iommu.c @@ -728,8 +728,10 @@ struct iommu_table *iommu_init_table(struct iommu_table *tbl, int nid, sz = BITS_TO_LONGS(tbl->it_size) * sizeof(unsigned long); tbl->it_map = vzalloc_node(sz, nid); - if (!tbl->it_map) - panic("iommu_init_table: Can't allocate %ld bytes\n", sz); + if (!tbl->it_map) { + pr_err("%s: Can't allocate %ld bytes\n", __func__, sz); + return NULL; + } iommu_table_reserve_pages(tbl, res_start, res_end); diff --git a/arch/powerpc/platforms/cell/iommu.c b/arch/powerpc/platforms/cell/iommu.c index 2124831cf57c..fa08699aedeb 100644 --- a/arch/powerpc/platforms/cell/iommu.c +++ b/arch/powerpc/platforms/cell/iommu.c @@ -486,7 +486,8 @@ cell_iommu_setup_window(struct cbe_iommu *iommu, struct device_node *np, window->table.it_size = size >> window->table.it_page_shift; window->table.it_ops = &cell_iommu_ops; - iommu_init_table(&window->table, iommu->nid, 0, 0); + if (!iommu_init_table(&window->table, iommu->nid, 0, 0)) + panic("Failed to initialize iommu table"); pr_debug("\tioid %d\n", window->ioid); pr_debug("\tblocksize %ld\n", window->table.it_blocksize); diff --git a/arch/powerpc/platforms/pasemi/iommu.c b/arch/powerpc/platforms/pasemi/iommu.c index b500a6e47e6b..5be7242fbd86 100644 --- a/arch/powerpc/platforms/pasemi/iommu.c +++ b/arch/powerpc/platforms/pasemi/iommu.c @@ -146,7 +146,9 @@ static void iommu_table_iobmap_setup(void) */ iommu_table_iobmap.it_blocksize = 4; iommu_table_iobmap.it_ops = &iommu_table_iobmap_ops; - iommu_init_table(&iommu_table_iobmap, 0, 0, 0); + if (!iommu_init_table(&iommu_table_iobmap, 0, 0, 0)) + panic("Failed to initialize iommu table"); + pr_debug(" <- %s\n", __func__); } diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c index f0f901683a2f..66c3c3337334 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -1762,7 +1762,8 @@ static void pnv_pci_ioda1_setup_dma_pe(struct pnv_phb *phb, tbl->it_ops = &pnv_ioda1_iommu_ops; pe->table_group.tce32_start = tbl->it_offset << tbl->it_page_shift; pe->table_group.tce32_size = tbl->it_size << tbl->it_page_shift; - iommu_init_table(tbl, phb->hose->node, 0, 0); + if (!iommu_init_table(tbl, phb->hose->node, 0, 0)) + panic("Failed to initialize iommu table"); pe->dma_setup_done = true; return; @@ -1930,16 +1931,16 @@ static long pnv_pci_ioda2_setup_default_config(struct pnv_ioda_pe *pe) res_start = pe->phb->ioda.m32_pci_base >> tbl->it_page_shift; res_end = min(window_size, SZ_4G) >> tbl->it_page_shift; } - iommu_init_table(tbl, pe->phb->hose->node, res_start, res_end); - rc = pnv_pci_ioda2_set_window(&pe->table_group, 0, tbl); + if (iommu_init_table(tbl, pe->phb->hose->node, res_start, res_end)) + rc = pnv_pci_ioda2_set_window(&pe->table_group, 0, tbl); + else + rc = -ENOMEM; if (rc) { - pe_err(pe, "Failed to configure 32-bit TCE table, err %ld\n", - rc); + pe_err(pe, "Failed to configure 32-bit TCE table, err %ld\n", rc); iommu_tce_table_put(tbl); - return rc; + tbl = NULL; /* This clears iommu_table_base below */ } - if (!pnv_iommu_bypass_disabled) pnv_pc
[PATCH kernel 0/2] powerpc/iommu: Stop crashing the host when VM is terminated
Killing a VM on a host under memory pressure kills a host which is annoying. 1/2 reduces the chances, 2/2 eliminates panic() on ioda2. This is based on sha1 f40ddce88593 Linus Torvalds "Linux 5.11". Please comment. Thanks. Alexey Kardashevskiy (2): powerpc/iommu: Allocate it_map by vmalloc powerpc/iommu: Do not immediately panic when failed IOMMU table allocation arch/powerpc/kernel/iommu.c | 19 ++- arch/powerpc/platforms/cell/iommu.c | 3 ++- arch/powerpc/platforms/pasemi/iommu.c | 4 +++- arch/powerpc/platforms/powernv/pci-ioda.c | 15 --- arch/powerpc/platforms/pseries/iommu.c| 10 +++--- arch/powerpc/sysdev/dart_iommu.c | 3 ++- 6 files changed, 28 insertions(+), 26 deletions(-) -- 2.17.1
[PATCH kernel 1/2] powerpc/iommu: Allocate it_map by vmalloc
The IOMMU table uses the it_map bitmap to keep track of allocated DMA pages. This has always been a contiguous array allocated at either the boot time or when a passed through device is returned to the host OS. The it_map memory is allocated by alloc_pages() which allocates contiguous physical memory. Such allocation method occasionally creates a problem when there is no big chunk of memory available (no free memory or too fragmented). On powernv/ioda2 the default DMA window requires 16MB for it_map. This replaces alloc_pages_node() with vzalloc_node() which allocates contiguous block but in virtual memory. This should reduce changes of failure but should not cause other behavioral changes as it_map is only used by the kernel's DMA hooks/api when MMU is on. Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/kernel/iommu.c | 15 +++ 1 file changed, 3 insertions(+), 12 deletions(-) diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c index c00214a4355c..8eb6eb0afa97 100644 --- a/arch/powerpc/kernel/iommu.c +++ b/arch/powerpc/kernel/iommu.c @@ -719,7 +719,6 @@ struct iommu_table *iommu_init_table(struct iommu_table *tbl, int nid, { unsigned long sz; static int welcomed = 0; - struct page *page; unsigned int i; struct iommu_pool *p; @@ -728,11 +727,9 @@ struct iommu_table *iommu_init_table(struct iommu_table *tbl, int nid, /* number of bytes needed for the bitmap */ sz = BITS_TO_LONGS(tbl->it_size) * sizeof(unsigned long); - page = alloc_pages_node(nid, GFP_KERNEL, get_order(sz)); - if (!page) + tbl->it_map = vzalloc_node(sz, nid); + if (!tbl->it_map) panic("iommu_init_table: Can't allocate %ld bytes\n", sz); - tbl->it_map = page_address(page); - memset(tbl->it_map, 0, sz); iommu_table_reserve_pages(tbl, res_start, res_end); @@ -774,8 +771,6 @@ struct iommu_table *iommu_init_table(struct iommu_table *tbl, int nid, static void iommu_table_free(struct kref *kref) { - unsigned long bitmap_sz; - unsigned int order; struct iommu_table *tbl; tbl = container_of(kref, struct iommu_table, it_kref); @@ -796,12 +791,8 @@ static void iommu_table_free(struct kref *kref) if (!bitmap_empty(tbl->it_map, tbl->it_size)) pr_warn("%s: Unexpected TCEs\n", __func__); - /* calculate bitmap size in bytes */ - bitmap_sz = BITS_TO_LONGS(tbl->it_size) * sizeof(unsigned long); - /* free bitmap */ - order = get_order(bitmap_sz); - free_pages((unsigned long) tbl->it_map, order); + vfree(tbl->it_map); /* free table */ kfree(tbl); -- 2.17.1
[PATCH kernel] powerpc/iommu: Annotate nested lock for lockdep
The IOMMU table is divided into pools for concurrent mappings and each pool has a separate spinlock. When taking the ownership of an IOMMU group to pass through a device to a VM, we lock these spinlocks which triggers a false negative warning in lockdep (below). This fixes it by annotating the large pool's spinlock as a nest lock. === WARNING: possible recursive locking detected 5.11.0-le_syzkaller_a+fstn1 #100 Not tainted qemu-system-ppc/4129 is trying to acquire lock: c000119bddb0 (&(p->lock)/1){}-{2:2}, at: iommu_take_ownership+0xac/0x1e0 but task is already holding lock: c000119bdd30 (&(p->lock)/1){}-{2:2}, at: iommu_take_ownership+0xac/0x1e0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 lock(&(p->lock)/1); lock(&(p->lock)/1); === Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/kernel/iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c index 557a09dd5b2f..2ee642a6731a 100644 --- a/arch/powerpc/kernel/iommu.c +++ b/arch/powerpc/kernel/iommu.c @@ -1089,7 +1089,7 @@ int iommu_take_ownership(struct iommu_table *tbl) spin_lock_irqsave(&tbl->large_pool.lock, flags); for (i = 0; i < tbl->nr_pools; i++) - spin_lock(&tbl->pools[i].lock); + spin_lock_nest_lock(&tbl->pools[i].lock, &tbl->large_pool.lock); iommu_table_release_pages(tbl); -- 2.17.1
Re: [PATCH kernel] powerpc/perf: Stop crashing with generic_compat_pmu
On 03/12/2020 16:27, Madhavan Srinivasan wrote: On 12/2/20 8:31 AM, Alexey Kardashevskiy wrote: Hi Maddy, I just noticed that I still have "powerpc/perf: Add checks for reserved values" in my pile (pushed here https://github.com/aik/linux/commit/61e1bc3f2e19d450e2e2d39174d422160b21957b ), do we still need it? The lockups I saw were fixed by https://github.com/aik/linux/commit/17899eaf88d689 but it is hardly a replacement. Thanks, sorry missed this. Will look at this again. Since we will need generation specific checks for the reserve field. So any luck with this? Cheers, Maddy On 04/06/2020 02:34, Madhavan Srinivasan wrote: On 6/2/20 8:26 AM, Alexey Kardashevskiy wrote: The bhrb_filter_map ("The Branch History Rolling Buffer") callback is only defined in raw CPUs' power_pmu structs. The "architected" CPUs use generic_compat_pmu which does not have this callback and crashed occur. This add a NULL pointer check for bhrb_filter_map() which behaves as if the callback returned an error. This does not add the same check for config_bhrb() as the only caller checks for cpuhw->bhrb_users which remains zero if bhrb_filter_map==0. Changes looks fine. Reviewed-by: Madhavan Srinivasan The commit be80e758d0c2e ('powerpc/perf: Add generic compat mode pmu driver') which introduced generic_compat_pmu was merged in v5.2. So we need to CC stable starting from 5.2 :( . My bad, sorry. Maddy Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/perf/core-book3s.c | 19 ++- 1 file changed, 14 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c index 3dcfecf858f3..36870569bf9c 100644 --- a/arch/powerpc/perf/core-book3s.c +++ b/arch/powerpc/perf/core-book3s.c @@ -1515,9 +1515,16 @@ static int power_pmu_add(struct perf_event *event, int ef_flags) ret = 0; out: if (has_branch_stack(event)) { - power_pmu_bhrb_enable(event); - cpuhw->bhrb_filter = ppmu->bhrb_filter_map( - event->attr.branch_sample_type); + u64 bhrb_filter = -1; + + if (ppmu->bhrb_filter_map) + bhrb_filter = ppmu->bhrb_filter_map( + event->attr.branch_sample_type); + + if (bhrb_filter != -1) { + cpuhw->bhrb_filter = bhrb_filter; + power_pmu_bhrb_enable(event); /* Does bhrb_users++ */ + } } perf_pmu_enable(event->pmu); @@ -1839,7 +1846,6 @@ static int power_pmu_event_init(struct perf_event *event) int n; int err; struct cpu_hw_events *cpuhw; - u64 bhrb_filter; if (!ppmu) return -ENOENT; @@ -1945,7 +1951,10 @@ static int power_pmu_event_init(struct perf_event *event) err = power_check_constraints(cpuhw, events, cflags, n + 1); if (has_branch_stack(event)) { - bhrb_filter = ppmu->bhrb_filter_map( + u64 bhrb_filter = -1; + + if (ppmu->bhrb_filter_map) + bhrb_filter = ppmu->bhrb_filter_map( event->attr.branch_sample_type); if (bhrb_filter == -1) { -- Alexey
Re: [PATCH v18 03/11] of: Add a common kexec FDT setup function
Lakshmi Ramasubramanian writes: > From: Rob Herring > > Both arm64 and powerpc do essentially the same FDT /chosen setup for > kexec. The differences are either omissions that arm64 should have > or additional properties that will be ignored. The setup code can be > combined and shared by both powerpc and arm64. > > The differences relative to the arm64 version: > - If /chosen doesn't exist, it will be created (should never happen). > - Any old dtb and initrd reserved memory will be released. > - The new initrd and elfcorehdr are marked reserved. > - "linux,booted-from-kexec" is set. > > The differences relative to the powerpc version: > - "kaslr-seed" and "rng-seed" may be set. > - "linux,elfcorehdr" is set. > - Any existing "linux,usable-memory-range" is removed. > > Combine the code for setting up the /chosen node in the FDT and updating > the memory reservation for kexec, for powerpc and arm64, in > of_kexec_alloc_and_setup_fdt() and move it to "drivers/of/kexec.c". > > Signed-off-by: Rob Herring > Signed-off-by: Lakshmi Ramasubramanian > --- > drivers/of/Makefile | 6 + > drivers/of/kexec.c | 265 > include/linux/of.h | 5 + > 3 files changed, 276 insertions(+) > create mode 100644 drivers/of/kexec.c Reviewed-by: Thiago Jung Bauermann -- Thiago Jung Bauermann IBM Linux Technology Center
Re: [PATCH v18 06/11] powerpc: Move ima buffer fields to struct kimage
Lakshmi Ramasubramanian writes: > The fields ima_buffer_addr and ima_buffer_size in "struct kimage_arch" > for powerpc are used to carry forward the IMA measurement list across > kexec system call. These fields are not architecture specific, but are > currently limited to powerpc. > > arch_ima_add_kexec_buffer() defined in "arch/powerpc/kexec/ima.c" > sets ima_buffer_addr and ima_buffer_size for the kexec system call. > This function does not have architecture specific code, but is > currently limited to powerpc. > > Move ima_buffer_addr and ima_buffer_size to "struct kimage". > Set ima_buffer_addr and ima_buffer_size in ima_add_kexec_buffer() > in security/integrity/ima/ima_kexec.c. > > Co-developed-by: Prakhar Srivastava > Signed-off-by: Prakhar Srivastava > Signed-off-by: Lakshmi Ramasubramanian > Suggested-by: Will Deacon > --- > arch/powerpc/include/asm/ima.h | 3 --- > arch/powerpc/include/asm/kexec.h | 5 - > arch/powerpc/kexec/ima.c | 29 ++--- > include/linux/kexec.h | 3 +++ > security/integrity/ima/ima_kexec.c | 8 ++-- > 5 files changed, 11 insertions(+), 37 deletions(-) Reviewed-by: Thiago Jung Bauermann -- Thiago Jung Bauermann IBM Linux Technology Center
Re: [PATCH v18 05/11] powerpc: Use common of_kexec_alloc_and_setup_fdt()
Lakshmi Ramasubramanian writes: > From: Rob Herring > > The code for setting up the /chosen node in the device tree > and updating the memory reservation for the next kernel has been > moved to of_kexec_alloc_and_setup_fdt() defined in "drivers/of/kexec.c". > > Use the common of_kexec_alloc_and_setup_fdt() to setup the device tree > and update the memory reservation for kexec for powerpc. > > Signed-off-by: Rob Herring > Signed-off-by: Lakshmi Ramasubramanian > --- > arch/powerpc/include/asm/kexec.h | 1 + > arch/powerpc/kexec/elf_64.c | 30 --- > arch/powerpc/kexec/file_load.c| 132 +- > arch/powerpc/kexec/file_load_64.c | 3 + > 4 files changed, 26 insertions(+), 140 deletions(-) Reviewed-by: Thiago Jung Bauermann -- Thiago Jung Bauermann IBM Linux Technology Center
Re: [PATCH v18 04/11] arm64: Use common of_kexec_alloc_and_setup_fdt()
Lakshmi Ramasubramanian writes: > From: Rob Herring > > The code for setting up the /chosen node in the device tree > and updating the memory reservation for the next kernel has been > moved to of_kexec_alloc_and_setup_fdt() defined in "drivers/of/kexec.c". > > Use the common of_kexec_alloc_and_setup_fdt() to setup the device tree > and update the memory reservation for kexec for arm64. > > Signed-off-by: Rob Herring > Signed-off-by: Lakshmi Ramasubramanian > --- > arch/arm64/kernel/machine_kexec_file.c | 180 ++--- > 1 file changed, 8 insertions(+), 172 deletions(-) Reviewed-by: Thiago Jung Bauermann -- Thiago Jung Bauermann IBM Linux Technology Center
Re: [PATCH v18 01/11] powerpc: Rename kexec elfcorehdr_addr to elf_load_addr
Lakshmi Ramasubramanian writes: > From: Rob Herring > > The architecture specific field, elfcorehdr_addr in struct kimage_arch, > that holds the address of the buffer in memory for ELF core header for > powerpc has a different name than the one used for x86_64. This makes > it hard to have a common code for setting up the device tree for > kexec system call. > > Rename elfcorehdr_addr to elf_load_addr to align with x86_64 name so > common code can use it. > > Signed-off-by: Rob Herring > Reviewed-by: Lakshmi Ramasubramanian > --- > arch/powerpc/include/asm/kexec.h | 2 +- > arch/powerpc/kexec/file_load.c| 4 ++-- > arch/powerpc/kexec/file_load_64.c | 4 ++-- > 3 files changed, 5 insertions(+), 5 deletions(-) Reviewed-by: Thiago Jung Bauermann -- Thiago Jung Bauermann IBM Linux Technology Center
Re: [PATCH v18 02/11] arm64: Rename kexec elf_headers_mem to elf_load_addr
Lakshmi Ramasubramanian writes: > The architecture specific field, elf_headers_mem in struct kimage_arch, > that holds the address of the buffer in memory for ELF core header for > arm64 has a different name than the one used for powerpc. This makes > it hard to have a common code for setting up the device tree for > kexec system call. > > Rename elf_headers_mem to elf_load_addr to align with powerpc name so > common code can use it. > > Signed-off-by: Lakshmi Ramasubramanian > Suggested-by: Thiago Jung Bauermann > --- > arch/arm64/include/asm/kexec.h | 2 +- > arch/arm64/kernel/machine_kexec_file.c | 6 +++--- > 2 files changed, 4 insertions(+), 4 deletions(-) Reviewed-by: Thiago Jung Bauermann -- Thiago Jung Bauermann IBM Linux Technology Center
Re: [PATCH 1/4] add generic builtin command line
On Thu, 2019-03-21 at 15:15 -0700, Andrew Morton wrote: > On Thu, 21 Mar 2019 08:13:08 -0700 Daniel Walker wrote: > > On Wed, Mar 20, 2019 at 08:14:33PM -0700, Andrew Morton wrote: > > > The patches (or some version of them) are already in linux-next, > > > which messes me up. I'll disable them for now. > > > > Those are from my tree, but I remove them when you picked up the series. The > > next linux-next should not have them. > > Yup, thanks, all looks good now. This patchset is currently neither in mainline nor in -next. May I ask what happened to it? Thanks.
Re: [PATCH 1/4] ibmvfc: simplify handling of sub-CRQ initialization
Reviewed-by: Brian King -- Brian King Power Linux I/O IBM Linux Technology Center
Re: [PATCH for 5.10] powerpc/32: Preserve cr1 in exception prolog stack check to fix build error
On Fri, Feb 12, 2021 at 08:57:14AM +, Christophe Leroy wrote: > This is backport of 3642eb21256a ("powerpc/32: Preserve cr1 in > exception prolog stack check to fix build error") for kernel 5.10 > > It fixes the build failure on v5.10 reported by kernel test robot > and by David Michael. > > This fix is not in Linux tree yet, it is in next branch in powerpc tree. Then there's nothing I can do about it until that happens :(
Re: [PATCH v4 1/3] powerpc/book3s64/radix/tlb: tlbie primitives for process-scoped invalidations from guests
Hi Bharata, Thank you for the patch! Yet something to improve: [auto build test ERROR on kvm/linux-next] [also build test ERROR on v5.11] [cannot apply to powerpc/next next-20210212] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Bharata-B-Rao/Support-for-H_RPT_INVALIDATE-in-PowerPC-KVM/20210215-143815 base: https://git.kernel.org/pub/scm/virt/kvm/kvm.git linux-next config: powerpc64-randconfig-r005-20210215 (attached as .config) compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project c9439ca36342fb6013187d0a69aef92736951476) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install powerpc64 cross compiling tool for clang build # apt-get install binutils-powerpc64-linux-gnu # https://github.com/0day-ci/linux/commit/2a2c1320dc2bc67ec962721c39e7639cc1abfa9d git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Bharata-B-Rao/Support-for-H_RPT_INVALIDATE-in-PowerPC-KVM/20210215-143815 git checkout 2a2c1320dc2bc67ec962721c39e7639cc1abfa9d # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc64 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot All errors (new ones prefixed by >>): >> arch/powerpc/mm/book3s64/radix_tlb.c:399:20: error: unused function >> '_tlbie_pid_lpid' [-Werror,-Wunused-function] static inline void _tlbie_pid_lpid(unsigned long pid, unsigned long lpid, ^ >> arch/powerpc/mm/book3s64/radix_tlb.c:643:20: error: unused function >> '_tlbie_va_range_lpid' [-Werror,-Wunused-function] static inline void _tlbie_va_range_lpid(unsigned long start, unsigned long end, ^ 2 errors generated. vim +/_tlbie_pid_lpid +399 arch/powerpc/mm/book3s64/radix_tlb.c 398 > 399 static inline void _tlbie_pid_lpid(unsigned long pid, unsigned long > lpid, 400 unsigned long ric) 401 { 402 asm volatile("ptesync" : : : "memory"); 403 404 /* 405 * Workaround the fact that the "ric" argument to __tlbie_pid 406 * must be a compile-time contraint to match the "i" constraint 407 * in the asm statement. 408 */ 409 switch (ric) { 410 case RIC_FLUSH_TLB: 411 __tlbie_pid_lpid(pid, lpid, RIC_FLUSH_TLB); 412 fixup_tlbie_pid_lpid(pid, lpid); 413 break; 414 case RIC_FLUSH_PWC: 415 __tlbie_pid_lpid(pid, lpid, RIC_FLUSH_PWC); 416 break; 417 case RIC_FLUSH_ALL: 418 default: 419 __tlbie_pid_lpid(pid, lpid, RIC_FLUSH_ALL); 420 fixup_tlbie_pid_lpid(pid, lpid); 421 } 422 asm volatile("eieio; tlbsync; ptesync" : : : "memory"); 423 } 424 struct tlbiel_pid { 425 unsigned long pid; 426 unsigned long ric; 427 }; 428 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org .config.gz Description: application/gzip
Re: [PATCH 00/27] arch: syscalls: unifiy all syscalltbl.sh into scripts/syscalltbl.sh
On Thu, Jan 28, 2021 at 9:51 AM Masahiro Yamada wrote: > > > As of v5.11-rc1, 12 architectures duplicate similar shell scripts: > > $ find arch -name syscalltbl.sh | sort > arch/alpha/kernel/syscalls/syscalltbl.sh > arch/arm/tools/syscalltbl.sh > arch/ia64/kernel/syscalls/syscalltbl.sh > arch/m68k/kernel/syscalls/syscalltbl.sh > arch/microblaze/kernel/syscalls/syscalltbl.sh > arch/mips/kernel/syscalls/syscalltbl.sh > arch/parisc/kernel/syscalls/syscalltbl.sh > arch/powerpc/kernel/syscalls/syscalltbl.sh > arch/sh/kernel/syscalls/syscalltbl.sh > arch/sparc/kernel/syscalls/syscalltbl.sh > arch/x86/entry/syscalls/syscalltbl.sh > arch/xtensa/kernel/syscalls/syscalltbl.sh > > This patch set unifies all of them into a single file, > scripts/syscalltbl.sh. > > The code-diff is attractive: > > 51 files changed, 254 insertions(+), 674 deletions(-) > delete mode 100644 arch/alpha/kernel/syscalls/syscalltbl.sh > delete mode 100644 arch/arm/tools/syscalltbl.sh > delete mode 100644 arch/ia64/kernel/syscalls/syscalltbl.sh > delete mode 100644 arch/m68k/kernel/syscalls/syscalltbl.sh > delete mode 100644 arch/microblaze/kernel/syscalls/syscalltbl.sh > delete mode 100644 arch/mips/kernel/syscalls/syscalltbl.sh > delete mode 100644 arch/parisc/kernel/syscalls/syscalltbl.sh > delete mode 100644 arch/powerpc/kernel/syscalls/syscalltbl.sh > delete mode 100644 arch/sh/kernel/syscalls/syscalltbl.sh > delete mode 100644 arch/sparc/kernel/syscalls/syscalltbl.sh > delete mode 100644 arch/x86/entry/syscalls/syscalltbl.sh > delete mode 100644 arch/xtensa/kernel/syscalls/syscalltbl.sh > create mode 100644 scripts/syscalltbl.sh > > Also, this includes Makefile fixes, and some x86 fixes and cleanups. > > My question is, how to merge this series. > > I am touching all architectures, but the first patch is a prerequisite > of the rest of this series. > > One possibility is to ask the x86 maintainers to pickup the first 5 > patches for v5.12-rc1, and then send the rest for v5.13-rc1, > splitting per-arch. > > I want the x86 maintainers to check the first 5 patches because > I cleaned up the x32 code. Never mind. Sending too big patch set tends to fail. I will apply the generic script parts to my tree, then split the rest per arch in the next development cycle (aim for v5.13-rc1) > I know x32 was considered for deprecation, but my motivation is to > clean-up scripts across the tree without changing the functionality. > > > > Masahiro Yamada (27): > scripts: add generic syscalltbl.sh > x86/syscalls: fix -Wmissing-prototypes warnings from COND_SYSCALL() > x86/build: add missing FORCE and fix 'targets' to make if_changed work > x86/entry/x32: rename __x32_compat_sys_* to __x64_compat_sys_* > x86/syscalls: switch to generic syscalltbl.sh > ARM: syscalls: switch to generic syscalltbl.sh > alpha: add missing FORCE and fix 'targets' to make if_changed work > alpha: syscalls: switch to generic syscalltbl.sh > ia64: add missing FORCE and fix 'targets' to make if_changed work > ia64: syscalls: switch to generic syscalltbl.sh > m68k: add missing FORCE and fix 'targets' to make if_changed work > m68k: syscalls: switch to generic syscalltbl.sh > microblaze: add missing FORCE and fix 'targets' to make if_changed > work > microblaze: syscalls: switch to generic syscalltbl.sh > mips: add missing FORCE and fix 'targets' to make if_changed work > mips: syscalls: switch to generic syscalltbl.sh > parisc: add missing FORCE and fix 'targets' to make if_changed work > parisc: syscalls: switch to generic syscalltbl.sh > sh: add missing FORCE and fix 'targets' to make if_changed work > sh: syscalls: switch to generic syscalltbl.sh > sparc: remove wrong comment from arch/sparc/include/asm/Kbuild > sparc: add missing FORCE and fix 'targets' to make if_changed work > sparc: syscalls: switch to generic syscalltbl.sh > powerpc: add missing FORCE and fix 'targets' to make if_changed work > powerpc: syscalls: switch to generic syscalltbl.sh > xtensa: add missing FORCE and fix 'targets' to make if_changed work > xtensa: syscalls: switch to generic syscalltbl.sh > > arch/alpha/kernel/syscalls/Makefile | 18 +++ > arch/alpha/kernel/syscalls/syscalltbl.sh | 32 --- > arch/alpha/kernel/systbls.S | 3 +- > arch/arm/kernel/entry-common.S| 8 +-- > arch/arm/tools/Makefile | 9 ++-- > arch/arm/tools/syscalltbl.sh | 22 > arch/ia64/kernel/entry.S | 3 +- > arch/ia64/kernel/syscalls/Makefile| 19 +++ > arch/ia64/kernel/syscalls/syscalltbl.sh | 32 --- > arch/m68k/kernel/syscalls/Makefile| 18 +++ > arch/m68k/kernel/syscalls/syscalltbl.sh | 32 --- > arch/m68k/kernel/syscalltable.S | 3 +- > arch/microblaze/kernel/syscall_table.S| 3 +- > arch/microblaze/kern
[PATCH v2] powerpc/pseries: Don't enforce MSI affinity with kdump
Depending on the number of online CPUs in the original kernel, it is likely for CPU #0 to be offline in a kdump kernel. The associated IRQs in the affinity mappings provided by irq_create_affinity_masks() are thus not started by irq_startup(), as per-design with managed IRQs. This can be a problem with multi-queue block devices driven by blk-mq : such a non-started IRQ is very likely paired with the single queue enforced by blk-mq during kdump (see blk_mq_alloc_tag_set()). This causes the device to remain silent and likely hangs the guest at some point. This is a regression caused by commit 9ea69a55b3b9 ("powerpc/pseries: Pass MSI affinity to irq_create_mapping()"). Note that this only happens with the XIVE interrupt controller because XICS has a workaround to bypass affinity, which is activated during kdump with the "noirqdistrib" kernel parameter. The issue comes from a combination of factors: - discrepancy between the number of queues detected by the multi-queue block driver, that was used to create the MSI vectors, and the single queue mode enforced later on by blk-mq because of kdump (i.e. keeping all queues fixes the issue) - CPU#0 offline (i.e. kdump always succeed with CPU#0) Given that I couldn't reproduce on x86, which seems to always have CPU#0 online even during kdump, I'm not sure where this should be fixed. Hence going for another approach : fine-grained affinity is for performance and we don't really care about that during kdump. Simply revert to the previous working behavior of ignoring affinity masks in this case only. Fixes: 9ea69a55b3b9 ("powerpc/pseries: Pass MSI affinity to irq_create_mapping()") Cc: lviv...@redhat.com Cc: sta...@vger.kernel.org Reviewed-by: Laurent Vivier Reviewed-by: Cédric Le Goater Signed-off-by: Greg Kurz --- v2: - added missing #include arch/powerpc/platforms/pseries/msi.c | 25 +++-- 1 file changed, 23 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/platforms/pseries/msi.c b/arch/powerpc/platforms/pseries/msi.c index b3ac2455faad..637300330507 100644 --- a/arch/powerpc/platforms/pseries/msi.c +++ b/arch/powerpc/platforms/pseries/msi.c @@ -4,6 +4,7 @@ * Copyright 2006-2007 Michael Ellerman, IBM Corp. */ +#include #include #include #include @@ -458,8 +459,28 @@ static int rtas_setup_msi_irqs(struct pci_dev *pdev, int nvec_in, int type) return hwirq; } - virq = irq_create_mapping_affinity(NULL, hwirq, - entry->affinity); + /* +* Depending on the number of online CPUs in the original +* kernel, it is likely for CPU #0 to be offline in a kdump +* kernel. The associated IRQs in the affinity mappings +* provided by irq_create_affinity_masks() are thus not +* started by irq_startup(), as per-design for managed IRQs. +* This can be a problem with multi-queue block devices driven +* by blk-mq : such a non-started IRQ is very likely paired +* with the single queue enforced by blk-mq during kdump (see +* blk_mq_alloc_tag_set()). This causes the device to remain +* silent and likely hangs the guest at some point. +* +* We don't really care for fine-grained affinity when doing +* kdump actually : simply ignore the pre-computed affinity +* masks in this case and let the default mask with all CPUs +* be used when creating the IRQ mappings. +*/ + if (is_kdump_kernel()) + virq = irq_create_mapping(NULL, hwirq); + else + virq = irq_create_mapping_affinity(NULL, hwirq, + entry->affinity); if (!virq) { pr_debug("rtas_msi: Failed mapping hwirq %d\n", hwirq); -- 2.26.2