Re: [PATCH v27 1/9] memblock: add memblock_cap_memory_range()
Will, On Wed, Nov 16, 2016 at 04:30:15PM +, Will Deacon wrote: > Hi Akashi, > > On Mon, Nov 14, 2016 at 02:55:16PM +0900, AKASHI Takahiro wrote: > > On Fri, Nov 11, 2016 at 11:19:04AM +0800, Dennis Chen wrote: > > > On Fri, Nov 11, 2016 at 11:50:50AM +0900, AKASHI Takahiro wrote: > > > > On Thu, Nov 10, 2016 at 05:27:20PM +, Will Deacon wrote: > > > > > On Wed, Nov 02, 2016 at 01:51:53PM +0900, AKASHI Takahiro wrote: > > > > > > +void __init memblock_cap_memory_range(phys_addr_t base, > > > > > > phys_addr_t size) > > > > > > +{ > > > > > > + int start_rgn, end_rgn; > > > > > > + int i, ret; > > > > > > + > > > > > > + if (!size) > > > > > > + return; > > > > > > + > > > > > > + ret = memblock_isolate_range(&memblock.memory, base, size, > > > > > > + &start_rgn, &end_rgn); > > > > > > + if (ret) > > > > > > + return; > > > > > > + > > > > > > + /* remove all the MAP regions */ > > > > > > + for (i = memblock.memory.cnt - 1; i >= end_rgn; i--) > > > > > > + if (!memblock_is_nomap(&memblock.memory.regions[i])) > > > > > > + memblock_remove_region(&memblock.memory, i); > > > > > > + > > > > > > + for (i = start_rgn - 1; i >= 0; i--) > > > > > > + if (!memblock_is_nomap(&memblock.memory.regions[i])) > > > > > > + memblock_remove_region(&memblock.memory, i); > > > > > > + > > > > > > + /* truncate the reserved regions */ > > > > > > + memblock_remove_range(&memblock.reserved, 0, base); > > > > > > + memblock_remove_range(&memblock.reserved, > > > > > > + base + size, (phys_addr_t)ULLONG_MAX); > > > > > > +} > > > > > > > > > > This duplicates a bunch of the logic in > > > > > memblock_mem_limit_remove_map. Can > > > > > you not implement that in terms of your new, more general, function? > > > > > e.g. > > > > > by passing base == 0, and size == limit? > > > > > > > > Obviously it's possible. > > > > I actually talked to Dennis before about merging them, > > > > but he was against my idea. > > > > > > > Oops! I thought we have reached agreement in the > > > thread:http://lists.infradead.org/pipermail/linux-arm-kernel/2016-July/442817.html > > > So feel free to do that as Will'll do > > > > OK, but I found that the two functions have a bit different semantics > > in clipping memory range, in particular, when the range [base,base+size) > > goes across several regions with a gap. > > (This does not happen in my arm64 kdump, though.) > > That is, 'limit' in memblock_mem_limit_remove_map() means total size of > > available memory, while 'size' in memblock_cap_memory_range() indicates > > the size of _continuous_ memory range. > > I thought limit was just a physical address, and then No, it's not. > memblock_mem_limit_remove_map operated on the end of the nearest memblock? No, but "max_addr" returned by __find_max_addr() is a physical address and the end address of memory of "limit" size in total. > You could leave the __find_max_addr call in memblock_mem_limit_remove_map, > given that I don't think you need/want it for memblock_cap_memory_range. > > > So I added an extra argument, exact, to a common function to specify > > distinct behaviors. Confusing? Please see the patch below. > > Oh yikes, this certainly wasn't what I had in mind! My observation was > just that memblock_mem_limit_remove_map(limit) does: > > > 1. memblock_isolate_range(limit - limit+ULLONG_MAX) > 2. memblock_remove_region(all non-nomap regions in the isolated region) > 3. truncate reserved regions to limit > > and your memblock_cap_memory_range(base, size) does: > > 1. memblock_isolate_range(base - base+size) > 2, memblock_remove_region(all non-nomap regions above and below the > isolated region) > 3. truncate reserved regions around the isolated region > > so, assuming we can invert the isolation in one of the cases, then they > could share the same underlying implementation. Please see my simplified patch below which would explain what I meant. (Note that the size is calculated by 'max_addr - 0'.) > I'm probably just missing something here, because the patch you've ended > up with is far more involved than I anticipated... I hope that it will meet almost your anticipation. Thanks, -Takahiro AKASHI > > Will ===8<=== diff --git a/mm/memblock.c b/mm/memblock.c index 7608bc3..fea1688 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -1514,11 +1514,37 @@ void __init memblock_enforce_memory_limit(phys_addr_t limit) (phys_addr_t)ULLONG_MAX); } +void __init memblock_cap_memory_range(phys_addr_t base, phys_addr_t size) +{ + int start_rgn, end_rgn; + int i, ret; + + if (!size) + return; + + ret = memblock_isolate_range(&memblock.memory, base, size, + &start_rgn, &end_rgn); + if (ret) + return; + + /* remov
Re: [PATCH Makedumpfile 00/10] arm64 cleanup and kaslr support
On Wednesday 16 November 2016 02:10 PM, Atsushi Kumagai wrote: As you wish. I can send v2 (there is no difference other than conflict >resolution for rebasing on top of current devel, conflict is minor), >or you can pick them from >https://github.com/pratyushanand/makedumpfile.git : arm64_devel. Sure, I'll pick them (from 60352bd to 3b5faac, right?) up from your devel branch, thanks. Yes, Correct. Thanks ~Pratyush ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH] iommu/vt-d: Flush old iotlb for kdump when the device gets context mapped
On 2016/11/16 at 22:58, Myron Stowe wrote: > On Wed, Nov 16, 2016 at 2:13 AM, Xunlei Pang wrote: >> Ccing David >> On 2016/11/16 at 17:02, Xunlei Pang wrote: >>> We met the DMAR fault both on hpsa P420i and P421 SmartArray controllers >>> under kdump, it can be steadily reproduced on several different machines, >>> the dmesg log is like: >>> HP HPSA Driver (v 3.4.16-0) >>> hpsa :02:00.0: using doorbell to reset controller >>> hpsa :02:00.0: board ready after hard reset. >>> hpsa :02:00.0: Waiting for controller to respond to no-op >>> DMAR: Setting identity map for device :02:00.0 [0xe8000 - 0xe8fff] >>> DMAR: Setting identity map for device :02:00.0 [0xf4000 - 0xf4fff] >>> DMAR: Setting identity map for device :02:00.0 [0xbdf6e000 - 0xbdf6efff] >>> DMAR: Setting identity map for device :02:00.0 [0xbdf6f000 - 0xbdf7efff] >>> DMAR: Setting identity map for device :02:00.0 [0xbdf7f000 - 0xbdf82fff] >>> DMAR: Setting identity map for device :02:00.0 [0xbdf83000 - 0xbdf84fff] >>> DMAR: DRHD: handling fault status reg 2 >>> DMAR: [DMA Read] Request device [02:00.0] fault addr f000 [fault reason >>> 06] PTE Read access is not set >>> hpsa :02:00.0: controller message 03:00 timed out >>> hpsa :02:00.0: no-op failed; re-trying >>> >>> After some debugging, we found that the corresponding pte entry value >>> is correct, and the value of the iommu caching mode is 0, the fault is >>> probably due to the old iotlb cache of the in-flight DMA. >>> >>> Thus need to flush the old iotlb after context mapping is setup for the >>> device, where the device is supposed to finish reset at its driver probe >>> stage and no in-flight DMA exists hereafter. >>> >>> With this patch, all our problematic machines can survive the kdump tests. >>> >>> CC: Myron Stowe >>> CC: Don Brace >>> CC: Baoquan He >>> CC: Dave Young >>> Tested-by: Joseph Szczypek >>> Signed-off-by: Xunlei Pang >>> --- >>> drivers/iommu/intel-iommu.c | 11 +-- >>> 1 file changed, 9 insertions(+), 2 deletions(-) >>> >>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c >>> index 3965e73..eb79288 100644 >>> --- a/drivers/iommu/intel-iommu.c >>> +++ b/drivers/iommu/intel-iommu.c >>> @@ -2067,9 +2067,16 @@ static int domain_context_mapping_one(struct >>> dmar_domain *domain, >>>* It's a non-present to present mapping. If hardware doesn't cache >>>* non-present entry we only need to flush the write-buffer. If the >>>* _does_ cache non-present entries, then it does so in the special > If this does get accepted then we should fix the above grammar also - > "If the _does_ cache ..." -> "If the hardware _does_ cache ..." Yes, but this reminds me of something. As per the comment, the code here only needs to flush context caches for the special domain 0 which is used to tag the non-present/erroneous caches, seems we should flush the old domain id of present entries for kdump according to the analysis, other than the new-allocated domain id. Let me ponder more on this. Regards, Xunlei > >>> - * domain #0, which we have to flush: >>> + * domain #0, which we have to flush. >>> + * >>> + * For kdump cases, present entries may be cached due to the in-flight >>> + * DMA and copied old pgtable, but there is no unmapping behaviour for >>> + * them, so we need an explicit iotlb flush for the newly-mapped >>> device. >>> + * For kdump, at this point, the device is supposed to finish reset at >>> + * the driver probe stage, no in-flight DMA will exist, thus we do not >>> + * need to worry about that anymore hereafter. >>>*/ >>> - if (cap_caching_mode(iommu->cap)) { >>> + if (is_kdump_kernel() || cap_caching_mode(iommu->cap)) { >>> iommu->flush.flush_context(iommu, 0, >>> (((u16)bus) << 8) | devfn, >>> DMA_CCMD_MASK_NOBIT, >> ___ >> iommu mailing list >> io...@lists.linux-foundation.org >> https://lists.linuxfoundation.org/mailman/listinfo/iommu ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
RE: [PATCH Makedumpfile 00/10] arm64 cleanup and kaslr support
>>>On Tue, Nov 15, 2016 at 12:04 PM, Pratyush Anand wrote: Hi Atsushi, There would be a conflict because of following patch while applying these patches. Other than that I also see a an issue with --config option. So I will fix that as well and repost the series soon. >>> >>>Sorry, for the noise.That was a bad compilation of the code. >>> >>>Anyway, I resolved the conflict because of below patch in upstream >>>devel branch and r5pushed the patches in my git tree: >>>https://github.com/pratyushanand/makedumpfile.git : arm64_devel >>> >>>There is no changes other than conflict resolution w.r.t. this patch series. >>> >>>~Pratyush >>> 0068010b9b83 [PATCH v2 2/2] Clean up unused KERNEL_IMAGE_SIZE >> >> I'm sorry ! I misunderstood that I already pushed this series >> in the devel branch. Should I wait for v2 series where the >> conflict is resolved ? > >As you wish. I can send v2 (there is no difference other than conflict >resolution for rebasing on top of current devel, conflict is minor), >or you can pick them from >https://github.com/pratyushanand/makedumpfile.git : arm64_devel. Sure, I'll pick them (from 60352bd to 3b5faac, right?) up from your devel branch, thanks. Regards, Atsushi Kumagai ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v27 1/9] memblock: add memblock_cap_memory_range()
Hi Akashi, On Mon, Nov 14, 2016 at 02:55:16PM +0900, AKASHI Takahiro wrote: > On Fri, Nov 11, 2016 at 11:19:04AM +0800, Dennis Chen wrote: > > On Fri, Nov 11, 2016 at 11:50:50AM +0900, AKASHI Takahiro wrote: > > > On Thu, Nov 10, 2016 at 05:27:20PM +, Will Deacon wrote: > > > > On Wed, Nov 02, 2016 at 01:51:53PM +0900, AKASHI Takahiro wrote: > > > > > +void __init memblock_cap_memory_range(phys_addr_t base, phys_addr_t > > > > > size) > > > > > +{ > > > > > + int start_rgn, end_rgn; > > > > > + int i, ret; > > > > > + > > > > > + if (!size) > > > > > + return; > > > > > + > > > > > + ret = memblock_isolate_range(&memblock.memory, base, size, > > > > > + &start_rgn, &end_rgn); > > > > > + if (ret) > > > > > + return; > > > > > + > > > > > + /* remove all the MAP regions */ > > > > > + for (i = memblock.memory.cnt - 1; i >= end_rgn; i--) > > > > > + if (!memblock_is_nomap(&memblock.memory.regions[i])) > > > > > + memblock_remove_region(&memblock.memory, i); > > > > > + > > > > > + for (i = start_rgn - 1; i >= 0; i--) > > > > > + if (!memblock_is_nomap(&memblock.memory.regions[i])) > > > > > + memblock_remove_region(&memblock.memory, i); > > > > > + > > > > > + /* truncate the reserved regions */ > > > > > + memblock_remove_range(&memblock.reserved, 0, base); > > > > > + memblock_remove_range(&memblock.reserved, > > > > > + base + size, (phys_addr_t)ULLONG_MAX); > > > > > +} > > > > > > > > This duplicates a bunch of the logic in memblock_mem_limit_remove_map. > > > > Can > > > > you not implement that in terms of your new, more general, function? > > > > e.g. > > > > by passing base == 0, and size == limit? > > > > > > Obviously it's possible. > > > I actually talked to Dennis before about merging them, > > > but he was against my idea. > > > > > Oops! I thought we have reached agreement in the > > thread:http://lists.infradead.org/pipermail/linux-arm-kernel/2016-July/442817.html > > So feel free to do that as Will'll do > > OK, but I found that the two functions have a bit different semantics > in clipping memory range, in particular, when the range [base,base+size) > goes across several regions with a gap. > (This does not happen in my arm64 kdump, though.) > That is, 'limit' in memblock_mem_limit_remove_map() means total size of > available memory, while 'size' in memblock_cap_memory_range() indicates > the size of _continuous_ memory range. I thought limit was just a physical address, and then memblock_mem_limit_remove_map operated on the end of the nearest memblock? You could leave the __find_max_addr call in memblock_mem_limit_remove_map, given that I don't think you need/want it for memblock_cap_memory_range. > So I added an extra argument, exact, to a common function to specify > distinct behaviors. Confusing? Please see the patch below. Oh yikes, this certainly wasn't what I had in mind! My observation was just that memblock_mem_limit_remove_map(limit) does: 1. memblock_isolate_range(limit - limit+ULLONG_MAX) 2. memblock_remove_region(all non-nomap regions in the isolated region) 3. truncate reserved regions to limit and your memblock_cap_memory_range(base, size) does: 1. memblock_isolate_range(base - base+size) 2, memblock_remove_region(all non-nomap regions above and below the isolated region) 3. truncate reserved regions around the isolated region so, assuming we can invert the isolation in one of the cases, then they could share the same underlying implementation. I'm probably just missing something here, because the patch you've ended up with is far more involved than I anticipated... Will ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH] iommu/vt-d: Flush old iotlb for kdump when the device gets context mapped
On Wed, Nov 16, 2016 at 2:13 AM, Xunlei Pang wrote: > Ccing David > On 2016/11/16 at 17:02, Xunlei Pang wrote: >> We met the DMAR fault both on hpsa P420i and P421 SmartArray controllers >> under kdump, it can be steadily reproduced on several different machines, >> the dmesg log is like: >> HP HPSA Driver (v 3.4.16-0) >> hpsa :02:00.0: using doorbell to reset controller >> hpsa :02:00.0: board ready after hard reset. >> hpsa :02:00.0: Waiting for controller to respond to no-op >> DMAR: Setting identity map for device :02:00.0 [0xe8000 - 0xe8fff] >> DMAR: Setting identity map for device :02:00.0 [0xf4000 - 0xf4fff] >> DMAR: Setting identity map for device :02:00.0 [0xbdf6e000 - 0xbdf6efff] >> DMAR: Setting identity map for device :02:00.0 [0xbdf6f000 - 0xbdf7efff] >> DMAR: Setting identity map for device :02:00.0 [0xbdf7f000 - 0xbdf82fff] >> DMAR: Setting identity map for device :02:00.0 [0xbdf83000 - 0xbdf84fff] >> DMAR: DRHD: handling fault status reg 2 >> DMAR: [DMA Read] Request device [02:00.0] fault addr f000 [fault reason >> 06] PTE Read access is not set >> hpsa :02:00.0: controller message 03:00 timed out >> hpsa :02:00.0: no-op failed; re-trying >> >> After some debugging, we found that the corresponding pte entry value >> is correct, and the value of the iommu caching mode is 0, the fault is >> probably due to the old iotlb cache of the in-flight DMA. >> >> Thus need to flush the old iotlb after context mapping is setup for the >> device, where the device is supposed to finish reset at its driver probe >> stage and no in-flight DMA exists hereafter. >> >> With this patch, all our problematic machines can survive the kdump tests. >> >> CC: Myron Stowe >> CC: Don Brace >> CC: Baoquan He >> CC: Dave Young >> Tested-by: Joseph Szczypek >> Signed-off-by: Xunlei Pang >> --- >> drivers/iommu/intel-iommu.c | 11 +-- >> 1 file changed, 9 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c >> index 3965e73..eb79288 100644 >> --- a/drivers/iommu/intel-iommu.c >> +++ b/drivers/iommu/intel-iommu.c >> @@ -2067,9 +2067,16 @@ static int domain_context_mapping_one(struct >> dmar_domain *domain, >>* It's a non-present to present mapping. If hardware doesn't cache >>* non-present entry we only need to flush the write-buffer. If the >>* _does_ cache non-present entries, then it does so in the special If this does get accepted then we should fix the above grammar also - "If the _does_ cache ..." -> "If the hardware _does_ cache ..." >> - * domain #0, which we have to flush: >> + * domain #0, which we have to flush. >> + * >> + * For kdump cases, present entries may be cached due to the in-flight >> + * DMA and copied old pgtable, but there is no unmapping behaviour for >> + * them, so we need an explicit iotlb flush for the newly-mapped >> device. >> + * For kdump, at this point, the device is supposed to finish reset at >> + * the driver probe stage, no in-flight DMA will exist, thus we do not >> + * need to worry about that anymore hereafter. >>*/ >> - if (cap_caching_mode(iommu->cap)) { >> + if (is_kdump_kernel() || cap_caching_mode(iommu->cap)) { >> iommu->flush.flush_context(iommu, 0, >> (((u16)bus) << 8) | devfn, >> DMA_CCMD_MASK_NOBIT, > > ___ > iommu mailing list > io...@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v1 2/2] kexec: Support -S paramter to return whether the kexec payload is loaded.
On Mon, Nov 14, 2016 at 05:12:53PM -0500, Konrad Rzeszutek Wilk wrote: > Instead of the scripts having to poke at various fields we can > provide that functionality via the -S parameter. > > Returns 0 if the payload is loaded. Can be used in combination > with -l or -p to get the state of the proper kexec image. > > Signed-off-by: Konrad Rzeszutek Wilk > --- > v0: First version (internal product). > v1: Posted on kexec mailing list. Changed -s to -S > > CC: kexec@lists.infradead.org > CC: xen-de...@lists.xenproject.org > CC: Daniel Kiper > --- > kexec/kexec-xen.c | 20 ++ > kexec/kexec.8 | 5 + > kexec/kexec.c | 61 > ++- > kexec/kexec.h | 5 - > 4 files changed, 85 insertions(+), 6 deletions(-) > > diff --git a/kexec/kexec-xen.c b/kexec/kexec-xen.c > index 24a4191..5a1e9d2 100644 > --- a/kexec/kexec-xen.c > +++ b/kexec/kexec-xen.c > @@ -105,6 +105,26 @@ int xen_kexec_unload(uint64_t kexec_flags) > return ret; > } > > +int xen_kexec_status(uint64_t kexec_flags) > +{ > + xc_interface *xch; > + uint8_t type; > + int ret; > + > + xch = xc_interface_open(NULL, NULL, 0); > + if (!xch) > + return -1; > + > + type = (kexec_flags & KEXEC_ON_CRASH) ? KEXEC_TYPE_CRASH > + : KEXEC_TYPE_DEFAULT; > + > + ret = xc_kexec_status(xch, type); > + > + xc_interface_close(xch); > + > + return ret; > +} > + > void xen_kexec_exec(void) > { > xc_interface *xch; > diff --git a/kexec/kexec.8 b/kexec/kexec.8 > index 4d0c1d1..02f4ccf 100644 > --- a/kexec/kexec.8 > +++ b/kexec/kexec.8 > @@ -107,6 +107,11 @@ command: > .B \-d\ (\-\-debug) > Enable debugging messages. > .TP > +.B \-D\ (\-\-status) > +Return 0 if the type (by default crash) is loaded. Can be used in conjuction > +with -l or -p to toggle the type. Note it will Is there a mnemonic to map -D to --status? If not perhaps it would be best to drop the short option. > +.BR not\ load\ the\ kernel. > +.TP > .B \-e\ (\-\-exec) > Run the currently loaded kernel. Note that it will reboot into the loaded > kernel without calling shutdown(8). > .TP > diff --git a/kexec/kexec.c b/kexec/kexec.c > index 500e5a9..bc72688 100644 > --- a/kexec/kexec.c > +++ b/kexec/kexec.c > @@ -855,6 +855,32 @@ static int k_unload (unsigned long kexec_flags) > return result; > } > > +static int kexec_loaded(void); > +static int __kexec_loaded(const char *f); I wonder if we could shuffle the location of functions in this file to avoid the need to forward declarations all together. > + > +static int k_status(unsigned long kexec_flags) > +{ > + int result; > + long native_arch; > + > + /* set the arch */ > + native_arch = physical_arch(); > + if (native_arch < 0) { > + return -1; > + } > + kexec_flags |= native_arch; > + > + if (xen_present()) > + result = xen_kexec_status(kexec_flags); > + else { > + if (kexec_flags & KEXEC_ON_CRASH) > + result = > __kexec_loaded("/sys/kernel/kexec_crash_loaded"); > + else > + result = kexec_loaded(); > + } > + return result; > +} > + > /* > * Start a reboot. > */ > @@ -890,8 +916,6 @@ static int my_exec(void) > return -1; > } > > -static int kexec_loaded(void); > - > static int load_jump_back_helper_image(unsigned long kexec_flags, void > *entry) > { > int result; > @@ -970,6 +994,7 @@ void usage(void) > " to original kernel.\n" > " -s, --kexec-file-syscall Use file based syscall for kexec > operation\n" > " -d, --debug Enable debugging to help spot a > failure.\n" > +" -S, --status Return 0 if the type (by default crash) > is loaded.\n" > "\n" > "Supported kernel file types and options: \n"); > for (i = 0; i < file_types; i++) { > @@ -981,7 +1006,7 @@ void usage(void) > printf("\n"); > } > > -static int kexec_loaded(void) > +static int __kexec_loaded(const char *file) > { > long ret = -1; > FILE *fp; > @@ -992,7 +1017,7 @@ static int kexec_loaded(void) > if (xen_present()) > return 1; > > - fp = fopen("/sys/kernel/kexec_loaded", "r"); > + fp = fopen(file, "r"); > if (fp == NULL) > return -1; > > @@ -1015,6 +1040,12 @@ static int kexec_loaded(void) > return (int)ret; > } > > +static int kexec_loaded(void) > +{ > + return __kexec_loaded("/sys/kernel/kexec_loaded"); > +} > + > + > /* > * Remove parameter from a kernel command line. Helper function by > get_command_line(). > */ > @@ -1204,6 +1235,7 @@ int main(int argc, char *argv[]) > int do_unload = 0; > int do_reuse_initrd = 0; > + int do_status = 0; > void *entry = 0; > char *type = 0; > char *endptr; > @@ -1345,6 +1377,9 @@ int m
Re: [PATCH] iommu/vt-d: Flush old iotlb for kdump when the device gets context mapped
Ccing David On 2016/11/16 at 17:02, Xunlei Pang wrote: > We met the DMAR fault both on hpsa P420i and P421 SmartArray controllers > under kdump, it can be steadily reproduced on several different machines, > the dmesg log is like: > HP HPSA Driver (v 3.4.16-0) > hpsa :02:00.0: using doorbell to reset controller > hpsa :02:00.0: board ready after hard reset. > hpsa :02:00.0: Waiting for controller to respond to no-op > DMAR: Setting identity map for device :02:00.0 [0xe8000 - 0xe8fff] > DMAR: Setting identity map for device :02:00.0 [0xf4000 - 0xf4fff] > DMAR: Setting identity map for device :02:00.0 [0xbdf6e000 - 0xbdf6efff] > DMAR: Setting identity map for device :02:00.0 [0xbdf6f000 - 0xbdf7efff] > DMAR: Setting identity map for device :02:00.0 [0xbdf7f000 - 0xbdf82fff] > DMAR: Setting identity map for device :02:00.0 [0xbdf83000 - 0xbdf84fff] > DMAR: DRHD: handling fault status reg 2 > DMAR: [DMA Read] Request device [02:00.0] fault addr f000 [fault reason > 06] PTE Read access is not set > hpsa :02:00.0: controller message 03:00 timed out > hpsa :02:00.0: no-op failed; re-trying > > After some debugging, we found that the corresponding pte entry value > is correct, and the value of the iommu caching mode is 0, the fault is > probably due to the old iotlb cache of the in-flight DMA. > > Thus need to flush the old iotlb after context mapping is setup for the > device, where the device is supposed to finish reset at its driver probe > stage and no in-flight DMA exists hereafter. > > With this patch, all our problematic machines can survive the kdump tests. > > CC: Myron Stowe > CC: Don Brace > CC: Baoquan He > CC: Dave Young > Tested-by: Joseph Szczypek > Signed-off-by: Xunlei Pang > --- > drivers/iommu/intel-iommu.c | 11 +-- > 1 file changed, 9 insertions(+), 2 deletions(-) > > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c > index 3965e73..eb79288 100644 > --- a/drivers/iommu/intel-iommu.c > +++ b/drivers/iommu/intel-iommu.c > @@ -2067,9 +2067,16 @@ static int domain_context_mapping_one(struct > dmar_domain *domain, >* It's a non-present to present mapping. If hardware doesn't cache >* non-present entry we only need to flush the write-buffer. If the >* _does_ cache non-present entries, then it does so in the special > - * domain #0, which we have to flush: > + * domain #0, which we have to flush. > + * > + * For kdump cases, present entries may be cached due to the in-flight > + * DMA and copied old pgtable, but there is no unmapping behaviour for > + * them, so we need an explicit iotlb flush for the newly-mapped device. > + * For kdump, at this point, the device is supposed to finish reset at > + * the driver probe stage, no in-flight DMA will exist, thus we do not > + * need to worry about that anymore hereafter. >*/ > - if (cap_caching_mode(iommu->cap)) { > + if (is_kdump_kernel() || cap_caching_mode(iommu->cap)) { > iommu->flush.flush_context(iommu, 0, > (((u16)bus) << 8) | devfn, > DMA_CCMD_MASK_NOBIT, ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH] iommu/vt-d: Flush old iotlb for kdump when the device gets context mapped
We met the DMAR fault both on hpsa P420i and P421 SmartArray controllers under kdump, it can be steadily reproduced on several different machines, the dmesg log is like: HP HPSA Driver (v 3.4.16-0) hpsa :02:00.0: using doorbell to reset controller hpsa :02:00.0: board ready after hard reset. hpsa :02:00.0: Waiting for controller to respond to no-op DMAR: Setting identity map for device :02:00.0 [0xe8000 - 0xe8fff] DMAR: Setting identity map for device :02:00.0 [0xf4000 - 0xf4fff] DMAR: Setting identity map for device :02:00.0 [0xbdf6e000 - 0xbdf6efff] DMAR: Setting identity map for device :02:00.0 [0xbdf6f000 - 0xbdf7efff] DMAR: Setting identity map for device :02:00.0 [0xbdf7f000 - 0xbdf82fff] DMAR: Setting identity map for device :02:00.0 [0xbdf83000 - 0xbdf84fff] DMAR: DRHD: handling fault status reg 2 DMAR: [DMA Read] Request device [02:00.0] fault addr f000 [fault reason 06] PTE Read access is not set hpsa :02:00.0: controller message 03:00 timed out hpsa :02:00.0: no-op failed; re-trying After some debugging, we found that the corresponding pte entry value is correct, and the value of the iommu caching mode is 0, the fault is probably due to the old iotlb cache of the in-flight DMA. Thus need to flush the old iotlb after context mapping is setup for the device, where the device is supposed to finish reset at its driver probe stage and no in-flight DMA exists hereafter. With this patch, all our problematic machines can survive the kdump tests. CC: Myron Stowe CC: Don Brace CC: Baoquan He CC: Dave Young Tested-by: Joseph Szczypek Signed-off-by: Xunlei Pang --- drivers/iommu/intel-iommu.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 3965e73..eb79288 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -2067,9 +2067,16 @@ static int domain_context_mapping_one(struct dmar_domain *domain, * It's a non-present to present mapping. If hardware doesn't cache * non-present entry we only need to flush the write-buffer. If the * _does_ cache non-present entries, then it does so in the special -* domain #0, which we have to flush: +* domain #0, which we have to flush. +* +* For kdump cases, present entries may be cached due to the in-flight +* DMA and copied old pgtable, but there is no unmapping behaviour for +* them, so we need an explicit iotlb flush for the newly-mapped device. +* For kdump, at this point, the device is supposed to finish reset at +* the driver probe stage, no in-flight DMA will exist, thus we do not +* need to worry about that anymore hereafter. */ - if (cap_caching_mode(iommu->cap)) { + if (is_kdump_kernel() || cap_caching_mode(iommu->cap)) { iommu->flush.flush_context(iommu, 0, (((u16)bus) << 8) | devfn, DMA_CCMD_MASK_NOBIT, -- 1.8.3.1 ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH Makedumpfile 00/10] arm64 cleanup and kaslr support
On Wed, Nov 16, 2016 at 11:14 AM, Atsushi Kumagai wrote: > Hello Pratyush, > >>On Tue, Nov 15, 2016 at 12:04 PM, Pratyush Anand wrote: >>> Hi Atsushi, >>> >>> There would be a conflict because of following patch while applying >>> these patches. Other than that I also see a an issue with --config >>> option. So I will fix that as well and repost the series soon. >> >>Sorry, for the noise.That was a bad compilation of the code. >> >>Anyway, I resolved the conflict because of below patch in upstream >>devel branch and r5pushed the patches in my git tree: >>https://github.com/pratyushanand/makedumpfile.git : arm64_devel >> >>There is no changes other than conflict resolution w.r.t. this patch series. >> >>~Pratyush >> >>> >>> 0068010b9b83 [PATCH v2 2/2] Clean up unused KERNEL_IMAGE_SIZE > > I'm sorry ! I misunderstood that I already pushed this series > in the devel branch. Should I wait for v2 series where the > conflict is resolved ? As you wish. I can send v2 (there is no difference other than conflict resolution for rebasing on top of current devel, conflict is minor), or you can pick them from https://github.com/pratyushanand/makedumpfile.git : arm64_devel. I am OK with whatever is convenient for you. Thanks Pratyush ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
RE: [PATCH Makedumpfile 00/10] arm64 cleanup and kaslr support
Hello Pratyush, >On Tue, Nov 15, 2016 at 12:04 PM, Pratyush Anand wrote: >> Hi Atsushi, >> >> There would be a conflict because of following patch while applying >> these patches. Other than that I also see a an issue with --config >> option. So I will fix that as well and repost the series soon. > >Sorry, for the noise.That was a bad compilation of the code. > >Anyway, I resolved the conflict because of below patch in upstream >devel branch and r5pushed the patches in my git tree: >https://github.com/pratyushanand/makedumpfile.git : arm64_devel > >There is no changes other than conflict resolution w.r.t. this patch series. > >~Pratyush > >> >> 0068010b9b83 [PATCH v2 2/2] Clean up unused KERNEL_IMAGE_SIZE I'm sorry ! I misunderstood that I already pushed this series in the devel branch. Should I wait for v2 series where the conflict is resolved ? Thanks, Atsushi Kumagai ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec