Re: [PATCH V14 01/10] acpi: apei: read ack upon ghes record consumption
On Tue, Mar 28, 2017 at 01:30:31PM -0600, Tyler Baicar wrote: > A RAS (Reliability, Availability, Serviceability) controller > may be a separate processor running in parallel with OS > execution, and may generate error records for consumption by > the OS. If the RAS controller produces multiple error records, > then they may be overwritten before the OS has consumed them. > > The Generic Hardware Error Source (GHES) v2 structure > introduces the capability for the OS to acknowledge the > consumption of the error record generated by the RAS > controller. A RAS controller supporting GHESv2 shall wait for > the acknowledgment before writing a new error record, thus > eliminating the race condition. > > Add support for parsing of GHESv2 sub-tables as well. > > Signed-off-by: Tyler Baicar> CC: Jonathan (Zhixiong) Zhang > Reviewed-by: James Morse > --- > drivers/acpi/apei/ghes.c | 49 > +--- > drivers/acpi/apei/hest.c | 7 +-- > include/acpi/ghes.h | 5 - > 3 files changed, 55 insertions(+), 6 deletions(-) ... > @@ -249,10 +254,18 @@ static struct ghes *ghes_new(struct acpi_hest_generic > *generic) > ghes = kzalloc(sizeof(*ghes), GFP_KERNEL); > if (!ghes) > return ERR_PTR(-ENOMEM); > + > ghes->generic = generic; > + if (IS_HEST_TYPE_GENERIC_V2(ghes)) { > + rc = apei_map_generic_address( > + >generic_v2->read_ack_register); Yeah, that linebreak just to keep the 80-cols rule makes the code ugly and hard to read. Please put that mapping and unmapping in wrappers called map_gen_v2(ghes) and unmap_gen_v2(ghes) or so, so that you can call them wherever needed. Thus should make the flow a bit more understandable what's going on and you won't have to repeat the unmapping lines in ghes_fini(). > @@ -649,6 +669,23 @@ static void ghes_estatus_cache_add( > rcu_read_unlock(); > } > > +static int ghes_ack_error(struct acpi_hest_generic_v2 *generic_v2) > +{ > + int rc; > + u64 val = 0; > + > + rc = apei_read(, _v2->read_ack_register); > + if (rc) > + return rc; > + val &= generic_v2->read_ack_preserve << > + generic_v2->read_ack_register.bit_offset; > + val |= generic_v2->read_ack_write << > + generic_v2->read_ack_register.bit_offset; Yeah, let them stick out, it more readable this way. Line spacing is helpful too: ... rc = apei_read(, _v2->read_ack_register); if (rc) return rc; val &= generic_v2->read_ack_preserve << generic_v2->read_ack_register.bit_offset; val |= generic_v2->read_ack_write<< generic_v2->read_ack_register.bit_offset; return apei_write(val, _v2->read_ack_register); } > + rc = apei_write(val, _v2->read_ack_register); > + > + return rc; > +} > + > static int ghes_proc(struct ghes *ghes) > { > int rc; -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH V14 03/10] efi: parse ARM processor error
On Tue, Mar 28, 2017 at 01:30:33PM -0600, Tyler Baicar wrote: > Add support for ARM Common Platform Error Record (CPER). > UEFI 2.6 specification adds support for ARM specific > processor error information to be reported as part of the > CPER records. This provides more detail on for processor error logs. > > Signed-off-by: Tyler Baicar> CC: Jonathan (Zhixiong) Zhang > Reviewed-by: James Morse > Reviewed-by: Ard Biesheuvel > --- > drivers/firmware/efi/cper.c | 133 > > include/linux/cper.h| 54 ++ > 2 files changed, 187 insertions(+) > > diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c > index 8fa4e23..56aa516 100644 > --- a/drivers/firmware/efi/cper.c > +++ b/drivers/firmware/efi/cper.c > @@ -110,12 +110,15 @@ void cper_print_bits(const char *pfx, unsigned int bits, > static const char * const proc_type_strs[] = { > "IA32/X64", > "IA64", > + "ARM", > }; > > static const char * const proc_isa_strs[] = { > "IA32", > "IA64", > "X64", > + "ARM A32/T32", > + "ARM A64", > }; > > static const char * const proc_error_type_strs[] = { > @@ -139,6 +142,18 @@ void cper_print_bits(const char *pfx, unsigned int bits, > "corrected", > }; > > +static const char * const arm_reg_ctx_strs[] = { > + "AArch32 general purpose registers", > + "AArch32 EL1 context registers", > + "AArch32 EL2 context registers", > + "AArch32 secure context registers", > + "AArch64 general purpose registers", > + "AArch64 EL1 context registers", > + "AArch64 EL2 context registers", > + "AArch64 EL3 context registers", > + "Misc. system register structure", > +}; That... > + > static void cper_print_proc_generic(const char *pfx, > const struct cper_sec_proc_generic *proc) > { > @@ -184,6 +199,114 @@ static void cper_print_proc_generic(const char *pfx, > printk("%s""IP: 0x%016llx\n", pfx, proc->ip); > } > > +static void cper_print_proc_arm(const char *pfx, > + const struct cper_sec_proc_arm *proc) ... and that function should go into: #if defined(CONFIG_ARM64) || defined(CONFIG_ARM) Just put them close together so that you don't have too much ifdeffery. -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: kvm/arm64: use-after-free in kvm_unmap_hva_handler/unmap_stage2_pmds
On Wed, Apr 12, 2017 at 8:43 PM, Marc Zyngierwrote: > On 12/04/17 17:19, Andrey Konovalov wrote: > > Hi Andrey, > >> Apparently this wasn't fixed, I've got this report again on >> linux-next-c4e7b35a3 (Apr 11), which includes 8b3405e34 "kvm: >> arm/arm64: Fix locking for kvm_free_stage2_pgd". > > This looks like a different bug. Oh, OK. > >> >> I now have a way to reproduce it, so I can test proposed patches. I >> don't have a simple C reproducer though. >> >> The bug happens when the following syzkaller program is executed: >> >> mmap(&(0x7f00/0xc000)=nil, (0xc000), 0x3, 0x32, 0x, >> 0x0) >> unshare(0x400) >> perf_event_open(&(0x7f02f000-0x78)={0x1, 0x78, 0x0, 0x0, 0x0, 0x0, >> 0x0, 0x6, 0x0, 0x0, 0xd34, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, >> 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, 0x0, 0x, >> 0x, 0x0) >> r0 = openat$kvm(0xff9c, >> &(0x7f00c000-0x9)="2f6465762f6b766d00", 0x0, 0x0) >> ioctl$TIOCSBRK(0x, 0x5427) >> r1 = ioctl$KVM_CREATE_VM(r0, 0xae01, 0x0) >> syz_kvm_setup_cpu$arm64(r1, 0x, >> &(0x7fdc6000/0x18000)=nil, &(0x7f00c000)=[{0x0, >> &(0x7f00c000)="5ba3c16f533efbed09f8221253c73763327fadce2371813b45dd7f7982f84a873e4ae89a6c2bd1af83a6024c36a1ff518318", >> 0x32}], 0x1, 0x0, &(0x7f00d000-0x10)=[@featur2={0x1, 0x3}], 0x1) > > Is that the only thing the program does? Or is there anything running in > parallel? These calls are executed repeatedly and in random order. That's all. Except that I'm running the reproducer on a real arm board, so there's probably a bunch of stuff going on besides these calls. > >> == >> BUG: KASAN: use-after-free in arch_spin_is_locked >> include/linux/compiler.h:254 [inline] >> BUG: KASAN: use-after-free in unmap_stage2_range+0x990/0x9a8 >> arch/arm64/kvm/../../../arch/arm/kvm/mmu.c:295 >> Read of size 8 at addr 84476730 by task syz-executor/13106 >> >> CPU: 1 PID: 13106 Comm: syz-executor Not tainted >> 4.11.0-rc6-next-20170411-xc2-11025-gc4e7b35a33d4-dirty #5 >> Hardware name: Hardkernel ODROID-C2 (DT) >> Call trace: >> [] dump_backtrace+0x0/0x440 arch/arm64/kernel/traps.c:505 >> [] show_stack+0x20/0x30 arch/arm64/kernel/traps.c:228 >> [] __dump_stack lib/dump_stack.c:16 [inline] >> [] dump_stack+0x110/0x168 lib/dump_stack.c:52 >> [] print_address_description+0x60/0x248 >> mm/kasan/report.c:252 >> [] kasan_report_error mm/kasan/report.c:351 [inline] >> [] kasan_report+0x218/0x300 mm/kasan/report.c:408 >> [] __asan_report_load8_noabort+0x18/0x20 >> mm/kasan/report.c:429 >> [] arch_spin_is_locked include/linux/compiler.h:254 >> [inline] > > This is the assert on the spinlock, and the memory is gone. > >> [] unmap_stage2_range+0x990/0x9a8 >> arch/arm64/kvm/../../../arch/arm/kvm/mmu.c:295 >> [] kvm_free_stage2_pgd.part.16+0x30/0x98 >> arch/arm64/kvm/../../../arch/arm/kvm/mmu.c:842 >> [] kvm_free_stage2_pgd >> arch/arm64/kvm/../../../arch/arm/kvm/mmu.c:838 [inline] > > But we've taken than lock here. There's only a handful of instructions > in between, and the memory can only go away if there is something > messing with us in parallel. > >> [] kvm_arch_flush_shadow_all+0x40/0x58 >> arch/arm64/kvm/../../../arch/arm/kvm/mmu.c:1895 >> [] kvm_mmu_notifier_release+0x154/0x1d0 >> arch/arm64/kvm/../../../virt/kvm/kvm_main.c:472 >> [] __mmu_notifier_release+0x1c0/0x3e0 mm/mmu_notifier.c:75 >> [] mmu_notifier_release >> include/linux/mmu_notifier.h:235 [inline] >> [] exit_mmap+0x21c/0x288 mm/mmap.c:2941 >> [] __mmput kernel/fork.c:888 [inline] >> [] mmput+0xdc/0x2e0 kernel/fork.c:910 >> [] exit_mm kernel/exit.c:557 [inline] >> [] do_exit+0x648/0x2020 kernel/exit.c:865 >> [] do_group_exit+0xdc/0x260 kernel/exit.c:982 >> [] get_signal+0x358/0xf58 kernel/signal.c:2318 >> [] do_signal+0x170/0xc10 arch/arm64/kernel/signal.c:370 >> [] do_notify_resume+0xe4/0x120 >> arch/arm64/kernel/signal.c:421 >> [] work_pending+0x8/0x14 > > So we're being serviced with a signal. Do you know if this signal is > generated by your syzkaller program? We could be racing between do_exit > triggered by a fatal signal (this trace) and the closing of the two file > descriptors (vcpu and vm). I'm not sure. > > Paolo: does this look possible to you? I can't see what locking we have > that could prevent this race. > > Thanks, > > M. > -- > Jazz is not dead. It just smells funny... ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: kvm/arm64: use-after-free in kvm_unmap_hva_handler/unmap_stage2_pmds
On 12/04/17 17:19, Andrey Konovalov wrote: Hi Andrey, > Apparently this wasn't fixed, I've got this report again on > linux-next-c4e7b35a3 (Apr 11), which includes 8b3405e34 "kvm: > arm/arm64: Fix locking for kvm_free_stage2_pgd". This looks like a different bug. > > I now have a way to reproduce it, so I can test proposed patches. I > don't have a simple C reproducer though. > > The bug happens when the following syzkaller program is executed: > > mmap(&(0x7f00/0xc000)=nil, (0xc000), 0x3, 0x32, 0x, > 0x0) > unshare(0x400) > perf_event_open(&(0x7f02f000-0x78)={0x1, 0x78, 0x0, 0x0, 0x0, 0x0, > 0x0, 0x6, 0x0, 0x0, 0xd34, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, 0x0, 0x, > 0x, 0x0) > r0 = openat$kvm(0xff9c, > &(0x7f00c000-0x9)="2f6465762f6b766d00", 0x0, 0x0) > ioctl$TIOCSBRK(0x, 0x5427) > r1 = ioctl$KVM_CREATE_VM(r0, 0xae01, 0x0) > syz_kvm_setup_cpu$arm64(r1, 0x, > &(0x7fdc6000/0x18000)=nil, &(0x7f00c000)=[{0x0, > &(0x7f00c000)="5ba3c16f533efbed09f8221253c73763327fadce2371813b45dd7f7982f84a873e4ae89a6c2bd1af83a6024c36a1ff518318", > 0x32}], 0x1, 0x0, &(0x7f00d000-0x10)=[@featur2={0x1, 0x3}], 0x1) Is that the only thing the program does? Or is there anything running in parallel? > == > BUG: KASAN: use-after-free in arch_spin_is_locked > include/linux/compiler.h:254 [inline] > BUG: KASAN: use-after-free in unmap_stage2_range+0x990/0x9a8 > arch/arm64/kvm/../../../arch/arm/kvm/mmu.c:295 > Read of size 8 at addr 84476730 by task syz-executor/13106 > > CPU: 1 PID: 13106 Comm: syz-executor Not tainted > 4.11.0-rc6-next-20170411-xc2-11025-gc4e7b35a33d4-dirty #5 > Hardware name: Hardkernel ODROID-C2 (DT) > Call trace: > [] dump_backtrace+0x0/0x440 arch/arm64/kernel/traps.c:505 > [] show_stack+0x20/0x30 arch/arm64/kernel/traps.c:228 > [] __dump_stack lib/dump_stack.c:16 [inline] > [] dump_stack+0x110/0x168 lib/dump_stack.c:52 > [] print_address_description+0x60/0x248 > mm/kasan/report.c:252 > [] kasan_report_error mm/kasan/report.c:351 [inline] > [] kasan_report+0x218/0x300 mm/kasan/report.c:408 > [] __asan_report_load8_noabort+0x18/0x20 > mm/kasan/report.c:429 > [] arch_spin_is_locked include/linux/compiler.h:254 [inline] This is the assert on the spinlock, and the memory is gone. > [] unmap_stage2_range+0x990/0x9a8 > arch/arm64/kvm/../../../arch/arm/kvm/mmu.c:295 > [] kvm_free_stage2_pgd.part.16+0x30/0x98 > arch/arm64/kvm/../../../arch/arm/kvm/mmu.c:842 > [] kvm_free_stage2_pgd > arch/arm64/kvm/../../../arch/arm/kvm/mmu.c:838 [inline] But we've taken than lock here. There's only a handful of instructions in between, and the memory can only go away if there is something messing with us in parallel. > [] kvm_arch_flush_shadow_all+0x40/0x58 > arch/arm64/kvm/../../../arch/arm/kvm/mmu.c:1895 > [] kvm_mmu_notifier_release+0x154/0x1d0 > arch/arm64/kvm/../../../virt/kvm/kvm_main.c:472 > [] __mmu_notifier_release+0x1c0/0x3e0 mm/mmu_notifier.c:75 > [] mmu_notifier_release > include/linux/mmu_notifier.h:235 [inline] > [] exit_mmap+0x21c/0x288 mm/mmap.c:2941 > [] __mmput kernel/fork.c:888 [inline] > [] mmput+0xdc/0x2e0 kernel/fork.c:910 > [] exit_mm kernel/exit.c:557 [inline] > [] do_exit+0x648/0x2020 kernel/exit.c:865 > [] do_group_exit+0xdc/0x260 kernel/exit.c:982 > [] get_signal+0x358/0xf58 kernel/signal.c:2318 > [] do_signal+0x170/0xc10 arch/arm64/kernel/signal.c:370 > [] do_notify_resume+0xe4/0x120 > arch/arm64/kernel/signal.c:421 > [] work_pending+0x8/0x14 So we're being serviced with a signal. Do you know if this signal is generated by your syzkaller program? We could be racing between do_exit triggered by a fatal signal (this trace) and the closing of the two file descriptors (vcpu and vm). Paolo: does this look possible to you? I can't see what locking we have that could prevent this race. Thanks, M. -- Jazz is not dead. It just smells funny... ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH V14 02/10] ras: acpi/apei: cper: generic error data entry v3 per ACPI 6.1
On Wed, 2017-04-12 at 15:34 +0200, Borislav Petkov wrote: > On Tue, Mar 28, 2017 at 01:30:32PM -0600, Tyler Baicar wrote: > > Currently when a RAS error is reported it is not timestamped. [] > > diff --git a/include/acpi/ghes.h b/include/acpi/ghes.h [] > > +#define acpi_hest_generic_data_error_length(gdata) \ > > + (((struct acpi_hest_generic_data *)(gdata))->error_data_length) > > +#define acpi_hest_generic_data_size(gdata) \ > > + ((acpi_hest_generic_data_version(gdata) >= 3) ? \ > > + sizeof(struct acpi_hest_generic_data_v300) :\ > > + sizeof(struct acpi_hest_generic_data)) > > +#define acpi_hest_generic_data_record_size(gdata) \ > > + (acpi_hest_generic_data_size(gdata) + \ > > + acpi_hest_generic_data_error_length(gdata)) > > +#define acpi_hest_generic_data_next(gdata) \ > > + ((void *)(gdata) + acpi_hest_generic_data_record_size(gdata)) > > This is one unreadable pile of too long names with a clearly redundant > and too long prefix. Please shorten it all. Naming is generally author's choice and internal consistency has value too. acpi_hest_generic is already used throughout this codebase in multiple files and paths. > > @@ -73,3 +85,13 @@ static inline void ghes_edac_unregister(struct ghes > > *ghes) > > { > > } > > #endif > > + > > +#define acpi_hest_generic_data_version(gdata) \ > > + (gdata->revision >> 8) > > + > > +static inline void *acpi_hest_generic_data_payload(struct > > acpi_hest_generic_data *gdata) > > Lemme try to shorten it: > > static inline void *acpi_hest_get_payload(struct acpi_hest_gdata *d) > { > if (hest_gdata_ver(d) >= 3) > return (void *)(((struct acpi_hest_gdata_v3 *)d) + 1); > else > return d + 1; > } > > Now this is much more readable IMO. You can actually see what's going > on. And you still know what the struct names are. trivial: unnecessary cast to void * ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: kvm/arm64: use-after-free in kvm_unmap_hva_handler/unmap_stage2_pmds
On Tue, Mar 14, 2017 at 5:57 PM, Paolo Bonziniwrote: > > > On 14/03/2017 12:07, Suzuki K Poulose wrote: >> On 10/03/17 13:34, Andrey Konovalov wrote: >>> Hi, >>> >>> I've got the following error report while fuzzing the kernel with >>> syzkaller. >>> >>> On linux-next commit 56b8bad5e066c23e8fa273ef5fba50bd3da2ace8 (Mar 8). >>> >>> Unfortunately I can't reproduce it. >>> >>> == >>> BUG: KASAN: use-after-free in put_page include/linux/compiler.h:243 >>> [inline] >>> BUG: KASAN: use-after-free in unmap_stage2_pmds >>> arch/arm64/kvm/../../../arch/arm/kvm/mmu.c:240 [inline] >>> BUG: KASAN: use-after-free in unmap_stage2_puds >>> arch/arm64/kvm/../../../arch/arm/kvm/mmu.c:269 [inline] >>> BUG: KASAN: use-after-free in unmap_stage2_range+0x884/0x938 >>> arch/arm64/kvm/../../../arch/arm/kvm/mmu.c:299 >>> Read of size 8 at addr 80004585c000 by task syz-executor/5176 >> >> >>> [] put_page include/linux/compiler.h:243 [inline] >>> [] unmap_stage2_pmds >>> arch/arm64/kvm/../../../arch/arm/kvm/mmu.c:240 [inline] >>> [] unmap_stage2_puds >>> arch/arm64/kvm/../../../arch/arm/kvm/mmu.c:269 [inline] >>> [] unmap_stage2_range+0x884/0x938 >>> arch/arm64/kvm/../../../arch/arm/kvm/mmu.c:299 >>> [] kvm_unmap_hva_handler+0x28/0x38 >>> arch/arm64/kvm/../../../arch/arm/kvm/mmu.c:1556 >>> [] handle_hva_to_gpa+0x140/0x250 >>> arch/arm64/kvm/../../../arch/arm/kvm/mmu.c:1547 >>> [] kvm_unmap_hva_range+0x60/0x80 >>> arch/arm64/kvm/../../../arch/arm/kvm/mmu.c:1579 >>> [] >>> kvm_mmu_notifier_invalidate_range_start+0x194/0x278 >>> arch/arm64/kvm/../../../virt/kvm/kvm_main.c:357 >>> [] __mmu_notifier_invalidate_range_start+0x1d0/0x2a0 >>> mm/mmu_notifier.c:199 >>> [] mmu_notifier_invalidate_range_start >>> include/linux/mmu_notifier.h:282 [inline] >>> [] unmap_vmas+0x12c/0x198 mm/memory.c:1372 >>> [] unmap_region+0x128/0x230 mm/mmap.c:2460 >>> [] update_hiwater_vm include/linux/mm.h:1483 [inline] >>> [] remove_vma_list mm/mmap.c:2432 [inline] >>> [] do_munmap+0x598/0x9b0 mm/mmap.c:2662 >>> [] find_vma_links mm/mmap.c:495 [inline] >>> [] mmap_region+0x138/0xc78 mm/mmap.c:1610 >>> [] is_file_hugepages include/linux/hugetlb.h:269 >>> [inline] >>> [] do_mmap+0x3cc/0x848 mm/mmap.c:1446 >>> [] do_mmap_pgoff include/linux/mm.h:2039 [inline] >>> [] vm_mmap_pgoff+0xec/0x120 mm/util.c:305 >>> [] SYSC_mmap_pgoff mm/mmap.c:1475 [inline] >>> [] SyS_mmap_pgoff+0x220/0x420 mm/mmap.c:1458 >>> [] sys_mmap+0x58/0x80 arch/arm64/kernel/sys.c:37 >>> [] el0_svc_naked+0x24/0x28 >>> >> >> We hold kvm->mmu_lock, while manipulating the stage2 ranges. However, I >> find that >> we don't take the lock, when we do it f rom kvm_free_stage2_pgd(), which >> could >> potentially have caused a problem with an munmap of a memslot ? >> >> Lightly tested... >> >> >> commit fa75684dbf0fe845cf8403517d6e0c2c3344a544 >> Author: Suzuki K Poulose >> Date: Tue Mar 14 10:26:54 2017 + >> >> kvm: arm: Fix locking for kvm_free_stage2_pgd >> In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while >> calling >> unmap_stage2_range() on the entire memory range for the guest. This >> could >> cause problems with other callers (e.g, munmap on a memslot) trying to >> unmap a range. >> Cc: Marc Zyngier >> Cc: Christoffer Dall >> Signed-off-by: Suzuki K Poulose >> >> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c >> index a07ce3e..7f97063 100644 >> --- a/arch/arm/kvm/mmu.c >> +++ b/arch/arm/kvm/mmu.c >> @@ -831,7 +831,10 @@ void kvm_free_stage2_pgd(struct kvm *kvm) >> if (kvm->arch.pgd == NULL) >> return; >> >> + spin_lock(>mmu_lock); >> unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE); >> + spin_unlock(>mmu_lock); >> + >> /* Free the HW pgd, one page at a time */ >> free_pages_exact(kvm->arch.pgd, S2_PGD_SIZE); >> kvm->arch.pgd = NULL; > > Reviewed-by: Paolo Bonzini >> >> >> >>> The buggy address belongs to the page: >>> page:7e0001161700 count:0 mapcount:0 mapping: (null) >>> index:0x0 >>> flags: 0xfffc000() >>> raw: 0fffc000 >>> raw: 7e00018c9120 7eea8b60 >>> page dumped because: kasan: bad access detected >>> >>> Memory state around the buggy address: >>> 80004585bf00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff >>> 80004585bf80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 80004585c000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff >>>^ >>> 80004585c080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff >>> 80004585c100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff >>> == >>> >> Hi, Apparently this wasn't
Re: [PATCH v3 8/8] arm64: exception: check shared writable page in SEI handler
Hi James, On 2017/4/7 23:56, James Morse wrote: > Hi Xie XiuQi, > > On 30/03/17 11:31, Xie XiuQi wrote: >> From: Wang Xiongfeng>> >> Since SEI is asynchronous, the error data has been consumed. So we must >> suppose that all the memory data current process can write are >> contaminated. If the process doesn't have shared writable pages, the >> process will be killed, and the system will continue running normally. >> Otherwise, the system must be terminated, because the error has been >> propagated to other processes running on other cores, and recursively >> the error may be propagated to several another processes. > > This is pretty complicated. We can't guarantee that another CPU hasn't > modified > the page tables while we do this, (so its racy). We can't guarantee that the > corrupt data hasn't been sent over the network or written to disk in the mean > time (so its not enough). > > The scenario you have is a write of corrupt data to memory where another CPU > reading it doesn't know the value is corrupt. > > The hardware gives us quite a lot of help containing errors. The RAS > specification (DDI 0587A) describes your scenario as error propagation in > '2.1.2 > Architectural error propagation', and then classifies it in '2.1.3 > Architecturally infected, containable and uncontainable' as uncontained > because > the value is no longer in the general-purpose registers. For uncontained > errors > we should panic(). > > We shouldn't need to try to track errors after we get a notification as the > hardware has done this for us. > Thanks for your comments. I think what you said is reasonable. We will remove this patch and use AET fields of ESR_ELx to determine whether we should kill current process or just panic. > > Firmware-first does complicate this if events like this are not delivered > using > a synchronous external abort, as Linux may have PSTATE.A masked preventing > SError Interrupts from being taken. It looks like PSTATE.A is masked much more > often than is necessary. I will look into cleaning this up. > > > Thanks, > > James > > . > Thanks, Wang Xiongfeng ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm