RE: [PATCH] x86/mce: Clean up TP_printk() output line of the mce_record tracepoint

2024-03-29 Thread Luck, Tony
>> - While at it, don't forget to: >> >>s/ADDR/MISC/SYND >> /addr/misc/synd >> > These are actually acronyms for Address, Miscellaneous and Syndrome registers. They look like abbreviations, not acronyms to me. -Tony

RE: [PATCH v4 2/2] tracing: Include Microcode Revision in mce_record tracepoint

2024-03-27 Thread Luck, Tony
> Export microcode version through the tracepoint to prevent ambiguity over > the active version on the system when the MCE was received. > > Signed-off-by: Avadhut Naik > Reviewed-by: Sohil Mehta > Reviewed-by: Steven Rostedt (Google) Reviewed-by: Tony Luck

RE: [PATCH v4 1/2] tracing: Include PPIN in mce_record tracepoint

2024-03-27 Thread Luck, Tony
> Export PPIN through the tracepoint as it provides a unique identifier for > the system (or socket in case of multi-socket systems) on which the MCE > has been received. > > Signed-off-by: Avadhut Naik > Reviewed-by: Sohil Mehta > Reviewed-by: Steven Rostedt (Google) Reviewed-by: Tony Luck

RE: [PATCH v2 0/2] Update mce_record tracepoint

2024-01-26 Thread Luck, Tony
> But no, that's not the right question to ask. > > It is rather: which bits of information are very relevant to an error > record and which are transient enough so that they cannot be gathered > from a system by other means or only gathered in a difficult way, and > should be part of that record.

RE: [PATCH v2 0/2] Update mce_record tracepoint

2024-01-26 Thread Luck, Tony
> > Is it so very different to add this to a trace record so that rasdaemon > > can have feature parity with mcelog(8)? > > I knew you were gonna say that. When someone decides that it is > a splendid idea to add more stuff to struct mce then said someone would > want it in the tracepoint too. > >

RE: [PATCH v2 0/2] Update mce_record tracepoint

2024-01-26 Thread Luck, Tony
> > You've spent enough time with Ashok and Thomas tweaking the Linux > > microcode driver to know that going back to the machine the next day > > to ask about microcode version has a bunch of ways to get a wrong > > answer. > > Huh, what does that have to do with this? If deployment of a

RE: [PATCH v2 0/2] Update mce_record tracepoint

2024-01-26 Thread Luck, Tony
> > 8 bytes for PPIN, 4 more for microcode. > > I know, nothing leads to bloat like 0.01% here, 0.001% there... 12 extra bytes divided by (say) 64GB (a very small server these days, may laptop has that much) = 0.0001746% We will need 57000 changes like this one before we get to 0.001%

RE: [PATCH v2 0/2] Update mce_record tracepoint

2024-01-25 Thread Luck, Tony
> > The first patch adds PPIN (Protected Processor Inventory Number) field to > > the tracepoint. > > > > The second patch adds the microcode field (Microcode Revision) to the > > tracepoint. > > This is a lot of static information to add to *every* MCE. 8 bytes for PPIN, 4 more for microcode.

RE: [PATCH] printk: add cpu id information to printk() output

2023-09-15 Thread Luck, Tony
> + return in_task() ? task_pid_nr(current) | (smp_processor_id() << > CPU_ID_SHIFT) : There are contexts and CONFIG options around pre-emption where smp_processor_id() will throw a warning. Use raw_smp_processor_id(). -Tony

RE: [PATCH] EDAC/mc_sysfs: refactor deprecated strncpy

2023-09-13 Thread Luck, Tony
> `strncpy` is deprecated for use on NUL-terminated destination strings [1]. > > We should prefer more robust and less ambiguous string interfaces. > > A suitable replacement is `strscpy_pad` [2] due to the fact that it guarantees > NUL-termination on the destination buffer whilst maintaining the

RE: [PATCH v7] x86/mce: retrieve poison range from hardware

2022-08-23 Thread Luck, Tony
> What I'm missing from this text here is, what *is* the mce->misc LSB > field in human speak? What does that field denote? The SDM says: Recoverable Address LSB (bits 5:0): The lowest valid recoverable address bit. Indicates the position of the least significant bit (LSB) of the recoverable

RE: [PATCH v5] x86/mce: retrieve poison range from hardware

2022-08-01 Thread Luck, Tony
> struct mce m; > + int lsb = PAGE_SHIFT; Some maintainers like to order local declaration lines from longest to shortest > + /* > + * Even if the ->validation_bits are set for address mask, > + * to be extra safe, check and reject an error radius '0', > + * and

Re: [PATCH v3] x86/mce: retrieve poison range from hardware

2022-07-18 Thread Luck, Tony
On Mon, Jul 18, 2022 at 09:11:33PM +, Jane Chu wrote: > On 7/18/2022 12:22 PM, Luck, Tony wrote: > >> It appears the kernel is trusting that ->physical_addr_mask is non-zero > >> in other paths. So this is at least equally broken in the presence of a > >> broke

RE: [PATCH v3] x86/mce: retrieve poison range from hardware

2022-07-18 Thread Luck, Tony
> It appears the kernel is trusting that ->physical_addr_mask is non-zero > in other paths. So this is at least equally broken in the presence of a > broken BIOS. The impact is potentially larger though with this change, > so it might be a good follow-on patch to make sure that >

RE: [PATCH v3] x86/mce: retrieve poison range from hardware

2022-07-18 Thread Luck, Tony
+ m.misc = (MCI_MISC_ADDR_PHYS << 6) | __ffs64(mem_err->physical_addr_mask); Do we want to unconditionally trust the sanity of the BIOS provided physical_address_mask? There's a warning comment on the kernel __ffs64() function: * The result is not defined if no bits are set, so check

Re: Phantom PMEM poison issue

2022-01-21 Thread Luck, Tony
On Sat, Jan 22, 2022 at 12:40:18AM +, Jane Chu wrote: > On 1/21/2022 4:31 PM, Jane Chu wrote: > > On baremetal Intel platform with DCPMEM installed and configured to > > provision daxfs, say a poison was consumed by a load from a user thread, > > and then daxfs takes action and clears the

RE: [PATCH v1 3/3] mm,hwpoison: add kill_accessing_process() to find error virtual address

2021-04-20 Thread Luck, Tony
> I meant in this case (racing to access the same poisoned pages), the > page mapping should have been removed by and the hwpoison swap pte > installed by the winner thread? My "mutex" patch that Horiguchi-san has included his summary series should make this happen in most cases. Only problem is

Re: [PATCH v1 3/3] mm,hwpoison: add kill_accessing_process() to find error virtual address

2021-04-20 Thread Luck, Tony
On Mon, Apr 19, 2021 at 07:03:01PM -0700, Jue Wang wrote: > On Tue, 13 Apr 2021 07:43:20 +0900, Naoya Horiguchi wrote: > > > This patch suggests to do page table walk to find the error virtual > > address. If we find multiple virtual addresses in walking, we now can't > > determine which one is

Re: [PATCH v1 3/3] mm,hwpoison: add kill_accessing_process() to find error virtual address

2021-04-20 Thread Luck, Tony
On Mon, Apr 19, 2021 at 06:49:15PM -0700, Jue Wang wrote: > On Tue, 13 Apr 2021 07:43:20 +0900, Naoya Horiguchi wrote: > ... > > + * This function is intended to handle "Action Required" MCEs on already > > + * hardware poisoned pages. They could happen, for example, when > > + * memory_failure()

RE: [PATCH 4/4] x86/mce: Avoid infinite loop for copy from user recovery

2021-04-19 Thread Luck, Tony
>> But there are places in the kernel where the code assumes that this >> EFAULT return was simply because of a page fault. The code takes some >> action to fix that, and then retries the access. This results in a second >> machine check. > > What about return EHWPOISON instead of EFAULT and

RE: [PATCH 3/4] mce/copyin: fix to not SIGBUS when copying from user hits poison

2021-04-13 Thread Luck, Tony
> So what I'm missing with all this fun is, yeah, sure, we have this > facility out there but who's using it? Is anyone even using it at all? Even if no applications ever do anything with it, it is still useful to avoid crashing the whole system and just terminate one application/guest. > If so,

Re: [PATCH 3/4] mce/copyin: fix to not SIGBUS when copying from user hits poison

2021-04-08 Thread Luck, Tony
On Thu, Apr 08, 2021 at 10:49:58AM +0200, Borislav Petkov wrote: > On Wed, Apr 07, 2021 at 02:43:10PM -0700, Luck, Tony wrote: > > On Wed, Apr 07, 2021 at 11:18:16PM +0200, Borislav Petkov wrote: > > > On Thu, Mar 25, 2021 at 05:02:34PM -0700, Tony Luck wrote: > > > &g

RE: [PATCH 4/4] x86/mce: Avoid infinite loop for copy from user recovery

2021-04-08 Thread Luck, Tony
> What I'm still unclear on, does this new version address that > "mysterious" hang or panic which the validation team triggered or you > haven't checked yet? No :-( They are triggering some case where multiple threads in a process hit the same poison, and somehow memory_failure() fails to

RE: [RFC 0/4] Fix machine check recovery for copy_from_user

2021-04-08 Thread Luck, Tony
> I have one scenario, may you take into account: > > If one copyin case occurs, write() returned by your patch, the user process > may > check the return values, for errors, it may exit the process, then the error > page > will be freed, and then the page maybe alloced to other process or to

Re: [PATCH 3/4] mce/copyin: fix to not SIGBUS when copying from user hits poison

2021-04-07 Thread Luck, Tony
On Wed, Apr 07, 2021 at 11:18:16PM +0200, Borislav Petkov wrote: > On Thu, Mar 25, 2021 at 05:02:34PM -0700, Tony Luck wrote: > > Andy Lutomirski pointed out that sending SIGBUS to tasks that > > hit poison in the kernel copying syscall parameters from user > > address space is not the right

RE: [PATCH v3] mm,hwpoison: return -EHWPOISON when page already poisoned

2021-04-02 Thread Luck, Tony
>> Combined with my "mutex" patch (to get rid of races where 2nd process returns >> early, but first process is still looking for mappings to unmap and tasks >> to signal) this patch moves forward a bit. But I think it needs an >> additional change here in kill_me_maybe() to just "return" if there

Re: [PATCH v3] mm,hwpoison: return -EHWPOISON when page already poisoned

2021-04-01 Thread Luck, Tony
On Wed, Mar 31, 2021 at 07:25:40PM +0800, Aili Yao wrote: > When the page is already poisoned, another memory_failure() call in the > same page now return 0, meaning OK. For nested memory mce handling, this > behavior may lead to one mce looping, Example: > > 1.When LCME is enabled, and there are

RE: [PATCH] x86/mce/dev-mcelog: Fix potential memory access error

2021-03-29 Thread Luck, Tony
- set_bit(MCE_OVERFLOW, (unsigned long *)>flags); + mcelog->flags |= BIT(MCE_OVERFLOW); set_bit() is an atomic operation because it might race with the code to get and clear this bit: do { flags = mcelog->flags;

RE: [PATCH] x86/mce: Add Skylake quirk for patrol scrub reported errors

2021-03-24 Thread Luck, Tony
> Yeah, into > > fd258dc4442c ("x86/mce: Add Skylake quirk for patrol scrub reported errors") Thanks ... my memory is failing, and I forgot that the patch had been improved and moved from core.c (where I looked for it) into severity.c -Tony

RE: [PATCH v5 2/3] x86/bus_lock: Handle #DB for bus lock

2021-03-19 Thread Luck, Tony
> What is the justifucation for making this rate limit per UID and not > per task, per process or systemwide? The concern is that a malicious user is running a workload that loops obtaining the buslock. This brings the whole system to its knees. Limiting per task doesn't help. The user can

Re: [PATCH] mm,hwpoison: return -EBUSY when page already poisoned

2021-03-16 Thread Luck, Tony
On Fri, Mar 12, 2021 at 11:48:31PM +, Luck, Tony wrote: > Thanks to the decode of the instruction we do have the virtual address. So we > just need > a safe walk of pgd->p4d->pud->pmd->pte (truncated if we hit a huge page) with > a write > of a "not-present&

RE: EDAC list as Trojan Horse distribution ??

2021-03-16 Thread Luck, Tony
>> Nothing new - just the next spammer attempt. > But this was a new class of Spam. So far i got only mass mailing... This was > personalized based on my previous e-Mail (did not include this part in my > mail) Somewhat new - combining trawling of public mailing lists for addresses with a

RE: [PATCH] mm,hwpoison: return -EBUSY when page already poisoned

2021-03-12 Thread Luck, Tony
>> will memory_failure() find it and unmap it? if succeed, then the current >> will be >> signaled with correct vaddr and shift? > > That's a very good question. I didn't see a SIGBUS when I first wrote this > code, > hence all the p->mce_vaddr. But now I'm > a) not sure why there wasn't a

RE: [PATCH] mm,hwpoison: return -EBUSY when page already poisoned

2021-03-12 Thread Luck, Tony
> Sorry to interrupt as I am really confused here: > If it's a copyin case, has the page been mapped for the current process? Yes. The kernel has the current process mapped to the low address range and code like get_user(), put_user() "simply" reaches down to those addresses (maybe not so simply,

RE: [PATCH V2 20/25] perf/x86/intel: Add Alder Lake Hybrid support

2021-03-11 Thread Luck, Tony
>> I think the "sapphire_rapids" is the code name for the server platform. > > If that's really the case, then: > > #define INTEL_FAM6_SAPPHIRERAPIDS_X 0x8F > > is wrong, those things should be uarch name, not platform name. Tony? 0x8F is the model number of the CPU that is named Sapphire

RE: [PATCH] mm,hwpoison: return -EBUSY when page already poisoned

2021-03-11 Thread Luck, Tony
> I guess that p->mce_vaddr stores the virtual address of the error here. > If so, sending SIGBUS with the address looks enough as we do now, so why > do you walk page table to find the error virtual address? p->mce_vaddr only has the virtual address for the COPYIN case. In that code path we

RE: [PATCH v3] x86/fault: Send a SIGBUS to user process always for hwpoison page access.

2021-03-11 Thread Luck, Tony
> I think we need to at least fix the existing bug before we add more > signals. AFAICS the MCE_IN_KERNEL_COPYIN code is busted for kernel > threads. Can a kernel thread do get_user() or copy_from_user()? Do we have kernel threads that have an associated user address space to copy from? -Tony

Re: [PATCH v2] mm,hwpoison: return -EBUSY when page already poisoned

2021-03-09 Thread Luck, Tony
On Tue, Mar 09, 2021 at 08:28:24AM +, HORIGUCHI NAOYA(堀口 直也) wrote: > On Tue, Mar 09, 2021 at 02:35:34PM +0800, Aili Yao wrote: > > When the page is already poisoned, another memory_failure() call in the > > same page now return 0, meaning OK. For nested memory mce handling, this > > behavior

[PATCH] mm/memory-failure: Use a mutex to avoid memory_failure() races

2021-03-08 Thread Luck, Tony
There can be races when multiple CPUs consume poison from the same page. The first into memory_failure() atomically sets the HWPoison page flag and begins hunting for tasks that map this page. Eventually it invalidates those mappings and may send a SIGBUS to the affected tasks. But while all that

RE: [PATCH] mm,hwpoison: return -EBUSY when page already poisoned

2021-03-08 Thread Luck, Tony
>> So it should be safe to grab and hold a mutex. See patch below. > > The mutex approach looks simpler and safer, so I'm fine with it. Thanks. Is that an "Acked-by:"? >> /** >> * memory_failure - Handle memory failure of a page. >> * @pfn: Page Number of the corrupted page >> @@ -1424,12

RE: [PATCH v3] x86/fault: Send a SIGBUS to user process always for hwpoison page access.

2021-03-08 Thread Luck, Tony
> Can you point me at that SIGBUS code in a current kernel? It is in kill_me_maybe(). mce_vaddr is setup when we disassemble whatever get_user() or copy from user variant was in use in the kernel when the poison memory was consumed. if (p->mce_vaddr != (void __user *)-1l) {

Re: [PATCH] mm,hwpoison: return -EBUSY when page already poisoned

2021-03-05 Thread Luck, Tony
This whole page table walking patch is trying to work around the races caused by multiple calls to memory_failure() for the same page. Maybe better to just avoid the races. The comment right above memory_failure says: * Must run in process context (e.g. a work queue) with interrupts * enabled

RE: [PATCH] mm,hwpoison: return -EBUSY when page already poisoned

2021-03-05 Thread Luck, Tony
> From the walk, it seems we have got the virtual address, can we just send a > SIGBUS with it? If the walk wins the race and the pte for the poisoned page is still valid, then yes. But we could have: CPU1CPU2 memory_failure sets poison bit for struct page rmap

Re: [PATCH] mm,hwpoison: return -EBUSY when page already poisoned

2021-03-04 Thread Luck, Tony
On Thu, Mar 04, 2021 at 02:45:24PM +0800, Aili Yao wrote: > > > if your methods works, should it be like this? > > > > > > 1582 pteval = > > > swp_entry_to_pte(make_hwpoison_entry(subpage)); > > > 1583 if (PageHuge(page)) { > > > 1584

RE: [PATCH] mm,hwpoison: return -EBUSY when page already poisoned

2021-03-03 Thread Luck, Tony
> For error address with sigbus, i think this is not an issue resulted by the > patch i post, before my patch, the issue is already there. > I don't find a realizable way to get the correct address for same reason --- > we don't know whether the page mapping is there or not when > we got to

Re: [PATCH] mm,hwpoison: return -EBUSY when page already poisoned

2021-03-03 Thread Luck, Tony
On Fri, Feb 26, 2021 at 10:59:15AM +0800, Aili Yao wrote: > Hi naoya, tony: > > > > > > Idea for what we should do next ... Now that x86 is calling > > > memory_failure() > > > from user context ... maybe parallel calls for the same page should > > > be blocked until the first caller completes

RE: [PATCH v3] x86/fault: Send a SIGBUS to user process always for hwpoison page access.

2021-03-01 Thread Luck, Tony
> Some programs may use read(2), write(2), etc as ways to check if > memory is valid without getting a signal. They might not want > signals, which means that this feature might need to be configurable. That sounds like an appalling hack. If users need such a mechanism we should create some

RE: [PATCH v3] x86/fault: Send a SIGBUS to user process always for hwpoison page access.

2021-03-01 Thread Luck, Tony
> Programs that get a signal might expect that the RIP that the signal > frame points to is the instruction that caused the signal and that the > instruction faulted without side effects. For SIGSEGV, I would be > especially nervous about this. Maybe SIGBUS is safer. For SIGSEGV, > it's

Re: [PATCH] mm,hwpoison: return -EBUSY when page already poisoned

2021-02-26 Thread Luck, Tony
On Fri, Feb 26, 2021 at 10:52:50AM +0800, Aili Yao wrote: > Hi naoya,Oscar,david: > > > > > We could use some negative value (error code) to report the reported case, > > > then as you mentioned above, some callers need change to handle the > > > new case, and the same is true if you use some

Re: [PATCH] mm,hwpoison: return -EBUSY when page already poisoned

2021-02-25 Thread Luck, Tony
On Thu, Feb 25, 2021 at 12:38:06PM +, HORIGUCHI NAOYA(堀口 直也) wrote: > Thank you for shedding light on this, this race looks worrisome to me. > We call try_to_unmap() inside memory_failure(), where we find affected > ptes by page_vma_mapped_walk() and convert into hwpoison entires in >

Re: [PATCH v3] x86/fault: Send a SIGBUS to user process always for hwpoison page access.

2021-02-23 Thread Luck, Tony
On Tue, Feb 23, 2021 at 07:33:46AM -0800, Andy Lutomirski wrote: > > > On Feb 23, 2021, at 4:44 AM, Aili Yao wrote: > > > > On Fri, 5 Feb 2021 17:01:35 +0800 > > Aili Yao wrote: > > > >> When one page is already hwpoisoned by MCE AO action, processes may not > >> be killed, processes mapping

RE: [PATCH v2] x86/mce: fix wrong no-return-ip logic in do_machine_check()

2021-02-23 Thread Luck, Tony
> What I think is qemu has not an easy to get the MCE signature from host or > currently no methods for this > So qemu treat all AR will be No RIPV, Do more is better than do less. RIPV would be important in the guest in the case where the guest can fix the problem that caused the machine check

checkpatch warnings for references to earlier commits

2021-02-22 Thread Luck, Tony
Would it be possible to teach checkpatch not to warn about canonical references to earlier commits? E.g. WARNING: Possible unwrapped commit description (prefer a maximum 75 chars per line) #7: commit e80634a75aba ("EDAC, skx: Retrieve and print retry_rd_err_log registers") Thanks -Tony

RE: [PATCH] x86, sched: Allow NUMA nodes to share an LLC on Intel platforms

2021-02-09 Thread Luck, Tony
> +#define X86_BUG_NUMA_SHARES_LLC X86_BUG(25) /* CPU may > enumerate an LLC shared by multiple NUMA nodes */ During internal review I wondered why this is a "BUG" rather than a "FEATURE" bit. Apparently, the suggestion for "BUG" came from earlier community discussions.

RE: Pstore : Query on using ramoops driver for DDR

2021-02-09 Thread Luck, Tony
> Can we use existing backend pstore ram driver (fs/pstore/ram.c) for DDR > instead of SRAM ? The expectation for pstore is that the system will go through a reset when it crashes. Most systems do not preserve DDR contents across reset. > Was the current driver written only to support

Re: [PATCH 02/49] x86/cpu: Describe hybrid CPUs in cpuinfo_x86

2021-02-08 Thread Luck, Tony
On Mon, Feb 08, 2021 at 02:04:24PM -0500, Liang, Kan wrote: > On 2/8/2021 12:56 PM, Borislav Petkov wrote: > > I think it's good enough for perf, but I'm not sure whether other codes need > the CPU type information. > > Ricardo, do you know? > > Maybe we should implement a generic function as

Re: [PATCH v3 1/7] seqnum_ops: Introduce Sequence Number Ops

2021-02-05 Thread Luck, Tony
On Fri, Feb 05, 2021 at 01:03:18PM -0700, Shuah Khan wrote: > On 2/5/21 2:58 AM, Greg KH wrote: > > On Wed, Feb 03, 2021 at 11:11:57AM -0700, Shuah Khan wrote: > > > +static inline u32 seqnum32_inc(struct seqnum32 *seq) > > > +{ > > > + atomic_t val = ATOMIC_INIT(seq->seqnum); > > > + > > > +

RE: [PATCH v5] x86/mce: Avoid infinite loop for copy from user recovery

2021-02-02 Thread Luck, Tony
> So that "system hang or panic" which the validation folks triggered, > that cannot be reproduced anymore? Did they run the latest version of > the patch? I will get the validation folks to run the latest version (and play around with hyperthreading if they see problems). -Tony

RE: [PATCH v5] x86/mce: Avoid infinite loop for copy from user recovery

2021-02-02 Thread Luck, Tony
> And the much more important question is, what is the code supposed to > do when that overflow *actually* happens in real life? Because IINM, > an overflow condition on the same page would mean killing the task to > contain the error and not killing the machine... Correct. The cases I've

RE: [PATCH v2] x86/fault: Send a SIGBUS to user process always for hwpoison page access.

2021-02-01 Thread Luck, Tony
> In any case, this patch needs rebasing on top of my big fault series Is that series in some GIT tree? Or can you give a lore.kernel.org link? Thanks -Tony

Re: [PATCH v5] x86/mce: Avoid infinite loop for copy from user recovery

2021-02-01 Thread Luck, Tony
On Thu, Jan 28, 2021 at 06:57:35PM +0100, Borislav Petkov wrote: > Crazy idea: if you still can reproduce on -rc3, you could bisect: i.e., > if you apply the patch on -rc3 and it explodes and if you apply the same > patch on -rc5 and it works, then that could be a start... Yeah, don't > have a

Re: [PATCH] x86/fault: Send SIGBUS to user process always for hwpoison page access.

2021-01-29 Thread Luck, Tony
Thanks for the explanation and test code. I think I see better what is going on here. [I took your idea for using madvise(...MADV_HWPOISON) and added a new "-S" option to my einj_mem_uc test program to use madvise instead of ACPI/EINJ for injections. Update pushed here:

RE: linux-next: removal of the ia64 tree

2021-01-28 Thread Luck, Tony
Stephen, Yes. Most stuff I do these days goes through the RAS tree I share with Boris. -Tony -Original Message- From: Stephen Rothwell Sent: Thursday, January 28, 2021 11:53 AM To: Luck, Tony Cc: Arnd Bergmann ; Linux Kernel Mailing List ; Linux Next Mailing List Subject: linux

Re: [PATCH] x86/fault: Send SIGBUS to user process always for hwpoison page access.

2021-01-28 Thread Luck, Tony
On Thu, Jan 28, 2021 at 07:43:26PM +0800, Aili Yao wrote: > when one page is already hwpoisoned by AO action, process may not be > killed, the process mapping this page may make a syscall include this > page and result to trigger a VM_FAULT_HWPOISON fault, as it's in kernel > mode it may be fixed

Re: [PATCH v5] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-27 Thread Luck, Tony
On Tue, Jan 26, 2021 at 12:03:14PM +0100, Borislav Petkov wrote: > On Mon, Jan 25, 2021 at 02:55:09PM -0800, Luck, Tony wrote: > > And now I've changed it back to non-atomic (but keeping the > > slightly cleaner looking code style that I used for the atomic > > version).

[PATCH v5] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-25 Thread Luck, Tony
On Thu, Jan 21, 2021 at 01:09:59PM -0800, Luck, Tony wrote: > On Wed, Jan 20, 2021 at 01:18:12PM +0100, Borislav Petkov wrote: > But, on a whim, I changed the type of mce_count from "int" to "atomic_t" and > fixeed up the increment & clear to use atomic_inc_return(

Re: [PATCH v4] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-21 Thread Luck, Tony
On Wed, Jan 20, 2021 at 01:18:12PM +0100, Borislav Petkov wrote: > On Tue, Jan 19, 2021 at 03:57:59PM -0800, Luck, Tony wrote: > > But the real validation folks took my patch and found that it has > > destabilized cases 1 & 2 (and case 3 also chokes if you repeat a few >

RE: [PATCH v4] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-20 Thread Luck, Tony
> Yah, some printk sprinkling might be a good start. But debugging in that > atomic context is always nasty. ;-\ Some very light printk sprinkling (one message in queue_task_work() in atomic context, one each in kill_me_now() and kill_me_maybe() to check when task_work actually called them.

Re: [PATCH v4] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-19 Thread Luck, Tony
On Tue, Jan 19, 2021 at 11:56:32AM +0100, Borislav Petkov wrote: > On Fri, Jan 15, 2021 at 03:23:46PM -0800, Luck, Tony wrote: > > On Fri, Jan 15, 2021 at 12:51:03PM -0800, Luck, Tony wrote: > > > static void kill_me_now(struct callback_head *ch) > > >

Re: [PATCH v4] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-15 Thread Luck, Tony
On Fri, Jan 15, 2021 at 12:51:03PM -0800, Luck, Tony wrote: > static void kill_me_now(struct callback_head *ch) > { > + p->mce_count = 0; > force_sig(SIGBUS); > } Brown paper bag time ... I just pasted that line from kill_me_maybe() and I thought I did a re-compile

[PATCH v4] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-15 Thread Luck, Tony
Recovery action when get_user() triggers a machine check uses the fixup path to make get_user() return -EFAULT. Also queue_task_work() sets up so that kill_me_maybe() will be called on return to user mode to send a SIGBUS to the current process. But there are places in the kernel where the code

Re: [PATCH v3] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-15 Thread Luck, Tony
On Fri, Jan 15, 2021 at 04:27:54PM +0100, Borislav Petkov wrote: > On Thu, Jan 14, 2021 at 04:38:17PM -0800, Tony Luck wrote: > > Add a "mce_busy" counter so that task_work_add() is only called once > > per faulty page in this task. > > Yeah, that sentence can be removed now too. I will update

Re: [PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-14 Thread Luck, Tony
On Thu, Jan 14, 2021 at 09:22:13PM +0100, Borislav Petkov wrote: > On Mon, Jan 11, 2021 at 01:44:50PM -0800, Tony Luck wrote: > > @@ -1431,8 +1433,11 @@ noinstr void do_machine_check(struct pt_regs *regs) > > mce_panic("Failed kernel mode recovery", , > > msg); > >

RE: [PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-13 Thread Luck, Tony
>> Maybe the other difference in approach between Andy and me is whether to >> go for a solution that covers all the corner cases, or just make an >> incremental >> improvement that allows for recover in some useful subset of remaining fatal >> cases, but still dies in other cases. > > Does that

RE: [PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-13 Thread Luck, Tony
> I think you guys have veered off into the weeds with this. The current > solution - modulo error messages not destined for humans :) - is not soo > bad, considering the whole MCA trainwreck. > > Or am I missing something completely unacceptable? Maybe the other difference in approach between

Re: [PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-12 Thread Luck, Tony
On Tue, Jan 12, 2021 at 02:04:55PM -0800, Andy Lutomirski wrote: > > But we know that the fault happend in a get_user() or copy_from_user() call > > (i.e. an RIP with an extable recovery address). Does context switch > > access user memory? > > No, but NMI can. > > The case that would be very

Re: [PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-12 Thread Luck, Tony
On Tue, Jan 12, 2021 at 10:57:07AM -0800, Andy Lutomirski wrote: > On Tue, Jan 12, 2021 at 10:24 AM Luck, Tony wrote: > > > > On Tue, Jan 12, 2021 at 09:21:21AM -0800, Andy Lutomirski wrote: > > > Well, we need to do *something* when the first __get_user() trips the >

Re: [PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-12 Thread Luck, Tony
On Tue, Jan 12, 2021 at 09:21:21AM -0800, Andy Lutomirski wrote: > Well, we need to do *something* when the first __get_user() trips the > #MC. It would be nice if we could actually fix up the page tables > inside the #MC handler, but, if we're in a pagefault_disable() context > we might have

Re: [PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-12 Thread Luck, Tony
On Tue, Jan 12, 2021 at 09:00:14AM -0800, Andy Lutomirski wrote: > > On Jan 11, 2021, at 2:21 PM, Luck, Tony wrote: > > > > On Mon, Jan 11, 2021 at 02:11:56PM -0800, Andy Lutomirski wrote: > >> > >>>> On Jan 11, 2021, at 1:45 PM, Tony Luck wrote:

Re: [PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-11 Thread Luck, Tony
On Mon, Jan 11, 2021 at 02:11:56PM -0800, Andy Lutomirski wrote: > > > On Jan 11, 2021, at 1:45 PM, Tony Luck wrote: > > > > Recovery action when get_user() triggers a machine check uses the fixup > > path to make get_user() return -EFAULT. Also queue_task_work() sets up > > so that

RE: [PATCH 2/2] futex, x86/mce: Avoid double machine checks

2021-01-08 Thread Luck, Tony
> Yeah, saw that, but that should be trivially fixable I'm thinking. Trivial, maybe ... but then follows the audit of other get_user() calls: git grep get_user\( | wc -l 2003 :-( -Tony

RE: [PATCH 2/2] futex, x86/mce: Avoid double machine checks

2021-01-08 Thread Luck, Tony
> I think this is horrid; why can't we have it return something different > then -EFAULT instead? I did consider this ... but it appears that architectures aren't unified in the return value from get_user() Here's another function involved in the futex call chain leading to this: static int

RE: [PATCH RFC x86/mce] Make mce_timed_out() identify holdout CPUs

2021-01-06 Thread Luck, Tony
res there. But that's a useful improvement (eliminating the other three sockets on this system). Tested-by: Tony Luck -Tony

Re: [PATCH RFC x86/mce] Make mce_timed_out() identify holdout CPUs

2021-01-06 Thread Luck, Tony
On Wed, Jan 06, 2021 at 11:17:08AM -0800, Paul E. McKenney wrote: > On Wed, Jan 06, 2021 at 06:39:30PM +0000, Luck, Tony wrote: > > > The "Timeout: Not all CPUs entered broadcast exception handler" message > > > will appear from time to time given enough

RE: [PATCH RFC x86/mce] Make mce_timed_out() identify holdout CPUs

2021-01-06 Thread Luck, Tony
> The "Timeout: Not all CPUs entered broadcast exception handler" message > will appear from time to time given enough systems, but this message does > not identify which CPUs failed to enter the broadcast exception handler. > This information would be valuable if available, for example, in order

Re: [PATCH] ia64: fix xchg() warning

2021-01-05 Thread Luck, Tony
On Tue, Jan 05, 2021 at 02:17:41PM +0100, Arnd Bergmann wrote: > On Mon, Jan 4, 2021 at 5:00 PM Luck, Tony wrote: > > > > > I have not received any reply from the ia64 maintainers, I assume they > > > were > > > both out of office for Christmas. > &g

RE: [PATCH] ia64: fix xchg() warning

2021-01-04 Thread Luck, Tony
> I have not received any reply from the ia64 maintainers, I assume they were > both out of office for Christmas. I'm back in the office ... but have no working ia64 machines, nor time to look at patches :-( Should drop me from the MAINTAINTERS file. -Tony

RE: linux-next: Tree for Nov 19 (drivers/edac/igen6_edac.c)

2020-11-19 Thread Luck, Tony
> ../drivers/edac/igen6_edac.c: In function 'ecclog_nmi_handler': > ../drivers/edac/igen6_edac.c:525:10: error: 'NMI_DONE' undeclared (first use > in this function); did you mean 'DMI_NONE'? >return NMI_DONE; This driver has a #include But inside that file it says: #if

RE: [PATCH v4 06/17] PCI: add SIOV and IMS capability detection

2020-11-13 Thread Luck, Tony
> Of course is this not only an x86 problem. Every architecture which > supports virtualization has the same issue. ARM(64) has no way to tell > for sure whether the machine runs bare metal either. No idea about the > other architectures. Sounds like a hypervisor problem. If the VMM provides

[PATCH v2] x86/mce: Use "safe" MSR functions when enabling additional error logging

2020-11-10 Thread Luck, Tony
Booting as a guest under KVM results in error messages about unchecked MSR access: [6.814328][T0] unchecked MSR access error: RDMSR from 0x17f at rIP: 0x84483f16 (mce_intel_feature_init+0x156/0x270) because KVM doesn't provide emulation for random model specific registers.

RE: [PATCH] x86/mce: Check for hypervisor before enabling additional error logging

2020-11-10 Thread Luck, Tony
I still think there is a reasonable case to claim that this usage is right to check whether it is running as a guest. Look at what it is trying to do ... change the behavior of the platform w.r.t. logging of memory errors. How does that make any sense for a guest ... that doesn't even know

RE: [tip: ras/core] x86/mce: Enable additional error logging on certain Intel CPUs

2020-11-09 Thread Luck, Tony
> I wouldn't expect all hypervisors to necessarily set CPUID.01H:ECX[bit > 31]. Architecturally, on Intel CPUs, that bit is simply defined as > "not used." There is no documented contract between Intel and > hypervisor vendors regarding the use of that bit. (AMD, on the other > hand, *does*

[PATCH] x86/mce: Check for hypervisor before enabling additional error logging

2020-11-09 Thread Luck, Tony
Booting as a guest under KVM results in error messages about unchecked MSR access: [6.814328][T0] unchecked MSR access error: RDMSR from 0x17f at rIP: 0x84483f16 (mce_intel_feature_init+0x156/0x270) because KVM doesn't provide emulation for random model specific registers.

RE: [tip: ras/core] x86/mce: Enable additional error logging on certain Intel CPUs

2020-11-09 Thread Luck, Tony
> I thought Linux had long ago gone the route of turning rdmsr/wrmsr > into rdmsr_safe/wrmsr_safe, so that the guest would ignore the #GPs on > writes and return zero to the caller for #GPs on reads. Linux just switched that around for the machine check banks ... if they #GP fault, then something

RE: [tip: ras/core] x86/mce: Enable additional error logging on certain Intel CPUs

2020-11-09 Thread Luck, Tony
What does KVM do with model specific MSRs? Looks like you let the guest believe it was running on one of Sandy Bridge, Ivy Bridge, Haswell (Xeon). So, the core MCE code tried to enable extended error reporting. If there is a mode to have KVM let the guest think that it read/wrote MSR 0x17F,

Re: [PATCH] x86/mce: Enable additional error logging on certain Intel CPUs

2020-10-30 Thread Luck, Tony
On Fri, Oct 30, 2020 at 12:04:03PM -0700, Luck, Tony wrote: Bah, didn't notice this conversation didn't include LKML. > The Xeon versions of Sandy Bridge, Ivy Bridge and Haswell support an > optional additional error logging mode which is enabled by an MSR. > > Previously this mode

[PLEASE PULL] ia64 for v5.10

2020-10-12 Thread Luck, Tony
The following changes since commit f4d51dffc6c01a9e94650d95ce0104964f8ae822: Linux 5.9-rc4 (2020-09-06 17:11:40 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux.git tags/ia64_for_5.10 for you to fetch changes up to

RE: [PATCH v3 4/6] x86/mce: Avoid tail copy when machine check terminated a copy from user

2020-10-07 Thread Luck, Tony
>> Machine checks are more serious. Just give up at the point where the >> main copy loop triggered the #MC and return from the copy code as if >> the copy succeeded. The machine check handler will use task_work_add() to >> make sure that the task is sent a SIGBUS. > > Isn't that just plain wrong?

Re: [PATCH v2 6/7] x86/mce: Recover from poison found while copying from user space

2020-10-05 Thread Luck, Tony
On Mon, Oct 05, 2020 at 06:32:47PM +0200, Borislav Petkov wrote: > On Wed, Sep 30, 2020 at 04:26:10PM -0700, Tony Luck wrote: > > arch/x86/kernel/cpu/mce/core.c | 33 + > > include/linux/sched.h | 2 ++ > > 2 files changed, 23 insertions(+), 12

Re: [PATCH 0/3] x86: Add initial support to discover Intel hybrid CPUs

2020-10-02 Thread Luck, Tony
On Sat, Oct 03, 2020 at 03:39:29AM +0200, Thomas Gleixner wrote: > On Fri, Oct 02 2020 at 13:19, Ricardo Neri wrote: > > Add support to discover and enumerate CPUs in Intel hybrid parts. A hybrid > > part has CPUs with more than one type of micro-architecture. Thus, certain > > features may only

  1   2   3   4   5   6   7   8   9   10   >