Re: [PATCH v2 0/2] Update mce_record tracepoint

2024-01-27 Thread Borislav Petkov
On Fri, Jan 26, 2024 at 10:01:29PM +, Luck, Tony wrote: > PPIN: Nice to have. People that send stuff to me are terrible about > providing surrounding details. The record already includes > CPUID(1).EAX ... so I can at least skip the step of asking them which > CPU family/model/stepping they

RE: [PATCH v2 0/2] Update mce_record tracepoint

2024-01-26 Thread Luck, Tony
> But no, that's not the right question to ask. > > It is rather: which bits of information are very relevant to an error > record and which are transient enough so that they cannot be gathered > from a system by other means or only gathered in a difficult way, and > should be part of that record.

Re: [PATCH v2 0/2] Update mce_record tracepoint

2024-01-26 Thread Borislav Petkov
On Fri, Jan 26, 2024 at 08:49:03PM +, Luck, Tony wrote: > Every patch that adds new code or data structures adds to the kernel > memory footprint. Each should be considered on its merits. The basic > question being: > >"Is the new functionality worth the cost?" > > Where does it end? It

RE: [PATCH v2 0/2] Update mce_record tracepoint

2024-01-26 Thread Luck, Tony
> > Is it so very different to add this to a trace record so that rasdaemon > > can have feature parity with mcelog(8)? > > I knew you were gonna say that. When someone decides that it is > a splendid idea to add more stuff to struct mce then said someone would > want it in the tracepoint too. > >

Re: [PATCH v2 0/2] Update mce_record tracepoint

2024-01-26 Thread Borislav Petkov
On Fri, Jan 26, 2024 at 07:15:50PM +, Luck, Tony wrote: > If deployment of a microcode update across a fleet always went > flawlessly, life would be simpler. But things can fail. And maybe the > failure wasn't noticed. Perhaps a node was rebooting when the sysadmin > pushed the update to the

RE: [PATCH v2 0/2] Update mce_record tracepoint

2024-01-26 Thread Luck, Tony
> > You've spent enough time with Ashok and Thomas tweaking the Linux > > microcode driver to know that going back to the machine the next day > > to ask about microcode version has a bunch of ways to get a wrong > > answer. > > Huh, what does that have to do with this? If deployment of a

Re: [PATCH v2 0/2] Update mce_record tracepoint

2024-01-26 Thread Borislav Petkov
On Fri, Jan 26, 2024 at 05:10:20PM +, Luck, Tony wrote: > 12 extra bytes divided by (say) 64GB (a very small server these days, may > laptop has that much) >= 0.0001746% > > We will need 57000 changes like this one before we get to 0.001% :-) You're forgetting that those 12 bytes

RE: [PATCH v2 0/2] Update mce_record tracepoint

2024-01-26 Thread Luck, Tony
> > 8 bytes for PPIN, 4 more for microcode. > > I know, nothing leads to bloat like 0.01% here, 0.001% there... 12 extra bytes divided by (say) 64GB (a very small server these days, may laptop has that much) = 0.0001746% We will need 57000 changes like this one before we get to 0.001%

Re: [PATCH v2 0/2] Update mce_record tracepoint

2024-01-26 Thread Borislav Petkov
On Thu, Jan 25, 2024 at 07:19:22PM +, Luck, Tony wrote: > 8 bytes for PPIN, 4 more for microcode. I know, nothing leads to bloat like 0.01% here, 0.001% there... > Number of recoverable machine checks per system I hope the > monthly rate should be countable on my fingers... That's not

Re: [PATCH v2 0/2] Update mce_record tracepoint

2024-01-25 Thread Naik, Avadhut
Hi, On 1/25/2024 1:19 PM, Luck, Tony wrote: >>> The first patch adds PPIN (Protected Processor Inventory Number) field to >>> the tracepoint. >>> >>> The second patch adds the microcode field (Microcode Revision) to the >>> tracepoint. >> >> This is a lot of static information to add to *every*

RE: [PATCH v2 0/2] Update mce_record tracepoint

2024-01-25 Thread Luck, Tony
> > The first patch adds PPIN (Protected Processor Inventory Number) field to > > the tracepoint. > > > > The second patch adds the microcode field (Microcode Revision) to the > > tracepoint. > > This is a lot of static information to add to *every* MCE. 8 bytes for PPIN, 4 more for microcode.

Re: [PATCH v2 0/2] Update mce_record tracepoint

2024-01-25 Thread Borislav Petkov
On Thu, Jan 25, 2024 at 12:48:55PM -0600, Avadhut Naik wrote: > This patchset updates the mce_record tracepoint so that the recently added > fields of struct mce are exported through it to userspace. > > The first patch adds PPIN (Protected Processor Inventory Number) field to > the tracepoint. >

[PATCH v2 0/2] Update mce_record tracepoint

2024-01-25 Thread Avadhut Naik
This patchset updates the mce_record tracepoint so that the recently added fields of struct mce are exported through it to userspace. The first patch adds PPIN (Protected Processor Inventory Number) field to the tracepoint. The second patch adds the microcode field (Microcode Revision) to the