Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-25 Thread Kani, Toshimitsu
On Mon, 2017-07-24 at 20:30 +0200, Borislav Petkov wrote: : > > So I don't want to break existing users and thus make only explicitly > known platforms load ghes_edac. In the current case, the HPE > machines. All the rest will simply use the platform drivers and > nothing will change for them. >

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Borislav Petkov
(Sending to your other mail address because there's some temporary resolution issue: msmtp: recipient address mche...@s-opensource.com not accepted by the server msmtp: server message: 451 4.3.0 : Temporary lookup failure msmtp: could not send mail (account alien8.de from /home/boris/.msmtprc)

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Borislav Petkov
On Mon, Jul 24, 2017 at 05:54:52PM +, Kani, Toshimitsu wrote: > Umm... I was under impression that we are adding the OSC bit check in > addition to the current GHES filtering. Read the parallel subthread again. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. --

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Kani, Toshimitsu
On Mon, 2017-07-24 at 14:56 -0300, Mauro Carvalho Chehab wrote: > Em Mon, 24 Jul 2017 15:56:27 + : > That's probably too late for me as I received a new HP machine > we bought just last week, but for the next time I would need to > get a new hardware, what would be the non-RAS equivalent to >

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Mauro Carvalho Chehab
Em Mon, 24 Jul 2017 18:44:00 +0200 Borislav Petkov escreveu: > On Mon, Jul 24, 2017 at 01:04:13PM -0300, Mauro Carvalho Chehab wrote: > > If the Kernel force those users to use ghes_edac by default, > > they they won't see the error counts anymore, but, instead, > > hardware reports that the memo

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Mauro Carvalho Chehab
Em Mon, 24 Jul 2017 15:56:27 + "Kani, Toshimitsu" escreveu: > On Mon, 2017-07-24 at 17:37 +0200, Borislav Petkov wrote: > > On Mon, Jul 24, 2017 at 03:25:34PM +, Kani, Toshimitsu wrote: > : > > > > > We've been providing this model for many years now. > > > > Dude, relax, I'm onl

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Kani, Toshimitsu
On Mon, 2017-07-24 at 20:50 +0300, Boris Petkov wrote: > On July 24, 2017 8:44:03 PM GMT+03:00, "Kani, Toshimitsu" @hpe.com> wrote: > > I assumed our platforms w/o build-in RAS do not implement GHES, > > If we make it a normal module, it will be decoupled from GHES and it > will rely only on the

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Boris Petkov
On July 24, 2017 8:44:03 PM GMT+03:00, "Kani, Toshimitsu" wrote: >I assumed our platforms w/o build-in RAS do not implement GHES, If we make it a normal module, it will be decoupled from GHES and it will rely only on the whitelist to load. -- Sent from a small device: formatting sux and brevi

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Kani, Toshimitsu
On Mon, 2017-07-24 at 18:37 +0200, Borislav Petkov wrote: > On Mon, Jul 24, 2017 at 03:56:27PM +, Kani, Toshimitsu wrote: > > Yes, Mauro has already pointed this out.  As I replied to him, we > > do have a separate series of platforms that do not have built-in > > RAS, and > > So this whitelis

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Borislav Petkov
On Mon, Jul 24, 2017 at 01:04:13PM -0300, Mauro Carvalho Chehab wrote: > If the Kernel force those users to use ghes_edac by default, > they they won't see the error counts anymore, but, instead, > hardware reports that the memories need to be replaced. This is exactly why I'm trying to load ghes_

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Borislav Petkov
On Mon, Jul 24, 2017 at 03:56:27PM +, Kani, Toshimitsu wrote: > Yes, Mauro has already pointed this out. As I replied to him, we do > have a separate series of platforms that do not have built-in RAS, and So this whitelist entry +static struct acpi_oemlist oemlist[] = { + {"HPE ", "S

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Mauro Carvalho Chehab
Em Mon, 24 Jul 2017 17:37:16 +0200 Borislav Petkov escreveu: > > Customers do not see error counts.  I do not think it's bogus. > > I am just trying to enable OS error reporting with ghes_edac. > > I know, you don't have to state the obvious constantly. The problem I see is that, currently,

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Kani, Toshimitsu
On Mon, 2017-07-24 at 17:37 +0200, Borislav Petkov wrote: > On Mon, Jul 24, 2017 at 03:25:34PM +, Kani, Toshimitsu wrote: : > > > We've been providing this model for many years now. > > Dude, relax, I'm only trying to point out to you that there are > customers who want to see *every* error

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Borislav Petkov
On Mon, Jul 24, 2017 at 03:25:34PM +, Kani, Toshimitsu wrote: > Customers do not see error counts.  I do not think it's bogus. Not showing the real error error counts but something contrived is the definition of bogus numbers. But you're not showing anything - only when some thresholds are bei

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Kani, Toshimitsu
On Mon, 2017-07-24 at 17:04 +0200, Borislav Petkov wrote: > On Mon, Jul 24, 2017 at 02:49:30PM +, Kani, Toshimitsu wrote: > > We do not tell the error counts to customers. > > Please read what I said: do you tell your customers that the error > counts they're seeing (or are *not* seeing) is bo

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Borislav Petkov
On Mon, Jul 24, 2017 at 02:49:30PM +, Kani, Toshimitsu wrote: > We do not tell the error counts to customers. Please read what I said: do you tell your customers that the error counts they're seeing (or are *not* seeing) is bogus because the BIOS is hiding them? Not the *actual* numbers! > We

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-24 Thread Kani, Toshimitsu
On Sat, 2017-07-22 at 08:28 +0200, Borislav Petkov wrote: > On Fri, Jul 21, 2017 at 06:38:52PM +, Kani, Toshimitsu wrote: > > Enterprise platforms have very different model (I do not say it's > > better for everyone from the cost perspective).  Typically, such > > But you do tell your customer

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Borislav Petkov
On Fri, Jul 21, 2017 at 06:38:52PM +, Kani, Toshimitsu wrote: > Enterprise platforms have very different model (I do not say it's > better for everyone from the cost perspective). Typically, such But you do tell your customers that the error counts they see are not really what *actually* happ

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Kani, Toshimitsu
On Fri, 2017-07-21 at 19:23 +0200, Borislav Petkov wrote: : > Not only that: thresholds depend on the DIMM types which means, BIOS > must know what DIMM types are in there which I doubt. BIOS knows DIMM model from the SPD data. > So exposing that to configuration instead of "deciding" for peopl

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Borislav Petkov
On Fri, Jul 21, 2017 at 02:01:31PM -0300, Mauro Carvalho Chehab wrote: > I see the value of having a threshold in BIOS, provided that it is > well documented, and whose value can be adjusted, if needed. > > One of the things I wanted to implement in ras-daemon were an > algorithm that would be doi

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Kani, Toshimitsu
On Fri, 2017-07-21 at 14:01 -0300, Mauro Carvalho Chehab wrote: > Em Fri, 21 Jul 2017 16:40:20 + > "Kani, Toshimitsu" escreveu: > > > On Fri, 2017-07-21 at 12:44 -0300, Mauro Carvalho Chehab wrote: > > > Em Fri, 21 Jul 2017 15:34:50 + > > > "Kani, Toshimitsu" escreveu: > > >    > > > > O

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Mauro Carvalho Chehab
Em Fri, 21 Jul 2017 16:40:20 + "Kani, Toshimitsu" escreveu: > On Fri, 2017-07-21 at 12:44 -0300, Mauro Carvalho Chehab wrote: > > Em Fri, 21 Jul 2017 15:34:50 + > > "Kani, Toshimitsu" escreveu: > > > > > On Fri, 2017-07-21 at 17:13 +0200, Borislav Petkov wrote: > > > > On Fri, Jul 2

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Kani, Toshimitsu
On Fri, 2017-07-21 at 12:44 -0300, Mauro Carvalho Chehab wrote: > Em Fri, 21 Jul 2017 15:34:50 + > "Kani, Toshimitsu" escreveu: > > > On Fri, 2017-07-21 at 17:13 +0200, Borislav Petkov wrote: > > > On Fri, Jul 21, 2017 at 03:08:41PM +, Kani, Toshimitsu > > > wrote:   > > > > Yes, that is

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Kani, Toshimitsu
On Fri, 2017-07-21 at 17:53 +0200, Borislav Petkov wrote: > On Fri, Jul 21, 2017 at 03:34:50PM +, Kani, Toshimitsu wrote: > > I suppose it'd depend on vendors, but I do not think users can do > > it properly unless they have depth knowledge about the hardware. > > I'm talking about a menu in t

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Borislav Petkov
On Fri, Jul 21, 2017 at 03:34:50PM +, Kani, Toshimitsu wrote: > I suppose it'd depend on vendors, but I do not think users can do it > properly unless they have depth knowledge about the hardware. I'm talking about a menu in the BIOS where you can set the thresholding levels on the system. Doe

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Mauro Carvalho Chehab
Em Fri, 21 Jul 2017 15:34:50 + "Kani, Toshimitsu" escreveu: > On Fri, 2017-07-21 at 17:13 +0200, Borislav Petkov wrote: > > On Fri, Jul 21, 2017 at 03:08:41PM +, Kani, Toshimitsu wrote: > > > Yes, that is correct.  Corrected errors are reported to the OS when > > > they exceeded the pla

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Kani, Toshimitsu
On Fri, 2017-07-21 at 17:13 +0200, Borislav Petkov wrote: > On Fri, Jul 21, 2017 at 03:08:41PM +, Kani, Toshimitsu wrote: > > Yes, that is correct.  Corrected errors are reported to the OS when > > they exceeded the platform's threshold. > > Are those thresholds user-configurable? I suppose i

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Borislav Petkov
On Fri, Jul 21, 2017 at 03:08:41PM +, Kani, Toshimitsu wrote: > Yes, that is correct. Corrected errors are reported to the OS when > they exceeded the platform's threshold. Are those thresholds user-configurable? If not, what are you telling users who want to see *every* corrected error for

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Kani, Toshimitsu
On Fri, 2017-07-21 at 15:47 +0200, Borislav Petkov wrote: > On Fri, Jul 21, 2017 at 10:40:01AM -0300, Mauro Carvalho Chehab > wrote: > > What happens when the error can be corrected? Does it still report > > it to userspace, or just silently hide the error? > > > > If I remember well about a past

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Borislav Petkov
On Fri, Jul 21, 2017 at 10:40:01AM -0300, Mauro Carvalho Chehab wrote: > What happens when the error can be corrected? Does it still report it to > userspace, or just silently hide the error? > > If I remember well about a past discussion with some vendor, I was told > that the firmware can hide s

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Mauro Carvalho Chehab
Em Fri, 21 Jul 2017 15:34:41 +0200 Borislav Petkov escreveu: > On Thu, Jul 20, 2017 at 07:50:03PM +, Kani, Toshimitsu wrote: > > GHES / firmware-first still requires OS recovery actions when an error > > cannot be corrected by the platform. They are handled by ghes_proc(), > > and ghes_edac

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-21 Thread Borislav Petkov
On Thu, Jul 20, 2017 at 07:50:03PM +, Kani, Toshimitsu wrote: > GHES / firmware-first still requires OS recovery actions when an error > cannot be corrected by the platform. They are handled by ghes_proc(), > and ghes_edac remains its error-reporting wrapper. I mean all the recovery actions t

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-20 Thread Kani, Toshimitsu
On Thu, 2017-07-20 at 17:15 -0300, Mauro Carvalho Chehab wrote: > Em Thu, 20 Jul 2017 19:50:03 + > "Kani, Toshimitsu" escreveu: : > > Firmware has better knowledge about the platform and can provide > > better RAS when implemented properly.  I agree that user > > experiences may vary on platf

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-20 Thread Mauro Carvalho Chehab
Em Thu, 20 Jul 2017 19:50:03 + "Kani, Toshimitsu" escreveu: > On Thu, 2017-07-20 at 06:33 +0200, Borislav Petkov wrote: > > On Wed, Jul 19, 2017 at 04:40:25PM +, Kani, Toshimitsu wrote: > > >  ghes_edac allows to report errors to OS management tools like > > > rasdaemon in addition to p

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-20 Thread Kani, Toshimitsu
On Thu, 2017-07-20 at 06:33 +0200, Borislav Petkov wrote: > On Wed, Jul 19, 2017 at 04:40:25PM +, Kani, Toshimitsu wrote: > >  ghes_edac allows to report errors to OS management tools like > > rasdaemon in addition to platform- specific managements. > > So ghes_edac *is* a poor man's driver in

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-20 Thread Mauro Carvalho Chehab
Em Thu, 20 Jul 2017 19:05:04 +0200 Borislav Petkov escreveu: > On Thu, Jul 20, 2017 at 04:55:59PM +, Luck, Tony wrote: > > Add a module parameter to those edac drivers that can override the check > > and let them load anyway. I'm not paranoid, I just assume that there is a > > BIOS > > out

RE: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-20 Thread Luck, Tony
> Or add that parameter to edac_core.ko and let it control which EDAC > driver gets loaded? Something like > > edac=ignore_ghes > > or so. And then the other EDAC drivers query it. Sure ... one central place is better than adding code to each driver. -Tony

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-20 Thread Borislav Petkov
On Thu, Jul 20, 2017 at 04:55:59PM +, Luck, Tony wrote: > Add a module parameter to those edac drivers that can override the check > and let them load anyway. I'm not paranoid, I just assume that there is a > BIOS > out there that sets the OSC/WHEA bits, but isn't generating useful GHES logs.

RE: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-20 Thread Luck, Tony
>> Yes, the following message is shown on HP systems. Please note that >> WHEA is a Windows-defined interface. > > Ok, so let's couple ghes_edac loading to that and see how far we could > go. I guess we should add checks for that to the major x86 EDAC drivers > to not load and this way ghes_edac w

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-20 Thread Borislav Petkov
On Thu, Jul 20, 2017 at 02:42:25PM +, Kani, Toshimitsu wrote: > Yes, the following message is shown on HP systems. Please note that > WHEA is a Windows-defined interface. Ok, so let's couple ghes_edac loading to that and see how far we could go. I guess we should add checks for that to the ma

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-20 Thread Kani, Toshimitsu
On Thu, 2017-07-20 at 06:16 +0200, Borislav Petkov wrote: > On Wed, Jul 19, 2017 at 04:56:17PM +, Kani, Toshimitsu wrote: > > Since ghes_edac has not been used for a long time, I have a feeling > > that not so many vendors want to use it.  In the case of HPE, we do > > not need to update with e

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-19 Thread Borislav Petkov
On Wed, Jul 19, 2017 at 04:40:25PM +, Kani, Toshimitsu wrote: > ghes_edac allows to report errors to OS management tools like > rasdaemon in addition to platform- specific managements. So ghes_edac *is* a poor man's driver in the sense that it doesn't do anything fancy but repeat like a parro

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-19 Thread Borislav Petkov
On Wed, Jul 19, 2017 at 02:55:08PM -0400, Aristeu Rozanski wrote: > That would also need to keep an eye on versions. A newer version of BIOS > on a whitelisted platform might be broken. Yeah, that would be a nasty, back-stabbing SNAFU. So I'm thinking of adding a bunch of FW_ERR sanity checks to

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-19 Thread Borislav Petkov
On Wed, Jul 19, 2017 at 04:56:17PM +, Kani, Toshimitsu wrote: > Since ghes_edac has not been used for a long time, I have a feeling > that not so many vendors want to use it. In the case of HPE, we do not > need to update with each platform since "HPE" "Server" will cover all > platforms we ne

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-19 Thread Kani, Toshimitsu
On Wed, 2017-07-19 at 14:55 -0400, Aristeu Rozanski wrote: > On Wed, Jul 19, 2017 at 06:22:04PM +0200, Borislav Petkov wrote: > > On Wed, Jul 19, 2017 at 04:10:07PM +, Kani, Toshimitsu wrote: > > > I do prefer to avoid any white / black listing.  But I do not see > > > how > > > it solves the b

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-19 Thread Aristeu Rozanski
On Wed, Jul 19, 2017 at 06:22:04PM +0200, Borislav Petkov wrote: > On Wed, Jul 19, 2017 at 04:10:07PM +, Kani, Toshimitsu wrote: > > I do prefer to avoid any white / black listing. But I do not see how > > it solves the buggy DMI/SMBIOS info as an example of firmware bugs we > > may have to de

RE: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-19 Thread Luck, Tony
>> Later when GHES gives you a NODE/CARD/MODULE) in an error record. You need >> to match these up. But SMBIOS only gave you two strings "Locator" and "Bank >> Locator" which have no defined syntax. You are at the mercy of the BIOS >> writer >> to put in something parseable. > > Well, at some poi

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-19 Thread Kani, Toshimitsu
On Wed, 2017-07-19 at 18:22 +0200, Borislav Petkov wrote: > On Wed, Jul 19, 2017 at 04:10:07PM +, Kani, Toshimitsu wrote: > > I do prefer to avoid any white / black listing.  But I do not see > > how it solves the buggy DMI/SMBIOS info as an example of firmware > > bugs we may have to deal with

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-19 Thread Kani, Toshimitsu
On Tue, 2017-07-18 at 18:15 -0300, Mauro Carvalho Chehab wrote: > Em Tue, 18 Jul 2017 19:58:54 + : > We had a similar discussion several years ago when I wrote this > driver. On that time, I talked with Red Hat, HP, Dell, Intel people > and with some customers with large clusters. > > The way

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-19 Thread Borislav Petkov
On Wed, Jul 19, 2017 at 04:10:07PM +, Kani, Toshimitsu wrote: > I do prefer to avoid any white / black listing. But I do not see how > it solves the buggy DMI/SMBIOS info as an example of firmware bugs we > may have to deal with. So how do you want to deal with this? Maintain an evergrowing

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-19 Thread Kani, Toshimitsu
On Wed, 2017-07-19 at 07:52 +0200, Borislav Petkov wrote: > On Tue, Jul 18, 2017 at 09:20:44PM +, Kani, Toshimitsu wrote: > > I agree that 'osc_sb_apei_support_acked' should be checked when > > enabling ghes_edac.  I do not know the details of existing issues, > > but it sounds unlikely that th

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-19 Thread Borislav Petkov
On Wed, Jul 19, 2017 at 03:14:32PM +, Luck, Tony wrote: > Later when GHES gives you a NODE/CARD/MODULE) in an error record. You need > to match these up. But SMBIOS only gave you two strings "Locator" and "Bank > Locator" which have no defined syntax. You are at the mercy of the BIOS writer >

RE: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-19 Thread Luck, Tony
> "The module number of the memory error location. (NODE, CARD, and MODULE > should provide the information necessary to identify the failing FRU)." > > So this tuple is sufficient to pinpoint the DIMM, IIUC. > > Which means, ghes_edac can have a single layer of DIMMs without channels. The tricky

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-18 Thread Borislav Petkov
On Tue, Jul 18, 2017 at 10:13:42PM +, Luck, Tony wrote: > Historically we've had complaints that sb_edac won't load that have been > tracked to BIOS hiding one of the (many) PCI devices that it needs. But > device hiding is orthogonal to providing GHES error records. A BIOS might > do that, b

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-18 Thread Borislav Petkov
On Tue, Jul 18, 2017 at 06:15:45PM -0300, Mauro Carvalho Chehab wrote: > The way it is, ghes_edac is a poor man's driver. What it hopefully > provide is a detection that an error happened, without really telling > the user what component should be replaced. I beg to differ. From the UEFI spec: "T

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-18 Thread Borislav Petkov
On Tue, Jul 18, 2017 at 07:58:54PM +, Kani, Toshimitsu wrote: > I have HPE Haswell and Skylake test systems with GHES, but they do not > hide IMCs from the OS. So, the sb_edac and skx_edac drivers get > attached on these systems when ghes_edac is disabled. That's how it is supposed to work. T

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-18 Thread Borislav Petkov
On Tue, Jul 18, 2017 at 09:20:44PM +, Kani, Toshimitsu wrote: > I agree that 'osc_sb_apei_support_acked' should be checked when > enabling ghes_edac. I do not know the details of existing issues, but > it sounds unlikely that this will address all of them since bugs can be > everywhere. No, s

RE: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-18 Thread Luck, Tony
> The question is: does the platform do this disabling now? > > Tony, I'm looking at sb_edac and there we don't do something like that > or maybe I'm missing it. Historically we've had complaints that sb_edac won't load that have been tracked to BIOS hiding one of the (many) PCI devices that it ne

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-18 Thread Kani, Toshimitsu
On Tue, 2017-07-18 at 10:08 +0200, Borislav Petkov wrote: > On Tue, Jul 18, 2017 at 08:00:07AM +0200, Borislav Petkov wrote: > > And I think we should try this first: have the firmware disable > > detection methods so that the platform drivers don't load. > > Btw, in looking at this more, what abo

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-18 Thread Mauro Carvalho Chehab
Em Tue, 18 Jul 2017 19:58:54 + "Kani, Toshimitsu" escreveu: > On Tue, 2017-07-18 at 08:00 +0200, Borislav Petkov wrote: > > On Mon, Jul 17, 2017 at 03:59:12PM -0600, Toshi Kani wrote: > > > The ghes_edac driver was introduced in 2013 [1], but it has not > > > been enabled by any distro yet.

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-18 Thread Kani, Toshimitsu
On Tue, 2017-07-18 at 08:00 +0200, Borislav Petkov wrote: > On Mon, Jul 17, 2017 at 03:59:12PM -0600, Toshi Kani wrote: > > The ghes_edac driver was introduced in 2013 [1], but it has not > > been enabled by any distro yet.  This driver obtains error info > > from firmware interfaces, which are not

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-18 Thread Kani, Toshimitsu
On Tue, 2017-07-18 at 10:24 -0600, Jeffrey Hugo wrote: > On 7/18/2017 9:36 AM, Kani, Toshimitsu wrote: > > On Tue, 2017-07-18 at 08:39 -0600, Jeffrey Hugo wrote: > > > On 7/17/2017 3:59 PM, Toshi Kani wrote: > > > > The ghes_edac driver was introduced in 2013 [1], but it has not > > > > been enable

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-18 Thread Jeffrey Hugo
On 7/18/2017 9:36 AM, Kani, Toshimitsu wrote: On Tue, 2017-07-18 at 08:39 -0600, Jeffrey Hugo wrote: On 7/17/2017 3:59 PM, Toshi Kani wrote: The ghes_edac driver was introduced in 2013 [1], but it has not been enabled by any distro yet. Ubuntu is expected to enable this soon. Interesting.

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-18 Thread Kani, Toshimitsu
On Tue, 2017-07-18 at 08:39 -0600, Jeffrey Hugo wrote: > On 7/17/2017 3:59 PM, Toshi Kani wrote: > > The ghes_edac driver was introduced in 2013 [1], but it has not > > been enabled by any distro yet.    > > Ubuntu is expected to enable this soon. Interesting. I was told from other distro that t

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-18 Thread Jeffrey Hugo
On 7/17/2017 3:59 PM, Toshi Kani wrote: The ghes_edac driver was introduced in 2013 [1], but it has not been enabled by any distro yet. Ubuntu is expected to enable this soon. -- Jeffrey Hugo Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technolo

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-18 Thread Borislav Petkov
On Tue, Jul 18, 2017 at 08:00:07AM +0200, Borislav Petkov wrote: > And I think we should try this first: have the firmware disable > detection methods so that the platform drivers don't load. Btw, in looking at this more, what about the firmware-first thing? I.e., the firmware-first detection wit

Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

2017-07-17 Thread Borislav Petkov
On Mon, Jul 17, 2017 at 03:59:12PM -0600, Toshi Kani wrote: > The ghes_edac driver was introduced in 2013 [1], but it has not > been enabled by any distro yet. This driver obtains error info > from firmware interfaces, which are not properly implemented on > many platforms, as the driver always em