Re: [PATCH] amd64_edac: Memory size reported double on processor family 0Fh

2012-09-21 Thread Borislav Petkov
On Fri, Sep 21, 2012 at 09:54:09AM -0500, Josh Hunt wrote: > So this is correct now: > root@x.x.x.x:/sys/devices/system/edac/mc/mc0# grep . csrow*/size_mb > csrow0/size_mb:1024 > csrow1/size_mb:1024 > csrow2/size_mb:1024 > csrow3/size_mb:1024 Ok, that's the same box where free reports 4G. Btw, it

Re: [PATCH] amd64_edac: Memory size reported double on processor family 0Fh

2012-09-21 Thread Borislav Petkov
On Fri, Sep 21, 2012 at 08:02:16AM -0500, Josh Hunt wrote: > On 09/21/2012 07:36 AM, Borislav Petkov wrote: > >Ok, I think this is still the old code you're looking at so it would be > >cool if you could test with my patchset I sent you last week. > > > >Because with it, it all looks fine on my K8

Re: [PATCH] amd64_edac: Memory size reported double on processor family 0Fh

2012-09-21 Thread Borislav Petkov
On Fri, Sep 14, 2012 at 09:39:00AM -0500, Josh Hunt wrote: > On 09/14/2012 07:55 AM, Josh Hunt wrote: > > > >Thanks to your help I was able to test your branch, but it still does > >not resolve the problem. Removal of the "factor=1" workaround fixes the > >memory size reporting on boot, but the sys

Re: [PATCH] amd64_edac: Memory size reported double on processor family 0Fh

2012-09-14 Thread Borislav Petkov
On Fri, Sep 14, 2012 at 09:39:00AM -0500, Josh Hunt wrote: > On 09/14/2012 07:55 AM, Josh Hunt wrote: > > > >Thanks to your help I was able to test your branch, but it still does > >not resolve the problem. Removal of the "factor=1" workaround fixes the > >memory size reporting on boot, but the sys

Re: [PATCH] amd64_edac: Memory size reported double on processor family 0Fh

2012-09-14 Thread Josh Hunt
On 09/14/2012 07:55 AM, Josh Hunt wrote: Thanks to your help I was able to test your branch, but it still does not resolve the problem. Removal of the "factor=1" workaround fixes the memory size reporting on boot, but the sysfs values are still incorrect. Please disregard what I said earlier

Re: [PATCH] amd64_edac: Memory size reported double on processor family 0Fh

2012-09-14 Thread Josh Hunt
On 09/12/2012 12:23 PM, Borislav Petkov wrote: Ok, I have something preliminary which seems to work fine on my K8 here. If you'd like, you can give it a run: git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git error-queue I've changed also debug messages, etc, so pls take a look at those an

Re: [PATCH] amd64_edac: Memory size reported double on processor family 0Fh

2012-09-12 Thread Borislav Petkov
On Wed, Sep 12, 2012 at 07:06:29PM +0200, Borislav Petkov wrote: > I'll ping you when I have something ready to test for the number of > pages per csrow reporting. Ok, I have something preliminary which seems to work fine on my K8 here. If you'd like, you can give it a run: git://git.kernel.org/p

Re: [PATCH] amd64_edac: Memory size reported double on processor family 0Fh

2012-09-12 Thread Borislav Petkov
On Wed, Sep 12, 2012 at 11:58:07AM -0500, Josh Hunt wrote: > Actually my apologies. I was looking at the 3.0 code. This issue is > fixed in the latest kernel. > > Sorry for the noise on that. That's fine. I'm happy you guys are testing this and reporting issues so thanks anyway. I'll ping you whe

Re: [PATCH] amd64_edac: Memory size reported double on processor family 0Fh

2012-09-12 Thread Josh Hunt
On 09/12/2012 11:48 AM, Borislav Petkov wrote: Can I have /proc/cpuinfo and full dmesg with EDAC_DEBUG enabled from that machine please? Actually my apologies. I was looking at the 3.0 code. This issue is fixed in the latest kernel. Sorry for the noise on that. Josh -- To unsubscribe from t

Re: [PATCH] amd64_edac: Memory size reported double on processor family 0Fh

2012-09-12 Thread Borislav Petkov
On Wed, Sep 12, 2012 at 11:23:36AM -0500, Josh Hunt wrote: > Looks like we're seeing an issue on another machine. Still 0Fh family, > but the model is reported as 2, with cs_mode 7. Can I have /proc/cpuinfo and full dmesg with EDAC_DEBUG enabled from that machine please? Thanks. -- Regards/Grus

Re: [PATCH] amd64_edac: Memory size reported double on processor family 0Fh

2012-09-12 Thread Josh Hunt
On 09/12/2012 10:49 AM, Borislav Petkov wrote: Yes, that's because the whole init_csrows thing has been b0rked since forever. In your case, amd64_csrow_nr_pages() should pay attention to the dct (second argument) which on K8 is always 0 (we have only one DCT aka Dram ConTroller on K8) and the fu

Re: [PATCH] amd64_edac: Memory size reported double on processor family 0Fh

2012-09-12 Thread Borislav Petkov
On Wed, Sep 12, 2012 at 10:37:18AM -0500, Josh Hunt wrote: > Well from what I see 603ad... would only fix the case of printing the > values correctly on boot, by removing the factor=1 shift. However, > that is merely cosmetic as it does not affect the actual calculation > of nr_pages. I guess maybe

Re: [PATCH] amd64_edac: Memory size reported double on processor family 0Fh

2012-09-12 Thread Josh Hunt
On 09/12/2012 10:30 AM, Borislav Petkov wrote: Yes, you're basically right. Here's what I see from here: In 2009 I added commit 603adaf6b3e37450235f0ddb5986b961b3146a79 Author: Borislav Petkov Date: Mon Dec 21 14:52:53 2009 +0100 amd64_edac: fix K8 chip select reporting Fix the c

Re: [PATCH] amd64_edac: Memory size reported double on processor family 0Fh

2012-09-12 Thread Borislav Petkov
On Wed, Sep 12, 2012 at 07:52:15AM -0500, Josh Hunt wrote: > I wanted to add that we started seeing this back in 3.0. I didn't go > back any farther, but know it was not occurring in 2.6.38. The issue > in 3.0 appeared to be that we shift left k8_dbam_to_chip_select() > and there was also another s

Re: [PATCH] amd64_edac: Memory size reported double on processor family 0Fh

2012-09-12 Thread Josh Hunt
On 09/12/2012 07:38 AM, Josh Hunt wrote: > On 09/12/2012 03:51 AM, Borislav Petkov wrote: >> On Tue, Sep 11, 2012 at 06:02:01PM -0500, Josh Hunt wrote: >>> On 09/11/2012 05:52 PM, Josh Hunt wrote: With recent kernels we noticed that edac was reporting double the memory size on syste

Re: [PATCH] amd64_edac: Memory size reported double on processor family 0Fh

2012-09-12 Thread Josh Hunt
On 09/12/2012 03:51 AM, Borislav Petkov wrote: > On Tue, Sep 11, 2012 at 06:02:01PM -0500, Josh Hunt wrote: >> On 09/11/2012 05:52 PM, Josh Hunt wrote: >>> With recent kernels we noticed that edac was reporting double the memory >>> size on >>> systems running with AMD family 0Fh processors. I'm n

Re: [PATCH] amd64_edac: Memory size reported double on processor family 0Fh

2012-09-12 Thread Borislav Petkov
On Tue, Sep 11, 2012 at 06:02:01PM -0500, Josh Hunt wrote: > On 09/11/2012 05:52 PM, Josh Hunt wrote: > >With recent kernels we noticed that edac was reporting double the memory > >size on > >systems running with AMD family 0Fh processors. I'm not very familiar with > >the > >code, but this resol

Re: [PATCH] amd64_edac: Memory size reported double on processor family 0Fh

2012-09-11 Thread Josh Hunt
[fixing lkml address] On 09/11/2012 05:52 PM, Josh Hunt wrote: With recent kernels we noticed that edac was reporting double the memory size on systems running with AMD family 0Fh processors. I'm not very familiar with the code, but this resolves it from what I can see on my systems. At least in