Re: [PATCH v2] x86, msr: Document AMD "tweak MSRs", use MSR_FnnH_NAME scheme for them

2017-04-25 Thread Denys Vlasenko

On 04/25/2017 06:23 PM, Borislav Petkov wrote:

On Tue, Apr 25, 2017 at 06:15:23PM +0200, Denys Vlasenko wrote:

On 04/25/2017 06:06 PM, Borislav Petkov wrote:

Pls no. Not every MSR for every family. Only the 4 which are actually
being used. We can't hold in here the full 32-bit MSR space.


The replacement of four define names is not the purpose
of the proposed patch.

The patch was prompted by the realization that these particular MSRs
are so badly and inconsistently documented that it takes many hours
of work and requires reading of literally a dozen PDFs to figure out
what are their names, which CPUs have them, and what bits are known.


They're all documented in the respective BKDGs or revision guides.


Yes. For some definition of "documented".

Let's say you are looking at all available documentation for Fam10h CPUs -
BKDG, Revision Guide, five volumes of APM, software optimization guide.
Eight documents.

If you read all of them, you can find exactly one mention that
MSR 0xC0011029 exists. It is mentioned by number.

As a reader of this documentation, can you find out what is it?
Does it have a name, at least?

You are right that kernel is not exactly the best place to store more info
about such things, but AMD probably won't accept my edits to their
documentation.


Re: [PATCH v2] x86, msr: Document AMD "tweak MSRs", use MSR_FnnH_NAME scheme for them

2017-04-25 Thread Denys Vlasenko

On 04/25/2017 06:23 PM, Borislav Petkov wrote:

On Tue, Apr 25, 2017 at 06:15:23PM +0200, Denys Vlasenko wrote:

On 04/25/2017 06:06 PM, Borislav Petkov wrote:

Pls no. Not every MSR for every family. Only the 4 which are actually
being used. We can't hold in here the full 32-bit MSR space.


The replacement of four define names is not the purpose
of the proposed patch.

The patch was prompted by the realization that these particular MSRs
are so badly and inconsistently documented that it takes many hours
of work and requires reading of literally a dozen PDFs to figure out
what are their names, which CPUs have them, and what bits are known.


They're all documented in the respective BKDGs or revision guides.


Yes. For some definition of "documented".

Let's say you are looking at all available documentation for Fam10h CPUs -
BKDG, Revision Guide, five volumes of APM, software optimization guide.
Eight documents.

If you read all of them, you can find exactly one mention that
MSR 0xC0011029 exists. It is mentioned by number.

As a reader of this documentation, can you find out what is it?
Does it have a name, at least?

You are right that kernel is not exactly the best place to store more info
about such things, but AMD probably won't accept my edits to their
documentation.


Re: [PATCH v2] x86, msr: Document AMD "tweak MSRs", use MSR_FnnH_NAME scheme for them

2017-04-25 Thread Borislav Petkov
On Tue, Apr 25, 2017 at 06:15:23PM +0200, Denys Vlasenko wrote:
> On 04/25/2017 06:06 PM, Borislav Petkov wrote:
> > Pls no. Not every MSR for every family. Only the 4 which are actually
> > being used. We can't hold in here the full 32-bit MSR space.
> 
> The replacement of four define names is not the purpose
> of the proposed patch.
> 
> The patch was prompted by the realization that these particular MSRs
> are so badly and inconsistently documented that it takes many hours
> of work and requires reading of literally a dozen PDFs to figure out
> what are their names, which CPUs have them, and what bits are known.

They're all documented in the respective BKDGs or revision guides.

> Anyone who looks at only one document won't see the full picture.

And what is the big picture?

To me it is just a bunch of MSRs. What's so special about them?

> Patch does not document bits, but at least documents all MSR names
> and explains why documentation is so sparse.

No, we don't document MSRs in the kernel - we collect all the MSRs the
kernel uses in msr-index.h.

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


Re: [PATCH v2] x86, msr: Document AMD "tweak MSRs", use MSR_FnnH_NAME scheme for them

2017-04-25 Thread Borislav Petkov
On Tue, Apr 25, 2017 at 06:15:23PM +0200, Denys Vlasenko wrote:
> On 04/25/2017 06:06 PM, Borislav Petkov wrote:
> > Pls no. Not every MSR for every family. Only the 4 which are actually
> > being used. We can't hold in here the full 32-bit MSR space.
> 
> The replacement of four define names is not the purpose
> of the proposed patch.
> 
> The patch was prompted by the realization that these particular MSRs
> are so badly and inconsistently documented that it takes many hours
> of work and requires reading of literally a dozen PDFs to figure out
> what are their names, which CPUs have them, and what bits are known.

They're all documented in the respective BKDGs or revision guides.

> Anyone who looks at only one document won't see the full picture.

And what is the big picture?

To me it is just a bunch of MSRs. What's so special about them?

> Patch does not document bits, but at least documents all MSR names
> and explains why documentation is so sparse.

No, we don't document MSRs in the kernel - we collect all the MSRs the
kernel uses in msr-index.h.

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


Re: [PATCH v2] x86, msr: Document AMD "tweak MSRs", use MSR_FnnH_NAME scheme for them

2017-04-25 Thread Denys Vlasenko

On 04/25/2017 06:06 PM, Borislav Petkov wrote:

Pls no. Not every MSR for every family. Only the 4 which are actually
being used. We can't hold in here the full 32-bit MSR space.


The replacement of four define names is not the purpose
of the proposed patch.

The patch was prompted by the realization that these particular MSRs
are so badly and inconsistently documented that it takes many hours
of work and requires reading of literally a dozen PDFs to figure out
what are their names, which CPUs have them, and what bits are known.

Anyone who looks at only one document won't see the full picture.

Patch does not document bits, but at least documents all MSR names
and explains why documentation is so sparse.

If you think it's not useful, so be it.


Re: [PATCH v2] x86, msr: Document AMD "tweak MSRs", use MSR_FnnH_NAME scheme for them

2017-04-25 Thread Denys Vlasenko

On 04/25/2017 06:06 PM, Borislav Petkov wrote:

Pls no. Not every MSR for every family. Only the 4 which are actually
being used. We can't hold in here the full 32-bit MSR space.


The replacement of four define names is not the purpose
of the proposed patch.

The patch was prompted by the realization that these particular MSRs
are so badly and inconsistently documented that it takes many hours
of work and requires reading of literally a dozen PDFs to figure out
what are their names, which CPUs have them, and what bits are known.

Anyone who looks at only one document won't see the full picture.

Patch does not document bits, but at least documents all MSR names
and explains why documentation is so sparse.

If you think it's not useful, so be it.


Re: [PATCH v2] x86, msr: Document AMD "tweak MSRs", use MSR_FnnH_NAME scheme for them

2017-04-25 Thread Borislav Petkov
On Tue, Apr 25, 2017 at 05:27:04PM +0200, Denys Vlasenko wrote:
> MSRs in 0xC001102x range (and a few close to this range)
> allow to modify some internal actions of the pipeline.
> 
> (There is one non-debug MSR in this range, introduced in Fam15h:
> MSR 0xC0011027 Address Mask For DR0 Breakpoints, aka DR0_ADDR_MASK).
> 
> Sometimes these MSRs are used to fix erratas.
> 
> Let's have a comment about that.
> 
> Lat's use the following naming scheme for all of them: MSR_FnnH_REGNAME
> This introduces some redundant names, but documents CPU family where
> we are reasonably sure a particular register exists, and avoids the need
> to explain why the same register is either "Combined Unit Cfg"
> or "Bus Unit Cfg" - obviously, because the name depends on the CPU family.
> 
> Renaming:
> MSR_AMD64_DC_CFG  -> MSR_F10H_DC_CFG
> MSR_AMD64_BU_CFG2 -> MSR_F10H_BU_CFG2
> MSR_AMD64_LS_CFG  -> MSR_F16H_LS_CFG
> MSR_AMD64_DE_CFG  -> MSR_F12H_DE_CFG (and moving to msr-index.h)
> 
> Here is a little compilation from about a dozen documents.
> 
> C001_1000:
> 15h Errata 608 "P-state Limit Changes May Not Generate Interrupts"
> - worked around by setting bit 16.
> 15h Errata 671 "Debug Breakpoint on Misaligned Store May Cause System Hang"
> - worked around by setting bit 17 to 0.
> - AMD is really reluctant to this workaround, must be painful
> 15h Errata 727 "Processor Core May Hang During CC6 Resume"
> - worked around by setting bit 15.
> - HW fixed in models 10h?
> 
> C001_1020:
> K8 Errata 106
> "Potential Deadlock with Tightly Coupled Semaphores in an MP System"
> - worked around by setting bit 25.
> 10h,12h Errata 670
> "Segment Load May Cause System Hang or Fault After State Change"
> - worked around by setting bit 8.
> - this bit has something to do with handling of LOCK prefix.
> 14h Errata 530
> "Potential Violation of Read Ordering Rules Between Semaphore Operation
> and Subsequent Load Operations"
> - worked around by setting bit 36.
> 14h Errata 551
> "Processor May Not Forward Data From Store to a Page Crossing
> Read-Modify-Write Operation"
> - worked around by setting bit 25.
> 14h Errata 560
> "Processor May Incorrectly Forward Data with Non-cacheable Floating-Point
> 128-bit SSE Operation"
> -  worked around by setting bit 18.
> 16h Errata 793
> "Specific Combination of Writes to Write Combined Memory Types and Locked
> Instructions May Cause Core Hang"
> - worked around by setting bit 15.
> 
> C001_1021:
> K8 Errata 94
> "Sequential Prefetch Feature May Cause Incorrect Processor Operation"
> - worked around by setting bit 11.
> 14h Errata 688
> "Processor May Cause Unpredictable Program Behavior Under Highly Specific
> Branch Conditions"
> - worked around by setting bits 14 and 3.
> 16h Errata 776
> "Incorrect Processor Branch Prediction for Two Consecutive Linear Pages"
> - worked around by setting bit 26.
> - HW fixed in models 30h?
> 
> C001_1022:
> K8 Errata 97 "128-Bit Streaming Stores May Cause Coherency Failure"
> - worked around by setting bit 3.
> K8 Errata 81
> "Cache Coherency Problem with Hardware Prefetching and Streaming Stores"
> - worked around by setting bit 10.
> 10h Errata 261
> "Processor May Stall Entering Stop-Grant Due to Pending Data Cache Scrub"
> - worked around by setting bit 24.
> 10h Errata 326 "Misaligned Load Operation May Cause Processor Core Hang"
> - worked around by setting bits 43:42 to 00.
> 10h Errata 383
> "CPU Core May Machine Check When System Software Changes Page Tables
> Dynamically"
> - worked around by setting bit 47.
> 15h Errata 674
> "Processor May Cache Prefetched Data from Remapped Memory Region"
> - worked around by setting bit 13.
> 
> C001_1023:
> K8 Errata 69
> "Multiprocessor Coherency Problem with Hardware Prefetch Mechanism"
> - worked around by setting bit 45.
> K8 Errata 113 "Enhanced Write-Combining Feature Causes System Hang"
> - worked around by setting bit 48.
> 10h Errata 254 "Internal Resource Livelock Involving Cached TLB Reload"
> - worked around by setting bit 21.
> 10h Errata 298
> "L2 Eviction May Occur During Processor Operation To Set Accessed or Dirty 
> Bit"
> - worked around by setting bit 1.
> 10h Errata 309
> "Processor Core May Execute Incorrect Instructions on Concurrent L2 and
> Northbridge Response"
> - worked around by setting bit 23.
> 
> C001_1029:
> 10h,12h Errata 721 "Processor May Incorrectly Update Stack Pointer"
> - worked around by setting bit 0.
> 12h Errata 665 "Integer Divide Instruction May Cause Unpredictable Behavior"
> - worked around by setting bit 31.
> Bit 23 serializes CLFLUSH instruction.
> 
> C001_102A:
> 15h Errata 503 "APIC Task-Priority Register May Be Incorrect"
> - worked around by setting bit 11.
> 
> K8_BKDG documents none of these registers, but Revision Guide mentions
> them a lot.
> 
> 10h_BKDG documents them as:
> MSRC001_1021 Instruction Cache Configuration Register (IC_CFG)

Re: [PATCH v2] x86, msr: Document AMD "tweak MSRs", use MSR_FnnH_NAME scheme for them

2017-04-25 Thread Borislav Petkov
On Tue, Apr 25, 2017 at 05:27:04PM +0200, Denys Vlasenko wrote:
> MSRs in 0xC001102x range (and a few close to this range)
> allow to modify some internal actions of the pipeline.
> 
> (There is one non-debug MSR in this range, introduced in Fam15h:
> MSR 0xC0011027 Address Mask For DR0 Breakpoints, aka DR0_ADDR_MASK).
> 
> Sometimes these MSRs are used to fix erratas.
> 
> Let's have a comment about that.
> 
> Lat's use the following naming scheme for all of them: MSR_FnnH_REGNAME
> This introduces some redundant names, but documents CPU family where
> we are reasonably sure a particular register exists, and avoids the need
> to explain why the same register is either "Combined Unit Cfg"
> or "Bus Unit Cfg" - obviously, because the name depends on the CPU family.
> 
> Renaming:
> MSR_AMD64_DC_CFG  -> MSR_F10H_DC_CFG
> MSR_AMD64_BU_CFG2 -> MSR_F10H_BU_CFG2
> MSR_AMD64_LS_CFG  -> MSR_F16H_LS_CFG
> MSR_AMD64_DE_CFG  -> MSR_F12H_DE_CFG (and moving to msr-index.h)
> 
> Here is a little compilation from about a dozen documents.
> 
> C001_1000:
> 15h Errata 608 "P-state Limit Changes May Not Generate Interrupts"
> - worked around by setting bit 16.
> 15h Errata 671 "Debug Breakpoint on Misaligned Store May Cause System Hang"
> - worked around by setting bit 17 to 0.
> - AMD is really reluctant to this workaround, must be painful
> 15h Errata 727 "Processor Core May Hang During CC6 Resume"
> - worked around by setting bit 15.
> - HW fixed in models 10h?
> 
> C001_1020:
> K8 Errata 106
> "Potential Deadlock with Tightly Coupled Semaphores in an MP System"
> - worked around by setting bit 25.
> 10h,12h Errata 670
> "Segment Load May Cause System Hang or Fault After State Change"
> - worked around by setting bit 8.
> - this bit has something to do with handling of LOCK prefix.
> 14h Errata 530
> "Potential Violation of Read Ordering Rules Between Semaphore Operation
> and Subsequent Load Operations"
> - worked around by setting bit 36.
> 14h Errata 551
> "Processor May Not Forward Data From Store to a Page Crossing
> Read-Modify-Write Operation"
> - worked around by setting bit 25.
> 14h Errata 560
> "Processor May Incorrectly Forward Data with Non-cacheable Floating-Point
> 128-bit SSE Operation"
> -  worked around by setting bit 18.
> 16h Errata 793
> "Specific Combination of Writes to Write Combined Memory Types and Locked
> Instructions May Cause Core Hang"
> - worked around by setting bit 15.
> 
> C001_1021:
> K8 Errata 94
> "Sequential Prefetch Feature May Cause Incorrect Processor Operation"
> - worked around by setting bit 11.
> 14h Errata 688
> "Processor May Cause Unpredictable Program Behavior Under Highly Specific
> Branch Conditions"
> - worked around by setting bits 14 and 3.
> 16h Errata 776
> "Incorrect Processor Branch Prediction for Two Consecutive Linear Pages"
> - worked around by setting bit 26.
> - HW fixed in models 30h?
> 
> C001_1022:
> K8 Errata 97 "128-Bit Streaming Stores May Cause Coherency Failure"
> - worked around by setting bit 3.
> K8 Errata 81
> "Cache Coherency Problem with Hardware Prefetching and Streaming Stores"
> - worked around by setting bit 10.
> 10h Errata 261
> "Processor May Stall Entering Stop-Grant Due to Pending Data Cache Scrub"
> - worked around by setting bit 24.
> 10h Errata 326 "Misaligned Load Operation May Cause Processor Core Hang"
> - worked around by setting bits 43:42 to 00.
> 10h Errata 383
> "CPU Core May Machine Check When System Software Changes Page Tables
> Dynamically"
> - worked around by setting bit 47.
> 15h Errata 674
> "Processor May Cache Prefetched Data from Remapped Memory Region"
> - worked around by setting bit 13.
> 
> C001_1023:
> K8 Errata 69
> "Multiprocessor Coherency Problem with Hardware Prefetch Mechanism"
> - worked around by setting bit 45.
> K8 Errata 113 "Enhanced Write-Combining Feature Causes System Hang"
> - worked around by setting bit 48.
> 10h Errata 254 "Internal Resource Livelock Involving Cached TLB Reload"
> - worked around by setting bit 21.
> 10h Errata 298
> "L2 Eviction May Occur During Processor Operation To Set Accessed or Dirty 
> Bit"
> - worked around by setting bit 1.
> 10h Errata 309
> "Processor Core May Execute Incorrect Instructions on Concurrent L2 and
> Northbridge Response"
> - worked around by setting bit 23.
> 
> C001_1029:
> 10h,12h Errata 721 "Processor May Incorrectly Update Stack Pointer"
> - worked around by setting bit 0.
> 12h Errata 665 "Integer Divide Instruction May Cause Unpredictable Behavior"
> - worked around by setting bit 31.
> Bit 23 serializes CLFLUSH instruction.
> 
> C001_102A:
> 15h Errata 503 "APIC Task-Priority Register May Be Incorrect"
> - worked around by setting bit 11.
> 
> K8_BKDG documents none of these registers, but Revision Guide mentions
> them a lot.
> 
> 10h_BKDG documents them as:
> MSRC001_1021 Instruction Cache Configuration Register (IC_CFG)