[PATCH] D70157: Align branches within 32-Byte boundary(NOP padding)

2019-12-20 Thread annita.zhang via Phabricator via cfe-commits
annita.zhang added a comment. In D70157#1793280 , @reames wrote: > I've gone ahead and landed the patch so that we can iterate in tree. See > commit 14fc20ca62821b5f85582bf76a467d412248c248 >

[PATCH] D70157: Align branches within 32-Byte boundary(NOP padding)

2019-12-20 Thread Fangrui Song via Phabricator via cfe-commits
MaskRay added a comment. In D70157#1793280 , @reames wrote: > I've gone ahead and landed the patch so that we can iterate in tree. See > commit 14fc20ca62821b5f85582bf76a467d412248c248 >

[PATCH] D70157: Align branches within 32-Byte boundary(NOP padding)

2019-12-20 Thread Philip Reames via Phabricator via cfe-commits
reames added a comment. I've gone ahead and landed the patch so that we can iterate in tree. See commit 14fc20ca62821b5f85582bf76a467d412248c248 . I've also landed a couple of follow up patches to address issues which would

[PATCH] D70157: Align branches within 32-Byte boundary(NOP padding)

2019-12-20 Thread Philip Reames via Phabricator via cfe-commits
This revision was not accepted when it landed; it landed in state "Needs Review". This revision was automatically updated to reflect the committed changes. Closed by commit rG14fc20ca6282: Align branches within 32-Byte boundary (NOP padding) (authored by reames). Repository: rG LLVM Github

[PATCH] D70157: Align branches within 32-Byte boundary(NOP padding)

2019-12-20 Thread Kan Shengchen via Phabricator via cfe-commits
skan updated this revision to Diff 234846. skan added a comment. 1. Add more tests for `!VK_NONE` cases. 2. Reduce pervasive `auto` CHANGES SINCE LAST ACTION https://reviews.llvm.org/D70157/new/ https://reviews.llvm.org/D70157 Files: llvm/include/llvm/MC/MCAsmBackend.h

[PATCH] D70157: Align branches within 32-Byte boundary(NOP padding)

2019-12-20 Thread Fangrui Song via Phabricator via cfe-commits
MaskRay added a comment. In D70157#1792262 , @skan wrote: > Do you think this patch is ready to land? @MaskRay It is 00:35 here and I should confess that I haven't read through this yet. There are some minor things like (1) pervasive `auto` (2)

[PATCH] D70157: Align branches within 32-Byte boundary(NOP padding)

2019-12-20 Thread Kan Shengchen via Phabricator via cfe-commits
skan added a comment. Do you think this patch is ready to land? @MaskRay CHANGES SINCE LAST ACTION https://reviews.llvm.org/D70157/new/ https://reviews.llvm.org/D70157 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[PATCH] D70157: Align branches within 32-Byte boundary(NOP padding)

2019-12-20 Thread Fangrui Song via Phabricator via cfe-commits
MaskRay added inline comments. Comment at: llvm/lib/MC/MCAssembler.cpp:1003 + // exists) also marks the end of the branch. + for (auto i = 0U, N = BF.isFused() ? 2U : 1U; + i != N && !isa(F); ++i, F = F->getNextNode()) { skan wrote: > MaskRay wrote: > >

[PATCH] D70157: Align branches within 32-Byte boundary(NOP padding)

2019-12-19 Thread Kan Shengchen via Phabricator via cfe-commits
skan updated this revision to Diff 234830. skan added a comment. 1. Simplify the test cases. 2. Add some comments CHANGES SINCE LAST ACTION https://reviews.llvm.org/D70157/new/ https://reviews.llvm.org/D70157 Files: llvm/include/llvm/MC/MCAsmBackend.h llvm/include/llvm/MC/MCAssembler.h

[PATCH] D70157: Align branches within 32-Byte boundary(NOP padding)

2019-12-19 Thread Kan Shengchen via Phabricator via cfe-commits
skan marked an inline comment as done. skan added inline comments. Comment at: llvm/lib/MC/MCAssembler.cpp:1003 + // exists) also marks the end of the branch. + for (auto i = 0U, N = BF.isFused() ? 2U : 1U; + i != N && !isa(F); ++i, F = F->getNextNode()) {

[PATCH] D70157: Align branches within 32-Byte boundary(NOP padding)

2019-12-19 Thread Fangrui Song via Phabricator via cfe-commits
MaskRay added inline comments. Comment at: llvm/lib/MC/MCAssembler.cpp:993 + MCBoundaryAlignFragment ) { + // The MCBoundaryAlignFragment that doesn't emit NOP should not relax. + if (!BF.canEmitNops()) be relaxed

[PATCH] D70157: Align branches within 32-Byte boundary(NOP padding)

2019-12-19 Thread Kan Shengchen via Phabricator via cfe-commits
skan added a comment. In D70157#1791347 , @reames wrote: > The general question is why a *range* of fragments can't be defined. > Computing the instruction size for the entire range then just requires > walking from first to last fragment in the range

[PATCH] D70157: Align branches within 32-Byte boundary(NOP padding)

2019-12-19 Thread annita.zhang via Phabricator via cfe-commits
annita.zhang added a comment. > To be concrete, I propose: > ".autopad", ".noautopad": allow/disallow the assembler to emit padding via > inserting a nop or prefix before any instruction, as needed. > ".align_branch_boundary [N,] [instruction,]": Enable branch-boundary padding > (per previous

[PATCH] D70157: Align branches within 32-Byte boundary(NOP padding)

2019-12-19 Thread Philip Reames via Phabricator via cfe-commits
reames accepted this revision. reames added a comment. LGTM. The patch isn't perfect, but this has reached the point where landing and iterating in tree is the fastest way to convergence. To be clear, this LGTM *only* applies to the current patch contents, and the *internal only* flag names

[PATCH] D70157: Align branches within 32-Byte boundary(NOP padding)

2019-12-19 Thread Philip Reames via Phabricator via cfe-commits
reames added a comment. In D70157#1789159 , @skan wrote: > In D70157#1788445 , @reames wrote: > > > Specifically on the revised patch, I remain confused by the need for > > multiple subtypes. The need for

[PATCH] D70157: Align branches within 32-Byte boundary(NOP padding)

2019-12-19 Thread Kan Shengchen via Phabricator via cfe-commits
skan updated this revision to Diff 234697. skan added a comment. Add more comment CHANGES SINCE LAST ACTION https://reviews.llvm.org/D70157/new/ https://reviews.llvm.org/D70157 Files: llvm/include/llvm/MC/MCAsmBackend.h llvm/include/llvm/MC/MCAssembler.h

[PATCH] D70157: Align branches within 32-Byte boundary(NOP padding)

2019-12-18 Thread Kan Shengchen via Phabricator via cfe-commits
skan updated this revision to Diff 234650. skan added a comment. move the code that checks if we can reuse the current `MCBoundaryAlignFragment` into the function `X86AsmBackend::getOrCreateBoundaryAlignFragment` CHANGES SINCE LAST ACTION https://reviews.llvm.org/D70157/new/

[PATCH] D70157: Align branches within 32-Byte boundary(NOP padding)

2019-12-18 Thread Kan Shengchen via Phabricator via cfe-commits
skan updated this revision to Diff 234520. skan added a comment. Fix a typo in `MCFragment::dump()` CHANGES SINCE LAST ACTION https://reviews.llvm.org/D70157/new/ https://reviews.llvm.org/D70157 Files: llvm/include/llvm/MC/MCAsmBackend.h llvm/include/llvm/MC/MCAssembler.h

[PATCH] D70157: Align branches within 32-Byte boundary(NOP padding)

2019-12-18 Thread Kan Shengchen via Phabricator via cfe-commits
skan updated this revision to Diff 234515. skan added a comment. 1. rename `MCBoundaryAlignFragment::hasEmitNop()` to `MCBoundaryAlignFragment::canEmitNop()` 2. reduce the number of `MCBoundaryAlignFragment` emitted as possible CHANGES SINCE LAST ACTION https://reviews.llvm.org/D70157/new/

[PATCH] D70157: Align branches within 32-Byte boundary(NOP padding)

2019-12-18 Thread Kan Shengchen via Phabricator via cfe-commits
skan updated this revision to Diff 234488. skan edited the summary of this revision. skan added a comment. **Simplify** Drop the subtype of `MCBoundaryAlignFragment` and add data member `EmitNops` to indicate whether NOPs should be emitted. CHANGES SINCE LAST ACTION

[PATCH] D70157: Align branches within 32-Byte boundary(NOP padding)

2019-12-17 Thread Kan Shengchen via Phabricator via cfe-commits
skan added a comment. In D70157#1788445 , @reames wrote: > Specifically on the revised patch, I remain confused by the need for multiple > subtypes. The need for fragments *between* the potentially fused > instructions doesn't make sense to me. What I

[PATCH] D70157: Align branches within 32-Byte boundary(NOP padding)

2019-12-17 Thread Philip Reames via Phabricator via cfe-commits
reames added a comment. I've updated the draft assembler support in https://reviews.llvm.org/D71315 to match the proposal here. Again, this is very much WIP and mostly just to show proposed syntax/usage. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D70157/new/

[PATCH] D70157: Align branches within 32-Byte boundary(NOP padding)

2019-12-17 Thread James Y Knight via Phabricator via cfe-commits
jyknight added a comment. In D70157#1788418 , @reames wrote: > In D70157#1788025 , @jyknight wrote: > > > > .push_align_branch_boundary [N,] [instruction,]* > > > > I'd like to raise again the possibility of using

[PATCH] D70157: Align branches within 32-Byte boundary(NOP padding)

2019-12-17 Thread Philip Reames via Phabricator via cfe-commits
reames added a comment. Specifically on the revised patch, I remain confused by the need for multiple subtypes. The need for fragments *between* the potentially fused instructions doesn't make sense to me. What I was expecting to see was the following: BoundaryAlign w/target=the branch

[PATCH] D70157: Align branches within 32-Byte boundary(NOP padding)

2019-12-17 Thread Philip Reames via Phabricator via cfe-commits
reames added a comment. In D70157#1788025 , @jyknight wrote: > > .push_align_branch_boundary [N,] [instruction,]* > > I'd like to raise again the possibility of using a more general region > directive to denote "It is allowable to add prefixes/nops

[PATCH] D70157: Align branches within 32-Byte boundary(NOP padding)

2019-12-17 Thread Philip Reames via Phabricator via cfe-commits
reames added a comment. In D70157#1787403 , @annita.zhang wrote: > In D70157#1787160 , @MaskRay wrote: > > > > > > > > > There is a precedant: .pushsection/.popsection (MCStreamer::SectionStack). > > With

[PATCH] D70157: Align branches within 32-Byte boundary(NOP padding)

2019-12-17 Thread Kan Shengchen via Phabricator via cfe-commits
skan updated this revision to Diff 234312. skan retitled this revision from "Align branches within 32-Byte boundary" to "Align branches within 32-Byte boundary(NOP padding)". skan edited the summary of this revision. skan added a comment. **Simplify** 1. Drop prefix padding support 2. Drop

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-17 Thread James Y Knight via Phabricator via cfe-commits
jyknight added a comment. > .push_align_branch_boundary [N,] [instruction,]* I'd like to raise again the possibility of using a more general region directive to denote "It is allowable to add prefixes/nops before instructions in this region if the assembler wants to", as I'd started discussing

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-17 Thread annita.zhang via Phabricator via cfe-commits
annita.zhang added a comment. In D70157#1787160 , @MaskRay wrote: > > There is a precedant: .pushsection/.popsection (MCStreamer::SectionStack). > With .push_align_branch/.pop_align_branch, we probably don't need the > 'switch-to-default' action. >

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-17 Thread Kan Shengchen via Phabricator via cfe-commits
skan added a comment. In D70157#1786901 , @reames wrote: > Noting another issue we found in local testing (with an older version of this > patch). This interacts badly with the implicit exception mechanism in LLVM. > For that mechanism, we end up

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-16 Thread annita.zhang via Phabricator via cfe-commits
annita.zhang added a comment. In D70157#1787139 , @reames wrote: > Here are the minutes from our phone call a few minutes ago. Thanks for coordinating the meeting and having a clear summary. It helps a lot to accelerate the patch review. I really

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-16 Thread Fangrui Song via Phabricator via cfe-commits
MaskRay added a comment. In D70157#1787139 , @reames wrote: > Here are the minutes from our phone call a few minutes ago. > > Attendees: Andy Kaylor, Craig Topper, Annita Zhang, Tom Stellard, Chandler > Carruth, Fedor Sergeev, Philip Reames, Yuanake Luo

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-16 Thread Philip Reames via Phabricator via cfe-commits
reames added a comment. Here are the minutes from our phone call a few minutes ago. Attendees: Andy Kaylor, Craig Topper, Annita Zhang, Tom Stellard, Chandler Carruth, Fedor Sergeev, Philip Reames, Yuanake Luo Status Summary == Performance data has been posted to llvm-dev. We had

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-16 Thread Kan Shengchen via Phabricator via cfe-commits
skan marked an inline comment as done. skan added inline comments. Comment at: llvm/lib/MC/MCFragment.cpp:426 + case MCFragment::FT_MachineDependent: { +const MCMachineDependentFragment *MF = +cast(this); MaskRay wrote: > `const auto *`. The type is

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-16 Thread Chandler Carruth via Phabricator via cfe-commits
chandlerc added a comment. Just wanted to say thanks for the performance data! I know it was hard to get, but it is really, really useful to help folks evaluate these kinds of changes with actual data around the options available. CHANGES SINCE LAST ACTION

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-16 Thread Fangrui Song via Phabricator via cfe-commits
MaskRay added a comment. Haven't looked into too many details yet but made some suggestions anyway... Comment at: llvm/include/llvm/MC/MCFragment.h:634 + + uint64_t getBoundarySize() const { +return AlignBoundarySize; Store AlignBoundarySize as a shift

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-16 Thread Philip Reames via Phabricator via cfe-commits
reames added a comment. Noting another issue we found in local testing (with an older version of this patch). This interacts badly with the implicit exception mechanism in LLVM. For that mechanism, we end up generating assembly which looks more or less like this: Ltmp: cmp %rsi, (%rdi)

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-16 Thread Philip Reames via Phabricator via cfe-commits
reames added a comment. In offline discussion, there was an agreement that we needed further coordination to make sure this patch moves forward quickly. For that reason, there will be a call happening today at 4pm Pacific. Interested parties are welcome to attend. Zoom Meeting ID:

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-16 Thread annita.zhang via Phabricator via cfe-commits
annita.zhang added a comment. > More performance data was posted on > http://lists.llvm.org/pipermail/llvm-dev/2019-December/137609.html and > http://lists.llvm.org/pipermail/llvm-dev/2019-December/137610.html. Let's > move on based on the data. Based on the SPEC data, we observed 2.6% and

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-16 Thread annita.zhang via Phabricator via cfe-commits
annita.zhang added a comment. In D70157#1768338 , @annita.zhang wrote: > In D70157#1768319 , @chandlerc wrote: > > > I'm seeing lots of updates to fix bugs, but no movement for many days on > > both my meta

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-11 Thread Kan Shengchen via Phabricator via cfe-commits
skan added a comment. In D70157#1777272 , @fedor.sergeev wrote: > In D70157#1776424 , @skan wrote: > > > > What if I insert explicit align(8) right *after* the sequence? > > > > If your insert explicit `.align

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-10 Thread Fedor Sergeev via Phabricator via cfe-commits
fedor.sergeev added a comment. In D70157#1776424 , @skan wrote: > > What if I insert explicit align(8) right *after* the sequence? > > If your insert explicit `.align 8` after the sequence, and the sequence > doesn't has any branch to be aligned, the

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-09 Thread Kan Shengchen via Phabricator via cfe-commits
skan added a comment. In D70157#1775481 , @fedor.sergeev wrote: > In D70157#1775016 , @annita.zhang > wrote: > > > > The point is that we have explicit requirement at the start and we have a > > > lowering into

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-09 Thread Philip Reames via Phabricator via cfe-commits
reames added a comment. I just posted an alternate review (https://reviews.llvm.org/D71238) which attempts to carve out a minimum reviewable piece of complexity. The hope is that we can review that one quickly (as there are fewer interacting concerns), and then rebase this one (possibly

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-09 Thread Fedor Sergeev via Phabricator via cfe-commits
fedor.sergeev added a comment. In D70157#1775016 , @annita.zhang wrote: > > The point is that we have explicit requirement at the start and we have a > > lowering into 16-byte sequence that we need to be preserved exactly as it > > is. > > Essentially

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-09 Thread annita.zhang via Phabricator via cfe-commits
annita.zhang added a comment. > The point is that we have explicit requirement at the start and we have a > lowering into 16-byte sequence that we need to be preserved exactly as it is. > Essentially what we need is a "protection" for this sequence from any > changes by machinery that

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-09 Thread Fedor Sergeev via Phabricator via cfe-commits
fedor.sergeev added a comment. In D70157#1774561 , @skan wrote: > I could not reproduce the phenomenon that N-byte nop becomes (N+M) bytes with > your example. So according to my understanding, I slightly modified your > case. (If my understand is

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-08 Thread Kan Shengchen via Phabricator via cfe-commits
skan added a comment. In D70157#1771771 , @reames wrote: > We uncovered another functional issue with this patch, or at least, the > interaction of this patch and other parts of LLVM. In our support for > STATEPOINT, PATCHPOINT, and STACKMAP we use

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-07 Thread annita.zhang via Phabricator via cfe-commits
annita.zhang added a comment. In D70157#1760278 , @jyknight wrote: > > Branch alignment > > --- > > The primary goal of this patch, restricting the placement of branch > instructions, is a performance optimization. Similar to loop alignment, the

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-06 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. In D70157#1773180 , @reames wrote: > Recording something so I don't forget it when we get back to the prefix > padding version. The write up on the bundle align mode stuff mentions a > concerning memory overhead for the

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-06 Thread Philip Reames via Phabricator via cfe-commits
reames added a comment. Recording something so I don't forget it when we get back to the prefix padding version. The write up on the bundle align mode stuff mentions a concerning memory overhead for the feature. Since the basic implementation techniques are similar, we need to make sure we

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-06 Thread David Zarzycki via Phabricator via cfe-commits
davezarzycki added a comment. In D70157#1772227 , @annita.zhang wrote: > > > > > >>> Third, I have not see a justification for why complexity for instruction > >>> prefix padding is necessary. All the effected CPUs support multi-byte > >>> nops, so

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-06 Thread annita.zhang via Phabricator via cfe-commits
annita.zhang added a comment. > > >>> Third, I have not see a justification for why complexity for instruction >>> prefix padding is necessary. All the effected CPUs support multi-byte >>> nops, so we're talking about a *single micro op* difference between the nop >>> form and prefix form.

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-05 Thread Fangrui Song via Phabricator via cfe-commits
MaskRay added a comment. In D70157#1772019 , @reames wrote: > In D70157#1771841 , @jyknight wrote: > > > In D70157#1771832 , @reames wrote: > > > > > I've been digging

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-05 Thread Philip Reames via Phabricator via cfe-commits
reames added a comment. In D70157#1771841 , @jyknight wrote: > In D70157#1771832 , @reames wrote: > > > I've been digging through the code for this for the last day or so. This > > is a new area for me, so it's

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-05 Thread Fangrui Song via Phabricator via cfe-commits
MaskRay added a subscriber: opaparo. MaskRay added a comment. In D70157#1771841 , @jyknight wrote: > In D70157#1771832 , @reames wrote: > > > I've been digging through the code for this for the last day or so.

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-05 Thread James Y Knight via Phabricator via cfe-commits
jyknight added a comment. In D70157#1771832 , @reames wrote: > I've been digging through the code for this for the last day or so. This is > a new area for me, so it's possible I'm off base, but I have some concerns > about the current design. > >

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-05 Thread Philip Reames via Phabricator via cfe-commits
reames added a comment. I've been digging through the code for this for the last day or so. This is a new area for me, so it's possible I'm off base, but I have some concerns about the current design. First, there appears to already be support for instruction bundling and alignment in the

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-05 Thread Philip Reames via Phabricator via cfe-commits
reames added a comment. We uncovered another functional issue with this patch, or at least, the interaction of this patch and other parts of LLVM. In our support for STATEPOINT, PATCHPOINT, and STACKMAP we use N-byte nop sequences for regions of code which might be patched out. It's

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-05 Thread Fedor Sergeev via Phabricator via cfe-commits
fedor.sergeev added a comment. In D70157#1768310 , @skan wrote: > fix the bug > > https://bugs.llvm.org/show_bug.cgi?id=44215 FYI: I did close the bug as fixed after verifying that the fix works for me. CHANGES SINCE LAST ACTION

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-05 Thread Kan Shengchen via Phabricator via cfe-commits
skan marked 3 inline comments as done. skan added inline comments. Comment at: llvm/test/MC/X86/x86-64-align-branch-1b.s:10 +# CHECK: foo: +# CHECK-NEXT:0: 64 89 04 25 01 00 00 00 movl%eax, %fs:1 +# CHECK-NEXT:8: 2e 55

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-05 Thread Kan Shengchen via Phabricator via cfe-commits
skan added a comment. In D70157#1769932 , @MaskRay wrote: > I find another deficiency (infinite loop) with the current approach. > > Say, there is a `je 0` (0x0F 0x84 0x00 0x00 0x00 0x00) at byte 0x90. > (0x90+6)%32 == 0, so it ends on a 32-byte

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-04 Thread Fangrui Song via Phabricator via cfe-commits
MaskRay added inline comments. Comment at: llvm/lib/MC/MCAssembler.cpp:1014 + unsigned EndAddr = StartAddr + Size; + return StartAddr / BoundarySize != ((EndAddr - 1) / BoundarySize); +} Division is slow. Pass in the power of 2 and use right shift instead.

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-04 Thread Fangrui Song via Phabricator via cfe-commits
MaskRay added a comment. I find another deficiency (infinite loop) with the current approach. Say, there is a `je 0` (0x0F 0x84 0x00 0x00 0x00 0x00) at byte 0x90. (0x90+6)%32 == 0, so it ends on a 32-byte boundary. MF.getMaxPrefixSize() is 4, so the size of MCMachineDependentFragment may vary

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-04 Thread Fangrui Song via Phabricator via cfe-commits
MaskRay added a comment. I am still trying to understand the patch. Just made some comments about the tests. Comment at: llvm/include/llvm/MC/MCFragment.h:663 + enum SubType : uint8_t { +// BranchPadding - The variable size fragment to insert NOP before branch. +

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-04 Thread Kan Shengchen via Phabricator via cfe-commits
skan added a comment. In D70157#1768389 , @craig.topper wrote: > Can you please put the macro fusion changes in a separate phabricator review. > I’ll review it in the morning US time and if it all looks good we can get > that part committed while the

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-04 Thread Craig Topper via Phabricator via cfe-commits
craig.topper added a comment. Can you please put the macro fusion changes in a separate phabricator review. I’ll review it in the morning US time and if it all looks good we can get that part committed while the other comments are being addressed. CHANGES SINCE LAST ACTION

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-04 Thread Kan Shengchen via Phabricator via cfe-commits
skan marked an inline comment as done. skan added inline comments. Comment at: llvm/lib/Target/X86/MCTargetDesc/X86BaseInfo.h:158 +case X86::AND16i16: +case X86::AND16mr: +case X86::AND16ri: craig.topper wrote: > None of the AND/ADD/SUB instructions

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-03 Thread annita.zhang via Phabricator via cfe-commits
annita.zhang added a comment. In D70157#1768319 , @chandlerc wrote: > I'm seeing lots of updates to fix bugs, but no movement for many days on both > my meta comments and (in some ways more importantly) James's meta comments. > (And thanks Philip for

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-03 Thread Craig Topper via Phabricator via cfe-commits
craig.topper added inline comments. Comment at: llvm/lib/Target/X86/MCTargetDesc/X86BaseInfo.h:158 +case X86::AND16i16: +case X86::AND16mr: +case X86::AND16ri: None of the AND/ADD/SUB instructions ending in mr are eligible for macrofusion as far as

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-03 Thread Chandler Carruth via Phabricator via cfe-commits
chandlerc added a comment. I'm seeing lots of updates to fix bugs, but no movement for many days on both my meta comments and (in some ways more importantly) James's meta comments. (And thanks Philip for chiming in too!) Meanwhile, we really, really need to get this functionality in place. The

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-03 Thread Fedor Sergeev via Phabricator via cfe-commits
fedor.sergeev requested changes to this revision. fedor.sergeev added a comment. This revision now requires changes to proceed. In D70157#1767212 , @fedor.sergeev wrote: > Working on getting upstream reproducer Hangs on a moderately small piece

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-03 Thread Fedor Sergeev via Phabricator via cfe-commits
fedor.sergeev added a comment. Just FYI for now as I'm trying to dig futher... I have been trying this fix in our downstream environment and managed to get a hang with this backtrace: #0 needPadding (BoundarySize=, Size=, StartAddr=) at ./llvm/lib/MC/MCAssembler.cpp:1028 #1

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-02 Thread Philip Reames via Phabricator via cfe-commits
reames added a comment. I want to chime in support of jyknight's meta comments - particularly the one about the need to balance execution speed vs code size differently in hot vs cold code. For our use case, we have a very large amount of branch dense known cold paths, and being able to only

[PATCH] D70157: Align branches within 32-Byte boundary

2019-11-26 Thread James Y Knight via Phabricator via cfe-commits
jyknight added a comment. I have some other concerns about the code itself, but after pondering this a little bit, I'd like to instead bring up some rather more general concerns about the overall approach used here -- with some suggestions. (As these comments are not really comments on the

[PATCH] D70157: Align branches within 32-Byte boundary

2019-11-24 Thread Chandler Carruth via Phabricator via cfe-commits
chandlerc added a comment. (Just a reminder that we need to have both performance and code size numbers for this patch. And given that there are a few options, may need a few examples.) CHANGES SINCE LAST ACTION https://reviews.llvm.org/D70157/new/ https://reviews.llvm.org/D70157

[PATCH] D70157: Align branches within 32-Byte boundary

2019-11-22 Thread Kan Shengchen via Phabricator via cfe-commits
skan added a comment. In D70157#1755927 , @jyknight wrote: > An alternative would be to simply emit NOPs before branches as needed. That > would be substantially simpler, since it would only require special handling > for a branch or a fused-branch. I

[PATCH] D70157: Align branches within 32-Byte boundary

2019-11-21 Thread Kan Shengchen via Phabricator via cfe-commits
skan marked an inline comment as done. skan added a comment. In D70157#1755927 , @jyknight wrote: > Thanks for the comments, they help a little. But it's still somewhat > confusing, so let me write down what seems to be happening: > > - Before emitting

[PATCH] D70157: Align branches within 32-Byte boundary

2019-11-21 Thread Eli Friedman via Phabricator via cfe-commits
efriedma added inline comments. Comment at: llvm/include/llvm/MC/MCFragment.h:684 + // need to be aligned. + static unsigned AlignBoundarySize; + // AlignMaxPrefixSize - The maximum size of prefixes per instruction. Global variables are forbidden in LLVM

[PATCH] D70157: Align branches within 32-Byte boundary

2019-11-21 Thread Craig Topper via Phabricator via cfe-commits
craig.topper added a comment. In D70157#1755927 , @jyknight wrote: > Thanks for the comments, they help a little. But it's still somewhat > confusing, so let me write down what seems to be happening: > > - Before emitting every instruction, a new

[PATCH] D70157: Align branches within 32-Byte boundary

2019-11-21 Thread James Y Knight via Phabricator via cfe-commits
jyknight added a comment. Thanks for the comments, they help a little. But it's still somewhat confusing, so let me write down what seems to be happening: - Before emitting every instruction, a new MCMachineDependentFragment is now emitted, of one of the multiple types: - For most

[PATCH] D70157: Align branches within 32-Byte boundary

2019-11-16 Thread James Y Knight via Phabricator via cfe-commits
jyknight added a comment. Overall comment: this whole change needs more comments, everywhere. Both for the added functions, and for the test cases. There is almost no description of what's happening, and it could really use it. Comment at: llvm/lib/MC/MCAssembler.cpp:1041 +

[PATCH] D70157: Align branches within 32-Byte boundary

2019-11-16 Thread Ed Maste via Phabricator via cfe-commits
emaste added a comment. > I think the default policy discussion might be better had on llvm-dev than a > Phab review. Ok, I agree with that, and also think it's useful to get this patch committed independent of a change to the defaults, which can be handled in a subsequent review. CHANGES

[PATCH] D70157: Align branches within 32-Byte boundary

2019-11-16 Thread Craig Topper via Phabricator via cfe-commits
craig.topper added a comment. In D70157#1747726 , @davezarzycki wrote: > In D70157#1747640 , @jyknight wrote: > > > In D70157#1747551 , @davezarzycki > > wrote: > > > > >

[PATCH] D70157: Align branches within 32-Byte boundary

2019-11-16 Thread David Zarzycki via Phabricator via cfe-commits
davezarzycki added a comment. In D70157#1747640 , @jyknight wrote: > In D70157#1747551 , @davezarzycki > wrote: > > > In D70157#1747510 , @xbolva00 > > wrote: > > > > > >

[PATCH] D70157: Align branches within 32-Byte boundary

2019-11-16 Thread James Y Knight via Phabricator via cfe-commits
jyknight added a comment. In D70157#1747551 , @davezarzycki wrote: > In D70157#1747510 , @xbolva00 wrote: > > > > Even though core2 isn't affected by the erratum, core2 code can run on > > > CPUs that do have the

[PATCH] D70157: Align branches within 32-Byte boundary

2019-11-16 Thread David Zarzycki via Phabricator via cfe-commits
davezarzycki added a comment. In D70157#1746793 , @MaskRay wrote: > On x86, the preferred function alignment is 16 > (https://github.com/llvm/llvm-project/blob/arcpatch-D70157/llvm/lib/Target/X86/X86ISelLowering.cpp#L1893), > which is the default

[PATCH] D70157: Align branches within 32-Byte boundary

2019-11-16 Thread Kan Shengchen via Phabricator via cfe-commits
skan added a comment. In D70157#1746793 , @MaskRay wrote: > On x86, the preferred function alignment is 16 > (https://github.com/llvm/llvm-project/blob/arcpatch-D70157/llvm/lib/Target/X86/X86ISelLowering.cpp#L1893), > which is the default function

[PATCH] D70157: Align branches within 32-Byte boundary

2019-11-16 Thread Ed Maste via Phabricator via cfe-commits
emaste added a comment. > Shall we default to -mbranches-within-32B-boundaries if the specified -march= > or -mtune= may be affected by the erratum? I think we should enable it based on `-mtune` specifying an affected CPU (and implicitly based on `-march` if `-mtune` is not specified).

[PATCH] D70157: Align branches within 32-Byte boundary

2019-11-16 Thread David Zarzycki via Phabricator via cfe-commits
davezarzycki added a comment. In D70157#1747510 , @xbolva00 wrote: > > Even though core2 isn't affected by the erratum, core2 code can run on CPUs > > that do have the bug (and core2 is a popular target for code that needs to > > run "everywhere"),

[PATCH] D70157: Align branches within 32-Byte boundary

2019-11-16 Thread Kan Shengchen via Phabricator via cfe-commits
skan marked 3 inline comments as done. skan added inline comments. Comment at: llvm/test/MC/X86/i386-align-branch-1a.s:1 +# RUN: llvm-mc -filetype=obj -triple i386-unknown-unknown --x86-align-branch-boundary=32 --x86-align-branch=fused-jcc-jmp --x86-align-branch-prefix-size=5

[PATCH] D70157: Align branches within 32-Byte boundary

2019-11-16 Thread Dávid Bolvanský via Phabricator via cfe-commits
xbolva00 added a comment. > Even though core2 isn't affected by the erratum, core2 code can run on CPUs > that do have the bug (and core2 is a popular target for code that needs to > run "everywhere"), therefore all target CPUs that predate a hardware fix > really So perf of all code/generic

[PATCH] D70157: Align branches within 32-Byte boundary

2019-11-16 Thread Kan Shengchen via Phabricator via cfe-commits
skan added a comment. In D70157#1747428 , @davezarzycki wrote: > In D70157#1746793 , @MaskRay wrote: > > > On x86, the preferred function alignment is 16 > >

[PATCH] D70157: Align branches within 32-Byte boundary

2019-11-14 Thread annita.zhang via Phabricator via cfe-commits
annita.zhang added a comment. In D70157#1744243 , @chandlerc wrote: > Thanks for sending this out! > > I've made a minor comment on the flag structure, but my bigger question would > be: can you summarize the performance hit and code size hit you've seen

[PATCH] D70157: Align branches within 32-Byte boundary

2019-11-14 Thread Xiang Zhang via Phabricator via cfe-commits
xiangzhangllvm added a comment. In D70157#1746793 , @MaskRay wrote: > On x86, the preferred function alignment is 16 > (https://github.com/llvm/llvm-project/blob/arcpatch-D70157/llvm/lib/Target/X86/X86ISelLowering.cpp#L1893), > which is the default

[PATCH] D70157: Align branches within 32-Byte boundary

2019-11-14 Thread Fangrui Song via Phabricator via cfe-commits
MaskRay added inline comments. Comment at: llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp:173 + + bool needAlignBranch() const; + bool needAlignJcc() const; We can generalize these functions with a function that takes a parameter.

[PATCH] D70157: Align branches within 32-Byte boundary

2019-11-14 Thread Fangrui Song via Phabricator via cfe-commits
MaskRay added a comment. On x86, the preferred function alignment is 16 (https://github.com/llvm/llvm-project/blob/arcpatch-D70157/llvm/lib/Target/X86/X86ISelLowering.cpp#L1893), which is the default function alignment in text sections. If the cross-boundary decision is made with alignment=32

[PATCH] D70157: Align branches within 32-Byte boundary

2019-11-14 Thread Fangrui Song via Phabricator via cfe-commits
MaskRay added inline comments. Comment at: llvm/lib/MC/MCFragment.cpp:435 +OS << "\n "; +OS << "Subtype:" << MF->SubKind << "Size: " << MF->getSize(); +break; ``` error: use of overloaded operator '<<' is ambiguous (with opera nd types

[PATCH] D70157: Align branches within 32-Byte boundary

2019-11-14 Thread Tom Stellard via Phabricator via cfe-commits
tstellar added a comment. Are there any optimizations in lld that might undo the 32-byte alignment emitted by the compiler? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D70157/new/ https://reviews.llvm.org/D70157

[PATCH] D70157: Align branches within 32-Byte boundary

2019-11-13 Thread Kan Shengchen via Phabricator via cfe-commits
skan added a comment. In D70157#1745128 , @craig.topper wrote: > In D70157#1745127 , @skan wrote: > > > In D70157#1745125 , @craig.topper > > wrote: > > > > > I've

  1   2   >