[Bug target/53929] [meta-bug] -masm=intel with global symbol
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53929 Eric Gallager changed: What|Removed |Added CC||egallager at gcc dot gnu.org --- Comment #27 from Eric Gallager --- is this really a meta-bug? Normally meta-bugs depend on other bugs...
[Bug target/53929] [meta-bug] -masm=intel with global symbol
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53929 --- Comment #26 from LIU Hao --- Created attachment 57199 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57199=edit Draft patch Ver. 2 1. Fix a typo in `ASM_OUTPUT_SYMBOL_REF` (`x` => `SYM`) 2. For Intel syntax, if the name does not start with a `*`, then it is taken as a symbol, and is quoted. 3. If the name starts with a `*`, then it is a request for verbatim output. According to comments in 'dwarf2cfi.cc' which say 'dwarf2out.cc might give us a label expression (e.g. .LVL548-1) as second argument. If so, make it a subexpression, ... ' so the name may be a combined expression. In this case parse it for `+` or `-` where the symbol stops, then quote the symbol and print the remaining part verbatim.
[Bug target/53929] [meta-bug] -masm=intel with global symbol
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53929 --- Comment #25 from LIU Hao --- Created attachment 57191 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57191=edit Draft patch This is a draft patch, bootstrapped on {i686,x86_64}-w64-mingw32 successfully. Haven't run tests though.
[Bug target/53929] [meta-bug] -masm=intel with global symbol
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53929 --- Comment #24 from LIU Hao --- I've composed a proposal to address this issue: https://github.com/lhmouse/mcfgthread/wiki/Formalized-Intel-Syntax-for-x86#the-proposal The proposal is to treat names between `ptr` and `[` as symbols, and to treat to treat names between `[` and `]` as registers. This lea rax, bx[rip] should be rejected due to invalidity, while lea rax, BYTE PTR bx[rip] can be parsed as referencing the symbol `bx` with no ambiguity.
[Bug target/53929] [meta-bug] -masm=intel with global symbol
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53929 --- Comment #23 from LIU Hao --- Changes to GCC should look like this I suspect (I didn't test this): ``` diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index fbd33a6bfd1..de80c7a805f 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -14080,7 +14080,11 @@ ix86_print_operand_address_as (FILE *file, rtx addr, if (flag_pic) output_pic_addr_const (file, disp, 0); else if (GET_CODE (disp) == LABEL_REF) - output_asm_label (disp); + { + putc ('\"', file); + output_asm_label (disp); + putc ('\"', file); + } else if (CONST_INT_P (disp)) offset = disp; else ``` It's a bit strange that `output_asm_label` writes output via a global `FILE*`.
[Bug target/53929] [meta-bug] -masm=intel with global symbol
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53929 --- Comment #22 from jbeulich at suse dot com --- (In reply to LIU Hao from comment #21) > oh really? I thought it would have to be implemented. If it's readily > available, we can start making use of it right now. Well, the general symbol part of it is there (with a few quirks, which I don't think would matter here). This missing part for quoted symbols matching register names was posted, see e.g. https://sourceware.org/pipermail/binutils/2023-May/127318.html.
[Bug target/53929] [meta-bug] -masm=intel with global symbol
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53929 --- Comment #21 from LIU Hao --- (In reply to jbeulich from comment #20) > This is assembly; I don't see how (dis)similarity with C would matter. I > also don't see how your example is any different in this regard from > > mov eax, "symbol" > > which gas has been supporting for quite some time. oh really? I thought it would have to be implemented. If it's readily available, we can start making use of it right now.
[Bug target/53929] [meta-bug] -masm=intel with global symbol
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53929 --- Comment #20 from jbeulich at suse dot com --- (In reply to LIU Hao from comment #19) > (In reply to jbeulich from comment #11) > > I have a rough plan on the gas side, but that will then need a gcc side > > change as well: For a couple of years we have had quoted symbol names there. > > While this doesn't currently work right in a number of cases (including the > > one needed here) the plan is to make e.g. > > > > mov eax, "ecx" > > > > not be treated the same as > > > > mov eax, ecx > > > > but considering "ecx" a symbol name due to the quotation. Obviously gcc's > > I don't like double quotes here, because it looks something different, like > in C. This is assembly; I don't see how (dis)similarity with C would matter. I also don't see how your example is any different in this regard from mov eax, "symbol" which gas has been supporting for quite some time. > Would it make some sense if we take the approach for MIPS and AArch64 > [1], so > > mov eax, %ecx > > or > > mov eax, :ecx > > denotes `ecx` is the name of a label, and otherwise a register. Also, such a > prefix should be optional, so people who write assembly can omit it if they > carefully avoid such names. I can't find any indication of such syntax being supported by gas for either of these architectures. % on MIPS and : on Arm64 actually are involved in relocation specifiers instead. Are you suggesting to overload them? (Note that % is out of game here, for being the register prefix on x86.)
[Bug target/53929] [meta-bug] -masm=intel with global symbol
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53929 --- Comment #19 from LIU Hao --- (In reply to jbeulich from comment #11) > I have a rough plan on the gas side, but that will then need a gcc side > change as well: For a couple of years we have had quoted symbol names there. > While this doesn't currently work right in a number of cases (including the > one needed here) the plan is to make e.g. > > mov eax, "ecx" > > not be treated the same as > > mov eax, ecx > > but considering "ecx" a symbol name due to the quotation. Obviously gcc's I don't like double quotes here, because it looks something different, like in C. Would it make some sense if we take the approach for MIPS and AArch64 [1], so mov eax, %ecx or mov eax, :ecx denotes `ecx` is the name of a label, and otherwise a register. Also, such a prefix should be optional, so people who write assembly can omit it if they carefully avoid such names. [1] https://maskray.me/blog/2023-05-08-assemblers
[Bug target/53929] [meta-bug] -masm=intel with global symbol
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53929 --- Comment #18 from LIU Hao --- Would it make any sense to have GAS be more permissive about such labels, 1. unconditionally? or 2. when input is from a pipe? or 3. when a special option is in effect e.g. `--output-from-gcc`?
[Bug target/53929] [meta-bug] -masm=intel with global symbol
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53929 --- Comment #17 from LIU Hao --- Yeah. It looks to me like the Microsoft compiler doesn't actually uses the assembler (like LLVM). Given the C source: ``` extern int rax; int main() { return rax; } ``` which compiled without errors: ``` > cl /O2 /c test.c /Fatest.asm Microsoft (R) C/C++ Optimizing Compiler Version 19.29.30148 for x64 Copyright (C) Microsoft Corporation. All rights reserved. test.c ``` and produced this assembly file ``` include listing.inc INCLUDELIB LIBCMT INCLUDELIB OLDNAMES PUBLIC main EXTRN rax:DWORD _TEXT SEGMENT mainPROC; COMDAT mov eax, DWORD PTR rax ret 0 mainENDP _TEXT ENDS END ``` which can't be assembled ``` > ml64 /c test.asm Microsoft (R) Macro Assembler (x64) Version 14.29.30148.0 Copyright (C) Microsoft Corporation. All rights reserved. Assembling: test.asm test.asm(9) : error A2008:syntax error : rax test.asm(16) : error A2032:invalid use of register ```
[Bug target/53929] [meta-bug] -masm=intel with global symbol
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53929 --- Comment #16 from jbeulich at suse dot com --- (In reply to LIU Hao from comment #15) > This is accepted by ML64: > > ``` > PUBLICmain > EXTRN rip:DWORD > _TEXT SEGMENT > main PROC > mov eax, DWORD PTR rip > ret 0 > main ENDP > _TEXT ENDS > END > ``` Which version? And did you try other register names? Unfortunately the newest I have access to right now is 12.x, and as said in #14 register names other than "rip" won't work there when (attempted to be) used as symbols. Clearly there's little point in dealing with "rip" alone. > Does it make sense to create kinda compatibility mode for ML, in addition to > MASM, if they are deemed to be incompatible? ML == MASM, at least for me (ML and ML64 are merely the names of the [non-ancient] executables).
[Bug target/53929] [meta-bug] -masm=intel with global symbol
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53929 --- Comment #15 from LIU Hao --- > Which as least MASM up to 12.x won't assemble. For one it complains about > "rip" being undeclared. And then the load of "ecx" is _not_ a memory access > (i.e. the "DWORD PTR" is ignored there). Which is in line with it also > objecting to something like "extrn eax:dword". This is accepted by ML64: ``` PUBLIC main EXTRN rip:DWORD _TEXT SEGMENT mainPROC mov eax, DWORD PTR rip ret 0 mainENDP _TEXT ENDS END ``` Does it make sense to create kinda compatibility mode for ML, in addition to MASM, if they are deemed to be incompatible? > I say this because I'd be happy to help this on the gas side, but only > without breaking MASM compatibility. My present plan for gas is (as already > outlined in #11) to make quoted identifiers unambiguously mean symbols, not > registers. But of course that would still require a gcc side change as well. > Unfortunately there continue to be inconsistencies in gas with quoted > identifiers in general, and it's not entirely clear yet whether those may > need addressing first. That quoting thing will be yet another extension. I think we had better keep extensions as few as possible.
[Bug target/53929] [meta-bug] -masm=intel with global symbol
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53929 --- Comment #14 from jbeulich at suse dot com --- (In reply to LIU Hao from comment #13) > MSVC outputs: > ``` > get_value PROC ; COMDAT > mov ecx, DWORD PTR eax > mov rax, QWORD PTR rip > mov eax, DWORD PTR [rax+rcx*4] > ret 0 > get_value ENDP > ``` Which as least MASM up to 12.x won't assemble. For one it complains about "rip" being undeclared. And then the load of "ecx" is _not_ a memory access (i.e. the "DWORD PTR" is ignored there). Which is in line with it also objecting to something like "extrn eax:dword". I say this because I'd be happy to help this on the gas side, but only without breaking MASM compatibility. My present plan for gas is (as already outlined in #11) to make quoted identifiers unambiguously mean symbols, not registers. But of course that would still require a gcc side change as well. Unfortunately there continue to be inconsistencies in gas with quoted identifiers in general, and it's not entirely clear yet whether those may need addressing first.
[Bug target/53929] [meta-bug] -masm=intel with global symbol
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53929 --- Comment #13 from LIU Hao --- dup notwithstanding, I think I had better copy my recommendation here for reference: This is how MSVC handles such names: (https://gcc.godbolt.org/z/TonjYaxqj) ``` static int* volatile rip; static unsigned int volatile eax; int get_value(void) { return rip[eax]; } ``` MSVC outputs: ``` get_value PROC ; COMDAT mov ecx, DWORD PTR eax mov rax, QWORD PTR rip mov eax, DWORD PTR [rax+rcx*4] ret 0 get_value ENDP ``` GCC outputs: ``` get_value: mov rdx, QWORD PTR rip[rip] mov eax, DWORD PTR eax[rip] mov eax, DWORD PTR [rdx+rax*4] ret ``` In the case of MSVC, `DWORD PTR eax` is unambiguously parsed as the label `eax` and `DWORD PTR [eax]` is unambiguously parsed as the register `eax`. The address of all labels are always relative to RIP, but it is implied, and brackets are not written explicitly. Maybe GCC can follow MSVC to omit the RIP register and brackets. The x86_64 memory reference syntax matches x86 with the only change in semantics of the immediate offset (for x86_64 it is relative to the next instruction, while for i686 it is absolute), but the opcode is the same.
[Bug target/53929] [meta-bug] -masm=intel with global symbol
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53929 Andrew Pinski changed: What|Removed |Added CC||lh_mouse at 126 dot com --- Comment #12 from Andrew Pinski --- *** Bug 109726 has been marked as a duplicate of this bug. ***
[Bug target/53929] [meta-bug] -masm=intel with global symbol
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53929 --- Comment #11 from jbeulich at suse dot com --- I have a rough plan on the gas side, but that will then need a gcc side change as well: For a couple of years we have had quoted symbol names there. While this doesn't currently work right in a number of cases (including the one needed here) the plan is to make e.g. mov eax, "ecx" not be treated the same as mov eax, ecx but considering "ecx" a symbol name due to the quotation. Obviously gcc's configure mechanism would then need to detect the assemblers capability of understanding this, and quote symbol names accordingly (perhaps universally rather than special-casing any particular names). While this isn't MASM-compatible (MASM treats "ecx" in such a case as an immediate), I view this as less of a problem than using e.g. Arm's model of enclosing a register name in parentheses to designate it as a symbol name: mov eax, (ecx) MASM treats this the same as with no parentheses present, and I consider this form to be more likely to be used in code ported from MASM than double quoted literals used as immediate constants. (Regardless of the choice in the end it may turn out necessary to hide the new behavior behind a new command line option and/or directive extension.)
[Bug target/53929] [meta-bug] -masm=intel with global symbol
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53929 H.J. Lu changed: What|Removed |Added CC||teo.samarzija at gmail dot com --- Comment #10 from H.J. Lu --- *** Bug 95652 has been marked as a duplicate of this bug. ***
[Bug target/53929] [meta-bug] -masm=intel with global symbol
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53929 --- Comment #9 from H.J. Lu --- *** Bug 87986 has been marked as a duplicate of this bug. ***
[Bug target/53929] [meta-bug] -masm=intel with global symbol
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53929 H.J. Lu changed: What|Removed |Added CC||umrihinva123 at gmail dot com --- Comment #8 from H.J. Lu --- *** Bug 98488 has been marked as a duplicate of this bug. ***
[Bug target/53929] [meta-bug] -masm=intel with global symbol
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53929 H.J. Lu changed: What|Removed |Added Last reconfirmed||2020-12-31 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Summary|Bug in the use of Intel asm |[meta-bug] -masm=intel with |syntax when a global is |global symbol |named "and" |