ibookstein added a comment.

I'll first explain my thought process about the representation of aliases and 
ifuncs in the IR, and why I think both aliasees and resolvers must always be 
defined; I hope I'm not completely off track and would love it if @MaskRay 
could weigh in as to whether I make sense.
Let's start at the level of the object file; My understanding is that aliases 
apply to ELF and MachO, and ifuncs apply only to ELF. I'm not at all acquainted 
with MachO, but on the ELF side, my understanding is that:

1. Aliases are simply lowered to additional symbols with the same st_value as 
their aliasee. As long as the value of a symbol has to be concrete/numeric and 
cannot express a way to refer to another symbol, for aliases to make sense and 
have the correct semantics at this level, their aliasee must be defined at the 
IR level. Otherwise all you're left with at the object file level is an 
undefined symbol with no way to express that it 'wants to' alias an external 
symbol with some specified name. In other words, symbols are either undefined 
(st_shndx == 0, st_value meaningless) or defined (st_shndx != 0, st_value 
meaningful and holds a section offset). If we were to allow aliases to have 
undefined aliasees, they would decay to simple undefined symbols and lose their 
aliasee information.
2. IFuncs are lowered to specially typed symbols whose st_value is the 
resolver. In much the same way as aliases, for this to actually have any 
meaning, the resolver must be defined (because you have no way to specify "the 
value is in another castle named 'XYZ'", only "defined at offset X" or 
"undefined"). When we allow ifuncs to have undefined resolvers, they decay to 
simple undefined symbols with the additional wart of having a special symbol 
type, but the desired resolver name is lost. Concretely, as long as the linker 
doesn't throw a fuss at said wart, for the references against that symbol from 
within the object file this will behave like a simple undefined external 
function. Because in your implementation one TU will have a cpu_dispatch and 
therefore a defined resolver, it will 'win' and intra-EXE/intra-DSO references 
against the ifunc will indeed be bound against the return value of the 
resolver. If no translation unit in the EXE/DSO had an ifunc with the same name 
and a defined resolver, you'd end up with a peculiar undefined symbol of type 
ifunc in the EXE/DSO (same as the .o).

It is my conclusion therefore that ifuncs with undefined resolvers behave 
exactly like function declarations (and lose the name of the resolver), as long 
as the linker is willing to accept such weird symbols.
Therefore, at the IR level, they're representational slack at best, and don't 
do what you want (possibly binding against a differently-named resolver) at 
worst, so they should not be allowed.

> My understanding is the frontend's semantic rules are/were different from the 
> IR rules, which is why we implemented it that way.

As I understand it, features like aliases and ifuncs consist mostly of vertical 
plumbing to expose low-level object-file semantics, and their design must be 
informed by them.

> I'm not sure what you mean here?  Are you suggesting that an undefined 
> resolver should instead just implement an undefined 'function' for the 
> .ifunc?  This doesn't seem right?  Wouldn't that cause ODR issues?

As I understand it, making the symbol your current implementation calls 
"x.ifunc" a function **declaration** which gets upgraded to an ifunc with a 
defined resolver on encountering cpu_dispatch would yield the correct behavior.

> I guess I still don't understand what the practical limitation that requires 
> ifuncs to have a defined resolver?  The resolver is just a normal function, 
> so it seems to me that allowing them to have normal linking rules makes 
> sense?  I personally think this is the least obtrusive change; this patch is 
> adding a limitation that didn't exist previously unnecessarily.

I //think// I've addressed this in the wall of text above

> It just seems odd I guess to name a function .ifunc, and not have it be an 
> ifunc?  What does our linker think about that?

Ah, the name is just a name :)
As far as the linker is concerned, it encounters an object file with an 
undefined symbol (of type STT_NOTYPE) and an object file with a defined symbol 
with the same name, of type STT_GNU_IFUNC. It will bind references in the 
former against the definition in the latter.
Here's my trying it out:

  itay@CTHULHU ~/tmp/ifuncdecl/tu> cat main.c
  int foo(void);
  int main() { return foo(); }
  
  itay@CTHULHU ~/tmp/ifuncdecl/tu> cat foo.c
  static int foo_impl(void) { return 42; }
  static void *foo_resolver(void) { return &foo_impl; }
  int foo(void) __attribute__((ifunc("foo_resolver")));
  
  itay@CTHULHU ~/tmp/ifuncdecl/tu> clang-14 -c main.c -o main.c.o
  itay@CTHULHU ~/tmp/ifuncdecl/tu> clang-14 -c foo.c -o foo.c.o
  itay@CTHULHU ~/tmp/ifuncdecl/tu> clang-14 main.c.o foo.c.o -o main
  itay@CTHULHU ~/tmp/ifuncdecl/tu> ./main
  itay@CTHULHU ~/tmp/ifuncdecl/tu [42]>
  itay@CTHULHU ~/tmp/ifuncdecl/tu> llvm-readobj-14 --elf-output-style=GNU 
main.c.o --symbols | grep foo
       4: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT   UND foo
  itay@CTHULHU ~/tmp/ifuncdecl/tu> llvm-readobj-14 --elf-output-style=GNU 
foo.c.o --symbols | grep foo
       1: 0000000000000000     0 FILE    LOCAL  DEFAULT   ABS foo.c
       3: 0000000000000000    16 FUNC    LOCAL  DEFAULT     2 foo_resolver
       4: 0000000000000010    11 FUNC    LOCAL  DEFAULT     2 foo_impl
       5: 0000000000000000    16 IFUNC   GLOBAL DEFAULT     2 foo
  itay@CTHULHU ~/tmp/ifuncdecl/tu> llvm-readobj-14 --elf-output-style=GNU main 
--symbols | grep foo
      35: 0000000000000000     0 FILE    LOCAL  DEFAULT   ABS foo.c
      36: 0000000000401150    16 FUNC    LOCAL  DEFAULT    13 foo_resolver
      37: 0000000000401160    11 FUNC    LOCAL  DEFAULT    13 foo_impl
      56: 0000000000401150    16 IFUNC   GLOBAL DEFAULT    13 foo
  itay@CTHULHU ~/tmp/ifuncdecl/tu> llvm-readobj-14 --elf-output-style=GNU main 
--relocations
  
  Relocation section '.rela.dyn' at offset 0x3d0 contains 2 entries:
      Offset             Info             Type               Symbol's Value  
Symbol's Name + Addend
  0000000000403ff0  0000000100000006 R_X86_64_GLOB_DAT      0000000000000000 
__libc_start_main@GLIBC_2.2.5 + 0
  0000000000403ff8  0000000200000006 R_X86_64_GLOB_DAT      0000000000000000 
__gmon_start__ + 0
  
  Relocation section '.rela.plt' at offset 0x400 contains 1 entries:
      Offset             Info             Type               Symbol's Value  
Symbol's Name + Addend
  0000000000404018  0000000000000025 R_X86_64_IRELATIVE                401150



> Thats correct, these aren't 'aliases' or 'ifuncs' as far as the CFE is 
> concerned; they are multiversioned functions.  That 'Aliases' and 'ifunc' 
> list in the CFE are the AST-constructs of those, not the IR constructs, so 
> there is no reason to put the multiversioned thinks in that list, since they 
> are implementation details.  Emitting an error "invalid alias!"/etc for

I see, makes sense, thanks for the explanation.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112349/new/

https://reviews.llvm.org/D112349

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to