rnk added a comment.

I guess my initial reaction is that it is disappointing that downstream 
consumers can't cope with non-unique symbol names (What is the point of having 
internal linkage if we can't rely on it?), but given @mtrofin's input about 
AFDO, this seems like it may be a practical solution. I can think of at least 
two other cases where I have wanted similar functionality, the first being 
unique names for CodeView types in anonymous namespaces, the second being Jumbo 
support, producing one object from two .cpp source inputs.

To handle the case for type identifiers, we came up with this solution:
https://github.com/llvm/llvm-project/blob/master/clang/lib/AST/MicrosoftMangle.cpp#L401
Your solution is pretty similar.

In D73307#1839829 <https://reviews.llvm.org/D73307#1839829>, @MaskRay wrote:

> I haven't followed Propeller development, but hope someone with profile 
> experience can confirm InternalLinkage is the only linkage we need to care 
> about (otherwise the option will be a misnomer if we ever extend it) and 
> check whether this feature is useful on its own.


There is one other "local" linkage type, `private` linkage, which I believe 
corresponds with temporary assembler labels (.L*) that don't survive into the 
object file symbol table. I think the main use case for private globals is to 
save on symbol table size, so if the user passes this flag, it seems reasonable 
to upgrade the linkage to internal, and assign a unique name.

There is also the case of unnamed globals, which I think are private. I don't 
think clang ever generates these, so they are not very interesting.

I grepped the clang/test subdirectory for 'define.*private', and it seems there 
are no tests for private functions, so I doubt clang ever emits them.

---

At a higher level, should this just be an IR pass that clang adds into the 
pipeline when the flag is set? It should be safe to rename internal functions 
and give private functions internal linkage. It would be less invasive to clang 
and have better separation of concerns. As written, this is based on the source 
filename on the module, which is accessible from IR. The only reason I can 
think of against this is that the debug info might refer to the function 
linkage name, but maybe that is calculated later during codegen.



================
Comment at: clang/lib/CodeGen/CodeGenModule.cpp:1062
 
   return std::string(Out.str());
 }
----------------
We seem to be defeating NRVO with our "clever" small string type. :(


================
Comment at: clang/lib/CodeGen/CodeGenModule.cpp:1125
   const auto *ND = cast<NamedDecl>(GD.getDecl());
   std::string MangledName = getMangledNameImpl(*this, GD, ND);
 
----------------
There are several other callers of getMangledNameImpl. Should they also use 
this suffix? They mostly have to do with function multiversioning. Those seem 
like they could be internal and need this treatment.


================
Comment at: clang/lib/CodeGen/CodeGenModule.cpp:1131
+  if (getCodeGenOpts().UniqueInternalFuncNames &&
+      dyn_cast<FunctionDecl>(GD.getDecl()) &&
+      getFunctionLinkage(GD) == llvm::GlobalValue::InternalLinkage &&
----------------
`isa` would be more idiomatic than testing `dyn_cast` results for truth.


================
Comment at: clang/lib/CodeGen/CodeGenModule.cpp:1135
+    llvm::MD5 Md5;
+    Md5.update(getModule().getSourceFileName());
+    llvm::MD5::MD5Result R;
----------------
davidxl wrote:
> Source filenames are not guaranteed to be unique, or it does contain the path 
> as well?
It appears to contain the path as the compiler receives it on the command line. 
However, it is possible to design a build where this string is not unique:
```
cd foo
clang -c name_conflict.c 
cd ../bar
clang -c name_conflict.c
clang foo/name_conflict.o bar/name_conflict.o
```

I don't think we should try to handle this case, but we should document the 
limitation somewhere. Some build systems do operate changing the working 
directory as they go.

Can you add some to the clang/docs/UsersManual.rst file? I'd expect it to show 
up here:
https://clang.llvm.org/docs/UsersManual.html#command-line-options

You can elaborate that the unique names are calculated using the source path 
passed to the compiler, and in a build system where that is not unique, the 
function names may not be unique.

---

I have also used this construct in the past for building single-file ABI 
compatibility test cases:

```
// foo.cpp
#ifdef PART1
// ...
#elif defined(PART2)
// ...
#endif

$ cc -c foo.cpp -DPART1
$ cc -c foo.cpp -DPART2
```


================
Comment at: clang/lib/CodeGen/CodeGenModule.cpp:1139
+    SmallString<32> Str;
+    llvm::MD5::stringifyResult(R, Str);
+    std::string UniqueSuffix = ("." + Str).str();
----------------
I think it would be preferable to calculate the source filename MD5 hash once, 
when CodeGenModule is constructed, stringify it, and then use that every time 
we mangle a name.


================
Comment at: clang/test/CodeGen/unique-internal-funcnames.c:3
+
+// RUN: %clang -target x86_64 -S -o - %s | FileCheck %s --check-prefix=PLAIN
+// RUN: %clang -target x86_64 -S -funique-internal-funcnames -o -  %s | 
FileCheck %s --check-prefix=UNIQUE
----------------
tmsriram wrote:
> davidxl wrote:
> > MaskRay wrote:
> > > You can hardly find any .c -> .s test in clang/test. We mostly do .c -> 
> > > .ll testing. `.ll` -> `.s` are in llvm/test/CodeGen. 
> > Is this convention documented somewhere? Any concerns of producing .s?
> There are more than 200 tests in clang that do -S from .c -> .s and tens of 
> tests in CodeGen alone.  I can change this to .ll but curious where this 
> "hardly any" comes from?
I can't specifically find documentation for this convention, but it is 
generally understood that every feature should have targeted unit tests that 
exercise the minimum amount of functionality. In this case, that means using 
`%clang_cc1 -emit-llvm -triple NNN` and checking the IR.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D73307/new/

https://reviews.llvm.org/D73307



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to