[PATCH] D149119: [CMake] Use llvm-nm to extract symbols for staged LTO builds on Windows

2023-05-08 Thread Igor Kudrin via Phabricator via cfe-commits
ikudrin added a comment.

In D149119#4312518 , @simon_tatham 
wrote:

> Do LLVM's current portability goals include the constraint that you can only 
> build LLVM for a platform it can also target? If not, then there surely still 
> needs to be //some// kind of escape hatch so that you can avoid needing 
> `llvm-nm` to already support the object file format of the host platform.
>
> I suppose you could say that in that unusual situation it's up to you to 
> adapt `extract_symbols.py` so that it has some other way to get the answers.

If I understand it right, the main targets to use `extract_symbols.py` are AIX 
and Windows. For both these platforms `llvm-nm` and `llvm-readobj` can be 
built. Support for more exotic situations can be added relatively easily should 
it be required.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149119/new/

https://reviews.llvm.org/D149119

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D149119: [CMake] Use llvm-nm to extract symbols for staged LTO builds on Windows

2023-05-02 Thread Simon Tatham via Phabricator via cfe-commits
simon_tatham added a comment.

Do LLVM's current portability goals include the constraint that you can only 
build LLVM for a platform it can also target? If not, then there surely still 
needs to be //some// kind of escape hatch so that you can avoid needing 
`llvm-nm` to already support the object file format of the host platform.

I suppose you could say that in that unusual situation it's up to you to adapt 
`extract_symbols.py` so that it has some other way to get the answers.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149119/new/

https://reviews.llvm.org/D149119

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D149119: [CMake] Use llvm-nm to extract symbols for staged LTO builds on Windows

2023-04-26 Thread Tomas Matheson via Phabricator via cfe-commits
tmatheson added a comment.

In D149119#4297540 , @ikudrin wrote:

> If I understand it right, we might not be able to build `llvm-nm` in cases 
> like cross-platform building, right?

LLVM has a way to build tools that need to run on the build machine as part of 
the build (`tablegen` for example), `llvm-nm` could be added to that system and 
then it would be available when `extract_symbols.py` is run. It would be an 
issue if `llvm-nm` ever had to depend on `extract_symbols.py` but that is not 
currently the case afaik.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149119/new/

https://reviews.llvm.org/D149119

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D149119: [CMake] Use llvm-nm to extract symbols for staged LTO builds on Windows

2023-04-26 Thread Martin Storsjö via Phabricator via cfe-commits
mstorsjo added a comment.

In D149119#4298024 , @phosek wrote:

> Supporting only a single tool and simplifying the script would be my 
> preference as well. I see that the script already supports `llvm-readobj`, do 
> we need the `llvm-nm` support in that case?

I remember seeing it mentioned somewhere (I don't immediately see where right 
now though), `llvm-readobj` only handles regular object files, while `llvm-nm` 
handles bitcode too.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149119/new/

https://reviews.llvm.org/D149119

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D149119: [CMake] Use llvm-nm to extract symbols for staged LTO builds on Windows

2023-04-26 Thread Petr Hosek via Phabricator via cfe-commits
phosek added a comment.

Supporting only a single tool and simplifying the script would be my preference 
as well. I see that the script already supports `llvm-readobj`, do we need the 
`llvm-nm` support in that case?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149119/new/

https://reviews.llvm.org/D149119

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D149119: [CMake] Use llvm-nm to extract symbols for staged LTO builds on Windows

2023-04-25 Thread Igor Kudrin via Phabricator via cfe-commits
ikudrin added a comment.

In D149119#4295110 , @tmatheson wrote:

> This looks like a nice addition. Would it make sense to use llvm-nm always, 
> not restricted to bootstrap builds? And would that work on Windows and allow 
> us to simplify this script substantially by using one tool for all platforms?

If I understand it right, we might not be able to build `llvm-nm` in cases like 
cross-platform building, right?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149119/new/

https://reviews.llvm.org/D149119

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D149119: [CMake] Use llvm-nm to extract symbols for staged LTO builds on Windows

2023-04-25 Thread Tomas Matheson via Phabricator via cfe-commits
tmatheson added a comment.

This looks like a nice addition. Would it make sense to use llvm-nm always, not 
restricted to bootstrap builds? And would that work on Windows and allow us to 
simplify this script substantially by using one tool for all platforms?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149119/new/

https://reviews.llvm.org/D149119

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D149119: [CMake] Use llvm-nm to extract symbols for staged LTO builds on Windows

2023-04-24 Thread Igor Kudrin via Phabricator via cfe-commits
ikudrin created this revision.
ikudrin added reviewers: john.brawn, daltenty, jsji, simon_tatham, tmatheson, 
mstorsjo, phosek.
ikudrin added projects: LLVM, clang.
Herald added subscribers: ekilmer, inglorion.
Herald added a project: All.
ikudrin requested review of this revision.

As for now, `extract_symbols.py` uses a predefined set of tools, none of which 
can read bitcode files. The patch makes it possible to override the used tool 
and passes a fresh built `llvm-nm` for that for multi-staged LTO builds. This 
fixes building plugins with LTO builds and subsequently makes 
`clang/test/Frontend/plugin-*` tests pass.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D149119

Files:
  clang/CMakeLists.txt
  llvm/utils/extract_symbols.py

Index: llvm/utils/extract_symbols.py
===
--- llvm/utils/extract_symbols.py
+++ llvm/utils/extract_symbols.py
@@ -29,8 +29,8 @@
 # as, especially on Windows, waiting for the entire output to be ready can take
 # a significant amount of time.
 
-def dumpbin_get_symbols(lib):
-process = subprocess.Popen(['dumpbin','/symbols',lib], bufsize=1,
+def dumpbin_get_symbols(tool, lib):
+process = subprocess.Popen([tool,'/symbols',lib], bufsize=1,
stdout=subprocess.PIPE, stdin=subprocess.PIPE,
universal_newlines=True)
 process.stdin.close()
@@ -41,10 +41,10 @@
 yield (match.group(2), match.group(1) != "UNDEF")
 process.wait()
 
-def nm_get_symbols(lib):
+def nm_get_symbols(tool, lib):
 # -P means the output is in portable format, and -g means we only get global
 # symbols.
-cmd = ['nm','-P','-g']
+cmd = [tool,'-P','-g']
 if sys.platform.startswith('aix'):
 cmd += ['-Xany','-C','-p']
 process = subprocess.Popen(cmd+[lib], bufsize=1,
@@ -68,8 +68,8 @@
 yield (match.group(1), False)
 process.wait()
 
-def readobj_get_symbols(lib):
-process = subprocess.Popen(['llvm-readobj','--symbols',lib], bufsize=1,
+def readobj_get_symbols(tool, lib):
+process = subprocess.Popen([tool,'--symbols',lib], bufsize=1,
stdout=subprocess.PIPE, stdin=subprocess.PIPE,
universal_newlines=True)
 process.stdin.close()
@@ -95,10 +95,10 @@
 # Define functions which determine if the target is 32-bit Windows (as that's
 # where calling convention name decoration happens).
 
-def dumpbin_is_32bit_windows(lib):
+def dumpbin_is_32bit_windows(tool, lib):
 # dumpbin /headers can output a huge amount of data (>100MB in a debug
 # build) so we read only up to the 'machine' line then close the output.
-process = subprocess.Popen(['dumpbin','/headers',lib], bufsize=1,
+process = subprocess.Popen([tool,'/headers',lib], bufsize=1,
stdout=subprocess.PIPE, stdin=subprocess.PIPE,
universal_newlines=True)
 process.stdin.close()
@@ -112,8 +112,8 @@
 process.wait()
 return retval
 
-def objdump_is_32bit_windows(lib):
-output = subprocess.check_output(['objdump','-f',lib],
+def objdump_is_32bit_windows(tool, lib):
+output = subprocess.check_output([tool,'-f',lib],
  universal_newlines=True)
 for line in output.splitlines():
 match = re.match('.+file format (\S+)', line)
@@ -121,8 +121,8 @@
 return (match.group(1) == 'pe-i386')
 return False
 
-def readobj_is_32bit_windows(lib):
-output = subprocess.check_output(['llvm-readobj','--file-header',lib],
+def readobj_is_32bit_windows(tool, lib):
+output = subprocess.check_output([tool,'--file-header',lib],
  universal_newlines=True)
 for line in output.splitlines():
 match = re.match('Format: (\S+)', line)
@@ -132,7 +132,7 @@
 
 # On AIX, there isn't an easy way to detect 32-bit windows objects with the system toolchain,
 # so just assume false.
-def aix_is_32bit_windows(lib):
+def aix_is_32bit_windows(tool, lib):
 return False
 
 # MSVC mangles names to ?@. By examining the
@@ -355,10 +355,10 @@
 return components
 
 def extract_symbols(arg):
-get_symbols, should_keep_symbol, calling_convention_decoration, lib = arg
+get_symbols, get_symbols_tool, should_keep_symbol, calling_convention_decoration, lib = arg
 symbol_defs = dict()
 symbol_refs = set()
-for (symbol, is_def) in get_symbols(lib):
+for (symbol, is_def) in get_symbols(get_symbols_tool, lib):
 symbol = should_keep_symbol(symbol, calling_convention_decoration)
 if symbol:
 if is_def:
@@ -392,8 +392,20 @@
 # Not a template
 return None
 
+def parse_arg_override(parser, val):
+tool, _, path = val.partition('=')
+if not tool in known_tools:
+parser.error(f'Unknown tool: {tool}')
+if not path or not os.path.isfile(path):
+