[clang] Switch builtin strings to use string tables (PR #118734)
chandlerc wrote: I see the build bot up and with the original failure, going to merge the revert to hopefully make sure it goes green after that. https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
chandlerc wrote: > That was a mistake. I originally was going to update the build bot to a later > version of Visual Studio as while it wouldn't match our internal > configuration, it was still useful to have some Windows coverage of the > components we build/test. I then changed my mind and was just going to > restore the original configuration (and thought I had) but had accidentally > replaced the compiler with a newer version. I'm going to put the original one > back. Ah, no problem. Let me know when its up and I can wait for at least one build to cycle before reverting. > > I have started the process to try and upgrade the version we are using > internally, but that will likely take a while. Of course, that's part of why I asked -- I was quite surprised to see the update. This does at least seem to isolate to MSVC version though and not some other aspect of the machine or configuration, which is actually really useful to know. https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
dyung wrote: > So, the builder came back up, and is now passing without this landing... > > Here is the first build after it came back up: > https://lab.llvm.org/buildbot/#/builders/46/builds/9173 > > And I went through the output and found a specific test that was failing, and > it seems to pass? > > ``` > PASS: Clang :: AST/builtins-arm-strex-rettype.c (433 of 81898) > ``` > > It looks like the builder is now using MSVC version 14.29.30133 -- so that > might explain it. This maybe means that there is some issue with later patch > releases of 14.28 (in Visual Studio v16.9 as I understand it)? > > Are all of your builders OK moving to 14.29 (Visual Studio 16.11 from my > reading)? That was a mistake. I originally was going to update the build bot to a later version of Visual Studio as while it wouldn't match our internal configuration, it was still useful to have some Windows coverage of the components we build/test. I then changed my mind and was just going to restore the original configuration (and thought I had) but had accidentally replaced the compiler with a newer version. I'm going to put the original one back. I have started the process to try and upgrade the version we are using internally, but that will likely take a while. https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
chandlerc wrote: So, the builder came back up, and is now passing without this landing... Here is the first build after it came back up: https://lab.llvm.org/buildbot/#/builders/46/builds/9173 And I went through the output and found a specific test that was failing, and it seems to pass? ``` PASS: Clang :: AST/builtins-arm-strex-rettype.c (433 of 81898) ``` It looks like the builder is now using MSVC version 14.29.30133 -- so that might explain it. This maybe means that there is some issue with later patch releases of 14.28 (in Visual Studio v16.9 as I understand it)? Are all of your builders OK moving to 14.29 (Visual Studio 16.11 from my reading)? https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
dyung wrote: > > No real updates here, but our internal builder did catch up to this commit > > and we are seeing the same (and a lot more) failures when this commit is > > merged into our downstream codebase. I was kind of hoping that it would > > pass so that it might indicate that the problem might be a configuration > > issue with our upstream build bot, but the fact that it fails in a similar > > manner likely points to some kind of bug in the version of MSVC that we are > > using. > > It really would be helpful to try a different version of MSVC. > > As I pointed out in the original reply to the build bot failure, this bot > seems to be using a version of MSVC that isn't even a listed release. The > last patch release of 16.9 is `192829923` according to Wikipedia, but this > builder reports `192829924`. To me, this seems to indicate a build that may > not have gone through full Q&A and such. The build was released as far as I'm aware (we don't install pre-release versions on our production machines). That being said, I'm guessing it is probably a bug with the specific version of MSVC that we are using. I have started the conversation internally about upgrading the version of MSVC we use to build, but that will take time. > Would it help for me to post a PR that adds some validation of the string > table? I can easily do that, and maybe that would let you isolate whether > this is definitely an MSVC version issue vs. something with the code. Sure, I'm happy to try anything that you think might help and this is one of my higher priority issues at the moment. > > @chandlerc could we consider reverting the change until we can figure out > > what the problem is? I will continue to take a look into it, but I don't > > really have any ideas what might be the cause at the moment, so have no ETA > > on a work-around, and in the meantime, this will continue to cause test > > failures for us internally. > > I'm happy to revert, and I'll start working on that. But I think there needs > to be a concrete ETA on being able to make progress. Especially given the > fact that no other builders seem to hit this and it may be specific to an > MSVC version that even other windows devs don't realistically have access > to... > > Can you give some fairly concrete timeline? 1 week? maybe 2? Otherwise, there > is no actual path to resolving this. I will look into it the rest of this week and into next week. I don't really know the problem here, so I cannot promise a resolution date, but I will try my best to get this resolved, it is making our internal automation fail, so I want to unblock that! https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
chandlerc wrote: > No real updates here, but our internal builder did catch up to this commit > and we are seeing the same (and a lot more) failures when this commit is > merged into our downstream codebase. I was kind of hoping that it would pass > so that it might indicate that the problem might be a configuration issue > with our upstream build bot, but the fact that it fails in a similar manner > likely points to some kind of bug in the version of MSVC that we are using. It really would be helpful to try a different version of MSVC. As I pointed out in the original reply to the build bot failure, this bot seems to be using a version of MSVC that isn't even a listed release. The last patch release of 16.9 is `192829923` according to Wikipedia, but this builder reports `192829924`. To me, this seems to indicate a build that may not have gone through full Q&A and such. Would it help for me to post a PR that adds some validation of the string table? I can easily do that, and maybe that would let you isolate whether this is definitely an MSVC version issue vs. something with the code. > @chandlerc could we consider reverting the change until we can figure out > what the problem is? I will continue to take a look into it, but I don't > really have any ideas what might be the cause at the moment, so have no ETA > on a work-around, and in the meantime, this will continue to cause test > failures for us internally. I'm happy to revert, and I'll start working on that. But I think there needs to be a concrete ETA on being able to make progress. Especially given the fact that no other builders seem to hit this and it may be specific to an MSVC version that even other windows devs don't realistically have access to... Can you give some fairly concrete timeline? 1 week? maybe 2? Otherwise, there is no actual path to resolving this. https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
dyung wrote: No real updates here, but our internal builder did catch up to this commit and we are seeing the same (and a lot more) failures when this commit is merged into our downstream codebase. I was kind of hoping that it would pass so that it might indicate that the problem might be a configuration issue with our upstream build bot, but the fact that it fails in a similar manner likely points to some kind of bug in the version of MSVC that we are using. @chandlerc could we consider reverting the change until we can figure out what the problem is? I will continue to take a look into it, but I don't really have any ideas what might be the cause at the moment, so have no ETA on a work-around, and in the meantime, this will continue to cause test failures for us internally. https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
zmodem wrote: > I've put it up on Dropbox I looked at Builtins.cpp.obj and compared to the one in my build dir. They're very similar, in particular the string table is there in both files, and looks identical. I don't really have any more ideas for debugging this. https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
dyung wrote: > > I've pulled out the entire tools\clang\lib\Basic directory from a failing > > build, what would you like me to do with it? > > Can you put it in a zip somewhere? It would be interesting to check if > there's some major difference compared to using a different MSVC version. I've put it up on Dropbox, let me know if you cannot access the zipfile: https://www.dropbox.com/scl/fi/pbc02zyba2wtcwhun27fz/Basic.zip?rlkey=v4vqd6bqoewubmkzca8508wxo&dl=0 https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
zmodem wrote: > I've pulled out the entire tools\clang\lib\Basic directory from a failing > build, what would you like me to do with it? Can you put it in a zip somewhere? It would be interesting to check if there's some major difference compared to using a different MSVC version. https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
dyung wrote: > > > Not sure what to do debug this... @zmodem maybe has some idea? > > > > > > Sorry, I don't have a lot to add. Things look good on my end so far (local > > builds and https://lab.llvm.org/buildbot/#/builders/63 + > > https://lab.llvm.org/buildbot/#/builders/107). > > Besides the particular MSVC version on the failing bot, it's also using > > ccache. Perhaps that could be related somehow? > > @dyung would it be possible to extract > > `tools\clang\lib\Basic\CMakeFiles\obj.clangBasic.dir\Builtins.cpp.obj` from > > one of the failing builds? > > Hmm... maybe it is ccache. Let me try doing a build with an empty cache and > see if it still fails, otherwise, I'll extract that file. A clean build with an empty ccache does not seem to have made a difference. :( I've pulled out the entire `tools\clang\lib\Basic` directory from a failing build, what would you like me to do with it? https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
dyung wrote: > > Not sure what to do debug this... @zmodem maybe has some idea? > > Sorry, I don't have a lot to add. Things look good on my end so far (local > builds and https://lab.llvm.org/buildbot/#/builders/63 + > https://lab.llvm.org/buildbot/#/builders/107). > > Besides the particular MSVC version on the failing bot, it's also using > ccache. Perhaps that could be related somehow? > > @dyung would it be possible to extract > `tools\clang\lib\Basic\CMakeFiles\obj.clangBasic.dir\Builtins.cpp.obj` from > one of the failing builds? Hmm... maybe it is ccache. Let me try doing a build with an empty cache and see if it still fails, otherwise, I'll extract that file. https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
chandlerc wrote: The arm64-windows failures are from `DirectoryWatcherTest` that seems exceedingly unlikely to be related. I think the only real failures here are the originally discussed ones on the one windows build bot. All the other windows bots seems to be OK here. https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
uweigand wrote: The s390x failure is just an unstable test that occasionally fails - that woudn't be a reason to revert. Cannot say about the arm64-windows failure. https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
llvm-ci wrote: LLVM Buildbot has detected a new failure on builder `clang-arm64-windows-msvc` running on `linaro-armv8-windows-msvc-04` while building `clang` at step 5 "ninja check 1". Full details are available at: https://lab.llvm.org/buildbot/#/builders/161/builds/3580 Here is the relevant piece of the build log for the reference ``` Step 5 (ninja check 1) failure: stage 1 checked (failure) TEST 'Clang-Unit :: DirectoryWatcher/./DirectoryWatcherTests.exe/5/8' FAILED Script(shard): -- GTEST_OUTPUT=json:C:\Users\tcwg\llvm-worker\clang-arm64-windows-msvc\stage1\tools\clang\unittests\DirectoryWatcher\.\DirectoryWatcherTests.exe-Clang-Unit-4944-5-8.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=8 GTEST_SHARD_INDEX=5 C:\Users\tcwg\llvm-worker\clang-arm64-windows-msvc\stage1\tools\clang\unittests\DirectoryWatcher\.\DirectoryWatcherTests.exe -- Script: -- C:\Users\tcwg\llvm-worker\clang-arm64-windows-msvc\stage1\tools\clang\unittests\DirectoryWatcher\.\DirectoryWatcherTests.exe --gtest_filter=DirectoryWatcherTest.DeleteWatchedDir -- C:/Users/tcwg/llvm-worker/clang-arm64-windows-msvc/llvm/clang/unittests/DirectoryWatcher/DirectoryWatcherTest.cpp(259): error: Value of: WaitForExpectedStateResult.wait_for(EventualResultTimeout) == std::future_status::ready Actual: false Expected: true The expected result state wasn't reached before the time-out. C:/Users/tcwg/llvm-worker/clang-arm64-windows-msvc/llvm/clang/unittests/DirectoryWatcher/DirectoryWatcherTest.cpp(262): error: Value of: TestConsumer.result().has_value() Actual: false Expected: true C:/Users/tcwg/llvm-worker/clang-arm64-windows-msvc/llvm/clang/unittests/DirectoryWatcher/DirectoryWatcherTest.cpp:259 Value of: WaitForExpectedStateResult.wait_for(EventualResultTimeout) == std::future_status::ready Actual: false Expected: true The expected result state wasn't reached before the time-out. C:/Users/tcwg/llvm-worker/clang-arm64-windows-msvc/llvm/clang/unittests/DirectoryWatcher/DirectoryWatcherTest.cpp:262 Value of: TestConsumer.result().has_value() Actual: false Expected: true ``` https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
chandlerc wrote: (FYI, I'm about to drop offline -- Hans, if you think it makes sense to revert temporarily, please feel free to do so. I don't have a good sense of whether the fallout here is enough to warrant that.) https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
llvm-ci wrote: LLVM Buildbot has detected a new failure on builder `clang-s390x-linux` running on `systemz-1` while building `clang` at step 5 "ninja check 1". Full details are available at: https://lab.llvm.org/buildbot/#/builders/42/builds/2256 Here is the relevant piece of the build log for the reference ``` Step 5 (ninja check 1) failure: stage 1 checked (failure) TEST 'libFuzzer-s390x-default-Linux :: fuzzer-timeout.test' FAILED Exit Code: 1 Command Output (stderr): -- RUN: at line 1: /home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/./bin/clang -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta --driver-mode=g++ -O2 -gline-tables-only -fsanitize=address,fuzzer -I/home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/lib/fuzzer /home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/test/fuzzer/TimeoutTest.cpp -o /home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutTest + /home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/./bin/clang -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta --driver-mode=g++ -O2 -gline-tables-only -fsanitize=address,fuzzer -I/home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/lib/fuzzer /home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/test/fuzzer/TimeoutTest.cpp -o /home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutTest RUN: at line 2: /home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/./bin/clang -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta --driver-mode=g++ -O2 -gline-tables-only -fsanitize=address,fuzzer -I/home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/lib/fuzzer /home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/test/fuzzer/TimeoutEmptyTest.cpp -o /home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutEmptyTest + /home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/./bin/clang -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta --driver-mode=g++ -O2 -gline-tables-only -fsanitize=address,fuzzer -I/home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/lib/fuzzer /home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/test/fuzzer/TimeoutEmptyTest.cpp -o /home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutEmptyTest RUN: at line 3: not /home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutTest -timeout=1 2>&1 | FileCheck /home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/test/fuzzer/fuzzer-timeout.test --check-prefix=TimeoutTest + not /home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutTest -timeout=1 + FileCheck /home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/test/fuzzer/fuzzer-timeout.test --check-prefix=TimeoutTest RUN: at line 12: not /home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutTest -timeout=1 /home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/test/fuzzer/hi.txt 2>&1 | FileCheck /home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/test/fuzzer/fuzzer-timeout.test --check-prefix=SingleInputTimeoutTest + not /home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutTest -timeout=1 /home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/test/fuzzer/hi.txt + FileCheck /home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/compiler-rt/test/fuzzer/fuzzer-timeout.test --check-prefix=SingleInputTimeoutTest RUN: at line 16: /home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutTest -timeout=1 -timeout_exitcode=0 + /home/uweigand/sandbox/buildbot/clang-s390x-linux/stage1/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutTest -timeout=1 -timeout_exitcode=0 INFO: Running with entropic power schedule (0xFF, 100). INFO: Seed: 2585351529 INFO: Loaded 1 modules (13 inline 8-bit counters): 13 [0x2aa27850e70, 0x2aa27850e7d), INFO: Loaded 1 PC tables (
[clang] Switch builtin strings to use string tables (PR #118734)
zmodem wrote: > Not sure what to do debug this... @zmodem maybe has some idea? Sorry, I don't have a lot to add. Things look good on my end so far (local builds and https://lab.llvm.org/buildbot/#/builders/63 + https://lab.llvm.org/buildbot/#/builders/107). Besides the particular MSVC version on the failing bot, it's also using ccache. Perhaps that could be related somehow? @dyung would it be possible to extract `tools\clang\lib\Basic\CMakeFiles\obj.clangBasic.dir\Builtins.cpp.obj` from one of the failing builds? https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
dyung wrote: I’m still looking into the failure, but may be a bit slow as the machine isn’t really setup for debugging, and I’m traveling today. https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
chandlerc wrote: > Could the version of VC that we are using possibly have the issue with long > strings that you mention? Is there a simple way to check that? It's a compile time error, so no, that'd be really clear cut. The only other thing I've seen is running out of heap, but that seems likely exclusively a compiler explorer thing. https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
dyung wrote: > > > > > > LLVM Buildbot has detected a new failure on builder > > > > > > `llvm-clang-x86_64-sie-win` running on `sie-win-worker` while > > > > > > building `clang` at step 7 "test-build-unified-tree-check-all". > > > > > > Full details are available at: > > > > > > [lab.llvm.org/buildbot#/builders/46/builds/9169](https://lab.llvm.org/buildbot/#/builders/46/builds/9169) > > > > > > > > > > > > > > > This is a Windows build bot with a version of MSVC '19.28.29924.0' -- > > > > > that's not even one of the numbers listed here: > > > > > [en.wikipedia.org/wiki/Microsoft_Visual_C%2B%2B#Internal_version_numbering](https://en.wikipedia.org/wiki/Microsoft_Visual_C%2B%2B#Internal_version_numbering) > > > > > (edit because I can't count) > > > > > Looks like it might be a build after 16.9.19 which is listed as > > > > > 19.28.29923. Not sure what to do debug this... @zmodem maybe has some > > > > > idea? > > > > > It doesn't seem to be the issue I know about on older versions of > > > > > MSVC as there are no errors on the long string literal. But somehow > > > > > it seems to be miscompiling the string literals? > > > > > > > > > > > > I manage this build bot, is there anything I can do to help you debug > > > > this problem? We use this build as our "blessed" build internally for > > > > building on Windows. > > > > > > > > > Some indication of what's causing this? Or maybe whether there are any > > > other versions of MSVC available with different results? > > > I don't have a Windows machine to debug on at all, but the premerge > > > windows build was successful, so it seems to work on some builders but > > > not others. So far I don't have any other failures to look at to > > > cross-compare... > > > > > > I'll pull the machine offline and try to take a look. There appear to be > > two types of problems, 1 is an assertion failure in the pch related tests, > > the other use of an "unknown" builtin. > > I'm guessing they're the same issue, just a difference of whether hitting > assert or a diagnostic first. Some strings for builtin names aren't coming > back out of the tables correctly -- we fail to recognize it in one place, and > hit an assert in the other. > > My best guess is something going wrong with the string literals generated by > the macros in this PR, or something going wrong with `sizeof` computing > offsets... I'll do some experiments with compiler explorer, and see if I can > get lucky... > > If we need to revert the PR, totally understand, just a bit odd given only > one compiler seems to be hitting the issue here. Could the version of VC that we are using possibly have the issue with long strings that you mention? Is there a simple way to check that? https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
chandlerc wrote: > > > > > LLVM Buildbot has detected a new failure on builder > > > > > `llvm-clang-x86_64-sie-win` running on `sie-win-worker` while > > > > > building `clang` at step 7 "test-build-unified-tree-check-all". > > > > > Full details are available at: > > > > > [lab.llvm.org/buildbot#/builders/46/builds/9169](https://lab.llvm.org/buildbot/#/builders/46/builds/9169) > > > > > > > > > > > > This is a Windows build bot with a version of MSVC '19.28.29924.0' -- > > > > that's not even one of the numbers listed here: > > > > [en.wikipedia.org/wiki/Microsoft_Visual_C%2B%2B#Internal_version_numbering](https://en.wikipedia.org/wiki/Microsoft_Visual_C%2B%2B#Internal_version_numbering) > > > > (edit because I can't count) > > > > Looks like it might be a build after 16.9.19 which is listed as > > > > 19.28.29923. Not sure what to do debug this... @zmodem maybe has some > > > > idea? > > > > It doesn't seem to be the issue I know about on older versions of MSVC > > > > as there are no errors on the long string literal. But somehow it seems > > > > to be miscompiling the string literals? > > > > > > > > > I manage this build bot, is there anything I can do to help you debug > > > this problem? We use this build as our "blessed" build internally for > > > building on Windows. > > > > > > Some indication of what's causing this? Or maybe whether there are any > > other versions of MSVC available with different results? > > I don't have a Windows machine to debug on at all, but the premerge windows > > build was successful, so it seems to work on some builders but not others. > > So far I don't have any other failures to look at to cross-compare... > > I'll pull the machine offline and try to take a look. There appear to be two > types of problems, 1 is an assertion failure in the pch related tests, the > other use of an "unknown" builtin. I'm guessing they're the same issue, just a difference of whether hitting assert or a diagnostic first. Some strings for builtin names aren't coming back out of the tables correctly -- we fail to recognize it in one place, and hit an assert in the other. My best guess is something going wrong with the string literals generated by the macros in this PR, or something going wrong with `sizeof` computing offsets... I'll do some experiments with compiler explorer, and see if I can get lucky... If we need to revert the PR, totally understand, just a bit odd given only one compiler seems to be hitting the issue here. https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
dyung wrote: > > > > LLVM Buildbot has detected a new failure on builder > > > > `llvm-clang-x86_64-sie-win` running on `sie-win-worker` while building > > > > `clang` at step 7 "test-build-unified-tree-check-all". > > > > Full details are available at: > > > > [lab.llvm.org/buildbot#/builders/46/builds/9169](https://lab.llvm.org/buildbot/#/builders/46/builds/9169) > > > > > > > > > This is a Windows build bot with a version of MSVC '19.28.29924.0' -- > > > that's not even one of the numbers listed here: > > > [en.wikipedia.org/wiki/Microsoft_Visual_C%2B%2B#Internal_version_numbering](https://en.wikipedia.org/wiki/Microsoft_Visual_C%2B%2B#Internal_version_numbering) > > > (edit because I can't count) > > > Looks like it might be a build after 16.9.19 which is listed as > > > 19.28.29923. Not sure what to do debug this... @zmodem maybe has some > > > idea? > > > It doesn't seem to be the issue I know about on older versions of MSVC as > > > there are no errors on the long string literal. But somehow it seems to > > > be miscompiling the string literals? > > > > > > I manage this build bot, is there anything I can do to help you debug this > > problem? We use this build as our "blessed" build internally for building > > on Windows. > > Some indication of what's causing this? Or maybe whether there are any other > versions of MSVC available with different results? > > I don't have a Windows machine to debug on at all, but the premerge windows > build was successful, so it seems to work on some builders but not others. So > far I don't have any other failures to look at to cross-compare... I'll pull the machine offline and try to take a look. There appear to be two types of problems, 1 is an assertion failure in the pch related tests, the other use of an "unknown" builtin. https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
chandlerc wrote: > > > LLVM Buildbot has detected a new failure on builder > > > `llvm-clang-x86_64-sie-win` running on `sie-win-worker` while building > > > `clang` at step 7 "test-build-unified-tree-check-all". > > > Full details are available at: > > > [lab.llvm.org/buildbot#/builders/46/builds/9169](https://lab.llvm.org/buildbot/#/builders/46/builds/9169) > > > > > > This is a Windows build bot with a version of MSVC '19.28.29924.0' -- > > that's not even one of the numbers listed here: > > [en.wikipedia.org/wiki/Microsoft_Visual_C%2B%2B#Internal_version_numbering](https://en.wikipedia.org/wiki/Microsoft_Visual_C%2B%2B#Internal_version_numbering) > > (edit because I can't count) > > Looks like it might be a build after 16.9.19 which is listed as > > 19.28.29923. Not sure what to do debug this... @zmodem maybe has some idea? > > It doesn't seem to be the issue I know about on older versions of MSVC as > > there are no errors on the long string literal. But somehow it seems to be > > miscompiling the string literals? > > I manage this build bot, is there anything I can do to help you debug this > problem? We use this build as our "blessed" build internally for building on > Windows. Some indication of what's causing this? Or maybe whether there are any other versions of MSVC available with different results? I don't have a Windows machine to debug on at all, but the premerge windows build was successful, so it seems to work on some builders but not others. So far I don't have any other failures to look at to cross-compare... https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
dyung wrote: > > LLVM Buildbot has detected a new failure on builder > > `llvm-clang-x86_64-sie-win` running on `sie-win-worker` while building > > `clang` at step 7 "test-build-unified-tree-check-all". > > Full details are available at: > > [lab.llvm.org/buildbot#/builders/46/builds/9169](https://lab.llvm.org/buildbot/#/builders/46/builds/9169) > > This is a Windows build bot with a version of MSVC '19.28.29924.0' -- that's > not even one of the numbers listed here: > https://en.wikipedia.org/wiki/Microsoft_Visual_C%2B%2B#Internal_version_numbering > > (edit because I can't count) > > Looks like it might be a build after 16.9.19 which is listed as 19.28.29923. > Not sure what to do debug this... @zmodem maybe has some idea? > > It doesn't seem to be the issue I know about on older versions of MSVC as > there are no errors on the long string literal. But somehow it seems to be > miscompiling the string literals? I manage this build bot, is there anything I can do to help you debug this problem? We use this build as our "blessed" build internally for building on Windows. https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
chandlerc wrote: > LLVM Buildbot has detected a new failure on builder > `llvm-clang-x86_64-sie-win` running on `sie-win-worker` while building > `clang` at step 7 "test-build-unified-tree-check-all". > > Full details are available at: > [lab.llvm.org/buildbot#/builders/46/builds/9169](https://lab.llvm.org/buildbot/#/builders/46/builds/9169) This is a Windows build bot with a version of MSVC '19.28.29924.0' -- that's not even one of the numbers listed here: https://en.wikipedia.org/wiki/Microsoft_Visual_C%2B%2B#Internal_version_numbering Looks like it might be a pre-release of 16.9.0, but the first listed version of that is '19.28.29910' at least on the Wikipedia page. Not sure what to do debug this... @zmodem maybe has some idea? https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
llvm-ci wrote: LLVM Buildbot has detected a new failure on builder `llvm-clang-x86_64-sie-win` running on `sie-win-worker` while building `clang` at step 7 "test-build-unified-tree-check-all". Full details are available at: https://lab.llvm.org/buildbot/#/builders/46/builds/9169 Here is the relevant piece of the build log for the reference ``` Step 7 (test-build-unified-tree-check-all) failure: test (failure) TEST 'Clang :: AST/builtins-arm-strex-rettype.c' FAILED Exit Code: 1 Command Output (stdout): -- # RUN: at line 1 z:\b\llvm-clang-x86_64-sie-win\build\bin\clang.exe -cc1 -internal-isystem Z:\b\llvm-clang-x86_64-sie-win\build\lib\clang\20\include -nostdsysteminc -triple thumbv7m-apple-darwin-eabi -ast-dump Z:\b\llvm-clang-x86_64-sie-win\llvm-project\clang\test\AST\builtins-arm-strex-rettype.c | z:\b\llvm-clang-x86_64-sie-win\build\bin\filecheck.exe Z:\b\llvm-clang-x86_64-sie-win\llvm-project\clang\test\AST\builtins-arm-strex-rettype.c # executed command: 'z:\b\llvm-clang-x86_64-sie-win\build\bin\clang.exe' -cc1 -internal-isystem 'Z:\b\llvm-clang-x86_64-sie-win\build\lib\clang\20\include' -nostdsysteminc -triple thumbv7m-apple-darwin-eabi -ast-dump 'Z:\b\llvm-clang-x86_64-sie-win\llvm-project\clang\test\AST\builtins-arm-strex-rettype.c' # .---command stderr # | Z:\b\llvm-clang-x86_64-sie-win\llvm-project\clang\test\AST\builtins-arm-strex-rettype.c:7:12: error: use of unknown builtin '__builtin_arm_strex' [-Wimplicit-function-declaration] # | 7 | } while (__builtin_arm_strex(a, b)); # | |^ # | 1 error generated. # `- # error: command failed with exit status: 1 # executed command: 'z:\b\llvm-clang-x86_64-sie-win\build\bin\filecheck.exe' 'Z:\b\llvm-clang-x86_64-sie-win\llvm-project\clang\test\AST\builtins-arm-strex-rettype.c' -- ``` https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
https://github.com/chandlerc closed https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
androm3da wrote: cc @llvm/pr-subscribers-backend-hexagon and @iajbar https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
https://github.com/chandlerc updated https://github.com/llvm/llvm-project/pull/118734 >From 9553557e87ec5d9ae5ce5636f6227150fcd080bc Mon Sep 17 00:00:00 2001 From: Chandler Carruth Date: Thu, 28 Nov 2024 09:56:40 + Subject: [PATCH 1/6] Switch builtin strings to use string tables MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The Clang binary (and any binary linking Clang as a library), when built using PIE, ends up with a pretty shocking number of dynamic relocations to apply to the executable image: roughly 400k. Each of these takes up binary space in the executable, and perhaps most interestingly takes start-up time to apply the relocations. The largest pattern I identified were the strings used to describe target builtins. The addresses of these string literals were stored into huge arrays, each one requiring a dynamic relocation. The way to avoid this is to design the target builtins to use a single large table of strings and offsets within the table for the individual strings. This switches the builtin management to such a scheme. This saves over 100k dynamic relocations by my measurement, an over 25% reduction. Just looking at byte size improvements, using the `bloaty` tool to compare a newly built `clang` binary to an old one: ``` FILE SIZEVM SIZE -- -- +1.4% +653Ki +1.4% +653Ki.rodata +0.0%+960 +0.0%+960.text +0.0%+197 +0.0%+197.dynstr +0.0%+184 +0.0%+184.eh_frame +0.0% +96 +0.0% +96.dynsym +0.0% +40 +0.0% +40.eh_frame_hdr +114% +32 [ = ] 0[Unmapped] +0.0% +20 +0.0% +20.gnu.hash +0.0% +8 +0.0% +8.gnu.version +0.9% +7 +0.9% +7[LOAD #2 [R]] [ = ] 0 -75.4% -3.00Ki.relro_padding -16.1% -802Ki -16.1% -802Ki.data.rel.ro -27.3% -2.52Mi -27.3% -2.52Mi.rela.dyn -1.6% -2.66Mi -1.6% -2.66MiTOTAL ``` We get a 16% reduction in the `.data.rel.ro` section, and nearly 30% reduction in `.rela.dyn` where those reloctaions are stored. This is also visible in my benchmarking of binary start-up overhead at least: ``` Benchmark 1: ./old_clang --version Time (mean ± σ): 17.6 ms ± 1.5 ms[User: 4.1 ms, System: 13.3 ms] Range (min … max):14.2 ms … 22.8 ms162 runs Benchmark 2: ./new_clang --version Time (mean ± σ): 15.5 ms ± 1.4 ms[User: 3.6 ms, System: 11.8 ms] Range (min … max):12.4 ms … 20.3 ms216 runs Summary './new_clang --version' ran 1.13 ± 0.14 times faster than './old_clang --version' ``` We get about 2ms faster `--version` runs. While there is a lot of noise in binary execution time, this delta is pretty consistent, and represents over 10% improvement. This is particularly interesting to me because for very short source files, repeatedly starting the `clang` binary is actually the dominant cost. For example, `configure` scripts running against the `clang` compiler are slow in large part because of binary start up time, not the time to process the actual inputs to the compiler. This PR implements the string tables using `constexpr` code and the existing macro system. I understand that the builtins are moving towards a TableGen model, and if complete that would provide more options for modeling this. Unfortunately, that migration isn't complete, and even the parts that are migrated still rely on the ability to break out of the TableGen model and directly expand an X-macro style `BUILTIN(...)` textually. I looked at trying to complete the move to TableGen, but it would both require the difficult migration of the remaining targets, and solving some tricky problems with how to move away from any macro-based expansion. I was also able to find a reasonably clean and effective way of doing this with the existing macros and some `constexpr` code that I think is clean enough to be a pretty good intermediate state, and maybe give a good target for the eventual TableGen solution. I was also able to factor the macros into set of consistent patterns that avoids a significant regression in overall boilerplate. There is one challenge with this approach: it requires the host compiler to support (very) long string literals, a bit over half a meg. =/ The current minimum MSVC version rejects these, but the very next patch release (16.8) removed that restriction. I'm going to send out a separate PR / RFC to raise the minimum version by one patch release, which I hope is acceptable as the current version was set years ago. FWIW, there are a few more low-hanging fruit sources of excessive dynamic relocations, maybe as many as 50k to 100k more that I'll take a look at to see if I can identify easy fixes. Beyond that, it seems to get quite difficult. It might be worth adding some guidance to developer documentation to try to avoid creating global data structures that _repeatedly_ store pointers to oth
[clang] Switch builtin strings to use string tables (PR #118734)
chandlerc wrote: > Confirmed that this works on GCC now. I'd suggest to replace the use of > StringLiteral::size with plain sizeof(). The build time overhead of going > through StringLiteral here is substantial. Sure. I was initially worried about the subtlety of this use of `sizeof`, but that was before everything got nicely factored into macros. And yeah, the compile time cost is unfortunate. Will merge once updated (including this fix) and builds passing. https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
https://github.com/nikic edited https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
@@ -68,23 +69,156 @@ enum ID { FirstTSBuiltin }; +// The info used to represent each builtin. struct Info { - llvm::StringLiteral Name; - const char *Type, *Attributes; - const char *Features; + // Rather than store pointers to the string literals describing these four + // aspects of builtins, we store offsets into a common string table. + struct StrOffsets { +int Name; +int Type; +int Attributes; +int Features; + } Offsets; + HeaderDesc Header; LanguageID Langs; }; +// The storage for `N` builtins. This contains a single pointer to the string +// table used for these builtins and an array of metadata for each builtin. +template struct Storage { + const char *StringTable; + + std::array Infos; + + // A constexpr function to construct the storage for a a given string table in + // the first argument and an array in the second argument. This is *only* + // expected to be used at compile time, we should mark it `consteval` when + // available. + // + // The `Infos` array is particularly special. This function expects an array + // of `Info` structs, where the string offsets of each entry refer to the + // *sizes* of those strings rather than their offsets, and for the target + // string to be in the provided string table at an offset the sum of all + // previous string sizes. This function walks the `Infos` array computing the + // running sum and replacing the sizes with the actual offsets in the string + // table that should be used. This arrangement is designed to make it easy to + // expand `.def` and `.inc` files with X-macros to construct both the string + // table and the `Info` structs in the arguments to this function. + static constexpr Storage Make(const char *Strings, + std::array Infos) { +// Translate lengths to offsets. +int Offset = 0; +for (auto &I : Infos) { + Info::StrOffsets NewOffsets = {}; + NewOffsets.Name = Offset; + Offset += I.Offsets.Name; + NewOffsets.Type = Offset; + Offset += I.Offsets.Type; + NewOffsets.Attributes = Offset; + Offset += I.Offsets.Attributes; + NewOffsets.Features = Offset; + Offset += I.Offsets.Features; + I.Offsets = NewOffsets; +} +return {Strings, Infos}; + } +}; + +// A detail macro used below to emit a string literal that, after string literal +// concatenation, ends up triggering the `-Woverlength-strings` warning. While +// the warning is useful in general to catch accidentally excessive strings, +// here we are creating them intentionally. +// +// This relies on a subtle aspect of `_Pragma`: that the *diagnostic* ones don't +// turn into actual tokens that would disrupt string literal concatenation. +#ifdef __clang__ +#define CLANG_BUILTIN_DETAIL_STR_TABLE(S) \ + _Pragma("clang diagnostic push") \ + _Pragma("clang diagnostic ignored \"-Woverlength-strings\"") \ + S _Pragma("clang diagnostic pop") +#else +#define CLANG_BUILTIN_DETAIL_STR_TABLE(S) S +#endif + +// A macro that can be used with `Builtins.def` and similar files as an X-macro +// to add the string arguments to a builtin string table. This is typically the +// target for the `BUILTIN`, `LANGBUILTIN`, or `LIBBUILTIN` macros in those +// files. +#define CLANG_BUILTIN_STR_TABLE(ID, TYPE, ATTRS) \ + CLANG_BUILTIN_DETAIL_STR_TABLE(#ID "\0" TYPE "\0" ATTRS "\0" /*FEATURE*/ "\0") + +// A macro that can be used with target builtin `.def` and `.inc` files as an +// X-macro to add the string arguments to a builtin string table. this is +// typically the target for the `TARGET_BUILTIN` macro. +#define CLANG_TARGET_BUILTIN_STR_TABLE(ID, TYPE, ATTRS, FEATURE) \ + CLANG_BUILTIN_DETAIL_STR_TABLE(#ID "\0" TYPE "\0" ATTRS "\0" FEATURE "\0") + +// A macro that can be used with target builtin `.def` and `.inc` files as an +// X-macro to add the string arguments to a builtin string table. this is +// typically the target for the `TARGET_HEADER_BUILTIN` macro. We can't delegate +// to `TARGET_BUILTIN` because the `FEATURE` string changes position. +#define CLANG_TARGET_HEADER_BUILTIN_STR_TABLE(ID, TYPE, ATTRS, HEADER, LANGS, \ + FEATURE) \ + CLANG_BUILTIN_DETAIL_STR_TABLE(#ID "\0" TYPE "\0" ATTRS "\0" FEATURE "\0") + +// A detail macro used internally to compute the desired string table +// `StrOffsets` struct for arguments to `Storage::Make`. +#define CLANG_BUILTIN_DETAIL_STR_OFFSETS(ID, TYPE, ATTRS) \ + Builtin::Info::StrOffsets { \ +llvm::StringLiteral(#ID).size() + 1, llvm::StringLiteral(TYPE).size() + 1, \ +llvm::StringLiteral(ATTRS).size() + 1, \ +llvm::StringLiteral("").size
[clang] Switch builtin strings to use string tables (PR #118734)
https://github.com/nikic approved this pull request. Confirmed that this works on GCC now. I'd suggest to replace the use of StringLiteral::size with plain sizeof(). The build time overhead of going through StringLiteral here is substantial. https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
@@ -68,23 +69,156 @@ enum ID { FirstTSBuiltin }; +// The info used to represent each builtin. struct Info { - llvm::StringLiteral Name; - const char *Type, *Attributes; - const char *Features; + // Rather than store pointers to the string literals describing these four + // aspects of builtins, we store offsets into a common string table. + struct StrOffsets { +int Name; +int Type; +int Attributes; +int Features; + } Offsets; + HeaderDesc Header; LanguageID Langs; }; +// The storage for `N` builtins. This contains a single pointer to the string +// table used for these builtins and an array of metadata for each builtin. +template struct Storage { + const char *StringTable; + + std::array Infos; + + // A constexpr function to construct the storage for a a given string table in + // the first argument and an array in the second argument. This is *only* + // expected to be used at compile time, we should mark it `consteval` when + // available. + // + // The `Infos` array is particularly special. This function expects an array + // of `Info` structs, where the string offsets of each entry refer to the + // *sizes* of those strings rather than their offsets, and for the target + // string to be in the provided string table at an offset the sum of all + // previous string sizes. This function walks the `Infos` array computing the + // running sum and replacing the sizes with the actual offsets in the string + // table that should be used. This arrangement is designed to make it easy to + // expand `.def` and `.inc` files with X-macros to construct both the string + // table and the `Info` structs in the arguments to this function. + static constexpr Storage Make(const char *Strings, + std::array Infos) { +// Translate lengths to offsets. +int Offset = 0; +for (auto &I : Infos) { + Info::StrOffsets NewOffsets = {}; + NewOffsets.Name = Offset; + Offset += I.Offsets.Name; + NewOffsets.Type = Offset; + Offset += I.Offsets.Type; + NewOffsets.Attributes = Offset; + Offset += I.Offsets.Attributes; + NewOffsets.Features = Offset; + Offset += I.Offsets.Features; + I.Offsets = NewOffsets; +} +return {Strings, Infos}; + } +}; + +// A detail macro used below to emit a string literal that, after string literal +// concatenation, ends up triggering the `-Woverlength-strings` warning. While +// the warning is useful in general to catch accidentally excessive strings, +// here we are creating them intentionally. +// +// This relies on a subtle aspect of `_Pragma`: that the *diagnostic* ones don't +// turn into actual tokens that would disrupt string literal concatenation. +#ifdef __clang__ +#define CLANG_BUILTIN_DETAIL_STR_TABLE(S) \ + _Pragma("clang diagnostic push") \ + _Pragma("clang diagnostic ignored \"-Woverlength-strings\"") \ + S _Pragma("clang diagnostic pop") +#else +#define CLANG_BUILTIN_DETAIL_STR_TABLE(S) S +#endif + +// A macro that can be used with `Builtins.def` and similar files as an X-macro +// to add the string arguments to a builtin string table. This is typically the +// target for the `BUILTIN`, `LANGBUILTIN`, or `LIBBUILTIN` macros in those +// files. +#define CLANG_BUILTIN_STR_TABLE(ID, TYPE, ATTRS) \ + CLANG_BUILTIN_DETAIL_STR_TABLE(#ID "\0" TYPE "\0" ATTRS "\0" /*FEATURE*/ "\0") + +// A macro that can be used with target builtin `.def` and `.inc` files as an +// X-macro to add the string arguments to a builtin string table. this is +// typically the target for the `TARGET_BUILTIN` macro. +#define CLANG_TARGET_BUILTIN_STR_TABLE(ID, TYPE, ATTRS, FEATURE) \ + CLANG_BUILTIN_DETAIL_STR_TABLE(#ID "\0" TYPE "\0" ATTRS "\0" FEATURE "\0") + +// A macro that can be used with target builtin `.def` and `.inc` files as an +// X-macro to add the string arguments to a builtin string table. this is +// typically the target for the `TARGET_HEADER_BUILTIN` macro. We can't delegate +// to `TARGET_BUILTIN` because the `FEATURE` string changes position. +#define CLANG_TARGET_HEADER_BUILTIN_STR_TABLE(ID, TYPE, ATTRS, HEADER, LANGS, \ + FEATURE) \ + CLANG_BUILTIN_DETAIL_STR_TABLE(#ID "\0" TYPE "\0" ATTRS "\0" FEATURE "\0") + +// A detail macro used internally to compute the desired string table +// `StrOffsets` struct for arguments to `Storage::Make`. +#define CLANG_BUILTIN_DETAIL_STR_OFFSETS(ID, TYPE, ATTRS) \ + Builtin::Info::StrOffsets { \ +llvm::StringLiteral(#ID).size() + 1, llvm::StringLiteral(TYPE).size() + 1, \ +llvm::StringLiteral(ATTRS).size() + 1, \ +llvm::StringLiteral("").size
[clang] Switch builtin strings to use string tables (PR #118734)
@@ -68,23 +69,156 @@ enum ID { FirstTSBuiltin }; +// The info used to represent each builtin. struct Info { - llvm::StringLiteral Name; - const char *Type, *Attributes; - const char *Features; + // Rather than store pointers to the string literals describing these four + // aspects of builtins, we store offsets into a common string table. + struct StrOffsets { +int Name; +int Type; +int Attributes; +int Features; + } Offsets; + HeaderDesc Header; LanguageID Langs; }; +// The storage for `N` builtins. This contains a single pointer to the string +// table used for these builtins and an array of metadata for each builtin. +template struct Storage { + const char *StringTable; + + std::array Infos; + + // A constexpr function to construct the storage for a a given string table in + // the first argument and an array in the second argument. This is *only* + // expected to be used at compile time, we should mark it `consteval` when + // available. + // + // The `Infos` array is particularly special. This function expects an array + // of `Info` structs, where the string offsets of each entry refer to the + // *sizes* of those strings rather than their offsets, and for the target + // string to be in the provided string table at an offset the sum of all + // previous string sizes. This function walks the `Infos` array computing the + // running sum and replacing the sizes with the actual offsets in the string + // table that should be used. This arrangement is designed to make it easy to + // expand `.def` and `.inc` files with X-macros to construct both the string + // table and the `Info` structs in the arguments to this function. + static constexpr auto Make(const char *Strings, + std::array Infos) -> Storage { chandlerc wrote: Sorry, a new habit of mine and I hadn't gone through thoroughly enough to fix everywhere. Should be fixed now throughout, I checked all the `auto` keywords in the diff. https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
https://github.com/chandlerc commented: > Fails to build with GCC: Doh, I had hoped GCC would support this. Anyways, fixed back to just suppress the diagnostic with Clang. Did some builds with GCC and everything seems fine, and it doesn't seem like we're GCC warning clean these days anyways. https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
https://github.com/chandlerc edited https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
https://github.com/chandlerc updated https://github.com/llvm/llvm-project/pull/118734 >From 73a0b5c796881d1e52f8336eb69f678fd4c9f4c4 Mon Sep 17 00:00:00 2001 From: Chandler Carruth Date: Thu, 28 Nov 2024 09:56:40 + Subject: [PATCH 1/5] Switch builtin strings to use string tables MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The Clang binary (and any binary linking Clang as a library), when built using PIE, ends up with a pretty shocking number of dynamic relocations to apply to the executable image: roughly 400k. Each of these takes up binary space in the executable, and perhaps most interestingly takes start-up time to apply the relocations. The largest pattern I identified were the strings used to describe target builtins. The addresses of these string literals were stored into huge arrays, each one requiring a dynamic relocation. The way to avoid this is to design the target builtins to use a single large table of strings and offsets within the table for the individual strings. This switches the builtin management to such a scheme. This saves over 100k dynamic relocations by my measurement, an over 25% reduction. Just looking at byte size improvements, using the `bloaty` tool to compare a newly built `clang` binary to an old one: ``` FILE SIZEVM SIZE -- -- +1.4% +653Ki +1.4% +653Ki.rodata +0.0%+960 +0.0%+960.text +0.0%+197 +0.0%+197.dynstr +0.0%+184 +0.0%+184.eh_frame +0.0% +96 +0.0% +96.dynsym +0.0% +40 +0.0% +40.eh_frame_hdr +114% +32 [ = ] 0[Unmapped] +0.0% +20 +0.0% +20.gnu.hash +0.0% +8 +0.0% +8.gnu.version +0.9% +7 +0.9% +7[LOAD #2 [R]] [ = ] 0 -75.4% -3.00Ki.relro_padding -16.1% -802Ki -16.1% -802Ki.data.rel.ro -27.3% -2.52Mi -27.3% -2.52Mi.rela.dyn -1.6% -2.66Mi -1.6% -2.66MiTOTAL ``` We get a 16% reduction in the `.data.rel.ro` section, and nearly 30% reduction in `.rela.dyn` where those reloctaions are stored. This is also visible in my benchmarking of binary start-up overhead at least: ``` Benchmark 1: ./old_clang --version Time (mean ± σ): 17.6 ms ± 1.5 ms[User: 4.1 ms, System: 13.3 ms] Range (min … max):14.2 ms … 22.8 ms162 runs Benchmark 2: ./new_clang --version Time (mean ± σ): 15.5 ms ± 1.4 ms[User: 3.6 ms, System: 11.8 ms] Range (min … max):12.4 ms … 20.3 ms216 runs Summary './new_clang --version' ran 1.13 ± 0.14 times faster than './old_clang --version' ``` We get about 2ms faster `--version` runs. While there is a lot of noise in binary execution time, this delta is pretty consistent, and represents over 10% improvement. This is particularly interesting to me because for very short source files, repeatedly starting the `clang` binary is actually the dominant cost. For example, `configure` scripts running against the `clang` compiler are slow in large part because of binary start up time, not the time to process the actual inputs to the compiler. This PR implements the string tables using `constexpr` code and the existing macro system. I understand that the builtins are moving towards a TableGen model, and if complete that would provide more options for modeling this. Unfortunately, that migration isn't complete, and even the parts that are migrated still rely on the ability to break out of the TableGen model and directly expand an X-macro style `BUILTIN(...)` textually. I looked at trying to complete the move to TableGen, but it would both require the difficult migration of the remaining targets, and solving some tricky problems with how to move away from any macro-based expansion. I was also able to find a reasonably clean and effective way of doing this with the existing macros and some `constexpr` code that I think is clean enough to be a pretty good intermediate state, and maybe give a good target for the eventual TableGen solution. I was also able to factor the macros into set of consistent patterns that avoids a significant regression in overall boilerplate. There is one challenge with this approach: it requires the host compiler to support (very) long string literals, a bit over half a meg. =/ The current minimum MSVC version rejects these, but the very next patch release (16.8) removed that restriction. I'm going to send out a separate PR / RFC to raise the minimum version by one patch release, which I hope is acceptable as the current version was set years ago. FWIW, there are a few more low-hanging fruit sources of excessive dynamic relocations, maybe as many as 50k to 100k more that I'll take a look at to see if I can identify easy fixes. Beyond that, it seems to get quite difficult. It might be worth adding some guidance to developer documentation to try to avoid creating global data structures that _repeatedly_ store pointers to oth
[clang] Switch builtin strings to use string tables (PR #118734)
https://github.com/nikic requested changes to this pull request. Fails to build with GCC: ``` FAILED: tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/Builtins.cpp.o CCACHE_CPP2=yes CCACHE_HASHDIR=yes /usr/bin/ccache /usr/bin/c++ -DCLANG_EXPORTS -DGTEST_HAS_RTTI=0 -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/tmp/llvm-project-build-stage1/tools/clang/lib/Basic -I/var/llvm-compile-time-tracker/llvm-project/clang/lib/Basic -I/var/llvm-compile-time-tracker/llvm-project/clang/include -I/tmp/llvm-project-build-stage1/tools/clang/include -I/tmp/llvm-project-build-stage1/include -I/var/llvm-compile-time-tracker/llvm-project/llvm/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -fno-lifetime-dse -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-uninitialized -Wno-nonnull -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wno-misleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -fno-common -Woverloaded-virtual -fno-strict-aliasing -O3 -DNDEBUG -fno-exceptions -funwind-tables -fno-rtti -std=c++17 -MD -MT tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/Builtins.cpp.o -MF tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/Builtins.cpp.o.d -o tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/Builtins.cpp.o -c /var/llvm-compile-time-tracker/llvm-project/clang/lib/Basic/Builtins.cpp In file included from /var/llvm-compile-time-tracker/llvm-project/clang/lib/Basic/Builtins.cpp:13: /var/llvm-compile-time-tracker/llvm-project/clang/include/clang/Basic/Builtins.h:137:3: error: ‘#pragma’ is not allowed here 137 | _Pragma("GCC diagnostic push") \ | ^~~ /var/llvm-compile-time-tracker/llvm-project/clang/include/clang/Basic/Builtins.h:149:3: note: in expansion of macro ‘CLANG_BUILTIN_DETAIL_STR_TABLE’ 149 | CLANG_BUILTIN_DETAIL_STR_TABLE(#ID "\0" TYPE "\0" ATTRS "\0" /*FEATURE*/ "\0") | ^~ /var/llvm-compile-time-tracker/llvm-project/clang/lib/Basic/Builtins.cpp:34:9: note: in expansion of macro ‘CLANG_BUILTIN_STR_TABLE’ 34 | CLANG_BUILTIN_STR_TABLE("not a builtin function", "", "") | ^~~ /var/llvm-compile-time-tracker/llvm-project/clang/lib/Basic/Builtins.cpp: In member function ‘std::pair clang::Builtin::Context::getStrTableAndInfo(unsigned int) const’: /var/llvm-compile-time-tracker/llvm-project/clang/lib/Basic/Builtins.cpp:48:65: error: could not convert ‘{, }’ from ‘’ to ‘std::pair’ 48 | return {BuiltinStorage.StringTable, BuiltinStorage.Infos[ID]}; | ^ | | | ``` https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
@@ -68,23 +69,156 @@ enum ID { FirstTSBuiltin }; +// The info used to represent each builtin. struct Info { - llvm::StringLiteral Name; - const char *Type, *Attributes; - const char *Features; + // Rather than store pointers to the string literals describing these four + // aspects of builtins, we store offsets into a common string table. + struct StrOffsets { +int Name; +int Type; +int Attributes; +int Features; + } Offsets; + HeaderDesc Header; LanguageID Langs; }; +// The storage for `N` builtins. This contains a single pointer to the string +// table used for these builtins and an array of metadata for each builtin. +template struct Storage { + const char *StringTable; + + std::array Infos; + + // A constexpr function to construct the storage for a a given string table in + // the first argument and an array in the second argument. This is *only* + // expected to be used at compile time, we should mark it `consteval` when + // available. + // + // The `Infos` array is particularly special. This function expects an array + // of `Info` structs, where the string offsets of each entry refer to the + // *sizes* of those strings rather than their offsets, and for the target + // string to be in the provided string table at an offset the sum of all + // previous string sizes. This function walks the `Infos` array computing the + // running sum and replacing the sizes with the actual offsets in the string + // table that should be used. This arrangement is designed to make it easy to + // expand `.def` and `.inc` files with X-macros to construct both the string + // table and the `Info` structs in the arguments to this function. + static constexpr auto Make(const char *Strings, + std::array Infos) -> Storage { tbaederr wrote: This applies to a few functions added in this PR, but we don't really use trailing return types unless it's necessary https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
@@ -41,6 +41,10 @@ class LLVM_LIBRARY_VISIBILITY ARCTargetInfo : public TargetInfo { MacroBuilder &Builder) const override; ArrayRef getTargetBuiltins() const override { return {}; } chandlerc wrote: Doh, yes. I meant to remove the base class one once they were all gone to make sure, and just forgot. Thanks for catching. Fixed. https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
https://github.com/chandlerc updated https://github.com/llvm/llvm-project/pull/118734 >From 73a0b5c796881d1e52f8336eb69f678fd4c9f4c4 Mon Sep 17 00:00:00 2001 From: Chandler Carruth Date: Thu, 28 Nov 2024 09:56:40 + Subject: [PATCH 1/2] Switch builtin strings to use string tables MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The Clang binary (and any binary linking Clang as a library), when built using PIE, ends up with a pretty shocking number of dynamic relocations to apply to the executable image: roughly 400k. Each of these takes up binary space in the executable, and perhaps most interestingly takes start-up time to apply the relocations. The largest pattern I identified were the strings used to describe target builtins. The addresses of these string literals were stored into huge arrays, each one requiring a dynamic relocation. The way to avoid this is to design the target builtins to use a single large table of strings and offsets within the table for the individual strings. This switches the builtin management to such a scheme. This saves over 100k dynamic relocations by my measurement, an over 25% reduction. Just looking at byte size improvements, using the `bloaty` tool to compare a newly built `clang` binary to an old one: ``` FILE SIZEVM SIZE -- -- +1.4% +653Ki +1.4% +653Ki.rodata +0.0%+960 +0.0%+960.text +0.0%+197 +0.0%+197.dynstr +0.0%+184 +0.0%+184.eh_frame +0.0% +96 +0.0% +96.dynsym +0.0% +40 +0.0% +40.eh_frame_hdr +114% +32 [ = ] 0[Unmapped] +0.0% +20 +0.0% +20.gnu.hash +0.0% +8 +0.0% +8.gnu.version +0.9% +7 +0.9% +7[LOAD #2 [R]] [ = ] 0 -75.4% -3.00Ki.relro_padding -16.1% -802Ki -16.1% -802Ki.data.rel.ro -27.3% -2.52Mi -27.3% -2.52Mi.rela.dyn -1.6% -2.66Mi -1.6% -2.66MiTOTAL ``` We get a 16% reduction in the `.data.rel.ro` section, and nearly 30% reduction in `.rela.dyn` where those reloctaions are stored. This is also visible in my benchmarking of binary start-up overhead at least: ``` Benchmark 1: ./old_clang --version Time (mean ± σ): 17.6 ms ± 1.5 ms[User: 4.1 ms, System: 13.3 ms] Range (min … max):14.2 ms … 22.8 ms162 runs Benchmark 2: ./new_clang --version Time (mean ± σ): 15.5 ms ± 1.4 ms[User: 3.6 ms, System: 11.8 ms] Range (min … max):12.4 ms … 20.3 ms216 runs Summary './new_clang --version' ran 1.13 ± 0.14 times faster than './old_clang --version' ``` We get about 2ms faster `--version` runs. While there is a lot of noise in binary execution time, this delta is pretty consistent, and represents over 10% improvement. This is particularly interesting to me because for very short source files, repeatedly starting the `clang` binary is actually the dominant cost. For example, `configure` scripts running against the `clang` compiler are slow in large part because of binary start up time, not the time to process the actual inputs to the compiler. This PR implements the string tables using `constexpr` code and the existing macro system. I understand that the builtins are moving towards a TableGen model, and if complete that would provide more options for modeling this. Unfortunately, that migration isn't complete, and even the parts that are migrated still rely on the ability to break out of the TableGen model and directly expand an X-macro style `BUILTIN(...)` textually. I looked at trying to complete the move to TableGen, but it would both require the difficult migration of the remaining targets, and solving some tricky problems with how to move away from any macro-based expansion. I was also able to find a reasonably clean and effective way of doing this with the existing macros and some `constexpr` code that I think is clean enough to be a pretty good intermediate state, and maybe give a good target for the eventual TableGen solution. I was also able to factor the macros into set of consistent patterns that avoids a significant regression in overall boilerplate. There is one challenge with this approach: it requires the host compiler to support (very) long string literals, a bit over half a meg. =/ The current minimum MSVC version rejects these, but the very next patch release (16.8) removed that restriction. I'm going to send out a separate PR / RFC to raise the minimum version by one patch release, which I hope is acceptable as the current version was set years ago. FWIW, there are a few more low-hanging fruit sources of excessive dynamic relocations, maybe as many as 50k to 100k more that I'll take a look at to see if I can identify easy fixes. Beyond that, it seems to get quite difficult. It might be worth adding some guidance to developer documentation to try to avoid creating global data structures that _repeatedly_ store pointers to oth
[clang] Switch builtin strings to use string tables (PR #118734)
@@ -41,6 +41,10 @@ class LLVM_LIBRARY_VISIBILITY ARCTargetInfo : public TargetInfo { MacroBuilder &Builder) const override; ArrayRef getTargetBuiltins() const override { return {}; } topperc wrote: Should this override of getTargetBuiltins be removed? It looks like it matches the default. https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
https://github.com/topperc edited https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
https://github.com/chandlerc updated https://github.com/llvm/llvm-project/pull/118734 >From 73a0b5c796881d1e52f8336eb69f678fd4c9f4c4 Mon Sep 17 00:00:00 2001 From: Chandler Carruth Date: Thu, 28 Nov 2024 09:56:40 + Subject: [PATCH] Switch builtin strings to use string tables MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The Clang binary (and any binary linking Clang as a library), when built using PIE, ends up with a pretty shocking number of dynamic relocations to apply to the executable image: roughly 400k. Each of these takes up binary space in the executable, and perhaps most interestingly takes start-up time to apply the relocations. The largest pattern I identified were the strings used to describe target builtins. The addresses of these string literals were stored into huge arrays, each one requiring a dynamic relocation. The way to avoid this is to design the target builtins to use a single large table of strings and offsets within the table for the individual strings. This switches the builtin management to such a scheme. This saves over 100k dynamic relocations by my measurement, an over 25% reduction. Just looking at byte size improvements, using the `bloaty` tool to compare a newly built `clang` binary to an old one: ``` FILE SIZEVM SIZE -- -- +1.4% +653Ki +1.4% +653Ki.rodata +0.0%+960 +0.0%+960.text +0.0%+197 +0.0%+197.dynstr +0.0%+184 +0.0%+184.eh_frame +0.0% +96 +0.0% +96.dynsym +0.0% +40 +0.0% +40.eh_frame_hdr +114% +32 [ = ] 0[Unmapped] +0.0% +20 +0.0% +20.gnu.hash +0.0% +8 +0.0% +8.gnu.version +0.9% +7 +0.9% +7[LOAD #2 [R]] [ = ] 0 -75.4% -3.00Ki.relro_padding -16.1% -802Ki -16.1% -802Ki.data.rel.ro -27.3% -2.52Mi -27.3% -2.52Mi.rela.dyn -1.6% -2.66Mi -1.6% -2.66MiTOTAL ``` We get a 16% reduction in the `.data.rel.ro` section, and nearly 30% reduction in `.rela.dyn` where those reloctaions are stored. This is also visible in my benchmarking of binary start-up overhead at least: ``` Benchmark 1: ./old_clang --version Time (mean ± σ): 17.6 ms ± 1.5 ms[User: 4.1 ms, System: 13.3 ms] Range (min … max):14.2 ms … 22.8 ms162 runs Benchmark 2: ./new_clang --version Time (mean ± σ): 15.5 ms ± 1.4 ms[User: 3.6 ms, System: 11.8 ms] Range (min … max):12.4 ms … 20.3 ms216 runs Summary './new_clang --version' ran 1.13 ± 0.14 times faster than './old_clang --version' ``` We get about 2ms faster `--version` runs. While there is a lot of noise in binary execution time, this delta is pretty consistent, and represents over 10% improvement. This is particularly interesting to me because for very short source files, repeatedly starting the `clang` binary is actually the dominant cost. For example, `configure` scripts running against the `clang` compiler are slow in large part because of binary start up time, not the time to process the actual inputs to the compiler. This PR implements the string tables using `constexpr` code and the existing macro system. I understand that the builtins are moving towards a TableGen model, and if complete that would provide more options for modeling this. Unfortunately, that migration isn't complete, and even the parts that are migrated still rely on the ability to break out of the TableGen model and directly expand an X-macro style `BUILTIN(...)` textually. I looked at trying to complete the move to TableGen, but it would both require the difficult migration of the remaining targets, and solving some tricky problems with how to move away from any macro-based expansion. I was also able to find a reasonably clean and effective way of doing this with the existing macros and some `constexpr` code that I think is clean enough to be a pretty good intermediate state, and maybe give a good target for the eventual TableGen solution. I was also able to factor the macros into set of consistent patterns that avoids a significant regression in overall boilerplate. There is one challenge with this approach: it requires the host compiler to support (very) long string literals, a bit over half a meg. =/ The current minimum MSVC version rejects these, but the very next patch release (16.8) removed that restriction. I'm going to send out a separate PR / RFC to raise the minimum version by one patch release, which I hope is acceptable as the current version was set years ago. FWIW, there are a few more low-hanging fruit sources of excessive dynamic relocations, maybe as many as 50k to 100k more that I'll take a look at to see if I can identify easy fixes. Beyond that, it seems to get quite difficult. It might be worth adding some guidance to developer documentation to try to avoid creating global data structures that _repeatedly_ store pointers to other g
[clang] Switch builtin strings to use string tables (PR #118734)
github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning: You can test this locally with the following command: ``bash git-clang-format --diff 983f88c1ec9cee51cf0fb4e6a0f00074a7cf1b60 16e70ef753de962e76b5232912094a51ca24 --extensions cpp,h -- clang/include/clang/Basic/Builtins.h clang/include/clang/Basic/TargetInfo.h clang/lib/Basic/Builtins.cpp clang/lib/Basic/Targets/AArch64.cpp clang/lib/Basic/Targets/AArch64.h clang/lib/Basic/Targets/AMDGPU.cpp clang/lib/Basic/Targets/AMDGPU.h clang/lib/Basic/Targets/ARC.h clang/lib/Basic/Targets/ARM.cpp clang/lib/Basic/Targets/ARM.h clang/lib/Basic/Targets/AVR.h clang/lib/Basic/Targets/BPF.cpp clang/lib/Basic/Targets/BPF.h clang/lib/Basic/Targets/CSKY.cpp clang/lib/Basic/Targets/CSKY.h clang/lib/Basic/Targets/DirectX.h clang/lib/Basic/Targets/Hexagon.cpp clang/lib/Basic/Targets/Hexagon.h clang/lib/Basic/Targets/Lanai.h clang/lib/Basic/Targets/LoongArch.cpp clang/lib/Basic/Targets/LoongArch.h clang/lib/Basic/Targets/M68k.cpp clang/lib/Basic/Targets/M68k.h clang/lib/Basic/Targets/MSP430.h clang/lib/Basic/Targets/Mips.cpp clang/lib/Basic/Targets/Mips.h clang/lib/Basic/Targets/NVPTX.cpp clang/lib/Basic/Targets/NVPTX.h clang/lib/Basic/Targets/PNaCl.h clang/lib/Basic/Targets/PPC.cpp clang/lib/Basic/Targets/PPC.h clang/lib/Basic/Targets/RISCV.cpp clang/lib/Basic/Targets/RISCV.h clang/lib/Basic/Targets/SPIR.cpp clang/lib/Basic/Targets/SPIR.h clang/lib/Basic/Targets/Sparc.h clang/lib/Basic/Targets/SystemZ.cpp clang/lib/Basic/Targets/SystemZ.h clang/lib/Basic/Targets/TCE.h clang/lib/Basic/Targets/VE.cpp clang/lib/Basic/Targets/VE.h clang/lib/Basic/Targets/WebAssembly.cpp clang/lib/Basic/Targets/WebAssembly.h clang/lib/Basic/Targets/X86.cpp clang/lib/Basic/Targets/X86.h clang/lib/Basic/Targets/XCore.cpp clang/lib/Basic/Targets/XCore.h `` View the diff from clang-format here. ``diff diff --git a/clang/include/clang/Basic/Builtins.h b/clang/include/clang/Basic/Builtins.h index d5d76cbb93..fe3a864a68 100644 --- a/clang/include/clang/Basic/Builtins.h +++ b/clang/include/clang/Basic/Builtins.h @@ -134,8 +134,8 @@ template struct Storage { // turn into actual tokens that would disrupt string literal concatenation. #ifdef __GNUC__ #define CLANG_BUILTIN_DETAIL_STR_TABLE(S) \ - _Pragma("GCC diagnostic push") \ - _Pragma("GCC diagnostic ignored \"-Woverlength-strings\"") \ + _Pragma("GCC diagnostic push") \ + _Pragma("GCC diagnostic ignored \"-Woverlength-strings\"") \ S _Pragma("GCC diagnostic pop") #else #define CLANG_BUILTIN_DETAIL_STR_TABLE(S) S `` https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
chandlerc wrote: Updated to use the GCC diagnostic push the same as TableGen does for long string literals. Also added Aaron to take a look as well (unless he's comfortable already). https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
https://github.com/chandlerc updated https://github.com/llvm/llvm-project/pull/118734 >From 16e70ef753de962e76b5232912094a51ca24 Mon Sep 17 00:00:00 2001 From: Chandler Carruth Date: Thu, 28 Nov 2024 09:56:40 + Subject: [PATCH] Switch builtin strings to use string tables MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The Clang binary (and any binary linking Clang as a library), when built using PIE, ends up with a pretty shocking number of dynamic relocations to apply to the executable image: roughly 400k. Each of these takes up binary space in the executable, and perhaps most interestingly takes start-up time to apply the relocations. The largest pattern I identified were the strings used to describe target builtins. The addresses of these string literals were stored into huge arrays, each one requiring a dynamic relocation. The way to avoid this is to design the target builtins to use a single large table of strings and offsets within the table for the individual strings. This switches the builtin management to such a scheme. This saves over 100k dynamic relocations by my measurement, an over 25% reduction. Just looking at byte size improvements, using the `bloaty` tool to compare a newly built `clang` binary to an old one: ``` FILE SIZEVM SIZE -- -- +1.4% +653Ki +1.4% +653Ki.rodata +0.0%+960 +0.0%+960.text +0.0%+197 +0.0%+197.dynstr +0.0%+184 +0.0%+184.eh_frame +0.0% +96 +0.0% +96.dynsym +0.0% +40 +0.0% +40.eh_frame_hdr +114% +32 [ = ] 0[Unmapped] +0.0% +20 +0.0% +20.gnu.hash +0.0% +8 +0.0% +8.gnu.version +0.9% +7 +0.9% +7[LOAD #2 [R]] [ = ] 0 -75.4% -3.00Ki.relro_padding -16.1% -802Ki -16.1% -802Ki.data.rel.ro -27.3% -2.52Mi -27.3% -2.52Mi.rela.dyn -1.6% -2.66Mi -1.6% -2.66MiTOTAL ``` We get a 16% reduction in the `.data.rel.ro` section, and nearly 30% reduction in `.rela.dyn` where those reloctaions are stored. This is also visible in my benchmarking of binary start-up overhead at least: ``` Benchmark 1: ./old_clang --version Time (mean ± σ): 17.6 ms ± 1.5 ms[User: 4.1 ms, System: 13.3 ms] Range (min … max):14.2 ms … 22.8 ms162 runs Benchmark 2: ./new_clang --version Time (mean ± σ): 15.5 ms ± 1.4 ms[User: 3.6 ms, System: 11.8 ms] Range (min … max):12.4 ms … 20.3 ms216 runs Summary './new_clang --version' ran 1.13 ± 0.14 times faster than './old_clang --version' ``` We get about 2ms faster `--version` runs. While there is a lot of noise in binary execution time, this delta is pretty consistent, and represents over 10% improvement. This is particularly interesting to me because for very short source files, repeatedly starting the `clang` binary is actually the dominant cost. For example, `configure` scripts running against the `clang` compiler are slow in large part because of binary start up time, not the time to process the actual inputs to the compiler. This PR implements the string tables using `constexpr` code and the existing macro system. I understand that the builtins are moving towards a TableGen model, and if complete that would provide more options for modeling this. Unfortunately, that migration isn't complete, and even the parts that are migrated still rely on the ability to break out of the TableGen model and directly expand an X-macro style `BUILTIN(...)` textually. I looked at trying to complete the move to TableGen, but it would both require the difficult migration of the remaining targets, and solving some tricky problems with how to move away from any macro-based expansion. I was also able to find a reasonably clean and effective way of doing this with the existing macros and some `constexpr` code that I think is clean enough to be a pretty good intermediate state, and maybe give a good target for the eventual TableGen solution. I was also able to factor the macros into set of consistent patterns that avoids a significant regression in overall boilerplate. There is one challenge with this approach: it requires the host compiler to support (very) long string literals, a bit over half a meg. =/ The current minimum MSVC version rejects these, but the very next patch release (16.8) removed that restriction. I'm going to send out a separate PR / RFC to raise the minimum version by one patch release, which I hope is acceptable as the current version was set years ago. FWIW, there are a few more low-hanging fruit sources of excessive dynamic relocations, maybe as many as 50k to 100k more that I'll take a look at to see if I can identify easy fixes. Beyond that, it seems to get quite difficult. It might be worth adding some guidance to developer documentation to try to avoid creating global data structures that _repeatedly_ store pointers to other g
[clang] Switch builtin strings to use string tables (PR #118734)
https://github.com/zygoloid approved this pull request. LGTM too. I do wonder if we can make tablegen generate the data directly in the desired format here (perhaps with some additional `.td` files to define exactly which files we want to include in which targets and to define the placeholder for the target-independent builtins), but I don't think this (huge) improvement needs to block on that. https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
https://github.com/dwblaikie approved this pull request. This looks fine to me - though given the broad impact (to every target in clang) maybe @AaronBallman wants to weigh in. If you wanted to front-load the things like `getRecord(ID).Attributes` -> `getAttributesString(ID)` those things could possibly be done in advance. https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
chandlerc wrote: Discussion thread about the MSVC version change: https://discourse.llvm.org/t/rfc-raising-minimum-msvc-version-by-one-patch-release/83490 https://github.com/llvm/llvm-project/pull/118734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Switch builtin strings to use string tables (PR #118734)
llvmbot wrote: @llvm/pr-subscribers-backend-arm Author: Chandler Carruth (chandlerc) Changes The Clang binary (and any binary linking Clang as a library), when built using PIE, ends up with a pretty shocking number of dynamic relocations to apply to the executable image: roughly 400k. Each of these takes up binary space in the executable, and perhaps most interestingly takes start-up time to apply the relocations. The largest pattern I identified were the strings used to describe target builtins. The addresses of these string literals were stored into huge arrays, each one requiring a dynamic relocation. The way to avoid this is to design the target builtins to use a single large table of strings and offsets within the table for the individual strings. This switches the builtin management to such a scheme. This saves over 100k dynamic relocations by my measurement, an over 25% reduction. Just looking at byte size improvements, using the `bloaty` tool to compare a newly built `clang` binary to an old one: ``` FILE SIZEVM SIZE -- -- +1.4% +653Ki +1.4% +653Ki.rodata +0.0%+960 +0.0%+960.text +0.0%+197 +0.0%+197.dynstr +0.0%+184 +0.0%+184.eh_frame +0.0% +96 +0.0% +96.dynsym +0.0% +40 +0.0% +40.eh_frame_hdr +114% +32 [ = ] 0[Unmapped] +0.0% +20 +0.0% +20.gnu.hash +0.0% +8 +0.0% +8.gnu.version +0.9% +7 +0.9% +7[LOAD #2 [R]] [ = ] 0 -75.4% -3.00Ki.relro_padding -16.1% -802Ki -16.1% -802Ki.data.rel.ro -27.3% -2.52Mi -27.3% -2.52Mi.rela.dyn -1.6% -2.66Mi -1.6% -2.66MiTOTAL ``` We get a 16% reduction in the `.data.rel.ro` section, and nearly 30% reduction in `.rela.dyn` where those reloctaions are stored. This is also visible in my benchmarking of binary start-up overhead at least: ``` Benchmark 1: ./old_clang --version Time (mean ± σ): 17.6 ms ± 1.5 ms[User: 4.1 ms, System: 13.3 ms] Range (min … max):14.2 ms … 22.8 ms162 runs Benchmark 2: ./new_clang --version Time (mean ± σ): 15.5 ms ± 1.4 ms[User: 3.6 ms, System: 11.8 ms] Range (min … max):12.4 ms … 20.3 ms216 runs Summary './new_clang --version' ran 1.13 ± 0.14 times faster than './old_clang --version' ``` We get about 2ms faster `--version` runs. While there is a lot of noise in binary execution time, this delta is pretty consistent, and represents over 10% improvement. This is particularly interesting to me because for very short source files, repeatedly starting the `clang` binary is actually the dominant cost. For example, `configure` scripts running against the `clang` compiler are slow in large part because of binary start up time, not the time to process the actual inputs to the compiler. This PR implements the string tables using `constexpr` code and the existing macro system. I understand that the builtins are moving towards a TableGen model, and if complete that would provide more options for modeling this. Unfortunately, that migration isn't complete, and even the parts that are migrated still rely on the ability to break out of the TableGen model and directly expand an X-macro style `BUILTIN(...)` textually. I looked at trying to complete the move to TableGen, but it would both require the difficult migration of the remaining targets, and solving some tricky problems with how to move away from any macro-based expansion. I was also able to find a reasonably clean and effective way of doing this with the existing macros and some `constexpr` code that I think is clean enough to be a pretty good intermediate state, and maybe give a good target for the eventual TableGen solution. I was also able to factor the macros into set of consistent patterns that avoids a significant regression in overall boilerplate. There is one challenge with this approach: it requires the host compiler to support (very) long string literals, a bit over half a meg. =/ The current minimum MSVC version rejects these, but the very next patch release (16.8) removed that restriction. I'm going to send out a separate PR / RFC to raise the minimum version by one patch release, which I hope is acceptable as the current version was set years ago. FWIW, there are a few more low-hanging fruit sources of excessive dynamic relocations, maybe as many as 50k to 100k more that I'll take a look at to see if I can identify easy fixes. Beyond that, it seems to get quite difficult. It might be worth adding some guidance to developer documentation to try to avoid creating global data structures that _repeatedly_ store pointers to other globals. --- Patch is 77.35 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/118734.diff 48 Files Affected: - (modified) clang/include/clang/Basic/Builtins.h (+174-36) - (modified) clang
[clang] Switch builtin strings to use string tables (PR #118734)
llvmbot wrote: @llvm/pr-subscribers-backend-msp430 Author: Chandler Carruth (chandlerc) Changes The Clang binary (and any binary linking Clang as a library), when built using PIE, ends up with a pretty shocking number of dynamic relocations to apply to the executable image: roughly 400k. Each of these takes up binary space in the executable, and perhaps most interestingly takes start-up time to apply the relocations. The largest pattern I identified were the strings used to describe target builtins. The addresses of these string literals were stored into huge arrays, each one requiring a dynamic relocation. The way to avoid this is to design the target builtins to use a single large table of strings and offsets within the table for the individual strings. This switches the builtin management to such a scheme. This saves over 100k dynamic relocations by my measurement, an over 25% reduction. Just looking at byte size improvements, using the `bloaty` tool to compare a newly built `clang` binary to an old one: ``` FILE SIZEVM SIZE -- -- +1.4% +653Ki +1.4% +653Ki.rodata +0.0%+960 +0.0%+960.text +0.0%+197 +0.0%+197.dynstr +0.0%+184 +0.0%+184.eh_frame +0.0% +96 +0.0% +96.dynsym +0.0% +40 +0.0% +40.eh_frame_hdr +114% +32 [ = ] 0[Unmapped] +0.0% +20 +0.0% +20.gnu.hash +0.0% +8 +0.0% +8.gnu.version +0.9% +7 +0.9% +7[LOAD #2 [R]] [ = ] 0 -75.4% -3.00Ki.relro_padding -16.1% -802Ki -16.1% -802Ki.data.rel.ro -27.3% -2.52Mi -27.3% -2.52Mi.rela.dyn -1.6% -2.66Mi -1.6% -2.66MiTOTAL ``` We get a 16% reduction in the `.data.rel.ro` section, and nearly 30% reduction in `.rela.dyn` where those reloctaions are stored. This is also visible in my benchmarking of binary start-up overhead at least: ``` Benchmark 1: ./old_clang --version Time (mean ± σ): 17.6 ms ± 1.5 ms[User: 4.1 ms, System: 13.3 ms] Range (min … max):14.2 ms … 22.8 ms162 runs Benchmark 2: ./new_clang --version Time (mean ± σ): 15.5 ms ± 1.4 ms[User: 3.6 ms, System: 11.8 ms] Range (min … max):12.4 ms … 20.3 ms216 runs Summary './new_clang --version' ran 1.13 ± 0.14 times faster than './old_clang --version' ``` We get about 2ms faster `--version` runs. While there is a lot of noise in binary execution time, this delta is pretty consistent, and represents over 10% improvement. This is particularly interesting to me because for very short source files, repeatedly starting the `clang` binary is actually the dominant cost. For example, `configure` scripts running against the `clang` compiler are slow in large part because of binary start up time, not the time to process the actual inputs to the compiler. This PR implements the string tables using `constexpr` code and the existing macro system. I understand that the builtins are moving towards a TableGen model, and if complete that would provide more options for modeling this. Unfortunately, that migration isn't complete, and even the parts that are migrated still rely on the ability to break out of the TableGen model and directly expand an X-macro style `BUILTIN(...)` textually. I looked at trying to complete the move to TableGen, but it would both require the difficult migration of the remaining targets, and solving some tricky problems with how to move away from any macro-based expansion. I was also able to find a reasonably clean and effective way of doing this with the existing macros and some `constexpr` code that I think is clean enough to be a pretty good intermediate state, and maybe give a good target for the eventual TableGen solution. I was also able to factor the macros into set of consistent patterns that avoids a significant regression in overall boilerplate. There is one challenge with this approach: it requires the host compiler to support (very) long string literals, a bit over half a meg. =/ The current minimum MSVC version rejects these, but the very next patch release (16.8) removed that restriction. I'm going to send out a separate PR / RFC to raise the minimum version by one patch release, which I hope is acceptable as the current version was set years ago. FWIW, there are a few more low-hanging fruit sources of excessive dynamic relocations, maybe as many as 50k to 100k more that I'll take a look at to see if I can identify easy fixes. Beyond that, it seems to get quite difficult. It might be worth adding some guidance to developer documentation to try to avoid creating global data structures that _repeatedly_ store pointers to other globals. --- Patch is 77.35 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/118734.diff 48 Files Affected: - (modified) clang/include/clang/Basic/Builtins.h (+174-36) - (modified) cl
[clang] Switch builtin strings to use string tables (PR #118734)
llvmbot wrote: @llvm/pr-subscribers-backend-sparc Author: Chandler Carruth (chandlerc) Changes The Clang binary (and any binary linking Clang as a library), when built using PIE, ends up with a pretty shocking number of dynamic relocations to apply to the executable image: roughly 400k. Each of these takes up binary space in the executable, and perhaps most interestingly takes start-up time to apply the relocations. The largest pattern I identified were the strings used to describe target builtins. The addresses of these string literals were stored into huge arrays, each one requiring a dynamic relocation. The way to avoid this is to design the target builtins to use a single large table of strings and offsets within the table for the individual strings. This switches the builtin management to such a scheme. This saves over 100k dynamic relocations by my measurement, an over 25% reduction. Just looking at byte size improvements, using the `bloaty` tool to compare a newly built `clang` binary to an old one: ``` FILE SIZEVM SIZE -- -- +1.4% +653Ki +1.4% +653Ki.rodata +0.0%+960 +0.0%+960.text +0.0%+197 +0.0%+197.dynstr +0.0%+184 +0.0%+184.eh_frame +0.0% +96 +0.0% +96.dynsym +0.0% +40 +0.0% +40.eh_frame_hdr +114% +32 [ = ] 0[Unmapped] +0.0% +20 +0.0% +20.gnu.hash +0.0% +8 +0.0% +8.gnu.version +0.9% +7 +0.9% +7[LOAD #2 [R]] [ = ] 0 -75.4% -3.00Ki.relro_padding -16.1% -802Ki -16.1% -802Ki.data.rel.ro -27.3% -2.52Mi -27.3% -2.52Mi.rela.dyn -1.6% -2.66Mi -1.6% -2.66MiTOTAL ``` We get a 16% reduction in the `.data.rel.ro` section, and nearly 30% reduction in `.rela.dyn` where those reloctaions are stored. This is also visible in my benchmarking of binary start-up overhead at least: ``` Benchmark 1: ./old_clang --version Time (mean ± σ): 17.6 ms ± 1.5 ms[User: 4.1 ms, System: 13.3 ms] Range (min … max):14.2 ms … 22.8 ms162 runs Benchmark 2: ./new_clang --version Time (mean ± σ): 15.5 ms ± 1.4 ms[User: 3.6 ms, System: 11.8 ms] Range (min … max):12.4 ms … 20.3 ms216 runs Summary './new_clang --version' ran 1.13 ± 0.14 times faster than './old_clang --version' ``` We get about 2ms faster `--version` runs. While there is a lot of noise in binary execution time, this delta is pretty consistent, and represents over 10% improvement. This is particularly interesting to me because for very short source files, repeatedly starting the `clang` binary is actually the dominant cost. For example, `configure` scripts running against the `clang` compiler are slow in large part because of binary start up time, not the time to process the actual inputs to the compiler. This PR implements the string tables using `constexpr` code and the existing macro system. I understand that the builtins are moving towards a TableGen model, and if complete that would provide more options for modeling this. Unfortunately, that migration isn't complete, and even the parts that are migrated still rely on the ability to break out of the TableGen model and directly expand an X-macro style `BUILTIN(...)` textually. I looked at trying to complete the move to TableGen, but it would both require the difficult migration of the remaining targets, and solving some tricky problems with how to move away from any macro-based expansion. I was also able to find a reasonably clean and effective way of doing this with the existing macros and some `constexpr` code that I think is clean enough to be a pretty good intermediate state, and maybe give a good target for the eventual TableGen solution. I was also able to factor the macros into set of consistent patterns that avoids a significant regression in overall boilerplate. There is one challenge with this approach: it requires the host compiler to support (very) long string literals, a bit over half a meg. =/ The current minimum MSVC version rejects these, but the very next patch release (16.8) removed that restriction. I'm going to send out a separate PR / RFC to raise the minimum version by one patch release, which I hope is acceptable as the current version was set years ago. FWIW, there are a few more low-hanging fruit sources of excessive dynamic relocations, maybe as many as 50k to 100k more that I'll take a look at to see if I can identify easy fixes. Beyond that, it seems to get quite difficult. It might be worth adding some guidance to developer documentation to try to avoid creating global data structures that _repeatedly_ store pointers to other globals. --- Patch is 77.35 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/118734.diff 48 Files Affected: - (modified) clang/include/clang/Basic/Builtins.h (+174-36) - (modified) cla
[clang] Switch builtin strings to use string tables (PR #118734)
llvmbot wrote: @llvm/pr-subscribers-backend-risc-v Author: Chandler Carruth (chandlerc) Changes The Clang binary (and any binary linking Clang as a library), when built using PIE, ends up with a pretty shocking number of dynamic relocations to apply to the executable image: roughly 400k. Each of these takes up binary space in the executable, and perhaps most interestingly takes start-up time to apply the relocations. The largest pattern I identified were the strings used to describe target builtins. The addresses of these string literals were stored into huge arrays, each one requiring a dynamic relocation. The way to avoid this is to design the target builtins to use a single large table of strings and offsets within the table for the individual strings. This switches the builtin management to such a scheme. This saves over 100k dynamic relocations by my measurement, an over 25% reduction. Just looking at byte size improvements, using the `bloaty` tool to compare a newly built `clang` binary to an old one: ``` FILE SIZEVM SIZE -- -- +1.4% +653Ki +1.4% +653Ki.rodata +0.0%+960 +0.0%+960.text +0.0%+197 +0.0%+197.dynstr +0.0%+184 +0.0%+184.eh_frame +0.0% +96 +0.0% +96.dynsym +0.0% +40 +0.0% +40.eh_frame_hdr +114% +32 [ = ] 0[Unmapped] +0.0% +20 +0.0% +20.gnu.hash +0.0% +8 +0.0% +8.gnu.version +0.9% +7 +0.9% +7[LOAD #2 [R]] [ = ] 0 -75.4% -3.00Ki.relro_padding -16.1% -802Ki -16.1% -802Ki.data.rel.ro -27.3% -2.52Mi -27.3% -2.52Mi.rela.dyn -1.6% -2.66Mi -1.6% -2.66MiTOTAL ``` We get a 16% reduction in the `.data.rel.ro` section, and nearly 30% reduction in `.rela.dyn` where those reloctaions are stored. This is also visible in my benchmarking of binary start-up overhead at least: ``` Benchmark 1: ./old_clang --version Time (mean ± σ): 17.6 ms ± 1.5 ms[User: 4.1 ms, System: 13.3 ms] Range (min … max):14.2 ms … 22.8 ms162 runs Benchmark 2: ./new_clang --version Time (mean ± σ): 15.5 ms ± 1.4 ms[User: 3.6 ms, System: 11.8 ms] Range (min … max):12.4 ms … 20.3 ms216 runs Summary './new_clang --version' ran 1.13 ± 0.14 times faster than './old_clang --version' ``` We get about 2ms faster `--version` runs. While there is a lot of noise in binary execution time, this delta is pretty consistent, and represents over 10% improvement. This is particularly interesting to me because for very short source files, repeatedly starting the `clang` binary is actually the dominant cost. For example, `configure` scripts running against the `clang` compiler are slow in large part because of binary start up time, not the time to process the actual inputs to the compiler. This PR implements the string tables using `constexpr` code and the existing macro system. I understand that the builtins are moving towards a TableGen model, and if complete that would provide more options for modeling this. Unfortunately, that migration isn't complete, and even the parts that are migrated still rely on the ability to break out of the TableGen model and directly expand an X-macro style `BUILTIN(...)` textually. I looked at trying to complete the move to TableGen, but it would both require the difficult migration of the remaining targets, and solving some tricky problems with how to move away from any macro-based expansion. I was also able to find a reasonably clean and effective way of doing this with the existing macros and some `constexpr` code that I think is clean enough to be a pretty good intermediate state, and maybe give a good target for the eventual TableGen solution. I was also able to factor the macros into set of consistent patterns that avoids a significant regression in overall boilerplate. There is one challenge with this approach: it requires the host compiler to support (very) long string literals, a bit over half a meg. =/ The current minimum MSVC version rejects these, but the very next patch release (16.8) removed that restriction. I'm going to send out a separate PR / RFC to raise the minimum version by one patch release, which I hope is acceptable as the current version was set years ago. FWIW, there are a few more low-hanging fruit sources of excessive dynamic relocations, maybe as many as 50k to 100k more that I'll take a look at to see if I can identify easy fixes. Beyond that, it seems to get quite difficult. It might be worth adding some guidance to developer documentation to try to avoid creating global data structures that _repeatedly_ store pointers to other globals. --- Patch is 77.35 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/118734.diff 48 Files Affected: - (modified) clang/include/clang/Basic/Builtins.h (+174-36) - (modified) cl
[clang] Switch builtin strings to use string tables (PR #118734)
https://github.com/chandlerc created https://github.com/llvm/llvm-project/pull/118734 The Clang binary (and any binary linking Clang as a library), when built using PIE, ends up with a pretty shocking number of dynamic relocations to apply to the executable image: roughly 400k. Each of these takes up binary space in the executable, and perhaps most interestingly takes start-up time to apply the relocations. The largest pattern I identified were the strings used to describe target builtins. The addresses of these string literals were stored into huge arrays, each one requiring a dynamic relocation. The way to avoid this is to design the target builtins to use a single large table of strings and offsets within the table for the individual strings. This switches the builtin management to such a scheme. This saves over 100k dynamic relocations by my measurement, an over 25% reduction. Just looking at byte size improvements, using the `bloaty` tool to compare a newly built `clang` binary to an old one: ``` FILE SIZEVM SIZE -- -- +1.4% +653Ki +1.4% +653Ki.rodata +0.0%+960 +0.0%+960.text +0.0%+197 +0.0%+197.dynstr +0.0%+184 +0.0%+184.eh_frame +0.0% +96 +0.0% +96.dynsym +0.0% +40 +0.0% +40.eh_frame_hdr +114% +32 [ = ] 0[Unmapped] +0.0% +20 +0.0% +20.gnu.hash +0.0% +8 +0.0% +8.gnu.version +0.9% +7 +0.9% +7[LOAD #2 [R]] [ = ] 0 -75.4% -3.00Ki.relro_padding -16.1% -802Ki -16.1% -802Ki.data.rel.ro -27.3% -2.52Mi -27.3% -2.52Mi.rela.dyn -1.6% -2.66Mi -1.6% -2.66MiTOTAL ``` We get a 16% reduction in the `.data.rel.ro` section, and nearly 30% reduction in `.rela.dyn` where those reloctaions are stored. This is also visible in my benchmarking of binary start-up overhead at least: ``` Benchmark 1: ./old_clang --version Time (mean ± σ): 17.6 ms ± 1.5 ms[User: 4.1 ms, System: 13.3 ms] Range (min … max):14.2 ms … 22.8 ms162 runs Benchmark 2: ./new_clang --version Time (mean ± σ): 15.5 ms ± 1.4 ms[User: 3.6 ms, System: 11.8 ms] Range (min … max):12.4 ms … 20.3 ms216 runs Summary './new_clang --version' ran 1.13 ± 0.14 times faster than './old_clang --version' ``` We get about 2ms faster `--version` runs. While there is a lot of noise in binary execution time, this delta is pretty consistent, and represents over 10% improvement. This is particularly interesting to me because for very short source files, repeatedly starting the `clang` binary is actually the dominant cost. For example, `configure` scripts running against the `clang` compiler are slow in large part because of binary start up time, not the time to process the actual inputs to the compiler. This PR implements the string tables using `constexpr` code and the existing macro system. I understand that the builtins are moving towards a TableGen model, and if complete that would provide more options for modeling this. Unfortunately, that migration isn't complete, and even the parts that are migrated still rely on the ability to break out of the TableGen model and directly expand an X-macro style `BUILTIN(...)` textually. I looked at trying to complete the move to TableGen, but it would both require the difficult migration of the remaining targets, and solving some tricky problems with how to move away from any macro-based expansion. I was also able to find a reasonably clean and effective way of doing this with the existing macros and some `constexpr` code that I think is clean enough to be a pretty good intermediate state, and maybe give a good target for the eventual TableGen solution. I was also able to factor the macros into set of consistent patterns that avoids a significant regression in overall boilerplate. There is one challenge with this approach: it requires the host compiler to support (very) long string literals, a bit over half a meg. =/ The current minimum MSVC version rejects these, but the very next patch release (16.8) removed that restriction. I'm going to send out a separate PR / RFC to raise the minimum version by one patch release, which I hope is acceptable as the current version was set years ago. FWIW, there are a few more low-hanging fruit sources of excessive dynamic relocations, maybe as many as 50k to 100k more that I'll take a look at to see if I can identify easy fixes. Beyond that, it seems to get quite difficult. It might be worth adding some guidance to developer documentation to try to avoid creating global data structures that _repeatedly_ store pointers to other globals. >From 02d05d256a1f5158be5bd924e66f820fee15ceec Mon Sep 17 00:00:00 2001 From: Chandler Carruth Date: Thu, 28 Nov 2024 09:56:40 + Subject: [PATCH] Switch builtin strings to use string tables MIME-Version: 1.0 Content-Type: text/plain;