Anastasia added a comment. I have done some measurements using the test produced from this Tablegen emitter (59K lines).
I have used the test it in two files: 1. `SemaOpenCL/all-std-buitins.cl` that has the following RUN line appended 6 times (for every supported OpenCL version v1.0, v1.1, v1.2, v2.0, v1.3, C++) //RUN: %clang_cc1 %s -triple=spir -fsyntax-only -verify -cl-std=CL2.0 -finclude-default-header -fdeclare-opencl-builtins 2. `SemaOpenCL/all-std-buitins-slow-header.cl` that has the following RUN line appended 6 times (for every supported OpenCL version v1.0, v1.1, v1.2, v2.0, v3.0, C++) //RUN: %clang_cc1 %s -triple=spir -fsyntax-only -verify -cl-std=CL2.0 -finclude-default-header So I am getting the following testing time breakdown then: 201.61s: Clang :: SemaOpenCL/all-std-buitins-slow-header.cl 199.70s: Clang :: SemaOpenCL/all-std-buitins.cl 85.14s: Clang :: Headers/arm-neon-header.c 68.06s: Clang :: OpenMP/nesting_of_regions.cpp 65.23s: Clang :: Driver/crash-report.c 60.26s: Clang :: Analysis/PR24184.cpp 57.80s: Clang :: CodeGen/X86/rot-intrinsics.c 57.58s: Clang :: CodeGen/X86/x86_64-xsave.c 56.34s: Clang :: Headers/opencl-c-header.cl 55.68s: Clang :: CodeGen/X86/x86_32-xsave.c 44.83s: Clang :: Driver/crash-report-with-asserts.c 40.38s: Clang :: Lexer/SourceLocationsOverflow.c 37.44s: Clang :: Headers/x86intrin-2.c 36.53s: Clang :: OpenMP/target_teams_distribute_parallel_for_simd_codegen_registration.cpp 34.09s: Clang :: CodeGen/X86/avx512f-builtins-constrained.c 33.41s: Clang :: CodeGen/X86/sse-builtins-constrained.c 32.82s: Clang :: Analysis/iterator-modeling.cpp 31.37s: Clang :: OpenMP/target_teams_distribute_simd_codegen_registration.cpp 31.10s: Clang :: OpenMP/target_parallel_for_simd_codegen_registration.cpp 30.78s: Clang :: Analysis/use-after-move.cpp I am very confused though about why is the difference between Tablegen and `opencl-c.h` so insignificant? FYI, also for a single clang invocation with Tablegen and `opencl-c.h` the difference is very insignificant in parsing time of this test - 20.794s vs 21.401s. This is really interesting because with small files the difference is huge 0.043s vs 3.990s on test with empty kernel. --------------------------------------- I also timed `check-clang` invocation on my 8 core machine: 1. with both tests - 697.70s 2. with all-std-buitins.cl only - 684.43s 3. without any new tests - 673.00s The change in total testing time appears to be insignificant. I guess this is due to parallel execution? Btw one thing I have thought of since OpenCL v1.0-1.1 doesn't differ a lot for builtin functions and they are not modified much either, perhaps we only need to test v1.2? That would reduce number of clang invocations to 4 in each test. Then the measurements are as follows: 134.13s: Clang :: SemaOpenCL/all-std-buitins-slow-header.cl 131.52s: Clang :: SemaOpenCL/all-std-buitins.cl 85.81s: Clang :: Headers/arm-neon-header.c 69.14s: Clang :: OpenMP/nesting_of_regions.cpp 60.08s: Clang :: Driver/crash-report.c 59.67s: Clang :: Analysis/PR24184.cpp 57.27s: Clang :: CodeGen/X86/rot-intrinsics.c 56.93s: Clang :: CodeGen/X86/x86_32-xsave.c 56.59s: Clang :: CodeGen/X86/x86_64-xsave.c 55.68s: Clang :: Headers/opencl-c-header.cl 40.71s: Clang :: Driver/crash-report-with-asserts.c 39.44s: Clang :: Lexer/SourceLocationsOverflow.c 38.02s: Clang :: OpenMP/target_teams_distribute_parallel_for_simd_codegen_registration.cpp 37.07s: Clang :: Headers/x86intrin-2.c 32.61s: Clang :: CodeGen/X86/avx512f-builtins-constrained.c 32.58s: Clang :: CodeGen/X86/sse-builtins-constrained.c 32.19s: Clang :: Analysis/use-after-move.cpp 31.96s: Clang :: Analysis/iterator-modeling.cpp 31.02s: Clang :: OpenMP/target_teams_distribute_simd_codegen_registration.cpp 30.59s: Clang :: OpenMP/target_parallel_for_simd_codegen_registration.cpp with a total testing time 688.61s **Conclusion:** - if we test the whole functionality the test will be at least 2x slower than the slowest clang test so far but it hardly affect the full testing time of clang-check on modern HW due to the parallel execution. Also related to this partitioning of test files could help with the latency due to the parallel execution. - Testing of opencl-c.h only doubles the testing time. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D97869/new/ https://reviews.llvm.org/D97869 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits