grokos created this revision. grokos added reviewers: ABataev, vzakhari. grokos added projects: OpenMP, clang. Herald added a reviewer: jdoerfert.
Currently, the offload-wrapper tool inserts `__tgt_register_lib` to the list of global ctors of a target module with `Priority=0`. This means that it's got the same priority as `__tgt_register_requires` and the order in which these two functions are called in not guaranteed. Ideally, we'd like to call `__tgt_register_requires` BEFORE loading a libomptarget plugin (which is one of the actions happening inside `__tgt_register_lib`). The reason is that we want to know which requirements the user has asked for so that upon loading the plugin libomptarget can report how many devices there are that can satisfy the requirements. E.g. with the current implementation we can run into the following problem: 1. The user requests `unified_shared_memory` but the available devices on the system do not support this feature. 2. Initially, the offload policy is set to `tgt_default`. 3. `__tgt_register_lib` is called and the plugin for the specific target device reports there are N>0 available devices. 4. Consequently, the offload policy is set to `tgt_mandatory`. 5. `__tgt_register_requires` is called and we find out that the `unified_shared_memory` requirement cannot be satisfied. 6. Offload fails and because the offload policy had been set to mandatory libomptarget terminates the application. With the proposed change things will proceed as follows: 1. The user requests `unified_shared_memory` but the available devices on the system do not support this feature. 2. Initially, the offload policy is set to `tgt_default`. 3. `__tgt_register_requires` is called and registers the `unified_shared_memory` requirement with libomptarget. 4. `__tgt_register_lib` is called and the plugin for the specific target device reports that the `unified_shared_memory` requirement cannot be satisfied, so there are N=0 available devices. 5. Consequently, the offload policy is set to `tgt_disabled`. 6. Execution falls back on the host instead of terminating the application. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D75223 Files: clang/test/Driver/clang-offload-wrapper.c clang/tools/clang-offload-wrapper/ClangOffloadWrapper.cpp Index: clang/tools/clang-offload-wrapper/ClangOffloadWrapper.cpp =================================================================== --- clang/tools/clang-offload-wrapper/ClangOffloadWrapper.cpp +++ clang/tools/clang-offload-wrapper/ClangOffloadWrapper.cpp @@ -262,7 +262,12 @@ Builder.CreateRetVoid(); // Add this function to constructors. - appendToGlobalCtors(M, Func, 0); + // Set priority to 1 so that __tgt_register_lib is executed AFTER + // __tgt_register_requires (we want to know what requirements have been + // asked for before we load a libomptarget plugin so that by the time the + // plugin is loaded it can report how many devices there are which can + // satisfy these requirements). + appendToGlobalCtors(M, Func, /*Priority*/ 1); } void createUnregisterFunction(GlobalVariable *BinDesc) { Index: clang/test/Driver/clang-offload-wrapper.c =================================================================== --- clang/test/Driver/clang-offload-wrapper.c +++ clang/test/Driver/clang-offload-wrapper.c @@ -39,7 +39,7 @@ // CHECK-IR: [[DESC:@.+]] = internal constant [[DESCTY]] { i32 1, [[IMAGETY]]* getelementptr inbounds ([1 x [[IMAGETY]]], [1 x [[IMAGETY]]]* [[IMAGES]], i64 0, i64 0), [[ENTTY]]* [[ENTBEGIN]], [[ENTTY]]* [[ENTEND]] } -// CHECK-IR: @llvm.global_ctors = appending global [1 x { i32, void ()*, i8* }] [{ i32, void ()*, i8* } { i32 0, void ()* [[REGFN:@.+]], i8* null }] +// CHECK-IR: @llvm.global_ctors = appending global [1 x { i32, void ()*, i8* }] [{ i32, void ()*, i8* } { i32 1, void ()* [[REGFN:@.+]], i8* null }] // CHECK-IR: @llvm.global_dtors = appending global [1 x { i32, void ()*, i8* }] [{ i32, void ()*, i8* } { i32 0, void ()* [[UNREGFN:@.+]], i8* null }] // CHECK-IR: define internal void [[REGFN]]()
Index: clang/tools/clang-offload-wrapper/ClangOffloadWrapper.cpp =================================================================== --- clang/tools/clang-offload-wrapper/ClangOffloadWrapper.cpp +++ clang/tools/clang-offload-wrapper/ClangOffloadWrapper.cpp @@ -262,7 +262,12 @@ Builder.CreateRetVoid(); // Add this function to constructors. - appendToGlobalCtors(M, Func, 0); + // Set priority to 1 so that __tgt_register_lib is executed AFTER + // __tgt_register_requires (we want to know what requirements have been + // asked for before we load a libomptarget plugin so that by the time the + // plugin is loaded it can report how many devices there are which can + // satisfy these requirements). + appendToGlobalCtors(M, Func, /*Priority*/ 1); } void createUnregisterFunction(GlobalVariable *BinDesc) { Index: clang/test/Driver/clang-offload-wrapper.c =================================================================== --- clang/test/Driver/clang-offload-wrapper.c +++ clang/test/Driver/clang-offload-wrapper.c @@ -39,7 +39,7 @@ // CHECK-IR: [[DESC:@.+]] = internal constant [[DESCTY]] { i32 1, [[IMAGETY]]* getelementptr inbounds ([1 x [[IMAGETY]]], [1 x [[IMAGETY]]]* [[IMAGES]], i64 0, i64 0), [[ENTTY]]* [[ENTBEGIN]], [[ENTTY]]* [[ENTEND]] } -// CHECK-IR: @llvm.global_ctors = appending global [1 x { i32, void ()*, i8* }] [{ i32, void ()*, i8* } { i32 0, void ()* [[REGFN:@.+]], i8* null }] +// CHECK-IR: @llvm.global_ctors = appending global [1 x { i32, void ()*, i8* }] [{ i32, void ()*, i8* } { i32 1, void ()* [[REGFN:@.+]], i8* null }] // CHECK-IR: @llvm.global_dtors = appending global [1 x { i32, void ()*, i8* }] [{ i32, void ()*, i8* } { i32 0, void ()* [[UNREGFN:@.+]], i8* null }] // CHECK-IR: define internal void [[REGFN]]()
_______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits