changpeng wrote: [AMD Official Use Only - General]
I am fine to remove f16/bf16 versions. Enumerating all possible types could be very painful. For example we gave up enumerating for B64, and ended up using v2i32 only. What do others think removing f16/bf16 versions? Thanks Get Outlook for iOS<https://aka.ms/o0ukef> ________________________________ From: Matt Arsenault ***@***.***> Sent: Friday, March 22, 2024 3:45:53 AM To: llvm/llvm-project ***@***.***> Cc: Fang, Changpeng ***@***.***>; Author ***@***.***> Subject: Re: [llvm/llvm-project] AMDGPU: Rename and add bf16 support for global_load_tr builtins (PR #86202) Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. @arsenm commented on this pull request. ________________________________ In clang/include/clang/Basic/BuiltinsAMDGPU.def<https://github.com/llvm/llvm-project/pull/86202#discussion_r1535389287>: > -TARGET_BUILTIN(__builtin_amdgcn_global_load_tr_v2i32, "V2iV2i*1", "nc", > "gfx12-insts,wavefrontsize32") -TARGET_BUILTIN(__builtin_amdgcn_global_load_tr_v8i16, "V8sV8s*1", "nc", "gfx12-insts,wavefrontsize32") -TARGET_BUILTIN(__builtin_amdgcn_global_load_tr_v8f16, "V8hV8h*1", "nc", "gfx12-insts,wavefrontsize32") - -TARGET_BUILTIN(__builtin_amdgcn_global_load_tr_i32, "ii*1", "nc", "gfx12-insts,wavefrontsize64") -TARGET_BUILTIN(__builtin_amdgcn_global_load_tr_v4i16, "V4sV4s*1", "nc", "gfx12-insts,wavefrontsize64") -TARGET_BUILTIN(__builtin_amdgcn_global_load_tr_v4f16, "V4hV4h*1", "nc", "gfx12-insts,wavefrontsize64") +TARGET_BUILTIN(__builtin_amdgcn_global_load_tr_b64_v2i32, "V2iV2i*1", "nc", "gfx12-insts,wavefrontsize32") +TARGET_BUILTIN(__builtin_amdgcn_global_load_tr_b128_v8i16, "V8sV8s*1", "nc", "gfx12-insts,wavefrontsize32") +TARGET_BUILTIN(__builtin_amdgcn_global_load_tr_b128_v8f16, "V8hV8h*1", "nc", "gfx12-insts,wavefrontsize32") +TARGET_BUILTIN(__builtin_amdgcn_global_load_tr_b128_v8bf16, "V8yV8y*1", "nc", "gfx12-insts,wavefrontsize32") + +TARGET_BUILTIN(__builtin_amdgcn_global_load_tr_b64_i32, "ii*1", "nc", "gfx12-insts,wavefrontsize64") +TARGET_BUILTIN(__builtin_amdgcn_global_load_tr_b128_v4i16, "V4sV4s*1", "nc", "gfx12-insts,wavefrontsize64") +TARGET_BUILTIN(__builtin_amdgcn_global_load_tr_b128_v4f16, "V4hV4h*1", "nc", "gfx12-insts,wavefrontsize64") +TARGET_BUILTIN(__builtin_amdgcn_global_load_tr_b128_v4bf16, "V4yV4y*1", "nc", "gfx12-insts,wavefrontsize64") Do we really need the f16/bf16 versions? You can always bitcast the i16 versions. — Reply to this email directly, view it on GitHub<https://github.com/llvm/llvm-project/pull/86202#pullrequestreview-1954492883>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ABALCLLYCW6QUD3CELVLSKDYZQDWDAVCNFSM6AAAAABFCIDLAKVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMYTSNJUGQ4TEOBYGM>. You are receiving this because you authored the thread.Message ID: ***@***.***> https://github.com/llvm/llvm-project/pull/86202 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits