pengfei added a comment. LGTM.
================ Comment at: clang/lib/Headers/avx512vlbf16intrin.h:164 +#define _mm_cvtneps_pbh(A) \ + ((__m128bh)__builtin_ia32_vcvtneps2bf16128((__v4sf)(A))) ---------------- FreddyYe wrote: > pengfei wrote: > > FreddyYe wrote: > > > pengfei wrote: > > > > pengfei wrote: > > > > > RKSimon wrote: > > > > > > Is there no way for __attribute__ to allow different attribute > > > > > > permutations? > > > > > > > > > > > > Also, can we keep the __builtin_ia32_cvtneps2bf16_128 naming > > > > > > convention? > > > > > > Is there no way for attribute to allow different attribute > > > > > > permutations? > > > > > > > > > > We have discussed this problem with GCC folks. There are two problems > > > > > here: > > > > > 1. Unlike builtins, function attributes are more generic. It may > > > > > introduce a lot of checks between callers and callees. I had a > > > > > research to limit it to `__always_inline__` functions only. However, > > > > > Clang handles inlining in middle-end, we don't have such information > > > > > in the front-end. Besides, we don't know how to merge different > > > > > permutations if they are inlining to the same function. > > > > > 2. We don't know how to put the permutations into IR's function > > > > > attributes. We need to preserve all permutations for inlining > > > > > reference, but the backend needs a determine feature list rather than > > > > > selective. > > > > It's better to use `__builtin_ia32_cvtneps2bf16_128`. > > > I think __builtin_ia32_vcvtneps2bf16128 is also a "right" name. > > > > > > See __builtin_ia32_vfmaddsubph256, __builtin_ia32_minph256... > > > > > > And I admit naming conventions of clang builtins as well as LLVM IR > > > builtins are confusing right now. > > The problem here is `16128` is a bit confusing, a `_` breaks it into 2 > > number. > > But I'm not insist on it :) > I did a try but found __builtin_ia32_cvtneps2bf16_256 existed for avx512bf16, > and it's used for mask intrinsic lowering currently. What about not change > this time? We can do a refine patch later for avx512bf16 builtins since they > also have some redundant FE/codegen logics for 256/512 mask intrinsics. No problem. ================ Comment at: llvm/test/CodeGen/X86/avxneconvert-intrinsics-shared.ll:2 +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=x86_64-unknown-unknown --show-mc-encoding -mattr=+avxneconvert,+avx512bf16,+avx512vl | FileCheck %s --check-prefix=AVX512BF16-COMMON +; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=i686-unknown-unknown --show-mc-encoding -mattr=+avxneconvert,+avx512bf16,+avx512vl | FileCheck %s --check-prefix=AVX512BF16-COMMON ---------------- Remove `-O0` ================ Comment at: llvm/test/CodeGen/X86/avxneconvert-intrinsics.ll:2 +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=x86_64-unknown-unknown --show-mc-encoding -mattr=+avxneconvert | FileCheck %s --check-prefixes=CHECK,X64 +; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=i686-unknown-unknown --show-mc-encoding -mattr=+avxneconvert | FileCheck %s --check-prefixes=CHECK,X86 ---------------- ditto. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D135930/new/ https://reviews.llvm.org/D135930 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits