[PATCH] D33406: PR28129 expand vector oparation to an IR constant.
RKSimon closed this revision. RKSimon added a comment. https://reviews.llvm.org/rL305551 https://reviews.llvm.org/D33406 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D33406: PR28129 expand vector oparation to an IR constant.
spatel accepted this revision. spatel added a comment. This revision is now accepted and ready to land. LGTM. https://reviews.llvm.org/D33406 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D33406: PR28129 expand vector oparation to an IR constant.
dtemirbulatov updated this revision to Diff 102717. dtemirbulatov added a comment. Update formatting, comments https://reviews.llvm.org/D33406 Files: lib/CodeGen/CGBuiltin.cpp test/CodeGen/avx-builtins.c Index: test/CodeGen/avx-builtins.c === --- test/CodeGen/avx-builtins.c +++ test/CodeGen/avx-builtins.c @@ -1427,3 +1427,51 @@ // CHECK: extractelement <8 x float> %{{.*}}, i32 0 return _mm256_cvtss_f32(__a); } + +__m256 test_mm256_cmp_ps_true(__m256 a, __m256 b) { + // CHECK-LABEL: @test_mm256_cmp_ps_true + // CHECK: store <8 x float> zeroinitializer, <8 x float>* %tmp, align 32 + return _mm256_cmp_ps(a, b, _CMP_FALSE_OQ); +} + +__m256 test_mm256_cmp_pd_false(__m256 a, __m256 b) { + // CHECK-LABEL: @test_mm256_cmp_pd_false + // CHECK: store <4 x double> zeroinitializer, <4 x double>* %tmp, align 32 + return _mm256_cmp_pd(a, b, _CMP_FALSE_OQ); +} + +__m256 test_mm256_cmp_ps_strue(__m256 a, __m256 b) { + // CHECK-LABEL: @test_mm256_cmp_ps_strue + // CHECK: store <8 x float> zeroinitializer, <8 x float>* %tmp, align 32 + return _mm256_cmp_ps(a, b, _CMP_FALSE_OS); +} + +__m256 test_mm256_cmp_pd_sfalse(__m256 a, __m256 b) { + // CHECK-LABEL: @test_mm256_cmp_pd_sfalse + // CHECK: store <4 x double> zeroinitializer, <4 x double>* %tmp, align 32 + return _mm256_cmp_pd(a, b, _CMP_FALSE_OS); +} Index: lib/CodeGen/CGBuiltin.cpp === --- lib/CodeGen/CGBuiltin.cpp +++ lib/CodeGen/CGBuiltin.cpp @@ -7923,19 +7923,40 @@ } // We can't handle 8-31 immediates with native IR, use the intrinsic. +// Except for predicates that create constants. Intrinsic::ID ID; switch (BuiltinID) { default: llvm_unreachable("Unsupported intrinsic!"); case X86::BI__builtin_ia32_cmpps: ID = Intrinsic::x86_sse_cmp_ps; break; case X86::BI__builtin_ia32_cmpps256: + // _CMP_TRUE_UQ, _CMP_TRUE_US produce -1,-1... vector + // on any input and _CMP_FALSE_OQ, _CMP_FALSE_OS produce 0, 0... + if (CC == 0xf || CC == 0xb || CC == 0x1b || CC == 0x1f) { + Value *Constant = (CC == 0xf || CC == 0x1f) ? +llvm::Constant::getAllOnesValue(Builder.getInt32Ty()) : +llvm::Constant::getNullValue(Builder.getInt32Ty()); + Value *Vec = Builder.CreateVectorSplat( +Ops[0]->getType()->getVectorNumElements(), Constant); + return Builder.CreateBitCast(Vec, Ops[0]->getType()); + } ID = Intrinsic::x86_avx_cmp_ps_256; break; case X86::BI__builtin_ia32_cmppd: ID = Intrinsic::x86_sse2_cmp_pd; break; case X86::BI__builtin_ia32_cmppd256: + // _CMP_TRUE_UQ, _CMP_TRUE_US produce -1,-1... vector + // on any input and _CMP_FALSE_OQ, _CMP_FALSE_OS produce 0, 0... + if (CC == 0xf || CC == 0xb || CC == 0x1b || CC == 0x1f) { + Value *Constant = (CC == 0xf || CC == 0x1f) ? +llvm::Constant::getAllOnesValue(Builder.getInt64Ty()) : +llvm::Constant::getNullValue(Builder.getInt64Ty()); + Value *Vec = Builder.CreateVectorSplat( +Ops[0]->getType()->getVectorNumElements(), Constant); + return Builder.CreateBitCast(Vec, Ops[0]->getType()); + } ID = Intrinsic::x86_avx_cmp_pd_256; break; } ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D33406: PR28129 expand vector oparation to an IR constant.
spatel added a comment. Functionally, I think this is correct and complete now. See inline for some nits. Comment at: lib/CodeGen/CGBuiltin.cpp:7925 // We can't handle 8-31 immediates with native IR, use the intrinsic. Intrinsic::ID ID; Fix comment to something like: "Except for predicates that create constants, ..." Comment at: lib/CodeGen/CGBuiltin.cpp:7933 case X86::BI__builtin_ia32_cmpps256: + // _CMP_TRUE_UQ, _CMP_TRUE_US would produce -1,-1... vector + // on any input and _CMP_FALSE_OQ, _CMP_FALSE_OS produces 0, 0... would produce --> produces Comment at: lib/CodeGen/CGBuiltin.cpp:7939 +llvm::Constant::getNullValue(Builder.getInt32Ty()); + Value *Vec = Builder.CreateVectorSplat(Ops[0]->getType()->getVectorNumElements(), + Constant); Formatting: over 80-col limit. Comment at: lib/CodeGen/CGBuiltin.cpp:7949 case X86::BI__builtin_ia32_cmppd256: + // _CMP_TRUE_UQ, _CMP_TRUE_US would produce -1,-1... vector + // on any input and _CMP_FALSE_OQ, _CMP_FALSE_OS produces 0, 0... would produce --> produces Comment at: lib/CodeGen/CGBuiltin.cpp:7955 +llvm::Constant::getNullValue(Builder.getInt64Ty()); + Value *Vec = Builder.CreateVectorSplat(Ops[0]->getType()->getVectorNumElements(), + Constant); Formatting: over 80-col limit. https://reviews.llvm.org/D33406 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D33406: PR28129 expand vector oparation to an IR constant.
dtemirbulatov updated this revision to Diff 102673. dtemirbulatov added a reviewer: hfinkel. dtemirbulatov added a comment. Update after http://lists.llvm.org/pipermail/llvm-dev/2017-June/114120.html. Added 0x1b(_CMP_FALSE_OS), 0x1f(_CMP_TRUE_US) handling. https://reviews.llvm.org/D33406 Files: lib/CodeGen/CGBuiltin.cpp test/CodeGen/avx-builtins.c Index: test/CodeGen/avx-builtins.c === --- test/CodeGen/avx-builtins.c +++ test/CodeGen/avx-builtins.c @@ -1427,3 +1427,51 @@ // CHECK: extractelement <8 x float> %{{.*}}, i32 0 return _mm256_cvtss_f32(__a); } + +__m256 test_mm256_cmp_ps_true(__m256 a, __m256 b) { + // CHECK-LABEL: @test_mm256_cmp_ps_true + // CHECK: store <8 x float> zeroinitializer, <8 x float>* %tmp, align 32 + return _mm256_cmp_ps(a, b, _CMP_FALSE_OQ); +} + +__m256 test_mm256_cmp_pd_false(__m256 a, __m256 b) { + // CHECK-LABEL: @test_mm256_cmp_pd_false + // CHECK: store <4 x double> zeroinitializer, <4 x double>* %tmp, align 32 + return _mm256_cmp_pd(a, b, _CMP_FALSE_OQ); +} + +__m256 test_mm256_cmp_ps_strue(__m256 a, __m256 b) { + // CHECK-LABEL: @test_mm256_cmp_ps_strue + // CHECK: store <8 x float> zeroinitializer, <8 x float>* %tmp, align 32 + return _mm256_cmp_ps(a, b, _CMP_FALSE_OS); +} + +__m256 test_mm256_cmp_pd_sfalse(__m256 a, __m256 b) { + // CHECK-LABEL: @test_mm256_cmp_pd_sfalse + // CHECK: store <4 x double> zeroinitializer, <4 x double>* %tmp, align 32 + return _mm256_cmp_pd(a, b, _CMP_FALSE_OS); +} Index: lib/CodeGen/CGBuiltin.cpp === --- lib/CodeGen/CGBuiltin.cpp +++ lib/CodeGen/CGBuiltin.cpp @@ -7930,12 +7930,32 @@ ID = Intrinsic::x86_sse_cmp_ps; break; case X86::BI__builtin_ia32_cmpps256: + // _CMP_TRUE_UQ, _CMP_TRUE_US would produce -1,-1... vector + // on any input and _CMP_FALSE_OQ, _CMP_FALSE_OS produces 0, 0... + if (CC == 0xf || CC == 0xb || CC == 0x1b || CC == 0x1f) { + Value *Constant = (CC == 0xf || CC == 0x1f) ? +llvm::Constant::getAllOnesValue(Builder.getInt32Ty()) : +llvm::Constant::getNullValue(Builder.getInt32Ty()); + Value *Vec = Builder.CreateVectorSplat(Ops[0]->getType()->getVectorNumElements(), + Constant); + return Builder.CreateBitCast(Vec, Ops[0]->getType()); + } ID = Intrinsic::x86_avx_cmp_ps_256; break; case X86::BI__builtin_ia32_cmppd: ID = Intrinsic::x86_sse2_cmp_pd; break; case X86::BI__builtin_ia32_cmppd256: + // _CMP_TRUE_UQ, _CMP_TRUE_US would produce -1,-1... vector + // on any input and _CMP_FALSE_OQ, _CMP_FALSE_OS produces 0, 0... + if (CC == 0xf || CC == 0xb || CC == 0x1b || CC == 0x1f) { + Value *Constant = (CC == 0xf || CC == 0x1f) ? +llvm::Constant::getAllOnesValue(Builder.getInt64Ty()) : +llvm::Constant::getNullValue(Builder.getInt64Ty()); + Value *Vec = Builder.CreateVectorSplat(Ops[0]->getType()->getVectorNumElements(), + Constant); + return Builder.CreateBitCast(Vec, Ops[0]->getType()); + } ID = Intrinsic::x86_avx_cmp_pd_256; break; } Index: test/CodeGen/avx-builtins.c === --- test/CodeGen/avx-builtins.c +++ test/CodeGen/avx-builtins.c @@ -1427,3 +1427,51 @@ // CHECK: extractelement <8 x float> %{{.*}}, i32 0 return _mm256_cvtss_f32(__a); } + +__m256 test_mm256_cmp_ps_true(__m256 a, __m256 b) { + // CHECK-LABEL: @test_mm256_cmp_ps_true + // CHECK: store <8 x float> zeroinitializer, <8 x float>* %tmp, align 32 + return _mm256_cmp_ps(a, b, _CMP_FALSE_OQ); +} + +__m256 test_mm256_cmp_pd_false(__m256 a, __m256 b) { + // CHECK-LABEL: @test_mm256_cmp_pd_false + // CHECK: store <4 x double> zeroinitializer, <4 x double>* %tmp, align 32 + return _mm256_cmp_pd(a, b, _CMP_FALSE_OQ); +} + +__m256 test_mm256_cmp_ps_strue(__m256 a, __m256 b) { + // CHECK-LABEL: @test_mm256_cmp_ps_strue + // CHECK: store <8 x float> zeroinitializer, <8 x float>* %tmp, align 32 + return _mm256_cmp_ps(a, b, _CMP_FALSE_OS); +} + +__m256 test_mm256_cmp_pd_sfalse(__m256 a, __m256 b) { + // CHECK-LABEL: @test_mm256_cmp_pd_sfalse + // CHECK: store <4 x double> zeroinitializer, <4 x double>* %tmp, align 32 + return _mm256_cmp_pd(a, b, _CMP_FALSE_OS); +} Index: lib/CodeGen/CGBuiltin.cpp === --- lib/CodeGen/CGBuiltin.cpp +++ lib/CodeGen/CGBuiltin.cpp @@ -7930,12 +7930,32 @@ ID = Intrinsic::x86_sse_cmp_ps; break; case X86::BI__builtin_ia32_cmpps256: + // _CMP_TRUE_UQ, _CMP_TRUE_US would produce -1,-1... vector + // on any input and _CMP_FALSE_OQ, _CMP_FALSE_OS produces 0, 0... + if (CC == 0xf || CC == 0xb || CC == 0x1b || CC == 0x1f) { + Value *Constant = (CC == 0xf || CC == 0x1f) ? +
[PATCH] D33406: PR28129 expand vector oparation to an IR constant.
dtemirbulatov added a comment. Ping. [andrew.w.kaylor, scanon] Is it OK to assume that FP exceptions are off by default and allow such transformation to constants in the IR since we know that we would have exception with "1.00 -nan" for _mm256_cmp_ps(a, b, 15)? https://reviews.llvm.org/D33406 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D33406: PR28129 expand vector oparation to an IR constant.
dtemirbulatov added a comment. > We should've asked this first: is that fold allowed in the default FPENV > state that we assume that clang is operating in? I suppose it is FE_ALL_EXCEPT. https://reviews.llvm.org/D33406 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D33406: PR28129 expand vector oparation to an IR constant.
spatel added inline comments. Comment at: lib/CodeGen/CGBuiltin.cpp:7932 break; case X86::BI__builtin_ia32_cmppd256: ID = Intrinsic::x86_avx_cmp_pd_256; 1. Should we handle the 'pd256' version the same way? 2. How about the 0xb ('false') constant? It should produce a zero here? 3. Can or should we deal with the signalling versions (0x1b, 0x1f) too? https://reviews.llvm.org/D33406 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D33406: PR28129 expand vector oparation to an IR constant.
RKSimon added a comment. Test _mm256_cmp_pd as well? Comment at: lib/CodeGen/CGBuiltin.cpp:7922 case X86::BI__builtin_ia32_cmpps256: + if (CC == 0xf) { + Value *Vec = Builder.CreateVectorSplat(Ops[0]->getType()->getVectorNumElements(), You need a comment here - explain what the constant represents and what the transform does. Comment at: test/CodeGen/avx-builtins.c:1434 + // CHECK: store <8 x float> https://reviews.llvm.org/D33406 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits