[PATCH] D156040: [AMDGPU] Add dynamic stack bit info to kernel-resource-usage Rpass output

2023-07-24 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. In D156040#4528606 , @arsenm wrote: > In D156040#4526036 , > @JonChesterfield wrote: > >> I don't see how this conveys any information. The compiler writes the stack >> size to be alloc

[PATCH] D154790: [HIP] Use native math functions for `-fcuda-approx-transcendentals`

2023-07-09 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added inline comments. Comment at: clang/lib/Headers/__clang_hip_math.h:304 +__DEVICE__ +float __tanf(float __x) { return __ocml_tan_f32(__x); } +// END INTRINSICS We could consider multiplying native_sin here with the native_recip of native_cos. CHAN

[PATCH] D154123: [HIP] Start document HIP support by clang

2023-06-29 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added inline comments. Comment at: clang/docs/HIPSupport.rst:104 + - This macro is defined when the GPU default stream kind is set to per-thread. + Should we include the __gfxNNN__ or __GFXN__ macros here? What about wave size, and CU mode? And w

[PATCH] D148796: [AMDGPU][GFX908] Add builtin support for global add atomic f16/f32

2023-04-20 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. In D148796#4284504 , @rampitec wrote: > We used to support it that way and decided just not doing it. It is very hard > to explain why a supported atomic results in error. Someone who really needs > it can use intrinsic. I ten

[PATCH] D146840: [AMDGPU] Replace target feature for global fadd32

2023-03-28 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. No objection here. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D146840/new/ https://reviews.llvm.org/D146840 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lis

[PATCH] D145343: [AMDGPU] Emit predefined macro `__AMDGCN_CUMODE_OPTION`

2023-03-05 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. In D145343#4170305 , @yaxunl wrote: > In D145343#4170250 , @arsenm wrote: > >> I think exposing whether or not the flag was used is weird/broken, as is >> including _OPTION in the name.

[PATCH] D142507: [AMDGPU] Split dot7 feature

2023-02-14 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. > My current understanding is the c-p will go into already forked clang-16, but > not to rocm 5.4. So rocm device-libs will be accompanied by the older > clang-16 w/o this and stay compatible. Someone building from scratch will use > latest clang-16 and staging devic

[PATCH] D142507: [AMDGPU] Split dot7 feature

2023-02-14 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. In D142507#4127382 , @rampitec wrote: > In D142507#4127374 , @aaronmondal > wrote: > >> I think unless conflicts arise creating an issue similar to this >> https://github.com/llvm/llvm-

[PATCH] D138507: HIP: Directly use sqrt builtins instead of calling ocml (f32 case)

2022-11-22 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. __builtin_sqrtf does not produce a correctly rounded result. I don't recommend this change. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D138507/new/ https://reviews.llvm.org/D138507 ___ cfe-commits mailing list cf

[PATCH] D136981: [HIP] add float to fp16 convert functions

2022-10-28 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. Thank you! CHANGES SINCE LAST ACTION https://reviews.llvm.org/D136981/new/ https://reviews.llvm.org/D136981 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commit

[PATCH] D130096: [Clang][AMDGPU] Emit AMDGPU library control constants in clang

2022-10-11 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. > Different functions providing different behaviors can be handled at link time > like any other function, instead of the same functions providing different > behaviors per translation unit and requires cloning. The current scheme > transfers complexity from the device

[PATCH] D130096: [Clang][AMDGPU] Emit AMDGPU library control constants in clang

2022-10-11 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. In D130096#3850473 , @arsenm wrote: > In D130096#3850472 , @jhuber6 wrote: > >> I don't like the fact that we need to have two different kinds of control >> constants, one per-TU and oth

[PATCH] D135614: [OpenMP][CUDA][AMDGPU] Accept case insensitive subarchitecture names

2022-10-11 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. Also, we may want to use uppercase for other purposes in the future. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D135614/new/ https://reviews.llvm.org/D135614 ___ cfe-commits m

[PATCH] D135614: [OpenMP][CUDA][AMDGPU] Accept case insensitive subarchitecture names

2022-10-11 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. I don't particularly see a need for this. I am not opposed to a "did you mean" in the error diagnostic. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D135614/new/ https://reviews.llvm.org/D135614

[PATCH] D134355: [AMDGPU] Emit module flag for all code object versions

2022-09-22 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. In D134355#3809294 , @yaxunl wrote: > In D134355#3807435 , @cfang wrote: > >> LGTM >> >> Should the module flag name be amdgpu_code_object_version or >> amdhsa_code_object_version? > > G

[PATCH] D132140: [AMDGPU] Add builtin s_sendmsg_rtn

2022-08-19 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. In D132140#3732337 , @yaxunl wrote: > revised by Brian's comments Thank you. Looks good to me. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D132140/new/ https://reviews.llvm.org/D132140 _

[PATCH] D132140: [AMDGPU] Add builtin s_sendmsg_rtn_b{32|64}

2022-08-18 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. Following existing naming, it might make sense to rename "rtn_b32" --> "rtn" and "rtn_b64" --> "rtnl". CHANGES SINCE LAST ACTION https://reviews.llvm.org/D132140/new/ https://reviews.llvm.org/D132140 ___ cfe-commits maili

[PATCH] D128022: [HIP] add -fhip-kernel-arg-name

2022-06-23 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added inline comments. Comment at: clang/lib/CodeGen/CodeGenModule.cpp:1845-1846 + } + if (getCodeGenOpts().EmitOpenCLArgMetadata || + getCodeGenOpts().HIPSaveKernelArgName) Fn->setMetadata("kernel_arg_name", yaxunl wrote: > tra wrote: > > ya

[PATCH] D124537: [AMDGPU][clang] Definition of gfx11 subtarget

2022-04-27 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. t-tye should review this too. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D124537/new/ https://reviews.llvm.org/D124537 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[PATCH] D114957: [AMDGPU] Change llvm.amdgcn.image.bvh.intersect.ray to take vec3 args

2021-12-02 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. Can we agree to drop this in LLVM 15 and note that in a comment or elsewhere? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D114957/new/ https://reviews.llvm.org/D114957 ___ cfe-

[PATCH] D114957: [AMDGPU] Change llvm.amdgcn.image.bvh.intersect.ray to take vec3 args

2021-12-02 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. In D114957#3166936 , @foad wrote: > In D114957#3166858 , @yaxunl wrote: > >> In D114957#3166817 , @foad wrote: >> >>> This is a flag-day change t

[PATCH] D114957: [AMDGPU] Change llvm.amdgcn.image.bvh.intersect.ray to take vec3 args

2021-12-02 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. In D114957#3166882 , @yaxunl wrote: > In D114957#3166861 , @arsenm wrote: > >> In D114957#3166858 , @yaxunl wrote: >> >>> In D114957#3166817

[PATCH] D90809: [amdgpu] Add `llvm.amdgcn.endpgm` support.

2020-11-05 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. Should this also be IntrConvergent? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D90809/new/ https://reviews.llvm.org/D90809 ___ cfe-commits mailing list cfe-commits@lists.llvm.

[PATCH] D77910: AMDGPU: Define cl_khr_gl_sharing as a supported extension

2020-05-02 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. In D77910#2015225 , @arsenm wrote: > In D77910#1989163 , @b-sumner wrote: > > > In D77910#1988807 , @arsenm wrote: > > > > > In D77910#1981828

[PATCH] D75917: Expose llvm fence instruction as clang intrinsic

2020-04-22 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added inline comments. Comment at: clang/test/CodeGenCXX/builtin-amdgcn-fence-failure.cpp:5 + +void test_amdgcn_fence_failure() { + JonChesterfield wrote: > arsenm wrote: > > Does this really depend on C++? Can it use OpenCL like the other builtin > > t

[PATCH] D77910: AMDGPU: Define cl_khr_gl_sharing as a supported extension

2020-04-17 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. In D77910#1988807 , @arsenm wrote: > In D77910#1981828 , @b-sumner wrote: > > > In D77910#1981429 , @arsenm wrote: > > > > > In D77910#1976171

[PATCH] D77910: AMDGPU: Define cl_khr_gl_sharing as a supported extension

2020-04-14 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. In D77910#1981429 , @arsenm wrote: > In D77910#1976171 , @b-sumner wrote: > > > I don't think we can guarantee this is or will be supported on all devices. > > The language runtime makes

[PATCH] D77923: OpenCL: Fix some missing predefined macros

2020-04-13 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. In my opinion, for on-line compile for OpenCL, the platform is responsible for setting `__OPENCL_VERSION__`. Also, it should be the platform's choice as to how to respond the image support query and how `__IMAGE_SUPPORT__` is set. For offline compile, it doesn't seem

[PATCH] D77910: AMDGPU: Define cl_khr_gl_sharing as a supported extension

2020-04-11 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. I don't think we can guarantee this is or will be supported on all devices. The language runtime makes this decision. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D77910/new/ https://reviews.llvm.org/D77910 ___ c

[PATCH] D75917: Expose llvm fence instruction as clang intrinsic

2020-04-08 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. In addition to predefining `__ATOMIC_RELAXED`, etc., clang also predefines `__OPENCL_MEMORY_SCOPE_WORK_ITEM` and friends. So it doesn't really seem unreasonable for clang to also predefine its known syncscopes, and to require the argument to be one of those integers.

[PATCH] D77390: Fix __builtin_amdgcn_workgroup_size_x/y/z return type

2020-04-03 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D77390/new/ https://reviews.llvm.org/D77390 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D75917: Expose llvm fence instruction as clang intrinsic

2020-04-02 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. Please go ahead and update to a string for the scope. Repository: rC Clang CHANGES SINCE LAST ACTION https://reviews.llvm.org/D75917/new/ https://reviews.llvm.org/D75917 ___ cfe-commits mailing list cfe-commits@lists.

[PATCH] D76795: [HIP] Change default --gpu-max-threads-per-block value to 1024

2020-03-25 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. Thanks. This looks fine to me. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D76795/new/ https://reviews.llvm.org/D76795 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailma

[PATCH] D76795: [HIP] Change default --gpu-max-threads-per-block value to 1024

2020-03-25 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added inline comments. Comment at: clang/lib/CodeGen/TargetInfo.cpp:8123 +// --gpu-max-threads-per-block=n or its default value for HIP. +const unsigned OpenCLMaxWorkGroupSize = 256; +const unsigned MaxWorkGroupSize = I'd like to see the word

[PATCH] D74910: [OpenCL] Remove spurious atomic_fetch_min/max builtins

2020-02-20 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. I recall we agreed that conformance tests using mixed types were broken, so this change should be OK. Hopefully this will not affect users. Repository: rC Clang CHANGES SINCE LAST ACTION https://reviews.llvm.org/D74910/new/ https://reviews.llvm.org/D74910 ___

[PATCH] D74807: Add cl_khr_mipmap_image_writes as supported to AMDGPU

2020-02-19 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner accepted this revision. b-sumner added a comment. This revision is now accepted and ready to land. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D74807/new/ https://reviews.llvm.org/D74807 ___ cfe-commits mailing list cfe-commi

[PATCH] D71365: expand printf when compiling HIP to AMDGPU

2020-01-07 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. Should this be looking forward to also handling OpenCL, which does require vector support? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D71365/new/ https://reviews.llvm.org/D71365 __

[PATCH] D66198: AMDGPU: Add builtins for is_local/is_private

2019-08-14 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. Looks fine to me. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66198/new/ https://reviews.llvm.org/D66198 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe

[PATCH] D66198: AMDGPU: Add builtins for is_local/is_private

2019-08-14 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. Looks fine to me. Thanks! CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66198/new/ https://reviews.llvm.org/D66198 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/lis

[PATCH] D62739: AMDGPU: Always emit amdgpu-flat-work-group-size

2019-06-14 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. In D62739#1543438 , @arsenm wrote: > In D62739#1536428 , @b-sumner wrote: > > > We need to communicate with anyone generating IR to ensure this is being > > generated before we change the

[PATCH] D62739: AMDGPU: Always emit amdgpu-flat-work-group-size

2019-06-10 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. We need to communicate with anyone generating IR to ensure this is being generated before we change the default. clang is only one of those generators. This change will also need to be documented in the usage document. CHANGES SINCE LAST ACTION https://reviews.llv

[PATCH] D62739: AMDGPU: Always emit amdgpu-flat-work-group-size

2019-05-31 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added inline comments. Comment at: lib/CodeGen/TargetInfo.cpp:7885 +// By default, restrict the maximum size to 256. +F->addFnAttr("amdgpu-flat-work-group-size", "128,256"); } Theoretically, shouldn't the minimum be 1? CHANGES SINCE LAST ACT

[PATCH] D61112: AMDGPU: Enable _Float16

2019-04-25 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. Looks good to me. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D61112/new/ https://reviews.llvm.org/D61112 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe

[PATCH] D59494: AMDGPU: Add support for cross address space synchronization scopes (clang)

2019-03-18 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added inline comments. Comment at: lib/CodeGen/TargetInfo.cpp:7976 + +Name = Twine(Twine(Name) + Twine("one-as")).str(); + } rampitec wrote: > b-sumner wrote: > > kzhuravl wrote: > > > rampitec wrote: > > > > I think subgroup is in the single addres

[PATCH] D59494: AMDGPU: Add support for cross address space synchronization scopes (clang)

2019-03-18 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. Comment at: lib/CodeGen/TargetInfo.cpp:7976 + +Name = Twine(Twine(Name) + Twine("one-as")).str(); + } kzhuravl wrote: > rampitec wrote: > > I think subgroup is in the single address space even if sequentially > > consistent. > I

[PATCH] D57831: AMDGPU: set wchar_t and wint_t to be unsigned short on windows

2019-02-06 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. Maybe there are already other types like this, but it saddens me that an offline compiled code object could potentially not work properly if the application is using any of these types. Or should the runtime try to detect a problem using argument metadata? CHANGES S

[PATCH] D52320: AMDGPU: add __builtin_amdgcn_update_dpp

2018-10-16 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. Ping. There's quite a bit of interest in getting this exposed by clang. https://reviews.llvm.org/D52320 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D52320: AMDGPU: add __builtin_amdgcn_update_dpp

2018-09-28 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added inline comments. Comment at: lib/CodeGen/CGBuiltin.cpp:11313-11315 + case AMDGPU::BI__builtin_amdgcn_update_dpp: { +llvm::SmallVector Args; +for (unsigned I = 0; I != 6; ++I) arsenm wrote: > The only difference between this and mov_dpp is

[PATCH] D50376: AMDGPU: Fix enabling denormals by default on pre-VI targets

2018-08-07 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. This approach seems fine to me. https://reviews.llvm.org/D50376 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D48667: [HIP] Fix ordering of device-lib linking

2018-06-27 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. Thanks, looks good. Repository: rC Clang https://reviews.llvm.org/D48667 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D48493: [HIP] Support flush denorms bitcode

2018-06-22 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. LGTM Repository: rC Clang https://reviews.llvm.org/D48493 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D46601: [OpenCL] Fix typos in emitted enqueue kernel function names

2018-05-08 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. Thanks! Looks good to me. https://reviews.llvm.org/D46601 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D39739: [HCC] Add flag to Import Weak Functions in Function Importer

2018-03-22 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. In https://reviews.llvm.org/D39739#1045611, @ashi1 wrote: > Is first one encountered a poor design? Strong or first weak is the standard behavior for ISA level linkers. Repository: rL LLVM https://reviews.llvm.org/D39739

[PATCH] D43911: [AMDGPU] Clean up old address space mapping and fix constant address space value

2018-03-02 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. Looks fine to me. https://reviews.llvm.org/D43911 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D43340: Clean up AMDGCN tests

2018-02-15 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner accepted this revision. b-sumner added a comment. Looks good to me. https://reviews.llvm.org/D43340 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D43281: [AMDGPU] fixes for lds f32 builtins

2018-02-14 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added inline comments. Comment at: lib/CodeGen/CGBuiltin.cpp:9866 + case AMDGPU::BI__builtin_amdgcn_ds_fmax: { +llvm::SmallVector Args; +for (unsigned I = 0; I != 5; ++I) Can the pointer argument address space be checked here? Repository: rC

[PATCH] D42578: [AMDGPU] Add ds_fadd, ds_fmin, ds_fmax builtins functions

2018-01-29 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. Should we expect that the last 3 arguments have any effect? Do we want to test to ensure they have the expected effects? Repository: rC Clang https://reviews.llvm.org/D42578 ___ cfe-commits mailing list cfe-commits@lis

[PATCH] D42578: [AMDGPU] Add ds_fadd builtin function

2018-01-26 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. Were you going to add min and max separately? Repository: rC Clang https://reviews.llvm.org/D42578 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D39739: [HCC] Add flag to Import Weak Functions in Function Importer

2017-12-05 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. The usual rule is to take the first weak definition encountered. Repository: rL LLVM https://reviews.llvm.org/D39739 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/

[PATCH] D37568: [AMDGPU] Allow flexible register names in inline asm constraints

2017-09-28 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner accepted this revision. b-sumner added a comment. This revision is now accepted and ready to land. LGTM. I think we can leave immediates to another patch. https://reviews.llvm.org/D37568 ___ cfe-commits mailing list cfe-commits@lists.llvm.

[PATCH] D37822: [OpenCL] Clean up and add missing fields for block struct

2017-09-21 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. In https://reviews.llvm.org/D37822#877903, @Anastasia wrote: > In https://reviews.llvm.org/D37822#877572, @yaxunl wrote: > > > In https://reviews.llvm.org/D37822#873876, @Anastasia wrote: > > > > > In https://reviews.llvm.org/D37822#872446, @yaxunl wrote: > > > > > > > I

[PATCH] D37703: [AMDGPU] Change addr space of clk_event_t, queue_t and reserve_id_t to global

2017-09-13 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner accepted this revision. b-sumner added a comment. This revision is now accepted and ready to land. Looks good to me. https://reviews.llvm.org/D37703 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailm

[PATCH] D37568: [AMDGPU] Allow flexible register names in inline asm constraints

2017-09-07 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. The assembler accepts v[N] in addition to vN. I'm not sure if that is needed here. https://reviews.llvm.org/D37568 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe

[PATCH] D36802: AMDGPU: Cleanup most of the macros

2017-08-28 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added inline comments. Comment at: lib/Basic/Targets/AMDGPU.cpp:362 +Builder.defineMacro(Twine("__") + Twine(GPUName)); +Builder.defineMacro(Twine("__") + Twine(GPUName) + Twine("__")); + } At the meeting we discussed defining every alias of the

[PATCH] D32896: [OpenCL] Make CLK_NULL_RESERVE_ID invalid reserve id.

2017-08-09 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added inline comments. Comment at: lib/Headers/opencl-c.h:16020 +// The macro CLK_NULL_RESERVE_ID refers to an invalid reservation ID. +#define CLK_NULL_RESERVE_ID (__builtin_astype((void *)0, reserve_id_t)) bool __ovld is_valid_reserve_id(reserve_id_t reserve_id);

[PATCH] D36171: AMDGPU: Use direct struct returns

2017-08-08 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added inline comments. Comment at: lib/CodeGen/TargetInfo.cpp:7571 + + // XXX: Should this be i64 instead, and should the limit increase? + llvm::Type *I32Ty = llvm::Type::getInt32Ty(getVMContext()); arsenm wrote: > b-sumner wrote: > > What we

[PATCH] D36171: AMDGPU: Use direct struct returns

2017-08-08 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added inline comments. Comment at: lib/CodeGen/TargetInfo.cpp:7571 + + // XXX: Should this be i64 instead, and should the limit increase? + llvm::Type *I32Ty = llvm::Type::getInt32Ty(getVMContext()); What we do here depends on NumRegsLeft when

[PATCH] D36327: [OpenCL] Allow targets emit optimized pipe functions for power of 2 type sizes

2017-08-08 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. In https://reviews.llvm.org/D36327#834032, @Anastasia wrote: > In https://reviews.llvm.org/D36327#833891, @yaxunl wrote: > > > In https://reviews.llvm.org/D36327#833653, @bader wrote: > > > > > Hi Sam, > > > > > > What do you think about implementing this optimization in

[PATCH] D36171: AMDGPU: Use direct struct returns

2017-08-07 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added inline comments. Comment at: lib/CodeGen/TargetInfo.cpp:7555 + if (NumRegsLeft > 0) +NumRegsLeft -= (Size + 31) / 32; + Won't NumRegsLeft wrap if size==64 and NumRegsLeft == 1 potentially causing an assert later? https://reviews.llv

[PATCH] D28691: Add OpenCL 2.0 atomic builtin functions as Clang builtin

2017-07-25 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. In https://reviews.llvm.org/D28691#820595, @rjmccall wrote: > In https://reviews.llvm.org/D28691#820541, @b-sumner wrote: > > > There are other languages for heterogeneous compute that have scopes, > > although not exposed quite as explicitly as OpenCL. For example AMD

[PATCH] D28691: Add OpenCL 2.0 atomic builtin functions as Clang builtin

2017-07-25 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. In https://reviews.llvm.org/D28691#820526, @rjmccall wrote: > In https://reviews.llvm.org/D28691#820489, @yaxunl wrote: > > > In https://reviews.llvm.org/D28691#820466, @b-sumner wrote: > > > > > Can we drop the "opencl" part of the name and use something like > > > __s

[PATCH] D28691: Add OpenCL 2.0 atomic builtin functions as Clang builtin

2017-07-25 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. Can we drop the "opencl" part of the name and use something like __scoped_atomic_*? Also, it may not make sense to support non-constant scope here since we can't predict what other scopes may be added by other languages in the future. https://reviews.llvm.org/D2869

[PATCH] D28691: Add OpenCL 2.0 atomic builtin functions as Clang builtin

2017-07-25 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added inline comments. Comment at: include/clang/Basic/Builtins.def:713 +ATOMIC_BUILTIN(__opencl_atomic_fetch_or, "v.", "t") +ATOMIC_BUILTIN(__opencl_atomic_fetch_xor, "v.", "t") + yaxunl wrote: > Anastasia wrote: > > What about min/max? I believe they w

[PATCH] D30551: [AMDGPU] Add builtin functions readlane ds_permute mov_dpp

2017-03-02 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment. mov_dpp should be under the VI+ comment https://reviews.llvm.org/D30551 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits