[PATCH] D77923: OpenCL: Fix some missing predefined macros

2021-03-10 Thread Anastasia Stulova via Phabricator via cfe-commits
Anastasia added a comment.
Herald added a subscriber: dexonsmith.

FYI there has been a related spec change that makes this functionality optional 
in the offline compilation: 
https://github.com/KhronosGroup/OpenCL-Docs/pull/522/files


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77923/new/

https://reviews.llvm.org/D77923

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77923: OpenCL: Fix some missing predefined macros

2020-04-28 Thread Anastasia Stulova via Phabricator via cfe-commits
Anastasia added a comment.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

The current solution seems very specific and doesn't provide generic uniform 
mechanisms for the targets. Would something based on a triple component not 
work?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77923/new/

https://reviews.llvm.org/D77923



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77923: OpenCL: Fix some missing predefined macros

2020-04-16 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment.

In D77923#1987795 , @jvesely wrote:

> will this trigger extra warnings for the user?


No.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77923/new/

https://reviews.llvm.org/D77923



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77923: OpenCL: Fix some missing predefined macros

2020-04-16 Thread Jan Vesely via Phabricator via cfe-commits
jvesely added a comment.

In D77923#1983787 , @yaxunl wrote:

> LGTM. The objective of the change is to simplify offline compilation. We want 
> to avoid asking users to add -D if a proper macro can be inferred from triple 
> and cpu. Ideally, users only need to specify triple and cpu with clang 
> driver. In case users want to override the default macros for the triple, 
> they can always override them by -D or -U in command line.


will this trigger extra warnings for the user? 
as long as there's a way to set custom values I'm fine.
I just don't understand why this can' be appended to the command line by the 
offline compiler distributed with the platform (and based on release if needed).
The platform will need to get the same number + any additional information it 
wants to report in `clGetDeviceInfo(CL_DEVICE_VERSION)` anyway so having it in 
one place would be prudent.

> The question is whether the default value of the macro is proper. To me this 
> change maintained the old behavior for spir and other targets and the default 
> value seems fine for amdgcn.




CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77923/new/

https://reviews.llvm.org/D77923



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77923: OpenCL: Fix some missing predefined macros

2020-04-15 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment.

LGTM. The objective of the change is to simplify offline compilation. We want 
to avoid asking users to add -D if a proper macro can be inferred from triple 
and cpu. Ideally, users only need to specify triple and cpu with clang driver. 
In case users want to override the default macros for the triple, they can 
always override them by -D or -U in command line.

The question is whether the default value of the macro is proper. To me this 
change maintained the old behavior for spir and other targets and the default 
value seems fine for amdgcn.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77923/new/

https://reviews.llvm.org/D77923



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77923: OpenCL: Fix some missing predefined macros

2020-04-15 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 257684.
arsenm added a comment.

Attach correct diff


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77923/new/

https://reviews.llvm.org/D77923

Files:
  clang/include/clang/Basic/TargetInfo.h
  clang/lib/Basic/Targets/AMDGPU.cpp
  clang/lib/Basic/Targets/AMDGPU.h
  clang/lib/Basic/Targets/SPIR.h
  clang/lib/Frontend/InitPreprocessor.cpp
  clang/test/Preprocessor/predefined-macros.c
  llvm/include/llvm/Support/TargetParser.h
  llvm/lib/Support/TargetParser.cpp

Index: llvm/lib/Support/TargetParser.cpp
===
--- llvm/lib/Support/TargetParser.cpp
+++ llvm/lib/Support/TargetParser.cpp
@@ -72,36 +72,36 @@
   {{"oland"}, {"gfx601"},  GK_GFX601,  FEATURE_NONE},
   {{"pitcairn"},  {"gfx601"},  GK_GFX601,  FEATURE_NONE},
   {{"verde"}, {"gfx601"},  GK_GFX601,  FEATURE_NONE},
-  {{"gfx700"},{"gfx700"},  GK_GFX700,  FEATURE_NONE},
-  {{"kaveri"},{"gfx700"},  GK_GFX700,  FEATURE_NONE},
-  {{"gfx701"},{"gfx701"},  GK_GFX701,  FEATURE_FAST_FMA_F32},
-  {{"hawaii"},{"gfx701"},  GK_GFX701,  FEATURE_FAST_FMA_F32},
-  {{"gfx702"},{"gfx702"},  GK_GFX702,  FEATURE_FAST_FMA_F32},
-  {{"gfx703"},{"gfx703"},  GK_GFX703,  FEATURE_NONE},
-  {{"kabini"},{"gfx703"},  GK_GFX703,  FEATURE_NONE},
-  {{"mullins"},   {"gfx703"},  GK_GFX703,  FEATURE_NONE},
-  {{"gfx704"},{"gfx704"},  GK_GFX704,  FEATURE_NONE},
-  {{"bonaire"},   {"gfx704"},  GK_GFX704,  FEATURE_NONE},
-  {{"gfx801"},{"gfx801"},  GK_GFX801,  FEATURE_FAST_FMA_F32|FEATURE_FAST_DENORMAL_F32},
-  {{"carrizo"},   {"gfx801"},  GK_GFX801,  FEATURE_FAST_FMA_F32|FEATURE_FAST_DENORMAL_F32},
-  {{"gfx802"},{"gfx802"},  GK_GFX802,  FEATURE_FAST_DENORMAL_F32},
-  {{"iceland"},   {"gfx802"},  GK_GFX802,  FEATURE_FAST_DENORMAL_F32},
-  {{"tonga"}, {"gfx802"},  GK_GFX802,  FEATURE_FAST_DENORMAL_F32},
-  {{"gfx803"},{"gfx803"},  GK_GFX803,  FEATURE_FAST_DENORMAL_F32},
-  {{"fiji"},  {"gfx803"},  GK_GFX803,  FEATURE_FAST_DENORMAL_F32},
-  {{"polaris10"}, {"gfx803"},  GK_GFX803,  FEATURE_FAST_DENORMAL_F32},
-  {{"polaris11"}, {"gfx803"},  GK_GFX803,  FEATURE_FAST_DENORMAL_F32},
-  {{"gfx810"},{"gfx810"},  GK_GFX810,  FEATURE_FAST_DENORMAL_F32},
-  {{"stoney"},{"gfx810"},  GK_GFX810,  FEATURE_FAST_DENORMAL_F32},
-  {{"gfx900"},{"gfx900"},  GK_GFX900,  FEATURE_FAST_FMA_F32|FEATURE_FAST_DENORMAL_F32},
-  {{"gfx902"},{"gfx902"},  GK_GFX902,  FEATURE_FAST_FMA_F32|FEATURE_FAST_DENORMAL_F32},
-  {{"gfx904"},{"gfx904"},  GK_GFX904,  FEATURE_FAST_FMA_F32|FEATURE_FAST_DENORMAL_F32},
-  {{"gfx906"},{"gfx906"},  GK_GFX906,  FEATURE_FAST_FMA_F32|FEATURE_FAST_DENORMAL_F32},
-  {{"gfx908"},{"gfx908"},  GK_GFX908,  FEATURE_FAST_FMA_F32|FEATURE_FAST_DENORMAL_F32},
-  {{"gfx909"},{"gfx909"},  GK_GFX909,  FEATURE_FAST_FMA_F32|FEATURE_FAST_DENORMAL_F32},
-  {{"gfx1010"},   {"gfx1010"}, GK_GFX1010, FEATURE_FAST_FMA_F32|FEATURE_FAST_DENORMAL_F32|FEATURE_WAVE32},
-  {{"gfx1011"},   {"gfx1011"}, GK_GFX1011, FEATURE_FAST_FMA_F32|FEATURE_FAST_DENORMAL_F32|FEATURE_WAVE32},
-  {{"gfx1012"},   {"gfx1012"}, GK_GFX1012, FEATURE_FAST_FMA_F32|FEATURE_FAST_DENORMAL_F32|FEATURE_WAVE32},
+  {{"gfx700"},{"gfx700"},  GK_GFX700,  FEATURE_FLAT_ADDRESS_SPACE},
+  {{"kaveri"},{"gfx700"},  GK_GFX700,  FEATURE_FLAT_ADDRESS_SPACE},
+  {{"gfx701"},{"gfx701"},  GK_GFX701,  FEATURE_FAST_FMA_F32|FEATURE_FLAT_ADDRESS_SPACE},
+  {{"hawaii"},{"gfx701"},  GK_GFX701,  FEATURE_FAST_FMA_F32|FEATURE_FLAT_ADDRESS_SPACE},
+  {{"gfx702"},{"gfx702"},  GK_GFX702,  FEATURE_FAST_FMA_F32|FEATURE_FLAT_ADDRESS_SPACE},
+  {{"gfx703"},{"gfx703"},  GK_GFX703,  FEATURE_FLAT_ADDRESS_SPACE},
+  {{"kabini"},{"gfx703"},  GK_GFX703,  FEATURE_FLAT_ADDRESS_SPACE},
+  {{"mullins"},   {"gfx703"},  GK_GFX703,  FEATURE_FLAT_ADDRESS_SPACE},
+  {{"gfx704"},{"gfx704"},  GK_GFX704,  FEATURE_FLAT_ADDRESS_SPACE},
+  {{"bonaire"},   {"gfx704"},  GK_GFX704,  FEATURE_FLAT_ADDRESS_SPACE},
+  {{"gfx801"},{"gfx801"},  GK_GFX801,  FEATURE_FAST_FMA_F32|FEATURE_FAST_DENORMAL_F32|FEATURE_FLAT_ADDRESS_SPACE},
+  {{"carrizo"},   {"gfx801"},  GK_GFX801,  FEATURE_FAST_FMA_F32|FEATURE_FAST_DENORMAL_F32|FEATURE_FLAT_ADDRESS_SPACE},
+  {{"gfx802"},{"gfx802"},  GK_GFX802,  FEATURE_FAST_DENORMAL_F32|FEATURE_FLAT_ADDRESS_SPACE},
+  {{"iceland"},   {"gfx802"},  GK_GFX802,  FEATURE_FAST_DENORMAL_F32|FEATURE_FLAT_ADDRESS_SPACE},
+  {{"tonga"}, {"gfx802"},  GK_GFX802,  FEATURE_FAST_DENORMAL_F32|FEATURE_FLAT_ADDRESS_SPACE},
+  {{"gfx803"},{"gfx803"},  GK_GFX803,  FEATURE_FAST_DENORMAL_F32|FEATURE_FLAT_ADDRESS_SPACE},
+  {{"fiji"},  {"gfx803"},  GK_GFX803,  FEATURE_FAST_DENORMAL_F32|FEATURE_FLAT_ADDRESS_SPACE},
+  {{"polaris10"}, {"gfx803"},  GK_GFX803,  FEATURE_FAST_DENORMAL_F32|FEATURE_FLAT_ADDRESS_SPACE},
+  {{"polaris11"}, {"gfx803"},  GK_GFX803,  FEATURE_FAST_DENORMAL_F32|FEATURE_FLAT_ADDRESS_SPACE},
+  {{"gfx810"},{"gfx810"},  GK_GFX810,  

[PATCH] D77923: OpenCL: Fix some missing predefined macros

2020-04-15 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 257681.
arsenm added a comment.

Check triple for support. Report 2.0 for -amdhsa and -amdpal with flat support, 
but 1.2 for clover/-mesa3d. Also require targets to explicitly set a value to 
define, rather than defaulting.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77923/new/

https://reviews.llvm.org/D77923

Files:
  clang/include/clang/Basic/TargetInfo.h
  clang/lib/Basic/Targets/AMDGPU.cpp
  clang/lib/Frontend/InitPreprocessor.cpp
  clang/test/Preprocessor/predefined-macros.c


Index: clang/test/Preprocessor/predefined-macros.c
===
--- clang/test/Preprocessor/predefined-macros.c
+++ clang/test/Preprocessor/predefined-macros.c
@@ -183,10 +183,21 @@
 // CHECK-AMDGCN-GFX6: #define __IMAGE_SUPPORT__ 1
 // CHECK-AMDGCN-GFX6: #define __OPENCL_VERSION__ 120{{$}}
 
+// No set OS or mesa3d, assume CL1.2
 // RUN: %clang_cc1 %s -E -dM -o - -x cl -triple amdgcn-unknown-unknown 
-target-cpu gfx700 \
 // RUN:   | FileCheck -match-full-lines %s --check-prefix=CHECK-AMDGCN-GFX7
+// RUN: %clang_cc1 %s -E -dM -o - -x cl -triple amdgcn-unknown-mesa3d 
-target-cpu gfx700 \
+// RUN:   | FileCheck -match-full-lines %s --check-prefix=CHECK-AMDGCN-GFX7
 // CHECK-AMDGCN-GFX7: #define __IMAGE_SUPPORT__ 1
-// CHECK-AMDGCN-GFX7: #define __OPENCL_VERSION__ 200{{$}}
+// CHECK-AMDGCN-GFX7: #define __OPENCL_VERSION__ 120{{$}}
+
+// Assume CL2.0 support for HSA and PAL
+// RUN: %clang_cc1 %s -E -dM -o - -x cl -triple amdgcn-unknown-amdhsa 
-target-cpu gfx700 \
+// RUN:   | FileCheck -match-full-lines %s 
--check-prefix=CHECK-AMDGCN-AMDHSAPAL
+// RUN: %clang_cc1 %s -E -dM -o - -x cl -triple amdgcn-unknown-amdpal 
-target-cpu gfx700 \
+// RUN:   | FileCheck -match-full-lines %s 
--check-prefix=CHECK-AMDGCN-AMDHSAPAL
+// CHECK-AMDGCN-AMDHSAPAL: #define __IMAGE_SUPPORT__ 1
+// CHECK-AMDGCN-AMDHSAPAL: #define __OPENCL_VERSION__ 200{{$}}
 
 
 // RUN: %clang_cc1 %s -E -dM -o - -x hip -triple amdgcn-amd-amdhsa \
Index: clang/lib/Frontend/InitPreprocessor.cpp
===
--- clang/lib/Frontend/InitPreprocessor.cpp
+++ clang/lib/Frontend/InitPreprocessor.cpp
@@ -450,8 +450,10 @@
   }
 }
 
-Builder.defineMacro("__OPENCL_VERSION__",
-Twine(TI.getMaxOpenCLSupportedVersion()));
+if (TI.getMaxOpenCLSupportedVersion() != 0) {
+  Builder.defineMacro("__OPENCL_VERSION__",
+  Twine(TI.getMaxOpenCLSupportedVersion()));
+}
 
 Builder.defineMacro("CL_VERSION_1_0", "100");
 Builder.defineMacro("CL_VERSION_1_1", "110");
Index: clang/lib/Basic/Targets/AMDGPU.cpp
===
--- clang/lib/Basic/Targets/AMDGPU.cpp
+++ clang/lib/Basic/Targets/AMDGPU.cpp
@@ -306,7 +306,13 @@
   MaxAtomicPromoteWidth = MaxAtomicInlineWidth = 64;
 
   SupportsOpenCLImages = true;
-  MaxOpenCLSupportedVersion =  hasFlatAddressSpace() ? 200 : 120;
+
+  // CL2.0 features require flat instruction support. ROCm supports CL2.0, but
+  // Clover does not.
+  const bool SupportsCL2 = hasFlatAddressSpace() &&
+(Triple.getOS() == llvm::Triple::AMDHSA ||
+ Triple.getOS() == llvm::Triple::AMDPAL);
+  MaxOpenCLSupportedVersion =  SupportsCL2 ? 200 : 120;
 }
 
 void AMDGPUTargetInfo::adjust(LangOptions ) {
Index: clang/include/clang/Basic/TargetInfo.h
===
--- clang/include/clang/Basic/TargetInfo.h
+++ clang/include/clang/Basic/TargetInfo.h
@@ -215,7 +215,10 @@
   unsigned MaxOpenCLWorkGroupSize;
 
   bool SupportsOpenCLImages = false;
-  unsigned MaxOpenCLSupportedVersion = 100;
+
+  /// Maximum supported device OpenCL version, corresponding to the values
+  /// expected for __OPENCL_VERSION__. If 0, the macro is not defined.
+  unsigned MaxOpenCLSupportedVersion = 0;
 
   // TargetInfo Constructor.  Default initializes all fields.
   TargetInfo(const llvm::Triple );


Index: clang/test/Preprocessor/predefined-macros.c
===
--- clang/test/Preprocessor/predefined-macros.c
+++ clang/test/Preprocessor/predefined-macros.c
@@ -183,10 +183,21 @@
 // CHECK-AMDGCN-GFX6: #define __IMAGE_SUPPORT__ 1
 // CHECK-AMDGCN-GFX6: #define __OPENCL_VERSION__ 120{{$}}
 
+// No set OS or mesa3d, assume CL1.2
 // RUN: %clang_cc1 %s -E -dM -o - -x cl -triple amdgcn-unknown-unknown -target-cpu gfx700 \
 // RUN:   | FileCheck -match-full-lines %s --check-prefix=CHECK-AMDGCN-GFX7
+// RUN: %clang_cc1 %s -E -dM -o - -x cl -triple amdgcn-unknown-mesa3d -target-cpu gfx700 \
+// RUN:   | FileCheck -match-full-lines %s --check-prefix=CHECK-AMDGCN-GFX7
 // CHECK-AMDGCN-GFX7: #define __IMAGE_SUPPORT__ 1
-// CHECK-AMDGCN-GFX7: #define __OPENCL_VERSION__ 200{{$}}
+// CHECK-AMDGCN-GFX7: #define __OPENCL_VERSION__ 120{{$}}
+
+// Assume CL2.0 support for HSA and PAL
+// RUN: %clang_cc1 %s -E -dM -o - 

[PATCH] D77923: OpenCL: Fix some missing predefined macros

2020-04-14 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment.

In D77923#1981698 , @jvesely wrote:

> does passing `-U__OPENCL_VERSION__ -D __OPENCL_VERSION__=x.y` on the 
> commadline work as well? the header trick was my fallback path.


Yes the command line option -U or -D override predefined macros.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77923/new/

https://reviews.llvm.org/D77923



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77923: OpenCL: Fix some missing predefined macros

2020-04-14 Thread Jan Vesely via Phabricator via cfe-commits
jvesely added a comment.

In D77923#1981398 , @arsenm wrote:

> In D77923#1979286 , @jvesely wrote:
>
> > > In D77923#1976497 , @jvesely 
> > > wrote:
> > > 
> > >> OPENCL_VERSION is something that should be really set by the opencl 
> > >> driver rather than the compiler.
> > >>  coarse-grained SVM can be done without FEATURE_FLAT_ADDRESS_SPACE, so 
> > >> those gpus can get over 1.2, while the compiler can be used in an 
> > >> implementation that doesn't go above 1.2 even for gfx700+
> > > 
> > > 
> > > I don't know why this would imply we can't just have a static 
> > > `__OPENCL_VERSION__` for each device in Clang? I don't see anything in 
> > > the standard that says you are not allowed to have (going with your 
> > > example) `__OPENCL_VERSION__=200` and `__OPENCL_C_VERSION=120` at the 
> > > same time.
> > > 
> > > It seems like we should be able to just have `__OPENCL_VERSION__` defined 
> > > in Clang, and let the runtime control setting `__OPENCL_C_VERSION__` via 
> > > `-cl-std=`. If the current patch sets `__OPENCL_VERSION__` differently 
> > > than the runtime does then it seems like we should make it match, but I'm 
> > > not sure how best to confirm that.
> >
> > The `__OPENCL_VERSION__` macro in the kernel needs to match the version 
> > returned by `clGetDeviceInfo` (both refer to OpenCL version supported by 
> > the device),
> >  which in turn must be <= version returned by `clGetPlatformInfo`.
> >  This part makes it unsuitable for hardcoding in the compiler.
> >
> > `__OPENCL_VERSION__` should indicate device capabilities rather than 
> > language features. specs talk about available resources.
> >  For example, `CL_DEVICE_IMAGE2D_MAX_HEIGHT` is required to be >= 8192 for 
> > OpenCL 1.2 while it's required to be >=16384 for OpenCL 2.1.
>
>
> But why would the driver not report the device values which we do know in the 
> compiler? A discrepancy between these sounds like a driver bug.


because it needs to be capped by the platform opencl version. devices 
supporting opencl x.y need to support opencl > if there's to be a default value hardcoded in the compiler it'd be nice if 
>> platform drivers had an easy way to override (lower) it.
> 
> I found a hacky way to do this. You can -include a header, which happens 
> after the predefines, with undef and redef of the predefined value. This does 
> trigger -Wreserved-id-macro (which I only found through -Weverything), but 
> you should be able to pragma push/pop the warning off inside the special 
> header.

does passing `-U__OPENCL_VERSION__ -D __OPENCL_VERSION__=x.y` on the commadline 
work as well? the header trick was my fallback path.

> For now I could assume the platform is 2.0 based on -amdhsa, and not for 
> -mesa3d to work around clover not currently supporting CL2.0.




CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77923/new/

https://reviews.llvm.org/D77923



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77923: OpenCL: Fix some missing predefined macros

2020-04-14 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment.

In D77923#1979286 , @jvesely wrote:

> > In D77923#1976497 , @jvesely wrote:
> > 
> >> OPENCL_VERSION is something that should be really set by the opencl driver 
> >> rather than the compiler.
> >>  coarse-grained SVM can be done without FEATURE_FLAT_ADDRESS_SPACE, so 
> >> those gpus can get over 1.2, while the compiler can be used in an 
> >> implementation that doesn't go above 1.2 even for gfx700+
> > 
> > 
> > I don't know why this would imply we can't just have a static 
> > `__OPENCL_VERSION__` for each device in Clang? I don't see anything in the 
> > standard that says you are not allowed to have (going with your example) 
> > `__OPENCL_VERSION__=200` and `__OPENCL_C_VERSION=120` at the same time.
> > 
> > It seems like we should be able to just have `__OPENCL_VERSION__` defined 
> > in Clang, and let the runtime control setting `__OPENCL_C_VERSION__` via 
> > `-cl-std=`. If the current patch sets `__OPENCL_VERSION__` differently than 
> > the runtime does then it seems like we should make it match, but I'm not 
> > sure how best to confirm that.
>
> The `__OPENCL_VERSION__` macro in the kernel needs to match the version 
> returned by `clGetDeviceInfo` (both refer to OpenCL version supported by the 
> device),
>  which in turn must be <= version returned by `clGetPlatformInfo`.
>  This part makes it unsuitable for hardcoding in the compiler.
>
> `__OPENCL_VERSION__` should indicate device capabilities rather than language 
> features. specs talk about available resources.
>  For example, `CL_DEVICE_IMAGE2D_MAX_HEIGHT` is required to be >= 8192 for 
> OpenCL 1.2 while it's required to be >=16384 for OpenCL 2.1.


But why would the driver not report the device values which we do know in the 
compiler? A discrepancy between these sounds like a driver bug.

> if there's to be a default value hardcoded in the compiler it'd be nice if 
> platform drivers had an easy way to override (lower) it.

I found a hacky way to do this. You can -include a header, which happens after 
the predefines, with undef and redef of the predefined value. This does trigger 
-Wreserved-id-macro (which I only found through -Weverything), but you should 
be able to pragma push/pop the warning off inside the special header.

For now I could assume the platform is 2.0 based on -amdhsa, and not for 
-mesa3d to work around clover not currently supporting CL2.0.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77923/new/

https://reviews.llvm.org/D77923



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77923: OpenCL: Fix some missing predefined macros

2020-04-14 Thread Anastasia Stulova via Phabricator via cfe-commits
Anastasia added a comment.

As for OpenCL C spec we should ideally add `__OPENCL_VERSION__` but I feel it 
should be taken from architecture or OS type... I think there were other cases 
where we needed something like an OpenCL runtime version in the triple too but 
I can't remember this now.

I can't think of any use case where this macro would be useful in the kernel 
code though.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77923/new/

https://reviews.llvm.org/D77923



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77923: OpenCL: Fix some missing predefined macros

2020-04-13 Thread Jan Vesely via Phabricator via cfe-commits
jvesely added a comment.

> In D77923#1976497 , @jvesely wrote:
> 
>> OPENCL_VERSION is something that should be really set by the opencl driver 
>> rather than the compiler.
>>  coarse-grained SVM can be done without FEATURE_FLAT_ADDRESS_SPACE, so those 
>> gpus can get over 1.2, while the compiler can be used in an implementation 
>> that doesn't go above 1.2 even for gfx700+
> 
> 
> I don't know why this would imply we can't just have a static 
> `__OPENCL_VERSION__` for each device in Clang? I don't see anything in the 
> standard that says you are not allowed to have (going with your example) 
> `__OPENCL_VERSION__=200` and `__OPENCL_C_VERSION=120` at the same time.
> 
> It seems like we should be able to just have `__OPENCL_VERSION__` defined in 
> Clang, and let the runtime control setting `__OPENCL_C_VERSION__` via 
> `-cl-std=`. If the current patch sets `__OPENCL_VERSION__` differently than 
> the runtime does then it seems like we should make it match, but I'm not sure 
> how best to confirm that.

The `__OPENCL_VERSION__` macro in the kernel needs to match the version 
returned by `clGetDeviceInfo` (both refer to OpenCL version supported by the 
device),
which in turn must be <= version returned by `clGetPlatformInfo`.
This part makes it unsuitable for hardcoding in the compiler.

`__OPENCL_VERSION__` should indicate device capabilities rather than language 
features. specs talk about available resources.
For example, `CL_DEVICE_IMAGE2D_MAX_HEIGHT` is required to be >= 8192 for 
OpenCL 1.2 while it's required to be >=16384 for OpenCL 2.1.

if there's to be a default value hardcoded in the compiler it'd be nice if 
platform drivers had an easy way to override (lower) it.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77923/new/

https://reviews.llvm.org/D77923



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77923: OpenCL: Fix some missing predefined macros

2020-04-13 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment.

In my opinion, for on-line compile for OpenCL, the platform is responsible for 
setting `__OPENCL_VERSION__`.  Also, it should be the platform's choice as to 
how to respond the image support query and how `__IMAGE_SUPPORT__` is set.  For 
offline compile, it doesn't seem unreasonable to ask the developer to set these.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77923/new/

https://reviews.llvm.org/D77923



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77923: OpenCL: Fix some missing predefined macros

2020-04-13 Thread Scott Linder via Phabricator via cfe-commits
scott.linder added a comment.

In D77923#1978261 , @arsenm wrote:

> In D77923#1978172 , @scott.linder 
> wrote:
>
> > https://www.khronos.org/registry/OpenCL/sdk/2.0/docs/man/xhtml/preprocessorDirectives.html
> >  describes `__OPENCL_VERSION__` as "an integer reflecting the version 
> > number of the OpenCL supported by the OpenCL device." as contrasted with 
> > `__OPENCL_C_VERSION__` which is described as "an integer reflecting the 
> > OpenCL C version specified by the -cl-std build option to clBuildProgram or 
> > clCompileProgram. If the -cl-std build option is not specified, the OpenCL 
> > C version supported by the compiler for this OpenCL device will be used."
> >
> > But I don't see where the correct use of these is defined? How should the 
> > user decide which to use? It does seem like `__OPENCL_VERSION__` and 
> > `__OPENCL_C_VERSION__` can differ, e.g. if you compile for a device 
> > supporting 2.0 with `-cl-std=1.2`, but why would `__OPENCL_VERSION__` ever 
> > be referenced then?
>
>
> I don't know why you would ever use this; but the standard says it should be 
> defined, so I guess we have to define it


Right, I just want to understand the actual semantics and it doesn't seem 
clear. I agree that I don't see a use for `__OPENCL_VERSION__` but we define it 
so I'm sure people are using it, so likely we need to be careful that this 
patch preserves the existing behavior of the runtime.

In D77923#1976497 , @jvesely wrote:

> OPENCL_VERSION is something that should be really set by the opencl driver 
> rather than the compiler.
>  coarse-grained SVM can be done without FEATURE_FLAT_ADDRESS_SPACE, so those 
> gpus can get over 1.2, while the compiler can be used in an implementation 
> that doesn't go above 1.2 even for gfx700+


I don't know why this would imply we can't just have a static 
`__OPENCL_VERSION__` for each device in Clang? I don't see anything in the 
standard that says you are not allowed to have (going with your example) 
`__OPENCL_VERSION__=200` and `__OPENCL_C_VERSION=120` at the same time.

It seems like we should be able to just have `__OPENCL_VERSION__` defined in 
Clang, and let the runtime control setting `__OPENCL_C_VERSION__` via 
`-cl-std=`. If the current patch sets `__OPENCL_VERSION__` differently than the 
runtime does then it seems like we should make it match, but I'm not sure how 
best to confirm that.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77923/new/

https://reviews.llvm.org/D77923



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77923: OpenCL: Fix some missing predefined macros

2020-04-13 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment.

In D77923#1978172 , @scott.linder 
wrote:

> https://www.khronos.org/registry/OpenCL/sdk/2.0/docs/man/xhtml/preprocessorDirectives.html
>  describes `__OPENCL_VERSION__` as "an integer reflecting the version number 
> of the OpenCL supported by the OpenCL device." as contrasted with 
> `__OPENCL_C_VERSION__` which is described as "an integer reflecting the 
> OpenCL C version specified by the -cl-std build option to clBuildProgram or 
> clCompileProgram. If the -cl-std build option is not specified, the OpenCL C 
> version supported by the compiler for this OpenCL device will be used."
>
> But I don't see where the correct use of these is defined? How should the 
> user decide which to use? It does seem like `__OPENCL_VERSION__` and 
> `__OPENCL_C_VERSION__` can differ, e.g. if you compile for a device 
> supporting 2.0 with `-cl-std=1.2`, but why would `__OPENCL_VERSION__` ever be 
> referenced then?


I don't know why you would ever use this; but the standard says it should be 
defined, so I guess we have to define it


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77923/new/

https://reviews.llvm.org/D77923



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77923: OpenCL: Fix some missing predefined macros

2020-04-13 Thread Scott Linder via Phabricator via cfe-commits
scott.linder added a comment.

https://www.khronos.org/registry/OpenCL/sdk/2.0/docs/man/xhtml/preprocessorDirectives.html
 describes `__OPENCL_VERSION__` as "an integer reflecting the version number of 
the OpenCL supported by the OpenCL device." as contrasted with 
`__OPENCL_C_VERSION__` which is described as "an integer reflecting the OpenCL 
C version specified by the -cl-std build option to clBuildProgram or 
clCompileProgram. If the -cl-std build option is not specified, the OpenCL C 
version supported by the compiler for this OpenCL device will be used."

But I don't see where the correct use of these is defined? How should the user 
decide which to use? It does seem like `__OPENCL_VERSION__` and 
`__OPENCL_C_VERSION__` can differ, e.g. if you compile for a device supporting 
2.0 with `-cl-std=1.2`, but why would `__OPENCL_VERSION__` ever be referenced 
then?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77923/new/

https://reviews.llvm.org/D77923



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77923: OpenCL: Fix some missing predefined macros

2020-04-13 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment.

In D77923#1976497 , @jvesely wrote:

> OPENCL_VERSION is something that should be really set by the opencl driver 
> rather than the compiler.
>  coarse-grained SVM can be done without FEATURE_FLAT_ADDRESS_SPACE, so those 
> gpus can get over 1.2, while the compiler can be used in an implementation 
> that doesn't go above 1.2 even for gfx700+


The offline compiler needs to be perfectly usable, with no driver/runtime 
dependencies. Any information needed from the driver should be captured by the 
target triple/OS type. We would invent a new OS type if there was such a need


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77923/new/

https://reviews.llvm.org/D77923



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77923: OpenCL: Fix some missing predefined macros

2020-04-11 Thread Jan Vesely via Phabricator via cfe-commits
jvesely added a comment.

OPENCL_VERSION sounds like something that should be really set by the opencl 
driver rather than the compiler.
coarse-grained SVM can be done without FEATURE_FLAT_ADDRESS_SPACE, so those 
gpus can get over 1.2, while the compiler can be used in an implementation that 
doesn't go above 1.2 even for gfx700+


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77923/new/

https://reviews.llvm.org/D77923



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77923: OpenCL: Fix some missing predefined macros

2020-04-10 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm created this revision.
arsenm added reviewers: Anastasia, yaxunl, scott.linder.
Herald added subscribers: kerbowa, hiraditya, nhaehnle, wdng, jvesely.

Some of the device specific standard predefines were missing.

  

__IMAGE_SUPPORT__ was only hardcoded for SPIR. __OPENCL_VERSION__
wasn't defined at all, so allow targets to pick values for these.


https://reviews.llvm.org/D77923

Files:
  clang/include/clang/Basic/TargetInfo.h
  clang/lib/Basic/Targets/AMDGPU.cpp
  clang/lib/Basic/Targets/AMDGPU.h
  clang/lib/Basic/Targets/SPIR.h
  clang/lib/Frontend/InitPreprocessor.cpp
  clang/test/Preprocessor/predefined-macros.c
  llvm/include/llvm/Support/TargetParser.h
  llvm/lib/Support/TargetParser.cpp

Index: llvm/lib/Support/TargetParser.cpp
===
--- llvm/lib/Support/TargetParser.cpp
+++ llvm/lib/Support/TargetParser.cpp
@@ -72,36 +72,36 @@
   {{"oland"}, {"gfx601"},  GK_GFX601,  FEATURE_NONE},
   {{"pitcairn"},  {"gfx601"},  GK_GFX601,  FEATURE_NONE},
   {{"verde"}, {"gfx601"},  GK_GFX601,  FEATURE_NONE},
-  {{"gfx700"},{"gfx700"},  GK_GFX700,  FEATURE_NONE},
-  {{"kaveri"},{"gfx700"},  GK_GFX700,  FEATURE_NONE},
-  {{"gfx701"},{"gfx701"},  GK_GFX701,  FEATURE_FAST_FMA_F32},
-  {{"hawaii"},{"gfx701"},  GK_GFX701,  FEATURE_FAST_FMA_F32},
-  {{"gfx702"},{"gfx702"},  GK_GFX702,  FEATURE_FAST_FMA_F32},
-  {{"gfx703"},{"gfx703"},  GK_GFX703,  FEATURE_NONE},
-  {{"kabini"},{"gfx703"},  GK_GFX703,  FEATURE_NONE},
-  {{"mullins"},   {"gfx703"},  GK_GFX703,  FEATURE_NONE},
-  {{"gfx704"},{"gfx704"},  GK_GFX704,  FEATURE_NONE},
-  {{"bonaire"},   {"gfx704"},  GK_GFX704,  FEATURE_NONE},
-  {{"gfx801"},{"gfx801"},  GK_GFX801,  FEATURE_FAST_FMA_F32|FEATURE_FAST_DENORMAL_F32},
-  {{"carrizo"},   {"gfx801"},  GK_GFX801,  FEATURE_FAST_FMA_F32|FEATURE_FAST_DENORMAL_F32},
-  {{"gfx802"},{"gfx802"},  GK_GFX802,  FEATURE_FAST_DENORMAL_F32},
-  {{"iceland"},   {"gfx802"},  GK_GFX802,  FEATURE_FAST_DENORMAL_F32},
-  {{"tonga"}, {"gfx802"},  GK_GFX802,  FEATURE_FAST_DENORMAL_F32},
-  {{"gfx803"},{"gfx803"},  GK_GFX803,  FEATURE_FAST_DENORMAL_F32},
-  {{"fiji"},  {"gfx803"},  GK_GFX803,  FEATURE_FAST_DENORMAL_F32},
-  {{"polaris10"}, {"gfx803"},  GK_GFX803,  FEATURE_FAST_DENORMAL_F32},
-  {{"polaris11"}, {"gfx803"},  GK_GFX803,  FEATURE_FAST_DENORMAL_F32},
-  {{"gfx810"},{"gfx810"},  GK_GFX810,  FEATURE_FAST_DENORMAL_F32},
-  {{"stoney"},{"gfx810"},  GK_GFX810,  FEATURE_FAST_DENORMAL_F32},
-  {{"gfx900"},{"gfx900"},  GK_GFX900,  FEATURE_FAST_FMA_F32|FEATURE_FAST_DENORMAL_F32},
-  {{"gfx902"},{"gfx902"},  GK_GFX902,  FEATURE_FAST_FMA_F32|FEATURE_FAST_DENORMAL_F32},
-  {{"gfx904"},{"gfx904"},  GK_GFX904,  FEATURE_FAST_FMA_F32|FEATURE_FAST_DENORMAL_F32},
-  {{"gfx906"},{"gfx906"},  GK_GFX906,  FEATURE_FAST_FMA_F32|FEATURE_FAST_DENORMAL_F32},
-  {{"gfx908"},{"gfx908"},  GK_GFX908,  FEATURE_FAST_FMA_F32|FEATURE_FAST_DENORMAL_F32},
-  {{"gfx909"},{"gfx909"},  GK_GFX909,  FEATURE_FAST_FMA_F32|FEATURE_FAST_DENORMAL_F32},
-  {{"gfx1010"},   {"gfx1010"}, GK_GFX1010, FEATURE_FAST_FMA_F32|FEATURE_FAST_DENORMAL_F32|FEATURE_WAVE32},
-  {{"gfx1011"},   {"gfx1011"}, GK_GFX1011, FEATURE_FAST_FMA_F32|FEATURE_FAST_DENORMAL_F32|FEATURE_WAVE32},
-  {{"gfx1012"},   {"gfx1012"}, GK_GFX1012, FEATURE_FAST_FMA_F32|FEATURE_FAST_DENORMAL_F32|FEATURE_WAVE32},
+  {{"gfx700"},{"gfx700"},  GK_GFX700,  FEATURE_FLAT_ADDRESS_SPACE},
+  {{"kaveri"},{"gfx700"},  GK_GFX700,  FEATURE_FLAT_ADDRESS_SPACE},
+  {{"gfx701"},{"gfx701"},  GK_GFX701,  FEATURE_FAST_FMA_F32|FEATURE_FLAT_ADDRESS_SPACE},
+  {{"hawaii"},{"gfx701"},  GK_GFX701,  FEATURE_FAST_FMA_F32|FEATURE_FLAT_ADDRESS_SPACE},
+  {{"gfx702"},{"gfx702"},  GK_GFX702,  FEATURE_FAST_FMA_F32|FEATURE_FLAT_ADDRESS_SPACE},
+  {{"gfx703"},{"gfx703"},  GK_GFX703,  FEATURE_FLAT_ADDRESS_SPACE},
+  {{"kabini"},{"gfx703"},  GK_GFX703,  FEATURE_FLAT_ADDRESS_SPACE},
+  {{"mullins"},   {"gfx703"},  GK_GFX703,  FEATURE_FLAT_ADDRESS_SPACE},
+  {{"gfx704"},{"gfx704"},  GK_GFX704,  FEATURE_FLAT_ADDRESS_SPACE},
+  {{"bonaire"},   {"gfx704"},  GK_GFX704,  FEATURE_FLAT_ADDRESS_SPACE},
+  {{"gfx801"},{"gfx801"},  GK_GFX801,  FEATURE_FAST_FMA_F32|FEATURE_FAST_DENORMAL_F32|FEATURE_FLAT_ADDRESS_SPACE},
+  {{"carrizo"},   {"gfx801"},  GK_GFX801,  FEATURE_FAST_FMA_F32|FEATURE_FAST_DENORMAL_F32|FEATURE_FLAT_ADDRESS_SPACE},
+  {{"gfx802"},{"gfx802"},  GK_GFX802,  FEATURE_FAST_DENORMAL_F32|FEATURE_FLAT_ADDRESS_SPACE},
+  {{"iceland"},   {"gfx802"},  GK_GFX802,  FEATURE_FAST_DENORMAL_F32|FEATURE_FLAT_ADDRESS_SPACE},
+  {{"tonga"}, {"gfx802"},  GK_GFX802,  FEATURE_FAST_DENORMAL_F32|FEATURE_FLAT_ADDRESS_SPACE},
+  {{"gfx803"},{"gfx803"},  GK_GFX803,  FEATURE_FAST_DENORMAL_F32|FEATURE_FLAT_ADDRESS_SPACE},
+  {{"fiji"},  {"gfx803"},  GK_GFX803,  FEATURE_FAST_DENORMAL_F32|FEATURE_FLAT_ADDRESS_SPACE},
+  {{"polaris10"}, {"gfx803"},