Re: [PATCH] D21912: [CUDA] Don't assume that destructors can't be overloaded.

2016-07-12 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL275231: [CUDA] Don't assume that destructors can't be overloaded. (authored by jlebar). Changed prior to commit: http://reviews.llvm.org/D21912?vs=62444&id=63749#toc Repository: rL LLVM http://revie

r275232 - [CUDA] Add additional testcases for EraseUnwantedCUDAMatches.

2016-07-12 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Tue Jul 12 18:23:12 2016 New Revision: 275232 URL: http://llvm.org/viewvc/llvm-project?rev=275232&view=rev Log: [CUDA] Add additional testcases for EraseUnwantedCUDAMatches. Summary: Specifically, this patch adds testcases for all three calls to EraseUnwantedCUDAMatches. The

r275231 - [CUDA] Don't assume that destructors can't be overloaded.

2016-07-12 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Tue Jul 12 18:23:01 2016 New Revision: 275231 URL: http://llvm.org/viewvc/llvm-project?rev=275231&view=rev Log: [CUDA] Don't assume that destructors can't be overloaded. Summary: You can overload a destructor in CUDA, and SemaOverload needs to be tweaked not to crash when it

r274897 - Fix flag name in comment in cuda-version-check.cu.

2016-07-08 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Fri Jul 8 12:59:24 2016 New Revision: 274897 URL: http://llvm.org/viewvc/llvm-project?rev=274897&view=rev Log: Fix flag name in comment in cuda-version-check.cu. Modified: cfe/trunk/test/Driver/cuda-version-check.cu Modified: cfe/trunk/test/Driver/cuda-version-check.cu

r274782 - [CUDA] s/OPT_nocuda_version_chec/OPT_no_cuda_version_check/.

2016-07-07 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Thu Jul 7 13:24:28 2016 New Revision: 274782 URL: http://llvm.org/viewvc/llvm-project?rev=274782&view=rev Log: [CUDA] s/OPT_nocuda_version_chec/OPT_no_cuda_version_check/. Fix build breakage. Modified: cfe/trunk/lib/Driver/ToolChains.cpp cfe/trunk/lib/Driver/Tools.c

Re: [PATCH] D21912: [CUDA] Don't assume that destructors can't be overloaded.

2016-07-07 Thread Justin Lebar via cfe-commits
jlebar added a comment. Friendly ping http://reviews.llvm.org/D21912 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [PATCH] D21869: [CUDA] Check that our CUDA install supports the requested architectures.

2016-07-07 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL274781: [CUDA] Check that our CUDA install supports the requested architectures. (authored by jlebar). Changed prior to commit: http://reviews.llvm.org/D21869?vs=63094&id=63100#toc Repository: rL LLV

r274781 - [CUDA] Check that our CUDA install supports the requested architectures.

2016-07-07 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Thu Jul 7 13:17:52 2016 New Revision: 274781 URL: http://llvm.org/viewvc/llvm-project?rev=274781&view=rev Log: [CUDA] Check that our CUDA install supports the requested architectures. Summary: Raise an error if you're using a CUDA installation that's too old for the requeste

r274780 - [CUDA] Rename the __nvvm_bar0 builtin back to __syncthreads.

2016-07-07 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Thu Jul 7 13:15:03 2016 New Revision: 274780 URL: http://llvm.org/viewvc/llvm-project?rev=274780&view=rev Log: [CUDA] Rename the __nvvm_bar0 builtin back to __syncthreads. The builtin was renamed in r274770. But __syncthreads is part of our user-facing API, so we need to ke

Re: [PATCH] D21869: [CUDA] Check that our CUDA install supports the requested architectures.

2016-07-07 Thread Justin Lebar via cfe-commits
jlebar added inline comments. Comment at: include/clang/Driver/Options.td:1722-1724 @@ -1721,2 +1721,5 @@ def nocudalib : Flag<["-"], "nocudalib">; +def nocuda_version_check : Flag<["-"], "nocuda-version-check">, + HelpText<"Don't error out if the detected version of the CUDA in

Re: [PATCH] D21869: [CUDA] Check that our CUDA install supports the requested architectures.

2016-07-07 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 63094. jlebar marked 3 inline comments as done. jlebar added a comment. Address review comments. http://reviews.llvm.org/D21869 Files: include/clang/Basic/DiagnosticDriverKinds.td include/clang/Driver/Options.td lib/Driver/ToolChains.cpp lib/Driver/T

r274713 - [CUDA] Fix "control reaches end of non-void function" warnings in Cuda.cpp.

2016-07-06 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Wed Jul 6 20:06:59 2016 New Revision: 274713 URL: http://llvm.org/viewvc/llvm-project?rev=274713&view=rev Log: [CUDA] Fix "control reaches end of non-void function" warnings in Cuda.cpp. Some compilers are too dumb to realize that the switch statement covers all cases. (Don

Re: [PATCH] D21869: [CUDA] Check that our CUDA install supports the requested architectures.

2016-07-06 Thread Justin Lebar via cfe-commits
jlebar added a comment. Friendly ping. http://reviews.llvm.org/D21869 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

r274689 - [CUDA] Add missing namespace qualification on CudaArch in Action.cpp.

2016-07-06 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Wed Jul 6 16:45:44 2016 New Revision: 274689 URL: http://llvm.org/viewvc/llvm-project?rev=274689&view=rev Log: [CUDA] Add missing namespace qualification on CudaArch in Action.cpp. Fix build breakage with MSVC. Modified: cfe/trunk/lib/Driver/Action.cpp Modified: cfe/tr

Re: [PATCH] D21867: [CUDA] Add utility functions for dealing with CUDA versions / architectures.

2016-07-06 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL274681: [CUDA] Add utility functions for dealing with CUDA versions / architectures. (authored by jlebar). Changed prior to commit: http://reviews.llvm.org/D21867?vs=62409&id=62975#toc Repository: rL

Re: [PATCH] D21868: [CUDA] Rename member variables in CudaInstallationDetector.

2016-07-06 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL274682: [CUDA] Rename member variables in CudaInstallationDetector. (authored by jlebar). Changed prior to commit: http://reviews.llvm.org/D21868?vs=62300&id=62976#toc Repository: rL LLVM http://rev

r274682 - [CUDA] Rename member variables in CudaInstallationDetector.

2016-07-06 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Wed Jul 6 16:21:43 2016 New Revision: 274682 URL: http://llvm.org/viewvc/llvm-project?rev=274682&view=rev Log: [CUDA] Rename member variables in CudaInstallationDetector. Summary: Remove the "Cuda" prefix from these variables -- it's clear that they related to CUDA given the

r274681 - [CUDA] Add utility functions for dealing with CUDA versions / architectures.

2016-07-06 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Wed Jul 6 16:21:39 2016 New Revision: 274681 URL: http://llvm.org/viewvc/llvm-project?rev=274681&view=rev Log: [CUDA] Add utility functions for dealing with CUDA versions / architectures. Summary: Currently our handling of CUDA architectures is scattered all around clang. T

Re: [PATCH] D21778: [CUDA] Add support for CUDA 8 and sm_60-62.

2016-07-06 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL274680: [CUDA] Add support for CUDA 8 and sm_60-62. (authored by jlebar). Changed prior to commit: http://reviews.llvm.org/D21778?vs=62908&id=62974#toc Repository: rL LLVM http://reviews.llvm.org/D2

r274680 - [CUDA] Add support for CUDA 8 and sm_60-62.

2016-07-06 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Wed Jul 6 16:21:14 2016 New Revision: 274680 URL: http://llvm.org/viewvc/llvm-project?rev=274680&view=rev Log: [CUDA] Add support for CUDA 8 and sm_60-62. Summary: Also add sm_32, which was missing. Reviewers: tra Subscribers: cfe-commits Differential Revision: http://rev

Re: [PATCH] D21778: [CUDA] Add support for CUDA 8 and sm_60-62.

2016-07-06 Thread Justin Lebar via cfe-commits
jlebar added a comment. > They will need to wait for corresponding patch on LLVM side to deal with new > SM variants, though. This is in now, http://reviews.llvm.org/D22068. Comment at: lib/Driver/ToolChains.cpp:1715 @@ -1714,2 +1714,3 @@ CudaPathCandidates.push_back(D.S

Re: [PATCH] D21778: [CUDA] Add support for CUDA 8 and sm_60-62.

2016-07-06 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 62908. jlebar marked an inline comment as done. jlebar added a comment. Don't pull in the cuda 8 headers by default. http://reviews.llvm.org/D21778 Files: lib/Basic/Targets.cpp lib/Driver/Action.cpp lib/Driver/ToolChains.cpp Index: lib/Driver/ToolChai

Re: [PATCH] D21845: [Driver][OpenMP] Add specialized action builder for OpenMP offloading actions.

2016-07-01 Thread Justin Lebar via cfe-commits
jlebar added a comment. Hi, Alexy. Would you mind not asking for 'final' in additional reviews until we've resolved this thread elsewhere? Feel free to find me on IRC if you want to talk about it synchronously. Thanks! http://reviews.llvm.org/D21845 __

Re: [PATCH] D18172: [CUDA][OpenMP] Add a generic offload action builder

2016-07-01 Thread Justin Lebar via cfe-commits
jlebar added a subscriber: jlebar. jlebar added a comment. Yeah, I'd say that in the absence of a rule, consistency with surrounding code is king. Otherwise we're sending a message when we don't mean to be. I'm not at my machine, but my recollection is that most of the driver uses final sparingl

Re: [PATCH] D18172: [CUDA][OpenMP] Add a generic offload action builder

2016-07-01 Thread Justin Lebar via cfe-commits
Yeah, I'd say that in the absence of a rule, consistency with surrounding code is king. Otherwise we're sending a message when we don't mean to be. I'm not at my machine, but my recollection is that most of the driver uses final sparingly. But whatever the convention is we should do that, I thin

[PATCH] D21914: [CUDA] Use the multi-element remove function in EraseUnwantedCUDAMatches.

2016-06-30 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added a reviewer: tra. jlebar added subscribers: bkramer, cfe-commits. Bug pointed out by Benjamin Kramer in r264008. I think the bug is benign because by the time this is called, we should only have at most two overloads to consider (either a host and a devic

[PATCH] D21913: [CUDA] Add additional testcases for EraseUnwantedCUDAMatches.

2016-06-30 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added a reviewer: tra. jlebar added a subscriber: cfe-commits. Specifically, this patch adds testcases for all three calls to EraseUnwantedCUDAMatches. The addr-of-overloaded-fn test I accidentally neutered in r264207, which moved much of CodeGenCUDA/function-

[PATCH] D21912: [CUDA] Don't assume that destructors can't be overloaded.

2016-06-30 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added a reviewer: rsmith. jlebar added subscribers: tra, cfe-commits. You can overload a destructor in CUDA, and SemaOverload needs to be tweaked not to crash when it sees an explicit call to an overloaded destructor. http://reviews.llvm.org/D21912 Files: l

Re: [PATCH] D18172: [CUDA][OpenMP] Add a generic offload action builder

2016-06-30 Thread Justin Lebar via cfe-commits
jlebar added a comment. Alexey, it seems that you're asking for "final" on all classes that are not inherited from. Forgive my ignorance, but would you mind pointing me to the document that talks about our position on "final" in LLVM source? I don't see it in the style guide, but I may be mis

Re: r264008 - [sema] [CUDA] Use std algorithms in EraseUnwantedCUDAMatchesImpl.

2016-06-30 Thread Justin Lebar via cfe-commits
t 5:08 AM, Benjamin Kramer wrote: >> On Tue, Mar 22, 2016 at 1:09 AM, Justin Lebar via cfe-commits >> wrote: >>> Author: jlebar >>> Date: Mon Mar 21 19:09:25 2016 >>> New Revision: 264008 >>> >>> URL: http://llvm.or

Re: r274257 - Don't instantiate a full host toolchain in ASTMatchersTest.

2016-06-30 Thread Justin Lebar via cfe-commits
1 test from 1 test case ran. (20 ms total) > [ PASSED ] 0 tests. > [ FAILED ] 1 test, listed below: > [ FAILED ] DeclarationMatcher.MatchClass > > 1 FAILED TEST > > > > > > 2016-06-30 21:12 GMT+03:00 Justin Lebar via cfe-commits > : >>

Re: [PATCH] D21867: [CUDA] Add utility functions for dealing with CUDA versions / architectures.

2016-06-30 Thread Justin Lebar via cfe-commits
jlebar marked 4 inline comments as done. jlebar added a comment. http://reviews.llvm.org/D21867 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [PATCH] D21867: [CUDA] Add utility functions for dealing with CUDA versions / architectures.

2016-06-30 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 62409. jlebar added a comment. Address Art's review. http://reviews.llvm.org/D21867 Files: include/clang/Basic/Cuda.h include/clang/Driver/Action.h lib/Basic/CMakeLists.txt lib/Basic/Cuda.cpp lib/Basic/Targets.cpp lib/Driver/Action.cpp lib/Driv

Re: r264008 - [sema] [CUDA] Use std algorithms in EraseUnwantedCUDAMatchesImpl.

2016-06-30 Thread Justin Lebar via cfe-commits
Interestingly all the clang tests pass with that whole line commented out. So something *really* seems missing here. Thank you for finding this. On Thu, Jun 30, 2016 at 5:08 AM, Benjamin Kramer wrote: > On Tue, Mar 22, 2016 at 1:09 AM, Justin Lebar via cfe-commits > wrote: >> Au

r274269 - Fix ASTMatchersNodeTest to work on Windows.

2016-06-30 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Thu Jun 30 15:29:29 2016 New Revision: 274269 URL: http://llvm.org/viewvc/llvm-project?rev=274269&view=rev Log: Fix ASTMatchersNodeTest to work on Windows. It was failing because it had an explicit check for whether we're on Windows. There are a few other similar explicit ch

Re: [PATCH] D21867: [CUDA] Add utility functions for dealing with CUDA versions / architectures.

2016-06-30 Thread Justin Lebar via cfe-commits
jlebar marked an inline comment as done. Comment at: lib/Driver/Driver.cpp:1026-1028 @@ -1024,4 +1025,5 @@ } else if (CudaDeviceAction *CDA = dyn_cast(A)) { -os << '"' - << (CDA->getGpuArchName() ? CDA->getGpuArchName() : "(multiple archs)") +os << '"' << (CDA->ge

Re: [PATCH] D21867: [CUDA] Add utility functions for dealing with CUDA versions / architectures.

2016-06-30 Thread Justin Lebar via cfe-commits
jlebar marked an inline comment as done. Comment at: lib/Basic/Cuda.cpp:8-19 @@ +7,14 @@ + +const char *CudaVersionToString(CudaVersion V) { + switch (V) { + case CudaVersion::UNKNOWN: +return "unknown"; + case CudaVersion::CUDA_70: +return "7.0"; + case CudaVersion::C

r274261 - [CUDA] Give templated device functions internal linkage, templated kernels external linkage.

2016-06-30 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Thu Jun 30 13:41:33 2016 New Revision: 274261 URL: http://llvm.org/viewvc/llvm-project?rev=274261&view=rev Log: [CUDA] Give templated device functions internal linkage, templated kernels external linkage. Summary: This lets LLVM perform IPO over these functions. In particul

Re: [PATCH] D21337: [CUDA] Give templated device functions internal linkage, templated kernels external linkage.

2016-06-30 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL274261: [CUDA] Give templated device functions internal linkage, templated kernels… (authored by jlebar). Changed prior to commit: http://reviews.llvm.org/D21337?vs=60728&id=62391#toc Repository: rL

Re: [PATCH] D21810: Don't instantiate a full host toolchain in ASTMatchersTest.

2016-06-30 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL274257: Don't instantiate a full host toolchain in ASTMatchersTest. (authored by jlebar). Changed prior to commit: http://reviews.llvm.org/D21810?vs=62132&id=62385#toc Repository: rL LLVM http://rev

r274257 - Don't instantiate a full host toolchain in ASTMatchersTest.

2016-06-30 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Thu Jun 30 13:12:25 2016 New Revision: 274257 URL: http://llvm.org/viewvc/llvm-project?rev=274257&view=rev Log: Don't instantiate a full host toolchain in ASTMatchersTest. Summary: This test was stat()'ing large swaths of /usr/lib hundreds of times, as every invocation of mat

Re: [PATCH] D21810: Don't instantiate a full host toolchain in ASTMatchersTest.

2016-06-30 Thread Justin Lebar via cfe-commits
jlebar added a comment. > But I think this is a reasonable workaround until such an API can be provided. Should I take that as an LG, or are we waiting for someone else to approve this? http://reviews.llvm.org/D21810 ___ cfe-commits mailing list c

[PATCH] D21868: [CUDA] Rename member variables in CudaInstallationDetector.

2016-06-29 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added a reviewer: tra. jlebar added a subscriber: cfe-commits. Remove the "Cuda" prefix from these variables -- it's clear that they related to CUDA given their containing type. http://reviews.llvm.org/D21868 Files: lib/Driver/ToolChains.cpp lib/Driver/To

[PATCH] D21869: [CUDA] Check that our CUDA install supports the requested architectures.

2016-06-29 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added a reviewer: tra. jlebar added a subscriber: cfe-commits. Raise an error if you're using a CUDA installation that's too old for the requested architectures. In practice, this means that you need a CUDA 8 install to compile for sm_6*. http://reviews.llvm.

[PATCH] D21867: [CUDA] Add utility functions for dealing with CUDA versions / architectures.

2016-06-29 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added a reviewer: tra. jlebar added a subscriber: cfe-commits. Currently our handling of CUDA architectures is scattered all around clang. This patch centralizes it. A key advantage of this centralization is that you can now write a C++ switch on e.g. CudaArc

Re: [PATCH] D21810: Don't instantiate a full host toolchain in ASTMatchersTest.

2016-06-28 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 62132. jlebar marked 3 inline comments as done. jlebar added a comment. Fix typo in comment. http://reviews.llvm.org/D21810 Files: unittests/ASTMatchers/ASTMatchersTest.h Index: unittests/ASTMatchers/ASTMatchersTest.h =

Re: [PATCH] D21810: Don't instantiate a full host toolchain in ASTMatchersTest.

2016-06-28 Thread Justin Lebar via cfe-commits
jlebar added inline comments. Comment at: unittests/ASTMatchers/ASTMatchersTest.h:81-83 @@ +80,5 @@ + // + // FIXME: This is a hack to work around the fact that there's no way to do the + // equivalent of runToolOnCodeWithArgs without instantiating a full Driver. + // We shou

[PATCH] D21810: Don't instantiate a full host toolchain in ASTMatchersTest.

2016-06-28 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added a reviewer: chandlerc. jlebar added a subscriber: cfe-commits. Herald added a subscriber: klimek. This test was stat()'ing large swaths of /usr/lib hundreds of times, as every invocation of matchesConditionally*() created a new Linux toolchain. In additi

[PATCH] D21778: [CUDA] Add support for CUDA 8 and sm_60-62.

2016-06-27 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added a reviewer: tra. jlebar added a subscriber: cfe-commits. Also add sm_32, which was missing. http://reviews.llvm.org/D21778 Files: lib/Basic/Targets.cpp lib/Driver/Action.cpp lib/Driver/ToolChains.cpp Index: lib/Driver/ToolChains.cpp =

Re: [PATCH] D21337: [CUDA] Give templated device functions internal linkage, templated kernels external linkage.

2016-06-24 Thread Justin Lebar via cfe-commits
jlebar added a comment. Friendly ping. http://reviews.llvm.org/D21337 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [PATCH] D21507: Changes after running check modernize-use-emplace (D20964)

2016-06-20 Thread Justin Lebar via cfe-commits
jlebar added a subscriber: jlebar. jlebar added a comment. There seem to be many nontrivial whitespace errors introduced by this patch. For example, -Attrs.push_back(HTMLStartTagComment::Attribute(Ident.getLocation(), - Ident.ge

Re: [PATCH] D21419: [CUDA] Don't pass top-level -march down to device cc1 or ptxas.

2016-06-15 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL272857: [CUDA] Don't pass top-level -march down to device cc1 or ptxas. (authored by jlebar). Changed prior to commit: http://reviews.llvm.org/D21419?vs=60932&id=60935#toc Repository: rL LLVM http:/

r272857 - [CUDA] Don't pass top-level -march down to device cc1 or ptxas.

2016-06-15 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Wed Jun 15 18:46:11 2016 New Revision: 272857 URL: http://llvm.org/viewvc/llvm-project?rev=272857&view=rev Log: [CUDA] Don't pass top-level -march down to device cc1 or ptxas. Summary: Previously if you did e.g. $ clang -march=haswell -x cuda foo.cu we would pass "-march=

Re: [PATCH] D21419: [CUDA] Don't pass top-level -march down to device cc1 or ptxas.

2016-06-15 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 60932. jlebar added a comment. Fix tests for real this time. http://reviews.llvm.org/D21419 Files: lib/Driver/ToolChains.cpp test/Driver/cuda-march.cu Index: test/Driver/cuda-march.cu === -

Re: [PATCH] D21419: [CUDA] Don't pass top-level -march down to device cc1 or ptxas.

2016-06-15 Thread Justin Lebar via cfe-commits
jlebar added inline comments. Comment at: test/Driver/cuda-march.cu:15-16 @@ +14,4 @@ + +// RUN: %clang -### -target x86_64-linux-gnu -c -march=skylake --cuda-gpu-arch=sm_30 %s 2>&1 | \ +// RUN: FileCheck -check-prefix SKYLAKE -check-prefix SM30 %s + tra wrote: >

Re: [PATCH] D21419: [CUDA] Don't pass top-level -march down to device cc1 or ptxas.

2016-06-15 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 60931. jlebar added a comment. Remove redundant test. http://reviews.llvm.org/D21419 Files: lib/Driver/ToolChains.cpp test/Driver/cuda-march.cu Index: test/Driver/cuda-march.cu === --- /dev

[PATCH] D21419: [CUDA] Don't pass top-level -march down to device cc1 or ptxas.

2016-06-15 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added a reviewer: tra. jlebar added subscribers: echristo, cfe-commits. Previously if you did e.g. $ clang -march=haswell -x cuda foo.cu we would pass "-march=haswell -march=sm_20" down to the ptxas tool. This causes it to assert, and rightly so! http://re

Re: [PATCH] D21337: [CUDA] Give templated device functions internal linkage, templated kernels external linkage.

2016-06-14 Thread Justin Lebar via cfe-commits
jlebar added a comment. tra makes the good point that maybe this should be done in ASTContext, where we already have a special case for __global__. (I think I gravitated to doing it this way because the GVA* enums have zero documentation -- at least I have a vague idea of what the LLVM attribu

[PATCH] D21337: [CUDA] Give templated device functions internal linkage, templated kernels external linkage.

2016-06-14 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added a reviewer: rsmith. jlebar added subscribers: tra, cfe-commits. This lets LLVM perform IPO over these functions. In particular, it allows LLVM to emit ld.global.nc for loads to __restrict pointers in kernels that are never written to. http://reviews.llv

Re: [PATCH] D21162: [CUDA] Implement __shfl* intrinsics in clang headers.

2016-06-09 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL272299: [CUDA] Implement __shfl* intrinsics in clang headers. (authored by jlebar). Changed prior to commit: http://reviews.llvm.org/D21162?vs=60223&id=60230#toc Repository: rL LLVM http://reviews.l

r272299 - [CUDA] Implement __shfl* intrinsics in clang headers.

2016-06-09 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Thu Jun 9 15:04:57 2016 New Revision: 272299 URL: http://llvm.org/viewvc/llvm-project?rev=272299&view=rev Log: [CUDA] Implement __shfl* intrinsics in clang headers. Summary: Clang changes to make use of the LLVM intrinsics added in D21160. Reviewers: tra Subscribers: jhole

Re: [PATCH] D21162: [CUDA] Implement __shfl* intrinsics in clang headers.

2016-06-09 Thread Justin Lebar via cfe-commits
jlebar added inline comments. Comment at: lib/Headers/__clang_cuda_intrinsics.h:77-80 @@ +76,6 @@ +_Static_assert(sizeof(__tmp) == sizeof(__in)); \ +memcpy(&__tmp, &__in, sizeof(__in)); \ +__tmp = ::__

Re: [PATCH] D21162: [CUDA] Implement __shfl* intrinsics in clang headers.

2016-06-09 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 60223. jlebar marked 2 inline comments as done. jlebar added a comment. Update after tra's review. http://reviews.llvm.org/D21162 Files: include/clang/Basic/BuiltinsNVPTX.def lib/Headers/__clang_cuda_intrinsics.h lib/Headers/__clang_cuda_runtime_wrappe

Re: [PATCH] D21162: [CUDA] Implement __shfl* intrinsics in clang headers.

2016-06-09 Thread Justin Lebar via cfe-commits
jlebar added a comment. Thank you for the reviews, Justin! http://reviews.llvm.org/D21162 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [PATCH] D21162: [CUDA] Implement __shfl* intrinsics in clang headers.

2016-06-09 Thread Justin Lebar via cfe-commits
jlebar added a comment. (Art, I would appreciate a second set of eyes on this one, as the last time I did this -- with ldg -- I messed up pretty badly.) http://reviews.llvm.org/D21162 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://l

[PATCH] D21162: [CUDA] Implement __shfl* intrinsics in clang headers.

2016-06-08 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added a reviewer: tra. jlebar added subscribers: cfe-commits, jholewinski. Clang changes to make use of the LLVM intrinsics added in D21160. http://reviews.llvm.org/D21162 Files: include/clang/Basic/BuiltinsNVPTX.def lib/Headers/__clang_cuda_intrinsics.h

Re: [PATCH] D20985: [CUDA] Add implicit conversion of __launch_bounds__ arguments to rvalue.

2016-06-03 Thread Justin Lebar via cfe-commits
jlebar added inline comments. Comment at: lib/Sema/SemaDeclAttr.cpp:4079 @@ +4078,3 @@ + if (ValArg.isInvalid()) +return nullptr; + OK, so then we want an assert, not an if? http://reviews.llvm.org/D20985 ___ c

Re: [PATCH] D20985: [CUDA] Add implicit conversion of __launch_bounds__ arguments to rvalue.

2016-06-03 Thread Justin Lebar via cfe-commits
jlebar accepted this revision. This revision is now accepted and ready to land. Comment at: lib/Sema/SemaDeclAttr.cpp:4044 @@ +4043,3 @@ +// Checks whether an argument of launch_bounds attribute is +// acceptable, performs implicit conversion to Rvalue and returns +// non-nullptr

Re: [PATCH] D20985: [CUDA] Add implicit conversion of __launch_bounds__ arguments to rvalue.

2016-06-03 Thread Justin Lebar via cfe-commits
jlebar added a comment. In http://reviews.llvm.org/D20985#448836, @tra wrote: > In http://reviews.llvm.org/D20985#448822, @jlebar wrote: > > > How is this different from test/SemaCUDA/launch_bounds.cu:27-28? It does > > > > const int constint = 512; > > __launch_bounds__(constint) void TestC

Re: [PATCH] D20985: [CUDA] Add implicit conversion of __launch_bounds__ arguments to rvalue.

2016-06-03 Thread Justin Lebar via cfe-commits
jlebar added a comment. How is this different from test/SemaCUDA/launch_bounds.cu:27-28? It does const int constint = 512; __launch_bounds__(constint) void TestConstInt(void); which looks verbatim the same as this testcase. http://reviews.llvm.org/D20985 ___

Re: r271336 - [CUDA] Conservatively mark inline asm as convergent.

2016-06-01 Thread Justin Lebar via cfe-commits
Thank you, Tom. I will have a look. On Wed, Jun 1, 2016 at 11:22 AM, Tom Stellard wrote: > On Tue, May 31, 2016 at 09:27:13PM -0000, Justin Lebar via cfe-commits wrote: >> Author: jlebar >> Date: Tue May 31 16:27:13 2016 >> New Revision: 271336 >> >> URL: http:/

Re: [PATCH] D20836: [CUDA] Conservatively mark inline asm as convergent.

2016-05-31 Thread Justin Lebar via cfe-commits
jlebar added a comment. In http://reviews.llvm.org/D20836#444911, @tra wrote: > I guess we would not be able to remove convergent from inline asm > automatically. Do we need a way to explicitly remove convergent from inline > asm? We can think about it. I'm not sure it will make a big differ

Re: [PATCH] D20836: [CUDA] Conservatively mark inline asm as convergent.

2016-05-31 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL271336: [CUDA] Conservatively mark inline asm as convergent. (authored by jlebar). Changed prior to commit: http://reviews.llvm.org/D20836?vs=59130&id=59133#toc Repository: rL LLVM http://reviews.ll

r271336 - [CUDA] Conservatively mark inline asm as convergent.

2016-05-31 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Tue May 31 16:27:13 2016 New Revision: 271336 URL: http://llvm.org/viewvc/llvm-project?rev=271336&view=rev Log: [CUDA] Conservatively mark inline asm as convergent. Summary: This is particularly important because a some convergent CUDA intrinsics (e.g. __shfl_down) are imple

[PATCH] D20836: [CUDA] Conservatively mark inline asm as convergent.

2016-05-31 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added a reviewer: tra. jlebar added a subscriber: cfe-commits. This is particularly important because a some convergent CUDA intrinsics (e.g. __shfl_down) are implemented in terms of inline asm. http://reviews.llvm.org/D20836 Files: lib/CodeGen/CGStmt.cpp

Re: [PATCH] D20794: [CUDA] Fix order of vectorized ldg intrinsics' elements.

2016-05-30 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL271215: [CUDA] Fix order of vectorized ldg intrinsics' elements. (authored by jlebar). Changed prior to commit: http://reviews.llvm.org/D20794?vs=58972&id=58976#toc Repository: rL LLVM http://review

r271215 - [CUDA] Fix order of vectorized ldg intrinsics' elements.

2016-05-30 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Mon May 30 12:12:55 2016 New Revision: 271215 URL: http://llvm.org/viewvc/llvm-project?rev=271215&view=rev Log: [CUDA] Fix order of vectorized ldg intrinsics' elements. Summary: The order is [x, y, z, w], not [w, x, y, z]. Subscribers: cfe-commits, tra Differential Revision

[PATCH] D20794: [CUDA] Fix order of vectorized ldg intrinsics' elements.

2016-05-30 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added subscribers: tra, cfe-commits. The order is [x, y, z, w], not [w, x, y, z]. http://reviews.llvm.org/D20794 Files: lib/Headers/__clang_cuda_intrinsics.h Index: lib/Headers/__clang_cuda_intrinsics.h =

Re: [PATCH] D20493: [CUDA] Add -fcuda-approx-transcendentals flag.

2016-05-23 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL270484: [CUDA] Add -fcuda-approx-transcendentals flag. (authored by jlebar). Changed prior to commit: http://reviews.llvm.org/D20493?vs=58123&id=58145#toc Repository: rL LLVM http://reviews.llvm.org

r270484 - [CUDA] Add -fcuda-approx-transcendentals flag.

2016-05-23 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Mon May 23 15:19:56 2016 New Revision: 270484 URL: http://llvm.org/viewvc/llvm-project?rev=270484&view=rev Log: [CUDA] Add -fcuda-approx-transcendentals flag. Summary: This lets us emit e.g. sin.approx.f32. See http://docs.nvidia.com/cuda/parallel-thread-execution/#floating-

Re: [PATCH] D20493: [CUDA] Add -fcuda-approx-transcendentals flag.

2016-05-23 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 58123. jlebar added a comment. More tightly scope the __USE_FAST_MATH__ macro. tra pointed out that device_functions.hpp uses __USE_FAST_MATH__ for its own purposes. For this CL, we only want to define __USE_FAST_MATH__ around math_functions.hpp. http://rev

Re: [PATCH] D20457: Update -ffast-math documentation to match reality.

2016-05-20 Thread Justin Lebar via cfe-commits
jlebar marked 2 inline comments as done. jlebar added a comment. Repository: rL LLVM http://reviews.llvm.org/D20457 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [PATCH] D20457: Update -ffast-math documentation to match reality.

2016-05-20 Thread Justin Lebar via cfe-commits
jlebar added a comment. Thank you for the review! http://reviews.llvm.org/D20457 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

r270279 - Update -ffast-math documentation to match reality.

2016-05-20 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Fri May 20 16:33:01 2016 New Revision: 270279 URL: http://llvm.org/viewvc/llvm-project?rev=270279&view=rev Log: Update -ffast-math documentation to match reality. Reviewers: rsmith Subscribers: cfe-commits Differential Revision: http://reviews.llvm.org/D20457 Modified:

Re: [PATCH] D20457: Update -ffast-math documentation to match reality.

2016-05-20 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL270279: Update -ffast-math documentation to match reality. (authored by jlebar). Changed prior to commit: http://reviews.llvm.org/D20457?vs=57958&id=57995#toc Repository: rL LLVM http://reviews.llvm

[PATCH] D20493: [CUDA] Add -fcuda-approx-transcendentals flag.

2016-05-20 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added a reviewer: rnk. jlebar added subscribers: cfe-commits, tra. This lets us emit e.g. sin.approx.f32. See http://docs.nvidia.com/cuda/parallel-thread-execution/#floating-point-instructions-sin http://reviews.llvm.org/D20493 Files: include/clang/Basic/L

Re: [PATCH] D20481: [CUDA] Define __USE_FAST_MATH__ when __FAST_MATH__ is defined.

2016-05-20 Thread Justin Lebar via cfe-commits
jlebar abandoned this revision. jlebar added a comment. Actually, after talking offline with Chandler, I need something more complicated than this. I will send a new patch. Sorry for the noise. http://reviews.llvm.org/D20481 ___ cfe-commits maili

[PATCH] D20481: [CUDA] Define __USE_FAST_MATH__ when __FAST_MATH__ is defined.

2016-05-20 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added a reviewer: rsmith. jlebar added a subscriber: cfe-commits. The CUDA headers look for __USE_FAST_MATH__. http://reviews.llvm.org/D20481 Files: lib/Headers/__clang_cuda_runtime_wrapper.h Index: lib/Headers/__clang_cuda_runtime_wrapper.h ==

Re: [PATCH] D20457: Update -ffast-math documentation to match reality.

2016-05-20 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 57958. jlebar added a comment. Update per Richard's review. http://reviews.llvm.org/D20457 Files: docs/UsersManual.rst include/clang/Basic/LangOptions.def include/clang/Driver/Options.td Index: include/clang/Driver/Options.td =

[PATCH] D20457: Update -ffast-math documentation to match reality.

2016-05-19 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added a reviewer: rsmith. jlebar added a subscriber: cfe-commits. http://reviews.llvm.org/D20457 Files: include/clang/Basic/LangOptions.def include/clang/Driver/Options.td Index: include/clang/Driver/Options.td

Re: [PATCH] D19990: [CUDA] Implement __ldg using intrinsics.

2016-05-19 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL270150: [CUDA] Implement __ldg using intrinsics. (authored by jlebar). Changed prior to commit: http://reviews.llvm.org/D19990?vs=56603&id=57873#toc Repository: rL LLVM http://reviews.llvm.org/D1999

r270150 - [CUDA] Implement __ldg using intrinsics.

2016-05-19 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Thu May 19 17:49:13 2016 New Revision: 270150 URL: http://llvm.org/viewvc/llvm-project?rev=270150&view=rev Log: [CUDA] Implement __ldg using intrinsics. Summary: Previously it was implemented as inline asm in the CUDA headers. This change allows us to use the [addr+imm] addr

Re: [PATCH] D20341: [CUDA] Enable fusing FP ops for CUDA by default.

2016-05-19 Thread Justin Lebar via cfe-commits
jlebar accepted this revision. jlebar added a comment. This revision is now accepted and ready to land. Well, if the CUDA documentation says so...let's do it. :) Thanks for your patience, everyone. http://reviews.llvm.org/D20341 ___ cfe-commits m

Re: [PATCH] D20341: [CUDA] Enable fusing FP ops for CUDA by default.

2016-05-17 Thread Justin Lebar via cfe-commits
jlebar added a comment. > But people also don't expect IEEE compliance on GPUs Is that true? You have a lot more experience with this than I do, but my observation of nvidia's hardware is that it's moved to add *more* IEEE compliance as it's matured. For example, older hardware didn't suppor

Re: [PATCH] D20341: [CUDA] Enable fusing FP ops for CUDA by default.

2016-05-17 Thread Justin Lebar via cfe-commits
jlebar added a comment. I am not sure we want this? Although it matches nvcc, it does not match our floating-point behavior for C++ in general -- it makes us non-IEEE-whatever compliant by default. Although I agree that if we don't do this, lots of people are not going to pass -fp-contract=fa

Re: [PATCH] D19990: [CUDA] Implement __ldg using intrinsics.

2016-05-17 Thread Justin Lebar via cfe-commits
jlebar added a comment. Friendly ping. This is a big help with some Tensorflow benchmarks. http://reviews.llvm.org/D19990 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [PATCH] D20034: [CUDA] Only __shared__ variables can be static local on device side.

2016-05-10 Thread Justin Lebar via cfe-commits
jlebar added a comment. This patch regresses Eigen, because it raises an error even on host+device functions. Repository: rL LLVM http://reviews.llvm.org/D20034 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bi

Re: [PATCH] D20141: Check for nullptr argument.

2016-05-10 Thread Justin Lebar via cfe-commits
jlebar accepted this revision. jlebar added a comment. This revision is now accepted and ready to land. OK, if the function explicitly says it accepts null values and if we check elsewhere in the function, I'm personally OK adding the checks. http://reviews.llvm.org/D20141 __

Re: [PATCH] D20141: Check for nullptr argument.

2016-05-10 Thread Justin Lebar via cfe-commits
jlebar added a comment. Can we have a test? http://reviews.llvm.org/D20141 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [PATCH] D20140: [CUDA] Do not allow non-empty destructors for global device-side variables.

2016-05-10 Thread Justin Lebar via cfe-commits
jlebar accepted this revision. jlebar added a comment. This revision is now accepted and ready to land. lgtm, but I'd like Richard to sign off on this too. Comment at: lib/Sema/SemaDecl.cpp:10438 @@ -10437,1 +10437,3 @@ + // Also make sure that destructor, ifthere is one,

<    1   2   3   4   5   6   7   8   9   >