hfinkel added inline comments.

================
Comment at: include/clang/Basic/AttrDocs.td:1330
@@ -1328,1 +1329,3 @@
+the loop unroller tries to unroll the loop by a power-of-two factor
+subject to a limit on unrolled code size.
 
----------------
meheff wrote:
> broune wrote:
> > Is #pragma unroll different from #pragma unroll(full) ? If not, why not 
> > make the two doc strings the same?
> In AttrDocs.td there is not documentation for "#pragma clang loop unroll".  
> Instead it just mentions "#pragma clang loop" and points to 
> LanguageExtensions for details.  Regarding LanguageExtensions documentation 
> of "#pragma clang loop unroll" and AttrDocs documentation for "#pragma 
> unroll", I believe they do both say the same thing (or at least aren't 
> conflicting with each other) with the LanguageExtensions documentation 
> providing more context and describing the limits on unrolling.
> 
> Little bit of background here.  There are two forms of the loop unroll 
> pragmas:
> 
> #pragma unroll
> #pragma unroll N
> 
> and:
> 
> #pragma clang loop unroll(full)
> #pragma clang loop unroll_count(N)
> 
> The "#pragma unroll" forms are supported for compatibilty with other 
> compilers (such as nvcc CUDA compiler) where this form is common while 
> "#pragma clang .." is a more cannonical form specific to clang.  "#pragma 
> unroll" and "#pragma clang loop unroll(full)" are equivalent.  What I propose 
> in the related optimization patch (http://reviews.llvm.org/D10854) is that 
> "#pragma unroll" should enable loop unrolling with a runtime trip count of 
> the loop.  So if you have:
> 
> void foo(int n) {
> #pragma unroll
>   for (int i=0; i<n; i++) { ... }
> }
> 
> Then the optimizer will unroll the loop with some reasonable power-of-two 
> factor.  One potentially funny bit about this approach is that "#pragma clang 
> loop unroll(full)" will also enable unrolling of a loop with a runtime trip 
> count.  This may be surprising... or maybe not. Full unrolling is clearly not 
> possible in this case so just being more aggressive with unrolling may be the 
> sensible thing to do.
> One potentially funny bit about this approach is that "#pragma clang loop 
> unroll(full)" will also enable unrolling of a loop with a runtime trip count. 
> This may be surprising... or maybe not. Full unrolling is clearly not 
> possible in this case so just being more aggressive with unrolling may be the 
> sensible thing to do.

I really don't like this. If the user is requesting 'full' unrolling, I'd 
rather we do only that (at least up to some really high limit, as we do 
currently). Runtime unrolling adds extra expense in the low-trip-count case 
(because of the various checks involved) and has a different set of trade-offs 
than simpler unrolling.

I think that we should have 'full' unrolling add 
"llvm.loop.unroll.runtime.disable" metadata. We should add some other mode, 
"aggressive" for example, which is like full unrolling but also allows for 
runtime unrolling. I think the desired end state here is that -Rpass=unroll 
would provide some specific feedback in the case where full unrolling was 
requested, but not possible because of a runtime trip count.



rL LLVM

http://reviews.llvm.org/D10857




_______________________________________________
cfe-commits mailing list
cfe-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Reply via email to