================ Comment at: include/clang/Basic/AttrDocs.td:1330 @@ -1328,1 +1329,3 @@ +the loop unroller tries to unroll the loop by a power-of-two factor +subject to a limit on unrolled code size. ---------------- hfinkel wrote: > meheff wrote: > > broune wrote: > > > Is #pragma unroll different from #pragma unroll(full) ? If not, why not > > > make the two doc strings the same? > > In AttrDocs.td there is not documentation for "#pragma clang loop unroll". > > Instead it just mentions "#pragma clang loop" and points to > > LanguageExtensions for details. Regarding LanguageExtensions documentation > > of "#pragma clang loop unroll" and AttrDocs documentation for "#pragma > > unroll", I believe they do both say the same thing (or at least aren't > > conflicting with each other) with the LanguageExtensions documentation > > providing more context and describing the limits on unrolling. > > > > Little bit of background here. There are two forms of the loop unroll > > pragmas: > > > > #pragma unroll > > #pragma unroll N > > > > and: > > > > #pragma clang loop unroll(full) > > #pragma clang loop unroll_count(N) > > > > The "#pragma unroll" forms are supported for compatibilty with other > > compilers (such as nvcc CUDA compiler) where this form is common while > > "#pragma clang .." is a more cannonical form specific to clang. "#pragma > > unroll" and "#pragma clang loop unroll(full)" are equivalent. What I > > propose in the related optimization patch (http://reviews.llvm.org/D10854) > > is that "#pragma unroll" should enable loop unrolling with a runtime trip > > count of the loop. So if you have: > > > > void foo(int n) { > > #pragma unroll > > for (int i=0; i<n; i++) { ... } > > } > > > > Then the optimizer will unroll the loop with some reasonable power-of-two > > factor. One potentially funny bit about this approach is that "#pragma > > clang loop unroll(full)" will also enable unrolling of a loop with a > > runtime trip count. This may be surprising... or maybe not. Full unrolling > > is clearly not possible in this case so just being more aggressive with > > unrolling may be the sensible thing to do. > > One potentially funny bit about this approach is that "#pragma clang loop > > unroll(full)" will also enable unrolling of a loop with a runtime trip > > count. This may be surprising... or maybe not. Full unrolling is clearly > > not possible in this case so just being more aggressive with unrolling may > > be the sensible thing to do. > > I really don't like this. If the user is requesting 'full' unrolling, I'd > rather we do only that (at least up to some really high limit, as we do > currently). Runtime unrolling adds extra expense in the low-trip-count case > (because of the various checks involved) and has a different set of > trade-offs than simpler unrolling. > > I think that we should have 'full' unrolling add > "llvm.loop.unroll.runtime.disable" metadata. We should add some other mode, > "aggressive" for example, which is like full unrolling but also allows for > runtime unrolling. I think the desired end state here is that -Rpass=unroll > would provide some specific feedback in the case where full unrolling was > requested, but not possible because of a runtime trip count. > I think that makes sense. The issue we were seeing was a loop with runtime trip count behaving badly because it was marked '#pragma unroll', had a runtime trip count, and we'd like to enable runtime unrolling for the NVPTX target (http://reviews.llvm.org/D10855). IIRC, the loop sometimes is compile-time trip count depending upon template magic. Anyway, your suggestion (unroll(full) implies llvm.loop.unroll.runtime.disable) fixes that issue. One question is how do you feel about enabling runtime unrolling with '#pragma unroll N"? This gives more benefit to loops with higher trip counts, but I could be convinced either way.
Repository: rL LLVM http://reviews.llvm.org/D10857 _______________________________________________ cfe-commits mailing list cfe-commits@cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits