On Mon, Sep 16, 2013 at 10:24 PM, Xinliang David Li <davi...@google.com> wrote: > On Mon, Sep 16, 2013 at 3:13 AM, Richard Biener > <richard.guent...@gmail.com> wrote: >> On Fri, Sep 13, 2013 at 5:16 PM, Xinliang David Li <davi...@google.com> >> wrote: >>> On Fri, Sep 13, 2013 at 1:30 AM, Richard Biener >>> <richard.guent...@gmail.com> wrote: >>>> On Thu, Sep 12, 2013 at 10:31 PM, Xinliang David Li <davi...@google.com> >>>> wrote: >>>>> Currently -ftree-vectorize turns on both loop and slp vectorizations, >>>>> but there is no simple way to turn on loop vectorization alone. The >>>>> logic for default O3 setting is also complicated. >>>>> >>>>> In this patch, two new options are introduced: >>>>> >>>>> 1) -ftree-loop-vectorize >>>>> >>>>> This option is used to turn on loop vectorization only. option >>>>> -ftree-slp-vectorize also becomes a first class citizen, and no funny >>>>> business of Init(2) is needed. With this change, -ftree-vectorize >>>>> becomes a simple alias to -ftree-loop-vectorize + >>>>> -ftree-slp-vectorize. >>>>> >>>>> For instance, to turn on only slp vectorize at O3, the old way is: >>>>> >>>>> -O3 -fno-tree-vectorize -ftree-slp-vectorize >>>>> >>>>> With the new change it becomes: >>>>> >>>>> -O3 -fno-loop-vectorize >>>>> >>>>> >>>>> To turn on only loop vectorize at O2, the old way is >>>>> >>>>> -O2 -ftree-vectorize -fno-slp-vectorize >>>>> >>>>> The new way is >>>>> >>>>> -O2 -ftree-loop-vectorize >>>>> >>>>> >>>>> >>>>> 2) -ftree-vect-loop-peeling >>>>> >>>>> This option is used to turn on/off loop peeling for alignment. In the >>>>> long run, this should be folded into the cheap cost model proposed by >>>>> Richard. This option is also useful in scenarios where peeling can >>>>> introduce runtime problems: >>>>> http://gcc.gnu.org/ml/gcc/2005-12/msg00390.html which happens to be >>>>> common in practice. >>>>> >>>>> >>>>> >>>>> Patch attached. Compiler boostrapped. Ok after testing? >>>> >>>> I'd like you to split 1) and 2), mainly because I agree on 1) but not on >>>> 2). >>> >>> Ok. Can you also comment on 2) ? >> >> I think we want to decide how granular we want to control the vectorizer >> and using which mechanism. My cost-model re-org makes >> ftree-vect-loop-version a no-op (basically removes it), so 2) looks like >> a step backwards in this context. > > Using cost model to do a coarse grain control/configuration is > certainly something we want, but having a fine grain control is still > useful. > >> >> So, can you summarize what pieces (including versioning) of the vectorizer >> you'd want to be able to disable separately? > > Loop peeling seems to be the main one. There is also a correctness > issue related. For instance, the following code is common in practice, > but loop peeling wrongly assumes initial base-alignment and generates > aligned mov instruction after peeling, leading to SEGV. Peeling is > not something we can blindly turned on -- even when it is on, there > should be a way to turn it off explicitly: > > char a[10000]; > > void foo(int n) > { > int* b = (int*)(a+n); > int i = 0; > for (; i < 1000; ++i) > b[i] = 1; > } > > int main(int argn, char** argv) > { > foo(argn); > }
But that's just a bug that should be fixed (looking into it). >> Just disabling peeling for >> alignment may get you into the versioning for alignment path (and thus >> an unvectorized loop at runtime). > > This is not true for target supporting mis-aligned access. I have not > seen a case where alignment driver loop version happens on x86. > >>Also it's know that the alignment peeling >> code needs some serious TLC (it's outcome depends on the order of DRs, >> the cost model it uses leaves to be desired as we cannot distinguish >> between unaligned load and store costs). > > Yet another reason to turn it off as it is not effective anyways? As said I'll disable all remains of -ftree-vect-loop-version with the cost model patch because it wasn't guarding versioning for aliasing but only versioning for alignment. We have to be consistent here - if we add a way to disable peeling for alignment then we certainly don't want to remove the ability to disable versioning for alignment, no? Richard. > > thanks, > > David > >> >> Richard. >> >>>> >>>> I've stopped a quick try doing 1) myself because >>>> >>>> @@ -1691,6 +1695,12 @@ common_handle_option (struct gcc_options >>>> opts->x_flag_ipa_reference = false; >>>> break; >>>> >>>> + case OPT_ftree_vectorize: >>>> + if (!opts_set->x_flag_tree_loop_vectorize) >>>> + opts->x_flag_tree_loop_vectorize = value; >>>> + if (!opts_set->x_flag_tree_slp_vectorize) >>>> + opts->x_flag_tree_slp_vectorize = value; >>>> + break; >>>> >>>> doesn't look obviously correct. Does that handle >>>> >>>> -ftree-vectorize -fno-tree-loop-vectorize -ftree-vectorize >>>> >>>> or >>>> >>>> -ftree-loop-vectorize -fno-tree-vectorize >>>> >>>> properly? Currently at least >>>> >>>> -ftree-slp-vectorize -fno-tree-vectorize >>>> >>>> doesn't "work". >>> >>> >>> Right -- same is true for -fprofile-use option. FDO enables some >>> passes, but can not re-enable them if they are flipped off before. >>> >>>> >>>> That said, the option machinery doesn't handle an option being an alias >>>> for two other options, so it's mechanism to contract positives/negatives >>>> doesn't work here and the override hooks do not work reliably for >>>> repeated options. >>>> >>>> Or am I wrong here? Should we care at all? Joseph? >>> >>> We should probably just document the behavior. Even better, we should >>> deprecate the old option. >>> >>> thanks, >>> >>> David >>> >>>> >>>> Thanks, >>>> Richard. >>>> >>>>> >>>>> thanks, >>>>> >>>>> David