On Mon, Sep 16, 2013 at 10:24 PM, Xinliang David Li <davi...@google.com> wrote:
> On Mon, Sep 16, 2013 at 3:13 AM, Richard Biener
> <richard.guent...@gmail.com> wrote:
>> On Fri, Sep 13, 2013 at 5:16 PM, Xinliang David Li <davi...@google.com> 
>> wrote:
>>> On Fri, Sep 13, 2013 at 1:30 AM, Richard Biener
>>> <richard.guent...@gmail.com> wrote:
>>>> On Thu, Sep 12, 2013 at 10:31 PM, Xinliang David Li <davi...@google.com> 
>>>> wrote:
>>>>> Currently -ftree-vectorize turns on both loop and slp vectorizations,
>>>>> but there is no simple way to turn on loop vectorization alone. The
>>>>> logic for default O3 setting is also complicated.
>>>>>
>>>>> In this patch, two new options are introduced:
>>>>>
>>>>> 1) -ftree-loop-vectorize
>>>>>
>>>>> This option is used to turn on loop vectorization only. option
>>>>> -ftree-slp-vectorize also becomes a first class citizen, and no funny
>>>>> business of Init(2) is needed.  With this change, -ftree-vectorize
>>>>> becomes a simple alias to -ftree-loop-vectorize +
>>>>> -ftree-slp-vectorize.
>>>>>
>>>>> For instance, to turn on only slp vectorize at O3, the old way is:
>>>>>
>>>>>      -O3 -fno-tree-vectorize -ftree-slp-vectorize
>>>>>
>>>>> With the new change it becomes:
>>>>>
>>>>>     -O3 -fno-loop-vectorize
>>>>>
>>>>>
>>>>> To turn on only loop vectorize at O2, the old way is
>>>>>
>>>>>     -O2 -ftree-vectorize -fno-slp-vectorize
>>>>>
>>>>> The new way is
>>>>>
>>>>>     -O2 -ftree-loop-vectorize
>>>>>
>>>>>
>>>>>
>>>>> 2) -ftree-vect-loop-peeling
>>>>>
>>>>> This option is used to turn on/off loop peeling for alignment.  In the
>>>>> long run, this should be folded into the cheap cost model proposed by
>>>>> Richard.  This option is also useful in scenarios where peeling can
>>>>> introduce runtime problems:
>>>>> http://gcc.gnu.org/ml/gcc/2005-12/msg00390.html  which happens to be
>>>>> common in practice.
>>>>>
>>>>>
>>>>>
>>>>> Patch attached. Compiler boostrapped. Ok after testing?
>>>>
>>>> I'd like you to split 1) and 2), mainly because I agree on 1) but not on 
>>>> 2).
>>>
>>> Ok. Can you also comment on 2) ?
>>
>> I think we want to decide how granular we want to control the vectorizer
>> and using which mechanism.  My cost-model re-org makes
>> ftree-vect-loop-version a no-op (basically removes it), so 2) looks like
>> a step backwards in this context.
>
> Using cost model to do a coarse grain control/configuration is
> certainly something we want, but having a fine grain control is still
> useful.
>
>>
>> So, can you summarize what pieces (including versioning) of the vectorizer
>> you'd want to be able to disable separately?
>
> Loop peeling seems to be the main one. There is also a correctness
> issue related. For instance, the following code is common in practice,
> but loop peeling wrongly assumes initial base-alignment and generates
> aligned mov instruction after peeling, leading to SEGV.  Peeling is
> not something we can blindly turned on -- even when it is on, there
> should be a way to turn it off explicitly:
>
> char a[10000];
>
> void foo(int n)
> {
>   int* b = (int*)(a+n);
>   int i = 0;
>   for (; i < 1000; ++i)
>     b[i] = 1;
> }
>
> int main(int argn, char** argv)
> {
>   foo(argn);
> }

But that's just a bug that should be fixed (looking into it).

>>  Just disabling peeling for
>> alignment may get you into the versioning for alignment path (and thus
>> an unvectorized loop at runtime).
>
> This is not true for target supporting mis-aligned access. I have not
> seen a case where alignment driver loop version happens on x86.
>
>>Also it's know that the alignment peeling
>> code needs some serious TLC (it's outcome depends on the order of DRs,
>> the cost model it uses leaves to be desired as we cannot distinguish
>> between unaligned load and store costs).
>
> Yet another reason to turn it off as it is not effective anyways?

As said I'll disable all remains of -ftree-vect-loop-version with the cost model
patch because it wasn't guarding versioning for aliasing but only versioning
for alignment.

We have to be consistent here - if we add a way to disable peeling for
alignment then we certainly don't want to remove the ability to disable
versioning for alignment, no?

Richard.

>
> thanks,
>
> David
>
>>
>> Richard.
>>
>>>>
>>>> I've stopped a quick try doing 1) myself because
>>>>
>>>> @@ -1691,6 +1695,12 @@ common_handle_option (struct gcc_options
>>>>          opts->x_flag_ipa_reference = false;
>>>>        break;
>>>>
>>>> +    case OPT_ftree_vectorize:
>>>> +      if (!opts_set->x_flag_tree_loop_vectorize)
>>>> + opts->x_flag_tree_loop_vectorize = value;
>>>> +      if (!opts_set->x_flag_tree_slp_vectorize)
>>>> + opts->x_flag_tree_slp_vectorize = value;
>>>> +      break;
>>>>
>>>> doesn't look obviously correct.  Does that handle
>>>>
>>>>   -ftree-vectorize -fno-tree-loop-vectorize -ftree-vectorize
>>>>
>>>> or
>>>>
>>>>   -ftree-loop-vectorize -fno-tree-vectorize
>>>>
>>>> properly?  Currently at least
>>>>
>>>>   -ftree-slp-vectorize -fno-tree-vectorize
>>>>
>>>> doesn't "work".
>>>
>>>
>>> Right -- same is true for -fprofile-use option. FDO enables some
>>> passes, but can not re-enable them if they are flipped off before.
>>>
>>>>
>>>> That said, the option machinery doesn't handle an option being an alias
>>>> for two other options, so it's mechanism to contract positives/negatives
>>>> doesn't work here and the override hooks do not work reliably for
>>>> repeated options.
>>>>
>>>> Or am I wrong here?  Should we care at all?  Joseph?
>>>
>>> We should probably just document the behavior. Even better, we should
>>> deprecate the old option.
>>>
>>> thanks,
>>>
>>> David
>>>
>>>>
>>>> Thanks,
>>>> Richard.
>>>>
>>>>>
>>>>> thanks,
>>>>>
>>>>> David

Reply via email to