Well, in your modified example, it is still due to jump threading that produce
code of bad control flow that cannot be if-converted and vectorized, though in
tree-vrp pass this time. 

Try this 
~/install-4.8/bin/gcc vect-ifconv-2.c  -O2 -fdump-tree-ifcvt-details 
-ftree-vectorize  -save-temps -fno-tree-vrp

The code can be vectorized. 

Grep "threading" in gcc, it seems that dom and vrp passes are two places that 
apply
jump threading. Any other place? I think we need an target hook to control it. 

Thanks,
Bingfeng

-----Original Message-----
From: Andrew Pinski [mailto:pins...@gmail.com] 
Sent: 21 November 2013 21:26
To: Bingfeng Mei
Cc: gcc@gcc.gnu.org
Subject: Re: Jump threading in tree dom pass prevents if-conversion & following 
vectorization

On Thu, Nov 21, 2013 at 7:11 AM, Bingfeng Mei <b...@broadcom.com> wrote:
> Hi,
> I am doing some investigation on loops can be vectorized
> by LLVM, but not GCC. One example is loop that contains
> more than one if-else constructs.
>
> typedef signed char int8;
> #define FFT         128
>
> typedef struct {
>     int8   exp[FFT];
> } feq_t;
>
> void test(feq_t *feq)
> {
>     int k;
>     int feqMinimum = 15;
>     int8 *exp = feq->exp;
>
>     for (k=0;k<FFT;k++) {
>         exp[k] -= feqMinimum;
>         if(exp[k]<-15) exp[k] = -15;
>         if(exp[k]>15) exp[k]  = 15;
>     }
> }
>
> Compile it with 4.8.2 on x86_64
> ~/install-4.8/bin/gcc ghs-algorithms_380.c -O2 -fdump-tree-ifcvt-details 
> -ftree-vectorize  -save-temps
>
> It is not vectorized because if-else constructs are not properly
> if-converted. Looking into .ifcvt file, I found the loop is not
> if-converted because of bad if-else structure. One branch jumps directly
> into another branch. Digging a bit deeper, I found such structure
> is generated by dom1 pass doing jump threading optimization.
> So recompile with
>
> ~/install-4.8/bin/gcc ghs-algorithms_380.c -O2 -fdump-tree-ifcvt-details 
> -ftree-vectorize  -save-temps -fno-tree-dominator-opts
>
> It is magically if-converted and vectorized! Same on our target,
> performance is improved greatly in this example.
>
> It seems to me that doing jump threading for architectures
> support if-conversion is not a good idea. Original if-else structures
> are damaged so that if-conversion cannot proceed, so are vectorization
> and maybe other optimizations. Should we try to identify those "bad"
> jump threading and skip them for such architectures?

This is not a bad jump threading at all.  In fact I think this is just
a misoptimization exposed by DOM.  Rewriting it like:
#define FFT         128

typedef struct {
    signed char   exp[FFT];
} feq_t;

void test(feq_t *feq)
{
    int k;
    int feqMinimum = 15;
    signed char *exp = feq->exp;

    for (k=0;k<FFT;k++) {
signed char temp = exp[k] - feqMinimum;
        if(temp<-15) temp = -15;
        if(temp>15) temp  = 15;
exp[k] = temp;
    }
}

--- CUT ----
Also shows the issue even without any jump threading involved (turning
off DOM does not fix my example).  Please file a bug with both your
and my examples.

Also what DOM is doing is getting rid of the extra store to exp[k] in
some cases.


>
> Bingfeng Mei
> Broadcom UK
>
>
>

Reply via email to