http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56175



             Bug #: 56175

           Summary: Issue with combine phase on x86.

    Classification: Unclassified

           Product: gcc

           Version: 4.8.0

            Status: UNCONFIRMED

          Severity: normal

          Priority: P3

         Component: rtl-optimization

        AssignedTo: unassig...@gcc.gnu.org

        ReportedBy: ysrum...@gmail.com





Analyzing performance of important benchmark on x86 Atom in 32bit mode we found

out that the code produced for attached testcase is not optimal - the inner

loop contains 18 instructions instead of 12.

The problem is that 'combine' does not perform desired substitution for the

following stmt:



    t = (u8)((x & 1) ^ ((u8)y & 1));

It is not able to convert it to more optimal form like:

    t = (u8)((x ^ (u8)y ) & 1);



This issue can be explained using the following testcase:



int foo( unsigned char x, unsigned short y)

{

  unsigned char z;

 if (x ==0 || y == 0)

  return 0;

 x>>=1;

 y>>=1;

  z = (unsigned char)((x & 1) ^ ((unsigned char)y & 1));

  if (z == 1)

    return 1;

  return 0;

}



For this case combine performs needed transformation and we get optimal

assembly:

...

    xorl    %edx, %eax

    andl    $1, %eax

    ret



For this case combine tries to perform the following substitution:



Trying 22, 20 -> 23:

Failed to match this instruction:

(parallel [

        (set (reg:QI 83 [ D.1758 ])

            (and:QI (xor:QI (reg:QI 79 [ x ])

                    (subreg:QI (reg:HI 81 [ y ]) 0))

                (const_int 1 [0x1])))

        (clobber (reg:CC 17 flags))

    ])

Failed to match this instruction:

(set (reg:QI 83 [ D.1758 ])

    (and:QI (xor:QI (reg:QI 79 [ x ])

            (subreg:QI (reg:HI 81 [ y ]) 0))

        (const_int 1 [0x1])))

Successfully matched this instruction:

(set (reg:QI 82 [ D.1760 ])

    (xor:QI (reg:QI 79 [ x ])

        (subreg:QI (reg:HI 81 [ y ]) 0)))

Successfully matched this instruction:

(set (reg:QI 83 [ D.1758 ])

    (and:QI (reg:QI 82 [ D.1760 ])

        (const_int 1 [0x1])))

where

(insn 20 19 21 4 (parallel [

            (set (reg:QI 80 [ D.1759 ])

                (and:QI (reg:QI 79 [ x ])

                    (const_int 1 [0x1])))

            (clobber (reg:CC 17 flags))

        ]) t.c:8 405 {*andqi_1}

     (expr_list:REG_DEAD (reg:QI 79 [ x ])

        (expr_list:REG_UNUSED (reg:CC 17 flags)

            (nil))))

(insn 22 21 23 4 (parallel [

            (set (reg:HI 82 [ D.1760 ])

                (and:HI (reg:HI 81 [ y ])

                    (const_int 1 [0x1])))

            (clobber (reg:CC 17 flags))

        ]) t.c:8 404 {*andhi_1}

     (expr_list:REG_DEAD (reg:HI 81 [ y ])

        (expr_list:REG_UNUSED (reg:CC 17 flags)

            (nil))))

(insn 23 22 24 4 (parallel [

            (set (reg:QI 83 [ D.1758 ])

                (xor:QI (reg:QI 80 [ D.1759 ])

                    (subreg:QI (reg:HI 82 [ D.1760 ]) 0)))

            (clobber (reg:CC 17 flags))

        ]) t.c:8 426 {*xorqi_1}

     (expr_list:REG_DEAD (reg:HI 82 [ D.1760 ])

        (expr_list:REG_DEAD (reg:QI 80 [ D.1759 ])

            (expr_list:REG_UNUSED (reg:CC 17 flags)

                (nil)))))

but for more compicated test that is attached combine tries to do the same

substitution in reverse order of operands and it is failed:



Trying 14, 13 -> 15:

Failed to match this instruction:

(parallel [

        (set (reg:QI 63 [ D.1770 ])

            (xor:QI (and:QI (reg/v:QI 72 [ x ])

                    (const_int 1 [0x1]))

                (and:QI (subreg:QI (reg/v:HI 74 [ y ]) 0)

                    (const_int 1 [0x1]))))

        (clobber (reg:CC 17 flags))

    ])

Failed to match this instruction:

(set (reg:QI 63 [ D.1770 ])

    (xor:QI (and:QI (reg/v:QI 72 [ x ])

            (const_int 1 [0x1]))

        (and:QI (subreg:QI (reg/v:HI 74 [ y ]) 0)

            (const_int 1 [0x1]))))

Successfully matched this instruction:

(set (reg:QI 77 [ D.1771 ])

    (and:QI (subreg:QI (reg/v:HI 74 [ y ]) 0)

        (const_int 1 [0x1])))

Failed to match this instruction:

(set (reg:QI 63 [ D.1770 ])

    (xor:QI (and:QI (reg/v:QI 72 [ x ])

            (const_int 1 [0x1]))

        (reg:QI 77 [ D.1771 ])))

where

(insn 13 12 14 3 (parallel [

            (set (reg:HI 76 [ D.1772 ])

                (and:HI (reg/v:HI 74 [ y ])

                    (const_int 1 [0x1])))

            (clobber (reg:CC 17 flags))

        ]) t1.c:9 404 {*andhi_1}

     (expr_list:REG_UNUSED (reg:CC 17 flags)

        (nil)))

(insn 14 13 15 3 (parallel [

            (set (reg:QI 77 [ D.1771 ])

                (and:QI (reg/v:QI 72 [ x ])

                    (const_int 1 [0x1])))

            (clobber (reg:CC 17 flags))

        ]) t1.c:9 405 {*andqi_1}

     (expr_list:REG_UNUSED (reg:CC 17 flags)

        (nil)))

(insn 15 14 16 3 (parallel [

            (set (reg:QI 63 [ D.1770 ])

                (xor:QI (reg:QI 77 [ D.1771 ])

                    (subreg:QI (reg:HI 76 [ D.1772 ]) 0)))

            (clobber (reg:CC 17 flags))

        ]) t1.c:9 426 {*xorqi_1}

     (expr_list:REG_DEAD (reg:QI 77 [ D.1771 ])

        (expr_list:REG_DEAD (reg:HI 76 [ D.1772 ])

            (expr_list:REG_UNUSED (reg:CC 17 flags)

                (nil)))))

It seems that if we tried to combine 13, 14 -> 15 we will be successful.

Note also that an order of instructions is different after expand.

Reply via email to