http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50038

             Bug #: 50038
           Summary: redundant zero extensions
    Classification: Unclassified
           Product: gcc
           Version: 4.7.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassig...@gcc.gnu.org
        ReportedBy: tocarip.in...@gmail.com


Following code

void t_run_test(int Pels,unsigned char * ImageInPtr,unsigned char *
ImageOutPtr)                                                                    
{                                                                               
        int             i;                                                      
        unsigned char                   xr, xg;                                 
        unsigned char                   xy=0;                                   
                for (i = 0; i < Pels; i++)                                      
                {                                                               
                        xr = *ImageInPtr++;                                     
                        xg = *ImageInPtr++;                                     
                        xy = (unsigned char) ((19595*xr + 38470*xg) >> 16);    
                                                                               
                    *ImageOutPtr++ = xy;                                       
                                                                               
        }                                                                      
                                                                    }

Is compiled -O2 with  both   gcc 4.5.1

(Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla
--enable-bootstrap --enable-shared --enable-threads=posix
--enable-checking=release --with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions --enable-gnu-unique-object
--enable-linker-build-id
--enable-languages=c,c++,objc,obj-c++,java,fortran,ada,lto --enable-plugin
--enable-java-awt=gtk --disable-dssi
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre
--enable-libgcj-multifile --enable-java-maintainer-mode
--with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib
--with-ppl --with-cloog --with-tune=generic --with-arch_32=i686
--build=x86_64-redhat-linux)  

and trunk version 

(Target: x86_64-unknown-linux-gnu
Configured with: ../configure --enable-languages=c --disable-bootsrap
Thread model: posix
gcc version 4.7.0 20110808 (experimental) (GCC) ) 

to 
         ...
        movzbl  (%rsi), %edi                                                    
        movzbl  1(%rsi), %eax                                                   
        movq    %rcx, %rsi                                                      
        movzbl  %dil, %edi  <- redundant                                        
        movzbl  %al, %eax   <- redundant                                        
        imull   $19595, %edi, %edi                                              
        imull   $38470, %eax, %eax                                              
        addl    %edi, %eax
        ...
For example icc does
         ...
        movzbl    (%rsi), %ecx                                                  
        incl      %eax                                                          
        movzbl    1(%rsi), %r8d                                                 
        addq      $2, %rsi                                                      
        imull     $19595, %ecx, %r10d                            
        ....
Without unnecessary zero extensions.

Reply via email to