https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122586

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2026-01-14
     Ever confirmed|0                           |1
            Summary|[16 Regression] 4-5%        |[16 Regression] 4-5%
                   |slowdown of 538.imagick_r   |slowdown of 538.imagick_r
                   |on Intel Ice Lake (3rd      |on Intel Ice Lake (3rd
                   |generation Xeon)            |generation Xeon) by
                   |                            |r16-4576-gfe9f0719d8ebd2

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
Confirmed on a Zen4 machine.  Reverting r16-4576-gfe9f0719d8ebd2 fixes it.

perf shows (GCC 15.2 vs. trunk):

Overhead       Samples  Command          Shared Object                   
Symbol                               
  25.52%        236921  imagick_r_peak.  imagick_r_peak.gcc7-m64          [.]
GetVirtualPixelsFromNexus
  24.58%        228166  imagick_r_base.  imagick_r_base.gcc7-m64          [.]
GetVirtualPixelsFromNexus
  16.56%        149491  imagick_r_peak.  imagick_r_peak.gcc7-m64          [.]
MorphologyApply
  16.44%        147132  imagick_r_base.  imagick_r_base.gcc7-m64          [.]
MorphologyApply
   7.51%         69689  imagick_r_peak.  imagick_r_peak.gcc7-m64          [.]
MeanShiftImage
   6.34%         58862  imagick_r_base.  imagick_r_base.gcc7-m64          [.]
MeanShiftImage
   0.66%          6104  imagick_r_peak.  imagick_r_peak.gcc7-m64          [.]
GetOneCacheViewVirtualPixel
   0.36%          3347  imagick_r_base.  imagick_r_base.gcc7-m64          [.]
GetOneCacheViewVirtualPixel

where MeanShiftImage has

                status=GetOneCacheViewVirtualPixel(pixel_view,(ssize_t)
                  MagickRound(mean_location.x+u),(ssize_t) MagickRound(
                  mean_location.y+v),&pixel,exception);

with

static inline double MagickRound(double x)             
{                 
  /*     
    Round the fraction to nearest integer.
  */              
  if ((x-floor(x)) < (ceil(x)-x)) 
    return(floor(x));
  return(ceil(x));       
}  

and code generated is old

        │       vrndscalesd  $0xa,%xmm0,%xmm0,%xmm3                            
                              ▒
     21 │       vrndscalesd  $0x9,%xmm0,%xmm0,%xmm1                            
                              ▒
      2 │       vsubsd       %xmm0,%xmm3,%xmm6                                 
                              ▒
    306 │       vsubsd       %xmm1,%xmm0,%xmm2                                 
                              ▒
    714 │       vmovsd       %xmm0,0xe0(%rsp)                                  
                              ▒
        │1786 return(ceil(x));                                                 
                              ▒
        │       vcmpltsd     %xmm6,%xmm2,%xmm2                                 
                              ▒
    664 │       vblendvpd    %xmm2,%xmm1,%xmm3,%xmm1 

vs new

     2 │       vaddsd       _IO_stdin_used+0x13698,%xmm3,%xmm0                 
                             ▒
     15 │       vmovsd       %xmm3,0xe8(%rsp)                                  
                              ▒
   1269 │       vrndscalesd  $0x9,%xmm0,%xmm0,%xmm0          

it's not clear why the former is prefered.  Possibly this is not the only
place the pattern triggers.  GetVirtualPixelsFromNexus doesn't use floor
though.

Needs more analysis.

Reply via email to