[Bug middle-end/38671] [4.4 Regression] extra code for setting up loops (IV-opts and 32bits vs 64bits)

2009-01-05 Thread rguenth at gcc dot gnu dot org


-- 

rguenth at gcc dot gnu dot org changed:

   What|Removed |Added

   Priority|P3  |P2


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38671



[Bug middle-end/38671] [4.4 Regression] extra code for setting up loops (IV-opts and 32bits vs 64bits)

2008-12-31 Thread tim at klingt dot org


--- Comment #6 from tim at klingt dot org  2008-12-31 09:20 ---
> sys_perf_counter_open always returns less than zero for me. 
> This is with:
> Linux gcc13 2.6.18-6-vserver-amd64 #1 SMP Sun Feb 10 17:55:04 UTC 2008 x86_64
> GNU/Linux
> 
> What system call is it trying to do and why?
> 

it is trying to open the performance counters
(http://lwn.net/Articles/310176/). it requires a patched kernel, though ...


(In reply to comment #3)
> t.cc: In function �float __vector__ nova::detail::gen_one()�:
> t.cc:34160: warning: �x� is used uninitialized in this function
> 
> inline __m128 gen_one(void)
> {
> __m128i x;
> __m128i ones = _mm_cmpeq_epi32(x, x);
> return (__m128)_mm_slli_epi32 (_mm_srli_epi32(ones, 25), 23);
> }
> 
> Is undefined code I think.

this code is valid. the uninitialized xmm register x is compared with itself in
order to set the register ones to .


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38671



[Bug middle-end/38671] [4.4 Regression] extra code for setting up loops (IV-opts and 32bits vs 64bits)

2008-12-31 Thread pinskia at gcc dot gnu dot org


--- Comment #5 from pinskia at gcc dot gnu dot org  2008-12-31 08:12 ---
Confirmed, though I don't have a fully reduced testcase yet.  Basically it
comes down to using unsigned int rather than size_t.  If you had used size_t as
the index, the code would have worked correctly.


-- 

pinskia at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2008-12-31 08:12:50
   date||
Summary|[4.4 Regression] extra code |[4.4 Regression] extra code
   |for setting up loops|for setting up loops (IV-
   ||opts and 32bits vs 64bits)


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38671



[Bug middle-end/38671] [4.4 Regression] extra code for setting up loops

2008-12-31 Thread pinskia at gcc dot gnu dot org


--- Comment #4 from pinskia at gcc dot gnu dot org  2008-12-31 08:10 ---
  D.45587 = VIEW_CONVERT_EXPR<__v4si>(x);
  D.45589 = __builtin_ia32_pcmpeqd128 (D.45587, D.45587);
  D.45591 = __builtin_ia32_psrldi128 (D.45589, 25);
  D.45594 = __builtin_ia32_pslldi128 (D.45591, 23);
  one = VIEW_CONVERT_EXPR<__m128>(VIEW_CONVERT_EXPR<__m128i>(D.45594));
  D.45644 = (long unsigned int) ((n >> 2) + 4294967295) + 1 * 16;
  ivtmp.516 = 0;


So the inner loop is not the issue, only the setup code.

The extra subtract/add comes from D.45644.


-- 

pinskia at gcc dot gnu dot org changed:

   What|Removed |Added

  Component|target  |middle-end
Summary|[4.4 Regression] speed  |[4.4 Regression] extra code
   |regression with sse |for setting up loops
   |intrinsics  |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38671