------- Comment #1 from falk at debian dot org  2006-09-24 19:52 -------
For this test case:

void f(double *pds, double *pdd, unsigned long len) {
  while (len >= 8*sizeof(double)) {
    register double r1,r2,r3,r4;
    r1 = *pds++;
    r2 = *pds++;
    r3 = *pds++;
    r4 = *pds++;
    *pdd++ = r1;
    *pdd++ = r2;
    *pdd++ = r3;
    *pdd++ = r4;
  }
}

gcc starting from 4.0 produces this:

.L3:
        fldds -16(%r26),%fr22
        fldds -8(%r26),%fr23
        fldds 0(%r26),%fr24
        fldds 8(%r26),%fr25
        ldo 32(%r26),%r26
        fstds %fr22,-16(%r25)
        fstds %fr23,-8(%r25)
        fstds %fr24,0(%r25)
        fstds %fr25,8(%r25)
        b .L3

which I suspect is actually better, since it avoids dependencies between the
loads. But I'm not familiar with hppa, can anybody comment?


-- 

falk at debian dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to fail|                            |3.4.2 4.1.2


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17264

Reply via email to