https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77600

            Bug ID: 77600
           Summary: M68K: gcc generates rubbish with -mshort
           Product: gcc
           Version: 6.1.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: dhowells at redhat dot com
  Target Milestone: ---

In certain cases gcc wants to generate the equivalent of:

    move.b (%a0,-1),foo

but instead of generating:

    moveq #-1.%d0
    moveb(%a0,%d0.l)

or similar it generates the bogus sequence:

    moveq #0,%d0
    not.w %d0

which would be the fast way of generating a 16-bit -1 value rather than a
32-bit value.

Compiling:

        void *my_memcpy(void *d, const void *s, long sz)
        {
                unsigned char *dp = d;
                const unsigned char *sp = s;
                while (sz--)
                        *dp++ = *sp++;
                return d;
        }

with:

    m68k-linux-gnu-gcc -m68000 -S /tmp/memcpy.c -o - -O2 -mshort

produces:

    my_memcpy:
        link.w %fp,#0
        move.l %a3,-(%sp)
        move.l %a2,-(%sp)
        move.l 8(%fp),%a0
        move.l 16(%fp),%d0
        jeq .L6
        move.l %a0,%a2
        ext.l %d0                      <--- isn't the size a 'long int'?
        lea (%a0,%d0.l),%a3
        move.l 12(%fp),%a1
        moveq #0,%d0                   <--- dodgy bit
        not.w %d0                      <---
    .L3:
        addq.l #1,%a1
        move.b (%a1,%d0.l),(%a2)+      <--- should this be %d0.w as the index?
        cmp.l %a2,%a3
        jne .L3
    .L6:
        move.l %a0,%d0
        move.l (%sp)+,%a2
        move.l (%sp)+,%a3
        unlk %fp
        rts

Dropping the -mshort, I see:

    my_memcpy:
        link.w %fp,#0
        move.l %a2,-(%sp)
        move.l 8(%fp),%a0
        move.l 12(%fp),%a1
        tst.l 16(%fp)
        jeq .L6
        move.l %a0,%a2
        move.l 16(%fp),%d0
        add.l %a1,%d0
    .L3:
        move.b (%a1)+,(%a2)+
        cmp.l %a1,%d0
        jne .L3
    .L6:
        move.l %a0,%d0
        move.l (%sp)+,%a2
        unlk %fp
        rts

which looks correct, though there's a test of sz and then sz is loaded into %d0
in two separate instructions; possibly these two could be combined since move
to data reg sets the condition flags.

I'm using a gcc-6.1.1 cross-compiler built for x86_64, but the problem also
shows up on gcc-5.3.1.  The gcc-6.1.1 compiler is svn rev 237634, dated 
20160621.

Reply via email to