On Fri, Nov 12, 2010 at 21:29:58 +0000, Eduardo Horvath wrote:

> > On Fri, Nov 12, 2010 at 09:23:07PM +0000, Eduardo Horvath wrote:
> > > On Sat, 13 Nov 2010, Valeriy E. Ushakov wrote:
> > > 
> > > > On Fri, Nov 12, 2010 at 21:59:54 +0100, Joerg Sonnenberger wrote:
> > > > 
> > > > > On Fri, Nov 12, 2010 at 08:31:39PM +0000, Eduardo Horvath wrote:
> > > > > > The assignment:
> > > > > > 
> > > > > > foo.size = htole64(size);
> > > > > > 
> > > > > > Cannot be replaced with:
> > > > > > 
> > > > > > __inline __asm("stxa %1, [%0] ASI_LITLE" : &foo.size : size);
> > > > > 
> > > > > Actually, it should be possible to do exactly that if you allow the
> > > > > output register to be a memory location. Not sure how smart GCC is for
> > > > > this though.
> > > > 
> > > > Yeah, I don't see why memory operand contraint wouldn't work here.
> > > > SPARC 'T' constraint - memory address aligned at 8 bytes.
> > > 
> > > 'Cause you're storing to foo.size which is not a parameter in the macro.
> > 
> > Doesn't really matter, as the compiler can reorder that and with the
> > right constraints, it certainly should.
> 
> So instead of a single store you force a register spill, a load, and 
> another store?  That's even worse than 8 shifts, ands and ors.

    #define htole64(x)              \
    ({                              \
        uint64_t result;            \
                                    \
        asm("ldxa %1 ASI_LITLE, %0" \
            : "=r"(result)          \
            : "m"(x));              \
                                    \
        result;                     \
    })

    // just to exercise it a bit
    struct s {
        uint64_t foo[12];
    };

    uint64_t
    foo(struct s *s)
    {
        return s->foo[10] + htole64(s->foo[11]);
    }


compiles to pretty sane:

foo:
        ldx     [%o0+80], %g1
        ldxa    [%o0+88] ASI_LITLE, %o0
        jmp     %o7+8
         add    %o0, %g1, %o0

-uwe

Reply via email to