[Mono-dev] Mono.Simd AltiVec port

2010-02-02 Thread Sergei Dyshel
Hello all, I'm currently working on PowerPC port of Mono which utilizes AltiVec SIMD instructions. During the development I've encountered an alignment problem: As far as I understood from running Mono's JIT, stack-allocated Mono.Simd.Vector* types are always aligned by 16 byte bound, but global

Re: [Mono-dev] Mono.Simd AltiVec port

2010-02-03 Thread Rodrigo Kumpera
Hi Sergei, On Tue, Feb 2, 2010 at 6:59 AM, Sergei Dyshel wrote: > Hello all, > > I'm currently working on PowerPC port of Mono which utilizes AltiVec SIMD > instructions. During the development I've encountered an alignment problem: > > As far as I understood from running Mono's JIT, stack-alloca

Re: [Mono-dev] Mono.Simd AltiVec port

2010-02-09 Thread Sergei Dyshel
Hi, Now I'm stuck with another problem on PPC. For multiplication of floats Altivec has only a fuse-add instruction which does a*b+c. So in order to implement OP_MULPS I need to assure c==0. The only solution which comes to mind is: XZERO D MULADD D <= S1, S2, D Where MULADD is the instruction and

Re: [Mono-dev] Mono.Simd AltiVec port

2010-02-11 Thread Rodrigo Kumpera
The way to handle those situations is to have a arch decomposition pass that converts MULPS into a VZERO + MULADD. For bonus points, you can add to the arch peephole code to fuse MULPS + ADDPS. For an example of that, take a look at mini-x86.c / mono_arch_decompose_opts. Rodrigo On Tue, Feb 9, 2

Re: [Mono-dev] Mono.Simd AltiVec port

2010-05-02 Thread Sergei Dyshel
Hello Rodrigo and all, Returning to my old problem which deals with alignment of vector variables. I noticed that on x86 vector locals are aligned at 8-byte boundary instead of 16-byte thus causing to use 'movups' instead of much more efficient 'movaps'. On PowerPC there is no such "bug", so I tri