== Quote from Manu Evans (turkey...@gmail.com)'s article > > > How can I do this in a nice way in D? I'm long sick of writing > > > unsightly vector classes in C++, but fortunately using vendor > > > specific compiler intrinsics usually leads to decent code > > > generation. I can currently imagine an equally ugly (possibly worse) > > > hardware vector library in D, if it's even possible. But perhaps > > > I've missed something here? > > Your C++ vector code should be amenable to translation to D, so that effort > > of > > yours isn't lost, except that it'd have to be in inline asm rather than intrinsics. > But sadly, in that case, it wouldn't work. Without an intrinsic hardware > vector type, there's > no way to pass vectors to functions in registers, and also, using explicit > asm, you tend to > end up with endless unnecessary loads and stores, and potentially a lot of > redundant > shuffling/permutation. This will differ radically between architectures too. > I think I read in another post too that functions containing inline asm will > not be inlined? > How does the D compiler go at optimising code around inline asm blocks? Most compilers have a > lot of trouble optimising around inline asm blocks, and many don't even > attempt to do so... > How does GDC compare to DMD? Does it do a good job? > I really need to take the weekend and do a lot of experiments I think.
GDC is just the same as DMD (same runtime library implementation for vector array operations). You can define vector types in the language through use of GCC's attribute though (is a pragma in GDC), then use a union to interface between it and the corresponding static array. It's deliberately UGLY and PRONE to you hitting lots of brick walls if you don't handle them in a very specific way though. :~) Stock example: pragma(attribute, vector_size()) typedef float __v4sf_t union __v4sf { float[4] f; __v4sf_t v; } __v4sf a = {[1,2,3,4]} b = {[1,2,3,4]} c; c.v = a.v + b.v; assert(c.f == [2,4,6,8]); The assignment compiles down to ~5 instructions: movaps -0x88(%ebp),%xmm1 movaps -0x78(%ebp),%xmm0 addps %xmm1,%xmm0 movaps %xmm0,-0x68(%ebp) flds -0x68(%ebp) And is far quicker than c[] = a[] + b[] due to it being inlined, and not an external library call. Regards Iain