Re: std.simd module

Sean Cavanaugh Sat, 04 Feb 2012 21:15:25 -0800

On 2/4/2012 7:37 PM, Martin Nowak wrote:

Am 05.02.2012, 02:13 Uhr, schrieb Manu <turkey...@gmail.com>:

On 5 February 2012 03:08, Martin Nowak <d...@dawgfoto.de> wrote:

Let me restate the main point.
Your approach to a higher level module wraps intrinsics with named
functions.
There is little gain in making simd(AND, f, f2) to and(f, f2) when
you can
easily take this to the level GLSL achieves.


What is missing to reach that level in your opinion? I think I basically
offer that (with some more work)
It's not clear to me what you object to...
I'm not prohibiting the operators, just adding the explicit functions,
which may be more efficient in certain cases (they receive the version).

Also the 'gains' of wrapping an intrinsic in an almost identical function
are, portability, and potential optimisation for hardware versioning. I'm
specifically trying to build something that's barely above the intrinsics
here, although a lot of the more arcane intrinsics are being collated
into
their typically useful functionality.

Are you just focused on the primitive math ops, or something broader?


GLSL achieves very clear and simple to write construction and conversion
of values.

I think wrapping the core.simd vector types in an alias this struct
makes it a snap
to define conversion through constructors and swizzling through
properties/opDispatch.
Then you can overload operands to do the implementation specific stuff
and add named methods
for the rest.

The GLSL or HLSL sync is fairly nice, but has a few advantages that areharder to take advantage of on PC SIMD:

The hardware that runs HLSL can handle natively operate on data types'smaller' than the register, either handled natively or by turning allthe instructions into a mass of scalar ops that are then run in parallelas best as possible. In SIMD land on CPU's the design is much morerigid: we are effectively stuck using float and float4 data types, andemulating float2 and float3. For a very long time there was not evena a dot product instruction, as from Intel's point of view your data istransposed incorrectly if you needed to do one (plus they have to handledot2, dot3, dot4 etc).

The cost of this emulation of float2 and float3 types is that we have toput 'some data' in the unused slots of the SIMD register on swizzleoperations, which will usually lead to the SIMD instructions generatingINF's and NANs in that slot and hurting performance.

The other major problem with the shader swizzle syntax is that it'doesnt scale'. If you are using a 128 register holding 8 shorts or 16bytes, what are the letters here? Shaders assume 4 is the limit so youhave either xyzw and rgba. Then there are platform considerations (i.e.you can can't swizzle 8 bit data on SSE, you have to use a series ofpack|unpack and shuffles, but VMX can easily)

That said: shader swizzle syntax is very nice, it can certainly reducethe amount of code you write by a huge factor (though the codegen isanother matter) Even silly tricks with swizzling literals in HLSL areuseful like the following code to sum up some numbers:


if (dot(a, 1.f.xxx) > 0)

Re: std.simd module

Reply via email to