On 12/01/12 8:13 PM, Norbert Nemec wrote:
Considering these hardware details of the SSE architecture alone, I fear
that portable low-level support for SIMD is very hard to achieve. If you
want to offer access to the raw power of each architecture, it might be
simpler to have machine-specific language extensions for SIMD and leave
the portability for a wrapper library with a common front-end and
various back-ends for the different architectures.

You are right, but don't forget that the same is true for instructions already in the language. For example, (1 << x) is a very slow operation on PPUs (it's micro-coded).

It's simply not possible to be portable and achieve maximum performance for any language features, not just vectors. Algorithms must be tuned for specific architectures in version statements. However, you can get a decent baseline by providing the lowest common denominator in functionality. This v128 type (or whatever it will be called) does that.

Reply via email to