Neat proposal:
thoughts
0) seems like you need to add a notion of const expr to the type system for
this proposal, right? I started staring at that and it's pretty subtle
(though I may have been looking at it wrong)

1) do rust tuples actually map to the llvm simd vector types?
2) so this would require some special syntax support right? Could it be
prototyped with a procedural macro plus having the shuffle ast code
generated internally?
3) would the internal rep properly treat the shuffle mask ints as part of
the op itself so that it won't get cse'd or the like?
4) would the syntax do a type / syntax error when you use a tuple position
index that's too large?

5) the llvm shuffle vector intrinsic takes two vectors of values and let's
you express interleaving them, not just rearranging a single one (which
seems to be a restriction on your example).  Both styles matter, and
respectively correspond to different platform specific shuffle instructions

I like the idea of something like this, and it's definitely simpler than
the shuffle proposals I've been trying to draft, though using a word like
"shuffle" may be. Though it doesn't give a way to use the same idea for
someone using the platform specific shuffle intrinsics that hopefully would
be added eventually. (Any such platform specific intrinsics would be for
fixed tuple size and type).

*An Alternative approach*? *Int *

One way around the const expr requirement for the type system that someone
suggested was pretty neat: expose the various platform specific simd
shuffle ops, an have their shuffle mask int args actually be "type args".
Apparently there's some protean support for type level numbers because of
sized vectors, and because rust requires all generics to be monomorphized,
this actually would capture the right "constness at compile time"

an example of this idea would be to take the VSHUFPD intel instruction (in
the intel architecture), and modify the intrinsic from the c code version

(nb: __m256d == v4f64 in rust parlance)
  __m256d _mm256_shuffle_pd (__m256d a, __m256d b, const int select);

into

fn  _mm256_shuffle_pd<const int select>(__m256d a, __m256d b)-> __m256d

I'm not sure how such a type level int application would work out, but It
may be the nicest way to conservatively add type safe SIMD  shuffle primops
to rust, though I could be completely wrong. (I was initially meh on this
type application idea, but its grown on me, it exploits the way rust
generics work very very nicely!)

*note* while exposing the architecture specific intrinsics would be bit
more work, it would also mean that the SIMD support in rust have a more
transparent mapping to various architectures, allow better architecture/cpu
microarchitecture based tuning (writing an version of BLIS
http://code.google.com/p/blis/ in rust might be a good stress test), and
it'd be less coupled to the vagaries of how LLVM lowers the shuffle
instruction to the target architecture. This actually matters in the
context of writing code that uses the "optimal" instruction sequence by
detecting the cpu micro architecture at runtime and branching to the tune
variant internally, something OpenBLAS does very nicely, see here for
examples https://github.com/xianyi/OpenBLAS/tree/develop/kernel/x86_64

That said, having a systematic way to support the llvm shuffle intrinsic In
it's full generality would be lovely, it's a much more friendly operation
that people can use to get started with doing simd in a somewhat user
friendly way.

point being: I support there being better shuffle simd support / simd
support period :), though how to do  it best seems unclear to me (and
theres also a few ways that arent good too)
-Carter

On Tuesday, January 14, 2014, Richard Diamond wrote:

> Basically the idea here is to support shuffling for SIMD types in a way
> that can be easily lowered to IR (LLVM's shufflevector requires the mask be
> a vector of constants, so an intrinsic function is out of the question),
> however I image this sugar could extend to tuples with multiple types.
>
> Some examples:
>
> let vec = (1.0f32, 2.0f32, 3.0f32, 4.0f32);
> let all_x = vec -> (0, 0, 0, 0); // perhaps this should be "vec <- (0, 0,
> 0, 0)"?
> assert_eq!(all_x, (1.0f32, 1.0f32, 1.0f32, 1.0f32));
> let single_x = vec -> (0);
> assert_eq!(single_x, (1.0f32));
>
> let mut vec = vec;
> vec <- (0) = 5.0f32; // set x only
> vec <- (1, 2) = (6.0f32, 7.0f32) // set y & z
> assert_eq!(vec, (5.0f32, 6.0f32, 7.0f32, 4.0f32));
>
> let vec = vec;
> // the mask may be arbitrarily long:
> assert_eq!(vec -> (0, 1, 2, 3, 0), (5.0f32, 6.0f32, 7.0f32, 4.0f32,
> 5.0f32));
>
> // leaves vec unchanged
> let functional_update = vec -> (0, 1, 3) .. (0.5f32, 1.0f32, 10.0f32);
> // functional_update would take it's type from vec
> assert_eq!(vec, (5.0f32, 6.0f32, 7.0f32, 4.0f32));
> assert_eq!(functional_update, (0.5f32, 1.0f32, 7.0f32, 10.0f32));
>
> A couple of things would need to be disallowed, however:
>
> let mut vec = vec;
> // no duplicate assignments/functional updates:
> vec <- (0, 0) = (..);
> let _ = vec -> (0, 1, 2, 3, 0) .. (..);
> // no out-of-bounds:
> vec <- (5, 9000) = (..);
> let _ = vec -> (5, 9001);
> let _ = vec -> (5, 9002) .. (..);
> let _ = vec -> (0, 1, 2, 3, 4) .. (..);
> // all mask values must be a const expr:
> let mut non_const_expr = 15;
> vec <- (non_const_expr) = (..);
> let _ = vec -> (non_const_expr) .. (..);
> let _ = vec -> (non_const_expr);
> // mismatched tuple sizes:
> vec <- (0, 1) = (0.0f32, 0.0f32, 0.0f32);
> let _ = vec -> (0) .. (0.0f32, 0.0f32);
>
> AIUI, the notation would be:
> tuple_mask : '(' integer [ ',' integer ] * ')' ;
> tuple_expr : '(' expr [ ',' expr ] * ')' |
>                   tuple_expr "->" tuple_mask [ ".." tuple_expr ] ? ;
>
> I'm willing to write this myself, but I'd like some consensus/feedback
> regarding ze proposed sugar.
>
_______________________________________________
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Reply via email to