+VECT_VAR_DECL(expected,poly,8,16) [] = { 0x33, 0x33, 0x33, 0x33, + 0x33, 0x33, 0x33, 0x33, + 0x33, 0x33, 0x33, 0x33, + 0x33, 0x33, 0x33, 0x33 }; +VECT_VAR_DECL(expected,poly,16,8) [] = { 0x3333, 0x3333, 0x3333, 0x3333, + 0x3333, 0x3333, 0x3333, 0x3333 };
No poly ops for vmlx.
+VECT_VAR_DECL(expected,hfloat,32,4) [] = { 0x45f0ae15, 0x45f0b615, + 0x45f0be15, 0x45f0c615 }; +
These expected results are calculated using chained(as opposed to fused) float MACs, right?
Otherwise, LGTM. Tejas.