On Fri, Sep 27, 2013 at 04:19:45PM +0100, Vidya Praveen wrote:
> On Fri, Sep 27, 2013 at 03:50:08PM +0100, Vidya Praveen wrote:
> [...]
> > > > I can't really insist on the single lane load.. something like:
> > > >
> > > > vc:V4SI[0] = c
> > > > vt:V4SI = vec_duplicate:V4SI (vec_select:SI vc:V4SI 0)
> > > > va:V4SI = vb:V4SI <op> vt:V4SI
> > > >
> > > > Or is there any other way to do this?
> > >
> > > Can you elaborate on "I can't really insist on the single lane load"?
> > > What's the single lane load in your example?
> >
> > Loading just one lane of the vector like this:
> >
> > vc:V4SI[0] = c // from the above scalar example
> >
> > or
> >
> > vc:V4SI[0] = c[2]
> >
> > is what I meant by single lane load. In this example:
> >
> > t = c[2]
> > ...
> > vb:v4si = b[0:3]
> > vc:v4si = { t, t, t, t }
> > va:v4si = vb:v4si <op> vc:v4si
> >
> > If we are expanding the CONSTRUCTOR as vec_duplicate at vec_init, I cannot
> > insist 't' to be vector and t = c[2] to be vect_t[0] = c[2] (which could be
> > seen as vec_select:SI (vect_t 0) ).
> >
> > > I'd expect the instruction
> > > pattern as quoted to just work (and I hope we expand an uniform
> > > constructor { a, a, a, a } properly using vec_duplicate).
> >
> > As much as I went through the code, this is only done using vect_init. It is
> > not expanded as vec_duplicate from, for example, store_constructor() of
> > expr.c
>
> Do you see any issues if we expand such constructor as vec_duplicate directly
> instead of going through vect_init way?
Sorry, that was a bad question.
But here's what I would like to propose as a first step. Please tell me if this
is acceptable or if it makes sense:
- Introduce standard pattern names
"vmulim4" - vector muliply with second operand as indexed operand
Example:
(define_insn "vmuliv4si4"
[set (match_operand:V4SI 0 "register_operand")
(mul:V4SI (match_operand:V4SI 1 "register_operand")
(vec_duplicate:V4SI
(vec_select:SI
(match_operand:V4SI 2 "register_operand")
(match_operand:V4SI 3 "immediate_operand)))))]
...
)
"vlmovmn3" - move where one of the operands is specific lane of a vector and
other is a scalar.
Example:
(define_insn "vlmovv4sisi3"
[set (vec_select:SI (match_operand:V4SI 0 "register_operand")
(match_operand:SI 1 "immediate_operand"))
(match_operand:SI 2 "memory_operand")]
...
)
- Identify the following idiom and expand through the above standard patterns:
t = c[m]
vc[0:n] = { t, t, t, t}
a[0:n] = b[0:n] * vc[0:n]
as
(insn (set (vec_select:SI (reg:V4SI 0) 0) (mem:SI ... )))
(insn (set (reg:V4SI 1)
(mult:V4SI (reg:V4SI 2)
(vec_duplicate:V4SI (vec_select:SI (reg:V4SI 0) 0)))))
If this path is acceptable, then I can extend this to support
"vmaddim4" - multiply and add (with indexed element as multiplier)
"vmsubim4" - multiply and subtract (with indexed element as multiplier)
Please let me know your thoughts.
Cheers
VP