On 10/17/2017 07:22 PM, Jan Hubicka wrote:
According to Agner's tables, gathers range from 12 ops (vgatherdpd)
to 66 ops (vpgatherdd). I assume that CPU needs to do following:
In our code, it is basically don't" care" how much work it is for a
gather instruction to do its work.
Without
> Please look at the testsuite fallout in detail. Note that only
> testcases that do not disable the cost model should be affected
> (all vect.exp testcases disable the cost model for example).
>
> The patch itself looks mostly good, I suppose if we also have
> separate costs for float vs.
On Thu, 19 Oct 2017, Jan Hubicka wrote:
> Hi,
> this is proof of concept patch for vectorizer costs to use costs used for
> rtx_cost
> and register_move_cost which are readily available in ix86_costs instead of
> using
> its own set of random values. At least until we have proof of evidence
Hi,
this is proof of concept patch for vectorizer costs to use costs used for
rtx_cost
and register_move_cost which are readily available in ix86_costs instead of
using
its own set of random values. At least until we have proof of evidence that
vectroizer
costs needs to differ, I do not think
> > Those instructions seems similarly expensive in Intel implementation.
> > http://users.atw.hu/instlatx64/GenuineIntel0050654_SkylakeXeon9_InstLatX64.txt
> > lists latencies ranging from 18 to 32 cycles.
> >
> > Of course it may also be the case that the utility is measuring gathers
> >
On Wed, 18 Oct 2017, Jan Hubicka wrote:
> > > According to Agner's tables, gathers range from 12 ops (vgatherdpd)
> > > to 66 ops (vpgatherdd). I assume that CPU needs to do following:
> > >
> > > 1) transfer the offsets sse->ALU unit for address generation (3 cycles
> > >each, 2 ops)
> > >
> > According to Agner's tables, gathers range from 12 ops (vgatherdpd)
> > to 66 ops (vpgatherdd). I assume that CPU needs to do following:
> >
> > 1) transfer the offsets sse->ALU unit for address generation (3 cycles
> >each, 2 ops)
> > 2) do the address calcualtion (2 ops, probably 4 ops
On Tue, 17 Oct 2017, Jan Hubicka wrote:
> > On Tue, 17 Oct 2017, Jan Hubicka wrote:
> >
> > > Hi,
> > > gether/scatter loads tends to be expensive (at least for x86) while we
> > > now account them
> > > as vector loads/stores which are cheap. This patch adds vectorizer cost
> > > entry for
> On Tue, 17 Oct 2017, Jan Hubicka wrote:
>
> > Hi,
> > gether/scatter loads tends to be expensive (at least for x86) while we now
> > account them
> > as vector loads/stores which are cheap. This patch adds vectorizer cost
> > entry for these
> > so this can be modelled more realistically.
>
On Tue, 17 Oct 2017, Jan Hubicka wrote:
> Hi,
> gether/scatter loads tends to be expensive (at least for x86) while we now
> account them
> as vector loads/stores which are cheap. This patch adds vectorizer cost
> entry for these
> so this can be modelled more realistically.
>
>
Hi,
gether/scatter loads tends to be expensive (at least for x86) while we now
account them
as vector loads/stores which are cheap. This patch adds vectorizer cost entry
for these
so this can be modelled more realistically.
Bootstrapped/regtested x86_64-linux, OK?
Honza
2017-10-17 Jan
11 matches
Mail list logo