Re: [HACKERS] asynchronous and vectorized execution

Pavel Stehule Mon, 09 May 2016 23:10:03 -0700

2016-05-10 8:05 GMT+02:00 David Rowley <david.row...@2ndquadrant.com>:


> On 10 May 2016 at 16:34, Greg Stark <st...@mit.edu> wrote:
> >
> > On 9 May 2016 8:34 pm, "David Rowley" <david.row...@2ndquadrant.com>
> wrote:
> >>
> >> This project does appear to require that we bloat the code with 100's
> >> of vector versions of each function. I'm not quite sure if there's a
> >> better way to handle this. The problem is that the fmgr is pretty much
> >> a barrier to SIMD operations, and this was the only idea that I've had
> >> so far about breaking through that barrier. So further ideas here are
> >> very welcome.
> >
> > Well yes and no. In practice I think you only need to worry about
> vectorised
> > versions of integer and possibly float. For other data types there either
> > aren't vectorised operators or there's little using them.
> >
> > And I'll make a bold claim here that the only operators I think really
> > matter are =
> >
> > The rain is because using SIMD instructions is a minor win if you have
> any
> > further work to do per tuple. The only time it's a big win is if you're
> > eliminating entire tuples from consideration efficiently. = is going to
> do
> > that often, other btree operator classes might be somewhat useful, but
> > things like + really only would come up in odd examples.
> >
> > But even that understates things. If you have column oriented storage
> then =
> > becomes even more important since every scan has a series of implied
> > equijoins to reconstruct the tuple. And the coup de grace is that in a
> > column oriented storage you try to store variable length data as integer
> > indexes into a dictionary of common values so *everything* is an integer
> =
> > operation.
> >
> > How to do this without punching right through the executor as an
> abstraction
> > and still supporting extensible data types and operators was puzzling me
> > already. I do think it involves having these vector operators in the
> > catalogue and also some kind of compression mapping to integer indexes.
> But
> > I'm not sure that's all that would be needed.
>
> Perhaps the first move to make on this front will be for aggregate
> functions. Experimentation might be quite simple to realise which
> functions will bring enough benefit. I imagined that even Datums where
> the type is not processor native might yield a small speedup, not from
> SIMD, but just from less calls through fmgr. Perhaps we'll realise
> that those are not worth the trouble, I've no idea at this stage.
>

It can be reduced to sum and count in first iteration. On other hand lot of
OLAP reports is based on pretty complex expressions - and there probably
the compilation is better way.

Regards

Pavel


>
> --
>  David Rowley                   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: [HACKERS] asynchronous and vectorized execution

Reply via email to