2016-05-10 8:05 GMT+02:00 David Rowley <david.row...@2ndquadrant.com>:
> On 10 May 2016 at 16:34, Greg Stark <st...@mit.edu> wrote: > > > > On 9 May 2016 8:34 pm, "David Rowley" <david.row...@2ndquadrant.com> > wrote: > >> > >> This project does appear to require that we bloat the code with 100's > >> of vector versions of each function. I'm not quite sure if there's a > >> better way to handle this. The problem is that the fmgr is pretty much > >> a barrier to SIMD operations, and this was the only idea that I've had > >> so far about breaking through that barrier. So further ideas here are > >> very welcome. > > > > Well yes and no. In practice I think you only need to worry about > vectorised > > versions of integer and possibly float. For other data types there either > > aren't vectorised operators or there's little using them. > > > > And I'll make a bold claim here that the only operators I think really > > matter are = > > > > The rain is because using SIMD instructions is a minor win if you have > any > > further work to do per tuple. The only time it's a big win is if you're > > eliminating entire tuples from consideration efficiently. = is going to > do > > that often, other btree operator classes might be somewhat useful, but > > things like + really only would come up in odd examples. > > > > But even that understates things. If you have column oriented storage > then = > > becomes even more important since every scan has a series of implied > > equijoins to reconstruct the tuple. And the coup de grace is that in a > > column oriented storage you try to store variable length data as integer > > indexes into a dictionary of common values so *everything* is an integer > = > > operation. > > > > How to do this without punching right through the executor as an > abstraction > > and still supporting extensible data types and operators was puzzling me > > already. I do think it involves having these vector operators in the > > catalogue and also some kind of compression mapping to integer indexes. > But > > I'm not sure that's all that would be needed. > > Perhaps the first move to make on this front will be for aggregate > functions. Experimentation might be quite simple to realise which > functions will bring enough benefit. I imagined that even Datums where > the type is not processor native might yield a small speedup, not from > SIMD, but just from less calls through fmgr. Perhaps we'll realise > that those are not worth the trouble, I've no idea at this stage. > It can be reduced to sum and count in first iteration. On other hand lot of OLAP reports is based on pretty complex expressions - and there probably the compilation is better way. Regards Pavel > > -- > David Rowley http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >