On 2016-05-10 12:34:19 +1200, David Rowley wrote: > a. Modify ScanAPI to allow batch tuple fetching in predefined batch sizes. > b. Modify TupleTableSlot to allow > 1 tuple to be stored. Add flag to > indicate if the struct contains a single or a multiple tuples. > Multiple tuples may need to be deformed in a non-lazy fashion in order > to prevent too many buffers from having to be pinned at once. Tuples > will be deformed into arrays of each column rather than arrays for > each tuple (this part is important to support the next sub-project)
FWIW, I don't think that's necessarily required, and it has the potential to slow down some operations (like target list processing/projections) considerably. By the time vectored execution for postgres is ready, gather instructions should be common and fast enough (IIRC they started to be ok with broadwells, and are better in skylake; other archs had them for longer). > c. Modify some nodes (perhaps start with nodeAgg.c) to allow them to > process a batch TupleTableSlot. This will require some tight loop to > aggregate the entire TupleTableSlot at once before returning. > d. Add function in execAmi.c which returns true or false depending on > if the node supports batch TupleTableSlots or not. > e. At executor startup determine if the entire plan tree supports > batch TupleTableSlots, if so enable batch scan mode. It doesn't really need to be the entire tree. Even if you have a subtree (say a parametrized index nested loop join) which doesn't support batch mode, you'll likely still see performance benefits by building a batch one layer above the non-batch-supporting node. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers