On Thu, Mar 2, 2017 at 10:03 AM, Peter Eisentraut < peter.eisentr...@2ndquadrant.com> wrote:
> On 12/20/16 23:14, Jim Nasby wrote: > > I've been looking at the performance of SPI calls within plpython. > > There's a roughly 1.5x difference from equivalent python code just in > > pulling data out of the SPI tuplestore. Some of that is due to an > > inefficiency in how plpython is creating result dictionaries, but fixing > > that is ultimately a dead-end: if you're dealing with a lot of results > > in python, you want a tuple of arrays, not an array of tuples. > > There is nothing that requires us to materialize the results into an > actual list of actual rows. We could wrap the SPI_tuptable into a > Python object and implement __getitem__ or __iter__ to emulate sequence > or mapping access. > Python objects have a small (but non-zero) overhead in terms of both memory and speed. A built-in dictionary is probably one of the least-expensive (memory/cpu) choices, although how the dictionary is constructed also impacts performance. Another choice is a tuple. Avoiding Py_BuildValue(...) in exchange for a bit more complexity (via PyTuple_New(..) and PyTuple_SetItem(...)) is also a nice performance win in my experience. -- Jon