Re: [HACKERS] Faster methods for getting SPI results (460% improvement)

Jim Nasby Tue, 24 Jan 2017 20:44:36 -0800

On 1/23/17 10:36 PM, Craig Ringer wrote:

which is currently returned as


[ {"a":1, "b":10}, {"a":2, "b":20} ]

instead as

{ "a": [1, 2], "b": [10, 20] }


Correct.

If so I see that as a lot more of a niche thing. I can see why it'd be
useful and would help performance, but it seems much more disruptive.
It requires users to discover it exists, actively adopt a different
style of ingesting data, etc. For a 10%-ish gain in a PL.

In data science, what we're doing now is actually the niche. All realanalytics happens with something like a Pandas DataFrame, which isorganized as a dict of lists.

This isn't just idle nomenclature either: organizing results in whatamounts to a column store provides a significant speed improvement formost analytics, because you're working on an array of contiguous memory(at least, when you're using more advanced types like DataFrames andSeries).

I strongly suggest making this design effort a separate thread, and
focusing on the SPI improvements that give "free" no-user-action
performance boosts here.

Fair enough. I posted the SPI portion of that yesterday. That should beuseful for pl/R and possibly pl/perl. pl/tcl could make use of it, butit would end up executing arbitrary tcl code in the middle of portalexecution, which doesn't strike me as a great idea. Unfortunately, Idon't think plpgsql could make much use of this for similar reasons.


I'll post a plpython patch that doesn't add the output format control.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)


--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Faster methods for getting SPI results (460% improvement)

Reply via email to