Re: Is it possible and worthy to optimize scanRTEForColumn()?

Andres Freund Fri, 08 Dec 2017 11:49:59 -0800

Hi,

On 2017-12-08 14:41:14 -0500, Tom Lane wrote:
> Yeah, if someone were holding a gun on me and saying "make that particular
> function faster", I'd think about a hash table rather than scanning a
> list.  Perhaps a hash table with all the column names exposed by FROM,
> not one hash per RTE.


That sounds right.


> However, if you have a FROM that exposes a lot of column names, and
> then the query only looks up a few of them, you might come out behind
> by building a hash table :-(

Hm, I don't think that's that big of a deal - you don't need many
lookups to make a hashtable worthwhile if the alternative is exhaustive
scans through linked lists.  I'd be more concerned about the pretty
common case we're most of the time hitting now, where there's just a
handfull of columns selected from about as many available columns, the
additional allocations and such might show up.


> I'm still unconvinced that this is the first place to improve for
> wide tables, anyway.

I've run a few profiles with wide columns lately, and on the read-mostly
side without prepared statements this is very commonly the biggest entry
by a lot. Over 90% of the time in one of them.

If the queries select a large number of the underlying rows tupledesc
computations, and their syscache lookups, become the bottleneck. That's
partially what lead me to microoptimize syscache ~two months back.  The
real solution there is going to be to move tupledesc computations to the
planner, but that's a bigger piece of work than I can take on right now.

Greetings,

Andres Freund

Re: Is it possible and worthy to optimize scanRTEForColumn()?

Reply via email to