I made a patch for "Cache last known per-tuple offsets to speed 
long tuple access" that is in TODO list.

This problem was discussed on hackers-list as "Terrible performance 
on wide selects".
The point of this problem is nocachegetattr() used from ExecEvalVar().
If tuple has many columns, and it has varlen column or null data,
time spent in nocachegetattr() is O(N^2) in the number of fields.

I referred URL below for implementation.
 http://archives.postgresql.org/pgsql-performance/2003-01/msg00262.php

The point of this patch is as follows:
(1)heap_deformtuple_incr() is added.
 This function can extract attributes of tupple, incrementally.

(2)The cache which keeps the result of heap_deformtuple_incr(), 
 is added inside TupleTableSlot.

(3)In ExecEvalVar(), heap_deformtuple_incr() is used in place of 
 nocachegetattr(). This would reduce the time from O(N^2) to O(N).

In order to measure the effect, I executed the test below.
-------------------
Table has 15,000tuples, 200columns. All data type is text. 
Table name is aaa. Column name is t001...t200.
Executed SQL is,
select t100, t110, t120, t130, t140, t150, 
       t160, t170, t180, t190, t200
from aaa;

The profile result of original code is as follows.
-------------------
Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls   s/call   s/call  name
 70.05      1.31     1.31   163846     0.00     0.00  nocachegetattr
  8.02      1.46     0.15   163840     0.00     0.00  FunctionCall3
  1.87      1.50     0.04   397763     0.00     0.00  AllocSetFreeIndex
  1.60      1.52     0.03   163840     0.00     0.00  ExecEvalVar
  1.34      1.55     0.03   200414     0.00     0.00  AllocSetAlloc

The profile result after the patch applying is as follows.
-------------------
Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 39.73      0.29     0.29   180224     0.00     0.00  heap_deformtuple_incr
  9.59      0.36     0.07   163840     0.00     0.00  FunctionCall3
  6.85      0.41     0.05    16384     0.00     0.02  ExecTargetList
  5.48      0.45     0.04    23477     0.00     0.00  hash_any
  4.11      0.48     0.03   200414     0.00     0.00  AllocSetAlloc

Regards,

--- Atsushi Ogawa ([EMAIL PROTECTED])

Attachment: deformtuple_cache.patch
Description: Binary data

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Reply via email to