RE: [WIP]Vertical Clustered Index (columnar store extension) - take2

Aya Iwata (Fujitsu) Fri, 13 Feb 2026 02:33:20 -0800

Hi,

I have re-visit the pros/cons here [1]. One notable point for me is
that IAM allows to implement some interest parts. In contrast,
TAM must support everything heap can also do - it can make the patch
narrower. Also, later developers may have to implement codes for both heap
and VCI (TAM). It may block upcoming new features.


I know the current patch must be polished. So what if we drop the code
entanglement from executor and some ad-hoc hooks? Per my understanding,
the second barrier is the freshness of the data, which you may mention as
"transaction machinery". E.g., VCI hacks deleting tuples to also delete
VCI entries. This does not follow the basic manners for indexes.

Do you know other reasons why VCI should be TAM? We want to understand
all worries from your side.
The best way is to implement two styles but it's quite difficult, needs
too much codes for us.

[1]
Pros of IAM:
- VCI serves not only as a columnar storage format but also as an
"accelerator" that speeds up data access, making it well-suited
for index implementation.
-- Based on the importance of analytics during OLTP,
it has been implemented as an index.
-- IAM can be implemented to support only specific workloads,
allowing partial implementation of VCI only where necessary.
- Vector processing enables fetching and computing multiple rows
simultaneously, achieving high performance.

Pros of TAM:
- Using TAM eliminates the need to reimplement heapam contents.
-- Given VCI's current implementation as IAM, it may be possible
to remove modifications made to PostgreSQL's heap code.
- If the advantage of VCI is to leverage storage for speed,
using TAM is preferable.

Cons of IAM:
- Doubts remain about whether VCI can be fully implemented as an IAM.
Currently, some modifications are being made directly to the core code.
-- In practice, VCI has "spilled over and impacted" the implementation of
PostgreSQL's heap code. For example, the current VCI adds implementation
within heapam.c.
-- IAM lacks knowledge of deletion or update information,
  so ad-hoc hooks are added and handles it.

Cons of TAM:
- It requires implementing the workload for all access methods.
- Current table access methods do not support the vectorized I/O. It may
be a key to surpass the performance.
- Moreover, current table access methods must read tuples line-by-line
 via like scan_getnextslot(). It may also potentially degrade the performance.
- Migrating VCI logic to TAM is extremely difficult.


Best regards,
Aya Iwata

RE: [WIP]Vertical Clustered Index (columnar store extension) - take2

Reply via email to