Hi, I have re-visit the pros/cons here [1]. One notable point for me is that IAM allows to implement some interest parts. In contrast, TAM must support everything heap can also do - it can make the patch narrower. Also, later developers may have to implement codes for both heap and VCI (TAM). It may block upcoming new features.
I know the current patch must be polished. So what if we drop the code entanglement from executor and some ad-hoc hooks? Per my understanding, the second barrier is the freshness of the data, which you may mention as "transaction machinery". E.g., VCI hacks deleting tuples to also delete VCI entries. This does not follow the basic manners for indexes. Do you know other reasons why VCI should be TAM? We want to understand all worries from your side. The best way is to implement two styles but it's quite difficult, needs too much codes for us. [1] Pros of IAM: - VCI serves not only as a columnar storage format but also as an "accelerator" that speeds up data access, making it well-suited for index implementation. -- Based on the importance of analytics during OLTP, it has been implemented as an index. -- IAM can be implemented to support only specific workloads, allowing partial implementation of VCI only where necessary. - Vector processing enables fetching and computing multiple rows simultaneously, achieving high performance. Pros of TAM: - Using TAM eliminates the need to reimplement heapam contents. -- Given VCI's current implementation as IAM, it may be possible to remove modifications made to PostgreSQL's heap code. - If the advantage of VCI is to leverage storage for speed, using TAM is preferable. Cons of IAM: - Doubts remain about whether VCI can be fully implemented as an IAM. Currently, some modifications are being made directly to the core code. -- In practice, VCI has "spilled over and impacted" the implementation of PostgreSQL's heap code. For example, the current VCI adds implementation within heapam.c. -- IAM lacks knowledge of deletion or update information, so ad-hoc hooks are added and handles it. Cons of TAM: - It requires implementing the workload for all access methods. - Current table access methods do not support the vectorized I/O. It may be a key to surpass the performance. - Moreover, current table access methods must read tuples line-by-line via like scan_getnextslot(). It may also potentially degrade the performance. - Migrating VCI logic to TAM is extremely difficult. Best regards, Aya Iwata
