My idea is that if an UPDATE places the new tuple on the same page as
the old tuple, it will not create new index entries for any indexes
where the key doesn't change.

Basically the idea behind preventing index bloat by updates is to have one index tuple point to several actual tuples having the same value.
        
        So : Index entry -> list of tuples having the same value -> actual 
tuples
        (-> represents an indirection)

I proposed to put the list of tuples in the index ; you propose to put it in data pages.

        I think both solutions have pros and cons :

        * List of tuples in the index :
        + reduces index size, makes cacheability in RAM more likely
        + speeds up index scans
        - complexity
        - slows down modifications to the index (a bit)

        * List of tuples in the page
        + simpler to implement
        + reduces index size, but less so than previous solution
        - useless if UPDATE puts the new tuple on a different page

        I guess the best solution would be a mix of both.

Also, I insist (again) that there is a lot to gain by using a bit of compression on the data pages, even if it's very simple compression like storing the new version of a row as a difference from the previous version (ie. only store the columns that changed). I think DB2 stores the latest version entirely, and stores the previous versions as a delta. This is more efficient.

In the case of tables containing TEXT values, these could also get TOASTed. When an update does not modify the TOASTed columns, it would be nice to simply be able to keep the reference to the TOASTed data instead of decompressing it and recompressing it. Or is it already the case ?
        

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

              http://www.postgresql.org/docs/faq

Reply via email to