Two options are on the table to attack the problem of updates causing write amplification: WARM and indirect indexes. They are completely different approaches but have overlapping effects on what scenarios are improved. Here's a recap of both features, with the intent that we make a well-considered decision about each.
The main effect of both features is that an updated tuple doesn't require updating indexes that are on unmodified columns. Indirect indexes are a completely new server feature which may enable other refinements later on; WARM is a targeted optimization on top of the HOT optimization. The big advantage of WARM is that it works automatically, like HOT: the user doesn't need to do anything different than today to get the benefit. With indirect indexes, the user needs to create the index as indirect explicitely. There are two big disadvantages to WARM (as to HOT): it cannot be applied when the heap page is full; and concurrent long-running transactions can spoil reclaimability of recently dead tuples in heap pages. There's a further disadvantage: currently, there can be only one WARM update in an update chain. (Pavan believes it is possible to allow multiple ones.) All those cases can benefit from indirect indexes. Another interesting case is a table with a primary key and a JSON object, on which you have a GIN index (or an int[] array, or text search). What happens if you modify the JSON? With WARM, this is just a normal index update. With indirect indexes, you may be able to skip inserting index entries for all the JSON elements except those which changed. (However, this is not implemented yet.) - When scanning a WARM-updated block starting from an index, you may need to do more work to walk the update chain until you find the visible tuple. Most of the time, HOT/WARM chains are very short thanks to HOT pruning, so this shouldn't be a problem. - Indirect indexes require a primary key to be present. If the PK is dropped, the IndIx must be dropped too. - Indirect indexes become larger if the primary key is wide. - Indirect indexes are not fully implemented yet (need to remove restriction of PK value being 6 bytes; also wholesale vacuuming, though there's no universal agreement that this is strictly necessary.) - An indirect index takes longer to read, since it needs to descend both the IndIx itself and the primary key index. -- Álvaro Herrera -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers