On L, 2005-07-09 at 16:50 -0400, Alvaro Herrera wrote: > On Sat, Jul 09, 2005 at 11:43:52PM +0300, Hannu Krosing wrote: > > > Could the new file not be made to cover the next available 1GB of file > > space, that is a new physical file ? > > > > This could made using of same kind of machinery my proposal for > > concurrent index does (i.e. locks that forbid putting new tuples in > > certain tuple ranges) > > I think your proposals are too "handwavy", but there is a similar > mechanism outlined for on-line index reorganizarion, whereby new tuples > can be inserted concurrently with the reorganization, being stored on a > "spill area". See
I try to state my reworked idea of concurrent indexing in a more clear way: The index build in done 2 transactions, need one new type of lock and a new system column in pg_class to tell planner not to use an incomplete index. This similar to vacuum in thet ot needs its own transactions and is not rollbackable. Perhaps the decision to use either this or current INDEX should be based on weather (RE)INDEX command is run in its own transaction. 1st transaction: ---------------- The index for the part of datafile that exists at the start of INDEX command, is created within the 1st transacton, in similar way we do now. "the part" is definded as all tuples with ctid below (<=) the max(ctid) stored at the start as MAX_CTID_1. To be sure that we cover all the tuples in range <= MAX_CTID_1, and no new tuples are stored there as the result of INSERT or UPDATE, we need a new type of lock (lets call it "TupleStoreRangeLock"), which prevents new tuples to be placed below MAX_CTID_1 and which is aquired before starting the initial build. After the initial build of index for tuples below MAX_CTID_1 is finished, it is made visible to the rest of the system by committing the transaction, but marking the index as "incomplete" (probably a new column in pg_class is needed for that), so that it will not be used by planner, but all new inerts/updates will see and use it. 2nd transaction --------------- After that we need to wait for all other running transactions to complete, so we can be sure that all other backends know about the new index. Once we are sure they do, we record the new max(ctid) as MAX_CTID_2. At this point we can release the TupleStoreRangeLock, to make it possible to place new tuples below MAX_CTID_1 As the final step we need to scan all tuples in range ( MAX_CTID_1 to MAX_CTID_2 ) and insert their corresponding index entries into the new index. If the entry already exists for exact same ctid, that is ok. After reaching MAX_CTID_2, the index is ready for use by planner as well and can be marked as "complete" in pg_class. In case of REINDEX, the new index can be made to replace the old one at this point. TODO: work out with which locks TupleStoreRangeLock conflicts and with which it can coexists. -- Hannu Krosing <[EMAIL PROTECTED]> ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend