Re: [HACKERS] CREATE INDEX CONCURRENTLY?
This just hit us today... Admittedly on an old cluster still running 9.2, though I can't see any mention of it being addressed since. Any chance of getting this on to to-do list? On Sat, 1 Nov 2014 at 07:45, Simon Riggs wrote: > On 31 October 2014 17:46, Michael Banck wrote: > > > I wonder whether that is pilot error (fair enough), or whether something > > could be done about this? > > When originally written the constraints were tighter, but have since > been relaxed. > > Even so a CIC waits until all snapshots that can see it have gone. So > what you observe is correct and known. > > > Can it be changed? Maybe. > > CREATE INDEX gets around the wait by using indcheckxmin to see whether > the row is usable. So the command completes, even if the index is not > usable by all current sessions. > > We perform the wait in a completely different way for CIC, for this > reason (in comments) > > We also need not set indcheckxmin during a concurrent index build, > because we won't set indisvalid true until all transactions that care > about the broken HOT chains are gone. > > Reading that again, I can't see why we do it that way. If CREATE INDEX > can exit once the index is built, so could CONCURRENTLY. > > ISTM that we could indcheckxmin into an Xid, not a boolean >For CREATE INDEX, set the indcheckxmin = xid of creating transaction >For CREATE INDEX CONCURRENTLY set the indcheckxmin = xid of the > completing transaction > > The apparent reason it does this is that the Xmin value used currently > is the Xmin of the index row. The index row is inserted prior to the > index being valid so that technique cannot work. So I am suggesting > for CIC that we use the xid of the transaction that completes the > index, not the xid that originally created the index row. Plus handle > the difference between valid and not. > > -- > Simon Riggs http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >
Re: [HACKERS] CREATE INDEX CONCURRENTLY?
On 31 October 2014 17:46, Michael Banck wrote: > I wonder whether that is pilot error (fair enough), or whether something > could be done about this? When originally written the constraints were tighter, but have since been relaxed. Even so a CIC waits until all snapshots that can see it have gone. So what you observe is correct and known. Can it be changed? Maybe. CREATE INDEX gets around the wait by using indcheckxmin to see whether the row is usable. So the command completes, even if the index is not usable by all current sessions. We perform the wait in a completely different way for CIC, for this reason (in comments) We also need not set indcheckxmin during a concurrent index build, because we won't set indisvalid true until all transactions that care about the broken HOT chains are gone. Reading that again, I can't see why we do it that way. If CREATE INDEX can exit once the index is built, so could CONCURRENTLY. ISTM that we could indcheckxmin into an Xid, not a boolean For CREATE INDEX, set the indcheckxmin = xid of creating transaction For CREATE INDEX CONCURRENTLY set the indcheckxmin = xid of the completing transaction The apparent reason it does this is that the Xmin value used currently is the Xmin of the index row. The index row is inserted prior to the index being valid so that technique cannot work. So I am suggesting for CIC that we use the xid of the transaction that completes the index, not the xid that originally created the index row. Plus handle the difference between valid and not. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE INDEX CONCURRENTLY?
Am Freitag, den 31.10.2014, 14:43 + schrieb Greg Stark: > On Fri, Oct 31, 2014 at 2:28 PM, Mark Woodward > wrote: > > I have not kept up with PostgreSQL changes and have just been using it. A > > co-worker recently told me that you need to word "CONCURRENTLY" in "CREATE > > INDEX" to avoid table locking. I called BS on this because to my knowledge > > PostgreSQL does not lock tables. I referenced this page in the > > documentation: > > You can read from tables while a normal index build is in progress but > you can't insert, update, or delete from them. CREATE INDEX > CONCURRENTLY allows you to insert, update, and delete data while the > index build is running at the expense of having the index build take > longer. I believe there is one caveat: If there is an idle-in-transaction backend from before the start of CREATE INDEX CONCURRENTLY, it can hold up the index creation indefinitely as long as it doesn't commit. src/backend/access/heap/README.HOT mentions this WRT CIC: "Then we wait until every transaction that could have a snapshot older than the second reference snapshot is finished. This ensures that nobody is alive any longer who could need to see any tuples that might be missing from the index, as well as ensuring that no one can see any inconsistent rows in a broken HOT chain (the first condition is stronger than the second)." I have seen CIC stall at clients when there were (seemlingy) unrelated idle-in-transactions open (their locks even touching only other schemas). I believe it depends on the specific locks that the other backend acquired, but at least with a DECLARE CURSOR I can trivially reproduce it: first session: postgres=# CREATE SCHEMA foo1; CREATE SCHEMA postgres=# CREATE TABLE foo1.foo1 (id int); CREATE TABLE postgres=# CREATE SCHEMA foo2; CREATE SCHEMA postgres=# CREATE TABLE foo2.foo2 (id int); CREATE TABLE second session: postgres=# BEGIN; DECLARE c1 CURSOR FOR SELECT * FROM foo1.foo1; BEGIN DECLARE CURSOR first session: postgres=# CREATE INDEX CONCURRENTLY ixfoo2 ON foo2.foo2(id); (hangs) I wonder whether that is pilot error (fair enough), or whether something could be done about this? Michael -- Michael Banck Projektleiter / Berater Tel.: +49 (2161) 4643-171 Fax: +49 (2161) 4643-100 Email: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Hohenzollernstr. 133, 41061 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE INDEX CONCURRENTLY?
On Fri, Oct 31, 2014 at 2:28 PM, Mark Woodward wrote: > I have not kept up with PostgreSQL changes and have just been using it. A > co-worker recently told me that you need to word "CONCURRENTLY" in "CREATE > INDEX" to avoid table locking. I called BS on this because to my knowledge > PostgreSQL does not lock tables. I referenced this page in the > documentation: You can read from tables while a normal index build is in progress but you can't insert, update, or delete from them. CREATE INDEX CONCURRENTLY allows you to insert, update, and delete data while the index build is running at the expense of having the index build take longer. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE INDEX CONCURRENTLY?
On 10/31/2014 10:28 AM, Mark Woodward wrote: I have not kept up with PostgreSQL changes and have just been using it. A co-worker recently told me that you need to word "CONCURRENTLY" in "CREATE INDEX" to avoid table locking. I called BS on this because to my knowledge PostgreSQL does not lock tables. I referenced this page in the documentation: http://www.postgresql.org/docs/9.3/static/locking-indexes.html That page refers to using the indexes, not creating them. However, I do see this sentence in the indexing page that was not in the docs prior to 8.0: "Creating an index can interfere with regular operation of a database. Normally PostgreSQL locks the table to be indexed against writes and performs the entire index build with a single scan of the table." Is this true? When/why the change? When we use "concurrently," it seems to hang. I am looking into it. Creating indexes always did lock tables. See for example http://www.postgresql.org/docs/7.4/static/explicit-locking.html#LOCKING-TABLES there CREATE INDEX is documented to take a SHARE lock on the table. CONCURRENTLY was an additional feature to allow you to get around this, at the possible cost of some extra processing. So we haven't made things harder, we've made them easier, and your understanding of old releases is incorrect. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] CREATE INDEX CONCURRENTLY?
I have not kept up with PostgreSQL changes and have just been using it. A co-worker recently told me that you need to word "CONCURRENTLY" in "CREATE INDEX" to avoid table locking. I called BS on this because to my knowledge PostgreSQL does not lock tables. I referenced this page in the documentation: http://www.postgresql.org/docs/9.3/static/locking-indexes.html However, I do see this sentence in the indexing page that was not in the docs prior to 8.0: "Creating an index can interfere with regular operation of a database. Normally PostgreSQL locks the table to be indexed against writes and performs the entire index build with a single scan of the table." Is this true? When/why the change? When we use "concurrently," it seems to hang. I am looking into it.
[HACKERS] CREATE INDEX CONCURRENTLY and HOT
Sorry to start another thread while we are still discussing CREATE INDEX design, but I need help/suggestions to finish the patch on time for 8.3 We earlier thought that CREATE INDEX CONCURRENTLY (CIC) would be simpler to do because of the existing waits in CIC. But one major problem with CIC is that UPDATEs are allowed while we are building the index and these UPDATEs can create HOT-chains which has different values for attributes on which we are building the new index. To keep the HOT-chain semantic consistent across old and new indexes, we might be forced to delete the old index entry and reinsert new one during the validate_index() phase. This is of course not easy. May I propose the following design which is less intrusive: We do CIC in three phases: In the first phase we just create the catalog entry for the new index, mark the index read-only and commit the transaction. By read-only, I mean that the index is not ready inserts, but is consulted during UPDATEs to decide whether to do HOT UPDATE or not (just like other existing indexes). We then wait for all transactions conflicting on ShareLock to complete. That would guarantee that all the existing transactions which can not see the new index catalog entry are finished. A new transaction is started. We then build the index just the way we do today. While we are building the index, no new HOT-chains are be created where the index keys do not match because the new index is consulted while deciding whether to do HOT UPDATE or not. At the end of this step, the index is marked ready for inserts, we once again wait for all transactions conflicting on ShareLock to finish and commit the transaction. In the third phase, we validate the index inserting any missing index entries for tuples which are not HEAP_ONLY. For HEAP_ONLY tuples we already have the index entry though it points to the root tuple. Thats OK because we guarantee that all tuples in the chain share the same key with respect to old as well as new indexes. We then mark the index "valid" and commit. In summary, this design introduces one more transaction and wait. But that should not be a problem because we would anyways wait for those transactions to finish today though a bit later in the process. Comments/suggestions ? Thanks, Pavan -- EnterpriseDB http://www.enterprisedb.com