On Wed, Jan 18, 2006 at 05:27:29PM -0500, Tom Lane wrote: > Right, the deadlock risk is exactly the reason you need some secret > sauce or other. Btree's page-level lock ensures that two insertions of > conflicting keys can't overlap (even if they ultimately get stored on > different pages). That's not the only way to fix this but it's a pretty > good way.
Ok, I'm not that great with locking issues, but how about this: 1. In the root page of the GiST index you store a counter, let's call it the Insertion ID (IID). 2. An index tuple has a state where it is not visible yet and contains an IID. 3. When you go to insert, you follow the normal GiST method all the way down to the leaf node. 4. Once you know where you're going to put it, take an exclusive lock on the page. 5. Now, take an exclusive lock on the root page, read the IID (this one is yours) and store the next one back in the root page. 6. Now write the index tuple to the leaf page recording your IID. 7. Release the root page lock, then the leaf page lock. The thing to note is that once someone has gotten a particular IID, all IIDs that are less than it will already have been written to the index leaf pages. Also, there should be no risk of deadlock since we lock the pages in a consistant order. 8. Do a normal index scan with your ~ operator. Ignore any provisional tuples with IID greater than yours. If a match has IID less than yours you may block on it using the same logic as for b-tree indexes. 9. Once the scan in completed and you know that there are no duplicates with your IID or less, make the tuple fully visible (for your transaction anyway). At this point the IID is no longer relevent and you can forget it. What I tried to acheive was avoiding holding any locks while doing scans, since for GiST indexes they may cover a lot of ground. Perhaps a better name for IID is Generation ID? Storing the IID could take extra space, but you could probably overlap it with the ctid since no-one's going to lookup the data tuple before it's fully visible. OTOH, if the backend dies before completing the process, how does one clean up the provisional index tuple? The ctid would provide a way for VACUUM to know when to remove it. > BTW, the deadlock risk also applies to deferred uniqueness checks. > Again, in btree it's possible to avoid this if you do a fresh indexscan > (and take a lock on the first scanned page while you do that). If you > try to do it without consulting the index then you need some other way > to break "ties". I think the above should work for deferred checks as well, as long as you store the list as "tuples to check" for each inserted tuple. These lists should form a directed acyclic graph so there should be no risk of deadlock. Conflicts with VACUUM is still an issue. Any thoughts? Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
signature.asc
Description: Digital signature