On Thu, Apr 30, 2015 at 9:46 AM, Amit Kapila <amit.kapil...@gmail.com> wrote: > As the index expression contain table columns and all the functions > or operators used in expression must be IMMUTABLE, won't that > guarantee to avoid such a situation?
The concern is that they might be labeled as immutable but not actually behave that way. The other, related problem is that the ordering operator might start to return different results than it did at index creation time. For example, consider a btree index built on a text column. Now consider 'yum update'. glibc gets updated, collation ordering of various strings change, and now you've got tuples that are in the "wrong place" in the index, because when the index was built, we thought A < B, but now we think B < A. You would think the glibc maintainers might avoid such changes in minor releases, or that the Red Hat guys would avoid packaging and shipping those changes in minor releases, but you'd be wrong. We've seen cases where the master and the standby were both running RHEL X.Y (same X.Y on both servers) but they didn't agree on the collation definitions, so queries that worked on the master failed on the slave. > If not, then I think for such indexes we might need to search > for tupleoffset in the entire index and mark it as Delete or may be > for such indexes follow the current mechanism. I think that *is* the current mechanism. > I think it will still > give us lot of benefit in more common cases. It's hard to figure out exactly what can work here. Aside from correctness issues, the biggest problem with refinding the index tuples one-by-one and killing them is that it may suck for bulk operations. When you delete one tuple from the table, refinding the index tuples and killing them immediately is probably the smartest thing you can do. If you postpone it, somebody may be forced to split the page because it's full, whereas if you do it right away, the next guy who wants to put a tuple on that page may be able to make it fit. That's a big win. But if you want to delete 10 million tuples, doing an index scan for each one may induce a tremendous amount of random I/O on the index pages, possibly visiting and dirtying the same pages more than once. Scanning the whole index for tuples to kill avoids that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers