Tomas Vondra <tomas.von...@2ndquadrant.com> wrote: > I share the view that this would be very valuable, but the scope > far exceeds what can be done within a single GSoC project. But > maybe we could split that into multiple pieces, and Eric would > implement only the first piece? > > For example the 'is_stale' flag for a MV would be really useful, > making it possible to refresh only the MVs that actually need a > refresh.
You may be on to something there. Frankly, though, I'm not sure that we could even reach consensus within the community on a detailed design for how we intend to track staleness (that will hold up both now and once we have incremental maintenance of materialized views working) within the time frame of a GSoC project. This would need to be done with an eye toward how it might be used in direct references (will we allow a "staleness limit" on a reference from a query?), for use in a rewrite, and how it will interact with changes to base tables and with both REFRESH statements and incremental maintenance at various levels of "eagerness". I'm not sure that staleness management wouldn't be better left until we have some of those other parts for it to work with. Questions to consider: Some other products allow materialized views to be partitioned and staleness to be tracked by partition, and will check which partitions will be accessed in determining staleness. Is that something we want to allow for? Once we have incremental maintenance, an MV maintained in an "eager" fashion (changes are visible in the MV as soon as the transaction modifying the underlying table commit) could be accessed with a MVCC snapshots, with different snapshots seeing different versions. It seems pretty clear that such an MV would always be considered "fresh", so there would be no need to constantly flipping to stale and back again as the underlying table were changed and the changes were reflected in the MV. How do we handle that? If changes to an MV are less eager (they are queued for application after COMMIT, as time permits) would we want to track the xid of how far along they are, so that we can tell whether a particular snapshot is safe to use? Do we want to allow a non-MVCC snapshot that shows the latest version of each row? Only if staleness is minimal? What about MVs which don't have incremental maintenance? We can still determine what xid they are current "as of", from the creation or the latest refresh. Do we want to track that instead of a simple boolean flag? -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers