* Kevin Grittner (kgri...@ymail.com) wrote: > Stephen Frost <sfr...@snowman.net> wrote: > > I worry that adding these will come back to bite us later > > How?
User misuse is certainly one consideration, but I wonder what's going to happen if we change our internal representation of data (eg: numerics get changed again), or when incremental matview maintenance happens and we start looking at subsets of rows instead of the entire query. Will the first update of a matview after a change to numeric's internal data structure cause the entire thing to be rewritten? > > and that we're making promises we won't be able to keep. > > The promise that a concurrent refresh will produce the same set of > rows as a non-concurrent one? The promise that we'll always return the binary representation of the data that we saw last. When greatest(x,y) comes back 'false' for a MAX(), we then have to go check "well, does the type consider them equal?", because, if the type considers them equal, we then have to decide if we should replace x with y anyway, because it's different at a binary level. That's what we're saying we'll always do now. We're also saying that we'll replace things based on plan differences rather than based on if the rows underneath actually changed at all. We could end up with material differences in the result of matviews updated through incremental REFRESH and matviews updated through actual incremental mainteance- and people may *care* about those because we've told them (or they discover) they can depend on these types of changes to be reflected in the result. > > Trying to do this incremental-but-not-really maintenance where > > the whole query is run but we try to skimp on what's actually > > getting updated in the matview is a premature optimization, imv, > > and one which may be less performant and more painful, with more > > gotchas and challenges for our users, to deal with in the long > > run. > > I have the evidence of a ten-fold performance improvement plus > minimized WAL and replication work on my side. What evidence do > you have to back your assertions? (Don't forget to work in bloat > and vacuum truncation issues to the costs of your proposal.) I don't doubt that there are cases in both directions and I'm not trying to argue that it'd always be faster, but I doubt it's always slower. I'm surprised that you had a case where the query was apparently quite fast yet the data set hardly changed and resulted in a very large result but I don't doubt that it happened. What I was trying to get at is really that the delete/insert approach would be good enough in very many cases and it wouldn't have what look, to me anyway, as some pretty ugly warts around these cases. Thanks, Stephen
signature.asc
Description: Digital signature