On Mon, Jul 1, 2013 at 3:49 PM, Andres Freund <and...@2ndquadrant.com> wrote: >> I must be missing something. At that point, yes, you'd like to avoid >> re-toasting unnecessarily, but ISTM you've already bought the farm. >> Unless I'm misunderstanding the code as written, you'd just end up >> writing the indirect pointer back out to disk in that scenario. >> That's not gonna work... > > You didn't misunderstand anything unfortunately. I broke that somewhere > along the road, probably when factoring things out to > toast_datum_differs(). Fixed... > Which shows that we need tests. I've added some using a function in > regress.c that makes all datums indirect. Together with a triggers that > allows some testing.
OK, that's nifty. + elog(ERROR, "deleteing a tuple indirect datums doesn't make sense"); That's spelled wrong. + Assert(VARATT_IS_EXTERNAL_ONDISK(old_value)); + + /* fast path for the common case where we have the toast oid available */ + if (VARATT_IS_EXTERNAL_ONDISK(old_value) && + VARATT_IS_EXTERNAL_ONDISK(new_value)) + return memcmp((char *) old_value, (char *) new_value, + VARSIZE_EXTERNAL(old_value)) != 0; You've already asserted VARATT_IS_EXTERNAL_ONDISK(old_value), so there can't be any need to test that condition again. + changed = memcmp(VARDATA_ANY(old_data), VARDATA_ANY(new_data), + VARSIZE_ANY_EXHDR(old_data)) != 0; + if (!changed) + *needs_rewrite = true; + + /* heap_tuple_untoast_attr copied data */ + pfree(old_data); + if (new_value != new_data) + pfree(new_data); + return changed; So... basically, we always set needs_rewrite to !changed, except when VARATT_IS_EXTERNAL_ONDISK(new_value). In that latter case we return after doing memcmp(). That's kind of byzantine and could probably be simplified by pushing the first if-clause of this function up into the caller and then making the function's job to simply compare the datums on the assumption that the new datum is not external on-disk. Then you wouldn't need this funny three-way return value done as two Booleans. But backing up a minute, this is really a significant behavior change that is independent of the purpose of the rest of this patch. What you're proposing here is that every time we consider toasting a value on update, we should first check whether it's byte-for-byte equivalent to the old value. That may or may not be a good idea - it will suck if, for example, a user repeatedly updates a very long string by changing only the last character thereof, but it will win in other cases. Whether it's a good idea or not, I think it deserves to be a separate patch. For purposes of this patch, I think you should just assume that any external-indirect value needs to be retoasted, just as we currently assume for untoasted values. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers