> On Jun 15, 2022, at 7:30 PM, Andres Freund <and...@anarazel.de> wrote:
>
>> It's effectively a synonym which determines whether the "bool use_sort"
>> parameter of the table AM's relation_copy_for_cluster will be set. Heap-AM
>> plays along and sorts or not based on that.
>
> Hardly a synonym if it behaves differently?
I don't think this point is really worth arguing. We don't have to call it a
synonym, and the rest of the discussion remains the same.
>> But it's up to the TAM what it wants to do with that boolean, if in fact it
>> does anything at all based on that. A TAM could decide to do the exact
>> opposite of what Heap-AM does and instead sort on VACUUM FULL but not sort
>> on CLUSTER, or perhaps perform a randomized shuffle, or <insert your weird
>> behavior here>.
>
> That's bogus. Yes, an AM can do stupid stuff in a callback. But so what,
> that's possible with all extension APIs.
I don't think it's "stupid stuff". A major motivation, perhaps the only useful
motivation, for implementing a TAM is to get a non-trivial performance boost
(relative to heap) for some target workload, almost certainly at the expense of
worse performance for another workload. To optimize any particular workload
sufficiently to make it worth the bother, you've pretty much got to do
something meaningfully different than what heap does.
>> From the point-of-view of a TAM implementor, VACUUM FULL and CLUSTER are
>> synonyms. Or am I missing something?
>
> The callback gets passed use_sort. So just implement it use_sort = false and
> error out if use_sort = true?
I'm not going to say that your idea is unreasonable for a TAM that you might
choose to implement, but I don't see why that should be required of all TAMs
anybody might ever implement.
The callback that gets use_sort is called from copy_table_data(). By that
time, it's too late to avoid the
/*
* Open the relations we need.
*/
NewHeap = table_open(OIDNewHeap, AccessExclusiveLock);
OldHeap = table_open(OIDOldHeap, AccessExclusiveLock);
code that has already happened in cluster.c's copy_table_data() function, and
unless I raise an error, after returning from my TAM's callback, the cluster
code will replace the old table with the new one. I'm left no choices but to
copy my data over, loose my data, or abort the command. None of those are OK
options for me.
I'm open to different solutions. If a simple callback like
relation_supports_cluster(Relation rel) is too simplistic, and more parameters
with more context information is wanted, then fine, let's do that. But I can't
really complete my work with the interface as it stands now.
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company