> On Jun 15, 2022, at 7:30 PM, Andres Freund <and...@anarazel.de> wrote:
> 
>> It's effectively a synonym which determines whether the "bool use_sort"
>> parameter of the table AM's relation_copy_for_cluster will be set.  Heap-AM
>> plays along and sorts or not based on that.
> 
> Hardly a synonym if it behaves differently?

I don't think this point is really worth arguing.  We don't have to call it a 
synonym, and the rest of the discussion remains the same.

>> But it's up to the TAM what it wants to do with that boolean, if in fact it
>> does anything at all based on that.  A TAM could decide to do the exact
>> opposite of what Heap-AM does and instead sort on VACUUM FULL but not sort
>> on CLUSTER, or perhaps perform a randomized shuffle, or <insert your weird
>> behavior here>.
> 
> That's bogus. Yes, an AM can do stupid stuff in a callback. But so what,
> that's possible with all extension APIs.

I don't think it's "stupid stuff".  A major motivation, perhaps the only useful 
motivation, for implementing a TAM is to get a non-trivial performance boost 
(relative to heap) for some target workload, almost certainly at the expense of 
worse performance for another workload.  To optimize any particular workload 
sufficiently to make it worth the bother, you've pretty much got to do 
something meaningfully different than what heap does.


>> From the point-of-view of a TAM implementor, VACUUM FULL and CLUSTER are
>> synonyms.  Or am I missing something?
> 
> The callback gets passed use_sort. So just implement it use_sort = false and
> error out if use_sort = true?

I'm not going to say that your idea is unreasonable for a TAM that you might 
choose to implement, but I don't see why that should be required of all TAMs 
anybody might ever implement.

The callback that gets use_sort is called from copy_table_data().  By that 
time, it's too late to avoid the

    /*
     * Open the relations we need.
     */
    NewHeap = table_open(OIDNewHeap, AccessExclusiveLock);
    OldHeap = table_open(OIDOldHeap, AccessExclusiveLock);

code that has already happened in cluster.c's copy_table_data() function, and 
unless I raise an error, after returning from my TAM's callback, the cluster 
code will replace the old table with the new one.  I'm left no choices but to 
copy my data over, loose my data, or abort the command.  None of those are OK 
options for me.

I'm open to different solutions.  If a simple callback like 
relation_supports_cluster(Relation rel) is too simplistic, and more parameters 
with more context information is wanted, then fine, let's do that.  But I can't 
really complete my work with the interface as it stands now.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company





Reply via email to