Re: Import Statistics in postgres_fdw before resorting to sampling.

Etsuro Fujita Sat, 14 Mar 2026 04:41:23 -0700

On Sat, Mar 14, 2026 at 12:36 AM Corey Huinker <[email protected]> wrote:
> On Fri, Mar 13, 2026 at 8:17 AM Etsuro Fujita <[email protected]> wrote:
>> On Fri, Mar 13, 2026 at 4:42 AM Corey Huinker <[email protected]> 
>> wrote:
>> >>
>> >> By the errors-to-warnings change, even when those libpq functions
>> >> fail, we fall back to the normal ANALYZE processing, but I don't think
>> >> that that is OK, because if those functions fail for some reasons
>> >> (network-related issues like network disconnection or memory-related
>> >> issues like out of memory), then libpq functions called later for the
>> >> normal processing would also be likely to fail for the same reasons,
>> >> causing the same failure again, which I don't think is good.  To avoid
>> >> that, when libpq functions in fetch_relstats() and fetch_attstats()
>> >> fail, shouldn't we just throw an error, as before?
>>
>> > Right now the code doesn't differentiate on the kind of an error that was 
>> > encountered. If it was a network error, then not only do we expect 
>> > fallback to also fail, but simple fdw queries will fail as well. If, 
>> > however it were a permission issue (no SELECT permission on pg_stats, for 
>> > instance, or pg_stats doesn't exist because the remote database is not a 
>> > regular Postgres like redshift or something) then I definitely want to 
>> > fall back. Currently, the code makes no judgement on the details of the 
>> > error, it just trusts that fallback-analyze will either succeed (because 
>> > it was permissions related) or it will quickly encounter the same 
>> > insurmountable effort, and one extra PQsendQuery isn't that much overhead.
>>
>> I'm not sure if that's really true; if the error is the one that
>> doesn't take a time to detect, the extra function call wouldn't
>> probably be a problem, but if the error is the one that takes a long
>> time to detect, the function call would make the situation worse.
>
> I'm struggling to think of what kind of error that would be that would take a 
> long time to detect. Network timeouts could be one thing, but such a 
> situation would affect normal query operations as well, not just analyzes. If 
> you have a specific error in mind, please let me know.


I too was thinking of network timeouts (and in such a case I thought
that even if PQsendQuery was run successfully, the following
PQgetResult would have to block for a long time).  And yes, that
situation can occur for normal queries, but in such a query, the extra
functions call is done only when aborting the remote transaction, in a
way libpq functions don't have to wait for a network timeout.

>> > If you think that inspecting the error that we get and matching against a 
>> > list of errors that should skip the retry or skip the fallback, I'd be in 
>> > favor of that. It's probably easier to start with a list of errorcodes 
>> > that we feel are hopeless and should remain ERRORs not WARNINGs.
>>
>> That's an improvement, but I think that that's nice-to-have, not
>> must-have.  As there's not much time left until the feature freeze,
>> I'd like to propose to make the first cut as simple as possible and
>> thus leave that for future work, if this is intended for v19.  If no
>> objections from you (or anyone else), I'd like to update the patch as
>> I proposed.
>
> I'd like to get this into 19 as well, so I'm likely to agree to your proposal.

Ok, will update the patch.

>> Another thing I noticed is related to this comment in fetch_relstats():
>>
>> +   /*
>> +    * Before v14, a reltuples value of 0 was ambiguous: it could either mean
>> +    * the relation is empty, or it could mean that it hadn't yet been
>> +    * vacuumed or analyzed.  (Newer versions use -1 for the latter case).
>> +    *
>> +    * We can ignore this change, because if the remote table wasn't 
>> analyzed,
>> +    * then it would have no attribute stats, and thus we wouldn't have stats
>> +    * that we would try to import. So we can take the reltuples value as-is.
>> +    */
>>
>> I might have misread the patch, but for v14 or later the patch tries
>> to import remote stats even when we have reltuples = -1.  Is that
>> intentional?  In that case the remote table has never yet been
>> vacuumed/analyzed, so I think we should just fall back to the normal
>> processing.
>
> For cases where we affirmatively have -1, it makes sense to immediately go to 
> the ANALYZE option (if available) and fall back if not. I'll make that change.
>
>>   Also, for v13 or earlier it also tries to import remote
>> stats even when we have reltuples = 0, but in that case the remote
>> table might not have been vacuumed/analyzed, so to avoid possibly
>> useless processing, I think we should just fall back to it in that
>> case as well.
>
> +1

I forgot to mention this, but when we have reltuples = 0 for v14 or
later, the patch tries to import both relstats and attstats, but in
that case I think it would be OK to do so only for the former for the
consistency with the normal processing.  What do you think about that?

Best regards,
Etsuro Fujita

Re: Import Statistics in postgres_fdw before resorting to sampling.

Reply via email to