On Sat, Mar 14, 2026 at 12:36 AM Corey Huinker <[email protected]> wrote: > On Fri, Mar 13, 2026 at 8:17 AM Etsuro Fujita <[email protected]> wrote: >> On Fri, Mar 13, 2026 at 4:42 AM Corey Huinker <[email protected]> >> wrote: >> >> >> >> By the errors-to-warnings change, even when those libpq functions >> >> fail, we fall back to the normal ANALYZE processing, but I don't think >> >> that that is OK, because if those functions fail for some reasons >> >> (network-related issues like network disconnection or memory-related >> >> issues like out of memory), then libpq functions called later for the >> >> normal processing would also be likely to fail for the same reasons, >> >> causing the same failure again, which I don't think is good. To avoid >> >> that, when libpq functions in fetch_relstats() and fetch_attstats() >> >> fail, shouldn't we just throw an error, as before? >> >> > Right now the code doesn't differentiate on the kind of an error that was >> > encountered. If it was a network error, then not only do we expect >> > fallback to also fail, but simple fdw queries will fail as well. If, >> > however it were a permission issue (no SELECT permission on pg_stats, for >> > instance, or pg_stats doesn't exist because the remote database is not a >> > regular Postgres like redshift or something) then I definitely want to >> > fall back. Currently, the code makes no judgement on the details of the >> > error, it just trusts that fallback-analyze will either succeed (because >> > it was permissions related) or it will quickly encounter the same >> > insurmountable effort, and one extra PQsendQuery isn't that much overhead. >> >> I'm not sure if that's really true; if the error is the one that >> doesn't take a time to detect, the extra function call wouldn't >> probably be a problem, but if the error is the one that takes a long >> time to detect, the function call would make the situation worse. > > I'm struggling to think of what kind of error that would be that would take a > long time to detect. Network timeouts could be one thing, but such a > situation would affect normal query operations as well, not just analyzes. If > you have a specific error in mind, please let me know.
I too was thinking of network timeouts (and in such a case I thought that even if PQsendQuery was run successfully, the following PQgetResult would have to block for a long time). And yes, that situation can occur for normal queries, but in such a query, the extra functions call is done only when aborting the remote transaction, in a way libpq functions don't have to wait for a network timeout. >> > If you think that inspecting the error that we get and matching against a >> > list of errors that should skip the retry or skip the fallback, I'd be in >> > favor of that. It's probably easier to start with a list of errorcodes >> > that we feel are hopeless and should remain ERRORs not WARNINGs. >> >> That's an improvement, but I think that that's nice-to-have, not >> must-have. As there's not much time left until the feature freeze, >> I'd like to propose to make the first cut as simple as possible and >> thus leave that for future work, if this is intended for v19. If no >> objections from you (or anyone else), I'd like to update the patch as >> I proposed. > > I'd like to get this into 19 as well, so I'm likely to agree to your proposal. Ok, will update the patch. >> Another thing I noticed is related to this comment in fetch_relstats(): >> >> + /* >> + * Before v14, a reltuples value of 0 was ambiguous: it could either mean >> + * the relation is empty, or it could mean that it hadn't yet been >> + * vacuumed or analyzed. (Newer versions use -1 for the latter case). >> + * >> + * We can ignore this change, because if the remote table wasn't >> analyzed, >> + * then it would have no attribute stats, and thus we wouldn't have stats >> + * that we would try to import. So we can take the reltuples value as-is. >> + */ >> >> I might have misread the patch, but for v14 or later the patch tries >> to import remote stats even when we have reltuples = -1. Is that >> intentional? In that case the remote table has never yet been >> vacuumed/analyzed, so I think we should just fall back to the normal >> processing. > > For cases where we affirmatively have -1, it makes sense to immediately go to > the ANALYZE option (if available) and fall back if not. I'll make that change. > >> Also, for v13 or earlier it also tries to import remote >> stats even when we have reltuples = 0, but in that case the remote >> table might not have been vacuumed/analyzed, so to avoid possibly >> useless processing, I think we should just fall back to it in that >> case as well. > > +1 I forgot to mention this, but when we have reltuples = 0 for v14 or later, the patch tries to import both relstats and attstats, but in that case I think it would be OK to do so only for the former for the consistency with the normal processing. What do you think about that? Best regards, Etsuro Fujita
