>
> * I proposed the StatisticsAreImportable callback function, but the
> ImportStatistics callback function is now allowed to fall back to
> analyzing, so I don't think StatisticsAreImportable has much value
> anymore.  I'll withdraw my proposal.  Attached is an updated patch
> removing it, in which 0001 is merged into 0002 for ease of review
> (0001 can't be tested without 0002).
>

Thanks.


> * I modified analyze.c to start progress monitor a bit earlier to
> monitor the work of ImportStatistics (and AnalyzeForeignTable) as part
> of initialization.
>
> * I noticed this odd behavior:
>
> create table t1 (id int, data text);
> create foreign table ft1 (id int, data text) server loopback options
> (table_name 't1');
>
> alter foreign table ft1 options (add fetch_stats 'false');
> analyze ft1 (ctid);
> ERROR:  column "ctid" of relation "ft1" does not exist
>
> Looks good, but:
>
> alter foreign table ft1 options (set fetch_stats 'true');
> analyze ft1 (ctid);
> NOTICE:  remote table public.t1 has no statistics to import
> HINT:  Falling back to ANALYZE with regular row sampling.
> ERROR:  column "ctid" of relation "ft1" does not exist
>
> postgres_fdw is doing the remote access without any need to do so,
> which isn't good.
>
> The cause is that the patches fail to validate the va_cols list given
> to analyze_rel before importing stats.  To fix, I modified analyze.c
> to do so before calling ImportStatistics.  I think this is also useful
> when analyzing inheritance trees, as it avoids validating the list
> repeatedly in that case.
>

I see this change and I like it.


> * I separated a bit of code in examine_attribute that checks whether
> the given column is analyzable into a new extern function so that FDWs
> can use it.
>

Interesting. I wouldn't have dared make a change that affected regular
analyze, but it makes sense.

postgres_fdw side:
>
> * In fetch_remote_statistics, if we get reltuples=0 for v14 or later,
> I think we should update only the relation stats with that info, and
> avoid resorting to analyzing, for efficiency, as I proposed before.  I
> modified that function (and import_fetched_statistics) that way.
>

This will miss out on the case where the remote table did get analyzed
once, when empty, but now isn't empty. I realize that shouldn't happen very
often, but the cost of rowsampling a table that is empty is very low.


> The root cause is that passing NULL to pg_restore_attribute_stats
> leaves the stats unchanged.  To fix, I modified
> import_fetched_statistics to do pg_clear_attribute_stats before the
> restore function.
>

+1


> * The matching logic in match_attrmap seems correct to me, but I think
> it's still complicated than necessary: you are doing a merge join of
> the RemoteAttributeMapping array and the attstats set by placing the
> latter in the outer side, but it makes it simpler to do the join the
> opposite way (ie, the former in the outer side), like the attached,
> which doesn't need an inner for-loop anymore, as the inner side (ie,
> the attstats set) consists of *distinct* columns.
>

The new match_attrmap seems reasonable.


> * I moved table_has_extended_stats to extended_stats.c and renamed it
> to match other functions in it so that other FDWs too can use it.
>

+1


> * The 0002 patch is using different levels (LOG, NOTICE, and WARNING)
> for logging in postgresImportStatisitcs, but I think it's appropriate
> to use elevel or WARNING for it.  So I modified the patch to use
> WARNING.  I modified the wording as well to match analyze.c.
>

+1


> * I added at the top/end of postgresImportStatisitcs log messages that
> show the start/end of importing stats.
>


+1


> That's it.
>

I see that remote_analyze didn't make it as a part of this patch. Is that
something you'd repackaged as a follow-on patch, or are you just done with
it?

Reply via email to