On 2016/07/13 18:00, Ashutosh Bapat wrote:
    To fix the first, I'd like to propose (1) replacing the existing
    has_foreign_join flag in the CachedPlan data structure with a new
    flag, say uses_user_mapping, that indicates whether a cached plan
    uses any user mapping regardless of whether the cached plan has
    foreign joins or not (likewise, replace the hasForeignJoin flag in
    PlannedStmt), and (2) invalidating the cached plan if the
    uses_user_mapping flag is true.

That way we will have plan cache invalidations even when simple foreign
tables scans (not join) are involved, which means all the plans
involving any reference to a foreign table with valid user mapping
associated with it. That can be a huge cost as compared to the current
solution where sub-optimal plan will be used only when a user mapping is
changed while a statement has been prepared. That's a rare scenario and
somebody can work around that by preparing the statement again.

I'm not sure that's a good workaround. ISTM that people often don't pay much attention to plan changes, so they would execute the inefficient plan without realizing the plan change, it would take long, they would start thinking what's happening there, and finally, they would find that the reason for that is due to the plan change. I think we should prevent such a trouble.

IIRC, we
had discussed this situation when implementing the cache invalidation
logic.

I didn't know that.  Sorry for speaking up late.

But there's no workaround for your solution.

As you said, this is a rare scenario; in many cases, people define user mappings properly beforehand. So, just invalidating all relevant plans on the syscache invalidation events would be fine. (I thought one possible improvement might be to track exactly the dependencies of plans on user mappings and invalidate just those plans that depend on the user mapping being modified the same way for user-defined functions, but I'm not sure it's worth complicating the code.)

    One benefit of using the proposed approach is that we could make the
    FDW's handling of user mappings in BeginForeignScan more efficient;
    currently, there is additional overhead caused by catalog re-lookups
    to obtain the user mapping information for the
    simple-foreign-table-scan case where user mappings mean something to
    the FDW as in postgres_fdw (probably, updates on the catalogs are
    infrequent, though), but we could improve the efficiency by using
    the validated user mapping information created at planning time for
    that case as well as for the foreign-join case.

postgres_fdw to fetches user mapping in some cases but never remembers
it. If what you are describing is a better way, it should have been
implemented before join pushdown was implemented. Refetching a user
mapping is not that expensive given that there is a high chance that it
will be found in the syscache, because it was accessed at the time of
planning.

That assumption is reasonably valid if execution is done immediately after planning, but that doesn't necessarily follow.

Effect of plan cache invalidation is probably worse than
fetching the value from a sys cache again.

As I said above, we could expect updates on pg_user_mapping to be infrequent, so the effect of the plan cache invalidation would be more limited than that of re-fetching user mappings during BeginForeignScan.

    I don't think the above change is sufficient to fix the second.  The
    root reason for that is that since currently, we allow the user
    mapping OID (rel->umid) to be InvalidOid in two cases: (1) user
    mappings mean something to the FDW but it can't get any user mapping
    at planning time and (2) user mappings are meaningless to the FDW,
    we cannot distinguish these two cases.

The way to differentiate between these two is to look at the serverid.
If server id is invalid it's the case 1,

Really? Maybe my explanation was not good, but consider a foreign join plan created through GetForeignJoinPaths, by an FDW to which user mappings are meaningless, like file_fdw. In that plan, the corresponding server id would be valid, not invalid. No?

    So, I'd like to introduce a new callback routine to specify that
    user mappings mean something to the FDW as proposed by Tom [2], and
    use that to reject the former case, which allows us to set the above
    uses_user_mapping flag appropriately, ie, set the flag to true only
    if user mapping changes require forcing a replan.

This routine is meaningless unless the core (or FDW) does not allow a
user mapping to be created for such FDWs. Without that, core code would
get confused as to what it should do when it sees a user mapping for an
FDW which says user mappings are meaningless.

The core wouldn't care about such a user mapping for the FDW; the core would just ignore the user mapping. No?

Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to