On 2016/07/13 18:00, Ashutosh Bapat wrote:
To fix the first, I'd like to propose (1) replacing the existing
has_foreign_join flag in the CachedPlan data structure with a new
flag, say uses_user_mapping, that indicates whether a cached plan
uses any user mapping regardless of whether the cached plan has
foreign joins or not (likewise, replace the hasForeignJoin flag in
PlannedStmt), and (2) invalidating the cached plan if the
uses_user_mapping flag is true.
That way we will have plan cache invalidations even when simple foreign
tables scans (not join) are involved, which means all the plans
involving any reference to a foreign table with valid user mapping
associated with it. That can be a huge cost as compared to the current
solution where sub-optimal plan will be used only when a user mapping is
changed while a statement has been prepared. That's a rare scenario and
somebody can work around that by preparing the statement again.
I'm not sure that's a good workaround. ISTM that people often don't pay
much attention to plan changes, so they would execute the inefficient
plan without realizing the plan change, it would take long, they would
start thinking what's happening there, and finally, they would find that
the reason for that is due to the plan change. I think we should
prevent such a trouble.
IIRC, we
had discussed this situation when implementing the cache invalidation
logic.
I didn't know that. Sorry for speaking up late.
But there's no workaround for your solution.
As you said, this is a rare scenario; in many cases, people define user
mappings properly beforehand. So, just invalidating all relevant plans
on the syscache invalidation events would be fine. (I thought one
possible improvement might be to track exactly the dependencies of plans
on user mappings and invalidate just those plans that depend on the user
mapping being modified the same way for user-defined functions, but I'm
not sure it's worth complicating the code.)
One benefit of using the proposed approach is that we could make the
FDW's handling of user mappings in BeginForeignScan more efficient;
currently, there is additional overhead caused by catalog re-lookups
to obtain the user mapping information for the
simple-foreign-table-scan case where user mappings mean something to
the FDW as in postgres_fdw (probably, updates on the catalogs are
infrequent, though), but we could improve the efficiency by using
the validated user mapping information created at planning time for
that case as well as for the foreign-join case.
postgres_fdw to fetches user mapping in some cases but never remembers
it. If what you are describing is a better way, it should have been
implemented before join pushdown was implemented. Refetching a user
mapping is not that expensive given that there is a high chance that it
will be found in the syscache, because it was accessed at the time of
planning.
That assumption is reasonably valid if execution is done immediately
after planning, but that doesn't necessarily follow.
Effect of plan cache invalidation is probably worse than
fetching the value from a sys cache again.
As I said above, we could expect updates on pg_user_mapping to be
infrequent, so the effect of the plan cache invalidation would be more
limited than that of re-fetching user mappings during BeginForeignScan.
I don't think the above change is sufficient to fix the second. The
root reason for that is that since currently, we allow the user
mapping OID (rel->umid) to be InvalidOid in two cases: (1) user
mappings mean something to the FDW but it can't get any user mapping
at planning time and (2) user mappings are meaningless to the FDW,
we cannot distinguish these two cases.
The way to differentiate between these two is to look at the serverid.
If server id is invalid it's the case 1,
Really? Maybe my explanation was not good, but consider a foreign join
plan created through GetForeignJoinPaths, by an FDW to which user
mappings are meaningless, like file_fdw. In that plan, the
corresponding server id would be valid, not invalid. No?
So, I'd like to introduce a new callback routine to specify that
user mappings mean something to the FDW as proposed by Tom [2], and
use that to reject the former case, which allows us to set the above
uses_user_mapping flag appropriately, ie, set the flag to true only
if user mapping changes require forcing a replan.
This routine is meaningless unless the core (or FDW) does not allow a
user mapping to be created for such FDWs. Without that, core code would
get confused as to what it should do when it sees a user mapping for an
FDW which says user mappings are meaningless.
The core wouldn't care about such a user mapping for the FDW; the core
would just ignore the user mapping. No?
Best regards,
Etsuro Fujita
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers