On Thu, Sep 11, 2014 at 8:40 PM, Kouhei Kaigai <kai...@ak.jp.nec.com> wrote: >> On Thu, Sep 11, 2014 at 11:24 AM, Kouhei Kaigai <kai...@ak.jp.nec.com> >> wrote: >> >> Don't the changes to src/backend/optimizer/plan/createplan.c belong >> >> in patch #2? >> >> >> > The borderline between #1 and #2 is little bit bogus. So, I moved most >> > of portion into #1, however, invocation of InitCustomScan (that is a >> > callback in CustomPlanMethod) in create_custom_plan() is still in #2. >> >> Eh, create_custom_scan() certainly looks like it is in #1 from here, or >> at least part of it is. It calculates tlist and clauses and then does >> nothing with them. That clearly can't be the right division. >> >> I think it would make sense to have create_custom_scan() compute tlist and >> clauses first, and then pass those to CreateCustomPlan(). Then you don't >> need a separate InitCustomScan() - which is misnamed anyway, since it has >> nothing to do with ExecInitCustomScan(). >> > The only reason why I put separate hooks here is, create_custom_scan() needs > to know exact size of the CustomScan node (including private fields), however, > it is helpful for extensions to kick its callback to initialize the node > next to the common initialization stuff.
Why does it need to know that? I don't see that it's doing anything that requires knowing the size of that node, and if it is, I think it shouldn't be. That should get delegated to the callback provided by the custom plan provider. > Regarding to the naming, how about GetCustomScan() instead of > InitCustomScan()? > It follows the manner in create_foreignscan_plan(). I guess that's a bit better, but come to think of it, I'd really like to avoid baking in the assumption that the custom path provider has to return any particular type of plan node. A good start would be to give it a name that doesn't imply that - e.g. PlanCustomPath(). >> > OK, I revised. Now custom-scan assumes it has a particular valid >> > relation to be scanned, so no code path with scanrelid == 0 at this moment. >> > >> > Let us revisit this scenario when custom-scan replaces relation-joins. >> > In this case, custom-scan will not be associated with a particular >> > base- relation, thus it needs to admit a custom-scan node with scanrelid >> == 0. >> >> Yeah, I guess the question there is whether we'll want let CustomScan have >> scanrelid == 0 or require that CustomJoin be used there instead. >> > Right now, I cannot imagine a use case that requires individual CustomJoin > node because CustomScan with scanrelid==0 (that performs like custom-plan > rather than custom-scan in actually) is sufficient. > > If a CustomScan gets chosen instead of built-in join logics, it shall looks > like a relation scan on the virtual one that is consists of two underlying > relation. Callbacks of the CustomScan has a responsibility to join underlying > relations; that is invisible from the core executor. > > It seems to me CustomScan with scanrelid==0 is sufficient to implement > an alternative logic on relation joins, don't need an individual node > from the standpoint of executor. That's valid logic, but it's not the only way to do it. If we have CustomScan and CustomJoin, either of them will require some adaption to handle this case. We can either allow a custom scan that isn't scanning any particular relation (i.e. scanrelid == 0), or we can allow a custom join that has no children. I don't know which way will come out cleaner, and I think it's good to leave that decision to one side for now. >> >> Why can't the Custom(GpuHashJoin) node build the hash table >> >> internally instead of using a separate node? >> >> >> > It's possible, however, it prevents to check sub-plans using EXPLAIN >> > if we manage inner-plans internally. So, I'd like to have a separate >> > node being connected to the inner-plan. >> >> Isn't that just a matter of letting the EXPLAIN code print more stuff? >> Why can't it? >> > My GpuHashJoin takes multiple relations to load them a hash-table. > On the other hand, Plan node can have two underlying relations at most > (inner/outer). Outer-side is occupied by the larger relation, so it > needs to make multiple relations visible using inner-branch. > If CustomScan can has a list of multiple underlying plan-nodes, like > Append node, it can represent the structure above in straightforward > way, but I'm uncertain which is the better design. Right. I think the key point is that it is *possible* to make this work without a multiexec interface, and it seems like we're agreed that it is. Now perhaps we will decide that there is enough benefit in having multiexec support that we want to do it anyway, but it's clearly not a hard requirement, because it can be done without that in the way you describe here. Let's leave to the future the decision as to how to proceed here; getting the basic thing done is hard enough. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers