Hi, On 2018-04-05 11:31:48 +0530, Pavan Deolasee wrote: > > +/*--------------------------------------------------------- > > ---------------- > > + * > > + * nodeMerge.c > > + * routines to handle Merge nodes relating to the MERGE command > > > > Isn't this file misnamed and it should be execMerge.c? The rest of the > > node*.c files are for nodes that are invoked via execProcnode / > > ExecProcNode(). This isn't an actual executor node. > > > > Makes sense. Done. (Now that the patch is committed, I don't know if there > would be a rethink about changing file names. May be not, just raising that > concern)
It absolutely definitely needed to be renamed. But that's been done, so ... > > What's the reasoning behind making this be an anomaluous type of > > executor node? > Didn't quite get that. I think naming of the file was bad (fixed now), but > I think it's a good idea to move the new code in a new file, from > maintainability as well as coverage perspective. If you've something very > different in mind, can you explain in more details? Well, it was kinda modeled as an executor node, including the file name. That's somewhat fixed now. I still am extremely suspicious of the codeflow here. My impression is that this simply shouldn't be going through nodeModifyTuple, but be it's own nodeMerge node. The trigger handling would need to be abstraced into execTrigger.c or such to avoid duplication. We're now going from nodeModifyTable.c:ExecModifyTable() into execMerge.c:ExecMerge(), back to nodeModifyTable.c:ExecUpdate/Insert(). To avoid ExecInsert() doing things that aren't appropriate for merge, we then pass an actionState, which neuters part of ExecUpdate/Insert(). This is just bad. I think this should be cleaned up before it can be released, which means this feature should be reverted and cleaned up nicely before being re-committed. Otherwise we'll sit on this bad architecture and it'll make future features harder. And if somebody cleans it up Simon will complain that things are needlessly being destabilized (hello xlog.c with it's 1000+ LOC functions). > > + /* > > + * If there are not WHEN MATCHED actions, we are done. > > + */ > > + if (mergeMatchedActionStates == NIL) > > + return true; > > > > Maybe I'm confused, but why is mergeMatchedActionStates attached to > > per-partition info? How can this differ in number between partitions, > > requiring us to re-check it below fetching the partition info above? > > > > > Because each partition may have a columns in different order, dropped > attributes etc. So we need to give treatment to the quals/targetlists. See > ON CONFLICT DO UPDATE for similar code. But the count wouldn't change, no? So we return before building the partition info if there's no MATCHED action? > > It seems fairly bad architecturally to me that we now have > > EvalPlanQual() loops in both this routine *and* > > ExecUpdate()/ExecDelete(). > > > > > This was done after review by Peter and I think I like the new way too. > Also keeps the regular UPDATE/DELETE code paths least changed and let Merge > handle concurrency issues specific to it. It also makes the whole code barely readable. Minimal amount of changes isn't a bad goal, but if the consequence of that is poor layering and repeated code it's bad as well. > > + /* > > + * Initialize hufdp. Since the caller is only interested in the failure > > + * status, initialize with the state that is used to indicate > > successful > > + * operation. > > + */ > > + if (hufdp) > > + hufdp->result = HeapTupleMayBeUpdated; > > + > > > > This signals for me that the addition of the result field to HUFD wasn't > > architecturally the right thing. HUFD is supposed to be supposed to be > > returned by heap_update(), reusing and setting it from other places > > seems like a layering violation to me. > I am not sure I agree. Sure we can keep adding more parameters to > ExecUpdate/ExecDelete and such routines, but I thought passing back all > information via a single struct makes more sense. You can just wrap HUFD in another struct that has the necessary information. > > + > > + /* > > + * Add a whole-row-Var entry to support references to "source.*". > > + */ > > + var = makeWholeRowVar(rt_rte, rt_rtindex, 0, false); > > + te = makeTargetEntry((Expr *) var, list_length(*mergeSourceTargetList) > > + 1, > > + NULL, true); > > > > Why can't this properly dealt with by transformWholeRowRef() etc? > > > > > I just followed ON CONFLICT style. May be there is a better way, but not > clear how. What code are you referring to? > > Why is this, and not building a proper executor node for merge that > > knows how to get the tuples, the right approach? We did a rough > > equivalent for matview updates, and I think it turned out to be a pretty > > bad plan. > > > > > I am still not sure why that would be any better. Can you explain in detail > what exactly you've in mind and how's that significantly better than what > we have today? Because the code flow would be understandable, we'd have proper parser locations, we'd not have to introduce a new query source type, the deparsing would be less painful, we'd not have to optimizer like hijinks in parse analysis etc. In my opinion you're attempting to do planner / optimizer stuff at the parse-analysis level, and that's just not a good idea. Greetings, Andres Freund