I'm still a bit confused and unhappy about the initialization of ResultRelInfos and the various fields in them. We've made progress in the previous patches, but it's still a bit messy.

                /*
                 * If transition tuples will be captured, initialize a map to 
convert
                 * child tuples into the format of the table mentioned in the 
query
                 * (root relation), because the transition tuple store can only 
store
                 * tuples in the root table format.  However for INSERT, the 
map is
                 * only initialized for a given partition when the partition 
itself is
                 * first initialized by ExecFindPartition.  Also, this map is 
also
                 * needed if an UPDATE ends up having to move tuples across
                 * partitions, because in that case the child tuple to be moved 
first
                 * needs to be converted into the root table's format.  In that 
case,
                 * we use GetChildToRootMap() to either create one from scratch 
if
                 * we didn't already create it here.
                 *
                 * Note: We cannot always initialize this map lazily, that is, 
use
                 * GetChildToRootMap(), because AfterTriggerSaveEvent(), which 
needs
                 * the map, doesn't have access to the "target" relation that is
                 * needed to create the map.
                 */
                if (mtstate->mt_transition_capture && operation != CMD_INSERT)
                {
                        Relation        relation = 
resultRelInfo->ri_RelationDesc;
                        Relation        targetRel = 
mtstate->rootResultRelInfo->ri_RelationDesc;

                        resultRelInfo->ri_ChildToRootMap =
                                
convert_tuples_by_name(RelationGetDescr(relation),
                                                                           
RelationGetDescr(targetRel));
                        /* First time creating the map for this result 
relation. */
                        Assert(!resultRelInfo->ri_ChildToRootMapValid);
                        resultRelInfo->ri_ChildToRootMapValid = true;
                }

The comment explains that AfterTriggerSaveEvent() cannot use GetChildToRootMap(), because it doesn't have access to the root target relation. But there is a field for that in ResultRelInfo: ri_PartitionRoot. However, that's only set up when we do partition routing.

How about we rename ri_PartitionRoot to e.g ri_RootTarget, and set it always, even for non-partition inheritance? We have that information available when we initialize the ResultRelInfo, so might as well.

Some code currently checks ri_PartitionRoot, to determine if a tuple that's been inserted, has been routed. For example:

                /*
                 * Also check the tuple against the partition constraint, if 
there is
                 * one; except that if we got here via tuple-routing, we don't 
need to
                 * if there's no BR trigger defined on the partition.
                 */
                if (resultRelationDesc->rd_rel->relispartition &&
                        (resultRelInfo->ri_PartitionRoot == NULL ||
                         (resultRelInfo->ri_TrigDesc &&
                          resultRelInfo->ri_TrigDesc->trig_insert_before_row)))
                        ExecPartitionCheck(resultRelInfo, slot, estate, true);

So if we set ri_PartitionRoot always, we would need some other way to determine if the tuple at hand has actually been routed or not. But wouldn't that be a good thing anyway? Isn't it possible that the same ResultRelInfo is sometimes used for routed tuples, and sometimes for tuples that have been inserted/updated "directly"? ExecLookupUpdateResultRelByOid() sets that field lazily, so I think it would be possible to get here with ri_PartitionRoot either set or not, depending on whether an earlier cross-partition update was routed to the table.

The above check is just an optimization, to skip unnecessary ExecPartitionCheck() calls, but I think this snippet in ExecConstraints() needs to get this right:

                                /*
                                 * If the tuple has been routed, it's been 
converted to the
                                 * partition's rowtype, which might differ from 
the root
                                 * table's.  We must convert it back to the 
root table's
                                 * rowtype so that val_desc shown error message 
matches the
                                 * input tuple.
                                 */
                                if (resultRelInfo->ri_PartitionRoot)
                                {
                                        AttrMap    *map;

                                        rel = resultRelInfo->ri_PartitionRoot;
                                        tupdesc = RelationGetDescr(rel);
                                        /* a reverse map */
                                        map = 
build_attrmap_by_name_if_req(orig_tupdesc,
                                                                                
                           tupdesc);

                                        /*
                                         * Partition-specific slot's tupdesc 
can't be changed, so
                                         * allocate a new one.
                                         */
                                        if (map != NULL)
                                                slot = 
execute_attr_map_slot(map, slot,
                                                                                    
                     MakeTupleTableSlot(tupdesc, &TTSOpsVirtual));
                                }

Is that an existing bug, or am I missing?

- Heikki


Reply via email to