[HACKERS] TupleDescs and refcounts and such, again

Tom Lane Tue, 26 Dec 2006 07:47:42 -0800

I looked into the bug reported by Jean-Pierre Pelletier here:
http://archives.postgresql.org/pgsql-bugs/2006-12/msg00163.php
The cause of the crash is a reference to an already-deallocated
TupleDesc.  The problem query has the structure of


        SELECT sum(x) FROM (complicated-subselect) GROUP BY ...

which gets planned as HashAggregate atop a SubqueryScan, and the
reason for the crash is this coding in nodeAgg.c:

    /* if first time through, initialize hashslot by cloning input slot */
    if (hashslot->tts_tupleDescriptor == NULL)
    {
        ExecSetSlotDescriptor(hashslot, inputslot->tts_tupleDescriptor);

This means the upper query's tupletable contains a reference to the
result tuple descriptor of the subquery, which has been allocated in a
separate memory context (because the subquery has its own ExecutorState
and hence its own es_query_cxt).  During plan shutdown, the sub-query's
memory is freed before the upper query's tupletable is deallocated.
In an assert-enabled build this reliably causes a failure like "tupdesc
reference 401901f8 is not owned by resource owner Portal", because the
lower tupdesc has been overwritten by the memory-clobber code, and that
makes it look like it should be reference-counted.  (Pre-8.2 it
accidentally failed to malfunction because tupletable shutdown didn't
touch the referenced tupdescs at all.)

I can see a couple of possibilities for fixing this:

1. The most localized fix would be to introduce a CreateTupleDescCopy()
call into the above ExecSetSlotDescriptor() call.  But I have zero
confidence in this way because of the likelihood that there are similar
usages elsewhere.  Moreover a large part of the point of the tupdesc
refcounting changes was to avoid extra tupdesc-copying steps --- if we
have to copy tupdescs anywhere they might have come from a subplan, that
patch is a failure.

2. Somehow persuade the subplan to allocate its result tupdesc in the
upper query's query context.  Could probably be done with localized
copying in nodeSubqueryscan, but seems like a wart.

3. Rejigger CreateExecutorState so that a subquery does not have its own
es_query_cxt but shares the parent's context.  Then anything created at
query lifespan in the subquery lives just as long as stuff created in
the upper query.  AFAICS this wouldn't make any practical difference in
memory lifespan since subqueries are only destroyed when their parent is
... but it would solve this particular problem as well as any related
ones.

As you can probably guess, I'm leaning to #3, but wanted to see if
anyone had an objection or a better idea.

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

[HACKERS] TupleDescs and refcounts and such, again

Reply via email to