On 12.3.2015 14:17, Alvaro Herrera wrote:
> Tomas Vondra wrote:
>> On 12.3.2015 03:16, Tom Lane wrote:
> 
>>> I agree though that it's worth considering defining 
>>> pg_attribute.attnum as the logical column position so as to minimize 
>>> the effects on client-side code. I doubt there is much stuff 
>>> client-side that cares about column creation order, but there is 
>>> plenty that cares about logical column order. OTOH this would 
>>> introduce confusion into the backend code, since Alvaro's definition 
>>> of attnum is what most of the backend should care about.
>>
>> IMHO reusing attnum for logical column order would actually make it more
>> complex, especially if we allow users to modify the logical order using
>> ALTER TABLE. Because if you change it, you have to walk through all the
>> places where it might be referenced and update those too (say, columns
>> referenced in indexes and such). Keeping attnum immutable makes this
>> much easier and simpler.
> 
> I think you're misunderstanding. The suggestion, as I understand it,
> is to rename the attnum column to something else (maybe, say,
> attidnum), and rename attlognum to attnum. That preserves the
> existing property that "ORDER BY attnum" gives you the correct view
> of the table from the point of view of the user. That's very useful
> because it means clients looking at pg_attribute need less changes,
> or maybe none at all.

Hmm ... I understood it as a suggestion to drop attlognum and just
define (attnum, attphysnum).

> I think this wouldn't be too difficult to implement, because there 
> aren't that many places that refer to the column-identity attribute
> by name; most of them just grab the TupleDesc->attrs array in
> whatever order is appropriate and scan that in a loop. Only a few of
> these use att->attnum inside the loop --- that's what would need to
> be changed, and it should be pretty mechanical.

I think it's way more complicated. We may fix all the pieces of the
code, but that's not all - attnum is referenced in various system views,
catalogs and such. For example pg_stats view does this:

  FROM pg_statistic s JOIN pg_class c ON (c.oid = s.starelid)
    JOIN pg_attribute a ON (c.oid = attrelid AND attnum = s.staattnum)
      LEFT JOIN pg_namespace n ON (n.oid = c.relnamespace)
  WHERE NOT attisdropped
    AND has_column_privilege(c.oid, a.attnum, 'select');

information_schema also uses attnum on many places too.

I see the catalogs as a kind of public API, and redefining the meaning
of an existing column this way seems tricky, especially when we
reference it from other catalogs - I'm pretty sure there's plenty of SQL
queries in various tools that rely on this. Just google for "pg_indexes
indkeys unnest" and you'll find posts like this one from Craig:


http://stackoverflow.com/questions/18121103/how-to-get-the-index-column-orderasc-desc-nulls-first-from-postgresql

specifically tell people to do this:

SELECT
    ...
    FROM (
      SELECT
        pg_class.relname,
        ...
        unnest(pg_index.indkey) AS k
      FROM pg_index
      INNER JOIN pg_class ON pg_index.indexrelid = pg_class.oid
    ) i
    ...
    INNER JOIN pg_attribute ON (pg_attribute.attrelid = i.indrelid
                            AND pg_attribute.attnum = k);

which specifically tells people to match attnum vs. indkeys. If we
redefine the meaning of attnum, and instead match indkeys against a
different column (say, attidnum), all those queries will be broken.

Which actually breaks the catalog definition as specified here:

  http://www.postgresql.org/docs/devel/static/catalog-pg-index.html

which explicitly says that indkey references pg_attribute.attnum.

But maybe we don't really care about breaking this API and it is a good
approach - I need to think about it and try it.

-- 
Tomas Vondra                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to