One of the friction points I have found in migrating from Oracle to PostgreSQL
is in the conversion of hierarchical queries from the Oracle START WITH/CONNECT
BY/ORDER SIBLINGS by pattern to using the ANSI recursive subquery form.
Once you wrap your head around it, the ANSI form is not so bad with one major
exception. In order to achieve the equivalent of Oracle’s ORDER SIBLINGS BY
clause, you need to add an additional column containing an array with the
accumulated path back to the root of the hierarchy for each row. The problem
with that is that it leaves you with an unfortunate choice: either accept the
inefficiency of returning the array with the path back to the client (which the
client doesn’t need or want), or requiring the application to explicitly list
all of the columns that it wants just to exclude the hierarchy column, which
can be hard to maintain, especially if your application needs to support both
databases. If you have a ORM model where there could be multiple queries that
share the same client code to read the result set, you might have to change
multiple queries when new columns are added to a table or view even though you
have centralized the processing of the result set.
The ideal solution for this would be for PostgreSQL to support the Oracle
syntax and internally convert it to the ANSI form. Failing that, I have a
modest suggestion that I would like to start a discussion around. What if you
could use the MINUS keyword in the column list of a select statement to remove
a column from the result set returned to the client? What I have in mind is
something like this:
To achieve the equivalent of the following Oracle query:
SELECT T.*
FROM T
START WITH T.ParentID IS NULL
CONNECT BY T.ParentID = PRIOR T.ID
ORDER SIBLINGS BY T.OrderVal
You could use
WITH RECURSIVE TT AS (
SELECT T0.*, ARRAY[]::INTEGER[] || T.OrderVal AS Sortable
FROM T T0
UNION ALL
SELECT T1.*, TT.Sortable || T1 AS Sortable
FROM TT
INNER JOIN T T1 ON (T1.ParentID = TT.ID)
)
SELECT TT.* MINUS TT.Sortable
FROM TT
ORDER BY TT.Sortable
Now the Sortable column can be used to order the result set but is not returned
to the client.
Not knowing the internals of the parser, I’m assuming that the use of MINUS in
this construct would be distinguishable from the set difference use case
because the expression being subtracted is a column (or perhaps even a lst of
columns) rather than a SELECT expression.