Re: Compatible defaults for LEAD/LAG

Pavel Stehule Sun, 30 Aug 2020 22:06:38 -0700

ne 30. 8. 2020 v 23:59 odesílatel Tom Lane <t...@sss.pgh.pa.us> napsal:


> Pavel Stehule <pavel.steh...@gmail.com> writes:
> > This is nice example of usage of anycompatible type (that is consistent
> > with other things in Postgres), but standard says something else.
> > It can be easily solved with https://commitfest.postgresql.org/28/2081/
> -
> > but Tom doesn't like this patch.
> > I am more inclined to think so this feature should be implemented
> > differently - there is no strong reason to go against standard or against
> > the implementations of other databases (and increase the costs of
> porting).
> > Now the implementation is limited, but allowed behaviour is 100% ANSI.
>
> I don't particularly buy this argument.  The case at hand is what to do
> if we have, say,
>
>         select lag(integer_column, 1, 1.2) over ...
>
> The proposed patch would result in the output being of type numeric,
> and any rows using the default would show "1.2".  The spec says that
> the right thing is to return integer, and we should round the default
> to "1" to make that work.  But
>
> (1) I doubt that anybody actually writes such things;
>
> (2) For anyone who does write it, the spec's behavior fails to meet
> the principle of least surprise.  It is not normally the case that
> any information-losing cast would be applied silently within an
> expression.
>

postgres=# create table foo(a int);
CREATE TABLE
postgres=# insert into foo values(1.1);
INSERT 0 1

postgres=# create table foo(a int default 1.1);
CREATE TABLE
postgres=# insert into foo values(default);
INSERT 0 1
postgres=# select * from foo;
┌───┐
│ a │
╞═══╡
│ 1 │
└───┘
(1 row)

It is maybe strange, but it is not an unusual pattern in SQL. I think it
can be analogy with default clause

DECLARE a int DEFAULT 1.2;

The default value doesn't change a type of variable. This is maybe too
stupid example. There can be other little bit more realistic

CREATE OR REPLACE FUNCTION foo(a numeric, b numeric, ...
DECLARE x int DEFAULT a;
BEGIN
  ...

I am afraid about performance - if default value can change type, then some
other things can stop work - like using index.

For *this* case we don't speak about some operations between two
independent operands or function arguments. We are speaking about
the value and about a *default* for the value.


> So this deviation from spec doesn't bother me; we have much bigger ones.
>

ok,  if it is acceptable for other people, I can accept it too - I
understand well so it is a corner case and there is some consistency with
other Postgres features.

Maybe this difference should be mentioned in documentation.


> My concern with this patch is what I said upthread: I'm not sure that
> we should be adjusting this behavior in a piecemeal fashion.  I looked
> through pg_proc to find all the functions that have more than one any*
> argument, and found these:
>
>                       oid                      | prorettype
> -----------------------------------------------+------------
>  lag(anyelement,integer,anyelement)            | anyelement
>  lead(anyelement,integer,anyelement)           | anyelement
>  width_bucket(anyelement,anyarray)             | integer
>  btarraycmp(anyarray,anyarray)                 | integer
>  array_eq(anyarray,anyarray)                   | boolean
>  array_ne(anyarray,anyarray)                   | boolean
>  array_lt(anyarray,anyarray)                   | boolean
>  array_gt(anyarray,anyarray)                   | boolean
>  array_le(anyarray,anyarray)                   | boolean
>  array_ge(anyarray,anyarray)                   | boolean
>  array_append(anyarray,anyelement)             | anyarray
>  array_prepend(anyelement,anyarray)            | anyarray
>  array_cat(anyarray,anyarray)                  | anyarray
>  array_larger(anyarray,anyarray)               | anyarray
>  array_smaller(anyarray,anyarray)              | anyarray
>  array_position(anyarray,anyelement)           | integer
>  array_position(anyarray,anyelement,integer)   | integer
>  array_positions(anyarray,anyelement)          | integer[]
>  array_remove(anyarray,anyelement)             | anyarray
>  array_replace(anyarray,anyelement,anyelement) | anyarray
>  arrayoverlap(anyarray,anyarray)               | boolean
>  arraycontains(anyarray,anyarray)              | boolean
>  arraycontained(anyarray,anyarray)             | boolean
>  elem_contained_by_range(anyelement,anyrange)  | boolean
>  range_contains_elem(anyrange,anyelement)      | boolean
>  range_eq(anyrange,anyrange)                   | boolean
>  range_ne(anyrange,anyrange)                   | boolean
>  range_overlaps(anyrange,anyrange)             | boolean
>  range_contains(anyrange,anyrange)             | boolean
>  range_contained_by(anyrange,anyrange)         | boolean
>  range_adjacent(anyrange,anyrange)             | boolean
>  range_before(anyrange,anyrange)               | boolean
>  range_after(anyrange,anyrange)                | boolean
>  range_overleft(anyrange,anyrange)             | boolean
>  range_overright(anyrange,anyrange)            | boolean
>  range_union(anyrange,anyrange)                | anyrange
>  range_merge(anyrange,anyrange)                | anyrange
>  range_intersect(anyrange,anyrange)            | anyrange
>  range_minus(anyrange,anyrange)                | anyrange
>  range_cmp(anyrange,anyrange)                  | integer
>  range_lt(anyrange,anyrange)                   | boolean
>  range_le(anyrange,anyrange)                   | boolean
>  range_ge(anyrange,anyrange)                   | boolean
>  range_gt(anyrange,anyrange)                   | boolean
>  range_gist_same(anyrange,anyrange,internal)   | internal
> (45 rows)
>
> Now, there's no point in changing range_eq and the later entries in this
> table (the ones taking two anyrange's); our rather lame definition of
> anycompatiblerange means that we'd get no benefit from doing so.  But
> I think there's a strong case for changing everything before range_eq.
> It'd be nice if something like
>
> SELECT array[1] < array[1.1];
>
> would work the same way that "SELECT 1 < 1.1" would.
>

There it has sense without any discussion. But it is a little bit different
than using the default value in the LAG function.


> I checked the other concern that I had about whether the more flexible
> polymorphic definitions would create any new ambiguities, and I don't
> see any problems with this list.  As functions, none of these names are
> overloaded, except with different numbers of arguments so there's no
> ambiguity.  Most of the array functions are also operators, but the
> competing operators do not take arrays, so it doesn't look like there's
> any issue on that side either.
>
> So I think we should be more ambitious and generalize all of these
> to use anycompatiblearray etc.
>

+1

Pavel

>
>                         regards, tom lane
>

Re: Compatible defaults for LEAD/LAG

Reply via email to