Re: poc - possibility to write window function in PL languages

2021-01-27 Thread Pavel Stehule
st 20. 1. 2021 v 21:14 odesílatel Pavel Stehule 
napsal:

>
>
> st 20. 1. 2021 v 21:07 odesílatel Tom Lane  napsal:
>
>> Pavel Stehule  writes:
>> > The second question is work with partition context value. This should be
>> > only one value, and of only one but of any type per function. In this
>> case
>> > we cannot use GET statements. I had an idea of enhancing declaration.
>> Some
>> > like
>>
>> > DECLARE
>> >   pcx PARTITION CONTEXT (int); -- read partition context
>> > BEGIN
>> >   pcx := 10; -- set partition context
>>
>> > What do you think about it?
>>
>> Uh, what?  I don't understand what this "partition context" is.
>>
>
> It was my name for an access to window partition local memory -
> WinGetPartitionLocalMemory
>
> We need some interface for this cache
>

I have to think more about declarative syntax. When I try to transform our
WindowObject API directly, then it looks like Cobol. It needs a different
concept to be user friendly.

Regards

Pavel


> Regards
>
> Pavel
>
>
>
>
>
>
>
>>
>> regards, tom lane
>>
>


Re: poc - possibility to write window function in PL languages

2021-01-20 Thread Pavel Stehule
st 20. 1. 2021 v 21:32 odesílatel Tom Lane  napsal:

> Pavel Stehule  writes:
> > st 20. 1. 2021 v 21:07 odesílatel Tom Lane  napsal:
> >> Uh, what?  I don't understand what this "partition context" is.
>
> > It was my name for an access to window partition local memory -
> > WinGetPartitionLocalMemory
>
> Ah.
>
> > We need some interface for this cache
>
> I'm not convinced we need to expose that, or that it'd be very
> satisfactory to plpgsql users if we did.  The fact that it's fixed-size
> and initializes to zeroes are both things that are okay for C programmers
> but might be awkward to deal with in plpgsql code.  At the very least it
> would greatly constrain what data types you could usefully store.
>
> So I'd be inclined to leave that out, at least for the first version.
>

I think this functionality is relatively important. If somebody tries to
implement own window function, then he starts with some variation of the
row_num function.

We can support only types of fixed length to begin.

Regards

Pavel



>
> regards, tom lane
>


Re: poc - possibility to write window function in PL languages

2021-01-20 Thread Tom Lane
Pavel Stehule  writes:
> st 20. 1. 2021 v 21:07 odesílatel Tom Lane  napsal:
>> Uh, what?  I don't understand what this "partition context" is.

> It was my name for an access to window partition local memory -
> WinGetPartitionLocalMemory

Ah.

> We need some interface for this cache

I'm not convinced we need to expose that, or that it'd be very
satisfactory to plpgsql users if we did.  The fact that it's fixed-size
and initializes to zeroes are both things that are okay for C programmers
but might be awkward to deal with in plpgsql code.  At the very least it
would greatly constrain what data types you could usefully store.

So I'd be inclined to leave that out, at least for the first version.

regards, tom lane




Re: poc - possibility to write window function in PL languages

2021-01-20 Thread Pavel Stehule
st 20. 1. 2021 v 21:07 odesílatel Tom Lane  napsal:

> Pavel Stehule  writes:
> > The second question is work with partition context value. This should be
> > only one value, and of only one but of any type per function. In this
> case
> > we cannot use GET statements. I had an idea of enhancing declaration.
> Some
> > like
>
> > DECLARE
> >   pcx PARTITION CONTEXT (int); -- read partition context
> > BEGIN
> >   pcx := 10; -- set partition context
>
> > What do you think about it?
>
> Uh, what?  I don't understand what this "partition context" is.
>

It was my name for an access to window partition local memory -
WinGetPartitionLocalMemory

We need some interface for this cache

Regards

Pavel







>
> regards, tom lane
>


Re: poc - possibility to write window function in PL languages

2021-01-20 Thread Tom Lane
Pavel Stehule  writes:
> The second question is work with partition context value. This should be
> only one value, and of only one but of any type per function. In this case
> we cannot use GET statements. I had an idea of enhancing declaration. Some
> like

> DECLARE
>   pcx PARTITION CONTEXT (int); -- read partition context
> BEGIN
>   pcx := 10; -- set partition context

> What do you think about it?

Uh, what?  I don't understand what this "partition context" is.

regards, tom lane




Re: poc - possibility to write window function in PL languages

2021-01-20 Thread Pavel Stehule
Hi

so 16. 1. 2021 v 0:09 odesílatel Tom Lane  napsal:

> Pavel Stehule  writes:
> > [ plpgsql-window-functions-20210104.patch.gz ]
>
> I spent some time looking at this patch.  It would certainly be
> appealing to have some ability to write custom window functions
> without descending into C; but I'm not very happy about the details.
>
> I'm okay with the idea of having a special variable of a new pseudotype.
> That's not exactly pretty, but it descends directly from how we handle
> the arguments of trigger functions, so at least there's precedent.
> What's bugging me though is the "typedvalue" stuff.  That seems like a
> conceptual mess, a performance loss, and a permanent maintenance time
> sink.  To avoid performance complaints, eventually this hard-wired set
> of conversions would have to bloom to cover every built-in cast, and
> as for extension types, you're just out of luck.
>

I invited typed values with an idea of larger usability. With this type we
can implement dynamic iteration over records better than now, when the
fields of records should be cast to text or json before operation. With
this type I can hold typed value longer time and I can do some like:

DECLARE var typedvalue;

var := fx(..);
IF var IS OF integer THEN
  var_int := CAST(var AS int);
ELSEIF var IS OF date THEN
  var_date := CAST(var AS date);
ELSE
  var_text := CAST(var AS text);
END;

Sometimes (when you process some external data) this late (lazy) cast can
be better and allows you to use typed values. When I read external data,
sometimes I don't know types of these data before reading. I would like to
inject a possibility of more dynamic work with values and variables (but
still cleanly and safely). It should be more safe and faster than now, when
people should use the "text" type.

But I understand and I agree with your objections. Probably a lot of people
will use this type badly.



> One way to avoid that would be to declare the argument-fetching
> functions as polymorphics with a dummy argument that just provides
> the expected result type.  So users would write something like
>
> create function pl_lag(x numeric)
>   ...
>   v := get_input_value_in_partition(windowobject, x, 1, -1,
> 'seek_current', false);
>
> where the argument-fetching function is declared
>
>get_input_value_in_partition(windowobject, anyelement, int, ...)
>returns anyelement
>
> and internally it could verify that the n'th window function argument
> matches the type of its second argument.  While this could be made
> to work, it's kind of unsatisfying because the argument number "1" is
> so obviously redundant with the reference to "x".  Ideally one should
> only have to write "x".  I don't quite see how to make that work,
> but maybe there's a way?
>
> On the whole though, I think your original idea of bespoke plpgsql
> syntax is better, ie let's write something like
>
>GET WINDOW VALUE v := x AT PARTITION CURRENT(-1);
>
> and hide all the mechanism behind that.  The reference to "x" is enough
> to provide the argument number and type, and the window object doesn't
> have to be explicitly visible at all.
>

yes, this syntax looks well.

The second question is work with partition context value. This should be
only one value, and of only one but of any type per function. In this case
we cannot use GET statements. I had an idea of enhancing declaration. Some
like

DECLARE
  pcx PARTITION CONTEXT (int); -- read partition context
BEGIN
  pcx := 10; -- set partition context

What do you think about it?

Regards

Pavel










> Yeah, this will mean that anybody who wants to provide equivalent
> functionality in some other PL will have to do more work.  But it's
> not like it was going to be zero effort for them before.  Furthermore,
> it's not clear to me that other PLs would want to adopt your current
> design anyway.  For example, I bet PL/R would like to somehow make
> window arguments map into vectors on the R side, but there's no chance
> of that with this SQL layer in between.
>
> regards, tom lane
>


Re: poc - possibility to write window function in PL languages

2021-01-15 Thread Tom Lane
Pavel Stehule  writes:
> [ plpgsql-window-functions-20210104.patch.gz ]

I spent some time looking at this patch.  It would certainly be
appealing to have some ability to write custom window functions
without descending into C; but I'm not very happy about the details.

I'm okay with the idea of having a special variable of a new pseudotype.
That's not exactly pretty, but it descends directly from how we handle
the arguments of trigger functions, so at least there's precedent.
What's bugging me though is the "typedvalue" stuff.  That seems like a
conceptual mess, a performance loss, and a permanent maintenance time
sink.  To avoid performance complaints, eventually this hard-wired set
of conversions would have to bloom to cover every built-in cast, and
as for extension types, you're just out of luck.

One way to avoid that would be to declare the argument-fetching
functions as polymorphics with a dummy argument that just provides
the expected result type.  So users would write something like

create function pl_lag(x numeric)
  ...
  v := get_input_value_in_partition(windowobject, x, 1, -1,
'seek_current', false);

where the argument-fetching function is declared

   get_input_value_in_partition(windowobject, anyelement, int, ...)
   returns anyelement

and internally it could verify that the n'th window function argument
matches the type of its second argument.  While this could be made
to work, it's kind of unsatisfying because the argument number "1" is
so obviously redundant with the reference to "x".  Ideally one should
only have to write "x".  I don't quite see how to make that work,
but maybe there's a way?

On the whole though, I think your original idea of bespoke plpgsql
syntax is better, ie let's write something like

   GET WINDOW VALUE v := x AT PARTITION CURRENT(-1);

and hide all the mechanism behind that.  The reference to "x" is enough
to provide the argument number and type, and the window object doesn't
have to be explicitly visible at all.

Yeah, this will mean that anybody who wants to provide equivalent
functionality in some other PL will have to do more work.  But it's
not like it was going to be zero effort for them before.  Furthermore,
it's not clear to me that other PLs would want to adopt your current
design anyway.  For example, I bet PL/R would like to somehow make
window arguments map into vectors on the R side, but there's no chance
of that with this SQL layer in between.

regards, tom lane




Re: poc - possibility to write window function in PL languages

2021-01-04 Thread Zhihong Yu
Hi, Pavel:
Thanks for the update.

I don't have other comment.

Cheers

On Mon, Jan 4, 2021 at 3:15 AM Pavel Stehule 
wrote:

> Hi
>
> pá 1. 1. 2021 v 18:57 odesílatel Zhihong Yu  napsal:
>
>> Hi, Pavel:
>> Happy New Year.
>>
>> +   command with clause WINDOW. The specific feature of
>> +   this functions is a possibility to two special storages with
>>
>> this functions -> this function
>>
>> possibility to two special storages: there is no verb.
>>
>> 'store with stored one value': store is repeated.
>>
>> + * Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group
>>
>> It would be better to change 2020 to 2021 in the new files.
>>
>
> fixed
>
>>
>> For some functions, such as windowobject_get_func_arg_frame, it would be
>> better to add comment explaining their purposes.
>>
>
> It is commented before. These functions just call WinAPI functions
>
> /*
>  * High level access function. These functions are wrappers for windows API
>  * for PL languages based on usage WindowObjectProxy.
>  */
>
>
>
>> For estimate_partition_context_size():
>> +errmsg("size of value is greather than limit (1024
>> bytes)")));
>>
>> Please include the value of typlen in the message. There is similar error
>> message in the else block where value of size should be included.
>>
>> +   return *realsize;
>> +   }
>> +   else
>>
>> The 'else' is not needed since the if block ends with return.
>>
>
> yes, but it is there for better readability (symmetry)
>
>>
>> +   size += size / 3;
>>
>> Please add a comment for the choice of constant 3.
>>
>> +   /* by default we allocate 30 bytes */
>> +   *realsize = 0;
>>
>> The value 30 may not be accurate - from the caller:
>>
>> +   if (PG_ARGISNULL(2))
>> +   minsize = VARLENA_MINSIZE;
>> +   else
>> +   minsize = PG_GETARG_INT32(2);
>>
>> VARLENA_MINSIZE is 32.
>>
>> Cheers
>>
>> On Fri, Jan 1, 2021 at 3:29 AM Pavel Stehule 
>> wrote:
>>
>>> Hi
>>>
>>> rebase
>>>
>>> Regards
>>>
>>> Pavel
>>>
>>
> I am sending updated patch
>
> Thank you for comments
>
> Regards
>
> Pavel
>


Re: poc - possibility to write window function in PL languages

2021-01-04 Thread Pavel Stehule
Hi

pá 1. 1. 2021 v 18:57 odesílatel Zhihong Yu  napsal:

> Hi, Pavel:
> Happy New Year.
>
> +   command with clause WINDOW. The specific feature of
> +   this functions is a possibility to two special storages with
>
> this functions -> this function
>
> possibility to two special storages: there is no verb.
>
> 'store with stored one value': store is repeated.
>
> + * Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group
>
> It would be better to change 2020 to 2021 in the new files.
>

fixed

>
> For some functions, such as windowobject_get_func_arg_frame, it would be
> better to add comment explaining their purposes.
>

It is commented before. These functions just call WinAPI functions

/*
 * High level access function. These functions are wrappers for windows API
 * for PL languages based on usage WindowObjectProxy.
 */



> For estimate_partition_context_size():
> +errmsg("size of value is greather than limit (1024
> bytes)")));
>
> Please include the value of typlen in the message. There is similar error
> message in the else block where value of size should be included.
>
> +   return *realsize;
> +   }
> +   else
>
> The 'else' is not needed since the if block ends with return.
>

yes, but it is there for better readability (symmetry)

>
> +   size += size / 3;
>
> Please add a comment for the choice of constant 3.
>
> +   /* by default we allocate 30 bytes */
> +   *realsize = 0;
>
> The value 30 may not be accurate - from the caller:
>
> +   if (PG_ARGISNULL(2))
> +   minsize = VARLENA_MINSIZE;
> +   else
> +   minsize = PG_GETARG_INT32(2);
>
> VARLENA_MINSIZE is 32.
>
> Cheers
>
> On Fri, Jan 1, 2021 at 3:29 AM Pavel Stehule 
> wrote:
>
>> Hi
>>
>> rebase
>>
>> Regards
>>
>> Pavel
>>
>
I am sending updated patch

Thank you for comments

Regards

Pavel


plpgsql-window-functions-20210104.patch.gz
Description: application/gzip


Re: poc - possibility to write window function in PL languages

2021-01-01 Thread Zhihong Yu
Hi, Pavel:
Happy New Year.

+   command with clause WINDOW. The specific feature of
+   this functions is a possibility to two special storages with

this functions -> this function

possibility to two special storages: there is no verb.

'store with stored one value': store is repeated.

+ * Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group

It would be better to change 2020 to 2021 in the new files.

For some functions, such as windowobject_get_func_arg_frame, it would be
better to add comment explaining their purposes.

For estimate_partition_context_size():
+errmsg("size of value is greather than limit (1024
bytes)")));

Please include the value of typlen in the message. There is similar error
message in the else block where value of size should be included.

+   return *realsize;
+   }
+   else

The 'else' is not needed since the if block ends with return.

+   size += size / 3;

Please add a comment for the choice of constant 3.

+   /* by default we allocate 30 bytes */
+   *realsize = 0;

The value 30 may not be accurate - from the caller:

+   if (PG_ARGISNULL(2))
+   minsize = VARLENA_MINSIZE;
+   else
+   minsize = PG_GETARG_INT32(2);

VARLENA_MINSIZE is 32.

Cheers

On Fri, Jan 1, 2021 at 3:29 AM Pavel Stehule 
wrote:

> Hi
>
> rebase
>
> Regards
>
> Pavel
>
>


Re: poc - possibility to write window function in PL languages

2021-01-01 Thread Pavel Stehule
Hi

rebase

Regards

Pavel
diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
index 11246aa653..89a07678ee 100644
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@@ -4606,6 +4606,99 @@ CREATE EVENT TRIGGER snitch ON ddl_command_start EXECUTE FUNCTION snitch();
 
   
 
+ 
+  Window Functions
+
+  
+   window
+   in PL/pgSQL
+  
+
+  
+   PL/pgSQL can be used to define window
+   functions. A window function is created with the CREATE FUNCTION
+   command with clause WINDOW. The specific feature of
+   this functions is a possibility to two special storages with 
+   sorted values of window function arguments and store with stored
+   one value of any type for currently processed partition (of window
+   function).
+  
+
+  
+   Access to both storages is done with special internal variable
+   WINDOWOBJECT. This variable is declared implicitly,
+   and it is available only in window functions.
+
+
+CREATE OR REPLACE FUNCTION plpgsql_rownum() RETURNS int8
+LANGUAGE plpgsql WINDOW
+AS $$
+DECLARE pos int8
+BEGIN
+pos := get_current_position(WINDOWOBJECT);
+pos := pos + 1;
+PERFORM set_mark_position(WINDOWOBJECT, pos);
+RETURN pos;
+$$;
+
+SELECT plpgsql_rownum() OVER (), * FROM tab;
+
+  
+
+  
+   The arguments of window function cannot be accessed directly. The special
+   functions should be used. With these functions we can choose a scope of
+   buffered arguments, we can choose a wanted position against first, current, or
+   last row. The implementation of lag can looks like
+   (the window functions in plpgsql can use polymorphic types too):
+
+
+CREATE OR REPLACE FUNCTION plpgsql_lag(anyelement) RETURNS anyelement
+LANGUAGE plpgsql WINDOW
+AS $$
+BEGIN
+RETURN
+  get_input_value_in_partition(WINDOWOBJECT,
+   1, -1,
+   'seek_current',
+   false);
+END;
+$$;
+
+SELECT v, plpgsql_lag(v) FROM generate_series(1, 10) g(v);
+
+
+  
+
+  
+   Second buffer that can be used in window function is a buffer for one value
+   assigned to partition. The content of this buffer can be read by function
+   get_partition_context_value or modified by function
+   set_partition_context_value. Next function replaces
+   missing values by previous non NULL value:
+
+
+CREATE OR REPLACE FUNCTION plpgsql_replace_missing(numeric) RETURNS numeric
+LANGUAGE plpgsql WINDOW
+AS $$
+DECLATE
+v numeric;
+BEGIN
+v := get_input_value_for_row(WINDOWOBJECT, 1);
+IF v IS NULL THEN
+v := get_partition_context_value(WINDOWOBJECT, NULL::numeric);
+ELSE
+PERFORM set_partition_context_value(WINDOWOBJECT, v);
+END IF;
+RETURN v;
+END;
+$$;
+
+
+  
+
+  
+
   
PL/pgSQL under the Hood
 
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index b140c210bc..37d38d6e84 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1442,6 +1442,18 @@ LANGUAGE internal
 STRICT IMMUTABLE PARALLEL SAFE
 AS 'unicode_is_normalized';
 
+CREATE OR REPLACE FUNCTION
+  get_partition_context_value(windowobjectproxy, anyelement, int4 DEFAULT NULL)
+  RETURNS anyelement
+LANGUAGE internal
+AS 'windowobject_get_partition_context_value';
+
+CREATE OR REPLACE FUNCTION
+  set_partition_context_value(windowobjectproxy, anyelement, int4 DEFAULT NULL)
+  RETURNS void
+LANGUAGE internal
+AS 'windowobject_set_partition_context_value';
+
 --
 -- The default permissions for functions mean that anyone can execute them.
 -- A number of functions shouldn't be executable by just anyone, but rather
diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index 82732146d3..4e10e2bde7 100644
--- a/src/backend/utils/adt/Makefile
+++ b/src/backend/utils/adt/Makefile
@@ -105,6 +105,7 @@ OBJS = \
 	tsvector.o \
 	tsvector_op.o \
 	tsvector_parser.o \
+	typedvalue.o \
 	uuid.o \
 	varbit.o \
 	varchar.o \
diff --git a/src/backend/utils/adt/pseudotypes.c b/src/backend/utils/adt/pseudotypes.c
index 99a93271fe..3745cc6515 100644
--- a/src/backend/utils/adt/pseudotypes.c
+++ b/src/backend/utils/adt/pseudotypes.c
@@ -372,6 +372,17 @@ pg_node_tree_send(PG_FUNCTION_ARGS)
 PSEUDOTYPE_DUMMY_IO_FUNCS(pg_ddl_command);
 PSEUDOTYPE_DUMMY_BINARY_IO_FUNCS(pg_ddl_command);
 
+/*
+ * windowobjectproxy
+ *
+ * This type is pointer to WindowObjectProxyData. It is communication
+ * mechanism between PL environment and WinFuncArgs functions. Due
+ * performance reason I prefer using indirect result processing against
+ * using function returning polymorphic composite value. The indirect
+ * mechanism is implemented with proxy object represented by type
+ * WindowObjectProxyData.
+ */
+PSEUDOTYPE_DUMMY_IO_FUNCS(windowobjectproxy);
 
 /*
  * Dummy I/O functions for various other pseudotypes.
diff --git a/src/backend/utils/adt/typedvalue.c b/src/backend/utils/adt/typedvalue.c
new file mode 100644
index 00..370804a05a

Re: poc - possibility to write window function in PL languages

2020-08-28 Thread Pavel Stehule
pá 28. 8. 2020 v 8:14 odesílatel Pavel Stehule 
napsal:

>
>
> st 26. 8. 2020 v 17:06 odesílatel Pavel Stehule 
> napsal:
>
>> Hi
>>
>> I simplified access to results of  winfuncargs functions by proxy type
>> "typedvalue". This type can hold any Datum value, and allows fast cast to
>> basic buildin types or it can use (slower) generic cast functions. It is
>> used in cooperation with a plpgsql assign statement that can choose the
>> correct cast implicitly. When the winfuncarg function returns a value of
>> the same type, that is expected by the variable on the left side of the
>> assign statement, then (for basic types), the value is just copied without
>> casts. With this proxy type is not necessary to have special statement for
>> assigning returned value from winfuncargs functions, so source code of
>> window function in plpgsql looks intuitive to me.
>>
>> Example - implementation of "lag" function in plpgsql
>>
>> create or replace function pl_lag(numeric)
>> returns numeric as $$
>> declare v numeric;
>> begin
>>   v := get_input_value_in_partition(windowobject, 1, -1, 'seek_current',
>> false);
>>   return v;
>> end;
>> $$ language plpgsql window;
>>
>> I think this code is usable, and I assign this patch to commitfest.
>>
>> Regards
>>
>> Pavel
>>
>
> fix regress tests and some doc
>

update - refactored implementation typedvalue type
diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
index 815912666d..5b4d5bbac4 100644
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@@ -4568,6 +4568,99 @@ CREATE EVENT TRIGGER snitch ON ddl_command_start EXECUTE FUNCTION snitch();
 
   
 
+ 
+  Window Functions
+
+  
+   window
+   in PL/pgSQL
+  
+
+  
+   PL/pgSQL can be used to define window
+   functions. A window function is created with the CREATE FUNCTION
+   command with clause WINDOW. The specific feature of
+   this functions is a possibility to two special storages with 
+   sorted values of window function arguments and store with stored
+   one value of any type for currently processed partition (of window
+   function).
+  
+
+  
+   Access to both storages is done with special internal variable
+   WINDOWOBJECT. This variable is declared implicitly,
+   and it is available only in window functions.
+
+
+CREATE OR REPLACE FUNCTION plpgsql_rownum() RETURNS int8
+LANGUAGE plpgsql WINDOW
+AS $$
+DECLARE pos int8
+BEGIN
+pos := get_current_position(WINDOWOBJECT);
+pos := pos + 1;
+PERFORM set_mark_position(WINDOWOBJECT, pos);
+RETURN pos;
+$$;
+
+SELECT plpgsql_rownum() OVER (), * FROM tab;
+
+  
+
+  
+   The arguments of window function cannot be accessed directly. The special
+   functions should be used. With these functions we can choose a scope of
+   buffered arguments, we can choose a wanted position against first, current, or
+   last row. The implementation of lag can looks like
+   (the window functions in plpgsql can use polymorphic types too):
+
+
+CREATE OR REPLACE FUNCTION plpgsql_lag(anyelement) RETURNS anyelement
+LANGUAGE plpgsql WINDOW
+AS $$
+BEGIN
+RETURN
+  get_input_value_in_partition(WINDOWOBJECT,
+   1, -1,
+   'seek_current',
+   false);
+END;
+$$;
+
+SELECT v, plpgsql_lag(v) FROM generate_series(1, 10) g(v);
+
+
+  
+
+  
+   Second buffer that can be used in window function is a buffer for one value
+   assigned to partition. The content of this buffer can be read by function
+   get_partition_context_value or modified by function
+   set_partition_context_value. Next function replaces
+   missing values by previous non NULL value:
+
+
+CREATE OR REPLACE FUNCTION plpgsql_replace_missing(numeric) RETURNS numeric
+LANGUAGE plpgsql WINDOW
+AS $$
+DECLATE
+v numeric;
+BEGIN
+v := get_input_value_for_row(WINDOWOBJECT, 1);
+IF v IS NULL THEN
+v := get_partition_context_value(WINDOWOBJECT, NULL::numeric);
+ELSE
+PERFORM set_partition_context_value(WINDOWOBJECT, v);
+END IF;
+RETURN v;
+END;
+$$;
+
+
+  
+
+  
+
   
PL/pgSQL under the Hood
 
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index a2d61302f9..f926f2e386 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1421,6 +1421,18 @@ LANGUAGE internal
 STRICT IMMUTABLE PARALLEL SAFE
 AS 'unicode_is_normalized';
 
+CREATE OR REPLACE FUNCTION
+  get_partition_context_value(windowobjectproxy, anyelement, int4 DEFAULT NULL)
+  RETURNS anyelement
+LANGUAGE internal
+AS 'windowobject_get_partition_context_value';
+
+CREATE OR REPLACE FUNCTION
+  set_partition_context_value(windowobjectproxy, anyelement, int4 DEFAULT NULL)
+  RETURNS void
+LANGUAGE internal
+AS 'windowobject_set_partition_context_value';
+
 --
 -- The default permissions for functions mean that anyone can execute them.
 -- A number of functions shouldn't be executable by just anyone, but rather

Re: poc - possibility to write window function in PL languages

2020-08-28 Thread Pavel Stehule
st 26. 8. 2020 v 17:06 odesílatel Pavel Stehule 
napsal:

> Hi
>
> I simplified access to results of  winfuncargs functions by proxy type
> "typedvalue". This type can hold any Datum value, and allows fast cast to
> basic buildin types or it can use (slower) generic cast functions. It is
> used in cooperation with a plpgsql assign statement that can choose the
> correct cast implicitly. When the winfuncarg function returns a value of
> the same type, that is expected by the variable on the left side of the
> assign statement, then (for basic types), the value is just copied without
> casts. With this proxy type is not necessary to have special statement for
> assigning returned value from winfuncargs functions, so source code of
> window function in plpgsql looks intuitive to me.
>
> Example - implementation of "lag" function in plpgsql
>
> create or replace function pl_lag(numeric)
> returns numeric as $$
> declare v numeric;
> begin
>   v := get_input_value_in_partition(windowobject, 1, -1, 'seek_current',
> false);
>   return v;
> end;
> $$ language plpgsql window;
>
> I think this code is usable, and I assign this patch to commitfest.
>
> Regards
>
> Pavel
>

fix regress tests and some doc
diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
index 815912666d..5b4d5bbac4 100644
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@@ -4568,6 +4568,99 @@ CREATE EVENT TRIGGER snitch ON ddl_command_start EXECUTE FUNCTION snitch();
 
   
 
+ 
+  Window Functions
+
+  
+   window
+   in PL/pgSQL
+  
+
+  
+   PL/pgSQL can be used to define window
+   functions. A window function is created with the CREATE FUNCTION
+   command with clause WINDOW. The specific feature of
+   this functions is a possibility to two special storages with 
+   sorted values of window function arguments and store with stored
+   one value of any type for currently processed partition (of window
+   function).
+  
+
+  
+   Access to both storages is done with special internal variable
+   WINDOWOBJECT. This variable is declared implicitly,
+   and it is available only in window functions.
+
+
+CREATE OR REPLACE FUNCTION plpgsql_rownum() RETURNS int8
+LANGUAGE plpgsql WINDOW
+AS $$
+DECLARE pos int8
+BEGIN
+pos := get_current_position(WINDOWOBJECT);
+pos := pos + 1;
+PERFORM set_mark_position(WINDOWOBJECT, pos);
+RETURN pos;
+$$;
+
+SELECT plpgsql_rownum() OVER (), * FROM tab;
+
+  
+
+  
+   The arguments of window function cannot be accessed directly. The special
+   functions should be used. With these functions we can choose a scope of
+   buffered arguments, we can choose a wanted position against first, current, or
+   last row. The implementation of lag can looks like
+   (the window functions in plpgsql can use polymorphic types too):
+
+
+CREATE OR REPLACE FUNCTION plpgsql_lag(anyelement) RETURNS anyelement
+LANGUAGE plpgsql WINDOW
+AS $$
+BEGIN
+RETURN
+  get_input_value_in_partition(WINDOWOBJECT,
+   1, -1,
+   'seek_current',
+   false);
+END;
+$$;
+
+SELECT v, plpgsql_lag(v) FROM generate_series(1, 10) g(v);
+
+
+  
+
+  
+   Second buffer that can be used in window function is a buffer for one value
+   assigned to partition. The content of this buffer can be read by function
+   get_partition_context_value or modified by function
+   set_partition_context_value. Next function replaces
+   missing values by previous non NULL value:
+
+
+CREATE OR REPLACE FUNCTION plpgsql_replace_missing(numeric) RETURNS numeric
+LANGUAGE plpgsql WINDOW
+AS $$
+DECLATE
+v numeric;
+BEGIN
+v := get_input_value_for_row(WINDOWOBJECT, 1);
+IF v IS NULL THEN
+v := get_partition_context_value(WINDOWOBJECT, NULL::numeric);
+ELSE
+PERFORM set_partition_context_value(WINDOWOBJECT, v);
+END IF;
+RETURN v;
+END;
+$$;
+
+
+  
+
+  
+
   
PL/pgSQL under the Hood
 
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index a2d61302f9..f926f2e386 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1421,6 +1421,18 @@ LANGUAGE internal
 STRICT IMMUTABLE PARALLEL SAFE
 AS 'unicode_is_normalized';
 
+CREATE OR REPLACE FUNCTION
+  get_partition_context_value(windowobjectproxy, anyelement, int4 DEFAULT NULL)
+  RETURNS anyelement
+LANGUAGE internal
+AS 'windowobject_get_partition_context_value';
+
+CREATE OR REPLACE FUNCTION
+  set_partition_context_value(windowobjectproxy, anyelement, int4 DEFAULT NULL)
+  RETURNS void
+LANGUAGE internal
+AS 'windowobject_set_partition_context_value';
+
 --
 -- The default permissions for functions mean that anyone can execute them.
 -- A number of functions shouldn't be executable by just anyone, but rather
diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index 54d5c37947..84da7222d9 100644
--- a/src/backend/utils/adt/Makefile
+++ 

Re: poc - possibility to write window function in PL languages

2020-08-26 Thread Pavel Stehule
Hi

I simplified access to results of  winfuncargs functions by proxy type
"typedvalue". This type can hold any Datum value, and allows fast cast to
basic buildin types or it can use (slower) generic cast functions. It is
used in cooperation with a plpgsql assign statement that can choose the
correct cast implicitly. When the winfuncarg function returns a value of
the same type, that is expected by the variable on the left side of the
assign statement, then (for basic types), the value is just copied without
casts. With this proxy type is not necessary to have special statement for
assigning returned value from winfuncargs functions, so source code of
window function in plpgsql looks intuitive to me.

Example - implementation of "lag" function in plpgsql

create or replace function pl_lag(numeric)
returns numeric as $$
declare v numeric;
begin
  v := get_input_value_in_partition(windowobject, 1, -1, 'seek_current',
false);
  return v;
end;
$$ language plpgsql window;

I think this code is usable, and I assign this patch to commitfest.

Regards

Pavel
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index ba5a23ac25..fdc364c05e 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1418,6 +1418,18 @@ LANGUAGE internal
 STRICT IMMUTABLE PARALLEL SAFE
 AS 'unicode_is_normalized';
 
+CREATE OR REPLACE FUNCTION
+  get_partition_context_value(windowobjectproxy, anyelement, int4 DEFAULT NULL)
+  RETURNS anyelement
+LANGUAGE internal
+AS 'windowobject_get_partition_context_value';
+
+CREATE OR REPLACE FUNCTION
+  set_partition_context_value(windowobjectproxy, anyelement, int4 DEFAULT NULL)
+  RETURNS void
+LANGUAGE internal
+AS 'windowobject_set_partition_context_value';
+
 --
 -- The default permissions for functions mean that anyone can execute them.
 -- A number of functions shouldn't be executable by just anyone, but rather
diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index 5d2aca8cfe..1974cfd7e6 100644
--- a/src/backend/utils/adt/Makefile
+++ b/src/backend/utils/adt/Makefile
@@ -101,6 +101,7 @@ OBJS = \
 	tsvector.o \
 	tsvector_op.o \
 	tsvector_parser.o \
+	typedvalue.o \
 	uuid.o \
 	varbit.o \
 	varchar.o \
diff --git a/src/backend/utils/adt/pseudotypes.c b/src/backend/utils/adt/pseudotypes.c
index 3d6b2f9093..ebb2a16572 100644
--- a/src/backend/utils/adt/pseudotypes.c
+++ b/src/backend/utils/adt/pseudotypes.c
@@ -334,6 +334,17 @@ pg_node_tree_send(PG_FUNCTION_ARGS)
 PSEUDOTYPE_DUMMY_IO_FUNCS(pg_ddl_command);
 PSEUDOTYPE_DUMMY_BINARY_IO_FUNCS(pg_ddl_command);
 
+/*
+ * windowobjectproxy
+ *
+ * This type is pointer to WindowObjectProxyData. It is communication
+ * mechanism between PL environment and WinFuncArgs functions. Due
+ * performance reason I prefer using indirect result processing against
+ * using function returning polymorphic composite value. The indirect
+ * mechanism is implemented with proxy object represented by type
+ * WindowObjectProxyData.
+ */
+PSEUDOTYPE_DUMMY_IO_FUNCS(windowobjectproxy);
 
 /*
  * Dummy I/O functions for various other pseudotypes.
diff --git a/src/backend/utils/adt/typedvalue.c b/src/backend/utils/adt/typedvalue.c
new file mode 100644
index 00..d2d5dc34c0
--- /dev/null
+++ b/src/backend/utils/adt/typedvalue.c
@@ -0,0 +1,630 @@
+/*-
+ *
+ * typedvalue.c
+ *
+ * Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/utils/adt/typedvalue.c
+ *
+ *-
+ */
+
+#include "postgres.h"
+
+#include "catalog/pg_type.h"
+#include "utils/builtins.h"
+#include "utils/datum.h"
+#include "utils/lsyscache.h"
+#include "utils/numeric.h"
+#include "utils/typedvalue.h"
+#include "fmgr.h"
+
+
+static Datum
+TypedValueGetDatum(TypedValue tv)
+{
+	if (tv->typbyval)
+		return *((Datum *) tv->data);
+	else
+		return PointerGetDatum(tv->data);
+}
+
+Datum
+typedvalue_in(PG_FUNCTION_ARGS)
+{
+	char	   *str =  PG_GETARG_CSTRING(0);
+	int			len;
+	Size		size;
+	TypedValue	tv;
+	text	   *txt;
+
+	len = strlen(str);
+
+	size = MAXALIGN(offsetof(TypedValueData, data) + len + VARHDRSZ);
+
+	tv = (TypedValue) palloc(size);
+	SET_VARSIZE(tv, size);
+
+	txt = (text *) tv->data;
+
+	SET_VARSIZE(txt, VARHDRSZ + len);
+	memcpy(VARDATA(txt), str, len);
+
+	tv->typid = TEXTOID;
+	tv->typbyval = false;
+	tv->typlen = -1;
+
+	PG_RETURN_POINTER(tv);
+}
+
+Datum
+typedvalue_out(PG_FUNCTION_ARGS)
+{
+	Oid			typOutput;
+	bool		isVarlena;
+	char	   *str;
+	TypedValue	tv;
+	Datum		value;
+
+	tv = (TypedValue) PG_GETARG_POINTER(0);
+
+	getTypeOutputInfo(tv->typid, , );
+
+	value = TypedValueGetDatum(tv);
+
+	str = OidOutputFunctionCall(typOutput, value);
+
+	PG_RETURN_CSTRING(str);
+}
+
+Datum
+makeTypedValue(Datum value, Oid typid, int16 typlen, bool typbyval)
+{
+	TypedValue		tv;
+	Size			

poc - possibility to write window function in PL languages

2020-08-24 Thread Pavel Stehule
Hi

I wrote a proof concept for the support window function from plpgsql.

Window function API - functions named WinFuncArg* are polymorphic and it is
not easy to wrap these functions for usage from SQL level. I wrote an
enhancement of the GET statement - for this case GET WINDOW_CONTEXT, that
allows safe and fast access to the result of these functions.

Custom variant of row_number can look like:

create or replace function pl_row_number()
returns bigint as $$
declare pos int8;
begin
  pos := get_current_position(windowobject);
  pos := pos + 1;
  perform set_mark_position(windowobject, pos);
  return pos;
end
$$
language plpgsql window;

Custom variant of lag function can look like:

create or replace function pl_lag(numeric)
returns numeric as $$
declare
  v numeric;
begin
  perform get_input_value_in_partition(windowobject, 1, -1, 'seek_current',
false);
  get pg_window_context v = PG_INPUT_VALUE;
  return v;
end;
$$ language plpgsql window;

Custom window functions can be used for generating missing data in time
series

create table test_missing_values(id int, v integer);
insert into test_missing_values
values(1,10),(2,11),(3,12),(4,null),(5,null),(6,15),(7,16);

create or replace function pl_pcontext_test(numeric)
returns numeric as $$
declare
  n numeric;
  v numeric;
begin
  perform get_input_value_for_row(windowobject, 1);
  get pg_window_context v = PG_INPUT_VALUE;
  if v is null then
v := get_partition_context_value(windowobject, null::numeric);
  else
perform set_partition_context_value(windowobject, v);
  end if;
  return v;
end
$$
language plpgsql window;

select id, v, pl_pcontext_test(v) over (order by id) from
test_missing_values;
 id | v  | pl_pcontext_test.
++--
  1 | 10 |   10
  2 | 11 |   11
  3 | 12 |   12
  4 ||   12
  5 ||   12
  6 | 15 |   15
  7 | 16 |   16
(7 rows)

I think about another variant for WinFuncArg functions where polymorphic
argument is used similarly like in get_partition_context_value - this patch
is prototype, but it works and I think so support of custom window
functions in PL languages is possible and probably useful.

Comments, notes, ideas, objections?

Regards

Pavel
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index ba5a23ac25..fdc364c05e 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1418,6 +1418,18 @@ LANGUAGE internal
 STRICT IMMUTABLE PARALLEL SAFE
 AS 'unicode_is_normalized';
 
+CREATE OR REPLACE FUNCTION
+  get_partition_context_value(windowobjectproxy, anyelement, int4 DEFAULT NULL)
+  RETURNS anyelement
+LANGUAGE internal
+AS 'windowobject_get_partition_context_value';
+
+CREATE OR REPLACE FUNCTION
+  set_partition_context_value(windowobjectproxy, anyelement, int4 DEFAULT NULL)
+  RETURNS void
+LANGUAGE internal
+AS 'windowobject_set_partition_context_value';
+
 --
 -- The default permissions for functions mean that anyone can execute them.
 -- A number of functions shouldn't be executable by just anyone, but rather
diff --git a/src/backend/utils/adt/pseudotypes.c b/src/backend/utils/adt/pseudotypes.c
index 3d6b2f9093..ebb2a16572 100644
--- a/src/backend/utils/adt/pseudotypes.c
+++ b/src/backend/utils/adt/pseudotypes.c
@@ -334,6 +334,17 @@ pg_node_tree_send(PG_FUNCTION_ARGS)
 PSEUDOTYPE_DUMMY_IO_FUNCS(pg_ddl_command);
 PSEUDOTYPE_DUMMY_BINARY_IO_FUNCS(pg_ddl_command);
 
+/*
+ * windowobjectproxy
+ *
+ * This type is pointer to WindowObjectProxyData. It is communication
+ * mechanism between PL environment and WinFuncArgs functions. Due
+ * performance reason I prefer using indirect result processing against
+ * using function returning polymorphic composite value. The indirect
+ * mechanism is implemented with proxy object represented by type
+ * WindowObjectProxyData.
+ */
+PSEUDOTYPE_DUMMY_IO_FUNCS(windowobjectproxy);
 
 /*
  * Dummy I/O functions for various other pseudotypes.
diff --git a/src/backend/utils/adt/windowfuncs.c b/src/backend/utils/adt/windowfuncs.c
index f0c8ae686d..f18495b228 100644
--- a/src/backend/utils/adt/windowfuncs.c
+++ b/src/backend/utils/adt/windowfuncs.c
@@ -14,6 +14,8 @@
 #include "postgres.h"
 
 #include "utils/builtins.h"
+#include "utils/datum.h"
+#include "utils/lsyscache.h"
 #include "windowapi.h"
 
 /*
@@ -35,6 +37,20 @@ typedef struct
 	int64		remainder;		/* (total rows) % (bucket num) */
 } ntile_context;
 
+#define PROXY_CONTEXT_MAGIC 19730715
+
+typedef struct
+{
+	int			magic;
+	Oid			typid;
+	int16		typlen;
+	bool		typbyval;
+	int			allocsize;
+	bool		isnull;
+	Datum		value;
+	char		data[FLEXIBLE_ARRAY_MEMBER];
+} proxy_context;
+
 static bool rank_up(WindowObject winobj);
 static Datum leadlag_common(FunctionCallInfo fcinfo,
 			bool forward, bool withoffset, bool withdefault);
@@ -472,3 +488,485 @@ window_nth_value(PG_FUNCTION_ARGS)
 
 	PG_RETURN_DATUM(result);
 }
+
+/*
+ * High level