Re: range_agg with multirange inputs

2022-03-30 Thread Peter Eisentraut

This patch has been committed.  I split it into a few pieces.

On 12.03.22 04:18, Paul Jungwirth wrote:

On 3/10/22 14:07, Chapman Flack wrote:

When I apply this patch, I get a func.sgml with two entries for
range_intersect_agg(anymultirange).


Arg, fixed.


In range_agg_transfn, you've changed the message in the "must be called
with a range or multirange"; that seems like another good candidate to
be an elog.


Agreed. Updated here.


I kept those messages as "range" or "multirange" separately, instead of 
"range or multirange".  This way, we don't have to update all the 
messages of this kind when a new function is added.  Since these are 
only internal messages anyway, I opted for higher maintainability.





Re: range_agg with multirange inputs

2022-03-28 Thread Greg Stark
Fwiw the cfbot is failing due to a duplicate OID. Traditionally we
didn't treat duplicate OIDs as reason to reject a patch because
they're inevitable as other patches get committed and the committer
can just renumber them.

I think the cfbot kind of changes this calculus since it's a pain lose
the visibility into whether the rest of the tests are passing that the
cfbot normally gives us.

If it's not to much of a hassle could you renumber resubmit the patch
with an updated OID?

[10:54:57.606] su postgres -c "make -s -j${BUILD_JOBS} world-bin"
[10:54:57.927] Duplicate OIDs detected:
[10:54:57.927] 8000




Re: range_agg with multirange inputs

2022-03-12 Thread Chapman Flack
On 03/11/22 22:18, Paul Jungwirth wrote:
> Arg, fixed.
> 
>> In range_agg_transfn, you've changed the message in the "must be called
>> with a range or multirange"; that seems like another good candidate to
>> be an elog.
> 
> Agreed. Updated here.

This looks good to me and passes installcheck-world, so I'll push
the RfC button.

> Sharing a prosrc seems even less remarkable than sharing an aggfinalfn.
> You're right there are no cases for other finalfns yet, but I don't think
> there is anything special about finalfns that would make this a weirder
> thing to do there than with ordinary functions.

You sent me back to look at how many of those there are. I get 42 cases
of shared prosrc (43 now).

The chief subgroup of those looks to involve sharing between parameter
signatures where the types have identical layouts and the semantic
differences are unimportant to the function in question (comparisons
between bit or between varbit, overlaps taking timestamp or timestamptz,
etc.).

The other prominent group is range and multirange constructors, where
the C function has an obviously generic name like range_constructor2
and gets shared by a bunch of SQL declarations.

I think here we've added the first instance where the C function is
shared by SQL-declared functions accepting two different polymorphic
pseudotypes. But it's clearly simple and works, nothing objectionable
about it.

I had experimented with renaming multirange_agg_finalfn to just
range_agg_finalfn so it would just look like two overloads of one
function sharing a prosrc, and ultimately gave up because genbki.pl
couldn't resolve the OID where the name is used in pg_aggregate.dat.

That's why it surprised me to see three instances where other functions
(overlaps, isfinite, name) do use the same SQL name for different
overloads, but the explanation seems to be that nothing else at genbki
time refers to those, so genbki's unique-name limitation doesn't affect
them.

Neither here nor there for this patch, but an interesting new thing
I learned while reviewing it.

Regards,
-Chap




Re: range_agg with multirange inputs

2022-03-11 Thread Paul Jungwirth

On 3/10/22 14:07, Chapman Flack wrote:

When I apply this patch, I get a func.sgml with two entries for
range_intersect_agg(anymultirange).


Arg, fixed.


In range_agg_transfn, you've changed the message in the "must be called
with a range or multirange"; that seems like another good candidate to
be an elog.


Agreed. Updated here.


I think your query finds aggregate declarations that share the same SQL
function declaration as their finalizer functions. That seems to be more
common.

The query I used looks for cases where different SQL-declared functions
appear as finalizers of aggregates, but the different SQL declared functions
share the same internal C implementation.


Okay, I see. I believe that is quite common for ordinary SQL functions. 
Sharing a prosrc seems even less remarkable than sharing an aggfinalfn. 
You're right there are no cases for other finalfns yet, but I don't 
think there is anything special about finalfns that would make this a 
weirder thing to do there than with ordinary functions. Still, noting it 
with a comment does seem helpful. I've updated the remark to match what 
you suggested.


Thank you again for the review, and sorry for so many iterations! :-)

Yours,

--
Paul  ~{:-)
p...@illuminatedcomputing.comFrom b409d7333132e34a928ec8cc0ecca0a3421b7268 Mon Sep 17 00:00:00 2001
From: "Paul A. Jungwirth" 
Date: Fri, 10 Dec 2021 16:04:57 -0800
Subject: [PATCH v4] Add range_agg with multirange inputs

---
 doc/src/sgml/func.sgml|  30 ++
 src/backend/utils/adt/multirangetypes.c   |  69 ++--
 src/include/catalog/pg_aggregate.dat  |   3 +
 src/include/catalog/pg_proc.dat   |  11 ++
 src/test/regress/expected/multirangetypes.out | 100 ++
 src/test/regress/expected/opr_sanity.out  |   1 +
 src/test/regress/sql/multirangetypes.sql  |  21 
 src/test/regress/sql/opr_sanity.sql   |   1 +
 8 files changed, 230 insertions(+), 6 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index df3cd5987b..ab6c4f0093 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -20012,8 +20012,23 @@ SELECT NULLIF(value, '(none)') ...

No
   
 
+  
+   
+
+ range_agg
+
+range_agg ( value
+ anymultirange )
+anymultirange
+   
+   
+Computes the union of the non-null input values.
+   
+   No
+  
+
   

 
  range_intersect_agg
@@ -20027,8 +20042,23 @@ SELECT NULLIF(value, '(none)') ...

No
   
 
+  
+   
+
+ range_intersect_agg
+
+range_intersect_agg ( value
+ anymultirange )
+anymultirange
+   
+   
+Computes the intersection of the non-null input values.
+   
+   No
+  
+
   

 
  string_agg
diff --git a/src/backend/utils/adt/multirangetypes.c b/src/backend/utils/adt/multirangetypes.c
index c474b24431..a6d376b083 100644
--- a/src/backend/utils/adt/multirangetypes.c
+++ b/src/backend/utils/adt/multirangetypes.c
@@ -1344,11 +1344,9 @@ range_agg_transfn(PG_FUNCTION_ARGS)
 		elog(ERROR, "range_agg_transfn called in non-aggregate context");
 
 	rngtypoid = get_fn_expr_argtype(fcinfo->flinfo, 1);
 	if (!type_is_range(rngtypoid))
-		ereport(ERROR,
-(errcode(ERRCODE_DATATYPE_MISMATCH),
- errmsg("range_agg must be called with a range")));
+		elog(ERROR, "range_agg must be called with a range or multirange");
 
 	if (PG_ARGISNULL(0))
 		state = initArrayResult(rngtypoid, aggContext, false);
 	else
@@ -1362,8 +1360,11 @@ range_agg_transfn(PG_FUNCTION_ARGS)
 }
 
 /*
  * range_agg_finalfn: use our internal array to merge touching ranges.
+ *
+ * Shared by range_agg_finalfn(anyrange) and
+ * multirange_agg_finalfn(anymultirange).
  */
 Datum
 range_agg_finalfn(PG_FUNCTION_ARGS)
 {
@@ -1397,8 +1398,66 @@ range_agg_finalfn(PG_FUNCTION_ARGS)
 
 	PG_RETURN_MULTIRANGE_P(make_multirange(mltrngtypoid, typcache->rngtype, range_count, ranges));
 }
 
+/*
+ * multirange_agg_transfn: combine adjacent/overlapping multiranges.
+ *
+ * All we do here is gather the input multiranges' ranges into an array
+ * so that the finalfn can sort and combine them.
+ */
+Datum
+multirange_agg_transfn(PG_FUNCTION_ARGS)
+{
+	MemoryContext aggContext;
+	Oid			mltrngtypoid;
+	TypeCacheEntry *typcache;
+	TypeCacheEntry *rngtypcache;
+	ArrayBuildState *state;
+	MultirangeType *current;
+	int32		range_count;
+	RangeType **ranges;
+	int32		i;
+
+	if (!AggCheckCallContext(fcinfo, &aggContext))
+		elog(ERROR, "multirange_agg_transfn called in non-aggregate context");
+
+	mltrngtypoid = get_fn_expr_argtype(fcinfo->flinfo, 1);
+	if (!type_is_multirange(mltrngtypoid))
+		elog(ERROR, "range_agg must be called with a range or multirange");
+
+	typcache = multirange_get_typcache(fcinfo, mltrngtypoid);
+	rngtypcache = typcache->rngtype;
+
+	if (PG_ARGISNULL

Re: range_agg with multirange inputs

2022-03-10 Thread Chapman Flack
On 03/05/22 15:53, Paul Jungwirth wrote:
> On 3/1/22 13:33, Chapman Flack wrote:
>> I think the 4 lines should suffice, but it looks like this patch was
>> generated from a rebase of the old one (with three lines) that ended up
>> putting the new 'range_agg' entry ahead of 'max' in func.sgml, which
>> position is now baked into the 4 lines of context. :)
> 
> You're right, my last rebase messed up the docs. Here it is fixed. Sorry
> about that!

When I apply this patch, I get a func.sgml with two entries for
range_intersect_agg(anymultirange).

> I like the elog solution. I've changed them in both places.

It looks like you've now got elog in three places: the "must be called
with a range or multirange" in multirange_agg_transfn and
multirange_intersect_agg_transfn, and the "called in non-aggregate
context" in multirange_agg_transfn.

I think that last is also ok, given that its state type is internal,
so it shouldn't be reachable in a user call.

In range_agg_transfn, you've changed the message in the "must be called
with a range or multirange"; that seems like another good candidate to
be an elog.

> I see 13 other shared finalfns (using select array_agg(aggfnoid::regproc) as
> procs, array_agg(aggtransfn) as transfns, aggfinalfn from pg_aggregate where
> aggfinalfn is distinct from 0 group by aggfinalfn having count(*) > 1;) but
> a comment can't hurt! Added.

I think your query finds aggregate declarations that share the same SQL
function declaration as their finalizer functions. That seems to be more
common.

The query I used looks for cases where different SQL-declared functions
appear as finalizers of aggregates, but the different SQL declared functions
share the same internal C implementation. That's the query where this seems
to be the unique result.

WITH
 finals(regp) AS (
  SELECT DISTINCT
CAST(aggfinalfn AS regprocedure)
   FROM
pg_aggregate
   WHERE
aggfinalfn <> 0 -- InvalidOid
 )
SELECT
  prosrc, array_agg(regp)
 FROM
  pg_proc, finals
 WHERE
  oid = regp AND prolang = 12 -- INTERNALlanguageId
 GROUP BY
  prosrc
 HAVING
  count(*) > 1;

In other words, I think the interesting thing to say in the C comment
is not "shared by range_agg(anyrange) and range_agg(anymultirange)", but
"shared by range_agg_finalfn(internal,anyrange) and
multirange_agg_finalfn(internal,anymultirange)".

It seems a little extra surprising to have one C function declared in SQL
with two different names and parameter signatures. It ends up working
out because it relies on get_fn_expr_rettype, which can do its job for
either polymorphic type it might find in the parameter declaration.
But that's a bit subtle. :)

Regards,
-Chap




Re: range_agg with multirange inputs

2022-03-05 Thread Paul Jungwirth

On 3/1/22 13:33, Chapman Flack wrote:

I think the 4 lines should suffice, but it looks like this patch was
generated from a rebase of the old one (with three lines) that ended up
putting the new 'range_agg' entry ahead of 'max' in func.sgml, which
position is now baked into the 4 lines of context. :)


You're right, my last rebase messed up the docs. Here it is fixed. Sorry 
about that!



I would not change them to actual Assert, which would blow up the whole
process on failure. If it's a genuine "not expected to happen" case,
maybe changing it to elog (or ereport with errmsg_internal) would save
a little workload for translators.


I like the elog solution. I've changed them in both places.


I did a small double-take seeing the C range_agg_finalfn being shared
by the SQL range_agg_finalfn and multirange_agg_finalfn. I infer that
the reason it works is get_fn_expr_rettype works equally well with
either parameter type.

Do you think it would be worth adding a comment at the C function
explaining that? In a quick query I just did, I found no other aggregate
final functions sharing a C function that way, so this could be the first.


I see 13 other shared finalfns (using select 
array_agg(aggfnoid::regproc) as procs, array_agg(aggtransfn) as 
transfns, aggfinalfn from pg_aggregate where aggfinalfn is distinct from 
0 group by aggfinalfn having count(*) > 1;) but a comment can't hurt! Added.


Thanks,

--
Paul  ~{:-)
p...@illuminatedcomputing.comFrom 1e7a96668b0ad341d29281055927ad693c656843 Mon Sep 17 00:00:00 2001
From: "Paul A. Jungwirth" 
Date: Fri, 10 Dec 2021 16:04:57 -0800
Subject: [PATCH v3] Add range_agg with multirange inputs

---
 doc/src/sgml/func.sgml|  45 
 src/backend/utils/adt/multirangetypes.c   |  66 +++-
 src/include/catalog/pg_aggregate.dat  |   3 +
 src/include/catalog/pg_proc.dat   |  11 ++
 src/test/regress/expected/multirangetypes.out | 100 ++
 src/test/regress/expected/opr_sanity.out  |   1 +
 src/test/regress/sql/multirangetypes.sql  |  21 
 src/test/regress/sql/opr_sanity.sql   |   1 +
 8 files changed, 244 insertions(+), 4 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index df3cd5987b..fb1963a687 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -20012,8 +20012,23 @@ SELECT NULLIF(value, '(none)') ...

No
   
 
+  
+   
+
+ range_agg
+
+range_agg ( value
+ anymultirange )
+anymultirange
+   
+   
+Computes the union of the non-null input values.
+   
+   No
+  
+
   

 
  range_intersect_agg
@@ -20027,8 +20042,38 @@ SELECT NULLIF(value, '(none)') ...

No
   
 
+  
+   
+
+ range_intersect_agg
+
+range_intersect_agg ( value
+ anymultirange )
+anymultirange
+   
+   
+Computes the intersection of the non-null input values.
+   
+   No
+  
+
+  
+   
+
+ range_intersect_agg
+
+range_intersect_agg ( value
+ anymultirange )
+anymultirange
+   
+   
+Computes the intersection of the non-null input values.
+   
+   No
+  
+
   

 
  string_agg
diff --git a/src/backend/utils/adt/multirangetypes.c b/src/backend/utils/adt/multirangetypes.c
index c474b24431..5a4c00457a 100644
--- a/src/backend/utils/adt/multirangetypes.c
+++ b/src/backend/utils/adt/multirangetypes.c
@@ -1346,9 +1346,9 @@ range_agg_transfn(PG_FUNCTION_ARGS)
 	rngtypoid = get_fn_expr_argtype(fcinfo->flinfo, 1);
 	if (!type_is_range(rngtypoid))
 		ereport(ERROR,
 (errcode(ERRCODE_DATATYPE_MISMATCH),
- errmsg("range_agg must be called with a range")));
+ errmsg("range_agg must be called with a range or multirange")));
 
 	if (PG_ARGISNULL(0))
 		state = initArrayResult(rngtypoid, aggContext, false);
 	else
@@ -1362,8 +1362,10 @@ range_agg_transfn(PG_FUNCTION_ARGS)
 }
 
 /*
  * range_agg_finalfn: use our internal array to merge touching ranges.
+ *
+ * Shared by range_agg(anyrange) and range_agg(anymultirange).
  */
 Datum
 range_agg_finalfn(PG_FUNCTION_ARGS)
 {
@@ -1397,8 +1399,66 @@ range_agg_finalfn(PG_FUNCTION_ARGS)
 
 	PG_RETURN_MULTIRANGE_P(make_multirange(mltrngtypoid, typcache->rngtype, range_count, ranges));
 }
 
+/*
+ * multirange_agg_transfn: combine adjacent/overlapping multiranges.
+ *
+ * All we do here is gather the input multiranges' ranges into an array
+ * so that the finalfn can sort and combine them.
+ */
+Datum
+multirange_agg_transfn(PG_FUNCTION_ARGS)
+{
+	MemoryContext aggContext;
+	Oid			mltrngtypoid;
+	TypeCacheEntry *typcache;
+	TypeCacheEntry *rngtypcache;
+	ArrayBuildState *state;
+	MultirangeType *current;
+	int32		range_count;
+	RangeType **ranges;
+	int32		i;
+
+	if (!AggCheckCallContext(f

Re: range_agg with multirange inputs

2022-03-01 Thread Chapman Flack
On 02/28/22 23:31, Paul Jungwirth wrote:
> On 2/26/22 17:13, Chapman Flack wrote:
>> (I think generating
>> the patch with 4 lines of context would be enough to keep that from being
>> a recurring issue.)
> 
> Thank you for the review and the tip re 4 lines of context! Rebase attached.

I think the 4 lines should suffice, but it looks like this patch was
generated from a rebase of the old one (with three lines) that ended up
putting the new 'range_agg' entry ahead of 'max' in func.sgml, which
position is now baked into the 4 lines of context. :)

So I think it needs a bit of manual attention to get the additions back
in the right places, and then a 4-context-lines patch generated from that.

> I changed the message to "range_agg must be called
> with a range or multirange". How does that seem?

That works for me.

>> I kind of wonder whether either message is really reachable, at least
>> through the aggregate machinery in the expected way. Won't that machinery
>> ensure that it is calling the right transfn with the right type of
>> argument? If that's the case, maybe the messages could only be seen
>> by someone calling the transfn directly ... which also seems ruled out
>> for these transfns because the state type is internal. Is this whole test
>> more of the nature of an assertion?
> 
> I don't think they are reachable, so perhaps they are more like asserts. Do
> you think I should change it? It seems like a worthwhile check in any case.

I would not change them to actual Assert, which would blow up the whole
process on failure. If it's a genuine "not expected to happen" case,
maybe changing it to elog (or ereport with errmsg_internal) would save
a little workload for translators. But as you were copying an existing
ereport with a translatable message, there's also an argument for sticking
to that style, and maybe mentioning the question to an eventual committer
who might have a stronger opinion.

I did a small double-take seeing the C range_agg_finalfn being shared
by the SQL range_agg_finalfn and multirange_agg_finalfn. I infer that
the reason it works is get_fn_expr_rettype works equally well with
either parameter type.

Do you think it would be worth adding a comment at the C function
explaining that? In a quick query I just did, I found no other aggregate
final functions sharing a C function that way, so this could be the first.

Regards,
-Chap




Re: range_agg with multirange inputs

2022-02-28 Thread Paul Jungwirth

On 2/26/22 17:13, Chapman Flack wrote:

This applies (with some fuzz) and passes installcheck-world, but a rebase
is needed, because 3 lines of context aren't enough to get the doc changes
in the right place in the aggregate function table. (I think generating
the patch with 4 lines of context would be enough to keep that from being
a recurring issue.)


Thank you for the review and the tip re 4 lines of context! Rebase attached.


One thing that seems a bit funny is this message in the new
multirange_agg_transfn:

+   if (!type_is_multirange(mltrngtypoid))
+   ereport(ERROR,
+   (errcode(ERRCODE_DATATYPE_MISMATCH),
+errmsg("range_agg must be called with a multirange")));


I agree it would be more helpful to users to let them know we can take 
either kind of argument. I changed the message to "range_agg must be 
called with a range or multirange". How does that seem?



I kind of wonder whether either message is really reachable, at least
through the aggregate machinery in the expected way. Won't that machinery
ensure that it is calling the right transfn with the right type of
argument? If that's the case, maybe the messages could only be seen
by someone calling the transfn directly ... which also seems ruled out
for these transfns because the state type is internal. Is this whole test
more of the nature of an assertion?


I don't think they are reachable, so perhaps they are more like asserts. 
Do you think I should change it? It seems like a worthwhile check in any 
case.


Yours,

--
Paul  ~{:-)
p...@illuminatedcomputing.comFrom a6689485aab9b1aaa6e866f2a577368c7a0e324e Mon Sep 17 00:00:00 2001
From: "Paul A. Jungwirth" 
Date: Fri, 10 Dec 2021 16:04:57 -0800
Subject: [PATCH v2] Add range_agg with multirange inputs

---
 doc/src/sgml/func.sgml|  30 ++
 src/backend/utils/adt/multirangetypes.c   |  62 ++-
 src/include/catalog/pg_aggregate.dat  |   3 +
 src/include/catalog/pg_proc.dat   |  11 ++
 src/test/regress/expected/multirangetypes.out | 100 ++
 src/test/regress/expected/opr_sanity.out  |   1 +
 src/test/regress/sql/multirangetypes.sql  |  21 
 src/test/regress/sql/opr_sanity.sql   |   1 +
 8 files changed, 228 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index df3cd5987b..6a72785327 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -19959,8 +19959,23 @@ SELECT NULLIF(value, '(none)') ...

No
   
 
+  
+   
+
+ range_agg
+
+range_agg ( value
+ anymultirange )
+anymultirange
+   
+   
+Computes the union of the non-null input values.
+   
+   No
+  
+
   

 
  max
@@ -20012,8 +20027,23 @@ SELECT NULLIF(value, '(none)') ...

No
   
 
+  
+   
+
+ range_intersect_agg
+
+range_intersect_agg ( value
+ anymultirange )
+anymultirange
+   
+   
+Computes the intersection of the non-null input values.
+   
+   No
+  
+
   

 
  range_intersect_agg
diff --git a/src/backend/utils/adt/multirangetypes.c b/src/backend/utils/adt/multirangetypes.c
index c474b24431..0efef8cf35 100644
--- a/src/backend/utils/adt/multirangetypes.c
+++ b/src/backend/utils/adt/multirangetypes.c
@@ -1346,9 +1346,9 @@ range_agg_transfn(PG_FUNCTION_ARGS)
 	rngtypoid = get_fn_expr_argtype(fcinfo->flinfo, 1);
 	if (!type_is_range(rngtypoid))
 		ereport(ERROR,
 (errcode(ERRCODE_DATATYPE_MISMATCH),
- errmsg("range_agg must be called with a range")));
+ errmsg("range_agg must be called with a range or multirange")));
 
 	if (PG_ARGISNULL(0))
 		state = initArrayResult(rngtypoid, aggContext, false);
 	else
@@ -1397,8 +1397,68 @@ range_agg_finalfn(PG_FUNCTION_ARGS)
 
 	PG_RETURN_MULTIRANGE_P(make_multirange(mltrngtypoid, typcache->rngtype, range_count, ranges));
 }
 
+/*
+ * multirange_agg_transfn: combine adjacent/overlapping multiranges.
+ *
+ * All we do here is gather the input multiranges' ranges into an array
+ * so that the finalfn can sort and combine them.
+ */
+Datum
+multirange_agg_transfn(PG_FUNCTION_ARGS)
+{
+	MemoryContext aggContext;
+	Oid			mltrngtypoid;
+	TypeCacheEntry *typcache;
+	TypeCacheEntry *rngtypcache;
+	ArrayBuildState *state;
+	MultirangeType *current;
+	int32		range_count;
+	RangeType **ranges;
+	int32		i;
+
+	if (!AggCheckCallContext(fcinfo, &aggContext))
+		elog(ERROR, "multirange_agg_transfn called in non-aggregate context");
+
+	mltrngtypoid = get_fn_expr_argtype(fcinfo->flinfo, 1);
+	if (!type_is_multirange(mltrngtypoid))
+		ereport(ERROR,
+(errcode(ERRCODE_DATATYPE_MISMATCH),
+ errmsg("range_agg must be called with a range or multirange")));
+
+	typcache = multirange_get_typcache(fcinfo, mltrngtypoid);
+	rngtypcache = typcache->rngtyp

Re: range_agg with multirange inputs

2022-02-26 Thread Chapman Flack
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   not tested
Documentation:not tested

This applies (with some fuzz) and passes installcheck-world, but a rebase
is needed, because 3 lines of context aren't enough to get the doc changes
in the right place in the aggregate function table. (I think generating
the patch with 4 lines of context would be enough to keep that from being
a recurring issue.)

One thing that seems a bit funny is this message in the new
multirange_agg_transfn:

+   if (!type_is_multirange(mltrngtypoid))
+   ereport(ERROR,
+   (errcode(ERRCODE_DATATYPE_MISMATCH),
+errmsg("range_agg must be called with a multirange")));

It's clearly copied from the corresponding test and message in
range_agg_transfn. They both say "range_agg must be called ...", which
makes perfect sense, as from the user's perspective both messages come
from (different overloads of) a function named range_agg.

Still, it could be odd to have (again from the user's perspective)
a function named range_agg that sometimes says "range_agg must be
called with a range" and other times says "range_agg must be called
with a multirange".

I'm not sure how to tweak the wording (of either message or both) to
make that less weird, but there's probably a way.

I kind of wonder whether either message is really reachable, at least
through the aggregate machinery in the expected way. Won't that machinery
ensure that it is calling the right transfn with the right type of
argument? If that's the case, maybe the messages could only be seen
by someone calling the transfn directly ... which also seems ruled out
for these transfns because the state type is internal. Is this whole test
more of the nature of an assertion?

Regards,
-Chap

The new status of this patch is: Waiting on Author


Re: range_agg

2020-12-27 Thread Alexander Korotkov
On Sun, Dec 27, 2020 at 9:07 PM David Fetter  wrote:
> On Sun, Dec 27, 2020 at 09:53:13AM -0800, Zhihong Yu wrote:
> > This is not an ideal way to index multirages, but something we can
> > easily have.
>
> What sort of indexing improvements do you have in mind?

Approximation of multirange as a range can cause false positives.
It's good if gaps are small, but what if they aren't.

Ideally, we should split multirange to the ranges and index them
separately.  So, we would need a GIN-like index.  The problem is that
the GIN entry tree is a B-tree, which is not very useful for searching
for ranges.  If we could replace the GIN entry tree with GiST or
SP-GiST, that should be good.  We could index multirage parts
separately and big gaps wouldn't be a problem.  Similar work was
already prototyped (it was prototyped under the name "vodka", but I'm
not a big fan of this name).  FWIW, such a new access method would
need a lot of work to bring it to commit.  I don't think it would be
reasonable, before multiranges get popular.

Regarding the GiST opclass, it seems the best we can do in GiST.

--
Regards,
Alexander Korotkov




Re: range_agg

2020-12-27 Thread Alexander Korotkov
On Sun, Dec 27, 2020 at 8:52 PM Zhihong Yu  wrote:
> This is not an ideal way to index multirages, but something we can easily 
> have.
>
> typo: multiranges

Thanks for catching.  I will revise the commit message before committing.

--
Regards,
Alexander Korotkov




Re: range_agg

2020-12-27 Thread David Fetter
On Sun, Dec 27, 2020 at 09:53:13AM -0800, Zhihong Yu wrote:
> Hi,
> 
> This is not an ideal way to index multirages, but something we can
> easily have.

What sort of indexing improvements do you have in mind?

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate




Re: range_agg

2020-12-27 Thread Zhihong Yu
Hi,

This is not an ideal way to index multirages, but something we can easily
have.

typo: multiranges

Cheers

On Sun, Dec 27, 2020 at 1:50 AM Alexander Korotkov 
wrote:

> On Thu, Dec 17, 2020 at 10:10 PM Alexander Korotkov
>  wrote:
> >
> > I think this patch is very close to committable.  I'm going to spend
> > some more time further polishing it and commit (if I don't find a
> > major issue or face objections).
>
> The main patch is committed.  I've prepared a set of improvements.
> 0001 Fixes bug in bsearch comparison functions
> 0002 Implements missing @> (range,multirange) operator and its commutator
> 0003 Does refactors signatures of *_internal() multirange functions
> 0004 Adds cross-type (range, multirange) operators handling to
> existing range GiST opclass
> 0005 Adds support for GiST multirange indexing by approximation of
> multirange as the union range with no gaps
>
> The patchset is quite trivial.  I'm going to push it if there are no
> objections.
>
> The SP-GiST handling is more tricky and requires substantial work.
>
> --
> Regards,
> Alexander Korotkov
>


Re: range_agg

2020-12-27 Thread Alexander Korotkov
On Thu, Dec 17, 2020 at 10:10 PM Alexander Korotkov
 wrote:
>
> I think this patch is very close to committable.  I'm going to spend
> some more time further polishing it and commit (if I don't find a
> major issue or face objections).

The main patch is committed.  I've prepared a set of improvements.
0001 Fixes bug in bsearch comparison functions
0002 Implements missing @> (range,multirange) operator and its commutator
0003 Does refactors signatures of *_internal() multirange functions
0004 Adds cross-type (range, multirange) operators handling to
existing range GiST opclass
0005 Adds support for GiST multirange indexing by approximation of
multirange as the union range with no gaps

The patchset is quite trivial.  I'm going to push it if there are no objections.

The SP-GiST handling is more tricky and requires substantial work.

--
Regards,
Alexander Korotkov


0001-Fix-bugs-in-comparison-functions-for-multirange_bsea.patch
Description: Binary data


0002-Implement-operators-for-checking-if-the-range-contai.patch
Description: Binary data


0003-Improve-the-signature-of-internal-multirange-functio.patch
Description: Binary data


0005-Add-GiST-indexes-for-multiranges.patch
Description: Binary data


0004-Add-support-of-multirange-matching-to-the-existing-r.patch
Description: Binary data


Re: range_agg

2020-12-16 Thread Alexander Korotkov
On Thu, Dec 17, 2020 at 2:37 AM Zhihong Yu  wrote:
> Letting user manually name the multirange (after a few automatic attempts) 
> seems reasonable.

Accepted.  Thank you for your feedback.

--
Regards,
Alexander Korotkov




Re: range_agg

2020-12-16 Thread Zhihong Yu
Letting user manually name the multirange (after a few automatic attempts)
seems reasonable.

Cheers

On Wed, Dec 16, 2020 at 3:34 PM Alexander Korotkov 
wrote:

> On Thu, Dec 17, 2020 at 1:03 AM Alexander Korotkov 
> wrote:
> > On Thu, Dec 17, 2020 at 12:54 AM Zhihong Yu  wrote:
> > > +* The idea is to prepend underscores as needed until we make a
> name that
> > > +* doesn't collide with anything ...
> > >
> > > I wonder if other characters (e.g. [a-z0-9]) can be used so that name
> without collision can be found without calling truncate_identifier().
> >
> > Probably.  But multiranges just shares naming logic already existing
> > in arrays.  If we're going to change this, I think we should change
> > this for arrays too.  And this change shouldn't be part of multirange
> > patch.
>
> I gave this another thought.  Now we have facility to name multirange
> types manually.  I think we should give up with underscore naming
> completely.  If both replacing "range" with "mutlirange" in the
> typename and appending "_multirange" to the type name failed (very
> unlikely), then let user manually name the multirange.  Any thoughts?
>
> --
> Regards,
> Alexander Korotkov
>


Re: range_agg

2020-12-16 Thread Alexander Korotkov
On Thu, Dec 17, 2020 at 1:03 AM Alexander Korotkov  wrote:
> On Thu, Dec 17, 2020 at 12:54 AM Zhihong Yu  wrote:
> > +* The idea is to prepend underscores as needed until we make a name 
> > that
> > +* doesn't collide with anything ...
> >
> > I wonder if other characters (e.g. [a-z0-9]) can be used so that name 
> > without collision can be found without calling truncate_identifier().
>
> Probably.  But multiranges just shares naming logic already existing
> in arrays.  If we're going to change this, I think we should change
> this for arrays too.  And this change shouldn't be part of multirange
> patch.

I gave this another thought.  Now we have facility to name multirange
types manually.  I think we should give up with underscore naming
completely.  If both replacing "range" with "mutlirange" in the
typename and appending "_multirange" to the type name failed (very
unlikely), then let user manually name the multirange.  Any thoughts?

--
Regards,
Alexander Korotkov




Re: range_agg

2020-12-16 Thread Alexander Korotkov
On Thu, Dec 17, 2020 at 12:54 AM Zhihong Yu  wrote:
> +* The idea is to prepend underscores as needed until we make a name that
> +* doesn't collide with anything ...
>
> I wonder if other characters (e.g. [a-z0-9]) can be used so that name without 
> collision can be found without calling truncate_identifier().

Probably.  But multiranges just shares naming logic already existing
in arrays.  If we're going to change this, I think we should change
this for arrays too.  And this change shouldn't be part of multirange
patch.

> +   else if (strcmp(defel->defname, "multirange_type_name") == 0)
> +   {
> +   if (multirangeTypeName != NULL)
> +   ereport(ERROR,
> +   (errcode(ERRCODE_SYNTAX_ERROR),
> +errmsg("conflicting or redundant options")));
>
> Maybe make the error message a bit different from occurrences of similar 
> error message (such as including multirangeTypeName) ?

This is again isn't an invention of multirange.  We use this message
many times in DefineRange() and other places.  From the first glance,
I've nothing against changing this to a more informative message, but
that should be done globally.  And this change isn't directly related
to multirage.  Feel free to propose a patch improving this.

--
Regards,
Alexander Korotkov




Re: range_agg

2020-12-07 Thread Alvaro Herrera
On 2020-Dec-08, Alexander Korotkov wrote:

> I also found a problem in multirange types naming logic.  Consider the
> following example.
> 
> create type a_multirange AS (x float, y float);
> create type a as range(subtype=text, collation="C");
> create table tbl (x __a_multirange);
> drop type a_multirange;
> 
> If you dump this database, the dump couldn't be restored.  The
> multirange type is named __a_multirange, because the type named
> a_multirange already exists.  However, it might appear that
> a_multirange type is already deleted.  When the dump is restored, a
> multirange type is named a_multirange, and the corresponding table
> fails to be created.  The same thing doesn't happen with arrays,
> because arrays are not referenced in dumps by their internal names.
> 
> I think we probably should add an option to specify multirange type
> names while creating a range type.  Then dump can contain exact type
> names used in the database, and restore wouldn't have a names
> collision.

Hmm, good point.  I agree that a dump must preserve the name, since once
created it is user-visible.  I had not noticed this problem, but it's
obvious in retrospect.

> In general, I wonder if we can make the binary format of multiranges
> more efficient.  It seems that every function involving multiranges
> from multirange_deserialize().  I think we can make functions like
> multirange_contains_elem() much more efficient.  Multirange is
> basically an array of ranges.  So we can pack it as follows.
> 1. Typeid and rangecount
> 2. Tightly packed array of flags (1-byte for each range)
> 3. Array of indexes of boundaries (4-byte for each range).  Or even
> better we can combine offsets and lengths to be compression-friendly
> like jsonb JEntry's do.
> 4. Boundary values
> Using this format, we can implement multirange_contains_elem(),
> multirange_contains_range() without deserialization and using binary
> search.  That would be much more efficient.  What do you think?

I also agree.  I spent some time staring at the I/O code a couple of
months back but was unable to focus on it for long enough.  I don't know
JEntry's format, but I do remember that the storage format for JSONB was
widely discussed back then; it seems wise to apply similar logic or at
least similar reasoning.




Re: range_agg

2020-12-07 Thread Alexander Korotkov
On Mon, Nov 30, 2020 at 11:39 PM Alexander Korotkov
 wrote:
> On Mon, Nov 30, 2020 at 10:35 PM Alexander Korotkov
>  wrote:
> > On Sun, Nov 29, 2020 at 11:53 PM Paul A Jungwirth
> >  wrote:
> > >
> > > On Sun, Nov 29, 2020 at 11:43 AM Alexander Korotkov
> > >  wrote:
> > > > Thank you.  Could you please, update doc/src/sgml/catalogs.sgml,
> > > > because pg_type and pg_range catalogs are updated.
> > >
> > > Attached! :-)
> >
> > You're quick, thank you.  Please, also take a look at cfbot failure
> > https://travis-ci.org/github/postgresql-cfbot/postgresql/builds/746623942
> > I've tried to reproduce it, but didn't manage yet.
>
> Got it.  type_sanity test fails on any platform, you just need to
> repeat "make check" till it fails.
>
> The failed query checked consistency of range types, but it didn't
> take into account ranges of domains and ranges of records, which are
> exercised by multirangetypes test running in parallel.  We could teach
> this query about such kinds of ranges, but I think that would be
> overkill, because we're not going to introduce such builtin ranges
> yet.  So, I'm going to just move multirangetypes test into another
> group of parallel tests.

I also found a problem in multirange types naming logic.  Consider the
following example.

create type a_multirange AS (x float, y float);
create type a as range(subtype=text, collation="C");
create table tbl (x __a_multirange);
drop type a_multirange;

If you dump this database, the dump couldn't be restored.  The
multirange type is named __a_multirange, because the type named
a_multirange already exists.  However, it might appear that
a_multirange type is already deleted.  When the dump is restored, a
multirange type is named a_multirange, and the corresponding table
fails to be created.  The same thing doesn't happen with arrays,
because arrays are not referenced in dumps by their internal names.

I think we probably should add an option to specify multirange type
names while creating a range type.  Then dump can contain exact type
names used in the database, and restore wouldn't have a names
collision.

Another thing that worries me is the multirange serialization format.

typedef struct
{
int32   vl_len_;/* varlena header */
charflags;  /* range flags */
char_padding;   /* Bounds must be aligned */
/* Following the header are zero to two bound values. */
} ShortRangeType;

Comment says this structure doesn't contain a varlena header, while
structure obviously has it.

In general, I wonder if we can make the binary format of multiranges
more efficient.  It seems that every function involving multiranges
from multirange_deserialize().  I think we can make functions like
multirange_contains_elem() much more efficient.  Multirange is
basically an array of ranges.  So we can pack it as follows.
1. Typeid and rangecount
2. Tightly packed array of flags (1-byte for each range)
3. Array of indexes of boundaries (4-byte for each range).  Or even
better we can combine offsets and lengths to be compression-friendly
like jsonb JEntry's do.
4. Boundary values
Using this format, we can implement multirange_contains_elem(),
multirange_contains_range() without deserialization and using binary
search.  That would be much more efficient.  What do you think?

--
Regards,
Alexander Korotkov




Re: range_agg

2020-11-30 Thread Alexander Korotkov
On Mon, Nov 30, 2020 at 10:35 PM Alexander Korotkov
 wrote:
> On Sun, Nov 29, 2020 at 11:53 PM Paul A Jungwirth
>  wrote:
> >
> > On Sun, Nov 29, 2020 at 11:43 AM Alexander Korotkov
> >  wrote:
> > > Thank you.  Could you please, update doc/src/sgml/catalogs.sgml,
> > > because pg_type and pg_range catalogs are updated.
> >
> > Attached! :-)
>
> You're quick, thank you.  Please, also take a look at cfbot failure
> https://travis-ci.org/github/postgresql-cfbot/postgresql/builds/746623942
> I've tried to reproduce it, but didn't manage yet.

Got it.  type_sanity test fails on any platform, you just need to
repeat "make check" till it fails.

The failed query checked consistency of range types, but it didn't
take into account ranges of domains and ranges of records, which are
exercised by multirangetypes test running in parallel.  We could teach
this query about such kinds of ranges, but I think that would be
overkill, because we're not going to introduce such builtin ranges
yet.  So, I'm going to just move multirangetypes test into another
group of parallel tests.

--
Regards,
Alexander Korotkov




Re: range_agg

2020-11-30 Thread Alexander Korotkov
On Sun, Nov 29, 2020 at 11:53 PM Paul A Jungwirth
 wrote:
>
> On Sun, Nov 29, 2020 at 11:43 AM Alexander Korotkov
>  wrote:
> > Thank you.  Could you please, update doc/src/sgml/catalogs.sgml,
> > because pg_type and pg_range catalogs are updated.
>
> Attached! :-)

You're quick, thank you.  Please, also take a look at cfbot failure
https://travis-ci.org/github/postgresql-cfbot/postgresql/builds/746623942
I've tried to reproduce it, but didn't manage yet.

--
Regards,
Alexander Korotkov




Re: range_agg

2020-11-29 Thread Alexander Korotkov
On Sun, Nov 29, 2020 at 8:11 PM Paul A Jungwirth
 wrote:
> On Fri, Nov 27, 2020 at 12:35 AM Alexander Korotkov
>  wrote:
> > I'd like to review this patch.  Could you please rebase it once again?  
> > Thanks.
>
> Thanks! Here is a rebased version. It also includes one more cleanup
> commit from Alvaro since the last one.

Thank you.  Could you please, update doc/src/sgml/catalogs.sgml,
because pg_type and pg_range catalogs are updated.

--
Regards,
Alexander Korotkov




Re: range_agg

2020-11-27 Thread Alexander Korotkov
Hi!

On Thu, Sep 24, 2020 at 3:05 AM Paul A Jungwirth
 wrote:
> On Sun, Aug 16, 2020 at 12:55 PM Paul A Jungwirth
>  wrote:
> > This is rebased on the current master, including some changes to doc
> > tables and pg_upgrade handling of type oids.
>
> Here is a rebased version of this patch, including a bunch of cleanup
> from Alvaro. (Thanks Alvaro!)

I'd like to review this patch.  Could you please rebase it once again?  Thanks.

--
Regards,
Alexander Korotkov




Re: range_agg

2020-10-08 Thread Pavel Stehule
čt 24. 9. 2020 v 2:05 odesílatel Paul A Jungwirth <
p...@illuminatedcomputing.com> napsal:

> On Sun, Aug 16, 2020 at 12:55 PM Paul A Jungwirth
>  wrote:
> > This is rebased on the current master, including some changes to doc
> > tables and pg_upgrade handling of type oids.
>
> Here is a rebased version of this patch, including a bunch of cleanup
> from Alvaro. (Thanks Alvaro!)
>

I tested this patch and It looks well, I have not any objections

1. there are not new warnings
2. make check-world passed
3. build doc without problems
4. doc is enough, regress tests too
5. there was not objection against this feature in discussion, and I think
it is interesting and useful feature - good additional to arrays

Regards

Pavel




> Paul
>


Re: range_agg

2020-07-05 Thread Paul A Jungwirth
On Sun, Jul 5, 2020 at 12:11 PM Paul A Jungwirth
 wrote:
>
> Just knowing that arrays are
> something we do this for is enough to hunt for clues, but if anyone
> can point me more directly to code that will help me do it for
> multiranges, I'd be appreciative.

It looks like expandeddatum.h is where I should be looking. . . .

Paul




Re: range_agg

2020-07-05 Thread Justin Pryzby
On Sat, Apr 11, 2020 at 09:36:37AM -0700, Paul A Jungwirth wrote:
> On Fri, Apr 10, 2020 at 8:44 PM Paul A Jungwirth 
>  wrote:
> > Thanks, and thanks for your v17 also. Here is a patch building on that
> > and adding support for anycompatiblemultirange.
> 
> Here is a v19 that just moved the multirange tests to a new parallel
> group to avoid a max-20-tests error. Sorry about that!

This needs to be rebased ; set cfbot to "waiting".

-- 
Justin




Re: range_agg

2020-04-11 Thread Alvaro Herrera
On 2020-Apr-11, Paul A Jungwirth wrote:

> On Sat, Apr 11, 2020 at 9:36 AM Paul A Jungwirth
>  wrote:
> > Btw I'm working on typanalyze + selectivity, and it seems like the
> > test suite doesn't run those things?
> 
> Nevermind, I just had to add `analyze numrange_test` to
> src/test/regress/sql/rangetypes.sql. :-) Do you want a separate patch
> for that? Or maybe it should go in sql/vacuum.sql?

Dunno, it seems fine in rangetypes.sql.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: range_agg

2020-04-11 Thread Paul A Jungwirth
On Sat, Apr 11, 2020 at 9:36 AM Paul A Jungwirth
 wrote:
> Btw I'm working on typanalyze + selectivity, and it seems like the
> test suite doesn't run those things?

Nevermind, I just had to add `analyze numrange_test` to
src/test/regress/sql/rangetypes.sql. :-) Do you want a separate patch
for that? Or maybe it should go in sql/vacuum.sql?

Regards,
Paul




Re: range_agg

2020-04-05 Thread Alvaro Herrera
v17 is a rebase fixing a minor parse_coerce.c edit; v16 lasted little
:-(  I chose to change the wording of the conflicting comment in
enforce_generic_type_consistency():

 * 3) Similarly, if return type is ANYRANGE or ANYMULTIRANGE, and any
 *argument is ANYRANGE or ANYMULTIRANGE, use that argument's
 *actual type, range type or multirange type as the function's return
 *type.

This wording is less precise, in that it doesn't say exactly which of
the three types is the actual result for each of the possible four cases
(r->r, r->m, m->m, m->r) but I think it should be straightforward.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: range_agg

2020-03-23 Thread Paul Jungwirth

Thanks Alvaro!

On Mon, Mar 23, 2020 at 4:33 PM Alvaro Herrera 
 wrote:

>
> Thinking about the on-disk representation, can we do better than putting
> the contained ranges in long-varlena format, including padding; also we
> include the type OID with each element.  Sounds wasteful.  A more
> compact representation might be to allow short varlenas and doing away
> with the alignment padding, put the the type OID just once.  This is
> important because we cannot change it later.

Can you give me some guidance on this? I don't know how to make the 
on-disk format different from the in-memory format. (And for the 
in-memory format, I think it's important to have actual RangeTypes 
inside the multirange.) Is there something in the documentation, or a 
README in the repo, or even another type I can follow?


> I'm also wondering if multirange_in() is the right strategy.  Would 
it> be sensible to give each range to range_parse or range_parse_bounde, so

> that it determines where each range starts and ends?  Then that function
> doesn't have to worry about each quote and escape, duplicating range
> parsing code.  (This will probably require changing signature of the
> rangetypes.c function, and exporting it; for example have
> range_parse_bound allow bound_str to be NULL and in that case don't mess
> with the StringInfo and just return the end position of the parsed
> bound.)

Yeah, I really wanted to do it that way originally too. As you say it 
would require passing back more information from the range-parsing code. 
I can take a stab at making the necessary changes. I'm a bit more 
confident now than I was then in changing the range code we have already.


Regards,

--
Paul  ~{:-)
p...@illuminatedcomputing.com




Re: range_agg

2020-03-23 Thread Alvaro Herrera
On 2020-Mar-19, Paul A Jungwirth wrote:

> On Thu, Mar 19, 2020 at 1:43 PM Paul A Jungwirth
>  wrote:
> > On Thu, Mar 19, 2020 at 1:42 PM Alvaro Herrera  
> > wrote:
> > > There's been another flurry of commits in the polymorphic types area.
> > > Can you please rebase again?
> >
> > I noticed that too. :-) I'm about halfway through a rebase right now.
> > I can probably finish it up tonight.
> 
> Here is that patch. I should probably add an anycompatiblemultirange
> type now too? I'll get started on that tomorrow.

Thanks for the new version.  Here's a few minor adjustments while I
continue to read through it.

Thinking about the on-disk representation, can we do better than putting
the contained ranges in long-varlena format, including padding; also we
include the type OID with each element.  Sounds wasteful.  A more
compact representation might be to allow short varlenas and doing away
with the alignment padding, put the the type OID just once.  This is
important because we cannot change it later.

I'm also wondering if multirange_in() is the right strategy.  Would it
be sensible to give each range to range_parse or range_parse_bounde, so
that it determines where each range starts and ends?  Then that function
doesn't have to worry about each quote and escape, duplicating range
parsing code.  (This will probably require changing signature of the
rangetypes.c function, and exporting it; for example have
range_parse_bound allow bound_str to be NULL and in that case don't mess
with the StringInfo and just return the end position of the parsed
bound.)

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From bd8b814426bef6f5a620351557cdc953ba0ae22e Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Mon, 23 Mar 2020 18:50:02 -0300
Subject: [PATCH 1/5] Fix typo

---
 src/backend/utils/adt/multirangetypes.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/utils/adt/multirangetypes.c b/src/backend/utils/adt/multirangetypes.c
index bd3cf3dc93..6b3cde8eca 100644
--- a/src/backend/utils/adt/multirangetypes.c
+++ b/src/backend/utils/adt/multirangetypes.c
@@ -79,7 +79,7 @@ static int32 multirange_canonicalize(TypeCacheEntry *rangetyp, int32 input_range
  * to range_in, but we have to detect quoting and backslash-escaping
  * which can happen for range bounds.
  * Backslashes can escape something inside or outside a quoted string,
- * and a quoted string can escape quote marks either either backslashes
+ * and a quoted string can escape quote marks with either backslashes
  * or double double-quotes.
  */
 Datum
-- 
2.20.1

>From afd017faab3d6aff3ca8618d95a6f9c918ab8cca Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Mon, 23 Mar 2020 18:50:19 -0300
Subject: [PATCH 2/5] Remove trailing useless ;

---
 src/backend/utils/adt/multirangetypes.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/utils/adt/multirangetypes.c b/src/backend/utils/adt/multirangetypes.c
index 6b3cde8eca..6e9cf77651 100644
--- a/src/backend/utils/adt/multirangetypes.c
+++ b/src/backend/utils/adt/multirangetypes.c
@@ -131,7 +131,7 @@ multirange_in(PG_FUNCTION_ARGS)
 			input_str),
 	 errdetail("Unexpected end of input.")));
 
-		 /* skip whitespace */ ;
+		/* skip whitespace */
 		if (isspace((unsigned char) ch))
 			continue;
 
-- 
2.20.1

>From d7c1ea171de018d025ab5de57f572a9353b02fdf Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Mon, 23 Mar 2020 18:50:40 -0300
Subject: [PATCH 3/5] reduce palloc+strlcpy to pnstrdup

---
 src/backend/utils/adt/multirangetypes.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/src/backend/utils/adt/multirangetypes.c b/src/backend/utils/adt/multirangetypes.c
index 6e9cf77651..5b57416cfd 100644
--- a/src/backend/utils/adt/multirangetypes.c
+++ b/src/backend/utils/adt/multirangetypes.c
@@ -167,9 +167,8 @@ multirange_in(PG_FUNCTION_ARGS)
 	parse_state = MULTIRANGE_IN_RANGE_ESCAPED;
 else if (ch == ']' || ch == ')')
 {
-	range_str_len = ptr - range_str + 2;
-	range_str_copy = palloc0(range_str_len);
-	strlcpy(range_str_copy, range_str, range_str_len);
+	range_str_len = ptr - range_str + 1;
+	range_str_copy = pnstrdup(range_str, range_str_len);
 	if (range_capacity == range_count)
 	{
 		range_capacity *= 2;
-- 
2.20.1

>From 7698f57d1f5ab870c89a84da7bea24bef241a458 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Mon, 23 Mar 2020 18:50:59 -0300
Subject: [PATCH 4/5] silence compiler warning

---
 src/backend/utils/fmgr/funcapi.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/src/backend/utils/fmgr/funcapi.c b/src/backend/utils/fmgr/funcapi.c
index ab24359981..3bfefcf48a 100644
--- a/src/backend/utils/fmgr/funcapi.c
+++ b/src/backend/utils/fmgr/funcapi.c
@@ -508,10 +508,13 @@ resolve_anyelement_from_others(polymorphic_actuals *actuals)
 	else if (OidIsVal

Re: range_agg

2020-03-20 Thread Alvaro Herrera
On 2020-Mar-14, Paul A Jungwirth wrote:

> On Fri, Mar 13, 2020 at 2:39 PM Alvaro Herrera  
> wrote:
> > Here's the rebased version.
> >
> > I just realized I didn't include the API change I proposed in
> > https://postgr.es/m/20200306200343.GA625@alvherre.pgsql ...
> 
> Thanks for your help with this Alvaro!
> 
> I was just adding your changes to my own branch and I noticed your
> v12-0001 has different parameter names here:
> 
>  static MultirangeIOData *
> -get_multirange_io_data(FunctionCallInfo fcinfo, Oid mltrngtypid,
> IOFuncSelector func)
> +get_multirange_io_data(FunctionCallInfo fcinfo, Oid rngtypid,
> IOFuncSelector func)

> I'm pretty sure mltrngtypid is the correct name here. Right? Let me
> know if I'm missing something. :-)

Heh.  The intention here was to abbreviate to "typid", but if you want
to keep the longer name, it's OK too.  I don't think that name is
particularly critical, since it should be obvious that it must be a
multirange type.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: range_agg

2020-03-19 Thread Paul A Jungwirth
On Thu, Mar 19, 2020 at 1:42 PM Alvaro Herrera  wrote:
>
> On 2020-Mar-16, Paul A Jungwirth wrote:
>
> > On Sat, Mar 14, 2020 at 11:13 AM Paul A Jungwirth
> >  wrote:
> > > I think that should fix the cfbot failure.
> >
> > I saw this patch was failing to apply again. There was some
> > refactoring to how polymorphic types are determined. I added my
> > changes for anymultirange to that new approach, and things should be
> > passing again.
>
> There's been another flurry of commits in the polymorphic types area.
> Can you please rebase again?

I noticed that too. :-) I'm about halfway through a rebase right now.
I can probably finish it up tonight.

Paul




Re: range_agg

2020-03-19 Thread Alvaro Herrera
On 2020-Mar-16, Paul A Jungwirth wrote:

> On Sat, Mar 14, 2020 at 11:13 AM Paul A Jungwirth
>  wrote:
> > I think that should fix the cfbot failure.
> 
> I saw this patch was failing to apply again. There was some
> refactoring to how polymorphic types are determined. I added my
> changes for anymultirange to that new approach, and things should be
> passing again.

There's been another flurry of commits in the polymorphic types area.
Can you please rebase again?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: range_agg

2020-03-14 Thread Paul A Jungwirth
On Fri, Mar 13, 2020 at 2:39 PM Alvaro Herrera  wrote:
> Here's the rebased version.
>
> I just realized I didn't include the API change I proposed in
> https://postgr.es/m/20200306200343.GA625@alvherre.pgsql ...

Thanks for your help with this Alvaro!

I was just adding your changes to my own branch and I noticed your
v12-0001 has different parameter names here:

diff --git a/src/backend/utils/adt/multirangetypes.c
b/src/backend/utils/adt/multirangetypes.c
index f9dd0378cc..0c9afd5448 100644
--- a/src/backend/utils/adt/multirangetypes.c
+++ b/src/backend/utils/adt/multirangetypes.c
@@ -376,11 +375,11 @@ multirange_typanalyze(PG_FUNCTION_ARGS)
  * pointer to a type cache entry.
  */
 static MultirangeIOData *
-get_multirange_io_data(FunctionCallInfo fcinfo, Oid mltrngtypid,
IOFuncSelector func)
+get_multirange_io_data(FunctionCallInfo fcinfo, Oid rngtypid,
IOFuncSelector func)
 {
 MultirangeIOData *cache = (MultirangeIOData *) fcinfo->flinfo->fn_extra;

-if (cache == NULL || cache->typcache->type_id != mltrngtypid)
+if (cache == NULL || cache->typcache->type_id != rngtypid)
 {
 int16   typlen;
 booltypbyval;
@@ -389,9 +388,9 @@ get_multirange_io_data(FunctionCallInfo fcinfo,
Oid mltrngtypid, IOFuncSelector

 cache = (MultirangeIOData *)
MemoryContextAlloc(fcinfo->flinfo->fn_mcxt,

sizeof(MultirangeIOData));
-cache->typcache = lookup_type_cache(mltrngtypid,
TYPECACHE_MULTIRANGE_INFO);
+cache->typcache = lookup_type_cache(rngtypid,
TYPECACHE_MULTIRANGE_INFO);
 if (cache->typcache->rngtype == NULL)
-elog(ERROR, "type %u is not a multirange type", mltrngtypid);
+elog(ERROR, "type %u is not a multirange type", rngtypid);

 /* get_type_io_data does more than we need, but is convenient */
 get_type_io_data(cache->typcache->rngtype->type_id,

I'm pretty sure mltrngtypid is the correct name here. Right? Let me
know if I'm missing something. :-)

Yours,
Paul




Re: range_agg

2020-03-13 Thread Tom Lane
Paul A Jungwirth  writes:
> On Wed, Mar 11, 2020 at 4:39 PM Paul A Jungwirth
>  wrote:
>> Oh, my last email left out the most important part. :-) Is this
>> failure online somewhere so I can take a look at it and fix it?

Look for your patch(es) at

http://commitfest.cputube.org

Right now it's not even applying, presumably because Alvaro already
pushed some pieces, so you need to rebase.  But when it was applying,
one or both of the test builds was failing.

regards, tom lane




Re: range_agg

2020-03-13 Thread Paul A Jungwirth
On Wed, Mar 11, 2020 at 4:39 PM Paul A Jungwirth
 wrote:
>
> On Sat, Mar 7, 2020 at 12:20 PM Tom Lane  wrote:
> > Alvaro Herrera  writes:
> > > [ v11 patches ]
> > The cfbot isn't too happy with this; it's getting differently-ordered
> > results than you apparently did for the list of owned objects in
> > dependency.out's DROP OWNED BY test.  Not sure why that should be ---
> > it seems like af6550d34 should have ensured that there's only one
> > possible ordering.
>
> Oh, my last email left out the most important part. :-) Is this
> failure online somewhere so I can take a look at it and fix it?

Looks like I sent this just to Tom before. This is something I need to
fix, right?

Regards,
Paul




Re: range_agg

2020-03-13 Thread Paul A Jungwirth
On Thu, Mar 12, 2020 at 5:38 AM Alvaro Herrera  wrote:
> ... thinking about gist+spgist, I think they could be written
> identically to those for ranges, using the lowest (first) lower bound
> and the higher (last) upper bound.
>
> ... thinking about selectivity, I think the way to write that is to
> first compute the selectivity for the range across the first lower bound
> and the last upper bound, and then subtract that for the "negative"
> space between the contained ranges.
>
> I have no immediate thoughts about typanalyze.  I suppose it should be
> somehow based on the implementation for ranges ... maybe a first-cut is
> to construct fake ranges covering the whole multirange (as above) and
> just use the ranges implementation (compute_range_stats).

Thanks, this is pretty much what I was thinking too, but I'm really
glad to have someone who knows better confirm it. I can get started on
these right away, and I'll let folks know if I need any help. When I
looked at this last fall there was a lot I didn't understand. More or
less using the existing ranges implementation should be a big help
though.

Paul




Re: range_agg

2020-03-12 Thread Alvaro Herrera


Hello Paul, thanks for the thorough response to all these points.

Regarding the merge of multiranges with ranges, I had also thought of
that at some point and was leaning towards doing that, but after the
latest responses I think the arguments against it are sensible; and now
there's a clear majority for keeping them separate.

I'll be posting an updated version of the patch later today.

I was a bit scared bit this part:

On 2020-Mar-11, Paul A Jungwirth wrote:

> Finally, I think I mentioned this a long time ago, but I'm still not
> sure if this patch needs work around these things:
> 
> - gist opclass
> - spgist opclass
> - typanalyze
> - selectivity
> 
> I'd love for a real Postgres expert to tell me "No, we can add that
> later" or "Yes, you have to add that now."

While I think that the gist and spgist opclass are in the "very nice to
have but still optional" category, the other two items seem mandatory
(but I'm not 100% certain about that, TBH).  I'm not sure we have time
to get those ready during this commitfest.


... thinking about gist+spgist, I think they could be written
identically to those for ranges, using the lowest (first) lower bound
and the higher (last) upper bound.

... thinking about selectivity, I think the way to write that is to
first compute the selectivity for the range across the first lower bound
and the last upper bound, and then subtract that for the "negative"
space between the contained ranges.

I have no immediate thoughts about typanalyze.  I suppose it should be
somehow based on the implementation for ranges ... maybe a first-cut is
to construct fake ranges covering the whole multirange (as above) and
just use the ranges implementation (compute_range_stats).

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: range_agg

2020-03-11 Thread Paul A Jungwirth
Thanks everyone for offering some thoughts on this!

Tom Lane  wrote:
> have you given any thought to just deciding that ranges and
> multiranges are the same type?

I can see how it might be nice to have just one type to think about.
Still I think keeping them separate makes sense. Other folks have
brought up several reasons already. Just to chime in:

Tom Lane  wrote:
> Isaac Morland  writes:
> > Definitely agreed that range and multirange (or whatever it's called)
> > should be different. In the work I do I have a number of uses for ranges,
> > but not (yet) for multiranges. I want to be able to declare a column as
> > range and be sure that it is just a single range, and then call lower() and
> > upper() on it and be sure to get just one value in each case; and if I
> > accidentally try to take the union of ranges where the union isn’t another
> > range, I want to get an error rather than calculate some weird (in my
> > context) multirange.
>
> I do not find that argument convincing at all.  Surely you could put
> that constraint on your column using "CHECK (numranges(VALUE) <= 1)"
> or some such notation.

A check constraint works for columns, but there are other contexts
where you'd like to restrict things to just a contiguous range, e.g.
user-defined functions and intermediate results in queries. Basic
ranges seem a lot simpler to think about, so I can appreciate how
letting any range be a multirange adds a heavy cognitive burden. I
think a lot of people will share Isaac's opinion here.

Tom Lane  wrote:
> Also, this would allow us to remove at least one ugly misfeature:
>
> regression=# select '[1,2]'::int4range + '[3,10)'::int4range;
>  ?column?
>  --
>   [1,10)
>   (1 row)
>
>   regression=# select '[1,2]'::int4range + '[4,10)'::int4range;
>   ERROR:  result of range union would not be contiguous

Because of backwards compatibility we can't really change +/-/* not to
raise (right?), so if we joined ranges and multiranges we'd need to
add operators with a different name. I was calling those @+/@-/@*
before, but that was considered too unintuitive and undiscoverable.
Having two types lets us use the nicer operator names.

Tom Lane  wrote:
> it seems like we could consider the traditional
> range functions like lower() and upper() to report on the first or last
> range bound in a multirange

I tried to keep functions/operators similar, so already lower(mr) =
lower(r) and upper(mr) = upper(r).
I think *conceptually* it's good to make ranges & multiranges as
interchangable as possible, but that doesn't mean they have to be the
same type.

Adding multiranges-as-ranges also raises questions about their string
format. If a multirange is {[1,2), [4,5)} would you only print the
curly braces when there is more than one element?

I don't *think* allowing non-contiguous ranges would break how we use
them in GiST indexes or exclusion constraints, but maybe someone can
think of some problem I can't. It's one place to be wary anyway. At
the very least it would make those things slower I expect.

On a few other issues people have raised recently:

Alvaro Herrera  writes:
> I wonder what's the point of multirange arrays. Is there a reason we
> create those?

We have arrays of everything else, so why not have them for
multiranges? We don't have to identify specific use cases here,
although I can see how you'd want to call array_agg/UNNEST on some
multiranges, e.g. (Actually I really want to add an UNNEST that
*takes* a multirange, but that could be a follow-on commit.) If
nothing else I think omitting arrays of multiranges would be a strange
irregularity in the type system.

David G. Johnston  wrote:
> In the tests there is:
>
> +select '{[a,a],[b,b]}'::textmultirange;
> + textmultirange
> +
> + {[a,a],[b,b]}
> +(1 row)
> +
> +-- without canonicalization, we can't join these:
> +select '{[a,a], [b,b]}'::textmultirange;
> + textmultirange
> +
> + {[a,a],[b,b]}
> +(1 row)
> +
>
> Aside from the comment they are identical so I'm confused as to why both 
> tests exist - though I suspect it has to do with the fact that the expected 
> result would be {[a,b]} since text is discrete.

Those tests are for basic string parsing (multirange_in), so one is
testing {A,B} and the other {A, B} (with a space after the comma).
(There are some tests right above those that also have blank spaces,
but they only output a single element in the multirange result.)

David G. Johnston  wrote:
> Also, the current patch set seems a bit undecided on whether it wants to be 
> truly a multi-range or a range that can report non-contiguous components.  
> Specifically,
>
> +select '{[a,d), [b,f]}'::textmultirange;
> + textmultirange
> +
> + {[a,f]}
> +(1 row)

Without a canonicalization function, we can't know that [a,a] touches
[b,b], but we *can* know that [a,d) touches [b,f). Or even:

regression=# select '{[a,b), [b,b]}'::textmultirange;
 textmultirange
 
  {[a,b]}
  (1 r

Re: range_agg

2020-03-10 Thread David Fetter
On Mon, Mar 09, 2020 at 06:34:04PM -0700, Jeff Davis wrote:
> On Sat, 2020-03-07 at 16:06 -0500, Tom Lane wrote:
> > Actually ... have you given any thought to just deciding that ranges
> > and
> > multiranges are the same type? 
> 
> It has come up in a number of conversations, but I'm not sure if it was
> discussed on this list.
> 
> > I think on the whole the advantages win,
> > and I feel like that might also be the case here.
> 
> Some things to think about:
> 
> 1. Ranges are common -- at least implicitly -- in a lot of
> applications/systems. It's pretty easy to represent extrernal data as
> ranges in postgres, and also to represent postgres ranges in external
> systems. But I can see multiranges causing friction around a lot of
> common tasks, like displaying in a UI. If you only expect ranges, you
> can add a CHECK constraint, so this is annoying but not necessarily a
> deal-breaker.

It could become well and truly burdensome in a UI or an API. The
difference between one, as ranges are now, and many, as multi-ranges
would be if we shoehorn them into the range type, are pretty annoying
to deal with.

> 2. There are existing client libraries[1] that support range types and
> transform them to types within the host language. Obviously, those
> would need to be updated to expect multiple ranges.

The type systems that would support such types might get unhappy with
us if we started messing with some of the properties like
contiguousness.

> 3. It seems like we would want some kind of base "range" type. When you
> try to break a multirange down into constituent ranges, what type would
> those pieces be? (Aside: how do you get the constituent ranges?)
> 
> I'm thinking more about casting to see if there's a possible compromise
> there.

I think the right compromise is to recognize that the closure of a set
(ranges) over an operation (set union) may well be a different set
(multi-ranges). Other operations have already been proposed, complete
with concrete use cases that could really make PostgreSQL stand out.

That we don't have an obvious choice of "most correct" operation over
which to close ranges makes it even bigger a potential foot-gun
when we choose one arbitrarily and declare it to be the canonical one.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate




Re: range_agg

2020-03-09 Thread Jeff Davis
On Sat, 2020-03-07 at 16:06 -0500, Tom Lane wrote:
> Actually ... have you given any thought to just deciding that ranges
> and
> multiranges are the same type? 

It has come up in a number of conversations, but I'm not sure if it was
discussed on this list.

> I think on the whole the advantages win,
> and I feel like that might also be the case here.

Some things to think about:

1. Ranges are common -- at least implicitly -- in a lot of
applications/systems. It's pretty easy to represent extrernal data as
ranges in postgres, and also to represent postgres ranges in external
systems. But I can see multiranges causing friction around a lot of
common tasks, like displaying in a UI. If you only expect ranges, you
can add a CHECK constraint, so this is annoying but not necessarily a
deal-breaker.

2. There are existing client libraries[1] that support range types and
transform them to types within the host language. Obviously, those
would need to be updated to expect multiple ranges.

3. It seems like we would want some kind of base "range" type. When you
try to break a multirange down into constituent ranges, what type would
those pieces be? (Aside: how do you get the constituent ranges?)

I'm thinking more about casting to see if there's a possible compromise
there.

Regards,
Jeff Davis

[1] 
https://sfackler.github.io/rust-postgres-range/doc/v0.8.2/postgres_range/





Re: range_agg

2020-03-09 Thread Tom Lane
Alvaro Herrera  writes:
> I wonder what's the point of multirange arrays.  Is there a reason we
> create those?

That's what we thought about arrays of composites to start with,
too.

regards, tom lane




Re: range_agg

2020-03-09 Thread Alvaro Herrera
I wonder what's the point of multirange arrays.  Is there a reason we
create those?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: range_agg

2020-03-09 Thread Robert Haas
On Sat, Mar 7, 2020 at 4:06 PM Tom Lane  wrote:
> It's possible that this is a bad idea.  It bears a lot of similarity,
> I guess, to the way that Postgres doesn't consider arrays of different
> dimensionality to be distinct types.  That has some advantages but it
> surely also has downsides.  I think on the whole the advantages win,
> and I feel like that might also be the case here.

Personally, I'm pretty unhappy with the fact that the array system
conflates arrays with different numbers of dimensions. Like, you end
up having to write array_upper(X, 1) instead of just array_upper(X),
and then you're still left wondering whether whatever you wrote is
going to blow up if somebody sneaks a multidimensional array in there,
or for that matter, an array with a non-standard lower bound. There's
lots of little things like that, where the decision to decorate the
array type with these extra frammishes makes it harder to use for
everybody even though most people don't use (or even want) those
features.

So count me as +1 for keeping range and multirange separate.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: range_agg

2020-03-08 Thread David G. Johnston
On Fri, Dec 20, 2019 at 10:43 AM Alvaro Herrera 
wrote:

> I took the liberty of rebasing this series on top of recent branch
> master.
>

In the tests there is:

+select '{[a,a],[b,b]}'::textmultirange;
+ textmultirange
+
+ {[a,a],[b,b]}
+(1 row)
+
+-- without canonicalization, we can't join these:
+select '{[a,a], [b,b]}'::textmultirange;
+ textmultirange
+
+ {[a,a],[b,b]}
+(1 row)
+

Aside from the comment they are identical so I'm confused as to why both
tests exist - though I suspect it has to do with the fact that the expected
result would be {[a,b]} since text is discrete.

Also, the current patch set seems a bit undecided on whether it wants to be
truly a multi-range or a range that can report non-contiguous components.
Specifically,

+select '{[a,d), [b,f]}'::textmultirange;
+ textmultirange
+
+ {[a,f]}
+(1 row)

There is a an argument that a multi-range should output {[a,d),[b,f]}.  IMO
its arguable that a multi-range container should not try and reduce the
number of contained ranges at all.  If that is indeed a desire, which seems
like it is, that feature alone goes a long way to support wanting to just
merge the desired functionality into the existing range type, where the
final output has the minimum number of contiguous ranges possible, rather
than having a separate multirange type.

David J.


Re: range_agg

2020-03-08 Thread Tom Lane
Isaac Morland  writes:
>> so 7. 3. 2020 v 22:20 odesílatel Tom Lane  napsal:
>>> Actually ... have you given any thought to just deciding that ranges and
>>> multiranges are the same type?  That is, any range can now potentially
>>> contain multiple segments?

> Definitely agreed that range and multirange (or whatever it's called)
> should be different. In the work I do I have a number of uses for ranges,
> but not (yet) for multiranges. I want to be able to declare a column as
> range and be sure that it is just a single range, and then call lower() and
> upper() on it and be sure to get just one value in each case; and if I
> accidentally try to take the union of ranges where the union isn’t another
> range, I want to get an error rather than calculate some weird (in my
> context) multirange.

I do not find that argument convincing at all.  Surely you could put
that constraint on your column using "CHECK (numranges(VALUE) <= 1)"
or some such notation.

Also, you're attacking a straw man with respect to lower() and upper();
I did not suggest changing them to return arrays, but rather interpreting
them as returning the lowest or highest endpoint, which I think would be
transparent in most cases.  (There would obviously need to be some other
functions that could dissect a multirange more completely.)

The real problem with the proposal as it stands, I think, is exactly
that range union has failure conditions and you have to use some other
operator if you want to get a successful result always.  That's an
enormously ugly kluge, and if we'd done it right the first time nobody
would have objected.

Bottom line is that I don't think that we should add a pile of new moving
parts to the type system just because people are afraid of change;
arguably, that's *more* change (and more risk of bugs), not less.
Unifying the types would, for example, get rid of the pesky question
of what promoting a range to multirange should look like exactly,
because it'd be a no-op.

regards, tom lane




Re: range_agg

2020-03-07 Thread Isaac Morland
On Sat, 7 Mar 2020 at 16:27, Pavel Stehule  wrote:

>
> so 7. 3. 2020 v 22:20 odesílatel Tom Lane  napsal:
>
>> I wrote:
>> > Actually ... have you given any thought to just deciding that ranges and
>> > multiranges are the same type?  That is, any range can now potentially
>> > contain multiple segments?  That would eliminate a whole lot of the
>> > tedious infrastructure hacking involved in this patch, and let you focus
>> > on the actually-useful functionality.
>>
>>

> I think this behave is correct. Sometimes you should to get only one range
> - and this check is a protection against not continuous range.
>
> if you expect multirange, then do
>
> select '[1,2]'::int4range::multirange + '[4,10)'::int4range;
>

Definitely agreed that range and multirange (or whatever it's called)
should be different. In the work I do I have a number of uses for ranges,
but not (yet) for multiranges. I want to be able to declare a column as
range and be sure that it is just a single range, and then call lower() and
upper() on it and be sure to get just one value in each case; and if I
accidentally try to take the union of ranges where the union isn’t another
range, I want to get an error rather than calculate some weird (in my
context) multirange.

On a related note, I was thinking about this and I don’t think I like
range_agg as a name at all. I know we have array_agg and string_agg but
surely shouldn’t this be called union_agg, and shouldn’t there also be an
intersect_agg? I mean, taking the union isn’t the only possible aggregate
on ranges or multiranges.


Re: range_agg

2020-03-07 Thread David Fetter
On Sat, Mar 07, 2020 at 06:45:44PM -0500, Tom Lane wrote:
> David Fetter  writes:
> > There's another use case not yet covered here that could make this
> > even more complex, we should probably plan for it: multi-ranges
> > with weights.
> 
> I'm inclined to reject that as completely out of scope.  The core
> argument for unifying multiranges with ranges, if you ask me, is to
> make the data type closed under union.  Weights are from some other
> universe.

I don't think they are. SQL databases are super useful because they do
bags in addition to sets, so set union isn't the only, or maybe even
the most important, operation over which ranges ought to be closed.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate




Re: range_agg

2020-03-07 Thread Tom Lane
David Fetter  writes:
> There's another use case not yet covered here that could make this
> even more complex, we should probably plan for it: multi-ranges with
> weights.

I'm inclined to reject that as completely out of scope.  The core
argument for unifying multiranges with ranges, if you ask me, is
to make the data type closed under union.  Weights are from some
other universe.

regards, tom lane




Re: range_agg

2020-03-07 Thread David Fetter
On Sat, Mar 07, 2020 at 04:06:32PM -0500, Tom Lane wrote:
> I wrote:
> > However, what I'm on about right at the moment is that I don't think
> > there should be any delta in that test at all.  As far as I can see,
> > the design idea here is that multiranges will be automatically created
> > over range types, and the user doesn't need to do that.  To my mind,
> > that means that they're an implementation detail and should not show up as
> > separately-owned objects, any more than an autogenerated array type does.
> 
> Actually ... have you given any thought to just deciding that ranges and
> multiranges are the same type?  That is, any range can now potentially
> contain multiple segments?  That would eliminate a whole lot of the
> tedious infrastructure hacking involved in this patch, and let you focus
> on the actually-useful functionality.

If we're changing range types rather than constructing a new
multi-range layer atop them, I think it would be helpful to have some
way to figure out quickly whether this new range type was contiguous.
One way to do that would be to include a "range cardinality" in the
data structure which be the number of left ends in it.

One of the things I'd pictured doing with multiranges was along the
lines of a "full coverage" constraint like "During a shift, there can
be no interval that's not covered," which would correspond to a "range
cardinality" of 1.

I confess I'm getting a little twitchy about the idea of eliding the
cases of "one" and "many", though.

> Assuming that that's ok, it seems like we could consider the traditional
> range functions like lower() and upper() to report on the first or last
> range bound in a multirange --- essentially, they ignore any "holes"
> that exist inside the range.  And the new functions for multiranges
> act much like array slicing, in that they give you back pieces of a range
> that aren't actually of a distinct type.

So new functions along the lines of lowers(), uppers(), opennesses(),
etc.? I guess this could be extended as needs emerge.

There's another use case not yet covered here that could make this
even more complex, we should probably plan for it: multi-ranges with
weights.

For example,

SELECT weighted_range_union(r)
FROM (VALUES('[0,1)'::float8range), ('[0,3)'), '('[2,5)')) AS t(r)

would yield something along the lines of:

(([0,1),1), ([1,3),2), ([3,5),1))

and wedging that into the range type seems messy. Each range would
then have a cardinality, and each range within would have a weight,
all of which would be an increasingly heavy burden on the common case
where there's just a single range.

Enhancing a separate multirange type to have weights seems like a
cleaner path forward.

Given that, I'm -1 on mushing multi-ranges into a special case of
ranges, or /vice versa/.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate




Re: range_agg

2020-03-07 Thread Pavel Stehule
so 7. 3. 2020 v 22:20 odesílatel Tom Lane  napsal:

> I wrote:
> > Actually ... have you given any thought to just deciding that ranges and
> > multiranges are the same type?  That is, any range can now potentially
> > contain multiple segments?  That would eliminate a whole lot of the
> > tedious infrastructure hacking involved in this patch, and let you focus
> > on the actually-useful functionality.
>
> Also, this would allow us to remove at least one ugly misfeature:
>
> regression=# select '[1,2]'::int4range + '[3,10)'::int4range;
>  ?column?
> --
>  [1,10)
> (1 row)
>
> regression=# select '[1,2]'::int4range + '[4,10)'::int4range;
> ERROR:  result of range union would not be contiguous
>
> If the result of range_union can be a multirange as easily as not,
> we would no longer have to throw an error here.
>

I think this behave is correct. Sometimes you should to get only one range
- and this check is a protection against not continuous range.

if you expect multirange, then do

select '[1,2]'::int4range::multirange + '[4,10)'::int4range;

Regards

Pavel


>
> regards, tom lane
>


Re: range_agg

2020-03-07 Thread Tom Lane
I wrote:
> Actually ... have you given any thought to just deciding that ranges and
> multiranges are the same type?  That is, any range can now potentially
> contain multiple segments?  That would eliminate a whole lot of the
> tedious infrastructure hacking involved in this patch, and let you focus
> on the actually-useful functionality.

Also, this would allow us to remove at least one ugly misfeature:

regression=# select '[1,2]'::int4range + '[3,10)'::int4range;
 ?column? 
--
 [1,10)
(1 row)

regression=# select '[1,2]'::int4range + '[4,10)'::int4range;
ERROR:  result of range union would not be contiguous

If the result of range_union can be a multirange as easily as not,
we would no longer have to throw an error here.

regards, tom lane




Re: range_agg

2020-03-07 Thread Tom Lane
I wrote:
> However, what I'm on about right at the moment is that I don't think
> there should be any delta in that test at all.  As far as I can see,
> the design idea here is that multiranges will be automatically created
> over range types, and the user doesn't need to do that.  To my mind,
> that means that they're an implementation detail and should not show up as
> separately-owned objects, any more than an autogenerated array type does.

Actually ... have you given any thought to just deciding that ranges and
multiranges are the same type?  That is, any range can now potentially
contain multiple segments?  That would eliminate a whole lot of the
tedious infrastructure hacking involved in this patch, and let you focus
on the actually-useful functionality.

It's possible that this is a bad idea.  It bears a lot of similarity,
I guess, to the way that Postgres doesn't consider arrays of different
dimensionality to be distinct types.  That has some advantages but it
surely also has downsides.  I think on the whole the advantages win,
and I feel like that might also be the case here.

The gating requirement for this would be to make sure that a plain
range and a multirange can be told apart by contents.  The first idea that
comes to mind is to repurpose the allegedly-unused RANGE_xB_NULL bits in
the flag byte at the end of the datum.  If one of them is set, then it's a
multirange, and we use a different interpretation of the bytes between the
type OID and the flag byte.

Assuming that that's ok, it seems like we could consider the traditional
range functions like lower() and upper() to report on the first or last
range bound in a multirange --- essentially, they ignore any "holes"
that exist inside the range.  And the new functions for multiranges
act much like array slicing, in that they give you back pieces of a range
that aren't actually of a distinct type.

regards, tom lane




Re: range_agg

2020-03-07 Thread Tom Lane
Alvaro Herrera  writes:
> [ v11 patches ]

The cfbot isn't too happy with this; it's getting differently-ordered
results than you apparently did for the list of owned objects in
dependency.out's DROP OWNED BY test.  Not sure why that should be ---
it seems like af6550d34 should have ensured that there's only one
possible ordering.

However, what I'm on about right at the moment is that I don't think
there should be any delta in that test at all.  As far as I can see,
the design idea here is that multiranges will be automatically created
over range types, and the user doesn't need to do that.  To my mind,
that means that they're an implementation detail and should not show up as
separately-owned objects, any more than an autogenerated array type does.
So somewhere there's a missing bit of code, or more than one missing bit,
to make multiranges act as derived types, the way arrays are.

regards, tom lane




Re: range_agg

2020-03-04 Thread Paul Jungwirth

Thanks for looking at this again!

On 3/4/20 1:33 PM, Alvaro Herrera wrote:

I came across an interesting thing, namely multirange_canonicalize()'s
use of qsort_arg with a callback of range_compare().  range_compare()
calls range_deserialize() (non-trivial parsing) for each input range;
multirange_canonicalize() later does a few extra deserialize calls of
its own.  Call me a premature optimization guy if you will, but I think
it makes sense to have a different struct (let's call it
"InMemoryRange") which stores the parsed representation of each range;
then we can deserialize all ranges up front, and use that as many times
as needed, without having to deserialize each range every time.


I don't know, this sounds like a drastic change. I agree that 
multirange_deserialize and range_deserialize do a lot of copying (not 
really any parsing though, and they both assume their inputs are already 
de-TOASTED). But they are used very extensively, so if you wanted to 
remove them you'd have to rewrite a lot.


I interpreted the intention of range_deserialize to be a way to keep the 
range struct fairly "private" and give a standard interface to 
extracting its attributes. Its motive seems akin to deconstruct_array. 
So I wrote multirange_deserialize to follow that principle. Both 
functions also handle memory alignment issues for you. With 
multirange_deserialize, there isn't actually much structure (just the 
list of ranges), so perhaps you could more easily omit it and give 
callers direct access into the multirange contents. That still seems 
risky though, and less well encapsulated.


My preference would be to see if these functions are really a 
performance problem first, and only redo the in-memory structures if 
they are. Also that seems like something you could do as a separate 
project. (I wouldn't mind working on it myself, although I'd prefer to 
do actual temporal database features first.) There are no 
backwards-compatibility concerns to changing the in-memory structure, 
right? (Even if there are, it's too late to avoid them for ranges.)



While I'm at this, why not name the new file simply multiranges.c
instead of multirangetypes.c?


As someone who doesn't do a lot of Postgres hacking, I tried to follow 
the approach in rangetypes.c as closely as I could, especially for 
naming things. So I named the file multirangetypes.c because there was 
already rangetypes.c. But also I can see how the "types" emphasizes that 
ranges and multiranges are not concrete types themselves, but more like 
abstract data types or generics (like arrays).


Yours,

--
Paul  ~{:-)
p...@illuminatedcomputing.com




Re: range_agg

2020-03-04 Thread Alvaro Herrera
I came across an interesting thing, namely multirange_canonicalize()'s
use of qsort_arg with a callback of range_compare().  range_compare()
calls range_deserialize() (non-trivial parsing) for each input range;
multirange_canonicalize() later does a few extra deserialize calls of
its own.  Call me a premature optimization guy if you will, but I think
it makes sense to have a different struct (let's call it
"InMemoryRange") which stores the parsed representation of each range;
then we can deserialize all ranges up front, and use that as many times
as needed, without having to deserialize each range every time.

While I'm at this, why not name the new file simply multiranges.c
instead of multirangetypes.c?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: range_agg

2020-01-23 Thread Pavel Stehule
Hi

st 22. 1. 2020 v 0:55 odesílatel Paul A Jungwirth <
p...@illuminatedcomputing.com> napsal:

> On Sun, Jan 19, 2020 at 9:57 PM Paul A Jungwirth
>  wrote:
> > On Sun, Jan 19, 2020 at 4:38 PM Tom Lane  wrote:
> > > True for casts involving concrete types, mainly because we'd like
> > > the identity "value::typename == typename(value)" to hold without
> > > too much worry about whether the latter is a plain function call
> > > or a special case.  Not sure whether it makes as much sense for
> > > polymorphics, since casting to a polymorphic type is pretty silly:
> > > we do seem to allow you to do that, but it's a no-op.
> > >
> > > ...
> > >
> > > Alternatively, consider this: a cast from some concrete multirange type
> > > to anymultirange is a no-op, while any other sort of cast probably
> ought
> > > to be casting to some particular concrete multirange type.  That would
> > > line up with the existing operations for plain ranges.
> >
> > I agree you wouldn't actually cast by saying x::anymultirange, and the
> > casts we define are already concrete, so instead you'd say
> > x::int4multirange. But I think having a polymorphic function to
> > convert from an anyrange to an anymultirange is useful so you can
> > write generic functions. I can see how calling it "anymultirange" may
> > be preferring the implementor perspective over the user perspective
> > though, and how simply "multirange" would be more empathetic. I don't
> > mind taking that approach.
>
> Here is a patch with anymultirange(anyrange) renamed to
> multirange(anyrange). I also rebased on the latest master, added
> documentation about the multirange(anyrange) function, and slightly
> adjusted the formatting of the range functions table.
>

I think so this patch is ready for commiter.

All tests passed, the doc is good enough (the chapter name "Range functions
and Operators" should be renamed to "Range/multirange functions and
Operators"
The code formatting and comments looks well


Thank you for your work

Regards

Pavel


> Thanks,
> Paul
>


Re: range_agg

2020-01-19 Thread Paul A Jungwirth
On Sun, Jan 19, 2020 at 4:38 PM Tom Lane  wrote:
> True for casts involving concrete types, mainly because we'd like
> the identity "value::typename == typename(value)" to hold without
> too much worry about whether the latter is a plain function call
> or a special case.  Not sure whether it makes as much sense for
> polymorphics, since casting to a polymorphic type is pretty silly:
> we do seem to allow you to do that, but it's a no-op.
>
> ...
>
> Alternatively, consider this: a cast from some concrete multirange type
> to anymultirange is a no-op, while any other sort of cast probably ought
> to be casting to some particular concrete multirange type.  That would
> line up with the existing operations for plain ranges.

I agree you wouldn't actually cast by saying x::anymultirange, and the
casts we define are already concrete, so instead you'd say
x::int4multirange. But I think having a polymorphic function to
convert from an anyrange to an anymultirange is useful so you can
write generic functions. I can see how calling it "anymultirange" may
be preferring the implementor perspective over the user perspective
though, and how simply "multirange" would be more empathetic. I don't
mind taking that approach.

Yours,
Paul




Re: range_agg

2020-01-19 Thread Pavel Stehule
po 20. 1. 2020 v 1:38 odesílatel Tom Lane  napsal:

> Paul A Jungwirth  writes:
> > On Sun, Jan 19, 2020 at 12:10 AM Pavel Stehule 
> wrote:
> >> Now, I think so name "anymultirange" is not good. Maybe better name is
> just "multirange"
>
> > Are you sure? This function exists to be a cast to an anymultirange,
> > and I thought the convention was to name cast functions after their
> > destination type.
>
> True for casts involving concrete types, mainly because we'd like
> the identity "value::typename == typename(value)" to hold without
> too much worry about whether the latter is a plain function call
> or a special case.  Not sure whether it makes as much sense for
> polymorphics, since casting to a polymorphic type is pretty silly:
> we do seem to allow you to do that, but it's a no-op.
>
> I'm a little troubled by the notion that what you're talking about
> here is not a no-op (if it were, you wouldn't need a function).
> That seems like there's something fundamentally not quite right
> either with the design or with how you're thinking about it.
>

I thinking about completeness of operations

I can to write

CREATE OR REPLACE FUNCTION fx(anyarray, anyelement)
RETURNS anyarray AS $$
SELECT $1 || ARRAY[$2]
$$ LANGUAGE sql;

I need to some functionality for moving a value to different category (it
is more generic than casting to specific type (that can hold category)

CREATE OR REPLACE FUNCTION fx(anymultirange, anyrange)
RETURNS anyrage AS $$
SELECT $1 + multirange($1)
$$ LANGUAGE sql;

is just a analogy.

Regards

Pavel


> As a comparison point, we sometimes describe subscripting as
> being a polymorphic operation like
>
> subscript(anyarray, integer) returns anyelement
>
> It would be completely unhelpful to call that anyelement().
> I feel like you might be making a similar mistake here.
>
> Alternatively, consider this: a cast from some concrete multirange type
> to anymultirange is a no-op, while any other sort of cast probably ought
> to be casting to some particular concrete multirange type.  That would
> line up with the existing operations for plain ranges.
>





> regards, tom lane
>


Re: range_agg

2020-01-19 Thread Tom Lane
Paul A Jungwirth  writes:
> On Sun, Jan 19, 2020 at 12:10 AM Pavel Stehule  
> wrote:
>> Now, I think so name "anymultirange" is not good. Maybe better name is just 
>> "multirange"

> Are you sure? This function exists to be a cast to an anymultirange,
> and I thought the convention was to name cast functions after their
> destination type.

True for casts involving concrete types, mainly because we'd like
the identity "value::typename == typename(value)" to hold without
too much worry about whether the latter is a plain function call
or a special case.  Not sure whether it makes as much sense for
polymorphics, since casting to a polymorphic type is pretty silly:
we do seem to allow you to do that, but it's a no-op.

I'm a little troubled by the notion that what you're talking about
here is not a no-op (if it were, you wouldn't need a function).
That seems like there's something fundamentally not quite right
either with the design or with how you're thinking about it.

As a comparison point, we sometimes describe subscripting as
being a polymorphic operation like

subscript(anyarray, integer) returns anyelement

It would be completely unhelpful to call that anyelement().
I feel like you might be making a similar mistake here.

Alternatively, consider this: a cast from some concrete multirange type
to anymultirange is a no-op, while any other sort of cast probably ought
to be casting to some particular concrete multirange type.  That would
line up with the existing operations for plain ranges.

regards, tom lane




Re: range_agg

2020-01-19 Thread Paul A Jungwirth
On Sun, Jan 19, 2020 at 12:10 AM Pavel Stehule  wrote:
> Now, I think so name "anymultirange" is not good. Maybe better name is just 
> "multirange"

Are you sure? This function exists to be a cast to an anymultirange,
and I thought the convention was to name cast functions after their
destination type. I can change it, but in my opinion anymultirange
follows the Postgres conventions better. But just let me know and I'll
do "multirange" instead!

Yours,
Paul




Re: range_agg

2020-01-19 Thread Pavel Stehule
so 18. 1. 2020 v 17:35 odesílatel Pavel Stehule 
napsal:

>
>
> so 18. 1. 2020 v 17:07 odesílatel Paul A Jungwirth <
> p...@illuminatedcomputing.com> napsal:
>
>> On Sat, Jan 18, 2020 at 7:20 AM Pavel Stehule 
>> wrote:
>> > Can be nice to have a polymorphic function
>> >
>> > multirange(anymultirange, anyrange) returns anymultirange. This
>> functions should to do multirange from $2 to type $1
>> >
>> > It can enhance to using polymorphic types and simplify casting.
>>
>> Thanks for taking another look! I actually have that already but it is
>> named anymultirange:
>>
>> regression=# select anymultirange(int4range(1,2));
>>  anymultirange
>> ---
>>  {[1,2)}
>> (1 row)
>>
>> Will that work for you?
>>
>
> It's better than I though
>
>
>> I think I only wrote that to satisfy some requirement of having an
>> anymultirange type, but I agree it could be useful. (I even used it in
>> the regress tests.) Maybe it's worth documenting too?
>>
>
Now, I think so name "anymultirange" is not good. Maybe better name is just
"multirange"



> yes
>
>
>> > when I tried to write this function in plpgsql I got
>> >
>> > create or replace function multirange(anymultirange, anyrange) returns
>> anymultirange as $$
>> > begin
>> >   execute format('select $2::%I', pg_typeof($1)) into $1;
>> >   return $1;
>> > end;
>> > $$ language plpgsql immutable strict;
>> >
>> > ERROR:  unrecognized typtype: 109
>> > CONTEXT:  compilation of PL/pgSQL function "multirange" near line 1
>>
>> Hmm, I'll add a test for that and see if I can find the problem.
>>
>
> ok
>
>
>> Thanks!
>> Paul
>>
>


Re: range_agg

2020-01-18 Thread Pavel Stehule
so 18. 1. 2020 v 17:07 odesílatel Paul A Jungwirth <
p...@illuminatedcomputing.com> napsal:

> On Sat, Jan 18, 2020 at 7:20 AM Pavel Stehule 
> wrote:
> > Can be nice to have a polymorphic function
> >
> > multirange(anymultirange, anyrange) returns anymultirange. This
> functions should to do multirange from $2 to type $1
> >
> > It can enhance to using polymorphic types and simplify casting.
>
> Thanks for taking another look! I actually have that already but it is
> named anymultirange:
>
> regression=# select anymultirange(int4range(1,2));
>  anymultirange
> ---
>  {[1,2)}
> (1 row)
>
> Will that work for you?
>

It's better than I though


> I think I only wrote that to satisfy some requirement of having an
> anymultirange type, but I agree it could be useful. (I even used it in
> the regress tests.) Maybe it's worth documenting too?
>

yes


> > when I tried to write this function in plpgsql I got
> >
> > create or replace function multirange(anymultirange, anyrange) returns
> anymultirange as $$
> > begin
> >   execute format('select $2::%I', pg_typeof($1)) into $1;
> >   return $1;
> > end;
> > $$ language plpgsql immutable strict;
> >
> > ERROR:  unrecognized typtype: 109
> > CONTEXT:  compilation of PL/pgSQL function "multirange" near line 1
>
> Hmm, I'll add a test for that and see if I can find the problem.
>

ok


> Thanks!
> Paul
>


Re: range_agg

2020-01-18 Thread Paul A Jungwirth
On Sat, Jan 18, 2020 at 7:20 AM Pavel Stehule  wrote:
> Can be nice to have a polymorphic function
>
> multirange(anymultirange, anyrange) returns anymultirange. This functions 
> should to do multirange from $2 to type $1
>
> It can enhance to using polymorphic types and simplify casting.

Thanks for taking another look! I actually have that already but it is
named anymultirange:

regression=# select anymultirange(int4range(1,2));
 anymultirange
---
 {[1,2)}
(1 row)

Will that work for you?

I think I only wrote that to satisfy some requirement of having an
anymultirange type, but I agree it could be useful. (I even used it in
the regress tests.) Maybe it's worth documenting too?

> when I tried to write this function in plpgsql I got
>
> create or replace function multirange(anymultirange, anyrange) returns 
> anymultirange as $$
> begin
>   execute format('select $2::%I', pg_typeof($1)) into $1;
>   return $1;
> end;
> $$ language plpgsql immutable strict;
>
> ERROR:  unrecognized typtype: 109
> CONTEXT:  compilation of PL/pgSQL function "multirange" near line 1

Hmm, I'll add a test for that and see if I can find the problem.

Thanks!
Paul




Re: range_agg

2020-01-18 Thread Pavel Stehule
pá 17. 1. 2020 v 21:08 odesílatel Paul A Jungwirth <
p...@illuminatedcomputing.com> napsal:

> On Fri, Jan 10, 2020 at 1:38 AM Pavel Stehule 
> wrote:
> >> This still leaves the question of how best to format the docs for
> >> these operators. I like being able to combine all the <@ variations
> >> (e.g.) into one table row, but if that is too ugly I could give them
> >> separate rows instead. Giving them all their own row consumes a lot of
> >> vertical space though, and to me that makes the docs more tedious to
> >> read & browse, so it's harder to grasp all the available range-related
> >> operations at a glance.
> >
> >
> > I have similar opinion - maybe is better do documentation for range and
> multirange separately. Sometimes there are still removed operators @+
>
> I like keeping the range/multirange operators together since they are
> so similar for both types, but if others disagree I'd be grateful for
> more feedback.
>

ok

>
> You're right that I left in a few references to the old @+ style
> operators in the examples; I've fixed those.
>
> > If you can share TYPTYPE_RANGE in code for multiranges, then it should
> be 'r'. If not, then it needs own value.
>
> Okay. I think a new 'm' value is warranted because they are not
> interchangeable.
>
> >> I experimented with setting pg_type.typelem to the multirange's range
> >> type, but it seemed to break a lot of things, and reading the code I
> >> saw some places that treat a non-zero typelem as synonymous with being
> >> an array. So I'm reluctant to make this change also, especially when
> >> it is just as easy to query pg_range to get a multirange's range type.
> >
> >
> > ok, it is unhappy, but it is true. This note should be somewhere in
> code, please
>
> I've added a comment about this. I put it at the top of DefineRange
> but let me know if that's the wrong place.
>
> The attached file is also rebased on currrent master.
>

Can be nice to have a polymorphic function

multirange(anymultirange, anyrange) returns anymultirange. This functions
should to do multirange from $2 to type $1

It can enhance to using polymorphic types and simplify casting.

Usage

CREATE OR REPLACE FUNCTION diff(anymultirange, anyrange)
RETURNS anymultirange AS $$
  SELECT $1 - multirange($1, $2)
$$ LANGUAGE sql;

when I tried to write this function in plpgsql I got

create or replace function multirange(anymultirange, anyrange) returns
anymultirange as $$
begin
  execute format('select $2::%I', pg_typeof($1)) into $1;
  return $1;
end;
$$ language plpgsql immutable strict;

ERROR:  unrecognized typtype: 109
CONTEXT:  compilation of PL/pgSQL function "multirange" near line 1

So probably some support in PL is missing

But all others looks very well

Regards

Pavel




>
> Thanks!
> Paul
>


Re: range_agg

2020-01-10 Thread Pavel Stehule
Hi

so 4. 1. 2020 v 6:29 odesílatel Paul A Jungwirth <
p...@illuminatedcomputing.com> napsal:

> On Fri, Dec 20, 2019 at 11:29 AM Alvaro Herrera
>  wrote:
> > > Should I just give up on implicit casts and require you to specify
> > > one? That makes it a little more annoying to mix range & multirange
> > > types, but personally I'm okay with that. This is my preferred
> > > approach.
> >
> > +1
>
> Here is a patch adding the casts, rebasing on the latest master, and
> incorporating Alvaro's changes. Per his earlier suggestion I combined
> it all into one patch file, which also makes it easier to apply
> rebases & updates.
>

This patch was applied cleanly and all tests passed



>
> My work on adding casts also removes the @+ / @- / @* operators and
> adds + / - / * operators where both parameters are multiranges. I
> retained other operators with mixed range/multirange parameters, both
> because there are already range operators with mixed range/scalar
> parameters (e.g. <@), and because it seemed like the objection to @+ /
> @- / @* was not mixed parameters per se, but rather their
> unguessability. Since the other operators are the same as the existing
> range operators, they don't share that problem.
>

looks well


>
> This still leaves the question of how best to format the docs for
> these operators. I like being able to combine all the <@ variations
> (e.g.) into one table row, but if that is too ugly I could give them
> separate rows instead. Giving them all their own row consumes a lot of
> vertical space though, and to me that makes the docs more tedious to
> read & browse, so it's harder to grasp all the available range-related
> operations at a glance.
>

I have similar opinion - maybe is better do documentation for range and
multirange separately. Sometimes there are still removed operators @+


> I'm skeptical of changing pg_type.typtype from 'm' to 'r'. A
> multirange isn't a range, so why should we give it the same type? Also
> won't this break any queries that are using that column to find range
> types? What is the motivation to use the same typtype for both ranges
> and multiranges? (There is plenty I don't understand here, e.g. why we
> have both typtype and typcategory, so maybe there is a good reason I'm
> missing.)
>

If you can share TYPTYPE_RANGE in code for multiranges, then it should be
'r'. If not, then it needs own value.


> I experimented with setting pg_type.typelem to the multirange's range
> type, but it seemed to break a lot of things, and reading the code I
> saw some places that treat a non-zero typelem as synonymous with being
> an array. So I'm reluctant to make this change also, especially when
> it is just as easy to query pg_range to get a multirange's range type.
>

ok, it is unhappy, but it is true. This note should be somewhere in code,
please


> Also range types themselves don't set typelem to their base type, and
> it seems like we'd want to treat ranges and multiranges the same way
> here.
>
> Alvaro also suggested renaming pg_range.mltrngtypid to
> pg_range.rngmultitypid, so it shares the same "rng" prefix as the
> other columns in this table. Having a different prefix does stand out.
> I've included that change in this patch too.
>

Personally I have not any comments to implemented functionality.

>
> Yours,
> Paul
>


Re: range_agg

2019-12-20 Thread Tom Lane
Alvaro Herrera  writes:
> On 2019-Dec-20, Paul A Jungwirth wrote:
>> Is it permitted to add casts with polymorphic inputs & outputs? Is
>> that something that we would actually want to do? I'd probably need
>> both the polymorphic and concrete casts so that you could still say
>> `int4range(1,2)::int4multirange`.

> I'm embarrased to admit that I don't grok the type system well enough
> (yet) to answer this question.

I would say no; if you want behavior like that you'd have to add code for
it into the coercion machinery, much like the casts around, say, types
record and record[] versus named composites and arrays of same.  Expecting
the generic cast machinery to do the right thing would be foolhardy.

In any case, even if it did do the right thing, you'd still need
some additional polymorphic type to express the behavior you wanted,
no?  So it's not clear there'd be any net savings of effort.

regards, tom lane




Re: range_agg

2019-12-20 Thread Alvaro Herrera
On 2019-Dec-20, Paul A Jungwirth wrote:

> Is it permitted to add casts with polymorphic inputs & outputs? Is
> that something that we would actually want to do? I'd probably need
> both the polymorphic and concrete casts so that you could still say
> `int4range(1,2)::int4multirange`.

I'm embarrased to admit that I don't grok the type system well enough
(yet) to answer this question.

> Should I change the coerce code to look for casts among concrete types
> when the function has polymorphic types? I'm pretty scared to do
> something like that though, both because of the complexity and lest I
> cause unintended effects.

Yeah, I suggest to stay away from that.  I think this multirange thing
is groundbreaking enough that we don't need to cause additional
potential breakage.

> Should I just give up on implicit casts and require you to specify
> one? That makes it a little more annoying to mix range & multirange
> types, but personally I'm okay with that. This is my preferred
> approach.

+1

> I have some time over the holidays to work on the other changes Alvaro
> has suggested.

I hope not to have made things worse by posting a rebase.  Anyway,
that's the reason I posted my other changes separately.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: range_agg

2019-12-20 Thread Paul A Jungwirth
On Fri, Dec 20, 2019 at 10:19 AM Pavel Stehule  wrote:
> I had a talk with Paul about possible simplification of designed operators. 
> Last message from Paul was - he is working on new version.

Thanks Alvaro & Pavel for helping move this forward. I've added the
casts but they aren't used automatically for things like
`range_union(r, mr)` or `mr + r`, even though they are implicit.
That's because the casts are for concrete types (e.g. int4range ->
int4multirange) but the functions & operators are for polymorphic
types (anymultirange + anymultirange). So I'd like to get some
feedback about the best way to proceed.

Is it permitted to add casts with polymorphic inputs & outputs? Is
that something that we would actually want to do? I'd probably need
both the polymorphic and concrete casts so that you could still say
`int4range(1,2)::int4multirange`.

Should I change the coerce code to look for casts among concrete types
when the function has polymorphic types? I'm pretty scared to do
something like that though, both because of the complexity and lest I
cause unintended effects.

Should I just give up on implicit casts and require you to specify
one? That makes it a little more annoying to mix range & multirange
types, but personally I'm okay with that. This is my preferred
approach.

I have some time over the holidays to work on the other changes Alvaro
has suggested.

Thanks,
Paul




Re: range_agg

2019-12-20 Thread Alvaro Herrera
On 2019-Dec-20, Alvaro Herrera wrote:

> I am not convinced that adding TYPTYPE_MULTIRANGE is really necessary.
> Why can't we just treat those types as TYPTYPE_RANGE and distinguish
> them using TYPCATEGORY_MULTIRANGE?  That's what we do for arrays.  I'll
> try to do that next.

I think this can be simplified if we make the the multirange's
pg_type.typelem carry the base range's OID (the link in the other
direction already appears as pg_range.mltrngtypid, though I'd recommend
renaming that to pg_range.rngmultitypid to maintain the "rng" prefix
convention).  Then we can distinguish a multirange from a plain range
easily, both of which have typtype as TYPTYPE_RANGE, because typelem !=
0 in a multi.  That knowledge can be encapsulated easily in
type_is_multirange and pg_dump's getTypes.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: range_agg

2019-12-20 Thread Pavel Stehule
pá 20. 12. 2019 v 18:43 odesílatel Alvaro Herrera 
napsal:

> I took the liberty of rebasing this series on top of recent branch
> master.  The first four are mostly Paul's originals, except for conflict
> fixes; the rest are changes I'm proposing as I go along figuring out the
> whole thing.  (I would post just my proposed changes, if it weren't for
> the rebasing; apologies for the messiness.)
>
> I am not convinced that adding TYPTYPE_MULTIRANGE is really necessary.
> Why can't we just treat those types as TYPTYPE_RANGE and distinguish
> them using TYPCATEGORY_MULTIRANGE?  That's what we do for arrays.  I'll
> try to do that next.
>
> I think the algorithm for coming up with the multirange name is
> suboptimal.  It works fine with the name is short enough that we can add
> a few extra letters, but otherwise the result look pretty silly.  I
> think we can still improve on that.  I propose to make
> makeUniqueTypeName accept a suffix, and truncate the letters that appear
> *before* the suffix rather than truncating after it's been appended.
>
> There's a number of ereport() calls that should become elog(); and a
> bunch of others that should probably acquire errcode() and be
> reformatted per our style.
>
>
> Regarding Pavel's documentation markup issue,
>
> > I am not sure how much is correct to use  class="monospaced">
> > in doc. It is used for ranges, and multiranges, but no in other places
>
> I looked at the generated PDF and the table looks pretty bad; the words
> in those entries overlap the words in the cell to their right.  But that
> also happens with entries that do not use !
> See [1] for an example of the existing docs being badly formatted.  The
> docbook documentation [2] seems to suggest that what Paul used is the
> appropriate way to do this.
>
> Maybe a way is to make each entry have more than one row -- so the
> example would appear below the other three fields in its own row, and
> would be able to use the whole width of the table.
>

I had a talk with Paul about possible simplification of designed operators.
Last message from Paul was - he is working on new version.

Regards

Pavel



> [1] https://twitter.com/alvherre/status/1205563468595781633
> [2] https://tdg.docbook.org/tdg/5.1/literallayout.html
>
> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


Re: range_agg

2019-11-29 Thread Alvaro Herrera
Hi Paul

I'm starting to look at this again.  Here's a few proposed changes to
the current code as I read along.

I noticed that 0001 does not compile on its own.  It works as soon as I
add 0002.  What this is telling me is that the current patch split is
not serving any goals; I think it's okay to merge them all into a single
commit.  If you want to split it in smaller pieces, it needs to be stuff
that can be committed separately.

One possible candidate for that is the new makeUniqueTypeName function
you propose.  I added this comment to explain what it does:

 /*
- * makeUniqueTypeName: Prepend underscores as needed until we make a name that
- * doesn't collide with anything. Tries the original typeName if requested.
+ * makeUniqueTypeName
+ * Generate a unique name for a prospective new type
+ *
+ * Given a typeName of length namelen, produce a new name into dest (an output
+ * buffer allocated by caller, which must of length NAMEDATALEN) by prepending
+ * underscores, until a non-conflicting name results.
+ *
+ * If tryOriginalName, first try with zero underscores.
  *
  * Returns the number of underscores added.
  */

This seems a little too strange; why not have the function allocate its
output buffer instead, and return it?  In order to support the case of
it failing to find an appropriate name, have it return NULL, for caller
to throw the "could not form ... name" error.

The attached 0001 simplifies makeMultirangeConstructors; I think it was
too baroque just because of it trying to imitate makeRangeConstructors;
it seems easier to just repeat the ProcedureCreate call than trying to
be clever.  (makeRangeConstructors's comment is lying about the number
of constructors it creates after commit df73584431e7, BTW.  But note
that the cruft in it to support doing it twice is not as much as in the
new code).

The other patches should be self-explanatory (I already submitted 0002
previously.)

I'll keep going over the rest of it.  Thanks!

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 6a5b230f741fb4272770d09750b57fdad225027e Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Thu, 28 Nov 2019 19:08:43 -0300
Subject: [PATCH 1/6] Simplify makeMultirangeConstructors

---
 src/backend/commands/typecmds.c | 134 +++-
 1 file changed, 80 insertions(+), 54 deletions(-)

diff --git a/src/backend/commands/typecmds.c b/src/backend/commands/typecmds.c
index 38948a049b..1b012c9cad 100644
--- a/src/backend/commands/typecmds.c
+++ b/src/backend/commands/typecmds.c
@@ -1784,70 +1784,96 @@ static void
 makeMultirangeConstructors(const char *name, Oid namespace,
 		   Oid multirangeOid, Oid rangeArrayOid)
 {
-	static const char *const prosrc[2] = {"multirange_constructor0",
-	"multirange_constructor1"};
-	static const int pronargs[2] = {0, 1};
-
-	Oid			constructorArgTypes = rangeArrayOid;
 	ObjectAddress myself,
 referenced;
-	int			i;
-
-	Datum		allParamTypes[1] = {ObjectIdGetDatum(rangeArrayOid)};
-	ArrayType  *allParameterTypes = construct_array(allParamTypes, 1, OIDOID,
-	sizeof(Oid), true, 'i');
-	Datum		constructorAllParamTypes[2] = {PointerGetDatum(NULL), PointerGetDatum(allParameterTypes)};
-
-	Datum		paramModes[1] = {CharGetDatum(FUNC_PARAM_VARIADIC)};
-	ArrayType  *parameterModes = construct_array(paramModes, 1, CHAROID,
- 1, true, 'c');
-	Datum		constructorParamModes[2] = {PointerGetDatum(NULL), PointerGetDatum(parameterModes)};
+	oidvector  *argtypes;
+	Datum		allParamTypes;
+	ArrayType  *allParameterTypes;
+	Datum		paramModes;
+	ArrayType  *parameterModes;
 
 	referenced.classId = TypeRelationId;
 	referenced.objectId = multirangeOid;
 	referenced.objectSubId = 0;
 
-	for (i = 0; i < lengthof(prosrc); i++)
-	{
-		oidvector  *constructorArgTypesVector;
+	argtypes = buildoidvector(NULL, 0);
+	myself = ProcedureCreate(name,	/* name: same as multirange type */
+			 namespace,
+			 false, /* replace */
+			 false, /* returns set */
+			 multirangeOid, /* return type */
+			 BOOTSTRAP_SUPERUSERID, /* proowner */
+			 INTERNALlanguageId,	/* language */
+			 F_FMGR_INTERNAL_VALIDATOR,
+			 "multirange_constructor0", /* prosrc */
+			 NULL,	/* probin */
+			 PROKIND_FUNCTION,
+			 false, /* security_definer */
+			 false, /* leakproof */
+			 false, /* isStrict */
+			 PROVOLATILE_IMMUTABLE, /* volatility */
+			 PROPARALLEL_SAFE,	/* parallel safety */
+			 argtypes,	/* parameterTypes */
+			 PointerGetDatum(NULL), /* allParameterTypes */
+			 PointerGetDatum(NULL), /* parameterModes */
+			 PointerGetDatum(NULL), /* parameterNames */
+			 NIL,	/* parameterDefaults */
+			 PointerGetDatum(NULL), /* trftypes */
+			 PointerGetDatum(NULL), /* proconfig */
+			 InvalidOid,	/* prosupport */
+			 1.0,	/* procost */
+			 0.0);	/* prorows */
 
-		constructorArgTypesVector

Re: range_agg

2019-11-22 Thread Pavel Stehule
pá 22. 11. 2019 v 17:25 odesílatel Paul A Jungwirth <
p...@illuminatedcomputing.com> napsal:

> On Thu, Nov 21, 2019 at 9:21 PM Pavel Stehule 
> wrote:
> > I though about it, and I think so cast from multirange to range is
> useless, minimally it should be explicit.
>
> I agree: definitely not implicit. If I think of a good reason for it
> I'll add it, but otherwise I'll leave it out.
>
> > On second hand - from range to multirange should be implicit.
>
> Okay.
>
> > The original patch did
> >
> > 1. MR @x MR = MR
> > 2. R @x R = MR
> > 3. MR @x R = MR
> >
> > I think so @1 & @3 has sense, but without introduction of special
> operator. @2 is bad and can be solved by cast one or second operand.
>
> Yes. I like how #2 follows the int/numeric analogy: if you want a
> numeric result from `int / int` you can say `int::numeric / int`.
>
> So my understanding is that conventionally cast functions are named
> after the destination type, e.g. int8multirange(int8range) would be
> the function to cast an int8range to an int8multirange. And
> int8range(int8multirange) would go the other way (if we do that). We
> already use these names for the "constructor" functions, but I think
> that is actually okay. For the multirange->range cast, the parameter
> type & number are different, so there is no real conflict. For the
> range->multirange cast, the parameter type is the same, and the
> constructor function is variadic---but I think that's fine, because
> the semantics are the same: build a multirange whose only element is
> the given range:
>
> regression=# select int8multirange(int8range(1,2));
>  int8multirange
> 
>  {[1,2)}
> (1 row)
>
> Even the NULL handling is already what we want:
>
> regression=# select int8multirange(null);
>  int8multirange
> 
>  NULL
> (1 row)
>
> So I think it's fine, but I'm curious whether you see any problems
> there? (I guess if there is a problem it's no big deal to name the
> function something else)
>

It looks well now. I am not sure about benefit of cast from MR to R if MR
has more than one values. But it can be there for completeness.

I think in this moment is not important to implement all functionality -
for start is good to implement basic functionality that can be good. It can
be enhanced step by step in next versions.

Pavel

>
> Thanks,
> Paul
>


Re: range_agg

2019-11-22 Thread Paul A Jungwirth
On Thu, Nov 21, 2019 at 9:21 PM Pavel Stehule  wrote:
> I though about it, and I think so cast from multirange to range is useless, 
> minimally it should be explicit.

I agree: definitely not implicit. If I think of a good reason for it
I'll add it, but otherwise I'll leave it out.

> On second hand - from range to multirange should be implicit.

Okay.

> The original patch did
>
> 1. MR @x MR = MR
> 2. R @x R = MR
> 3. MR @x R = MR
>
> I think so @1 & @3 has sense, but without introduction of special operator. 
> @2 is bad and can be solved by cast one or second operand.

Yes. I like how #2 follows the int/numeric analogy: if you want a
numeric result from `int / int` you can say `int::numeric / int`.

So my understanding is that conventionally cast functions are named
after the destination type, e.g. int8multirange(int8range) would be
the function to cast an int8range to an int8multirange. And
int8range(int8multirange) would go the other way (if we do that). We
already use these names for the "constructor" functions, but I think
that is actually okay. For the multirange->range cast, the parameter
type & number are different, so there is no real conflict. For the
range->multirange cast, the parameter type is the same, and the
constructor function is variadic---but I think that's fine, because
the semantics are the same: build a multirange whose only element is
the given range:

regression=# select int8multirange(int8range(1,2));
 int8multirange

 {[1,2)}
(1 row)

Even the NULL handling is already what we want:

regression=# select int8multirange(null);
 int8multirange

 NULL
(1 row)

So I think it's fine, but I'm curious whether you see any problems
there? (I guess if there is a problem it's no big deal to name the
function something else)

Thanks,
Paul




Re: range_agg

2019-11-21 Thread Pavel Stehule
čt 21. 11. 2019 v 21:15 odesílatel Paul Jungwirth <
p...@illuminatedcomputing.com> napsal:

> On 11/21/19 1:06 AM, Pavel Stehule wrote:
> > 2. I don't like introduction "safe" operators - now the basic operators
> > are doubled, and nobody without documentation will use @* operators.
> >
> > It is not intuitive. I think is better to map this functionality to
> > basic operators +- * and implement it just for pairs (Multirange,
> > Multirange) and (Multirange, Range) if it is possible
> >
> > It's same relation line Numeric X integer. There should not be
> > introduced new operators. If somebody need it for ranges, then he can
> > use cast to multirange, and can continue.
>  > [snip]
> > 3. There are not prepared casts -
> >
> > postgres=# select int8range(10,15)::int8multirange;
> > ERROR:  cannot cast type int8range to int8multirange
> > LINE 1: select int8range(10,15)::int8multirange;
> > ^
> > There should be some a) fully generic solution, or b) possibility to
> > build implicit cast when any multirange type is created.
>
> Okay, I like the idea of just having `range + range` and `multirange +
> multirange`, then letting you cast between ranges and multiranges. The
> analogy to int/numeric seems strong. I guess if you cast a multirange
> with more than one element to a range it will raise an error. That will
> let me clean up the docs a lot too.
>

I though about it, and I think so cast from multirange to range is useless,
minimally it should be explicit.

On second hand - from range to multirange should be implicit.

The original patch did

1. MR @x MR = MR
2. R @x R = MR
3. MR @x R = MR

I think so @1 & @3 has sense, but without introduction of special operator.
@2 is bad and can be solved by cast one or second operand.

Pavel


> Thanks!
>
> --
> Paul  ~{:-)
> p...@illuminatedcomputing.com
>


Re: range_agg

2019-11-21 Thread Paul Jungwirth

On 11/21/19 1:06 AM, Pavel Stehule wrote:
2. I don't like introduction "safe" operators - now the basic operators 
are doubled, and nobody without documentation will use @* operators.


It is not intuitive. I think is better to map this functionality to 
basic operators +- * and implement it just for pairs (Multirange, 
Multirange) and (Multirange, Range) if it is possible


It's same relation line Numeric X integer. There should not be 
introduced new operators. If somebody need it for ranges, then he can 
use cast to multirange, and can continue.

> [snip]

3. There are not prepared casts -

postgres=# select int8range(10,15)::int8multirange;
ERROR:  cannot cast type int8range to int8multirange
LINE 1: select int8range(10,15)::int8multirange;
                                ^
There should be some a) fully generic solution, or b) possibility to 
build implicit cast when any multirange type is created.


Okay, I like the idea of just having `range + range` and `multirange + 
multirange`, then letting you cast between ranges and multiranges. The 
analogy to int/numeric seems strong. I guess if you cast a multirange 
with more than one element to a range it will raise an error. That will 
let me clean up the docs a lot too.


Thanks!

--
Paul  ~{:-)
p...@illuminatedcomputing.com




Re: range_agg

2019-11-21 Thread Pavel Stehule
st 20. 11. 2019 v 20:32 odesílatel Paul A Jungwirth <
p...@illuminatedcomputing.com> napsal:

> On Tue, Nov 19, 2019 at 9:49 PM Paul A Jungwirth
>  wrote:
> >
> > On Tue, Nov 19, 2019 at 1:17 AM Pavel Stehule 
> wrote:
> > > Hi
> > > I tested last patches. I found some issues
> >
> > Thank you for the review!
>
> Here is an updated patch series fixing the problems from that last
> review. I would still like some direction about the doc formatting:
>
>
yes, these bugs are fixed

there are not compilation's issues
tests passed
doc is ok

I have notes

1. the chapter should be renamed to "Range Functions and Operators" to
"Range and Multirange Functions and Operators"

But now the doc is not well readable - there is not clean, what functions
are for range type, what for multirange and what for both

2. I don't like introduction "safe" operators - now the basic operators are
doubled, and nobody without documentation will use @* operators.

It is not intuitive. I think is better to map this functionality to basic
operators +- * and implement it just for pairs (Multirange, Multirange) and
(Multirange, Range) if it is possible

It's same relation line Numeric X integer. There should not be introduced
new operators. If somebody need it for ranges, then he can use cast to
multirange, and can continue.

The "safe" operators can be implement on user space - but should not be
default solution.

3. There are not prepared casts -

postgres=# select int8range(10,15)::int8multirange;
ERROR:  cannot cast type int8range to int8multirange
LINE 1: select int8range(10,15)::int8multirange;
   ^
There should be some a) fully generic solution, or b) possibility to build
implicit cast when any multirange type is created.

Regards

Pavel




> > > I am not sure how much is correct to use  class="monospaced"> in doc. It is used for ranges, and multiranges, but no
> in other places
> >
> > I could use some advice here. Many operators seem best presented in
> > groups of four, where only their parameter types change, for example:
> >
> > int8range < int8range
> > int8range < int8multirange
> > int8multirange < int8range
> > int8multirange < int8multirange
> >
> > All I really want is to show those separated by line breaks. I
> > couldn't find any other examples of that happening inside a table cell
> > though (i.e. inside ). What is the best way
> > to do that?
>

Personally I think it should be cleaned. Mainly if there is not visible
differences. But range related doc it uses, so it is consistent with it.
And then this is not big issue.



> Thanks,
> Paul
>


Re: range_agg

2019-11-19 Thread Paul A Jungwirth
On Tue, Nov 19, 2019 at 1:17 AM Pavel Stehule  wrote:
> Hi
> I tested last patches. I found some issues

Thank you for the review!

> 1. you should not to try patch catversion.

I've seen discussion on pgsql-hackers going both ways, but I'll leave
it out of future patches. :-)

> 2. there is warning
>
> parse_coerce.c: In function ‘enforce_generic_type_consistency’:
> parse_coerce.c:1975:11: warning: ‘range_typelem’ may be used uninitialized in 
> this function [-Wmaybe-uninitialized]
>  1975 |   else if (range_typelem != elem_typeid)

Fixed locally, will include in my next patch.

> 3. there are problems with pg_upgrade. Regress tests fails
> . . .
> pg_restore: while PROCESSING TOC:
> pg_restore: from TOC entry 1653; 1247 17044 TYPE arrayrange pavel
> pg_restore: error: could not execute query: ERROR:  pg_type array OID value 
> not set when in binary upgrade mode

I see what's going on here. (Sorry if this verbose explanation is
obvious; it's as much for me as for anyone.) With pg_upgrade the
values of pg_type.oid and pg_type.typarray must be the same before &
after. For built-in types there's no problem, because those are fixed
by pg_type.dat. But for user-defined types we have to take extra steps
to make sure they don't change. CREATE TYPE always uses two oids: one
for the type and one for the type's array type. But now when you
create a range type we use *four*: the range type, the array of that
range, the multirange type, and the array of that multirange.
Currently when you run pg_dump in "binary mode" (i.e. as part of
pg_upgrade) it includes calls to special functions to set the next oid
to use for pg_type.oid and pg_type.typarray. Then CREATE TYPE also has
special "binary mode" code to check those variables and use those oids
(e.g. AssignTypeArrayOid). After using them once it sets them back to
InvalidOid so it doesn't keep using them. So I guess I need to add
code to pg_dump so that it also outputs calls to two new special
functions that similarly set the oid to use for the next multirange
and multirange[]. For v12->v13 it will chose high-enough oids like we
do already for arrays of domains. (For other upgrades it will use the
existing value.) And then I can change the CREATE TYPE code to check
those pre-set values when obtaining the next oid. Does that sound like
the right approach here?

> 4. there is a problem with doc
>
> extend.sgml:281: parser error : Opening and ending tag mismatch: para line 
> 270 and type
>  type of the ranges in an anymultirange.

Hmm, yikes, I'll fix that!

> I am not sure how much is correct to use  
> in doc. It is used for ranges, and multiranges, but no in other places

I could use some advice here. Many operators seem best presented in
groups of four, where only their parameter types change, for example:

int8range < int8range
int8range < int8multirange
int8multirange < int8range
int8multirange < int8multirange

All I really want is to show those separated by line breaks. I
couldn't find any other examples of that happening inside a table cell
though (i.e. inside ). What is the best way
to do that?

Thanks,
Paul




Re: range_agg

2019-11-19 Thread Pavel Stehule
Hi

čt 7. 11. 2019 v 3:36 odesílatel Paul A Jungwirth <
p...@illuminatedcomputing.com> napsal:

> On Wed, Nov 6, 2019 at 3:02 PM Paul A Jungwirth
>  wrote:
> > On Thu, Sep 26, 2019 at 2:13 PM Alvaro Herrera 
> wrote:
> > > Hello Paul, I've started to review this patch.  Here's a few minor
> > > things I ran across -- mostly compiler warnings (is my compiler too
> > > ancient?).
> > I just opened this thread to post a rebased set patches (especially
> > because of the `const` additions to range functions). Maybe it's not
> > that helpful since they don't include your changes yet but here they
> > are anyway. I'll post some more with your changes shortly.
>
> Here is another batch of patches incorporating your improvements. It
> seems like almost all the warnings were about moving variable
> declarations above any other statements. For some reason I don't get
> warnings about that on my end (compiling on OS X):
>
> platter:postgres paul$ gcc --version
> Configured with:
> --prefix=/Applications/Xcode.app/Contents/Developer/usr
> --with-gxx-include-dir=/usr/include/c++/4.2.1
> Apple clang version 11.0.0 (clang-1100.0.33.12)
> Target: x86_64-apple-darwin18.6.0
> Thread model: posix
> InstalledDir:
> /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
>
> For configure I'm saying this:
>
> ./configure CFLAGS=-ggdb 5-Og -g3 -fno-omit-frame-pointer
> --enable-tap-tests --enable-cassert --enable-debug
> --prefix=/Users/paul/local
>
> Any suggestions to get better warnings? On my other patch I got
> feedback about the very same kind. I could just compile on Linux but
> it's nice to work on this away from my desk on the laptop. Maybe
> installing a real gcc is the way to go.
>

I tested last patches. I found some issues

1. you should not to try patch catversion.

2. there is warning

parse_coerce.c: In function ‘enforce_generic_type_consistency’:
parse_coerce.c:1975:11: warning: ‘range_typelem’ may be used uninitialized
in this function [-Wmaybe-uninitialized]
 1975 |   else if (range_typelem != elem_typeid)

3. there are problems with pg_upgrade. Regress tests fails

command:
"/home/pavel/src/postgresql.master/tmp_install/usr/local/pgsql/bin/pg_restore"
--host /home/pavel/src/postgresql.master/src/b
pg_restore: connecting to database for restore
pg_restore: creating DATABASE "regression"
pg_restore: connecting to new database "regression"
pg_restore: connecting to database "regression" as user "pavel"
pg_restore: creating DATABASE PROPERTIES "regression"
pg_restore: connecting to new database "regression"
pg_restore: connecting to database "regression" as user "pavel"
pg_restore: creating pg_largeobject "pg_largeobject"
pg_restore: creating SCHEMA "fkpart3"
pg_restore: creating SCHEMA "fkpart4"
pg_restore: creating SCHEMA "fkpart5"
pg_restore: creating SCHEMA "fkpart6"
pg_restore: creating SCHEMA "mvtest_mvschema"
pg_restore: creating SCHEMA "regress_indexing"
pg_restore: creating SCHEMA "regress_rls_schema"
pg_restore: creating SCHEMA "regress_schema_2"
pg_restore: creating SCHEMA "testxmlschema"
pg_restore: creating TRANSFORM "TRANSFORM FOR integer LANGUAGE "sql""
pg_restore: creating TYPE "public.aggtype"
pg_restore: creating TYPE "public.arrayrange"
pg_restore: while PROCESSING TOC:
pg_restore: from TOC entry 1653; 1247 17044 TYPE arrayrange pavel
pg_restore: error: could not execute query: ERROR:  pg_type array OID value
not set when in binary upgrade mode
Command was:.
-- For binary upgrade, must preserve pg_type oid
SELECT
pg_catalog.binary_upgrade_set_next_pg_type_oid('17044'::pg_catalog.oid);


-- For binary upgrade, must preserve pg_type array oid
SELECT
pg_catalog.binary_upgrade_set_next_array_pg_type_oid('17045'::pg_catalog.oid);

CREATE TYPE "public"."arrayrange" AS RANGE (
subtype = integer[]
);

4. there is a problem with doc

  echo ""; \
  echo ""; \
} > version.sgml
'/usr/bin/perl' ./mk_feature_tables.pl YES
../../../src/backend/catalog/sql_feature_packages.txt
../../../src/backend/catalog/sql_features.txt > features-supported.sgml
'/usr/bin/perl' ./mk_feature_tables.pl NO
../../../src/backend/catalog/sql_feature_packages.txt
../../../src/backend/catalog/sql_features.txt > features-unsupported.sgml
'/usr/bin/perl' ./generate-errcodes-table.pl
../../../src/backend/utils/errcodes.txt > errcodes-table.sgml
'/usr/bin/perl' ./generate-keywords-table.pl . > keywords-table.sgml
/usr/bin/xmllint --path . --noout --valid postgres.sgml
extend.sgml:281: parser error : Opening and ending tag mismatch: para line
270 and type
 type of the ranges in an anymultirange.
 ^
extend.sgml:281: parser error : Opening and ending tag mismatch: sect2 line
270 and type
 type of the ranges in an anymultirange.
 ^
extend.sgml:282: parser error : Opening and ending tag mismatch: sect1 line
270 and para

   ^
extend.sgml:324: parser error : Opening and ending tag mismatch: chapt

Re: range_agg

2019-09-26 Thread Alvaro Herrera
Hello Paul, I've started to review this patch.  Here's a few minor
things I ran across -- mostly compiler warnings (is my compiler too
ancient?).  You don't have to agree with every fix -- feel free to use
different fixes if you have them.  Also, feel free to squash them onto
whatever commit you like (I think they all belong onto 0001 except the
last which seems to be for 0002).

Did you not push your latest version to your github repo?  I pulled from
there and branch 'multirange' does not seem to match what you posted.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 6d849beafea1e4129f044f8dd038933d6ea72b54 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Thu, 26 Sep 2019 16:56:26 -0300
Subject: [PATCH 1/9] Remove unused variable

---
 src/backend/catalog/pg_type.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/src/backend/catalog/pg_type.c b/src/backend/catalog/pg_type.c
index 08552850c8..ea5b161c4d 100644
--- a/src/backend/catalog/pg_type.c
+++ b/src/backend/catalog/pg_type.c
@@ -876,7 +876,6 @@ makeMultirangeTypeName(const char *rangeTypeName, Oid typeNamespace)
 	char		mltrng[NAMEDATALEN];
 	char	   *mltrngunique = (char *) palloc(NAMEDATALEN);
 	int			namelen = strlen(rangeTypeName);
-	int			rangelen;
 	char	   *rangestr;
 	int			rangeoffset;
 	int			underscores;
@@ -892,7 +891,6 @@ makeMultirangeTypeName(const char *rangeTypeName, Oid typeNamespace)
 	if (rangestr)
 	{
 		rangeoffset = rangestr - rangeTypeName;
-		rangelen = strlen(rangestr);
 		strlcpy(mltrng + rangeoffset, "multi", NAMEDATALEN - rangeoffset);
 		strlcpy(mltrng + rangeoffset + 5, rangestr, NAMEDATALEN - rangeoffset - 5);
 		namelen += 5;
-- 
2.17.1

>From 63bd9723a45c33cccd11704817fd954af5abaf31 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Thu, 26 Sep 2019 17:39:54 -0300
Subject: [PATCH 2/9] Silence compiler warning

---
 src/backend/parser/parse_coerce.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/backend/parser/parse_coerce.c b/src/backend/parser/parse_coerce.c
index f0b1f6f831..aab2348952 100644
--- a/src/backend/parser/parse_coerce.c
+++ b/src/backend/parser/parse_coerce.c
@@ -1919,6 +1919,8 @@ enforce_generic_type_consistency(const Oid *actual_arg_types,
 			   format_type_be(elem_typeid;
 		}
 	}
+	else
+		range_typelem = InvalidOid;
 
 	/* Get the range type based on the multirange type, if we have one */
 	if (OidIsValid(multirange_typeid))
-- 
2.17.1

>From c4e4b1a97136f7a2047f8be6608cac0782dd6d12 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Thu, 26 Sep 2019 17:42:22 -0300
Subject: [PATCH 3/9] Silence 'mixed declarations and code' compiler warning

---
 src/backend/commands/typecmds.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/src/backend/commands/typecmds.c b/src/backend/commands/typecmds.c
index 14a6857062..26ed3e4c76 100644
--- a/src/backend/commands/typecmds.c
+++ b/src/backend/commands/typecmds.c
@@ -1782,14 +1782,10 @@ makeMultirangeConstructors(const char *name, Oid namespace,
 	static const char *const prosrc[2] = {"multirange_constructor0",
 	"multirange_constructor1"};
 	static const int pronargs[2] = {0, 1};
-
-	Oid			constructorArgTypes[0];
+	Oid			constructorArgTypes = rangeArrayOid;
 	ObjectAddress myself,
 referenced;
 	int			i;
-
-	constructorArgTypes[0] = rangeArrayOid;
-
 	Datum		allParamTypes[1] = {ObjectIdGetDatum(rangeArrayOid)};
 	ArrayType  *allParameterTypes = construct_array(allParamTypes, 1, OIDOID,
 	sizeof(Oid), true, 'i');
@@ -1808,7 +1804,7 @@ makeMultirangeConstructors(const char *name, Oid namespace,
 	{
 		oidvector  *constructorArgTypesVector;
 
-		constructorArgTypesVector = buildoidvector(constructorArgTypes,
+		constructorArgTypesVector = buildoidvector(&constructorArgTypes,
    pronargs[i]);
 
 		myself = ProcedureCreate(name,	/* name: same as multirange type */
-- 
2.17.1

>From af414286417d4fddd078c7ffac087482b3ba959d Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Thu, 26 Sep 2019 17:49:19 -0300
Subject: [PATCH 4/9] Silence 'mixed declarations and code' compiler warnings

---
 src/backend/utils/fmgr/funcapi.c | 98 +++-
 1 file changed, 59 insertions(+), 39 deletions(-)

diff --git a/src/backend/utils/fmgr/funcapi.c b/src/backend/utils/fmgr/funcapi.c
index e6e82fda63..701664bd3b 100644
--- a/src/backend/utils/fmgr/funcapi.c
+++ b/src/backend/utils/fmgr/funcapi.c
@@ -541,18 +541,21 @@ resolve_polymorphic_tupdesc(TupleDesc tupdesc, oidvector *declared_args,
 
 		if (OidIsValid(anymultirange_type))
 		{
-			Oid			rngtype = resolve_generic_type(ANYRANGEOID,
-	   anymultirange_type,
-	   ANYMULTIRANGEOID);
+			Oid			rngtype;
+			Oid			subtype;
+
+			rngtype = resolve_generic_type(ANYRANGEOID,
+		   anymultirange_type,
+		   ANYMULTIRANGEOID);
 
 			/* check for inconsistent range and multirange results */
 			if (OidIsVa

Re: range_agg

2019-09-05 Thread Jeff Davis
On Thu, 2019-09-05 at 10:45 -0700, Paul A Jungwirth wrote:
> Right now I'm planning to do all that before sending a patch. I'm
> happy to send something something in-progress too, but I don't want
> to
> waste any reviewers' time. If folks want an early peak though let me
> know. (You can also find my messy progress at
> https://github.com/pjungwir/postgresql/tree/multirange)

Sounds good. The rule I use is: "will the feedback I get be helpful, or
just tell me about obvious problems I already know about".

Regards,
Jeff Davis






Re: range_agg

2019-09-05 Thread Paul A Jungwirth
On Thu, Sep 5, 2019 at 10:15 AM Jeff Davis  wrote:
>
> On Sun, 2019-09-01 at 06:26 -0700, Paul A Jungwirth wrote:
> > @+ and @- and @* (I dunno why but I kind of like it. We already have
> > @> and <@.)
>
> I think I like this proposal best; it reminds me of perl. Though some
> might say that's an argument against it.

Thanks Jeff, it's my favorite too. :-) Strangely it feels the hardest
to justify. Right now I have + and - and * implemented but I'll change
them to @+ and @- and @* so that I can support `range R range =
multirange` too.

Btw is there any reason to send a "preview" patch with my current
progress, since we're starting a new commit fest? Here is what I have
left to do:

- Change those three operators.
- Write range_intersect_agg. (range_agg is done but needs some tests
before I commit it.)
- Write documentation.
- Add multiranges to resolve_generic_type, and figure out how to test
that (see the other thread about two latent range-related bugs there).
- Rebase on current master. (It should be just a few weeks behind right now.)
- Run pgindent to make sure I'm conforming to whitespace/style guidelines.
- Split it up into a few separate patch files.

Right now I'm planning to do all that before sending a patch. I'm
happy to send something something in-progress too, but I don't want to
waste any reviewers' time. If folks want an early peak though let me
know. (You can also find my messy progress at
https://github.com/pjungwir/postgresql/tree/multirange)

Also here are some other items that won't be in my next patch, but
should probably be done (maybe by someone else but I'm happy to figure
it out too) before this is really committed:

- typanalyze
- selectivity
- gist support
- spgist support

If anyone would like to help with those, let me know. :-)

Yours,
Paul




Re: range_agg

2019-09-05 Thread Jeff Davis
On Sun, 2019-09-01 at 06:26 -0700, Paul A Jungwirth wrote:
> @+ and @- and @* (I dunno why but I kind of like it. We already have
> @> and <@.)

I think I like this proposal best; it reminds me of perl. Though some
might say that's an argument against it.

Regards,
Jeff Davis






Re: range_agg

2019-09-01 Thread Paul A Jungwirth
> > Btw I have working multirange_{send,recv,in,out} now. . . .

Just about all the other operators are done too, but I wonder what
symbols people like for union and minus? Range uses + for union. I
have working code and tests that adds this:

r + mr = mr
mr + r = mr
mr + mr = mr

But I would like to use a different symbol instead, like ++, so I can
have all four:

r ++ r = mr
r ++ mr = mr
mr ++ r = mr
mr ++ mr = mr

(The existing r + r operator throws an error if the inputs have a gap.)

The trouble is that ++ isn't allowed. (Neither is --.) From
https://www.postgresql.org/docs/11/sql-createoperator.html :

> A multicharacter operator name cannot end in + or -, unless the name also 
> contains at least one of these characters:
> ~ ! @ # % ^ & | ` ?

So are there any other suggestions? I'm open to arguments that I
should just use +, but I think having a way to add two simple ranges
and get a multirange would be nice too, so my preference is to find a
new operator. It should work with minus and intersection (* for
ranges) too. Some proposals:

+* and -* and ** (* as in regex "zero or many" reminds me of how a
multirange holds zero or many ranges. ** is confusing though because
it's like exponentiation.)

@+ and @- and @* (I dunno why but I kind of like it. We already have @> and <@.)

<+> and <-> and <*> (I was hoping for (+) etc for the math connection
to circled operators, but this is close. Maybe this would be stronger
if multirange_{in,out} used <> delims instead of {}, although I also
like how {} is consistent with arrays.)

Anyone else have any thoughts?

Thanks,
Paul




Re: range_agg

2019-08-26 Thread Jeff Davis
On Wed, 2019-08-21 at 21:54 -0700, Paul A Jungwirth wrote:
> Sometimes I think about having a maybe type instead of null/not
> null. SQL nulls are already very "monadic" I think but nullability
> doesn't travel.

Yeah, that would be a great direction, but there is some additional
complexity that we'd need to deal with that a "normal" compiler does
not:

  * handling both existing global types (like int) as well as on-the-
fly types like Maybe>
  * types need to do more things, like have serialized representations,
interface with indexing strategies, and present the optimizer with
choices that may influence which indexes can be used or not
  * at some point needs to work with normal SQL types and NULL
  * there are a lot of times we care not just whether a type is
sortable, but we actually care about the way it's sorted (e.g.
localization). typeclasses+newtype would probably be unacceptable for
trying to match SQL behavior here.

I'm all in favor of pursuing this, but it's not going to bear fruit
very soon.

> Btw I have working multirange_{send,recv,in,out} now, and I
> automatically create a multirange type and its array type when
> someone
> creates a new range type. I have a decent start on passing tests and
> no compiler warnings. I also have a start on anymultirange and
> anyrangearray. (I think I need the latter to support a range-like
> constructor function, so you can say `int4multirange(int4range(1,4),
> int4range(8,10))`.) I want to get the any* types done and improve the
> test coverage, and then I'll probably be ready to share a patch.

Awesome!

> Here are a couple other questions:
> 
> - Does anyone have advice for the typanalyze function? I feel pretty
> out of my depth there (although I haven't looked into typanalyze
> stuff
> very deeply yet). I can probably get some inspiration from
> range_typanalyze and array_typanalyze, but those are both quite
> detailed (their statistics functions that is).

I think Alexander Korotkov did a lot of the heavy lifting here, perhaps
he has a comment? I'd keep it simple for now if you can, and we can try
to improve it later.

> - What should a multirange do if you give it an empty range? I'm
> thinking it should just ignore it, but then `'{}'::int4multirange =
> '{empty}'::int4multirange`. Does that seem okay? (It does to me
> actually, if we think of `empty` as the additive identity. Similarly
> mr + empty = mr.

I agree. Multiranges are more than just an array of ranges, so they
coalesce into some canonical form.

> - What should a multirange do if you give it a null, like
> `int4multirange(int4range(1,4), null)`. I'm thinking it should be
> null, just like mr + null = null. Right?

Yes. NULL is for the overall multirange datum (that is, a multirange
column can be NULL), but I don't think individual parts of a datatype
make much sense as NULL. So, I agree that mr + null = null. (Note that
arrays and records can have NULL parts, but I don't see a reason we
should follow those examples for multiranges.)

Regards,
Jeff Davis






Re: range_agg

2019-08-21 Thread Paul A Jungwirth
On Tue, Aug 20, 2019 at 10:33 PM Jeff Davis  wrote:
> > Is there any historical discussion around
> > typemods on range types?
>
> I did find a few references:

Thanks for looking those up! It's very interesting to see some of the
original discussion around range types.

Btw this is true of so much, isn't it?:

> It's more a property of the
> column than the type.

Sometimes I think about having a maybe type instead of null/not
null. SQL nulls are already very "monadic" I think but nullability
doesn't travel. I feel like someone ought to write a paper about that,
but I don't know of any. This is tantalizingly close (and a fun read)
but really about something else:
https://www.researchgate.net/publication/266657590_Incomplete_data_what_went_wrong_and_how_to_fix_it
Since you're getting into Rust maybe you can update the wiki page
mentioned in those threads about refactoring the type system. :-)
Anyway sorry for the digression. . . .

> So, I wouldn't spend a lot of time on typmod for multiranges.

Okay, thanks! There is plenty else to do. I think I'm already
supporting it as much as range types do.

Btw I have working multirange_{send,recv,in,out} now, and I
automatically create a multirange type and its array type when someone
creates a new range type. I have a decent start on passing tests and
no compiler warnings. I also have a start on anymultirange and
anyrangearray. (I think I need the latter to support a range-like
constructor function, so you can say `int4multirange(int4range(1,4),
int4range(8,10))`.) I want to get the any* types done and improve the
test coverage, and then I'll probably be ready to share a patch.

Here are a couple other questions:

- Does anyone have advice for the typanalyze function? I feel pretty
out of my depth there (although I haven't looked into typanalyze stuff
very deeply yet). I can probably get some inspiration from
range_typanalyze and array_typanalyze, but those are both quite
detailed (their statistics functions that is).

- What should a multirange do if you give it an empty range? I'm
thinking it should just ignore it, but then `'{}'::int4multirange =
'{empty}'::int4multirange`. Does that seem okay? (It does to me
actually, if we think of `empty` as the additive identity. Similarly
mr + empty = mr.

- What should a multirange do if you give it a null, like
`int4multirange(int4range(1,4), null)`. I'm thinking it should be
null, just like mr + null = null. Right?

Thanks!
Paul




Re: range_agg

2019-08-20 Thread Jeff Davis
On Sat, 2019-08-17 at 10:47 -0700, Paul A Jungwirth wrote:
> So I'm wondering how seriously I should take this for multiranges? I
> guess if a range type did support typmods, it would just delegate to
> the underlying element type for their meaning, and so a multirange
> should delegate it too? Is there any historical discussion around
> typemods on range types?

I did find a few references:


https://www.postgresql.org/message-id/1288029716.8645.4.camel%40jdavis-ux.asterdata.local

https://www.postgresql.org/message-id/2011091334.GB11603%40fetter.org
https://www.postgresql.org/message-id/1296974485.27157.136.camel@jdavis

I'd be interested in ways that we can use a typmod-like concept to
improve the type system. Unfortunately, typmod is just not
sophisticated enough to do very much because it's lost through function
calls. Improving that would be a separate and challenging project.

So, I wouldn't spend a lot of time on typmod for multiranges.

Regards,
Jeff Davis






Re: range_agg

2019-08-17 Thread Paul A Jungwirth
On Mon, Jul 8, 2019 at 9:46 AM Paul A Jungwirth
 wrote:
> - A multirange type is an extra thing you get when you define a range
> (just like how you get a tstzrange[]). Therefore

I've been able to make a little more progress on multiranges the last
few days, but it reminded me of an open question I've had for awhile:
typmods! I see places in the range code that gesture toward supporting
typmods, but none of the existing range types permit them. For
example:

postgres=# select '5'::numeric(4,2);
 numeric
-
5.00
(1 row)

postgres=# select '[1,4)'::numrange(4,2);
ERROR:  type modifier is not allowed for type "numrange"
LINE 1: select '[1,4)'::numrange(4,2);

So I'm wondering how seriously I should take this for multiranges? I
guess if a range type did support typmods, it would just delegate to
the underlying element type for their meaning, and so a multirange
should delegate it too? Is there any historical discussion around
typemods on range types?

Thanks!
Paul




Re: range_agg

2019-08-01 Thread Thomas Munro
On Wed, Jul 24, 2019 at 5:13 PM Paul A Jungwirth
 wrote:
> On Tue, Jul 23, 2019 at 3:32 PM Alvaro Herrera  
> wrote:
> > Just checking if you've had a chance to make progress on this.
>
> Not a lot. :-) But I should have more time for it the next few weeks
> than I did the last few.  ...

Hi Paul,

I didn't follow this thread, but as the CF is coming to a close, I'm
interpreting the above to mean that this is being worked on and there
is a good chance of a new patch in time for September.  Therefore I
have moved this entry to that 'fest.

Thanks,

-- 
Thomas Munro
https://enterprisedb.com




Re: range_agg

2019-07-23 Thread Paul A Jungwirth
On Tue, Jul 23, 2019 at 3:32 PM Alvaro Herrera  wrote:
> Just checking if you've had a chance to make progress on this.

Not a lot. :-) But I should have more time for it the next few weeks
than I did the last few. I do have some code for creating concrete
multirange types (used when you create a concrete range type) and
filling in a TypeCacheEntry based on the range type oid---which I know
is all very modest progress. I've been working on a multirange_in
function and mostly just learning about Postgres varlena and TOASTed
objects by reading the code for range_in & array_in.

Here is something from my multirangetypes.h:

/*
 * Multiranges are varlena objects, so must meet the varlena convention that
 * the first int32 of the object contains the total object size in bytes.
 * Be sure to use VARSIZE() and SET_VARSIZE() to access it, though!
 */
typedef struct
{
int32   vl_len_;/* varlena header (do not touch
directly!) */
Oid multirangetypid;/* multirange type's own OID */
/*
 * Following the OID are the range objects themselves.
 * Note that ranges are varlena too,
 * depending on whether they have lower/upper bounds
 * and because even their base types can be varlena.
 * So we can't really index into this list.
 */
} MultirangeType;

I'm working on parsing a multirange much like we parse an array,
although it's a lot simpler because it's a single dimension and there
are no nulls.

I know that's not much to go on, but let me know if any of it worries you. :-)

Paul




Re: range_agg

2019-07-23 Thread Alvaro Herrera
Hi Paul,

Just checking if you've had a chance to make progress on this.

Thanks,

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: range_agg

2019-07-10 Thread Paul Jungwirth

On 7/9/19 11:24 PM, David Fetter wrote:

I seem to recall that the usual convention (at least in math) is
to use intervals that are generally represented as open on the
infinity side, but that might not fit how we do things.


I think it does, unless I'm misunderstanding?


Oh, I was just wondering about the square bracket on the left side of
[null, 1).  It's not super important.


Ah, I understand now. Just a typo on my part. Thanks for catching it, 
and sorry for the confusion!



!mr , perhaps?


I like that suggestion. Honestly I'm not sure we even want an inverse,
but it's so important theoretically we should at least consider
whether it is appropriate here. Or maybe "inverse" is the wrong word
for this, or there is a different meaning it should have.


Jeff's suggestion of ~ for complement is better.


Okay, thanks. I like it better too.

Yours,

--
Paul  ~{:-)
p...@illuminatedcomputing.com




Re: range_agg

2019-07-09 Thread David Fetter
On Tue, Jul 09, 2019 at 09:40:59AM -0700, Paul A Jungwirth wrote:
> On Tue, Jul 9, 2019 at 8:51 AM David Fetter  wrote:
> > > - A multirange type is an extra thing you get when you define a range
> > > (just like how you get a tstzrange[]). Therefore
> > > - I don't need separate commands to add/drop multirange types. You get
> > > one when you define a range type, and if you drop a range type it gets
> > > dropped automatically.
> >
> > Yay for fewer manual steps!
> 
> Thanks for taking a look and sharing your thoughts!
> 
> > > - You can have a multirange[].
> >
> > I can see how that would fall out of this, but I'm a little puzzled as
> > to what people might use it for. Aggregates, maybe?
> 
> I don't know either, but I thought it was standard to define a T[] for
> every T. Anyway it doesn't seem difficult.
> 
> > > - You can cast from a multirange to an array. The individual ranges
> > > are always sorted in the result array.
> >
> > Is this so people can pick individual ranges out of the multirange,
> > or...?
> 
> Yes. I want this for foreign keys actually, where I construct a
> multirange and ask for just its first range.

I'm sure I'll understand this better once I get my head around
temporal foreign keys.

> > Speaking of casts, it's possible that a multirange is also a
> > range. Would it make sense to have a cast from multirange to range?
> 
> Hmm, that seems strange to me. You don't cast from an array to one of
> its elements. If we have subscripting, why use casting to get the
> first element?

Excellent point.

> > > - Interesting functions:
> > >   - multirange_length
> >
> > Is that the sum of the lengths of the ranges?  Are we guaranteeing a
> > measure in addition to ordering on ranges now?
> 
> Just the number of disjoint ranges in the multirange.

Thanks for clarifying.

> > > - You can subscript a multirange like you do an array (? This could be
> > > a function instead.)
> >
> > How would this play with the generic subscripting patch in flight?
> 
> I'm not aware of that patch but I guess I better check it out. :-)

Looks like I'm the second to mention it. Worth a review?

> > >   - inverse operator?:
> > > - the inverse of {"[1,2)"} would be {"[null, 1)", "[2, null)"}.
> >
> > Is that the same as ["(∞, ∞)"] - {"[1,2)"}?
> 
> Yes.
> 
> > I seem to recall that the usual convention (at least in math) is
> > to use intervals that are generally represented as open on the
> > infinity side, but that might not fit how we do things.
> 
> I think it does, unless I'm misunderstanding?

Oh, I was just wondering about the square bracket on the left side of
[null, 1).  It's not super important.

> > > - not sure we want this or what the symbol should be. I don't like
> > > -mr as an inverse because then mr - mr != mr ++ -mr.
> >
> > !mr , perhaps?
> 
> I like that suggestion. Honestly I'm not sure we even want an inverse,
> but it's so important theoretically we should at least consider
> whether it is appropriate here. Or maybe "inverse" is the wrong word
> for this, or there is a different meaning it should have.

Jeff's suggestion of ~ for complement is better.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate




Re: range_agg

2019-07-09 Thread Pavel Stehule
st 10. 7. 2019 v 6:26 odesílatel Paul A Jungwirth <
p...@illuminatedcomputing.com> napsal:

> On Tue, Jul 9, 2019 at 12:24 PM Pavel Stehule 
> wrote:
> > út 9. 7. 2019 v 21:10 odesílatel Pavel Stehule 
> napsal:
> >> I afraid so with generic multiragetype there lot of array
> infrastructure will be duplicated
> >
> > on second hand - it is true so classic array concat is not optimal for
> set of ranges, so some functionality should be redefined every time.
> >
> > I don't know what is possible, but for me - multiranges is special kind
> (subset) of arrays and can be implement as subset of arrays. I remember
> other possible kind of arrays - "sets" without duplicates. It is similar
> case, I think.
> >
> > Maybe introduction of multirages as new generic type is bad direction,
> and can be better and more enhanceable in future to introduce some like
> special kinds of arrays. So for example - unnest can be used directly for
> arrays and multiranges too - because there will be common base.
>
> Well I'm afraid of that too a bit, although I also agree it may be an
> opportunity to share some common behavior and implementation. For
> example in the discussion about string syntax, I think keeping it the
> same as arrays is nicer for people and lets us share more between the
> two types.
>
> That said I don't think a multirange is a subtype of arrays (speaking
> as a traditional object-oriented subtype), just something that shares
> a lot of the same behavior. I'm inclined to maximize the overlap where
> feasible though, e.g. string syntax, UNNEST, indexing, function naming
> (`range_length`), etc. Something like Rust traits (or Java interfaces)
> seems a closer mental model, not that we have to formalize that
> somehow, particularly up front.
>

A introduction of new generic type can have some other impacts - there can
be necessary special support for PL languages.

I understand so it is hard to decide - because we miss some more generic
base "sets".

Probably we cannot to think more about it now, and we have to wait to some
patches. Later we can see how much code is duplicated and if it is a
problem or not.

Regards

Pavel


> Yours,
> Paul
>


Re: range_agg

2019-07-09 Thread Paul A Jungwirth
On Tue, Jul 9, 2019 at 12:24 PM Pavel Stehule  wrote:
> út 9. 7. 2019 v 21:10 odesílatel Pavel Stehule  
> napsal:
>> I afraid so with generic multiragetype there lot of array infrastructure 
>> will be duplicated
>
> on second hand - it is true so classic array concat is not optimal for set of 
> ranges, so some functionality should be redefined every time.
>
> I don't know what is possible, but for me - multiranges is special kind 
> (subset) of arrays and can be implement as subset of arrays. I remember other 
> possible kind of arrays - "sets" without duplicates. It is similar case, I 
> think.
>
> Maybe introduction of multirages as new generic type is bad direction, and 
> can be better and more enhanceable in future to introduce some like special 
> kinds of arrays. So for example - unnest can be used directly for arrays and 
> multiranges too - because there will be common base.

Well I'm afraid of that too a bit, although I also agree it may be an
opportunity to share some common behavior and implementation. For
example in the discussion about string syntax, I think keeping it the
same as arrays is nicer for people and lets us share more between the
two types.

That said I don't think a multirange is a subtype of arrays (speaking
as a traditional object-oriented subtype), just something that shares
a lot of the same behavior. I'm inclined to maximize the overlap where
feasible though, e.g. string syntax, UNNEST, indexing, function naming
(`range_length`), etc. Something like Rust traits (or Java interfaces)
seems a closer mental model, not that we have to formalize that
somehow, particularly up front.

Yours,
Paul




Re: range_agg

2019-07-09 Thread Paul A Jungwirth
On Tue, Jul 9, 2019 at 12:02 PM Jeff Davis  wrote:
> > - Multirange in/out work just like arrays, e.g. '{"[1,3)", "[5,6)"}'
>
> It would be cool to have a better text representation. We could go
> simple like:
>
>'[1,3) [5,6)'

Will that work with all ranges, even user-defined ones? With a
tstzrange[] there is a lot of quoting:

=# select array[tstzrange(now(), now() + interval '1 hour')];
   array
---
 {"[\"2019-07-09 12:40:20.794054-07\",\"2019-07-09 13:40:20.794054-07\")"}

I'm more inclined to follow the array syntax both because it will be
familiar & consistent to users (no need to remember any differences)
and because it's already built so we can use it and know it won't have
gotchas.

> I think "complement" might be a better name than "inverse".
>
> m1 - m2 = m1 * complement(m2)
>
> What about "~"?

That seems like the right term and a good symbol.

Paul




Re: range_agg

2019-07-09 Thread Pavel Stehule
út 9. 7. 2019 v 21:10 odesílatel Pavel Stehule 
napsal:

>
>
> út 9. 7. 2019 v 20:25 odesílatel Jeff Davis  napsal:
>
>> On Tue, 2019-07-09 at 07:08 +0200, Pavel Stehule wrote:
>> >
>> > I am not against a multirange type, but I miss a explanation why you
>> > introduce new kind of types and don't use just array of ranges.
>> >
>> > Introduction of new kind of types is not like introduction new type.
>>
>> The biggest benefit, in my opinion, is that it means you can define
>> functions/operators that take an "anyrange" and return an
>> "anymultirange". That way you don't have to define different functions
>> for int4 ranges, date ranges, etc.
>>
>>
> I am not sure how strong is this argument.
>
> I think so introduction of anyrangearray polymorphic type and enhancing
> some type deduction can do same work.
>
> It starts to get even more complex when you want to add opclasses, etc.
>>
>> Ranges and arrays are effectively generic types that need a type
>> parameter to become a concrete type. Ideally, we'd have first-class
>> support for generic types, but I think that's a different topic ;-)
>
>
> I afraid so with generic multiragetype there lot of array infrastructure
> will be duplicated
>

on second hand - it is true so classic array concat is not optimal for set
of ranges, so some functionality should be redefined every time.

I don't know what is possible, but for me - multiranges is special kind
(subset) of arrays and can be implement as subset of arrays. I remember
other possible kind of arrays - "sets" without duplicates. It is similar
case, I think.

Maybe introduction of multirages as new generic type is bad direction, and
can be better and more enhanceable in future to introduce some like special
kinds of arrays. So for example - unnest can be used directly for arrays
and multiranges too - because there will be common base.

Regards

Pavel



> Regards
>
> Pavel
>
>
>> Regards,
>> Jeff Davis
>>
>>
>>


Re: range_agg

2019-07-09 Thread Pavel Stehule
út 9. 7. 2019 v 20:25 odesílatel Jeff Davis  napsal:

> On Tue, 2019-07-09 at 07:08 +0200, Pavel Stehule wrote:
> >
> > I am not against a multirange type, but I miss a explanation why you
> > introduce new kind of types and don't use just array of ranges.
> >
> > Introduction of new kind of types is not like introduction new type.
>
> The biggest benefit, in my opinion, is that it means you can define
> functions/operators that take an "anyrange" and return an
> "anymultirange". That way you don't have to define different functions
> for int4 ranges, date ranges, etc.
>
>
I am not sure how strong is this argument.

I think so introduction of anyrangearray polymorphic type and enhancing
some type deduction can do same work.

It starts to get even more complex when you want to add opclasses, etc.
>
> Ranges and arrays are effectively generic types that need a type
> parameter to become a concrete type. Ideally, we'd have first-class
> support for generic types, but I think that's a different topic ;-)


I afraid so with generic multiragetype there lot of array infrastructure
will be duplicated

Regards

Pavel


> Regards,
> Jeff Davis
>
>
>


  1   2   >