Re: Odd 9.4, 9.3 buildfarm failure on s390x

2018-10-05 Thread Tom Lane
I wrote:
> I'm inclined to just go ahead and apply/backpatch this.  It's certainly
> possible that more bugs remain to be found, but I have no good ideas
> about how to search for them, and in any case that wouldn't invalidate
> the patch as it stands.

And done.  If anyone can think of additional ways to test or search for
more instances of the same bug, please do.  In the meantime, I've
configured buildfarm member longfin to define STRESS_SORT_INT_MIN,
so it should help prevent introduction of new instances.

regards, tom lane



Re: Odd 9.4, 9.3 buildfarm failure on s390x

2018-10-05 Thread Tom Lane
I wrote:
> Here's a draft patch against HEAD for this.
> I looked for problem spots by (a) testing with the STRESS_SORT_INT_MIN
> option I added in nbtcompare.c, (b) grepping for "x = -x" type code,
> and (c) grepping for "return -x" type code.  (b) and (c) found several
> places that (a) didn't, which does not give me a warm feeling about
> whether I have found quite everything.

I thought of another, uglier way to stress things: make wrappers around
memcmp, strcmp, and strncmp to force those to return INT_MIN/0/INT_MAX,
thereby modeling what we see is happening on Mark's machine.

This successfully exposed the bug I'd already found by grepping in
pg_rewind/filemap.c, as well as some astonishingly unportable code
in contrib/ltree.

The attached update incorporates Thomas' suggestion for the macro
name, as well as the ltree fix.  For completeness, I also show the
very quick-hacky way I changed memcmp et al, but I don't propose
committing that.

I'm inclined to just go ahead and apply/backpatch this.  It's certainly
possible that more bugs remain to be found, but I have no good ideas
about how to search for them, and in any case that wouldn't invalidate
the patch as it stands.

regards, tom lane

diff --git a/contrib/ltree/ltree_op.c b/contrib/ltree/ltree_op.c
index 759f1f8..df61c63 100644
--- a/contrib/ltree/ltree_op.c
+++ b/contrib/ltree/ltree_op.c
@@ -45,17 +45,24 @@ ltree_compare(const ltree *a, const ltree *b)
 	ltree_level *bl = LTREE_FIRST(b);
 	int			an = a->numlevel;
 	int			bn = b->numlevel;
-	int			res = 0;
 
 	while (an > 0 && bn > 0)
 	{
+		int			res;
+
 		if ((res = memcmp(al->name, bl->name, Min(al->len, bl->len))) == 0)
 		{
 			if (al->len != bl->len)
 return (al->len - bl->len) * 10 * (an + 1);
 		}
 		else
+		{
+			if (res < 0)
+res = -1;
+			else
+res = 1;
 			return res * 10 * (an + 1);
+		}
 
 		an--;
 		bn--;
@@ -146,7 +153,7 @@ inner_isparent(const ltree *c, const ltree *p)
 	{
 		if (cl->len != pl->len)
 			return false;
-		if (memcmp(cl->name, pl->name, cl->len))
+		if (memcmp(cl->name, pl->name, cl->len) != 0)
 			return false;
 
 		pn--;
diff --git a/contrib/pgcrypto/imath.c b/contrib/pgcrypto/imath.c
index cd528bf..b94a51b 100644
--- a/contrib/pgcrypto/imath.c
+++ b/contrib/pgcrypto/imath.c
@@ -1254,11 +1254,9 @@ mp_int_compare(mp_int a, mp_int b)
 		 * If they're both zero or positive, the normal comparison applies; if
 		 * both negative, the sense is reversed.
 		 */
-		if (sa == MP_ZPOS)
-			return cmp;
-		else
-			return -cmp;
-
+		if (sa != MP_ZPOS)
+			INVERT_COMPARE_RESULT(cmp);
+		return cmp;
 	}
 	else
 	{
@@ -1314,10 +1312,9 @@ mp_int_compare_value(mp_int z, int value)
 	{
 		cmp = s_vcmp(z, value);
 
-		if (vsign == MP_ZPOS)
-			return cmp;
-		else
-			return -cmp;
+		if (vsign != MP_ZPOS)
+			INVERT_COMPARE_RESULT(cmp);
+		return cmp;
 	}
 	else
 	{
diff --git a/doc/src/sgml/btree.sgml b/doc/src/sgml/btree.sgml
index 8bd0bad..c16825e 100644
--- a/doc/src/sgml/btree.sgml
+++ b/doc/src/sgml/btree.sgml
@@ -228,11 +228,8 @@
   B, A
   = B,
   or A >
-  B, respectively.  The function must not
-  return INT_MIN for the A
-  < B case,
-  since the value may be negated before being tested for sign.  A null
-  result is disallowed, too.
+  B, respectively.
+  A null result is disallowed: all values of the data type must be comparable.
   See src/backend/access/nbtree/nbtcompare.c for
   examples.
  
diff --git a/src/backend/access/nbtree/nbtcompare.c b/src/backend/access/nbtree/nbtcompare.c
index b1855e8..6f2ad23 100644
--- a/src/backend/access/nbtree/nbtcompare.c
+++ b/src/backend/access/nbtree/nbtcompare.c
@@ -22,11 +22,10 @@
  *
  *	The result is always an int32 regardless of the input datatype.
  *
- *	Although any negative int32 (except INT_MIN) is acceptable for reporting
- *	"<", and any positive int32 is acceptable for reporting ">", routines
+ *	Although any negative int32 is acceptable for reporting "<",
+ *	and any positive int32 is acceptable for reporting ">", routines
  *	that work on 32-bit or wider datatypes can't just return "a - b".
- *	That could overflow and give the wrong answer.  Also, one must not
- *	return INT_MIN to report "<", since some callers will negate the result.
+ *	That could overflow and give the wrong answer.
  *
  *	NOTE: it is critical that the comparison function impose a total order
  *	on all non-NULL values of the data type, and that the datatype's
@@ -44,13 +43,31 @@
  *	during an index access won't be recovered till end of query.  This
  *	primarily affects comparison routines for toastable datatypes;
  *	they have to be careful to free any detoasted copy of an input datum.
+ *
+ *	NOTE: we used to forbid comparison functions from returning INT_MIN,
+ *	but that proves to be too error-prone because some platforms' versions
+ *	of memcmp() etc can return INT_MIN.  As a means of stress-testing
+ *	callers, this file can be compiled with STRESS_SORT_INT_MIN defined
+ *	to cause many of these functi

Re: Odd 9.4, 9.3 buildfarm failure on s390x

2018-10-04 Thread Tom Lane
Thomas Munro  writes:
> I suppose someone might mistake this for a function that converts -42
> to 42... would something like INVERT_COMPARE_RESULT() be better?

I have no particular allegiance to the macro name; it's just the
first idea that came to mind.

regards, tom lane



Re: Odd 9.4, 9.3 buildfarm failure on s390x

2018-10-04 Thread Thomas Munro
On Fri, Oct 5, 2018 at 3:12 PM Tom Lane  wrote:
> Here's a draft patch against HEAD for this.

+ * Invert the sign of a qsort-style comparison result, ie, exchange negative
+ * and positive integer values, being careful not to get the wrong answer
+ * for INT_MIN.  The argument should be an integral variable.
+ */
+#define INVERT_SIGN(var) \
+((var) = ((var) < 0) ? 1 : -(var))

I suppose someone might mistake this for a function that converts -42
to 42... would something like INVERT_COMPARE_RESULT() be better?

-- 
Thomas Munro
http://www.enterprisedb.com



Re: Odd 9.4, 9.3 buildfarm failure on s390x

2018-10-04 Thread Tom Lane
I wrote:
> Andres Freund  writes:
>> On 2018-10-01 12:13:57 -0400, Tom Lane wrote:
>>> (2) Drop the restriction.  This'd require at least changing the
>>> DESC correction, and maybe other things.  I'm not sure what the
>>> odds would be of finding everyplace we need to check.

>> (2) seems more maintainable to me (or perhaps less unmaintainable). It's
>> infrastructure, rather than every datatype + support out there...

> I guess we could set up some testing infrastructure: hack int4cmp
> and/or a couple other popular comparators so that they *always*
> return INT_MIN, 0, or INT_MAX, and then see what falls over.

Here's a draft patch against HEAD for this.

I looked for problem spots by (a) testing with the STRESS_SORT_INT_MIN
option I added in nbtcompare.c, (b) grepping for "x = -x" type code,
and (c) grepping for "return -x" type code.  (b) and (c) found several
places that (a) didn't, which does not give me a warm feeling about
whether I have found quite everything.

I changed a couple of places where things might've been safe but
I didn't feel like chasing the calls to prove it (e.g. imath.c),
and contrariwise I left a *very* small number of places alone
because they were inverting the result of a specific function
that is defined to return 1/0/-1 and nothing else.

regards, tom lane

diff --git a/contrib/pgcrypto/imath.c b/contrib/pgcrypto/imath.c
index cd528bf..ca4995a 100644
--- a/contrib/pgcrypto/imath.c
+++ b/contrib/pgcrypto/imath.c
@@ -1254,11 +1254,9 @@ mp_int_compare(mp_int a, mp_int b)
 		 * If they're both zero or positive, the normal comparison applies; if
 		 * both negative, the sense is reversed.
 		 */
-		if (sa == MP_ZPOS)
-			return cmp;
-		else
-			return -cmp;
-
+		if (sa != MP_ZPOS)
+			INVERT_SIGN(cmp);
+		return cmp;
 	}
 	else
 	{
@@ -1314,10 +1312,9 @@ mp_int_compare_value(mp_int z, int value)
 	{
 		cmp = s_vcmp(z, value);
 
-		if (vsign == MP_ZPOS)
-			return cmp;
-		else
-			return -cmp;
+		if (vsign != MP_ZPOS)
+			INVERT_SIGN(cmp);
+		return cmp;
 	}
 	else
 	{
diff --git a/doc/src/sgml/btree.sgml b/doc/src/sgml/btree.sgml
index 8bd0bad..c16825e 100644
--- a/doc/src/sgml/btree.sgml
+++ b/doc/src/sgml/btree.sgml
@@ -228,11 +228,8 @@
   B, A
   = B,
   or A >
-  B, respectively.  The function must not
-  return INT_MIN for the A
-  < B case,
-  since the value may be negated before being tested for sign.  A null
-  result is disallowed, too.
+  B, respectively.
+  A null result is disallowed: all values of the data type must be comparable.
   See src/backend/access/nbtree/nbtcompare.c for
   examples.
  
diff --git a/src/backend/access/nbtree/nbtcompare.c b/src/backend/access/nbtree/nbtcompare.c
index b1855e8..6f2ad23 100644
--- a/src/backend/access/nbtree/nbtcompare.c
+++ b/src/backend/access/nbtree/nbtcompare.c
@@ -22,11 +22,10 @@
  *
  *	The result is always an int32 regardless of the input datatype.
  *
- *	Although any negative int32 (except INT_MIN) is acceptable for reporting
- *	"<", and any positive int32 is acceptable for reporting ">", routines
+ *	Although any negative int32 is acceptable for reporting "<",
+ *	and any positive int32 is acceptable for reporting ">", routines
  *	that work on 32-bit or wider datatypes can't just return "a - b".
- *	That could overflow and give the wrong answer.  Also, one must not
- *	return INT_MIN to report "<", since some callers will negate the result.
+ *	That could overflow and give the wrong answer.
  *
  *	NOTE: it is critical that the comparison function impose a total order
  *	on all non-NULL values of the data type, and that the datatype's
@@ -44,13 +43,31 @@
  *	during an index access won't be recovered till end of query.  This
  *	primarily affects comparison routines for toastable datatypes;
  *	they have to be careful to free any detoasted copy of an input datum.
+ *
+ *	NOTE: we used to forbid comparison functions from returning INT_MIN,
+ *	but that proves to be too error-prone because some platforms' versions
+ *	of memcmp() etc can return INT_MIN.  As a means of stress-testing
+ *	callers, this file can be compiled with STRESS_SORT_INT_MIN defined
+ *	to cause many of these functions to return INT_MIN or INT_MAX instead of
+ *	their customary -1/+1.  For production, though, that's not a good idea
+ *	since users or third-party code might expect the traditional results.
  *-
  */
 #include "postgres.h"
 
+#include 
+
 #include "utils/builtins.h"
 #include "utils/sortsupport.h"
 
+#ifdef STRESS_SORT_INT_MIN
+#define A_LESS_THAN_B		INT_MIN
+#define A_GREATER_THAN_B	INT_MAX
+#else
+#define A_LESS_THAN_B		(-1)
+#define A_GREATER_THAN_B	1
+#endif
+
 
 Datum
 btboolcmp(PG_FUNCTION_ARGS)
@@ -95,11 +112,11 @@ btint4cmp(PG_FUNCTION_ARGS)
 	int32		b = PG_GETARG_INT32(1);
 
 	if (a > b)
-		PG_RETURN_INT32(1);
+		PG_RETURN_INT32(A_GREATER_THAN_B);
 	else if (a == b)
 		PG_RETURN_INT32(0);
 	else
-		PG_RETURN_INT32(-1);
+		PG_RET

Re: Odd 9.4, 9.3 buildfarm failure on s390x

2018-10-01 Thread Thomas Munro
On Tue, Oct 2, 2018 at 10:55 AM Andrew Dunstan
 wrote:
> On 10/01/2018 12:50 PM, Tom Lane wrote:
> > Andres Freund  writes:
> >> On 2018-10-01 12:13:57 -0400, Tom Lane wrote:
> >>> Yeah.  So our choices are
> >>>
> >>> (1) Retain the current restriction on what sort comparators can
> >>> produce.  Find all the places where memcmp's result is returned
> >>> directly, and fix them.  (I wonder if strcmp has same issue.)
> >>>
> >>> (2) Drop the restriction.  This'd require at least changing the
> >>> DESC correction, and maybe other things.  I'm not sure what the
> >>> odds would be of finding everyplace we need to check.
> >>>
> >>> Neither one is sounding very pleasant, or maintainable.
> >> (2) seems more maintainable to me (or perhaps less unmaintainable). It's
> >> infrastructure, rather than every datatype + support out there...
> > I guess we could set up some testing infrastructure: hack int4cmp
> > and/or a couple other popular comparators so that they *always*
> > return INT_MIN, 0, or INT_MAX, and then see what falls over.
> >
> > I'm fairly sure that btree, as well as the sort code proper,
> > has got an issue here.
> >
> >
>
>
> I agree option 2 seems less unmaintainable. (Nice use of litotes there?)

+1 for option 2.  It seems to me that it should ideally be the job of
the code that is negating the value to deal with this edge case.  I
see that the restriction is clearly documented at the top of
src/backend/access/nbtree/nbtcompare.c even though it directly returns
strncmp() results.  This was quite a surprising thread.

-- 
Thomas Munro
http://www.enterprisedb.com



Re: Odd 9.4, 9.3 buildfarm failure on s390x

2018-10-01 Thread Andrew Dunstan




On 10/01/2018 12:50 PM, Tom Lane wrote:

Andres Freund  writes:

On 2018-10-01 12:13:57 -0400, Tom Lane wrote:

Yeah.  So our choices are

(1) Retain the current restriction on what sort comparators can
produce.  Find all the places where memcmp's result is returned
directly, and fix them.  (I wonder if strcmp has same issue.)

(2) Drop the restriction.  This'd require at least changing the
DESC correction, and maybe other things.  I'm not sure what the
odds would be of finding everyplace we need to check.

Neither one is sounding very pleasant, or maintainable.

(2) seems more maintainable to me (or perhaps less unmaintainable). It's
infrastructure, rather than every datatype + support out there...

I guess we could set up some testing infrastructure: hack int4cmp
and/or a couple other popular comparators so that they *always*
return INT_MIN, 0, or INT_MAX, and then see what falls over.

I'm fairly sure that btree, as well as the sort code proper,
has got an issue here.





I agree option 2 seems less unmaintainable. (Nice use of litotes there?)

cheers

andrew

--
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Odd 9.4, 9.3 buildfarm failure on s390x

2018-10-01 Thread Larry Rosenman
On Mon, Oct 01, 2018 at 05:11:02PM -0400, Tom Lane wrote:
> Andrew Dunstan  writes:
> > On 10/01/2018 11:58 AM, Tom Lane wrote:
> >> Oooh ... apparently, on that platform, memcmp() is willing to produce
> >> INT_MIN in some cases.  That's not a safe value for a sort comparator
> >> to produce --- we explicitly say that somewhere, IIRC.  I think we
> >> implement DESC by negating the comparator's result, which explains
> >> why only the DESC case fails.
> 
> > Is there a standard that forbids this, or have we just been lucky up to now?
> 
> We've been lucky; POSIX just says the value is less than, equal to,
> or greater than zero.
> 
> In practice, a memcmp that operates byte-at-a-time would not likely
> return anything outside +-255.  But on a big-endian machine you could
> easily optimize to use word-wide operations to compare 4 bytes at a
> time, and I suspect that's what's happening here.  Or maybe there's
> just some weird architecture-specific reason that makes it cheap
> for them to return INT_MIN rather than some other value?
> 
as a former S3[79]x assembler programmer, they probably do it in
registers or using TRT.  All of which could be word wise. 


>   regards, tom lane
> 

-- 
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Drive, Round Rock, TX 78665-2106


signature.asc
Description: PGP signature


Re: Odd 9.4, 9.3 buildfarm failure on s390x

2018-10-01 Thread Tom Lane
Andrew Dunstan  writes:
> On 10/01/2018 11:58 AM, Tom Lane wrote:
>> Oooh ... apparently, on that platform, memcmp() is willing to produce
>> INT_MIN in some cases.  That's not a safe value for a sort comparator
>> to produce --- we explicitly say that somewhere, IIRC.  I think we
>> implement DESC by negating the comparator's result, which explains
>> why only the DESC case fails.

> Is there a standard that forbids this, or have we just been lucky up to now?

We've been lucky; POSIX just says the value is less than, equal to,
or greater than zero.

In practice, a memcmp that operates byte-at-a-time would not likely
return anything outside +-255.  But on a big-endian machine you could
easily optimize to use word-wide operations to compare 4 bytes at a
time, and I suspect that's what's happening here.  Or maybe there's
just some weird architecture-specific reason that makes it cheap
for them to return INT_MIN rather than some other value?

regards, tom lane



Re: Odd 9.4, 9.3 buildfarm failure on s390x

2018-10-01 Thread Andrew Dunstan




On 10/01/2018 11:58 AM, Tom Lane wrote:

Mark Wong  writes:

   a   |  a   | 
 uuid_cmp
--+--+-
  ---- | ---- | 
  0
  ---- | ---- | 
-2147483648
  ---- | 3f3e3c3b-3a30-3938-3736-353433a2313e | 
-2147483648
  ---- | ---- | 
  1
  ---- | ---- | 
  0
  ---- | 3f3e3c3b-3a30-3938-3736-353433a2313e | 
-2147483648
  3f3e3c3b-3a30-3938-3736-353433a2313e | ---- | 
  1
  3f3e3c3b-3a30-3938-3736-353433a2313e | ---- | 
  1
  3f3e3c3b-3a30-3938-3736-353433a2313e | 3f3e3c3b-3a30-3938-3736-353433a2313e | 
  0
(9 rows)


Oooh ... apparently, on that platform, memcmp() is willing to produce
INT_MIN in some cases.  That's not a safe value for a sort comparator
to produce --- we explicitly say that somewhere, IIRC.  I think we
implement DESC by negating the comparator's result, which explains
why only the DESC case fails.





Is there a standard that forbids this, or have we just been lucky up to now?

cheers

andrew

--
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Odd 9.4, 9.3 buildfarm failure on s390x

2018-10-01 Thread Tom Lane
Andres Freund  writes:
> On 2018-10-01 12:13:57 -0400, Tom Lane wrote:
>> Yeah.  So our choices are
>> 
>> (1) Retain the current restriction on what sort comparators can
>> produce.  Find all the places where memcmp's result is returned
>> directly, and fix them.  (I wonder if strcmp has same issue.)
>> 
>> (2) Drop the restriction.  This'd require at least changing the
>> DESC correction, and maybe other things.  I'm not sure what the
>> odds would be of finding everyplace we need to check.
>> 
>> Neither one is sounding very pleasant, or maintainable.

> (2) seems more maintainable to me (or perhaps less unmaintainable). It's
> infrastructure, rather than every datatype + support out there...

I guess we could set up some testing infrastructure: hack int4cmp
and/or a couple other popular comparators so that they *always*
return INT_MIN, 0, or INT_MAX, and then see what falls over.

I'm fairly sure that btree, as well as the sort code proper,
has got an issue here.

regards, tom lane



Re: Odd 9.4, 9.3 buildfarm failure on s390x

2018-10-01 Thread Andres Freund
On 2018-10-01 12:13:57 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2018-10-01 11:58:51 -0400, Tom Lane wrote:
> >> Oooh ... apparently, on that platform, memcmp() is willing to produce
> >> INT_MIN in some cases.  That's not a safe value for a sort comparator
> >> to produce --- we explicitly say that somewhere, IIRC.
> 
> > Hm, that'd be pretty painful - memcmp() isn't guaranteed to return
> > anything smaller. And we use memcmp in a fair number of comparators.
> 
> Yeah.  So our choices are
> 
> (1) Retain the current restriction on what sort comparators can
> produce.  Find all the places where memcmp's result is returned
> directly, and fix them.  (I wonder if strcmp has same issue.)
> 
> (2) Drop the restriction.  This'd require at least changing the
> DESC correction, and maybe other things.  I'm not sure what the
> odds would be of finding everyplace we need to check.
> 
> Neither one is sounding very pleasant, or maintainable.

(2) seems more maintainable to me (or perhaps less unmaintainable). It's
infrastructure, rather than every datatype + support out there...

Greetings,

Andres Freund



Re: Odd 9.4, 9.3 buildfarm failure on s390x

2018-10-01 Thread Tom Lane
Andres Freund  writes:
> On 2018-10-01 11:58:51 -0400, Tom Lane wrote:
>> Oooh ... apparently, on that platform, memcmp() is willing to produce
>> INT_MIN in some cases.  That's not a safe value for a sort comparator
>> to produce --- we explicitly say that somewhere, IIRC.

> Hm, that'd be pretty painful - memcmp() isn't guaranteed to return
> anything smaller. And we use memcmp in a fair number of comparators.

Yeah.  So our choices are

(1) Retain the current restriction on what sort comparators can
produce.  Find all the places where memcmp's result is returned
directly, and fix them.  (I wonder if strcmp has same issue.)

(2) Drop the restriction.  This'd require at least changing the
DESC correction, and maybe other things.  I'm not sure what the
odds would be of finding everyplace we need to check.

Neither one is sounding very pleasant, or maintainable.

regards, tom lane



Re: Odd 9.4, 9.3 buildfarm failure on s390x

2018-10-01 Thread Andres Freund
On 2018-10-01 11:58:51 -0400, Tom Lane wrote:
> Mark Wong  writes:
> >   a   |  a  
> >  |  uuid_cmp
> > --+--+-
> >  ---- | 
> > ---- |   0
> >  ---- | 
> > ---- | -2147483648
> >  ---- | 
> > 3f3e3c3b-3a30-3938-3736-353433a2313e | -2147483648
> >  ---- | 
> > ---- |   1
> >  ---- | 
> > ---- |   0
> >  ---- | 
> > 3f3e3c3b-3a30-3938-3736-353433a2313e | -2147483648
> >  3f3e3c3b-3a30-3938-3736-353433a2313e | 
> > ---- |   1
> >  3f3e3c3b-3a30-3938-3736-353433a2313e | 
> > ---- |   1
> >  3f3e3c3b-3a30-3938-3736-353433a2313e | 
> > 3f3e3c3b-3a30-3938-3736-353433a2313e |   0
> > (9 rows)
> 
> 
> Oooh ... apparently, on that platform, memcmp() is willing to produce
> INT_MIN in some cases.  That's not a safe value for a sort comparator
> to produce --- we explicitly say that somewhere, IIRC.

Hm, that'd be pretty painful - memcmp() isn't guaranteed to return
anything smaller. And we use memcmp in a fair number of comparators.


> I think we implement DESC by negating the comparator's result, which
> explains why only the DESC case fails.

That makes sense.

Greetings,

Andres Freund



Re: Odd 9.4, 9.3 buildfarm failure on s390x

2018-10-01 Thread Tom Lane
Mark Wong  writes:
>   a   |  a   
> |  uuid_cmp
> --+--+-
>  ---- | ---- 
> |   0
>  ---- | ---- 
> | -2147483648
>  ---- | 3f3e3c3b-3a30-3938-3736-353433a2313e 
> | -2147483648
>  ---- | ---- 
> |   1
>  ---- | ---- 
> |   0
>  ---- | 3f3e3c3b-3a30-3938-3736-353433a2313e 
> | -2147483648
>  3f3e3c3b-3a30-3938-3736-353433a2313e | ---- 
> |   1
>  3f3e3c3b-3a30-3938-3736-353433a2313e | ---- 
> |   1
>  3f3e3c3b-3a30-3938-3736-353433a2313e | 3f3e3c3b-3a30-3938-3736-353433a2313e 
> |   0
> (9 rows)


Oooh ... apparently, on that platform, memcmp() is willing to produce
INT_MIN in some cases.  That's not a safe value for a sort comparator
to produce --- we explicitly say that somewhere, IIRC.  I think we
implement DESC by negating the comparator's result, which explains
why only the DESC case fails.

regards, tom lane



Re: Odd 9.4, 9.3 buildfarm failure on s390x

2018-10-01 Thread Mark Wong
On Sun, Sep 30, 2018 at 12:38:46AM +0100, Andrew Gierth wrote:
> > "Andrew" == Andrew Dunstan  writes:
> 
>  >> What is the size of a C "int" on this platform?
> 
>  Andrew> 4.
> 
> Hmm.
> 
> Because int being more than 32 bits is the simplest explanation for this
> difference.
> 
> How about the output of this query:
> 
> with d(a) as (values ('----'::uuid),
>  ('----'::uuid),
>  ('3f3e3c3b-3a30-3938-3736-353433a2313e'::uuid))
>   select d1.a, d2.a, uuid_cmp(d1.a,d2.a) from d d1, d d2
>order by d1.a asc, d2.a desc;

That also appears to produce the same results:

With 9.4:

postgres=# select version();
  version
---
 PostgreSQL 9.4.19 on s390x-ibm-linux-gnu, compiled by clang version 5.0.1 
(tags/RELEASE_501/final 312548), 64-bit
(1 row)

...

  a   |  a   |  
uuid_cmp
--+--+-
 ---- | ---- |  
 0
 ---- | ---- | 
-2147483648
 ---- | 3f3e3c3b-3a30-3938-3736-353433a2313e | 
-2147483648
 ---- | ---- |  
 1
 ---- | ---- |  
 0
 ---- | 3f3e3c3b-3a30-3938-3736-353433a2313e | 
-2147483648
 3f3e3c3b-3a30-3938-3736-353433a2313e | ---- |  
 1
 3f3e3c3b-3a30-3938-3736-353433a2313e | ---- |  
 1
 3f3e3c3b-3a30-3938-3736-353433a2313e | 3f3e3c3b-3a30-3938-3736-353433a2313e |  
 0
(9 rows)


Then with HEAD:

postgres=# select version();
  version

 PostgreSQL 12devel on s390x-ibm-linux-gnu, compiled by clang version 5.0.1 
(tags/RELEASE_501/final 312548), 64-bit
(1 row)

...

  a   |  a   |  
uuid_cmp
--+--+-
 ---- | ---- |  
 0
 ---- | ---- | 
-2147483648
 ---- | 3f3e3c3b-3a30-3938-3736-353433a2313e | 
-2147483648
 ---- | ---- |  
 1
 ---- | ---- |  
 0
 ---- | 3f3e3c3b-3a30-3938-3736-353433a2313e | 
-2147483648
 3f3e3c3b-3a30-3938-3736-353433a2313e | ---- |  
 1
 3f3e3c3b-3a30-3938-3736-353433a2313e | ---- |  
 1
 3f3e3c3b-3a30-3938-3736-353433a2313e | 3f3e3c3b-3a30-3938-3736-353433a2313e |  
 0
(9 rows)


Regards,
Mark

-- 
Mark Wonghttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, RemoteDBA, Training & Services



Re: Odd 9.4, 9.3 buildfarm failure on s390x

2018-09-29 Thread Andrew Gierth
> "Tom" == Tom Lane  writes:

 > Andrew Gierth  writes:
 >> Because int being more than 32 bits is the simplest explanation for
 >> this difference.

 Tom> Curious to hear your reasoning behind that statement? I hadn't
 Tom> gotten further than "memcmp is broken" ... and neither of those
 Tom> theories is tenable, because if they were true then a lot more
 Tom> things besides uuid sorting would be falling over.

memcmp() returns an int, and guarantees only the sign of the result, so
((int32) memcmp()) may have the wrong value if int is wider than int32.

But yeah, it seems unlikely that it would break for uuid but not bytea
(or text in collate C).

-- 
Andrew (irc:RhodiumToad)



Re: Odd 9.4, 9.3 buildfarm failure on s390x

2018-09-29 Thread Tom Lane
Andrew Gierth  writes:
> Because int being more than 32 bits is the simplest explanation for this
> difference.

Curious to hear your reasoning behind that statement?  I hadn't gotten
further than "memcmp is broken" ... and neither of those theories is
tenable, because if they were true then a lot more things besides uuid
sorting would be falling over.

regards, tom lane



Re: Odd 9.4, 9.3 buildfarm failure on s390x

2018-09-29 Thread Andrew Gierth
> "Andrew" == Andrew Dunstan  writes:

 >> What is the size of a C "int" on this platform?

 Andrew> 4.

Hmm.

Because int being more than 32 bits is the simplest explanation for this
difference.

How about the output of this query:

with d(a) as (values ('----'::uuid),
 ('----'::uuid),
 ('3f3e3c3b-3a30-3938-3736-353433a2313e'::uuid))
  select d1.a, d2.a, uuid_cmp(d1.a,d2.a) from d d1, d d2
   order by d1.a asc, d2.a desc;

-- 
Andrew (irc:RhodiumToad)



Re: Odd 9.4, 9.3 buildfarm failure on s390x

2018-09-29 Thread Andrew Dunstan




On 09/29/2018 01:36 AM, Andrew Gierth wrote:

  Mark> What should I try next?

What is the size of a C "int" on this platform?



4.

cheers

andrew

--
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Odd 9.4, 9.3 buildfarm failure on s390x

2018-09-28 Thread Andrew Gierth
> "Mark" == Mark Wong  writes:

 Mark> What should I try next?

What is the size of a C "int" on this platform?

-- 
Andrew (irc:RhodiumToad)



Re: Odd 9.4, 9.3 buildfarm failure on s390x

2018-09-28 Thread Mark Wong
Hi Andres,

On Fri, Sep 28, 2018 at 03:41:27PM -0700, Andres Freund wrote:
> On 2018-09-28 15:22:23 -0700, Mark Wong wrote:
> > On Fri, Sep 28, 2018 at 11:52:15AM -0700, Andres Freund wrote:
> > > Mark, is there anything odd for specific branches?
> > 
> > No... I don't have anything in the config that would be applied to
> > specific branches...
> 
> Could you perhaps do some manual debugging on that machine?
> 
> Maybe starting with manually running something like:
> 
> SELECT uuid_cmp('----'::uuid, 
> '----'::uuid);
> SELECT uuid_cmp('----'::uuid, 
> '----'::uuid);
> SELECT uuid_cmp('----'::uuid, 
> '1113----'::uuid);
> SELECT uuid_cmp('----'::uuid, 
> '1110----'::uuid);
> 
> on both master and one of the failing branches?

I've attached the output for head and the 9.4 stable branch.  It appears
they are returning the same results.

I built them both by:

CC=/usr/bin/clang ./configure --enable-cassert --enable-debug \
--enable-nls --with-perl --with-python --with-tcl \
--with-tclconfig=/usr/lib64 --with-gssapi --with-openssl \
--with-ldap --with-libxml --with-libxslt

What should I try next?

Regards,
Mark

--
Mark Wonghttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, RemoteDBA, Training & Services
psql (12devel)
Type "help" for help.

postgres=# SELECT uuid_cmp('----'::uuid, 
'----'::uuid);
  uuid_cmp
-
 -2147483648
(1 row)

postgres=# SELECT uuid_cmp('----'::uuid, 
'----'::uuid);
 uuid_cmp
--
0
(1 row)

postgres=# SELECT uuid_cmp('----'::uuid, 
'1113----'::uuid);
  uuid_cmp
-
 -2147483648
(1 row)

postgres=# SELECT uuid_cmp('----'::uuid, 
'1110----'::uuid);
 uuid_cmp
--
1
(1 row)

psql (9.4.19)
Type "help" for help.

postgres=# SELECT uuid_cmp('----'::uuid, 
'----'::uuid);
  uuid_cmp
-
 -2147483648
(1 row)

postgres=# SELECT uuid_cmp('----'::uuid, 
'----'::uuid);
 uuid_cmp
--
0
(1 row)

postgres=# SELECT uuid_cmp('----'::uuid, 
'1113----'::uuid);
  uuid_cmp
-
 -2147483648
(1 row)

postgres=# SELECT uuid_cmp('----'::uuid, 
'1110----'::uuid);
 uuid_cmp
--
1
(1 row)




Re: Odd 9.4, 9.3 buildfarm failure on s390x

2018-09-28 Thread Andres Freund
Hi,

On 2018-09-28 15:22:23 -0700, Mark Wong wrote:
> On Fri, Sep 28, 2018 at 11:52:15AM -0700, Andres Freund wrote:
> > Mark, is there anything odd for specific branches?
> 
> No... I don't have anything in the config that would be applied to
> specific branches...

Could you perhaps do some manual debugging on that machine?

Maybe starting with manually running something like:

SELECT uuid_cmp('----'::uuid, 
'----'::uuid);
SELECT uuid_cmp('----'::uuid, 
'----'::uuid);
SELECT uuid_cmp('----'::uuid, 
'1113----'::uuid);
SELECT uuid_cmp('----'::uuid, 
'1110----'::uuid);

on both master and one of the failing branches?

Greetings,

Andres Freund



Re: Odd 9.4, 9.3 buildfarm failure on s390x

2018-09-28 Thread Mark Wong
On Fri, Sep 28, 2018 at 11:52:15AM -0700, Andres Freund wrote:
> Mark, is there anything odd for specific branches?

No... I don't have anything in the config that would be applied to
specific branches...

Regards,
Mark

-- 
Mark Wonghttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, RemoteDBA, Training & Services



Re: Odd 9.4, 9.3 buildfarm failure on s390x

2018-09-28 Thread Tom Lane
Andres Freund  writes:
> It seems Mark started a new buildfarm animal on s390x. It shows a pretty
> odd failure on 9.3 and 9.4, but *not* on newer animals:

No, lumpsucker is showing the same failure on 9.5 as well.  I suspect
that the reason 9.6 and up are OK is that 9.6 is where we introduced
the abbreviated-sort-key machinery.  IOW, the problem exists in the
old-style UUID sort comparator but not the new one.  Which is pretty
darn odd, because the old-style comparator is just memcmp().  How
could that be broken without causing lots more issues?

regards, tom lane



Odd 9.4, 9.3 buildfarm failure on s390x

2018-09-28 Thread Andres Freund
Hi,

It seems Mark started a new buildfarm animal on s390x. It shows a pretty
odd failure on 9.3 and 9.4, but *not* on newer animals:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lumpsucker&dt=2018-09-26%2020%3A30%3A58

== pgsql.build/src/test/regress/regression.diffs 
===
*** 
/home/linux1/build-farm-8-clang/buildroot/REL9_4_STABLE/pgsql.build/src/test/regress/expected/uuid.out
  Mon Sep 24 17:49:30 2018
--- 
/home/linux1/build-farm-8-clang/buildroot/REL9_4_STABLE/pgsql.build/src/test/regress/results/uuid.out
   Wed Sep 26 16:31:31 2018
***
*** 64,72 
  SELECT guid_field FROM guid1 ORDER BY guid_field DESC;
guid_field
  --
-  3f3e3c3b-3a30-3938-3736-353433a2313e
-  ----
   ----
  (3 rows)

  -- = operator test
--- 64,72 
  SELECT guid_field FROM guid1 ORDER BY guid_field DESC;
guid_field
  --
   ----
+  ----
+  3f3e3c3b-3a30-3938-3736-353433a2313e
  (3 rows)

  -- = operator test

==

Mark, is there anything odd for specific branches?

I don't see anything immediately suspicious in the relevant comparator
code...

Greetings,

Andres Freund