Re: [PATCH] Add function to_oct

2023-08-23 Thread Nathan Bossart
On Tue, Aug 22, 2023 at 04:57:02PM +0200, Peter Eisentraut wrote:
> On 22.08.23 16:26, Nathan Bossart wrote:
>> > I don't understand the reason for this handling of negative values.  I 
>> > would
>> > expect that, say, to_hex(-1234) would return '-' || to_hex(1234).
>> For this patch set, I was trying to retain the current behavior, which is
>> to return the two's complement representation.  I'm open to changing this
>> if there's agreement on what the output should be.
> 
> Ah, if there is existing behavior then we should probably keep it.

Committed.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: [PATCH] Add function to_oct

2023-08-22 Thread Peter Eisentraut

On 22.08.23 16:26, Nathan Bossart wrote:

I don't understand the reason for this handling of negative values.  I would
expect that, say, to_hex(-1234) would return '-' || to_hex(1234).

For this patch set, I was trying to retain the current behavior, which is
to return the two's complement representation.  I'm open to changing this
if there's agreement on what the output should be.


Ah, if there is existing behavior then we should probably keep it.




Re: [PATCH] Add function to_oct

2023-08-22 Thread Nathan Bossart
On Tue, Aug 22, 2023 at 04:20:02PM +0200, Peter Eisentraut wrote:
> On 20.08.23 17:25, Nathan Bossart wrote:
>> > Doing a quick test, shows that this changes the current behaviour,
>> > because all inputs are now treated as 64-bit:
>> > 
>> > HEAD:
>> > 
>> > select to_hex((-1234)::int);
>> >to_hex
>> > --
>> >   fb2e
>> > 
>> > With patch:
>> > 
>> > select to_hex((-1234)::int);
>> >to_hex
>> > --
>> >   fb2e
>> Good catch.  In v8, I fixed this by first casting the input to uint32 for
>> the 32-bit versions of the functions.  This prevents the conversion to
>> uint64 from setting the rest of the bits.  AFAICT this behavior is pretty
>> well defined in the standard.
> 
> What standard?

C99

> I don't understand the reason for this handling of negative values.  I would
> expect that, say, to_hex(-1234) would return '-' || to_hex(1234).

For this patch set, I was trying to retain the current behavior, which is
to return the two's complement representation.  I'm open to changing this
if there's agreement on what the output should be.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: [PATCH] Add function to_oct

2023-08-22 Thread Peter Eisentraut

On 20.08.23 17:25, Nathan Bossart wrote:

Doing a quick test, shows that this changes the current behaviour,
because all inputs are now treated as 64-bit:

HEAD:

select to_hex((-1234)::int);
   to_hex
--
  fb2e

With patch:

select to_hex((-1234)::int);
   to_hex
--
  fb2e

Good catch.  In v8, I fixed this by first casting the input to uint32 for
the 32-bit versions of the functions.  This prevents the conversion to
uint64 from setting the rest of the bits.  AFAICT this behavior is pretty
well defined in the standard.


What standard?

I don't understand the reason for this handling of negative values.  I 
would expect that, say, to_hex(-1234) would return '-' || to_hex(1234).






Re: [PATCH] Add function to_oct

2023-08-21 Thread John Naylor
On Tue, Aug 22, 2023 at 3:10 AM Dean Rasheed 
wrote:
>
> OK, cool. This looks good to me.

LGTM too.

--
John Naylor
EDB: http://www.enterprisedb.com


Re: [PATCH] Add function to_oct

2023-08-21 Thread Dean Rasheed
On Mon, 21 Aug 2023 at 20:15, Nathan Bossart  wrote:
>
> I added some examples for negative inputs.
>
> I renamed it to to_bin().
>
> I reordered the functions in the docs.
>

OK, cool. This looks good to me.

Regards,
Dean




Re: [PATCH] Add function to_oct

2023-08-21 Thread Nathan Bossart
On Mon, Aug 21, 2023 at 09:31:37AM +0100, Dean Rasheed wrote:
> Hmm, I think just including the doc text update, without the examples
> of positive and negative inputs, might not be sufficient to make the
> meaning clear to everyone.

I added some examples for negative inputs.

> Something else that bothers me slightly is the function naming --
> "hexadecimal" gets abbreviated to "hex", "octal" gets abbreviated to
> "oct", but "binary" is left as-is. I think it ought to be "to_bin()"
> on consistency grounds, even though I understand the words "to bin"
> could be interpreted differently. (Looking elsewhere for precedents,
> Python has bin(), oct() and hex() functions.)

I renamed it to to_bin().

> Also, I think the convention is to always list functions
> alphabetically, so to_oct() should really come after to_hex().

I reordered the functions in the docs.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 131ee7fcfc62040572425cf1a090c0a6967d7559 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Mon, 21 Aug 2023 11:29:51 -0700
Subject: [PATCH v9 1/1] add to_bin() and to_oct()

---
 doc/src/sgml/func.sgml| 59 +-
 src/backend/utils/adt/varlena.c   | 86 +++
 src/include/catalog/pg_proc.dat   | 12 
 src/test/regress/expected/strings.out | 62 ++-
 src/test/regress/sql/strings.sql  | 15 -
 5 files changed, 204 insertions(+), 30 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index be2f54c914..7a0d4b9134 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -3737,6 +3737,32 @@ repeat('Pg', 4) PgPgPgPg

   
 
+  
+   
+
+ to_bin
+
+to_bin ( integer )
+text
+   
+   
+to_bin ( bigint )
+text
+   
+   
+Converts the number to its equivalent two's complement binary
+representation.
+   
+   
+to_bin(2147483647)
+111
+   
+   
+to_bin(-1234)
+101100101110
+   
+  
+
   

 
@@ -3750,11 +3776,42 @@ repeat('Pg', 4) PgPgPgPg
 text


-Converts the number to its equivalent hexadecimal representation.
+Converts the number to its equivalent two's complement hexadecimal
+representation.


 to_hex(2147483647)
 7fff
+   
+   
+to_hex(-1234)
+fb2e
+   
+  
+
+  
+   
+
+ to_oct
+
+to_oct ( integer )
+text
+   
+   
+to_oct ( bigint )
+text
+   
+   
+Converts the number to its equivalent two's complement octal
+representation.
+   
+   
+to_oct(2147483647)
+177
+   
+   
+to_oct(-1234)
+3775456

   
 
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index b1ec5c32ce..72e1e24fe0 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -4919,53 +4919,87 @@ array_to_text_internal(FunctionCallInfo fcinfo, ArrayType *v,
 	return result;
 }
 
-#define HEXBASE 16
 /*
- * Convert an int32 to a string containing a base 16 (hex) representation of
- * the number.
+ * Workhorse for to_bin, to_oct, and to_hex.  Note that base must be > 1 and <=
+ * 16.
  */
-Datum
-to_hex32(PG_FUNCTION_ARGS)
+static inline text *
+convert_to_base(uint64 value, int base)
 {
-	uint32		value = (uint32) PG_GETARG_INT32(0);
-	char	   *ptr;
 	const char *digits = "0123456789abcdef";
-	char		buf[32];		/* bigger than needed, but reasonable */
 
-	ptr = buf + sizeof(buf) - 1;
-	*ptr = '\0';
+	/* We size the buffer for to_bin's longest possible return value. */
+	char		buf[sizeof(uint64) * BITS_PER_BYTE];
+	char	   *const end = buf + sizeof(buf);
+	char	   *ptr = end;
+
+	Assert(base > 1);
+	Assert(base <= 16);
 
 	do
 	{
-		*--ptr = digits[value % HEXBASE];
-		value /= HEXBASE;
+		*--ptr = digits[value % base];
+		value /= base;
 	} while (ptr > buf && value);
 
-	PG_RETURN_TEXT_P(cstring_to_text(ptr));
+	return cstring_to_text_with_len(ptr, end - ptr);
+}
+
+/*
+ * Convert an integer to a string containing a base-2 (binary) representation
+ * of the number.
+ */
+Datum
+to_bin32(PG_FUNCTION_ARGS)
+{
+	uint64		value = (uint32) PG_GETARG_INT32(0);
+
+	PG_RETURN_TEXT_P(convert_to_base(value, 2));
+}
+Datum
+to_bin64(PG_FUNCTION_ARGS)
+{
+	uint64		value = (uint64) PG_GETARG_INT64(0);
+
+	PG_RETURN_TEXT_P(convert_to_base(value, 2));
 }
 
 /*
- * Convert an int64 to a string containing a base 16 (hex) representation of
+ * Convert an integer to a string containing a base-8 (oct) representation of
  * the number.
  */
 Datum
-to_hex64(PG_FUNCTION_ARGS)
+to_oct32(PG_FUNCTION_ARGS)
+{
+	uint64		value = (uint32) PG_GETARG_INT32(0);
+
+	PG_RETURN_TEXT_P(con

Re: [PATCH] Add function to_oct

2023-08-21 Thread Dean Rasheed
On Sun, 20 Aug 2023 at 16:25, Nathan Bossart  wrote:
>
> On Sat, Aug 19, 2023 at 08:35:46AM +0100, Dean Rasheed wrote:
>
> > The way that negative inputs are handled really should be documented,
> > or at least it should include a couple of examples.
>
> I used your suggestion and noted that the output is the two's complement
> representation [0].
>

Hmm, I think just including the doc text update, without the examples
of positive and negative inputs, might not be sufficient to make the
meaning clear to everyone.

Something else that bothers me slightly is the function naming --
"hexadecimal" gets abbreviated to "hex", "octal" gets abbreviated to
"oct", but "binary" is left as-is. I think it ought to be "to_bin()"
on consistency grounds, even though I understand the words "to bin"
could be interpreted differently. (Looking elsewhere for precedents,
Python has bin(), oct() and hex() functions.)

Also, I think the convention is to always list functions
alphabetically, so to_oct() should really come after to_hex().

Regards,
Dean




Re: [PATCH] Add function to_oct

2023-08-20 Thread Nathan Bossart
On Sat, Aug 19, 2023 at 08:35:46AM +0100, Dean Rasheed wrote:
> I note that there are no tests for negative inputs.

I added some in v8.

> Doing a quick test, shows that this changes the current behaviour,
> because all inputs are now treated as 64-bit:
> 
> HEAD:
> 
> select to_hex((-1234)::int);
>   to_hex
> --
>  fb2e
> 
> With patch:
> 
> select to_hex((-1234)::int);
>   to_hex
> --
>  fb2e

Good catch.  In v8, I fixed this by first casting the input to uint32 for
the 32-bit versions of the functions.  This prevents the conversion to
uint64 from setting the rest of the bits.  AFAICT this behavior is pretty
well defined in the standard.

> The way that negative inputs are handled really should be documented,
> or at least it should include a couple of examples.

I used your suggestion and noted that the output is the two's complement
representation [0].

[0] 
https://postgr.es/m/CAEZATCVbkL1ynqpsKiTDpch34%3DSCr5nnau%3DnfNmiy2nM3SJHtw%40mail.gmail.com

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From a03c06d0b26ccbd49bda55c2efab565ab9fa90db Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Tue, 25 Jul 2023 16:09:01 -0700
Subject: [PATCH v8 1/1] add to_binary() and to_oct()

---
 doc/src/sgml/func.sgml| 47 ++-
 src/backend/utils/adt/varlena.c   | 86 +++
 src/include/catalog/pg_proc.dat   | 12 
 src/test/regress/expected/strings.out | 62 ++-
 src/test/regress/sql/strings.sql  | 15 -
 5 files changed, 192 insertions(+), 30 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index be2f54c914..835c08905e 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -3737,6 +3737,50 @@ repeat('Pg', 4) PgPgPgPg

   
 
+  
+   
+
+ to_binary
+
+to_binary ( integer )
+text
+   
+   
+to_binary ( bigint )
+text
+   
+   
+Converts the number to its equivalent two's complement binary
+representation.
+   
+   
+to_binary(2147483647)
+111
+   
+  
+
+  
+   
+
+ to_oct
+
+to_oct ( integer )
+text
+   
+   
+to_oct ( bigint )
+text
+   
+   
+Converts the number to its equivalent two's complement octal
+representation.
+   
+   
+to_oct(2147483647)
+177
+   
+  
+
   

 
@@ -3750,7 +3794,8 @@ repeat('Pg', 4) PgPgPgPg
 text


-Converts the number to its equivalent hexadecimal representation.
+Converts the number to its equivalent two's complement hexadecimal
+representation.


 to_hex(2147483647)
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index b1ec5c32ce..cac3577480 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -4919,53 +4919,87 @@ array_to_text_internal(FunctionCallInfo fcinfo, ArrayType *v,
 	return result;
 }
 
-#define HEXBASE 16
 /*
- * Convert an int32 to a string containing a base 16 (hex) representation of
- * the number.
+ * Workhorse for to_binary, to_oct, and to_hex.  Note that base must be > 1 and
+ * <= 16.
  */
-Datum
-to_hex32(PG_FUNCTION_ARGS)
+static inline text *
+convert_to_base(uint64 value, int base)
 {
-	uint32		value = (uint32) PG_GETARG_INT32(0);
-	char	   *ptr;
 	const char *digits = "0123456789abcdef";
-	char		buf[32];		/* bigger than needed, but reasonable */
 
-	ptr = buf + sizeof(buf) - 1;
-	*ptr = '\0';
+	/* We size the buffer for to_binary's longest possible return value. */
+	char		buf[sizeof(uint64) * BITS_PER_BYTE];
+	char	   *const end = buf + sizeof(buf);
+	char	   *ptr = end;
+
+	Assert(base > 1);
+	Assert(base <= 16);
 
 	do
 	{
-		*--ptr = digits[value % HEXBASE];
-		value /= HEXBASE;
+		*--ptr = digits[value % base];
+		value /= base;
 	} while (ptr > buf && value);
 
-	PG_RETURN_TEXT_P(cstring_to_text(ptr));
+	return cstring_to_text_with_len(ptr, end - ptr);
+}
+
+/*
+ * Convert an integer to a string containing a base-2 (binary) representation
+ * of the number.
+ */
+Datum
+to_binary32(PG_FUNCTION_ARGS)
+{
+	uint64		value = (uint32) PG_GETARG_INT32(0);
+
+	PG_RETURN_TEXT_P(convert_to_base(value, 2));
+}
+Datum
+to_binary64(PG_FUNCTION_ARGS)
+{
+	uint64		value = (uint64) PG_GETARG_INT64(0);
+
+	PG_RETURN_TEXT_P(convert_to_base(value, 2));
 }
 
 /*
- * Convert an int64 to a string containing a base 16 (hex) representation of
+ * Convert an integer to a string containing a base-8 (oct) representation of
  * the number.
  */
 Datum
-to_hex64(PG_FUNCTION_ARGS)
+to_oct32(PG_FUNCTION_ARGS)
+{
+	uint64		value = (uint32) PG_GETARG_INT32(0);
+
+	PG_RETURN_TEXT_P(convert_to_base(value, 8));
+}
+Datum
+to_oct64(PG_FUNCTION_ARGS)
 {
 	uint64		

Re: [PATCH] Add function to_oct

2023-08-20 Thread Nathan Bossart
On Sat, Aug 19, 2023 at 11:41:56AM +0700, John Naylor wrote:
> This looks nicer, but still doesn't save the starting pointer, and so needs
> to lug around that big honking macro. This is what I mean:
> 
> static inline text *
> convert_to_base(uint64 value, int base)
> {
>   const char *digits = "0123456789abcdef";
>   /* We size the buffer for to_binary's longest possible return value. */
>   charbuf[sizeof(uint64) * BITS_PER_BYTE];
>   char * const end =  buf + sizeof(buf);
>   char *ptr = end;
> 
>   Assert(base > 1);
>   Assert(base <= 16);
> 
>   do
>   {
> *--ptr = digits[value % base];
> value /= base;
>   } while (ptr > buf && value);
> 
>   return cstring_to_text_with_len(ptr,  end - ptr);
> }

I will use this in v8.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: [PATCH] Add function to_oct

2023-08-19 Thread Dean Rasheed
On Thu, 17 Aug 2023 at 16:26, Nathan Bossart  wrote:
>
> Works for me.  I did it that way in v7.
>

I note that there are no tests for negative inputs.

Doing a quick test, shows that this changes the current behaviour,
because all inputs are now treated as 64-bit:

HEAD:

select to_hex((-1234)::int);
  to_hex
--
 fb2e

With patch:

select to_hex((-1234)::int);
  to_hex
--
 fb2e

The way that negative inputs are handled really should be documented,
or at least it should include a couple of examples.

Regards,
Dean




Re: [PATCH] Add function to_oct

2023-08-18 Thread John Naylor
On Thu, Aug 17, 2023 at 10:26 PM Nathan Bossart 
wrote:
>
> On Thu, Aug 17, 2023 at 12:35:54PM +0700, John Naylor wrote:
> > That makes it a lexically-scoped global variable, which we don't need
> > either. Can we have the internal function allocate on the stack, then
> > call cstring_to_text() on that, returning the text result? That does its
> > own palloc.
> >
> > Or maybe better, save the starting pointer, compute the length at the
end,
> > and call cstring_to_text_with_len()?  (It seems we wouldn't need
> > the nul-terminator then, either.)
>
> Works for me.  I did it that way in v7.

This looks nicer, but still doesn't save the starting pointer, and so needs
to lug around that big honking macro. This is what I mean:

static inline text *
convert_to_base(uint64 value, int base)
{
  const char *digits = "0123456789abcdef";
  /* We size the buffer for to_binary's longest possible return value. */
  charbuf[sizeof(uint64) * BITS_PER_BYTE];
  char * const end =  buf + sizeof(buf);
  char *ptr = end;

  Assert(base > 1);
  Assert(base <= 16);

  do
  {
*--ptr = digits[value % base];
value /= base;
  } while (ptr > buf && value);

  return cstring_to_text_with_len(ptr,  end - ptr);
}

--
John Naylor
EDB: http://www.enterprisedb.com


Re: [PATCH] Add function to_oct

2023-08-17 Thread Nathan Bossart
On Thu, Aug 17, 2023 at 12:35:54PM +0700, John Naylor wrote:
> That makes it a lexically-scoped global variable, which we don't need
> either. Can we have the internal function allocate on the stack, then
> call cstring_to_text() on that, returning the text result? That does its
> own palloc.
> 
> Or maybe better, save the starting pointer, compute the length at the end,
> and call cstring_to_text_with_len()?  (It seems we wouldn't need
> the nul-terminator then, either.)

Works for me.  I did it that way in v7.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From b7eccfe08150ec90ca4b48fe69c2b37453c633b2 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Tue, 25 Jul 2023 16:09:01 -0700
Subject: [PATCH v7 1/1] add to_binary() and to_oct()

---
 doc/src/sgml/func.sgml| 42 
 src/backend/utils/adt/varlena.c   | 92 +++
 src/include/catalog/pg_proc.dat   | 12 
 src/test/regress/expected/strings.out | 26 +++-
 src/test/regress/sql/strings.sql  |  9 ++-
 5 files changed, 153 insertions(+), 28 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index be2f54c914..23b5ac7457 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -3737,6 +3737,48 @@ repeat('Pg', 4) PgPgPgPg

   
 
+  
+   
+
+ to_binary
+
+to_binary ( integer )
+text
+   
+   
+to_binary ( bigint )
+text
+   
+   
+Converts the number to its equivalent binary representation.
+   
+   
+to_binary(2147483647)
+111
+   
+  
+
+  
+   
+
+ to_oct
+
+to_oct ( integer )
+text
+   
+   
+to_oct ( bigint )
+text
+   
+   
+Converts the number to its equivalent octal representation.
+   
+   
+to_oct(2147483647)
+177
+   
+  
+
   

 
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index b1ec5c32ce..58c0cbcdd8 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -4919,53 +4919,95 @@ array_to_text_internal(FunctionCallInfo fcinfo, ArrayType *v,
 	return result;
 }
 
-#define HEXBASE 16
 /*
- * Convert an int32 to a string containing a base 16 (hex) representation of
- * the number.
+ * We size the buffer for to_binary's longest possible return value.
  */
-Datum
-to_hex32(PG_FUNCTION_ARGS)
+#define CONVERT_TO_BASE_BUFSIZE		(sizeof(uint64) * BITS_PER_BYTE)
+
+/*
+ * Workhorse for to_binary, to_oct, and to_hex.  Note that base must be > 1 and
+ * <= 16.
+ */
+static inline text *
+convert_to_base(uint64 value, int base)
 {
-	uint32		value = (uint32) PG_GETARG_INT32(0);
 	char	   *ptr;
 	const char *digits = "0123456789abcdef";
-	char		buf[32];		/* bigger than needed, but reasonable */
+	char		buf[CONVERT_TO_BASE_BUFSIZE];
+	int			len;
+
+	Assert(base > 1);
+	Assert(base <= 16);
 
-	ptr = buf + sizeof(buf) - 1;
-	*ptr = '\0';
+	ptr = buf + sizeof(buf);
 
 	do
 	{
-		*--ptr = digits[value % HEXBASE];
-		value /= HEXBASE;
+		*--ptr = digits[value % base];
+		value /= base;
 	} while (ptr > buf && value);
 
-	PG_RETURN_TEXT_P(cstring_to_text(ptr));
+	len = CONVERT_TO_BASE_BUFSIZE - (ptr - buf);
+	return cstring_to_text_with_len(ptr, len);
+}
+
+#undef CONVERT_TO_BASE_BUFSIZE
+
+/*
+ * Convert an integer to a string containing a base-2 (binary) representation
+ * of the number.
+ */
+Datum
+to_binary32(PG_FUNCTION_ARGS)
+{
+	uint64		value = (uint64) PG_GETARG_INT32(0);
+
+	PG_RETURN_TEXT_P(convert_to_base(value, 2));
+}
+Datum
+to_binary64(PG_FUNCTION_ARGS)
+{
+	uint64		value = (uint64) PG_GETARG_INT64(0);
+
+	PG_RETURN_TEXT_P(convert_to_base(value, 2));
 }
 
 /*
- * Convert an int64 to a string containing a base 16 (hex) representation of
+ * Convert an integer to a string containing a base-8 (oct) representation of
  * the number.
  */
 Datum
-to_hex64(PG_FUNCTION_ARGS)
+to_oct32(PG_FUNCTION_ARGS)
+{
+	uint64		value = (uint64) PG_GETARG_INT32(0);
+
+	PG_RETURN_TEXT_P(convert_to_base(value, 8));
+}
+Datum
+to_oct64(PG_FUNCTION_ARGS)
 {
 	uint64		value = (uint64) PG_GETARG_INT64(0);
-	char	   *ptr;
-	const char *digits = "0123456789abcdef";
-	char		buf[32];		/* bigger than needed, but reasonable */
 
-	ptr = buf + sizeof(buf) - 1;
-	*ptr = '\0';
+	PG_RETURN_TEXT_P(convert_to_base(value, 8));
+}
 
-	do
-	{
-		*--ptr = digits[value % HEXBASE];
-		value /= HEXBASE;
-	} while (ptr > buf && value);
+/*
+ * Convert an integer to a string containing a base-16 (hex) representation of
+ * the number.
+ */
+Datum
+to_hex32(PG_FUNCTION_ARGS)
+{
+	uint64		value = (uint64) PG_GETARG_INT32(0);
+
+	PG_RETURN_TEXT_P(convert_to_base(value, 16));
+}
+Datum
+to_hex64(PG_FUNCTION_ARGS)
+{
+	uint64		value = (uint64) PG_GETARG_INT64(0);
 
-	PG_RETURN_TEXT_P(cstring_to_text(ptr));
+	PG_RETURN_TEXT_P(c

Re: [PATCH] Add function to_oct

2023-08-16 Thread John Naylor
On Wed, Aug 16, 2023 at 9:24 PM Nathan Bossart 
wrote:
>
> On Wed, Aug 16, 2023 at 10:35:27AM +0700, John Naylor wrote:

> > Now I'm struggling to understand why each and every instance has its own
> > nominal buffer, passed down to the implementation. All we care about is
the
> > result -- is there some reason not to confine the buffer declaration to
the
> > general implementation?
>
> We can do that if we use a static variable, which is what I've done in v6.

That makes it a lexically-scoped global variable, which we don't need
either. Can we have the internal function allocate on the stack, then
call cstring_to_text() on that, returning the text result? That does its
own palloc.

Or maybe better, save the starting pointer, compute the length at the end,
and call cstring_to_text_with_len()?  (It seems we wouldn't need
the nul-terminator then, either.)

--
John Naylor
EDB: http://www.enterprisedb.com


Re: [PATCH] Add function to_oct

2023-08-16 Thread Nathan Bossart
On Wed, Aug 16, 2023 at 10:35:27AM +0700, John Naylor wrote:
> ```
> *ptr = '\0';
> 
> do
> ```
> 
> to
> 
> ```
> *ptr = '\0';
> do
> ```

Oh, I misunderstood.  I thought you meant that there might be a whitespace
change on that line, not the surrounding ones.  This is fixed in v6.

> Now I'm struggling to understand why each and every instance has its own
> nominal buffer, passed down to the implementation. All we care about is the
> result -- is there some reason not to confine the buffer declaration to the
> general implementation?

We can do that if we use a static variable, which is what I've done in v6.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From dce8c16d55f1fe91747be45abe155b5d7d1f4274 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Tue, 25 Jul 2023 16:09:01 -0700
Subject: [PATCH v6 1/1] add to_binary() and to_oct()

---
 doc/src/sgml/func.sgml| 42 +
 src/backend/utils/adt/varlena.c   | 90 ---
 src/include/catalog/pg_proc.dat   | 12 
 src/test/regress/expected/strings.out | 26 +++-
 src/test/regress/sql/strings.sql  |  9 ++-
 5 files changed, 153 insertions(+), 26 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index be2f54c914..23b5ac7457 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -3737,6 +3737,48 @@ repeat('Pg', 4) PgPgPgPg

   
 
+  
+   
+
+ to_binary
+
+to_binary ( integer )
+text
+   
+   
+to_binary ( bigint )
+text
+   
+   
+Converts the number to its equivalent binary representation.
+   
+   
+to_binary(2147483647)
+111
+   
+  
+
+  
+   
+
+ to_oct
+
+to_oct ( integer )
+text
+   
+   
+to_oct ( bigint )
+text
+   
+   
+Converts the number to its equivalent octal representation.
+   
+   
+to_oct(2147483647)
+177
+   
+  
+
   

 
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index b1ec5c32ce..2330f86c09 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -4919,53 +4919,97 @@ array_to_text_internal(FunctionCallInfo fcinfo, ArrayType *v,
 	return result;
 }
 
-#define HEXBASE 16
 /*
- * Convert an int32 to a string containing a base 16 (hex) representation of
- * the number.
+ * Workhorse for to_binary, to_oct, and to_hex.  Note that base must be > 1 and
+ * <= 16.
  */
-Datum
-to_hex32(PG_FUNCTION_ARGS)
+static inline char *
+convert_to_base(uint64 value, int base)
 {
-	uint32		value = (uint32) PG_GETARG_INT32(0);
+	/*
+	 * We size the buffer for to_binary's longest possible return value,
+	 * including the terminating '\0'.
+	 */
+	static char buf[sizeof(uint64) * BITS_PER_BYTE + 1];
 	char	   *ptr;
 	const char *digits = "0123456789abcdef";
-	char		buf[32];		/* bigger than needed, but reasonable */
+
+	Assert(base > 1);
+	Assert(base <= 16);
 
 	ptr = buf + sizeof(buf) - 1;
 	*ptr = '\0';
 
 	do
 	{
-		*--ptr = digits[value % HEXBASE];
-		value /= HEXBASE;
+		*--ptr = digits[value % base];
+		value /= base;
 	} while (ptr > buf && value);
 
-	PG_RETURN_TEXT_P(cstring_to_text(ptr));
+	return ptr;
 }
 
 /*
- * Convert an int64 to a string containing a base 16 (hex) representation of
+ * Convert an integer to a string containing a base-2 (binary) representation
+ * of the number.
+ */
+Datum
+to_binary32(PG_FUNCTION_ARGS)
+{
+	uint64		value = (uint64) PG_GETARG_INT32(0);
+	char	   *result = convert_to_base(value, 2);
+
+	PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+Datum
+to_binary64(PG_FUNCTION_ARGS)
+{
+	uint64		value = (uint64) PG_GETARG_INT64(0);
+	char	   *result = convert_to_base(value, 2);
+
+	PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+/*
+ * Convert an integer to a string containing a base-8 (oct) representation of
  * the number.
  */
 Datum
-to_hex64(PG_FUNCTION_ARGS)
+to_oct32(PG_FUNCTION_ARGS)
+{
+	uint64		value = (uint64) PG_GETARG_INT32(0);
+	char	   *result = convert_to_base(value, 8);
+
+	PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+Datum
+to_oct64(PG_FUNCTION_ARGS)
 {
 	uint64		value = (uint64) PG_GETARG_INT64(0);
-	char	   *ptr;
-	const char *digits = "0123456789abcdef";
-	char		buf[32];		/* bigger than needed, but reasonable */
+	char	   *result = convert_to_base(value, 8);
 
-	ptr = buf + sizeof(buf) - 1;
-	*ptr = '\0';
+	PG_RETURN_TEXT_P(cstring_to_text(result));
+}
 
-	do
-	{
-		*--ptr = digits[value % HEXBASE];
-		value /= HEXBASE;
-	} while (ptr > buf && value);
+/*
+ * Convert an integer to a string containing a base-16 (hex) representation of
+ * the number.
+ */
+Datum
+to_hex32(PG_FUNCTION_ARGS)
+{
+	uint64		value = (uint64) PG_GETARG_INT32(0);
+	char	   *result = convert_to_base(value, 16);
+
+	PG_RETURN_TEXT_P(cstring_to_text(

Re: [PATCH] Add function to_oct

2023-08-15 Thread John Naylor
On Wed, Aug 16, 2023 at 12:17 AM Nathan Bossart 
wrote:
>
> On Tue, Aug 15, 2023 at 01:53:25PM +0700, John Naylor wrote:

> > - *ptr = '\0';
> > + Assert(base == 2 || base == 8 || base == 16);
> >
> > + *ptr = '\0';
> >
> > Spurious whitespace change?
>
> I think this might just be a weird artifact of the diff algorithm.

Don't believe everything you think. :-)

```
*ptr = '\0';

do
```

to

```
*ptr = '\0';
do
```

> > - char buf[32]; /* bigger than needed, but reasonable */
> > + char   *buf = palloc(sizeof(uint64) * BITS_PER_BYTE + 1);
> >
> > Why is this no longer allocated on the stack? Maybe needs a comment
about
> > the size calculation.
>
> It really should be.  IIRC I wanted to avoid passing a pre-allocated
buffer
> to convert_to_base(), but I don't remember why.  I fixed this in v5.

Now I'm struggling to understand why each and every instance has its own
nominal buffer, passed down to the implementation. All we care about is the
result -- is there some reason not to confine the buffer declaration to the
general implementation?

--
John Naylor
EDB: http://www.enterprisedb.com


Re: [PATCH] Add function to_oct

2023-08-15 Thread Nathan Bossart
On Tue, Aug 15, 2023 at 01:53:25PM +0700, John Naylor wrote:
> If we're not going to have a general SQL conversion function, here are some
> comments on the current patch.

I appreciate the review.

> +static char *convert_to_base(uint64 value, int base);
> 
> Not needed if the definition is above the callers.

Done.

> + * Workhorse for to_binary, to_oct, and to_hex.  Note that base must be
> either
> + * 2, 8, or 16.
> 
> Why wouldn't it work with any base <= 16?

You're right.  I changed this in v5.

> - *ptr = '\0';
> + Assert(base == 2 || base == 8 || base == 16);
> 
> + *ptr = '\0';
> 
> Spurious whitespace change?

I think this might just be a weird artifact of the diff algorithm.

> - char buf[32]; /* bigger than needed, but reasonable */
> + char   *buf = palloc(sizeof(uint64) * BITS_PER_BYTE + 1);
> 
> Why is this no longer allocated on the stack? Maybe needs a comment about
> the size calculation.

It really should be.  IIRC I wanted to avoid passing a pre-allocated buffer
to convert_to_base(), but I don't remember why.  I fixed this in v5.

> +static char *
> +convert_to_base(uint64 value, int base)
> 
> On my machine this now requires a function call and a DIV instruction, even
> though the patch claims not to support anything but a power of two. It's
> tiny enough to declare it inline so the compiler can specialize for each
> call site.

This was on my list of things to check before committing.  I assumed that
it would be automatically inlined, but given your analysis, I went ahead
and added the inline keyword.  My compiler took the hint and inlined the
function, and it used SHR instead of DIV instructions.  The machine code
for to_hex32/64 is still a couple of instructions longer than before
(probably because of the uint64 casts), but I don't know if we need to
worry about micro-optimizing these functions any further.

> +{ oid => '5101', descr => 'convert int4 number to binary',
> 
> Needs to be over 8000.

Done.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 2ad83f8781a013ef5e5e5a62a422f3b73c6c5f2d Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Tue, 25 Jul 2023 16:09:01 -0700
Subject: [PATCH v5 1/1] add to_binary() and to_oct()

---
 doc/src/sgml/func.sgml|  42 +++
 src/backend/utils/adt/varlena.c   | 103 +++---
 src/include/catalog/pg_proc.dat   |  12 +++
 src/test/regress/expected/strings.out |  26 ++-
 src/test/regress/sql/strings.sql  |   9 ++-
 5 files changed, 163 insertions(+), 29 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index be2f54c914..23b5ac7457 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -3737,6 +3737,48 @@ repeat('Pg', 4) PgPgPgPg

   
 
+  
+   
+
+ to_binary
+
+to_binary ( integer )
+text
+   
+   
+to_binary ( bigint )
+text
+   
+   
+Converts the number to its equivalent binary representation.
+   
+   
+to_binary(2147483647)
+111
+   
+  
+
+  
+   
+
+ to_oct
+
+to_oct ( integer )
+text
+   
+   
+to_oct ( bigint )
+text
+   
+   
+Converts the number to its equivalent octal representation.
+   
+   
+to_oct(2147483647)
+177
+   
+  
+
   

 
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index b1ec5c32ce..b955e54611 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -4919,54 +4919,105 @@ array_to_text_internal(FunctionCallInfo fcinfo, ArrayType *v,
 	return result;
 }
 
-#define HEXBASE 16
 /*
- * Convert an int32 to a string containing a base 16 (hex) representation of
- * the number.
+ * We size the buffer for to_binary's longest possible return value, including
+ * the terminating '\0'.
  */
-Datum
-to_hex32(PG_FUNCTION_ARGS)
+#define CONVERT_TO_BASE_BUFSIZE		(sizeof(uint64) * BITS_PER_BYTE + 1)
+
+/*
+ * Workhorse for to_binary, to_oct, and to_hex.  Note that base must be > 1 and
+ * <= 16.  buf must be at least CONVERT_TO_BASE_BUFSIZE bytes.
+ */
+static inline char *
+convert_to_base(uint64 value, int base, char *buf)
 {
-	uint32		value = (uint32) PG_GETARG_INT32(0);
-	char	   *ptr;
 	const char *digits = "0123456789abcdef";
-	char		buf[32];		/* bigger than needed, but reasonable */
+	char	   *ptr = buf + CONVERT_TO_BASE_BUFSIZE - 1;
 
-	ptr = buf + sizeof(buf) - 1;
-	*ptr = '\0';
+	Assert(base > 1);
+	Assert(base <= 16);
 
+	*ptr = '\0';
 	do
 	{
-		*--ptr = digits[value % HEXBASE];
-		value /= HEXBASE;
+		*--ptr = digits[value % base];
+		value /= base;
 	} while (ptr > buf && value);
 
-	PG_RETURN_TEXT_P(cstring_to_text(ptr));
+	return ptr;
+}
+
+/*
+ * Convert an integer to a string containing a base-2 (binary) representation
+ * of the number.
+ 

Re: [PATCH] Add function to_oct

2023-08-15 Thread Nathan Bossart
On Tue, Aug 15, 2023 at 07:58:17AM +0200, Vik Fearing wrote:
> On 8/15/23 06:11, Nathan Bossart wrote:
>> If there are no objections, I'd like to commit this patch soon.
> 
> I just took a look at this (and the rest of the thread).  I am a little bit
> disappointed that we don't have a generic base conversion function, but even
> if we did I think these specialized functions would still be useful.
> 
> No objection from me.

Thanks for taking a look.  I don't mean for this to preclude a generic base
conversion function that would supersede the functions added by this patch.
However, I didn't want to hold up $SUBJECT because of something that may or
may not happen after lots of discussion.  If it does happen within the v17
development cycle, we could probably just remove to_oct/binary.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: [PATCH] Add function to_oct

2023-08-14 Thread John Naylor
On Tue, Jul 25, 2023 at 05:16:56PM -0700, Nathan Bossart wrote:
> [v4]

If we're not going to have a general SQL conversion function, here are some
comments on the current patch.

+static char *convert_to_base(uint64 value, int base);

Not needed if the definition is above the callers.

+ * Workhorse for to_binary, to_oct, and to_hex.  Note that base must be
either
+ * 2, 8, or 16.

Why wouldn't it work with any base <= 16?

- *ptr = '\0';
+ Assert(base == 2 || base == 8 || base == 16);

+ *ptr = '\0';

Spurious whitespace change?

- char buf[32]; /* bigger than needed, but reasonable */
+ char   *buf = palloc(sizeof(uint64) * BITS_PER_BYTE + 1);

Why is this no longer allocated on the stack? Maybe needs a comment about
the size calculation.

+static char *
+convert_to_base(uint64 value, int base)

On my machine this now requires a function call and a DIV instruction, even
though the patch claims not to support anything but a power of two. It's
tiny enough to declare it inline so the compiler can specialize for each
call site.

+{ oid => '5101', descr => 'convert int4 number to binary',

Needs to be over 8000.

--
John Naylor
EDB: http://www.enterprisedb.com


Re: [PATCH] Add function to_oct

2023-08-14 Thread Vik Fearing

On 8/15/23 06:11, Nathan Bossart wrote:

On Tue, Jul 25, 2023 at 08:29:17PM -0700, Nathan Bossart wrote:

Here's a new version of the patch with the silly mistakes fixed.


If there are no objections, I'd like to commit this patch soon.


I just took a look at this (and the rest of the thread).  I am a little 
bit disappointed that we don't have a generic base conversion function, 
but even if we did I think these specialized functions would still be 
useful.


No objection from me.
--
Vik Fearing





Re: [PATCH] Add function to_oct

2023-08-14 Thread Nathan Bossart
On Tue, Jul 25, 2023 at 08:29:17PM -0700, Nathan Bossart wrote:
> Here's a new version of the patch with the silly mistakes fixed.

If there are no objections, I'd like to commit this patch soon.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: [PATCH] Add function to_oct

2023-07-25 Thread Nathan Bossart
On Tue, Jul 25, 2023 at 05:16:56PM -0700, Nathan Bossart wrote:
> On Tue, Jul 25, 2023 at 04:24:26PM -0700, Nathan Bossart wrote:
>> I started taking a look at this and ended up adding to_binary() and a
>> bigint version of to_oct() for completeness.  While I was at it, I moved
>> the base-conversion logic out to a separate static function that
>> to_binary/oct/hex all use.
> 
> Bleh, this patch seems to fail horribly on 32-bit builds.  I'll look into
> it soon.

Here's a new version of the patch with the silly mistakes fixed.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 1a729185b7fb77dbfee10f9a715cef6c3cd7e74b Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Tue, 25 Jul 2023 16:09:01 -0700
Subject: [PATCH v4 1/1] add to_binary() and to_oct()

---
 doc/src/sgml/func.sgml| 42 +
 src/backend/utils/adt/varlena.c   | 86 +++
 src/include/catalog/pg_proc.dat   | 12 
 src/test/regress/expected/strings.out | 26 +++-
 src/test/regress/sql/strings.sql  |  9 ++-
 5 files changed, 148 insertions(+), 27 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index b94827674c..79334ea145 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -3737,6 +3737,48 @@ repeat('Pg', 4) PgPgPgPg

   
 
+  
+   
+
+ to_binary
+
+to_binary ( integer )
+text
+   
+   
+to_binary ( bigint )
+text
+   
+   
+Converts the number to its equivalent binary representation.
+   
+   
+to_binary(2147483647)
+111
+   
+  
+
+  
+   
+
+ to_oct
+
+to_oct ( integer )
+text
+   
+   
+to_oct ( bigint )
+text
+   
+   
+Converts the number to its equivalent octal representation.
+   
+   
+to_oct(2147483647)
+177
+   
+  
+
   

 
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index b1ec5c32ce..b8903f338f 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -165,6 +165,7 @@ static void text_format_string_conversion(StringInfo buf, char conversion,
 		  int flags, int width);
 static void text_format_append_string(StringInfo buf, const char *str,
 	  int flags, int width);
+static char *convert_to_base(uint64 value, int base);
 
 
 /*
@@ -4919,53 +4920,90 @@ array_to_text_internal(FunctionCallInfo fcinfo, ArrayType *v,
 	return result;
 }
 
-#define HEXBASE 16
 /*
- * Convert an int32 to a string containing a base 16 (hex) representation of
- * the number.
+ * Convert an integer to a string containing a base-2 (binary) representation
+ * of the number.
  */
 Datum
-to_hex32(PG_FUNCTION_ARGS)
+to_binary32(PG_FUNCTION_ARGS)
 {
-	uint32		value = (uint32) PG_GETARG_INT32(0);
-	char	   *ptr;
-	const char *digits = "0123456789abcdef";
-	char		buf[32];		/* bigger than needed, but reasonable */
+	uint64		value = (uint64) PG_GETARG_INT32(0);
+	char	   *result = convert_to_base(value, 2);
 
-	ptr = buf + sizeof(buf) - 1;
-	*ptr = '\0';
+	PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+Datum
+to_binary64(PG_FUNCTION_ARGS)
+{
+	uint64		value = (uint64) PG_GETARG_INT64(0);
+	char	   *result = convert_to_base(value, 2);
 
-	do
-	{
-		*--ptr = digits[value % HEXBASE];
-		value /= HEXBASE;
-	} while (ptr > buf && value);
+	PG_RETURN_TEXT_P(cstring_to_text(result));
+}
 
-	PG_RETURN_TEXT_P(cstring_to_text(ptr));
+/*
+ * Convert an integer to a string containing a base-8 (oct) representation of
+ * the number.
+ */
+Datum
+to_oct32(PG_FUNCTION_ARGS)
+{
+	uint64		value = (uint64) PG_GETARG_INT32(0);
+	char	   *result = convert_to_base(value, 8);
+
+	PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+Datum
+to_oct64(PG_FUNCTION_ARGS)
+{
+	uint64		value = (uint64) PG_GETARG_INT64(0);
+	char	   *result = convert_to_base(value, 8);
+
+	PG_RETURN_TEXT_P(cstring_to_text(result));
 }
 
 /*
- * Convert an int64 to a string containing a base 16 (hex) representation of
+ * Convert an integer to a string containing a base-16 (hex) representation of
  * the number.
  */
 Datum
+to_hex32(PG_FUNCTION_ARGS)
+{
+	uint64		value = (uint64) PG_GETARG_INT32(0);
+	char	   *result = convert_to_base(value, 16);
+
+	PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+Datum
 to_hex64(PG_FUNCTION_ARGS)
 {
 	uint64		value = (uint64) PG_GETARG_INT64(0);
-	char	   *ptr;
+	char	   *result = convert_to_base(value, 16);
+
+	PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+/*
+ * Workhorse for to_binary, to_oct, and to_hex.  Note that base must be either
+ * 2, 8, or 16.
+ */
+static char *
+convert_to_base(uint64 value, int base)
+{
 	const char *digits = "0123456789abcdef";
-	char		buf[32];		/* bigger than needed, but r

Re: [PATCH] Add function to_oct

2023-07-25 Thread Nathan Bossart
On Tue, Jul 25, 2023 at 04:24:26PM -0700, Nathan Bossart wrote:
> I started taking a look at this and ended up adding to_binary() and a
> bigint version of to_oct() for completeness.  While I was at it, I moved
> the base-conversion logic out to a separate static function that
> to_binary/oct/hex all use.

Bleh, this patch seems to fail horribly on 32-bit builds.  I'll look into
it soon.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: [PATCH] Add function to_oct

2023-07-25 Thread Nathan Bossart
On Tue, Apr 04, 2023 at 08:45:36PM -0400, Kirk Wolak wrote:
> Marked Ready for Committer.

I started taking a look at this and ended up adding to_binary() and a
bigint version of to_oct() for completeness.  While I was at it, I moved
the base-conversion logic out to a separate static function that
to_binary/oct/hex all use.

>From the other discussion referenced upthread, it sounds like we might want
to replace to_binary/oct/hex with a more generic base-conversion function.
Maybe we should try to do that instead.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From a8cac3f3de2394a6128515b6efc2ef3d0a92bce5 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Tue, 25 Jul 2023 16:09:01 -0700
Subject: [PATCH v3 1/1] add to_binary() and to_oct()

---
 doc/src/sgml/func.sgml| 42 +
 src/backend/utils/adt/varlena.c   | 86 +++
 src/include/catalog/pg_proc.dat   | 12 
 src/test/regress/expected/strings.out | 26 +++-
 src/test/regress/sql/strings.sql  |  9 ++-
 5 files changed, 148 insertions(+), 27 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index b94827674c..79334ea145 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -3737,6 +3737,48 @@ repeat('Pg', 4) PgPgPgPg

   
 
+  
+   
+
+ to_binary
+
+to_binary ( integer )
+text
+   
+   
+to_binary ( bigint )
+text
+   
+   
+Converts the number to its equivalent binary representation.
+   
+   
+to_binary(2147483647)
+111
+   
+  
+
+  
+   
+
+ to_oct
+
+to_oct ( integer )
+text
+   
+   
+to_oct ( bigint )
+text
+   
+   
+Converts the number to its equivalent octal representation.
+   
+   
+to_oct(2147483647)
+177
+   
+  
+
   

 
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index b1ec5c32ce..b14f995fad 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -165,6 +165,7 @@ static void text_format_string_conversion(StringInfo buf, char conversion,
 		  int flags, int width);
 static void text_format_append_string(StringInfo buf, const char *str,
 	  int flags, int width);
+static char *convert_to_base(uint64 value, int base);
 
 
 /*
@@ -4919,53 +4920,90 @@ array_to_text_internal(FunctionCallInfo fcinfo, ArrayType *v,
 	return result;
 }
 
-#define HEXBASE 16
 /*
- * Convert an int32 to a string containing a base 16 (hex) representation of
- * the number.
+ * Convert an integer to a string containing a base-2 (binary) representation
+ * of the number.
  */
 Datum
-to_hex32(PG_FUNCTION_ARGS)
+to_binary32(PG_FUNCTION_ARGS)
 {
-	uint32		value = (uint32) PG_GETARG_INT32(0);
-	char	   *ptr;
-	const char *digits = "0123456789abcdef";
-	char		buf[32];		/* bigger than needed, but reasonable */
+	uint64		value = (uint64) PG_GETARG_INT32(0);
+	char	   *result = convert_to_base(value, 2);
 
-	ptr = buf + sizeof(buf) - 1;
-	*ptr = '\0';
+	PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+Datum
+to_binary64(PG_FUNCTION_ARGS)
+{
+	uint64		value = (uint64) PG_GETARG_INT64(0);
+	char	   *result = convert_to_base(value, 2);
 
-	do
-	{
-		*--ptr = digits[value % HEXBASE];
-		value /= HEXBASE;
-	} while (ptr > buf && value);
+	PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+/*
+ * Convert an integer to a string containing a base-8 (oct) representation of
+ * the number.
+ */
+Datum
+to_oct32(PG_FUNCTION_ARGS)
+{
+	uint64		value = (uint64) PG_GETARG_INT32(0);
+	char	   *result = convert_to_base(value, 8);
+
+	PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+Datum
+to_oct64(PG_FUNCTION_ARGS)
+{
+	uint64		value = (uint64) PG_GETARG_INT64(0);
+	char	   *result = convert_to_base(value, 8);
 
-	PG_RETURN_TEXT_P(cstring_to_text(ptr));
+	PG_RETURN_TEXT_P(cstring_to_text(result));
 }
 
 /*
- * Convert an int64 to a string containing a base 16 (hex) representation of
+ * Convert an integer to a string containing a base-16 (hex) representation of
  * the number.
  */
 Datum
+to_hex32(PG_FUNCTION_ARGS)
+{
+	uint64		value = (uint64) PG_GETARG_INT64(0);
+	char	   *result = convert_to_base(value, 16);
+
+	PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+Datum
 to_hex64(PG_FUNCTION_ARGS)
 {
 	uint64		value = (uint64) PG_GETARG_INT64(0);
-	char	   *ptr;
+	char	   *result = convert_to_base(value, 16);
+
+	PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
+/*
+ * Workhorse for to_binary, to_oct, and to_hex.  Note that base must be either
+ * 2, 8, or 16.
+ */
+static char *
+convert_to_base(uint64 value, int base)
+{
 	const char *digits = "0123456789abcdef";
-	char		buf[32];		/* bigger than needed, but reasonable */
+

Re: [PATCH] Add function to_oct

2023-04-04 Thread Kirk Wolak
On Thu, Feb 23, 2023 at 6:32 AM Peter Eisentraut <
peter.eisentr...@enterprisedb.com> wrote:

> On 20.12.22 23:08, Eric Radman wrote:
> > This patch is a new function based on the implementation of to_hex(int).
> >
> > Since support for octal integer literals was added, to_oct(int) allows
> > octal values to be easily stored and returned in query results.
> >
> >to_oct(0o755) = '755'
> >
> > This is probably most useful for storing file system permissions.
>
> Note this subsequent discussion about the to_hex function:
>
> https://www.postgresql.org/message-id/flat/CAEZATCVbkL1ynqpsKiTDpch34%3DSCr5nnau%3DnfNmiy2nM3SJHtw%40mail.gmail.com
>
> Also, I think there is no "to binary" function, so perhaps if we're
> going down this road one way or the other, we should probably complete
> the set.
>
> The code reads clearly.  It works as expected (being an old PDP-11
guy!)... And the docs make sense and build as well.
Nothing larger than an int gets in.  I was "missing" the bigint version,
but read through ALL of the messages to see (and agree)
That that's okay.
Marked Ready for Committer.

Thanks, Kirk


Re: [PATCH] Add function to_oct

2023-02-23 Thread Peter Eisentraut

On 20.12.22 23:08, Eric Radman wrote:

This patch is a new function based on the implementation of to_hex(int).

Since support for octal integer literals was added, to_oct(int) allows
octal values to be easily stored and returned in query results.

   to_oct(0o755) = '755'

This is probably most useful for storing file system permissions.


Note this subsequent discussion about the to_hex function: 
https://www.postgresql.org/message-id/flat/CAEZATCVbkL1ynqpsKiTDpch34%3DSCr5nnau%3DnfNmiy2nM3SJHtw%40mail.gmail.com


Also, I think there is no "to binary" function, so perhaps if we're 
going down this road one way or the other, we should probably complete 
the set.






Re: [PATCH] Add function to_oct

2023-01-07 Thread Tom Lane
vignesh C  writes:
> Few suggestions
> 1) We could use to_oct instead of to_oct32 as we don't have multiple
> implementations for to_oct

That seems (a) shortsighted and (b) inconsistent with the naming
pattern used for to_hex, so I doubt it'd be an improvement.

regards, tom lane




Re: [PATCH] Add function to_oct

2023-01-07 Thread vignesh C
On Thu, 22 Dec 2022 at 23:11, Eric Radman  wrote:
>
> On Thu, Dec 22, 2022 at 10:08:17AM +, Dag Lem wrote:
> >
> > The calculation of quotient and remainder can be replaced by less costly 
> > masking and shifting.
> >
> > Defining
> >
> > #define OCT_DIGIT_BITS 3
> > #define OCT_DIGIT_BITMASK 0x7
> >
> > the content of the loop can be replaced by
> >
> >   *--ptr = digits[value & OCT_DIGIT_BITMASK];
> >   value >>= OCT_DIGIT_BITS;
> >
> > Also, the check for ptr > buf in the while loop can be removed. The
> > check is superfluous, since buf cannot possibly be exhausted by a 32
> > bit integer (the maximum octal number being 377).
>
> I integrated these suggestions in the attached -v2 patch and tested
> range of values manually.
>
> Also picked an OID > 8000 as suggested by unused_oids.

Few suggestions
1) We could use to_oct instead of to_oct32 as we don't have multiple
implementations for to_oct
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 98d90d9338..fde0b24563 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -3687,6 +3687,9 @@
 { oid => '2090', descr => 'convert int8 number to hex',
   proname => 'to_hex', prorettype => 'text', proargtypes => 'int8',
   prosrc => 'to_hex64' },
+{ oid => '8335', descr => 'convert int4 number to oct',
+  proname => 'to_oct', prorettype => 'text', proargtypes => 'int4',
+  prosrc => 'to_oct32' },

2) Similarly we could change this to "to_oct"
+/*
+ * Convert an int32 to a string containing a base 8 (oct) representation of
+ * the number.
+ */
+Datum
+to_oct32(PG_FUNCTION_ARGS)
+{
+   uint32  value = (uint32) PG_GETARG_INT32(0);

3) I'm not sure if  AS "" is required, but I also noticed it
is done similarly in to_hex too:
+--
+-- test to_oct
+--
+select to_oct(256*256*256 - 1) AS "";
+ 
+--
+ 
+(1 row)

4) You could add a commit message for the patch

Regards,
Vignesh




Re: [PATCH] Add function to_oct

2022-12-22 Thread Eric Radman
On Thu, Dec 22, 2022 at 10:08:17AM +, Dag Lem wrote:
> 
> The calculation of quotient and remainder can be replaced by less costly 
> masking and shifting.
> 
> Defining
> 
> #define OCT_DIGIT_BITS 3
> #define OCT_DIGIT_BITMASK 0x7
> 
> the content of the loop can be replaced by
> 
>   *--ptr = digits[value & OCT_DIGIT_BITMASK];
>   value >>= OCT_DIGIT_BITS;
> 
> Also, the check for ptr > buf in the while loop can be removed. The
> check is superfluous, since buf cannot possibly be exhausted by a 32
> bit integer (the maximum octal number being 377).

I integrated these suggestions in the attached -v2 patch and tested
range of values manually.

Also picked an OID > 8000 as suggested by unused_oids.

..Eric
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 1f63dc6dba..b539e0dc0b 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -3699,6 +3699,23 @@ repeat('Pg', 4) PgPgPgPg

   
 
+  
+   
+
+ to_oct
+
+to_oct ( integer )
+text
+   
+   
+Converts the number to its equivalent octal representation.
+   
+   
+to_oct(32)
+40
+   
+  
+
   

 
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index 1c52deec55..586693ad8f 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -5229,6 +5229,32 @@ to_hex64(PG_FUNCTION_ARGS)
PG_RETURN_TEXT_P(cstring_to_text(ptr));
 }
 
+#define OCT_DIGIT_BITS 3
+#define OCT_DIGIT_BITMASK 0x7
+/*
+ * Convert an int32 to a string containing a base 8 (oct) representation of
+ * the number.
+ */
+Datum
+to_oct32(PG_FUNCTION_ARGS)
+{
+   uint32  value = (uint32) PG_GETARG_INT32(0);
+   char*ptr;
+   const char *digits = "01234567";
+   charbuf[32];/* bigger than needed, but 
reasonable */
+
+   ptr = buf + sizeof(buf) - 1;
+   *ptr = '\0';
+
+   do
+   {
+   *--ptr = digits[value & OCT_DIGIT_BITMASK];
+   value >>= OCT_DIGIT_BITS;
+   } while (value);
+
+   PG_RETURN_TEXT_P(cstring_to_text(ptr));
+}
+
 /*
  * Return the size of a datum, possibly compressed
  *
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 98d90d9338..fde0b24563 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -3687,6 +3687,9 @@
 { oid => '2090', descr => 'convert int8 number to hex',
   proname => 'to_hex', prorettype => 'text', proargtypes => 'int8',
   prosrc => 'to_hex64' },
+{ oid => '8335', descr => 'convert int4 number to oct',
+  proname => 'to_oct', prorettype => 'text', proargtypes => 'int4',
+  prosrc => 'to_oct32' },
 
 # for character set encoding support
 
diff --git a/src/test/regress/expected/strings.out 
b/src/test/regress/expected/strings.out
index f028c1f10f..5ebaeb4a80 100644
--- a/src/test/regress/expected/strings.out
+++ b/src/test/regress/expected/strings.out
@@ -2143,6 +2143,15 @@ select 
to_hex(256::bigint*256::bigint*256::bigint*256::bigint - 1) AS ""
  
 (1 row)
 
+--
+-- test to_oct
+--
+select to_oct(256*256*256 - 1) AS "";
+  
+--
+ 
+(1 row)
+
 --
 -- SHA-2
 --
diff --git a/src/test/regress/sql/strings.sql b/src/test/regress/sql/strings.sql
index 932f71cbca..c4c40949e7 100644
--- a/src/test/regress/sql/strings.sql
+++ b/src/test/regress/sql/strings.sql
@@ -691,6 +691,11 @@ select to_hex(256*256*256 - 1) AS "ff";
 
 select to_hex(256::bigint*256::bigint*256::bigint*256::bigint - 1) AS 
"";
 
+--
+-- test to_oct
+--
+select to_oct(256*256*256 - 1) AS "";
+
 --
 -- SHA-2
 --


Re: [PATCH] Add function to_oct

2022-12-22 Thread Dag Lem
The following review has been posted through the commitfest application:
make installcheck-world:  not tested
Implements feature:   not tested
Spec compliant:   not tested
Documentation:not tested

This is a mini-review of to_oct :-)

The function seems useful to me, and is obviously correct.

I don't know whether optimization at such a low level is considered in 
PostgreSQL, but here goes.

The calculation of quotient and remainder can be replaced by less costly 
masking and shifting.

Defining

#define OCT_DIGIT_BITS 3
#define OCT_DIGIT_BITMASK 0x7

the content of the loop can be replaced by

*--ptr = digits[value & OCT_DIGIT_BITMASK];
value >>= OCT_DIGIT_BITS;

Also, the check for ptr > buf in the while loop can be removed. The check is 
superfluous, since buf cannot possibly be exhausted by a 32 bit integer (the 
maximum octal number being 377).


Best regards

Dag Lem

The new status of this patch is: Waiting on Author


Re: [PATCH] Add function to_oct

2022-12-20 Thread Ian Lawrence Barwick
2022年12月21日(水) 10:42 Eric Radman :
>
> On Wed, Dec 21, 2022 at 08:36:40AM +0900, Ian Lawrence Barwick wrote:
> > 2022年12月21日(水) 7:08 Eric Radman :>
> > > Hello!
> > >
> > > This patch is a new function based on the implementation of to_hex(int).
> > >
> > > Since support for octal integer literals was added, to_oct(int) allows
> > > octal values to be easily stored and returned in query results.
> > >
> > >   to_oct(0o755) = '755'
> > >
> > > This is probably most useful for storing file system permissions.
> >
> > Seems like it would be convenient to have. Any reason why there's
> > no matching "to_oct(bigint)" version?
>
> I couldn't think of a reason someone might want an octal
> representation of a bigint.  Certainly it would be easy to add
> if there is value in supporting all of the same argument types.

Yeah, I am struggling to think of a practical application other than
symmetry with to_hex().


Regards

Ian Barwick




Re: [PATCH] Add function to_oct

2022-12-20 Thread Eric Radman
On Wed, Dec 21, 2022 at 08:36:40AM +0900, Ian Lawrence Barwick wrote:
> 2022年12月21日(水) 7:08 Eric Radman :>
> > Hello!
> >
> > This patch is a new function based on the implementation of to_hex(int).
> >
> > Since support for octal integer literals was added, to_oct(int) allows
> > octal values to be easily stored and returned in query results.
> >
> >   to_oct(0o755) = '755'
> >
> > This is probably most useful for storing file system permissions.
> 
> Seems like it would be convenient to have. Any reason why there's
> no matching "to_oct(bigint)" version?

I couldn't think of a reason someone might want an octal
representation of a bigint.  Certainly it would be easy to add
if there is value in supporting all of the same argument types.




Re: [PATCH] Add function to_oct

2022-12-20 Thread Ian Lawrence Barwick
2022年12月21日(水) 7:08 Eric Radman :>
> Hello!
>
> This patch is a new function based on the implementation of to_hex(int).
>
> Since support for octal integer literals was added, to_oct(int) allows
> octal values to be easily stored and returned in query results.
>
>   to_oct(0o755) = '755'
>
> This is probably most useful for storing file system permissions.

Seems like it would be convenient to have. Any reason why there's
no matching "to_oct(bigint)" version?

Patch has been added to the next commitfest [1], thanks.

[1] https://commitfest.postgresql.org/41/4071/

Regards

Ian Barwick




[PATCH] Add function to_oct

2022-12-20 Thread Eric Radman
Hello!

This patch is a new function based on the implementation of to_hex(int).

Since support for octal integer literals was added, to_oct(int) allows
octal values to be easily stored and returned in query results.

  to_oct(0o755) = '755'

This is probably most useful for storing file system permissions.

--
Eric Radman
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 1f63dc6dba..b539e0dc0b 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -3699,6 +3699,23 @@ repeat('Pg', 4) PgPgPgPg

   
 
+  
+   
+
+ to_oct
+
+to_oct ( integer )
+text
+   
+   
+Converts the number to its equivalent octal representation.
+   
+   
+to_oct(32)
+40
+   
+  
+
   

 
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index 1c52deec55..16929ddeed 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -5229,6 +5229,31 @@ to_hex64(PG_FUNCTION_ARGS)
PG_RETURN_TEXT_P(cstring_to_text(ptr));
 }
 
+#define OCTBASE 8
+/*
+ * Convert an int32 to a string containing a base 8 (oct) representation of
+ * the number.
+ */
+Datum
+to_oct32(PG_FUNCTION_ARGS)
+{
+   uint32  value = (uint32) PG_GETARG_INT32(0);
+   char   *ptr;
+   const char *digits = "01234567";
+   charbuf[32];/* bigger than needed, but 
reasonable */
+
+   ptr = buf + sizeof(buf) - 1;
+   *ptr = '\0';
+
+   do
+   {
+   *--ptr = digits[value % OCTBASE];
+   value /= OCTBASE;
+   } while (ptr > buf && value);
+
+   PG_RETURN_TEXT_P(cstring_to_text(ptr));
+}
+
 /*
  * Return the size of a datum, possibly compressed
  *
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 98d90d9338..824df6fdc8 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -3687,6 +3687,9 @@
 { oid => '2090', descr => 'convert int8 number to hex',
   proname => 'to_hex', prorettype => 'text', proargtypes => 'int8',
   prosrc => 'to_hex64' },
+{ oid => '2173', descr => 'convert int4 number to oct',
+  proname => 'to_oct', prorettype => 'text', proargtypes => 'int4',
+  prosrc => 'to_oct32' },
 
 # for character set encoding support
 
diff --git a/src/test/regress/expected/strings.out 
b/src/test/regress/expected/strings.out
index f028c1f10f..5ebaeb4a80 100644
--- a/src/test/regress/expected/strings.out
+++ b/src/test/regress/expected/strings.out
@@ -2143,6 +2143,15 @@ select 
to_hex(256::bigint*256::bigint*256::bigint*256::bigint - 1) AS ""
  
 (1 row)
 
+--
+-- test to_oct
+--
+select to_oct(256*256*256 - 1) AS "";
+  
+--
+ 
+(1 row)
+
 --
 -- SHA-2
 --
diff --git a/src/test/regress/sql/strings.sql b/src/test/regress/sql/strings.sql
index 932f71cbca..c4c40949e7 100644
--- a/src/test/regress/sql/strings.sql
+++ b/src/test/regress/sql/strings.sql
@@ -691,6 +691,11 @@ select to_hex(256*256*256 - 1) AS "ff";
 
 select to_hex(256::bigint*256::bigint*256::bigint*256::bigint - 1) AS 
"";
 
+--
+-- test to_oct
+--
+select to_oct(256*256*256 - 1) AS "";
+
 --
 -- SHA-2
 --