Re: [PATCH] Add function to_oct
On Tue, Aug 22, 2023 at 04:57:02PM +0200, Peter Eisentraut wrote: > On 22.08.23 16:26, Nathan Bossart wrote: >> > I don't understand the reason for this handling of negative values. I >> > would >> > expect that, say, to_hex(-1234) would return '-' || to_hex(1234). >> For this patch set, I was trying to retain the current behavior, which is >> to return the two's complement representation. I'm open to changing this >> if there's agreement on what the output should be. > > Ah, if there is existing behavior then we should probably keep it. Committed. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: [PATCH] Add function to_oct
On 22.08.23 16:26, Nathan Bossart wrote: I don't understand the reason for this handling of negative values. I would expect that, say, to_hex(-1234) would return '-' || to_hex(1234). For this patch set, I was trying to retain the current behavior, which is to return the two's complement representation. I'm open to changing this if there's agreement on what the output should be. Ah, if there is existing behavior then we should probably keep it.
Re: [PATCH] Add function to_oct
On Tue, Aug 22, 2023 at 04:20:02PM +0200, Peter Eisentraut wrote: > On 20.08.23 17:25, Nathan Bossart wrote: >> > Doing a quick test, shows that this changes the current behaviour, >> > because all inputs are now treated as 64-bit: >> > >> > HEAD: >> > >> > select to_hex((-1234)::int); >> >to_hex >> > -- >> > fb2e >> > >> > With patch: >> > >> > select to_hex((-1234)::int); >> >to_hex >> > -- >> > fb2e >> Good catch. In v8, I fixed this by first casting the input to uint32 for >> the 32-bit versions of the functions. This prevents the conversion to >> uint64 from setting the rest of the bits. AFAICT this behavior is pretty >> well defined in the standard. > > What standard? C99 > I don't understand the reason for this handling of negative values. I would > expect that, say, to_hex(-1234) would return '-' || to_hex(1234). For this patch set, I was trying to retain the current behavior, which is to return the two's complement representation. I'm open to changing this if there's agreement on what the output should be. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: [PATCH] Add function to_oct
On 20.08.23 17:25, Nathan Bossart wrote: Doing a quick test, shows that this changes the current behaviour, because all inputs are now treated as 64-bit: HEAD: select to_hex((-1234)::int); to_hex -- fb2e With patch: select to_hex((-1234)::int); to_hex -- fb2e Good catch. In v8, I fixed this by first casting the input to uint32 for the 32-bit versions of the functions. This prevents the conversion to uint64 from setting the rest of the bits. AFAICT this behavior is pretty well defined in the standard. What standard? I don't understand the reason for this handling of negative values. I would expect that, say, to_hex(-1234) would return '-' || to_hex(1234).
Re: [PATCH] Add function to_oct
On Tue, Aug 22, 2023 at 3:10 AM Dean Rasheed wrote: > > OK, cool. This looks good to me. LGTM too. -- John Naylor EDB: http://www.enterprisedb.com
Re: [PATCH] Add function to_oct
On Mon, 21 Aug 2023 at 20:15, Nathan Bossart wrote: > > I added some examples for negative inputs. > > I renamed it to to_bin(). > > I reordered the functions in the docs. > OK, cool. This looks good to me. Regards, Dean
Re: [PATCH] Add function to_oct
On Mon, Aug 21, 2023 at 09:31:37AM +0100, Dean Rasheed wrote: > Hmm, I think just including the doc text update, without the examples > of positive and negative inputs, might not be sufficient to make the > meaning clear to everyone. I added some examples for negative inputs. > Something else that bothers me slightly is the function naming -- > "hexadecimal" gets abbreviated to "hex", "octal" gets abbreviated to > "oct", but "binary" is left as-is. I think it ought to be "to_bin()" > on consistency grounds, even though I understand the words "to bin" > could be interpreted differently. (Looking elsewhere for precedents, > Python has bin(), oct() and hex() functions.) I renamed it to to_bin(). > Also, I think the convention is to always list functions > alphabetically, so to_oct() should really come after to_hex(). I reordered the functions in the docs. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com >From 131ee7fcfc62040572425cf1a090c0a6967d7559 Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Mon, 21 Aug 2023 11:29:51 -0700 Subject: [PATCH v9 1/1] add to_bin() and to_oct() --- doc/src/sgml/func.sgml| 59 +- src/backend/utils/adt/varlena.c | 86 +++ src/include/catalog/pg_proc.dat | 12 src/test/regress/expected/strings.out | 62 ++- src/test/regress/sql/strings.sql | 15 - 5 files changed, 204 insertions(+), 30 deletions(-) diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index be2f54c914..7a0d4b9134 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -3737,6 +3737,32 @@ repeat('Pg', 4) PgPgPgPg + + + + to_bin + +to_bin ( integer ) +text + + +to_bin ( bigint ) +text + + +Converts the number to its equivalent two's complement binary +representation. + + +to_bin(2147483647) +111 + + +to_bin(-1234) +101100101110 + + + @@ -3750,11 +3776,42 @@ repeat('Pg', 4) PgPgPgPg text -Converts the number to its equivalent hexadecimal representation. +Converts the number to its equivalent two's complement hexadecimal +representation. to_hex(2147483647) 7fff + + +to_hex(-1234) +fb2e + + + + + + + to_oct + +to_oct ( integer ) +text + + +to_oct ( bigint ) +text + + +Converts the number to its equivalent two's complement octal +representation. + + +to_oct(2147483647) +177 + + +to_oct(-1234) +3775456 diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c index b1ec5c32ce..72e1e24fe0 100644 --- a/src/backend/utils/adt/varlena.c +++ b/src/backend/utils/adt/varlena.c @@ -4919,53 +4919,87 @@ array_to_text_internal(FunctionCallInfo fcinfo, ArrayType *v, return result; } -#define HEXBASE 16 /* - * Convert an int32 to a string containing a base 16 (hex) representation of - * the number. + * Workhorse for to_bin, to_oct, and to_hex. Note that base must be > 1 and <= + * 16. */ -Datum -to_hex32(PG_FUNCTION_ARGS) +static inline text * +convert_to_base(uint64 value, int base) { - uint32 value = (uint32) PG_GETARG_INT32(0); - char *ptr; const char *digits = "0123456789abcdef"; - char buf[32]; /* bigger than needed, but reasonable */ - ptr = buf + sizeof(buf) - 1; - *ptr = '\0'; + /* We size the buffer for to_bin's longest possible return value. */ + char buf[sizeof(uint64) * BITS_PER_BYTE]; + char *const end = buf + sizeof(buf); + char *ptr = end; + + Assert(base > 1); + Assert(base <= 16); do { - *--ptr = digits[value % HEXBASE]; - value /= HEXBASE; + *--ptr = digits[value % base]; + value /= base; } while (ptr > buf && value); - PG_RETURN_TEXT_P(cstring_to_text(ptr)); + return cstring_to_text_with_len(ptr, end - ptr); +} + +/* + * Convert an integer to a string containing a base-2 (binary) representation + * of the number. + */ +Datum +to_bin32(PG_FUNCTION_ARGS) +{ + uint64 value = (uint32) PG_GETARG_INT32(0); + + PG_RETURN_TEXT_P(convert_to_base(value, 2)); +} +Datum +to_bin64(PG_FUNCTION_ARGS) +{ + uint64 value = (uint64) PG_GETARG_INT64(0); + + PG_RETURN_TEXT_P(convert_to_base(value, 2)); } /* - * Convert an int64 to a string containing a base 16 (hex) representation of + * Convert an integer to a string containing a base-8 (oct) representation of * the number. */ Datum -to_hex64(PG_FUNCTION_ARGS) +to_oct32(PG_FUNCTION_ARGS) +{ + uint64 value = (uint32) PG_GETARG_INT32(0); + + PG_RETURN_TEXT_P(con
Re: [PATCH] Add function to_oct
On Sun, 20 Aug 2023 at 16:25, Nathan Bossart wrote: > > On Sat, Aug 19, 2023 at 08:35:46AM +0100, Dean Rasheed wrote: > > > The way that negative inputs are handled really should be documented, > > or at least it should include a couple of examples. > > I used your suggestion and noted that the output is the two's complement > representation [0]. > Hmm, I think just including the doc text update, without the examples of positive and negative inputs, might not be sufficient to make the meaning clear to everyone. Something else that bothers me slightly is the function naming -- "hexadecimal" gets abbreviated to "hex", "octal" gets abbreviated to "oct", but "binary" is left as-is. I think it ought to be "to_bin()" on consistency grounds, even though I understand the words "to bin" could be interpreted differently. (Looking elsewhere for precedents, Python has bin(), oct() and hex() functions.) Also, I think the convention is to always list functions alphabetically, so to_oct() should really come after to_hex(). Regards, Dean
Re: [PATCH] Add function to_oct
On Sat, Aug 19, 2023 at 08:35:46AM +0100, Dean Rasheed wrote: > I note that there are no tests for negative inputs. I added some in v8. > Doing a quick test, shows that this changes the current behaviour, > because all inputs are now treated as 64-bit: > > HEAD: > > select to_hex((-1234)::int); > to_hex > -- > fb2e > > With patch: > > select to_hex((-1234)::int); > to_hex > -- > fb2e Good catch. In v8, I fixed this by first casting the input to uint32 for the 32-bit versions of the functions. This prevents the conversion to uint64 from setting the rest of the bits. AFAICT this behavior is pretty well defined in the standard. > The way that negative inputs are handled really should be documented, > or at least it should include a couple of examples. I used your suggestion and noted that the output is the two's complement representation [0]. [0] https://postgr.es/m/CAEZATCVbkL1ynqpsKiTDpch34%3DSCr5nnau%3DnfNmiy2nM3SJHtw%40mail.gmail.com -- Nathan Bossart Amazon Web Services: https://aws.amazon.com >From a03c06d0b26ccbd49bda55c2efab565ab9fa90db Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Tue, 25 Jul 2023 16:09:01 -0700 Subject: [PATCH v8 1/1] add to_binary() and to_oct() --- doc/src/sgml/func.sgml| 47 ++- src/backend/utils/adt/varlena.c | 86 +++ src/include/catalog/pg_proc.dat | 12 src/test/regress/expected/strings.out | 62 ++- src/test/regress/sql/strings.sql | 15 - 5 files changed, 192 insertions(+), 30 deletions(-) diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index be2f54c914..835c08905e 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -3737,6 +3737,50 @@ repeat('Pg', 4) PgPgPgPg + + + + to_binary + +to_binary ( integer ) +text + + +to_binary ( bigint ) +text + + +Converts the number to its equivalent two's complement binary +representation. + + +to_binary(2147483647) +111 + + + + + + + to_oct + +to_oct ( integer ) +text + + +to_oct ( bigint ) +text + + +Converts the number to its equivalent two's complement octal +representation. + + +to_oct(2147483647) +177 + + + @@ -3750,7 +3794,8 @@ repeat('Pg', 4) PgPgPgPg text -Converts the number to its equivalent hexadecimal representation. +Converts the number to its equivalent two's complement hexadecimal +representation. to_hex(2147483647) diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c index b1ec5c32ce..cac3577480 100644 --- a/src/backend/utils/adt/varlena.c +++ b/src/backend/utils/adt/varlena.c @@ -4919,53 +4919,87 @@ array_to_text_internal(FunctionCallInfo fcinfo, ArrayType *v, return result; } -#define HEXBASE 16 /* - * Convert an int32 to a string containing a base 16 (hex) representation of - * the number. + * Workhorse for to_binary, to_oct, and to_hex. Note that base must be > 1 and + * <= 16. */ -Datum -to_hex32(PG_FUNCTION_ARGS) +static inline text * +convert_to_base(uint64 value, int base) { - uint32 value = (uint32) PG_GETARG_INT32(0); - char *ptr; const char *digits = "0123456789abcdef"; - char buf[32]; /* bigger than needed, but reasonable */ - ptr = buf + sizeof(buf) - 1; - *ptr = '\0'; + /* We size the buffer for to_binary's longest possible return value. */ + char buf[sizeof(uint64) * BITS_PER_BYTE]; + char *const end = buf + sizeof(buf); + char *ptr = end; + + Assert(base > 1); + Assert(base <= 16); do { - *--ptr = digits[value % HEXBASE]; - value /= HEXBASE; + *--ptr = digits[value % base]; + value /= base; } while (ptr > buf && value); - PG_RETURN_TEXT_P(cstring_to_text(ptr)); + return cstring_to_text_with_len(ptr, end - ptr); +} + +/* + * Convert an integer to a string containing a base-2 (binary) representation + * of the number. + */ +Datum +to_binary32(PG_FUNCTION_ARGS) +{ + uint64 value = (uint32) PG_GETARG_INT32(0); + + PG_RETURN_TEXT_P(convert_to_base(value, 2)); +} +Datum +to_binary64(PG_FUNCTION_ARGS) +{ + uint64 value = (uint64) PG_GETARG_INT64(0); + + PG_RETURN_TEXT_P(convert_to_base(value, 2)); } /* - * Convert an int64 to a string containing a base 16 (hex) representation of + * Convert an integer to a string containing a base-8 (oct) representation of * the number. */ Datum -to_hex64(PG_FUNCTION_ARGS) +to_oct32(PG_FUNCTION_ARGS) +{ + uint64 value = (uint32) PG_GETARG_INT32(0); + + PG_RETURN_TEXT_P(convert_to_base(value, 8)); +} +Datum +to_oct64(PG_FUNCTION_ARGS) { uint64
Re: [PATCH] Add function to_oct
On Sat, Aug 19, 2023 at 11:41:56AM +0700, John Naylor wrote: > This looks nicer, but still doesn't save the starting pointer, and so needs > to lug around that big honking macro. This is what I mean: > > static inline text * > convert_to_base(uint64 value, int base) > { > const char *digits = "0123456789abcdef"; > /* We size the buffer for to_binary's longest possible return value. */ > charbuf[sizeof(uint64) * BITS_PER_BYTE]; > char * const end = buf + sizeof(buf); > char *ptr = end; > > Assert(base > 1); > Assert(base <= 16); > > do > { > *--ptr = digits[value % base]; > value /= base; > } while (ptr > buf && value); > > return cstring_to_text_with_len(ptr, end - ptr); > } I will use this in v8. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: [PATCH] Add function to_oct
On Thu, 17 Aug 2023 at 16:26, Nathan Bossart wrote: > > Works for me. I did it that way in v7. > I note that there are no tests for negative inputs. Doing a quick test, shows that this changes the current behaviour, because all inputs are now treated as 64-bit: HEAD: select to_hex((-1234)::int); to_hex -- fb2e With patch: select to_hex((-1234)::int); to_hex -- fb2e The way that negative inputs are handled really should be documented, or at least it should include a couple of examples. Regards, Dean
Re: [PATCH] Add function to_oct
On Thu, Aug 17, 2023 at 10:26 PM Nathan Bossart wrote: > > On Thu, Aug 17, 2023 at 12:35:54PM +0700, John Naylor wrote: > > That makes it a lexically-scoped global variable, which we don't need > > either. Can we have the internal function allocate on the stack, then > > call cstring_to_text() on that, returning the text result? That does its > > own palloc. > > > > Or maybe better, save the starting pointer, compute the length at the end, > > and call cstring_to_text_with_len()? (It seems we wouldn't need > > the nul-terminator then, either.) > > Works for me. I did it that way in v7. This looks nicer, but still doesn't save the starting pointer, and so needs to lug around that big honking macro. This is what I mean: static inline text * convert_to_base(uint64 value, int base) { const char *digits = "0123456789abcdef"; /* We size the buffer for to_binary's longest possible return value. */ charbuf[sizeof(uint64) * BITS_PER_BYTE]; char * const end = buf + sizeof(buf); char *ptr = end; Assert(base > 1); Assert(base <= 16); do { *--ptr = digits[value % base]; value /= base; } while (ptr > buf && value); return cstring_to_text_with_len(ptr, end - ptr); } -- John Naylor EDB: http://www.enterprisedb.com
Re: [PATCH] Add function to_oct
On Thu, Aug 17, 2023 at 12:35:54PM +0700, John Naylor wrote: > That makes it a lexically-scoped global variable, which we don't need > either. Can we have the internal function allocate on the stack, then > call cstring_to_text() on that, returning the text result? That does its > own palloc. > > Or maybe better, save the starting pointer, compute the length at the end, > and call cstring_to_text_with_len()? (It seems we wouldn't need > the nul-terminator then, either.) Works for me. I did it that way in v7. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com >From b7eccfe08150ec90ca4b48fe69c2b37453c633b2 Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Tue, 25 Jul 2023 16:09:01 -0700 Subject: [PATCH v7 1/1] add to_binary() and to_oct() --- doc/src/sgml/func.sgml| 42 src/backend/utils/adt/varlena.c | 92 +++ src/include/catalog/pg_proc.dat | 12 src/test/regress/expected/strings.out | 26 +++- src/test/regress/sql/strings.sql | 9 ++- 5 files changed, 153 insertions(+), 28 deletions(-) diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index be2f54c914..23b5ac7457 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -3737,6 +3737,48 @@ repeat('Pg', 4) PgPgPgPg + + + + to_binary + +to_binary ( integer ) +text + + +to_binary ( bigint ) +text + + +Converts the number to its equivalent binary representation. + + +to_binary(2147483647) +111 + + + + + + + to_oct + +to_oct ( integer ) +text + + +to_oct ( bigint ) +text + + +Converts the number to its equivalent octal representation. + + +to_oct(2147483647) +177 + + + diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c index b1ec5c32ce..58c0cbcdd8 100644 --- a/src/backend/utils/adt/varlena.c +++ b/src/backend/utils/adt/varlena.c @@ -4919,53 +4919,95 @@ array_to_text_internal(FunctionCallInfo fcinfo, ArrayType *v, return result; } -#define HEXBASE 16 /* - * Convert an int32 to a string containing a base 16 (hex) representation of - * the number. + * We size the buffer for to_binary's longest possible return value. */ -Datum -to_hex32(PG_FUNCTION_ARGS) +#define CONVERT_TO_BASE_BUFSIZE (sizeof(uint64) * BITS_PER_BYTE) + +/* + * Workhorse for to_binary, to_oct, and to_hex. Note that base must be > 1 and + * <= 16. + */ +static inline text * +convert_to_base(uint64 value, int base) { - uint32 value = (uint32) PG_GETARG_INT32(0); char *ptr; const char *digits = "0123456789abcdef"; - char buf[32]; /* bigger than needed, but reasonable */ + char buf[CONVERT_TO_BASE_BUFSIZE]; + int len; + + Assert(base > 1); + Assert(base <= 16); - ptr = buf + sizeof(buf) - 1; - *ptr = '\0'; + ptr = buf + sizeof(buf); do { - *--ptr = digits[value % HEXBASE]; - value /= HEXBASE; + *--ptr = digits[value % base]; + value /= base; } while (ptr > buf && value); - PG_RETURN_TEXT_P(cstring_to_text(ptr)); + len = CONVERT_TO_BASE_BUFSIZE - (ptr - buf); + return cstring_to_text_with_len(ptr, len); +} + +#undef CONVERT_TO_BASE_BUFSIZE + +/* + * Convert an integer to a string containing a base-2 (binary) representation + * of the number. + */ +Datum +to_binary32(PG_FUNCTION_ARGS) +{ + uint64 value = (uint64) PG_GETARG_INT32(0); + + PG_RETURN_TEXT_P(convert_to_base(value, 2)); +} +Datum +to_binary64(PG_FUNCTION_ARGS) +{ + uint64 value = (uint64) PG_GETARG_INT64(0); + + PG_RETURN_TEXT_P(convert_to_base(value, 2)); } /* - * Convert an int64 to a string containing a base 16 (hex) representation of + * Convert an integer to a string containing a base-8 (oct) representation of * the number. */ Datum -to_hex64(PG_FUNCTION_ARGS) +to_oct32(PG_FUNCTION_ARGS) +{ + uint64 value = (uint64) PG_GETARG_INT32(0); + + PG_RETURN_TEXT_P(convert_to_base(value, 8)); +} +Datum +to_oct64(PG_FUNCTION_ARGS) { uint64 value = (uint64) PG_GETARG_INT64(0); - char *ptr; - const char *digits = "0123456789abcdef"; - char buf[32]; /* bigger than needed, but reasonable */ - ptr = buf + sizeof(buf) - 1; - *ptr = '\0'; + PG_RETURN_TEXT_P(convert_to_base(value, 8)); +} - do - { - *--ptr = digits[value % HEXBASE]; - value /= HEXBASE; - } while (ptr > buf && value); +/* + * Convert an integer to a string containing a base-16 (hex) representation of + * the number. + */ +Datum +to_hex32(PG_FUNCTION_ARGS) +{ + uint64 value = (uint64) PG_GETARG_INT32(0); + + PG_RETURN_TEXT_P(convert_to_base(value, 16)); +} +Datum +to_hex64(PG_FUNCTION_ARGS) +{ + uint64 value = (uint64) PG_GETARG_INT64(0); - PG_RETURN_TEXT_P(cstring_to_text(ptr)); + PG_RETURN_TEXT_P(c
Re: [PATCH] Add function to_oct
On Wed, Aug 16, 2023 at 9:24 PM Nathan Bossart wrote: > > On Wed, Aug 16, 2023 at 10:35:27AM +0700, John Naylor wrote: > > Now I'm struggling to understand why each and every instance has its own > > nominal buffer, passed down to the implementation. All we care about is the > > result -- is there some reason not to confine the buffer declaration to the > > general implementation? > > We can do that if we use a static variable, which is what I've done in v6. That makes it a lexically-scoped global variable, which we don't need either. Can we have the internal function allocate on the stack, then call cstring_to_text() on that, returning the text result? That does its own palloc. Or maybe better, save the starting pointer, compute the length at the end, and call cstring_to_text_with_len()? (It seems we wouldn't need the nul-terminator then, either.) -- John Naylor EDB: http://www.enterprisedb.com
Re: [PATCH] Add function to_oct
On Wed, Aug 16, 2023 at 10:35:27AM +0700, John Naylor wrote: > ``` > *ptr = '\0'; > > do > ``` > > to > > ``` > *ptr = '\0'; > do > ``` Oh, I misunderstood. I thought you meant that there might be a whitespace change on that line, not the surrounding ones. This is fixed in v6. > Now I'm struggling to understand why each and every instance has its own > nominal buffer, passed down to the implementation. All we care about is the > result -- is there some reason not to confine the buffer declaration to the > general implementation? We can do that if we use a static variable, which is what I've done in v6. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com >From dce8c16d55f1fe91747be45abe155b5d7d1f4274 Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Tue, 25 Jul 2023 16:09:01 -0700 Subject: [PATCH v6 1/1] add to_binary() and to_oct() --- doc/src/sgml/func.sgml| 42 + src/backend/utils/adt/varlena.c | 90 --- src/include/catalog/pg_proc.dat | 12 src/test/regress/expected/strings.out | 26 +++- src/test/regress/sql/strings.sql | 9 ++- 5 files changed, 153 insertions(+), 26 deletions(-) diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index be2f54c914..23b5ac7457 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -3737,6 +3737,48 @@ repeat('Pg', 4) PgPgPgPg + + + + to_binary + +to_binary ( integer ) +text + + +to_binary ( bigint ) +text + + +Converts the number to its equivalent binary representation. + + +to_binary(2147483647) +111 + + + + + + + to_oct + +to_oct ( integer ) +text + + +to_oct ( bigint ) +text + + +Converts the number to its equivalent octal representation. + + +to_oct(2147483647) +177 + + + diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c index b1ec5c32ce..2330f86c09 100644 --- a/src/backend/utils/adt/varlena.c +++ b/src/backend/utils/adt/varlena.c @@ -4919,53 +4919,97 @@ array_to_text_internal(FunctionCallInfo fcinfo, ArrayType *v, return result; } -#define HEXBASE 16 /* - * Convert an int32 to a string containing a base 16 (hex) representation of - * the number. + * Workhorse for to_binary, to_oct, and to_hex. Note that base must be > 1 and + * <= 16. */ -Datum -to_hex32(PG_FUNCTION_ARGS) +static inline char * +convert_to_base(uint64 value, int base) { - uint32 value = (uint32) PG_GETARG_INT32(0); + /* + * We size the buffer for to_binary's longest possible return value, + * including the terminating '\0'. + */ + static char buf[sizeof(uint64) * BITS_PER_BYTE + 1]; char *ptr; const char *digits = "0123456789abcdef"; - char buf[32]; /* bigger than needed, but reasonable */ + + Assert(base > 1); + Assert(base <= 16); ptr = buf + sizeof(buf) - 1; *ptr = '\0'; do { - *--ptr = digits[value % HEXBASE]; - value /= HEXBASE; + *--ptr = digits[value % base]; + value /= base; } while (ptr > buf && value); - PG_RETURN_TEXT_P(cstring_to_text(ptr)); + return ptr; } /* - * Convert an int64 to a string containing a base 16 (hex) representation of + * Convert an integer to a string containing a base-2 (binary) representation + * of the number. + */ +Datum +to_binary32(PG_FUNCTION_ARGS) +{ + uint64 value = (uint64) PG_GETARG_INT32(0); + char *result = convert_to_base(value, 2); + + PG_RETURN_TEXT_P(cstring_to_text(result)); +} +Datum +to_binary64(PG_FUNCTION_ARGS) +{ + uint64 value = (uint64) PG_GETARG_INT64(0); + char *result = convert_to_base(value, 2); + + PG_RETURN_TEXT_P(cstring_to_text(result)); +} + +/* + * Convert an integer to a string containing a base-8 (oct) representation of * the number. */ Datum -to_hex64(PG_FUNCTION_ARGS) +to_oct32(PG_FUNCTION_ARGS) +{ + uint64 value = (uint64) PG_GETARG_INT32(0); + char *result = convert_to_base(value, 8); + + PG_RETURN_TEXT_P(cstring_to_text(result)); +} +Datum +to_oct64(PG_FUNCTION_ARGS) { uint64 value = (uint64) PG_GETARG_INT64(0); - char *ptr; - const char *digits = "0123456789abcdef"; - char buf[32]; /* bigger than needed, but reasonable */ + char *result = convert_to_base(value, 8); - ptr = buf + sizeof(buf) - 1; - *ptr = '\0'; + PG_RETURN_TEXT_P(cstring_to_text(result)); +} - do - { - *--ptr = digits[value % HEXBASE]; - value /= HEXBASE; - } while (ptr > buf && value); +/* + * Convert an integer to a string containing a base-16 (hex) representation of + * the number. + */ +Datum +to_hex32(PG_FUNCTION_ARGS) +{ + uint64 value = (uint64) PG_GETARG_INT32(0); + char *result = convert_to_base(value, 16); + + PG_RETURN_TEXT_P(cstring_to_text(
Re: [PATCH] Add function to_oct
On Wed, Aug 16, 2023 at 12:17 AM Nathan Bossart wrote: > > On Tue, Aug 15, 2023 at 01:53:25PM +0700, John Naylor wrote: > > - *ptr = '\0'; > > + Assert(base == 2 || base == 8 || base == 16); > > > > + *ptr = '\0'; > > > > Spurious whitespace change? > > I think this might just be a weird artifact of the diff algorithm. Don't believe everything you think. :-) ``` *ptr = '\0'; do ``` to ``` *ptr = '\0'; do ``` > > - char buf[32]; /* bigger than needed, but reasonable */ > > + char *buf = palloc(sizeof(uint64) * BITS_PER_BYTE + 1); > > > > Why is this no longer allocated on the stack? Maybe needs a comment about > > the size calculation. > > It really should be. IIRC I wanted to avoid passing a pre-allocated buffer > to convert_to_base(), but I don't remember why. I fixed this in v5. Now I'm struggling to understand why each and every instance has its own nominal buffer, passed down to the implementation. All we care about is the result -- is there some reason not to confine the buffer declaration to the general implementation? -- John Naylor EDB: http://www.enterprisedb.com
Re: [PATCH] Add function to_oct
On Tue, Aug 15, 2023 at 01:53:25PM +0700, John Naylor wrote: > If we're not going to have a general SQL conversion function, here are some > comments on the current patch. I appreciate the review. > +static char *convert_to_base(uint64 value, int base); > > Not needed if the definition is above the callers. Done. > + * Workhorse for to_binary, to_oct, and to_hex. Note that base must be > either > + * 2, 8, or 16. > > Why wouldn't it work with any base <= 16? You're right. I changed this in v5. > - *ptr = '\0'; > + Assert(base == 2 || base == 8 || base == 16); > > + *ptr = '\0'; > > Spurious whitespace change? I think this might just be a weird artifact of the diff algorithm. > - char buf[32]; /* bigger than needed, but reasonable */ > + char *buf = palloc(sizeof(uint64) * BITS_PER_BYTE + 1); > > Why is this no longer allocated on the stack? Maybe needs a comment about > the size calculation. It really should be. IIRC I wanted to avoid passing a pre-allocated buffer to convert_to_base(), but I don't remember why. I fixed this in v5. > +static char * > +convert_to_base(uint64 value, int base) > > On my machine this now requires a function call and a DIV instruction, even > though the patch claims not to support anything but a power of two. It's > tiny enough to declare it inline so the compiler can specialize for each > call site. This was on my list of things to check before committing. I assumed that it would be automatically inlined, but given your analysis, I went ahead and added the inline keyword. My compiler took the hint and inlined the function, and it used SHR instead of DIV instructions. The machine code for to_hex32/64 is still a couple of instructions longer than before (probably because of the uint64 casts), but I don't know if we need to worry about micro-optimizing these functions any further. > +{ oid => '5101', descr => 'convert int4 number to binary', > > Needs to be over 8000. Done. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com >From 2ad83f8781a013ef5e5e5a62a422f3b73c6c5f2d Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Tue, 25 Jul 2023 16:09:01 -0700 Subject: [PATCH v5 1/1] add to_binary() and to_oct() --- doc/src/sgml/func.sgml| 42 +++ src/backend/utils/adt/varlena.c | 103 +++--- src/include/catalog/pg_proc.dat | 12 +++ src/test/regress/expected/strings.out | 26 ++- src/test/regress/sql/strings.sql | 9 ++- 5 files changed, 163 insertions(+), 29 deletions(-) diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index be2f54c914..23b5ac7457 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -3737,6 +3737,48 @@ repeat('Pg', 4) PgPgPgPg + + + + to_binary + +to_binary ( integer ) +text + + +to_binary ( bigint ) +text + + +Converts the number to its equivalent binary representation. + + +to_binary(2147483647) +111 + + + + + + + to_oct + +to_oct ( integer ) +text + + +to_oct ( bigint ) +text + + +Converts the number to its equivalent octal representation. + + +to_oct(2147483647) +177 + + + diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c index b1ec5c32ce..b955e54611 100644 --- a/src/backend/utils/adt/varlena.c +++ b/src/backend/utils/adt/varlena.c @@ -4919,54 +4919,105 @@ array_to_text_internal(FunctionCallInfo fcinfo, ArrayType *v, return result; } -#define HEXBASE 16 /* - * Convert an int32 to a string containing a base 16 (hex) representation of - * the number. + * We size the buffer for to_binary's longest possible return value, including + * the terminating '\0'. */ -Datum -to_hex32(PG_FUNCTION_ARGS) +#define CONVERT_TO_BASE_BUFSIZE (sizeof(uint64) * BITS_PER_BYTE + 1) + +/* + * Workhorse for to_binary, to_oct, and to_hex. Note that base must be > 1 and + * <= 16. buf must be at least CONVERT_TO_BASE_BUFSIZE bytes. + */ +static inline char * +convert_to_base(uint64 value, int base, char *buf) { - uint32 value = (uint32) PG_GETARG_INT32(0); - char *ptr; const char *digits = "0123456789abcdef"; - char buf[32]; /* bigger than needed, but reasonable */ + char *ptr = buf + CONVERT_TO_BASE_BUFSIZE - 1; - ptr = buf + sizeof(buf) - 1; - *ptr = '\0'; + Assert(base > 1); + Assert(base <= 16); + *ptr = '\0'; do { - *--ptr = digits[value % HEXBASE]; - value /= HEXBASE; + *--ptr = digits[value % base]; + value /= base; } while (ptr > buf && value); - PG_RETURN_TEXT_P(cstring_to_text(ptr)); + return ptr; +} + +/* + * Convert an integer to a string containing a base-2 (binary) representation + * of the number. +
Re: [PATCH] Add function to_oct
On Tue, Aug 15, 2023 at 07:58:17AM +0200, Vik Fearing wrote: > On 8/15/23 06:11, Nathan Bossart wrote: >> If there are no objections, I'd like to commit this patch soon. > > I just took a look at this (and the rest of the thread). I am a little bit > disappointed that we don't have a generic base conversion function, but even > if we did I think these specialized functions would still be useful. > > No objection from me. Thanks for taking a look. I don't mean for this to preclude a generic base conversion function that would supersede the functions added by this patch. However, I didn't want to hold up $SUBJECT because of something that may or may not happen after lots of discussion. If it does happen within the v17 development cycle, we could probably just remove to_oct/binary. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: [PATCH] Add function to_oct
On Tue, Jul 25, 2023 at 05:16:56PM -0700, Nathan Bossart wrote: > [v4] If we're not going to have a general SQL conversion function, here are some comments on the current patch. +static char *convert_to_base(uint64 value, int base); Not needed if the definition is above the callers. + * Workhorse for to_binary, to_oct, and to_hex. Note that base must be either + * 2, 8, or 16. Why wouldn't it work with any base <= 16? - *ptr = '\0'; + Assert(base == 2 || base == 8 || base == 16); + *ptr = '\0'; Spurious whitespace change? - char buf[32]; /* bigger than needed, but reasonable */ + char *buf = palloc(sizeof(uint64) * BITS_PER_BYTE + 1); Why is this no longer allocated on the stack? Maybe needs a comment about the size calculation. +static char * +convert_to_base(uint64 value, int base) On my machine this now requires a function call and a DIV instruction, even though the patch claims not to support anything but a power of two. It's tiny enough to declare it inline so the compiler can specialize for each call site. +{ oid => '5101', descr => 'convert int4 number to binary', Needs to be over 8000. -- John Naylor EDB: http://www.enterprisedb.com
Re: [PATCH] Add function to_oct
On 8/15/23 06:11, Nathan Bossart wrote: On Tue, Jul 25, 2023 at 08:29:17PM -0700, Nathan Bossart wrote: Here's a new version of the patch with the silly mistakes fixed. If there are no objections, I'd like to commit this patch soon. I just took a look at this (and the rest of the thread). I am a little bit disappointed that we don't have a generic base conversion function, but even if we did I think these specialized functions would still be useful. No objection from me. -- Vik Fearing
Re: [PATCH] Add function to_oct
On Tue, Jul 25, 2023 at 08:29:17PM -0700, Nathan Bossart wrote: > Here's a new version of the patch with the silly mistakes fixed. If there are no objections, I'd like to commit this patch soon. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: [PATCH] Add function to_oct
On Tue, Jul 25, 2023 at 05:16:56PM -0700, Nathan Bossart wrote: > On Tue, Jul 25, 2023 at 04:24:26PM -0700, Nathan Bossart wrote: >> I started taking a look at this and ended up adding to_binary() and a >> bigint version of to_oct() for completeness. While I was at it, I moved >> the base-conversion logic out to a separate static function that >> to_binary/oct/hex all use. > > Bleh, this patch seems to fail horribly on 32-bit builds. I'll look into > it soon. Here's a new version of the patch with the silly mistakes fixed. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com >From 1a729185b7fb77dbfee10f9a715cef6c3cd7e74b Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Tue, 25 Jul 2023 16:09:01 -0700 Subject: [PATCH v4 1/1] add to_binary() and to_oct() --- doc/src/sgml/func.sgml| 42 + src/backend/utils/adt/varlena.c | 86 +++ src/include/catalog/pg_proc.dat | 12 src/test/regress/expected/strings.out | 26 +++- src/test/regress/sql/strings.sql | 9 ++- 5 files changed, 148 insertions(+), 27 deletions(-) diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index b94827674c..79334ea145 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -3737,6 +3737,48 @@ repeat('Pg', 4) PgPgPgPg + + + + to_binary + +to_binary ( integer ) +text + + +to_binary ( bigint ) +text + + +Converts the number to its equivalent binary representation. + + +to_binary(2147483647) +111 + + + + + + + to_oct + +to_oct ( integer ) +text + + +to_oct ( bigint ) +text + + +Converts the number to its equivalent octal representation. + + +to_oct(2147483647) +177 + + + diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c index b1ec5c32ce..b8903f338f 100644 --- a/src/backend/utils/adt/varlena.c +++ b/src/backend/utils/adt/varlena.c @@ -165,6 +165,7 @@ static void text_format_string_conversion(StringInfo buf, char conversion, int flags, int width); static void text_format_append_string(StringInfo buf, const char *str, int flags, int width); +static char *convert_to_base(uint64 value, int base); /* @@ -4919,53 +4920,90 @@ array_to_text_internal(FunctionCallInfo fcinfo, ArrayType *v, return result; } -#define HEXBASE 16 /* - * Convert an int32 to a string containing a base 16 (hex) representation of - * the number. + * Convert an integer to a string containing a base-2 (binary) representation + * of the number. */ Datum -to_hex32(PG_FUNCTION_ARGS) +to_binary32(PG_FUNCTION_ARGS) { - uint32 value = (uint32) PG_GETARG_INT32(0); - char *ptr; - const char *digits = "0123456789abcdef"; - char buf[32]; /* bigger than needed, but reasonable */ + uint64 value = (uint64) PG_GETARG_INT32(0); + char *result = convert_to_base(value, 2); - ptr = buf + sizeof(buf) - 1; - *ptr = '\0'; + PG_RETURN_TEXT_P(cstring_to_text(result)); +} +Datum +to_binary64(PG_FUNCTION_ARGS) +{ + uint64 value = (uint64) PG_GETARG_INT64(0); + char *result = convert_to_base(value, 2); - do - { - *--ptr = digits[value % HEXBASE]; - value /= HEXBASE; - } while (ptr > buf && value); + PG_RETURN_TEXT_P(cstring_to_text(result)); +} - PG_RETURN_TEXT_P(cstring_to_text(ptr)); +/* + * Convert an integer to a string containing a base-8 (oct) representation of + * the number. + */ +Datum +to_oct32(PG_FUNCTION_ARGS) +{ + uint64 value = (uint64) PG_GETARG_INT32(0); + char *result = convert_to_base(value, 8); + + PG_RETURN_TEXT_P(cstring_to_text(result)); +} +Datum +to_oct64(PG_FUNCTION_ARGS) +{ + uint64 value = (uint64) PG_GETARG_INT64(0); + char *result = convert_to_base(value, 8); + + PG_RETURN_TEXT_P(cstring_to_text(result)); } /* - * Convert an int64 to a string containing a base 16 (hex) representation of + * Convert an integer to a string containing a base-16 (hex) representation of * the number. */ Datum +to_hex32(PG_FUNCTION_ARGS) +{ + uint64 value = (uint64) PG_GETARG_INT32(0); + char *result = convert_to_base(value, 16); + + PG_RETURN_TEXT_P(cstring_to_text(result)); +} +Datum to_hex64(PG_FUNCTION_ARGS) { uint64 value = (uint64) PG_GETARG_INT64(0); - char *ptr; + char *result = convert_to_base(value, 16); + + PG_RETURN_TEXT_P(cstring_to_text(result)); +} + +/* + * Workhorse for to_binary, to_oct, and to_hex. Note that base must be either + * 2, 8, or 16. + */ +static char * +convert_to_base(uint64 value, int base) +{ const char *digits = "0123456789abcdef"; - char buf[32]; /* bigger than needed, but r
Re: [PATCH] Add function to_oct
On Tue, Jul 25, 2023 at 04:24:26PM -0700, Nathan Bossart wrote: > I started taking a look at this and ended up adding to_binary() and a > bigint version of to_oct() for completeness. While I was at it, I moved > the base-conversion logic out to a separate static function that > to_binary/oct/hex all use. Bleh, this patch seems to fail horribly on 32-bit builds. I'll look into it soon. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: [PATCH] Add function to_oct
On Tue, Apr 04, 2023 at 08:45:36PM -0400, Kirk Wolak wrote: > Marked Ready for Committer. I started taking a look at this and ended up adding to_binary() and a bigint version of to_oct() for completeness. While I was at it, I moved the base-conversion logic out to a separate static function that to_binary/oct/hex all use. >From the other discussion referenced upthread, it sounds like we might want to replace to_binary/oct/hex with a more generic base-conversion function. Maybe we should try to do that instead. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com >From a8cac3f3de2394a6128515b6efc2ef3d0a92bce5 Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Tue, 25 Jul 2023 16:09:01 -0700 Subject: [PATCH v3 1/1] add to_binary() and to_oct() --- doc/src/sgml/func.sgml| 42 + src/backend/utils/adt/varlena.c | 86 +++ src/include/catalog/pg_proc.dat | 12 src/test/regress/expected/strings.out | 26 +++- src/test/regress/sql/strings.sql | 9 ++- 5 files changed, 148 insertions(+), 27 deletions(-) diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index b94827674c..79334ea145 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -3737,6 +3737,48 @@ repeat('Pg', 4) PgPgPgPg + + + + to_binary + +to_binary ( integer ) +text + + +to_binary ( bigint ) +text + + +Converts the number to its equivalent binary representation. + + +to_binary(2147483647) +111 + + + + + + + to_oct + +to_oct ( integer ) +text + + +to_oct ( bigint ) +text + + +Converts the number to its equivalent octal representation. + + +to_oct(2147483647) +177 + + + diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c index b1ec5c32ce..b14f995fad 100644 --- a/src/backend/utils/adt/varlena.c +++ b/src/backend/utils/adt/varlena.c @@ -165,6 +165,7 @@ static void text_format_string_conversion(StringInfo buf, char conversion, int flags, int width); static void text_format_append_string(StringInfo buf, const char *str, int flags, int width); +static char *convert_to_base(uint64 value, int base); /* @@ -4919,53 +4920,90 @@ array_to_text_internal(FunctionCallInfo fcinfo, ArrayType *v, return result; } -#define HEXBASE 16 /* - * Convert an int32 to a string containing a base 16 (hex) representation of - * the number. + * Convert an integer to a string containing a base-2 (binary) representation + * of the number. */ Datum -to_hex32(PG_FUNCTION_ARGS) +to_binary32(PG_FUNCTION_ARGS) { - uint32 value = (uint32) PG_GETARG_INT32(0); - char *ptr; - const char *digits = "0123456789abcdef"; - char buf[32]; /* bigger than needed, but reasonable */ + uint64 value = (uint64) PG_GETARG_INT32(0); + char *result = convert_to_base(value, 2); - ptr = buf + sizeof(buf) - 1; - *ptr = '\0'; + PG_RETURN_TEXT_P(cstring_to_text(result)); +} +Datum +to_binary64(PG_FUNCTION_ARGS) +{ + uint64 value = (uint64) PG_GETARG_INT64(0); + char *result = convert_to_base(value, 2); - do - { - *--ptr = digits[value % HEXBASE]; - value /= HEXBASE; - } while (ptr > buf && value); + PG_RETURN_TEXT_P(cstring_to_text(result)); +} + +/* + * Convert an integer to a string containing a base-8 (oct) representation of + * the number. + */ +Datum +to_oct32(PG_FUNCTION_ARGS) +{ + uint64 value = (uint64) PG_GETARG_INT32(0); + char *result = convert_to_base(value, 8); + + PG_RETURN_TEXT_P(cstring_to_text(result)); +} +Datum +to_oct64(PG_FUNCTION_ARGS) +{ + uint64 value = (uint64) PG_GETARG_INT64(0); + char *result = convert_to_base(value, 8); - PG_RETURN_TEXT_P(cstring_to_text(ptr)); + PG_RETURN_TEXT_P(cstring_to_text(result)); } /* - * Convert an int64 to a string containing a base 16 (hex) representation of + * Convert an integer to a string containing a base-16 (hex) representation of * the number. */ Datum +to_hex32(PG_FUNCTION_ARGS) +{ + uint64 value = (uint64) PG_GETARG_INT64(0); + char *result = convert_to_base(value, 16); + + PG_RETURN_TEXT_P(cstring_to_text(result)); +} +Datum to_hex64(PG_FUNCTION_ARGS) { uint64 value = (uint64) PG_GETARG_INT64(0); - char *ptr; + char *result = convert_to_base(value, 16); + + PG_RETURN_TEXT_P(cstring_to_text(result)); +} + +/* + * Workhorse for to_binary, to_oct, and to_hex. Note that base must be either + * 2, 8, or 16. + */ +static char * +convert_to_base(uint64 value, int base) +{ const char *digits = "0123456789abcdef"; - char buf[32]; /* bigger than needed, but reasonable */ +
Re: [PATCH] Add function to_oct
On Thu, Feb 23, 2023 at 6:32 AM Peter Eisentraut < peter.eisentr...@enterprisedb.com> wrote: > On 20.12.22 23:08, Eric Radman wrote: > > This patch is a new function based on the implementation of to_hex(int). > > > > Since support for octal integer literals was added, to_oct(int) allows > > octal values to be easily stored and returned in query results. > > > >to_oct(0o755) = '755' > > > > This is probably most useful for storing file system permissions. > > Note this subsequent discussion about the to_hex function: > > https://www.postgresql.org/message-id/flat/CAEZATCVbkL1ynqpsKiTDpch34%3DSCr5nnau%3DnfNmiy2nM3SJHtw%40mail.gmail.com > > Also, I think there is no "to binary" function, so perhaps if we're > going down this road one way or the other, we should probably complete > the set. > > The code reads clearly. It works as expected (being an old PDP-11 guy!)... And the docs make sense and build as well. Nothing larger than an int gets in. I was "missing" the bigint version, but read through ALL of the messages to see (and agree) That that's okay. Marked Ready for Committer. Thanks, Kirk
Re: [PATCH] Add function to_oct
On 20.12.22 23:08, Eric Radman wrote: This patch is a new function based on the implementation of to_hex(int). Since support for octal integer literals was added, to_oct(int) allows octal values to be easily stored and returned in query results. to_oct(0o755) = '755' This is probably most useful for storing file system permissions. Note this subsequent discussion about the to_hex function: https://www.postgresql.org/message-id/flat/CAEZATCVbkL1ynqpsKiTDpch34%3DSCr5nnau%3DnfNmiy2nM3SJHtw%40mail.gmail.com Also, I think there is no "to binary" function, so perhaps if we're going down this road one way or the other, we should probably complete the set.
Re: [PATCH] Add function to_oct
vignesh C writes: > Few suggestions > 1) We could use to_oct instead of to_oct32 as we don't have multiple > implementations for to_oct That seems (a) shortsighted and (b) inconsistent with the naming pattern used for to_hex, so I doubt it'd be an improvement. regards, tom lane
Re: [PATCH] Add function to_oct
On Thu, 22 Dec 2022 at 23:11, Eric Radman wrote: > > On Thu, Dec 22, 2022 at 10:08:17AM +, Dag Lem wrote: > > > > The calculation of quotient and remainder can be replaced by less costly > > masking and shifting. > > > > Defining > > > > #define OCT_DIGIT_BITS 3 > > #define OCT_DIGIT_BITMASK 0x7 > > > > the content of the loop can be replaced by > > > > *--ptr = digits[value & OCT_DIGIT_BITMASK]; > > value >>= OCT_DIGIT_BITS; > > > > Also, the check for ptr > buf in the while loop can be removed. The > > check is superfluous, since buf cannot possibly be exhausted by a 32 > > bit integer (the maximum octal number being 377). > > I integrated these suggestions in the attached -v2 patch and tested > range of values manually. > > Also picked an OID > 8000 as suggested by unused_oids. Few suggestions 1) We could use to_oct instead of to_oct32 as we don't have multiple implementations for to_oct diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index 98d90d9338..fde0b24563 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -3687,6 +3687,9 @@ { oid => '2090', descr => 'convert int8 number to hex', proname => 'to_hex', prorettype => 'text', proargtypes => 'int8', prosrc => 'to_hex64' }, +{ oid => '8335', descr => 'convert int4 number to oct', + proname => 'to_oct', prorettype => 'text', proargtypes => 'int4', + prosrc => 'to_oct32' }, 2) Similarly we could change this to "to_oct" +/* + * Convert an int32 to a string containing a base 8 (oct) representation of + * the number. + */ +Datum +to_oct32(PG_FUNCTION_ARGS) +{ + uint32 value = (uint32) PG_GETARG_INT32(0); 3) I'm not sure if AS "" is required, but I also noticed it is done similarly in to_hex too: +-- +-- test to_oct +-- +select to_oct(256*256*256 - 1) AS ""; + +-- + +(1 row) 4) You could add a commit message for the patch Regards, Vignesh
Re: [PATCH] Add function to_oct
On Thu, Dec 22, 2022 at 10:08:17AM +, Dag Lem wrote: > > The calculation of quotient and remainder can be replaced by less costly > masking and shifting. > > Defining > > #define OCT_DIGIT_BITS 3 > #define OCT_DIGIT_BITMASK 0x7 > > the content of the loop can be replaced by > > *--ptr = digits[value & OCT_DIGIT_BITMASK]; > value >>= OCT_DIGIT_BITS; > > Also, the check for ptr > buf in the while loop can be removed. The > check is superfluous, since buf cannot possibly be exhausted by a 32 > bit integer (the maximum octal number being 377). I integrated these suggestions in the attached -v2 patch and tested range of values manually. Also picked an OID > 8000 as suggested by unused_oids. ..Eric diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 1f63dc6dba..b539e0dc0b 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -3699,6 +3699,23 @@ repeat('Pg', 4) PgPgPgPg + + + + to_oct + +to_oct ( integer ) +text + + +Converts the number to its equivalent octal representation. + + +to_oct(32) +40 + + + diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c index 1c52deec55..586693ad8f 100644 --- a/src/backend/utils/adt/varlena.c +++ b/src/backend/utils/adt/varlena.c @@ -5229,6 +5229,32 @@ to_hex64(PG_FUNCTION_ARGS) PG_RETURN_TEXT_P(cstring_to_text(ptr)); } +#define OCT_DIGIT_BITS 3 +#define OCT_DIGIT_BITMASK 0x7 +/* + * Convert an int32 to a string containing a base 8 (oct) representation of + * the number. + */ +Datum +to_oct32(PG_FUNCTION_ARGS) +{ + uint32 value = (uint32) PG_GETARG_INT32(0); + char*ptr; + const char *digits = "01234567"; + charbuf[32];/* bigger than needed, but reasonable */ + + ptr = buf + sizeof(buf) - 1; + *ptr = '\0'; + + do + { + *--ptr = digits[value & OCT_DIGIT_BITMASK]; + value >>= OCT_DIGIT_BITS; + } while (value); + + PG_RETURN_TEXT_P(cstring_to_text(ptr)); +} + /* * Return the size of a datum, possibly compressed * diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index 98d90d9338..fde0b24563 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -3687,6 +3687,9 @@ { oid => '2090', descr => 'convert int8 number to hex', proname => 'to_hex', prorettype => 'text', proargtypes => 'int8', prosrc => 'to_hex64' }, +{ oid => '8335', descr => 'convert int4 number to oct', + proname => 'to_oct', prorettype => 'text', proargtypes => 'int4', + prosrc => 'to_oct32' }, # for character set encoding support diff --git a/src/test/regress/expected/strings.out b/src/test/regress/expected/strings.out index f028c1f10f..5ebaeb4a80 100644 --- a/src/test/regress/expected/strings.out +++ b/src/test/regress/expected/strings.out @@ -2143,6 +2143,15 @@ select to_hex(256::bigint*256::bigint*256::bigint*256::bigint - 1) AS "" (1 row) +-- +-- test to_oct +-- +select to_oct(256*256*256 - 1) AS ""; + +-- + +(1 row) + -- -- SHA-2 -- diff --git a/src/test/regress/sql/strings.sql b/src/test/regress/sql/strings.sql index 932f71cbca..c4c40949e7 100644 --- a/src/test/regress/sql/strings.sql +++ b/src/test/regress/sql/strings.sql @@ -691,6 +691,11 @@ select to_hex(256*256*256 - 1) AS "ff"; select to_hex(256::bigint*256::bigint*256::bigint*256::bigint - 1) AS ""; +-- +-- test to_oct +-- +select to_oct(256*256*256 - 1) AS ""; + -- -- SHA-2 --
Re: [PATCH] Add function to_oct
The following review has been posted through the commitfest application: make installcheck-world: not tested Implements feature: not tested Spec compliant: not tested Documentation:not tested This is a mini-review of to_oct :-) The function seems useful to me, and is obviously correct. I don't know whether optimization at such a low level is considered in PostgreSQL, but here goes. The calculation of quotient and remainder can be replaced by less costly masking and shifting. Defining #define OCT_DIGIT_BITS 3 #define OCT_DIGIT_BITMASK 0x7 the content of the loop can be replaced by *--ptr = digits[value & OCT_DIGIT_BITMASK]; value >>= OCT_DIGIT_BITS; Also, the check for ptr > buf in the while loop can be removed. The check is superfluous, since buf cannot possibly be exhausted by a 32 bit integer (the maximum octal number being 377). Best regards Dag Lem The new status of this patch is: Waiting on Author
Re: [PATCH] Add function to_oct
2022年12月21日(水) 10:42 Eric Radman : > > On Wed, Dec 21, 2022 at 08:36:40AM +0900, Ian Lawrence Barwick wrote: > > 2022年12月21日(水) 7:08 Eric Radman :> > > > Hello! > > > > > > This patch is a new function based on the implementation of to_hex(int). > > > > > > Since support for octal integer literals was added, to_oct(int) allows > > > octal values to be easily stored and returned in query results. > > > > > > to_oct(0o755) = '755' > > > > > > This is probably most useful for storing file system permissions. > > > > Seems like it would be convenient to have. Any reason why there's > > no matching "to_oct(bigint)" version? > > I couldn't think of a reason someone might want an octal > representation of a bigint. Certainly it would be easy to add > if there is value in supporting all of the same argument types. Yeah, I am struggling to think of a practical application other than symmetry with to_hex(). Regards Ian Barwick
Re: [PATCH] Add function to_oct
On Wed, Dec 21, 2022 at 08:36:40AM +0900, Ian Lawrence Barwick wrote: > 2022年12月21日(水) 7:08 Eric Radman :> > > Hello! > > > > This patch is a new function based on the implementation of to_hex(int). > > > > Since support for octal integer literals was added, to_oct(int) allows > > octal values to be easily stored and returned in query results. > > > > to_oct(0o755) = '755' > > > > This is probably most useful for storing file system permissions. > > Seems like it would be convenient to have. Any reason why there's > no matching "to_oct(bigint)" version? I couldn't think of a reason someone might want an octal representation of a bigint. Certainly it would be easy to add if there is value in supporting all of the same argument types.
Re: [PATCH] Add function to_oct
2022年12月21日(水) 7:08 Eric Radman :> > Hello! > > This patch is a new function based on the implementation of to_hex(int). > > Since support for octal integer literals was added, to_oct(int) allows > octal values to be easily stored and returned in query results. > > to_oct(0o755) = '755' > > This is probably most useful for storing file system permissions. Seems like it would be convenient to have. Any reason why there's no matching "to_oct(bigint)" version? Patch has been added to the next commitfest [1], thanks. [1] https://commitfest.postgresql.org/41/4071/ Regards Ian Barwick
[PATCH] Add function to_oct
Hello! This patch is a new function based on the implementation of to_hex(int). Since support for octal integer literals was added, to_oct(int) allows octal values to be easily stored and returned in query results. to_oct(0o755) = '755' This is probably most useful for storing file system permissions. -- Eric Radman diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 1f63dc6dba..b539e0dc0b 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -3699,6 +3699,23 @@ repeat('Pg', 4) PgPgPgPg + + + + to_oct + +to_oct ( integer ) +text + + +Converts the number to its equivalent octal representation. + + +to_oct(32) +40 + + + diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c index 1c52deec55..16929ddeed 100644 --- a/src/backend/utils/adt/varlena.c +++ b/src/backend/utils/adt/varlena.c @@ -5229,6 +5229,31 @@ to_hex64(PG_FUNCTION_ARGS) PG_RETURN_TEXT_P(cstring_to_text(ptr)); } +#define OCTBASE 8 +/* + * Convert an int32 to a string containing a base 8 (oct) representation of + * the number. + */ +Datum +to_oct32(PG_FUNCTION_ARGS) +{ + uint32 value = (uint32) PG_GETARG_INT32(0); + char *ptr; + const char *digits = "01234567"; + charbuf[32];/* bigger than needed, but reasonable */ + + ptr = buf + sizeof(buf) - 1; + *ptr = '\0'; + + do + { + *--ptr = digits[value % OCTBASE]; + value /= OCTBASE; + } while (ptr > buf && value); + + PG_RETURN_TEXT_P(cstring_to_text(ptr)); +} + /* * Return the size of a datum, possibly compressed * diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index 98d90d9338..824df6fdc8 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -3687,6 +3687,9 @@ { oid => '2090', descr => 'convert int8 number to hex', proname => 'to_hex', prorettype => 'text', proargtypes => 'int8', prosrc => 'to_hex64' }, +{ oid => '2173', descr => 'convert int4 number to oct', + proname => 'to_oct', prorettype => 'text', proargtypes => 'int4', + prosrc => 'to_oct32' }, # for character set encoding support diff --git a/src/test/regress/expected/strings.out b/src/test/regress/expected/strings.out index f028c1f10f..5ebaeb4a80 100644 --- a/src/test/regress/expected/strings.out +++ b/src/test/regress/expected/strings.out @@ -2143,6 +2143,15 @@ select to_hex(256::bigint*256::bigint*256::bigint*256::bigint - 1) AS "" (1 row) +-- +-- test to_oct +-- +select to_oct(256*256*256 - 1) AS ""; + +-- + +(1 row) + -- -- SHA-2 -- diff --git a/src/test/regress/sql/strings.sql b/src/test/regress/sql/strings.sql index 932f71cbca..c4c40949e7 100644 --- a/src/test/regress/sql/strings.sql +++ b/src/test/regress/sql/strings.sql @@ -691,6 +691,11 @@ select to_hex(256*256*256 - 1) AS "ff"; select to_hex(256::bigint*256::bigint*256::bigint*256::bigint - 1) AS ""; +-- +-- test to_oct +-- +select to_oct(256*256*256 - 1) AS ""; + -- -- SHA-2 --