Re: Add support for unit "B" to pg_size_pretty()
On Wed, 8 Mar 2023 at 09:22, Peter Eisentraut wrote: > Ok, I have fixed the original documentation to that effect and > backpatched it. Thanks for fixing that. David
Re: Add support for unit "B" to pg_size_pretty()
On 06.03.23 09:27, David Rowley wrote: On Mon, 6 Mar 2023 at 21:13, Peter Eisentraut wrote: On 02.03.23 20:58, David Rowley wrote: I think I'd prefer to see the size_bytes_unit_alias struct have an index into size_pretty_units[] array. i.e: Ok, done that way. (I had thought about that, but I was worried that that would be too error-prone to maintain. But I suppose the tables don't change that often, and test cases would easily catch mistakes.) Patch looks pretty good. I just see a small spelling mistake in: +/* Additional unit aliases acceted by pg_size_bytes */ I also updated the documentation a bit more. I see I must have forgotten to add PB to the docs when pg_size_pretty had that unit added. I guess you added the "etc" to fix that? I'm wondering if that's the right choice. You modified the comment above size_pretty_units[] to remind us to update the docs when adding units, but the docs now say "etc", so do we need to? I'd likely have gone with just adding "PB" to the docs, that way it's pretty clear that new units need to be mentioned in the docs. Ok, I have fixed the original documentation to that effect and backpatched it. The remaining patch has been updated accordingly and committed also.
Re: Add support for unit "B" to pg_size_pretty()
On Mon, 6 Mar 2023 at 21:13, Peter Eisentraut wrote: > > On 02.03.23 20:58, David Rowley wrote: > > I think I'd prefer to see the size_bytes_unit_alias struct have an > > index into size_pretty_units[] array. i.e: > > Ok, done that way. (I had thought about that, but I was worried that > that would be too error-prone to maintain. But I suppose the tables > don't change that often, and test cases would easily catch mistakes.) Patch looks pretty good. I just see a small spelling mistake in: +/* Additional unit aliases acceted by pg_size_bytes */ > I also updated the documentation a bit more. I see I must have forgotten to add PB to the docs when pg_size_pretty had that unit added. I guess you added the "etc" to fix that? I'm wondering if that's the right choice. You modified the comment above size_pretty_units[] to remind us to update the docs when adding units, but the docs now say "etc", so do we need to? I'd likely have gone with just adding "PB" to the docs, that way it's pretty clear that new units need to be mentioned in the docs. David
Re: Add support for unit "B" to pg_size_pretty()
On 02.03.23 20:58, David Rowley wrote: On Mon, 27 Feb 2023 at 21:34, Peter Eisentraut wrote: On 22.02.23 03:39, David Rowley wrote: I think you'll need to find another way to make the aliases work. Maybe another array with the name and an int to reference the corresponding index in size_pretty_units. Ok, here is a new patch with a separate table of aliases. (Might look like overkill, but I think the "PiB" etc. example you had could actually be a good use case for this as well.) I think I'd prefer to see the size_bytes_unit_alias struct have an index into size_pretty_units[] array. i.e: Ok, done that way. (I had thought about that, but I was worried that that would be too error-prone to maintain. But I suppose the tables don't change that often, and test cases would easily catch mistakes.) I also updated the documentation a bit more. From bb0fb6eb3364195838a9c7e387ee4237c8cd30b4 Mon Sep 17 00:00:00 2001 From: Peter Eisentraut Date: Mon, 6 Mar 2023 09:10:50 +0100 Subject: [PATCH v4] Add support for unit "B" to pg_size_bytes() This makes it consistent with the units support in GUC. Discussion: https://www.postgresql.org/message-id/flat/0106914a-9eb5-22be-40d8-652cc88c827d%40enterprisedb.com --- doc/src/sgml/func.sgml | 9 +--- src/backend/utils/adt/dbsize.c | 33 src/test/regress/expected/dbsize.out | 15 +++-- src/test/regress/sql/dbsize.sql | 2 +- 4 files changed, 44 insertions(+), 15 deletions(-) diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 97b3f1c1a6..fa5f60cf4c 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -27166,8 +27166,11 @@ Database Object Size Functions bigint -Converts a size in human-readable format (as returned -by pg_size_pretty) into bytes. +Converts a size in human-readable format (as returned by +pg_size_pretty) into bytes. Valid units are +bytes, B, kB, +MB, GB, TB, +and PB. @@ -27185,7 +27188,7 @@ Database Object Size Functions Converts a size in bytes into a more easily human-readable format with -size units (bytes, kB, MB, GB or TB as appropriate). Note that the +size units (bytes, kB, MB, GB, TB, etc. as appropriate). Note that the units are powers of 2 rather than powers of 10, so 1kB is 1024 bytes, 1MB is 10242 = 1048576 bytes, and so on. diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c index dbd404101f..8d5ca41c8b 100644 --- a/src/backend/utils/adt/dbsize.c +++ b/src/backend/utils/adt/dbsize.c @@ -46,7 +46,7 @@ struct size_pretty_unit * unit */ }; -/* When adding units here also update the error message in pg_size_bytes */ +/* When adding units here also update the docs and the error message in pg_size_bytes */ static const struct size_pretty_unit size_pretty_units[] = { {"bytes", 10 * 1024, false, 0}, {"kB", 20 * 1024 - 1, true, 10}, @@ -57,6 +57,19 @@ static const struct size_pretty_unit size_pretty_units[] = { {NULL, 0, false, 0} }; +/* Additional unit aliases acceted by pg_size_bytes */ +struct size_bytes_unit_alias +{ + const char *alias; + int unit_index; /* corresponding size_pretty_units element */ +}; + +/* When adding units here also update the docs and the error message in pg_size_bytes */ +static const struct size_bytes_unit_alias size_bytes_aliases[] = { + {"B", 0}, + {NULL} +}; + /* Return physical size of directory contents, or 0 if dir doesn't exist */ static int64 db_dir_size(const char *path) @@ -801,9 +814,19 @@ pg_size_bytes(PG_FUNCTION_ARGS) { /* Parse the unit case-insensitively */ if (pg_strcasecmp(strptr, unit->name) == 0) - { - multiplier = ((int64) 1) << unit->unitbits; break; + } + + /* If not found, look in table of aliases */ + if (unit->name == NULL) + { + for (const struct size_bytes_unit_alias *a = size_bytes_aliases; a->alias != NULL; a++) + { + if (pg_strcasecmp(strptr, a->alias) == 0) + { + unit = _pretty_units[a->unit_index]; + break; + } } } @@ -813,7 +836,9 @@ pg_size_bytes(PG_FUNCTION_ARGS) (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("invalid size: \"%s\"", text_to_cstring(arg)),
Re: Add support for unit "B" to pg_size_pretty()
On Fri, 3 Mar 2023 at 11:23, David Rowley wrote: > > On Fri, 3 Mar 2023 at 09:32, Dean Rasheed wrote: > > Hmm, I think it would be easier to just have a separate table for > > pg_size_bytes(), rather than reusing pg_size_pretty()'s table. > > Maybe that's worthwhile if we were actually thinking of adding any > non-base 2 units in the future, but if we're not, perhaps it's better > just to have the smaller alias array which for Peter's needs will just > require 1 element + the NULL one instead of 6 + NULL. > Maybe. It's the tradeoff between having a smaller array and more code (2 loops) vs a larger array and less code (1 loop). > In any case, I'm not really sure I see what the path forward would be > to add something like base-10 units would be for pg_size_bytes(). If > we were to change MB to mean 10^6 rather than 2^20 I think many people > would get upset. > Yeah, that's probably true. Given the way this and configuration parameters currently work, I think we're stuck with 1MB meaning 2^20 bytes. Regards, Dean
Re: Add support for unit "B" to pg_size_pretty()
On Fri, 3 Mar 2023 at 09:32, Dean Rasheed wrote: > Hmm, I think it would be easier to just have a separate table for > pg_size_bytes(), rather than reusing pg_size_pretty()'s table. I.e., > size_bytes_units[], which would only need name and multiplier columns > (not round and limit). Done that way, it would be easier to add other > units later (e.g., non-base-2 units). Maybe that's worthwhile if we were actually thinking of adding any non-base 2 units in the future, but if we're not, perhaps it's better just to have the smaller alias array which for Peter's needs will just require 1 element + the NULL one instead of 6 + NULL. In any case, I'm not really sure I see what the path forward would be to add something like base-10 units would be for pg_size_bytes(). If we were to change MB to mean 10^6 rather than 2^20 I think many people would get upset. David
Re: Add support for unit "B" to pg_size_pretty()
On Thu, 2 Mar 2023 at 19:58, David Rowley wrote: > > I think I'd prefer to see the size_bytes_unit_alias struct have an > index into size_pretty_units[] array. i.e: > > struct size_bytes_unit_alias > { > const char *alias; /* aliased unit name */ > const int unit_index; /* corresponding size_pretty_units element */ > }; > > then the pg_size_bytes code can be simplified to: > > /* If not found, look in the table of aliases */ > if (unit->name == NULL) > { > for (const struct size_bytes_unit_alias *a = size_bytes_aliases; > a->alias != NULL; a++) > { > if (pg_strcasecmp(strptr, a->alias) == 0) > { > unit = _pretty_units[a->unit_index]; > break; > } > } > } > > which saves having to have the additional and slower nested loop code. > Hmm, I think it would be easier to just have a separate table for pg_size_bytes(), rather than reusing pg_size_pretty()'s table. I.e., size_bytes_units[], which would only need name and multiplier columns (not round and limit). Done that way, it would be easier to add other units later (e.g., non-base-2 units). Also, it looks to me as though the doc change is for pg_size_pretty() instead of pg_size_bytes(). Regards, Dean
Re: Add support for unit "B" to pg_size_pretty()
On Mon, 27 Feb 2023 at 21:34, Peter Eisentraut wrote: > > On 22.02.23 03:39, David Rowley wrote: > > I think you'll need to find another way to make the aliases work. > > Maybe another array with the name and an int to reference the > > corresponding index in size_pretty_units. > > Ok, here is a new patch with a separate table of aliases. (Might look > like overkill, but I think the "PiB" etc. example you had could actually > be a good use case for this as well.) I think I'd prefer to see the size_bytes_unit_alias struct have an index into size_pretty_units[] array. i.e: struct size_bytes_unit_alias { const char *alias; /* aliased unit name */ const int unit_index; /* corresponding size_pretty_units element */ }; then the pg_size_bytes code can be simplified to: /* If not found, look in the table of aliases */ if (unit->name == NULL) { for (const struct size_bytes_unit_alias *a = size_bytes_aliases; a->alias != NULL; a++) { if (pg_strcasecmp(strptr, a->alias) == 0) { unit = _pretty_units[a->unit_index]; break; } } } which saves having to have the additional and slower nested loop code. Apart from that, the patch looks fine. David
Re: Add support for unit "B" to pg_size_pretty()
On 22.02.23 03:39, David Rowley wrote: hmm. I didn't really code pg_size_pretty with aliases in mind. I don't think you can do this. There's code in pg_size_pretty() and pg_size_pretty_numeric() that'll not work correctly. We look ahead to the next unit to check if there is one so we know we must use this unit if there are no other units to convert to. I think you'll need to find another way to make the aliases work. Maybe another array with the name and an int to reference the corresponding index in size_pretty_units. Ok, here is a new patch with a separate table of aliases. (Might look like overkill, but I think the "PiB" etc. example you had could actually be a good use case for this as well.) From 4e493128adddc2656f3f139b2ca402f0d13721ba Mon Sep 17 00:00:00 2001 From: Peter Eisentraut Date: Mon, 27 Feb 2023 09:28:25 +0100 Subject: [PATCH v3] Add support for unit "B" to pg_size_bytes() This makes it consistent with the units support in GUC. Discussion: https://www.postgresql.org/message-id/flat/0106914a-9eb5-22be-40d8-652cc88c827d%40enterprisedb.com --- doc/src/sgml/func.sgml | 2 +- src/backend/utils/adt/dbsize.c | 34 +--- src/test/regress/expected/dbsize.out | 15 ++-- src/test/regress/sql/dbsize.sql | 2 +- 4 files changed, 41 insertions(+), 12 deletions(-) diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 0cbdf63632..718d0cb550 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -27176,7 +27176,7 @@ Database Object Size Functions Converts a size in bytes into a more easily human-readable format with -size units (bytes, kB, MB, GB or TB as appropriate). Note that the +size units (bytes, B, kB, MB, GB or TB as appropriate). Note that the units are powers of 2 rather than powers of 10, so 1kB is 1024 bytes, 1MB is 10242 = 1048576 bytes, and so on. diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c index dbd404101f..338e990aeb 100644 --- a/src/backend/utils/adt/dbsize.c +++ b/src/backend/utils/adt/dbsize.c @@ -57,6 +57,19 @@ static const struct size_pretty_unit size_pretty_units[] = { {NULL, 0, false, 0} }; +/* Additional unit aliases acceted by pg_size_bytes */ +struct size_bytes_unit_alias +{ + const char *alias; + const char *base; +}; + +/* When adding units here also update the error message in pg_size_bytes */ +static const struct size_bytes_unit_alias size_bytes_aliases[] = { + {"B", "bytes"}, + {NULL, NULL} +}; + /* Return physical size of directory contents, or 0 if dir doesn't exist */ static int64 db_dir_size(const char *path) @@ -801,9 +814,22 @@ pg_size_bytes(PG_FUNCTION_ARGS) { /* Parse the unit case-insensitively */ if (pg_strcasecmp(strptr, unit->name) == 0) - { - multiplier = ((int64) 1) << unit->unitbits; break; + } + + /* If not found, look in table of aliases */ + if (unit->name == NULL) + { + for (const struct size_bytes_unit_alias *a = size_bytes_aliases; a->alias != NULL; a++) + { + if (pg_strcasecmp(strptr, a->alias) == 0) + { + for (unit = size_pretty_units; unit->name != NULL; unit++) + { + if (pg_strcasecmp(a->base, unit->name) == 0) + break; + } + } } } @@ -813,7 +839,9 @@ pg_size_bytes(PG_FUNCTION_ARGS) (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("invalid size: \"%s\"", text_to_cstring(arg)), errdetail("Invalid size unit: \"%s\".", strptr), -errhint("Valid units are \"bytes\", \"kB\", \"MB\", \"GB\", \"TB\", and \"PB\"."))); +errhint("Valid units are \"bytes\", \"B\", \"kB\", \"MB\", \"GB\", \"TB\", and \"PB\"."))); + + multiplier = ((int64) 1) << unit->unitbits; if (multiplier > 1) { diff --git a/src/test/regress/expected/dbsize.out b/src/test/regress/expected/dbsize.out index d8d6686b5f..f1121a87aa 100644 --- a/src/test/regress/expected/dbsize.out +++ b/src/test/regress/expected/dbsize.out @@ -81,12 +81,13 @@ SELECT size, pg_size_pretty(size), pg_size_pretty(-1 * size) FROM -- pg_size_bytes() tests SELECT size, pg_size_bytes(size) FROM -(VALUES ('1'), ('123bytes'), ('1kB'),
Re: Add support for unit "B" to pg_size_pretty()
On Wed, 22 Feb 2023 at 12:47, Peter Eisentraut wrote: > >> diff --git a/src/backend/utils/adt/dbsize.c > >> b/src/backend/utils/adt/dbsize.c > >> index dbd404101f..9ecd5428c3 100644 > >> --- a/src/backend/utils/adt/dbsize.c > >> +++ b/src/backend/utils/adt/dbsize.c > >> @@ -49,6 +49,7 @@ struct size_pretty_unit > >> /* When adding units here also update the error message in pg_size_bytes > >> */ > >> static const struct size_pretty_unit size_pretty_units[] = { > >> {"bytes", 10 * 1024, false, 0}, > >> +{"B", 10 * 1024, false, 0}, > > > > This adds a duplicate line (unitbits=0) where no other existing line > > uses duplicates. If that's intentional, I think it deserves a comment > > highlighting that it's an /*alias*/, and about why that does the right > > thing, either here about or in the commit message. > > I have added a comment about that. hmm. I didn't really code pg_size_pretty with aliases in mind. I don't think you can do this. There's code in pg_size_pretty() and pg_size_pretty_numeric() that'll not work correctly. We look ahead to the next unit to check if there is one so we know we must use this unit if there are no other units to convert to. Let's assume someone in the future reads your comment about aliases and thinks we can just go and add an alias for any unit. Here we'll add PiB for PB. diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c index dbd404101f..8e22969a76 100644 --- a/src/backend/utils/adt/dbsize.c +++ b/src/backend/utils/adt/dbsize.c @@ -54,6 +54,7 @@ static const struct size_pretty_unit size_pretty_units[] = { {"GB", 20 * 1024 - 1, true, 30}, {"TB", 20 * 1024 - 1, true, 40}, {"PB", 20 * 1024 - 1, true, 50}, + {"PiB", 20 * 1024 - 1, true, 50}, {NULL, 0, false, 0} }; testing it, I see: postgres=# select pg_size_pretty(1::numeric * 1024*1024*1024*1024*1024); pg_size_pretty 1 PB (1 row) postgres=# select pg_size_pretty(2::numeric * 1024*1024*1024*1024*1024); pg_size_pretty 2 PiB (1 row) I think we'll likely get complaints about PB being used sometimes and PiB being used at other times. I think you'll need to find another way to make the aliases work. Maybe another array with the name and an int to reference the corresponding index in size_pretty_units. David
Re: Add support for unit "B" to pg_size_pretty()
On 20.02.23 15:34, Justin Pryzby wrote: On Mon, Feb 20, 2023 at 07:44:15AM +0100, Peter Eisentraut wrote: This patch adds support for the unit "B" to pg_size_pretty(). This makes it It seems like what it actually does is to support "B" in pg_size_bytes() - is that what you meant ? yes pg_size_pretty() already supports "bytes", so this doesn't actually make sizes any more pretty, or evidently change its output at all. Right, this is for the input side. diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c index dbd404101f..9ecd5428c3 100644 --- a/src/backend/utils/adt/dbsize.c +++ b/src/backend/utils/adt/dbsize.c @@ -49,6 +49,7 @@ struct size_pretty_unit /* When adding units here also update the error message in pg_size_bytes */ static const struct size_pretty_unit size_pretty_units[] = { {"bytes", 10 * 1024, false, 0}, + {"B", 10 * 1024, false, 0}, This adds a duplicate line (unitbits=0) where no other existing line uses duplicates. If that's intentional, I think it deserves a comment highlighting that it's an /*alias*/, and about why that does the right thing, either here about or in the commit message. I have added a comment about that. From 6b3a155260e2da5338f7cb6a1d729a0d34e3935a Mon Sep 17 00:00:00 2001 From: Peter Eisentraut Date: Wed, 22 Feb 2023 00:44:45 +0100 Subject: [PATCH v2] Add support for unit "B" to pg_size_bytes() This makes it consistent with the units support in GUC. Discussion: https://www.postgresql.org/message-id/flat/0106914a-9eb5-22be-40d8-652cc88c827d%40enterprisedb.com --- doc/src/sgml/func.sgml | 2 +- src/backend/utils/adt/dbsize.c | 10 -- src/test/regress/expected/dbsize.out | 15 --- src/test/regress/sql/dbsize.sql | 2 +- 4 files changed, 18 insertions(+), 11 deletions(-) diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index e09e289a43..15a5a98b0a 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -27162,7 +27162,7 @@ Database Object Size Functions Converts a size in bytes into a more easily human-readable format with -size units (bytes, kB, MB, GB or TB as appropriate). Note that the +size units (bytes, B, kB, MB, GB or TB as appropriate). Note that the units are powers of 2 rather than powers of 10, so 1kB is 1024 bytes, 1MB is 10242 = 1048576 bytes, and so on. diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c index dbd404101f..cab7834e8a 100644 --- a/src/backend/utils/adt/dbsize.c +++ b/src/backend/utils/adt/dbsize.c @@ -46,9 +46,15 @@ struct size_pretty_unit * unit */ }; -/* When adding units here also update the error message in pg_size_bytes */ +/* + * When adding units here also update the error message in pg_size_bytes. + * + * Aliases (with the same unitbits) are allowed. pg_size_pretty uses the + * first one among them. + */ static const struct size_pretty_unit size_pretty_units[] = { {"bytes", 10 * 1024, false, 0}, + {"B", 10 * 1024, false, 0}, {"kB", 20 * 1024 - 1, true, 10}, {"MB", 20 * 1024 - 1, true, 20}, {"GB", 20 * 1024 - 1, true, 30}, @@ -813,7 +819,7 @@ pg_size_bytes(PG_FUNCTION_ARGS) (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("invalid size: \"%s\"", text_to_cstring(arg)), errdetail("Invalid size unit: \"%s\".", strptr), -errhint("Valid units are \"bytes\", \"kB\", \"MB\", \"GB\", \"TB\", and \"PB\"."))); +errhint("Valid units are \"bytes\", \"B\", \"kB\", \"MB\", \"GB\", \"TB\", and \"PB\"."))); if (multiplier > 1) { diff --git a/src/test/regress/expected/dbsize.out b/src/test/regress/expected/dbsize.out index d8d6686b5f..f1121a87aa 100644 --- a/src/test/regress/expected/dbsize.out +++ b/src/test/regress/expected/dbsize.out @@ -81,12 +81,13 @@ SELECT size, pg_size_pretty(size), pg_size_pretty(-1 * size) FROM -- pg_size_bytes() tests SELECT size, pg_size_bytes(size) FROM -(VALUES ('1'), ('123bytes'), ('1kB'), ('1MB'), (' 1 GB'), ('1.5 GB '), +(VALUES ('1'), ('123bytes'), ('256 B'), ('1kB'), ('1MB'), (' 1 GB'), ('1.5 GB '), ('1TB'), ('3000 TB'), ('1e6 MB'), ('99 PB')) x(size); size | pg_size_bytes --+ 1| 1 123bytes |123 + 256 B|256 1kB | 1024 1MB |1048576 1 GB| 1073741824 @@ -95,7 +96,7 @@ SELECT size, pg_size_bytes(size) FROM 3000 TB | 3298534883328000 1e6 MB | 104857600 99 PB| 111464090777419776 -(10 rows) +(11 rows) -- case-insensitive units are supported
Re: Add support for unit "B" to pg_size_pretty()
On Mon, Feb 20, 2023 at 07:44:15AM +0100, Peter Eisentraut wrote: > This patch adds support for the unit "B" to pg_size_pretty(). This makes it It seems like what it actually does is to support "B" in pg_size_bytes() - is that what you meant ? pg_size_pretty() already supports "bytes", so this doesn't actually make sizes any more pretty, or evidently change its output at all. > diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c > index dbd404101f..9ecd5428c3 100644 > --- a/src/backend/utils/adt/dbsize.c > +++ b/src/backend/utils/adt/dbsize.c > @@ -49,6 +49,7 @@ struct size_pretty_unit > /* When adding units here also update the error message in pg_size_bytes */ > static const struct size_pretty_unit size_pretty_units[] = { > {"bytes", 10 * 1024, false, 0}, > + {"B", 10 * 1024, false, 0}, This adds a duplicate line (unitbits=0) where no other existing line uses duplicates. If that's intentional, I think it deserves a comment highlighting that it's an /*alias*/, and about why that does the right thing, either here about or in the commit message. > {"kB", 20 * 1024 - 1, true, 10}, > {"MB", 20 * 1024 - 1, true, 20}, > {"GB", 20 * 1024 - 1, true, 30}, -- Justin