[
https://issues.apache.org/jira/browse/CASSANDRA-19546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cheng Wang updated CASSANDRA-19546:
-----------------------------------
Description:
There are cases (e.g in our system_views tables but might be applicable for
user tables as well) when a column is of a type which represents number of
bytes. However, it is quite hard to parse a value for a human to have some
estimation what that value is. This function looks at values in a column as if
it was in bytes, and it will convert it to whatever a user pleases. Supported
units are: {{{}B{}}}, {{{}KiB{}}}, {{MiB}} and {{{}GiB{}}}. The result will be
rounded to two decimal places.
I propose this:
{code:java}
CREATE TABLE ks.tb (
id int PRIMARY KEY,
val bigint
) {code}
{code:java}
cqlsh> select * from ks.tb;
id | val
----+----------------
5 | 60000
1 | 1234234
2 | 12342341234234
4 | 60001
7 | null
6 | 43
3 | 123423 {code}
{code:java}
cqlsh> select format_bytes(val) from ks.tb;
system.format_bytes(val)
--------------------------
58.59 KiB
1.18 MiB
11494.7 GiB
58.59 KiB
null
43 B
120.53 KiB{code}
I also propose that this format_bytes function (name of it might be indeed
discussed and it is just a suggestion) should be only applicable for int,
smallint, bigint and varint types. I am not sure how to apply this to e.g.
"float" or similar. As I mentioned, it is meant to convert just number of
bytes, which is just some number, to a string representation of that and I do
not think that applying that function to anything else but these types makes
sense.
The second way to call {{format_bytes}} functions is to specify into what size
unit we would like to see all values to be converted to. For example, we want
all size to be represented in mebibytes, hence we do:
{code:java}
cqlsh> select format_bytes(val, 'MiB') from ks.tb;
system.format_bytes(val, 'MiB')
----------------------------------
0.06 MiB
1.18 MiB
11770573.84 MiB
0.06 MiB
null
0 MiB
0.12 MiB {code}
Lastly, we can specify a source unit and a target unit. A source unit tells
what unit that column is logically of, the target unit tells what unit we want
these values to be converted to. For example, if we know that our column is
logically in kibibytes and we want them to be converted into mebibytes, we
would do:
{code:java}
cqlsh> select format_bytes(val, 'Kib', 'MiB') from ks.tb;
system.format_bytes(val, 'KiB', 'MiB')
----------------------------------------
58.59 MiB
1205.31 MiB
12053067611.56 MiB
58.59 MiB
null
0.04 MiB
120.53 MiB {code}
Similarly to {*}{{format_bytes}}{*}, we can do transformations on duration-like
columns, namely {*}format_time{*}.
Supported units are: {{{}d{}}}, {{{}h{}}}, {{{}m{}}}, {{{}s{}}}, {{{}ms{}}},
{{{}us{}}}, {{{}µs{}}}, {{{}ns{}}}. Supported column types on which this
function is possible to be applied: {{{}INT{}}}, {{{}TINYINT{}}},
{{{}SMALLINT{}}}, {{{}BIGINT{}}}, {{{}VARINT{}}}, {{{}ASCII{}}}, {{{}TEXT{}}}.
For {{ASCII}} and {{TEXT}} types, text of such column has to be a non-negative
number. The result will be rounded to two decimal places.
{code:java}
cqlsh> select format_time(val) from ks.tb;
system.format_time(val)
-------------------------
1 m
20.57 m
142851.17 d
1 m
null
43 ms
2.06 m {code}
We may specify what unit we want that value to be converted to, give the
column’s values are in milliseconds:
{code:java}
cqlsh> select format_time(val, 'm') from ks.tb;
system.format_time(val, 'm')
------------------------------
1 m
20.57 m
205705687.24 m
1 m
null
0 m
2.06 m {code}
Lastly, we can specify both source and target values:
{code:java}
cqlsh> select format_time(val, 's', 'h') from ks.tb;
system.format_time(val, 's', 'h')
-----------------------------------
16.67 h
342.84 h
3428428120.62 h
16.67 h
null
0.01 h
34.28 h {code}
was:
There are cases (e.g in our system_views tables but might be applicable for
user tables as well) when a column is of a type which represents number of
bytes. However, it is quite hard to parse a value for a human to have some
estimation what that value is.
I propose this:
{code:java}
cqlsh> select * from myks.mytb ;
id | col1 | col2 | col3 | col4
----+------+------+------+----------
1 | 100 | 200 | 300 | 32432423
(1 rows)
cqlsh> select format_bytes(col4) from myks.mytb where id = 1;
system.format_bytes(col4)
----------------------
30.93 MiB
(1 rows)
cqlsh> select format_bytes(col4,0) from myks.mytb where id = 1;
system.to_human_size(col4, 0)
-------------------------
31 MiB
(1 rows)
cqlsh> select to_human_size(col4,1) from myks.mytb where id = 1;
system.to_human_size(col4, 1)
-------------------------
30.9 MiB
(1 rows)
{code}
The second argument is optional and represents the number of decimal places (at
most) to use. Without the second argument, it will default to FileUtils.df
which is "#.##" format.
{code:java}
cqlsh> DESCRIBE myks.mytb ;
CREATE TABLE myks.mytb (
id int PRIMARY KEY,
col1 int,
col2 smallint,
col3 bigint,
col4 varint,
)
{code}
I also propose that this to_human_size function (name of it might be indeed
discussed and it is just a suggestion) should be only applicable for int,
smallint, bigint and varint types. I am not sure how to apply this to e.g.
"float" or similar. As I mentioned, it is meant to convert just number of
bytes, which is just some number, to a string representation of that and I do
not think that applying that function to anything else but these types makes
sense.
> Add format_bytes and format_time functions
> ------------------------------------------
>
> Key: CASSANDRA-19546
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19546
> Project: Apache Cassandra
> Issue Type: New Feature
> Components: Legacy/CQL
> Reporter: Stefan Miklosovic
> Assignee: Cheng Wang
> Priority: Normal
> Fix For: 5.x
>
> Time Spent: 3h 50m
> Remaining Estimate: 0h
>
> There are cases (e.g in our system_views tables but might be applicable for
> user tables as well) when a column is of a type which represents number of
> bytes. However, it is quite hard to parse a value for a human to have some
> estimation what that value is. This function looks at values in a column as
> if it was in bytes, and it will convert it to whatever a user pleases.
> Supported units are: {{{}B{}}}, {{{}KiB{}}}, {{MiB}} and {{{}GiB{}}}. The
> result will be rounded to two decimal places.
> I propose this:
>
> {code:java}
> CREATE TABLE ks.tb (
> id int PRIMARY KEY,
> val bigint
> ) {code}
>
> {code:java}
> cqlsh> select * from ks.tb;
> id | val
> ----+----------------
> 5 | 60000
> 1 | 1234234
> 2 | 12342341234234
> 4 | 60001
> 7 | null
> 6 | 43
> 3 | 123423 {code}
> {code:java}
> cqlsh> select format_bytes(val) from ks.tb;
> system.format_bytes(val)
> --------------------------
> 58.59 KiB
> 1.18 MiB
> 11494.7 GiB
> 58.59 KiB
> null
> 43 B
> 120.53 KiB{code}
> I also propose that this format_bytes function (name of it might be indeed
> discussed and it is just a suggestion) should be only applicable for int,
> smallint, bigint and varint types. I am not sure how to apply this to e.g.
> "float" or similar. As I mentioned, it is meant to convert just number of
> bytes, which is just some number, to a string representation of that and I do
> not think that applying that function to anything else but these types makes
> sense.
> The second way to call {{format_bytes}} functions is to specify into what
> size unit we would like to see all values to be converted to. For example, we
> want all size to be represented in mebibytes, hence we do:
> {code:java}
> cqlsh> select format_bytes(val, 'MiB') from ks.tb;
> system.format_bytes(val, 'MiB')
> ----------------------------------
> 0.06 MiB
> 1.18 MiB
> 11770573.84 MiB
> 0.06 MiB
> null
> 0 MiB
> 0.12 MiB {code}
> Lastly, we can specify a source unit and a target unit. A source unit tells
> what unit that column is logically of, the target unit tells what unit we
> want these values to be converted to. For example, if we know that our column
> is logically in kibibytes and we want them to be converted into mebibytes, we
> would do:
>
> {code:java}
> cqlsh> select format_bytes(val, 'Kib', 'MiB') from ks.tb;
> system.format_bytes(val, 'KiB', 'MiB')
> ----------------------------------------
> 58.59 MiB
> 1205.31 MiB
> 12053067611.56 MiB
> 58.59 MiB
> null
> 0.04 MiB
> 120.53 MiB {code}
>
>
> Similarly to {*}{{format_bytes}}{*}, we can do transformations on
> duration-like columns, namely {*}format_time{*}.
> Supported units are: {{{}d{}}}, {{{}h{}}}, {{{}m{}}}, {{{}s{}}}, {{{}ms{}}},
> {{{}us{}}}, {{{}µs{}}}, {{{}ns{}}}. Supported column types on which this
> function is possible to be applied: {{{}INT{}}}, {{{}TINYINT{}}},
> {{{}SMALLINT{}}}, {{{}BIGINT{}}}, {{{}VARINT{}}}, {{{}ASCII{}}},
> {{{}TEXT{}}}. For {{ASCII}} and {{TEXT}} types, text of such column has to be
> a non-negative number. The result will be rounded to two decimal places.
> {code:java}
> cqlsh> select format_time(val) from ks.tb;
> system.format_time(val)
> -------------------------
> 1 m
> 20.57 m
> 142851.17 d
> 1 m
> null
> 43 ms
> 2.06 m {code}
> We may specify what unit we want that value to be converted to, give the
> column’s values are in milliseconds:
> {code:java}
> cqlsh> select format_time(val, 'm') from ks.tb;
> system.format_time(val, 'm')
> ------------------------------
> 1 m
> 20.57 m
> 205705687.24 m
> 1 m
> null
> 0 m
> 2.06 m {code}
> Lastly, we can specify both source and target values:
> {code:java}
> cqlsh> select format_time(val, 's', 'h') from ks.tb;
> system.format_time(val, 's', 'h')
> -----------------------------------
> 16.67 h
> 342.84 h
> 3428428120.62 h
> 16.67 h
> null
> 0.01 h
> 34.28 h {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]