[ 
https://issues.apache.org/jira/browse/CASSANDRA-19546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17842437#comment-17842437
 ] 

Brad Schoening commented on CASSANDRA-19546:
--------------------------------------------

[~smiklosovic] I like the intent here and think this will be a nice 
enhancement.  More direct function names instead of a generic catch-all would 
be preferable in my view.  I looked at some other DBs which support something 
like this:

[format_bytes()|https://dev.mysql.com/doc/refman/5.7/en/sys-format-bytes.html]
{quote}Given a byte count, converts it to human-readable format and returns a 
string consisting of a value and a units indicator. Depending on the size of 
the value, the units part is {{{}bytes{}}}, {{KiB}} (kibibytes), {{MiB}} 
(mebibytes), {{GiB}} (gibibytes), {{TiB}} (tebibytes), or {{PiB}} (pebibytes).
{quote}
[format_time()|https://dev.mysql.com/doc/refman/8.0/en/sys-format-time.html]
{quote}Depending on the size of the value, the units part is {{ps}} 
(picoseconds), {{ns}} (nanoseconds), {{us}} (microseconds), {{ms}} 
(milliseconds), {{s}} (seconds), {{m}} (minutes), {{h}} (hours), {{d}} (days), 
or {{w}} (weeks).
{quote}
[format()|https://dev.mysql.com/doc/refman/8.0/en/string-functions.html#function_format]
{quote}Formats the number _{{X}}_ to a format like {{{}'#,###,###.##'{}}}, 
rounded to _{{D}}_ decimal places, and returns the result as a string.
{quote}
MySQL, [Big 
Query|https://cloud.google.com/bigquery/docs/reference/standard-sql/format-elements]
 and [Azure Data 
Explorer|https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/format-bytes-function]
 use the format_\{type} approach.

Oracle, Postgres and DataBricks uses a single function 
[to_char()|https://www.postgresql.org/docs/current/functions-formatting.html]   
One could envision formatting percentages and currency at some future point. 

Instead of to_human_size, using either the explicit type naming (format_bytes) 
or to_char() would be a good convention to follow.

Would the time / duration format apply to both Cassandra's integer and duration 
types?  The code otherwise looks fine.

> Add to_human_size and to_human_duration function
> ------------------------------------------------
>
>                 Key: CASSANDRA-19546
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19546
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Legacy/CQL
>            Reporter: Stefan Miklosovic
>            Assignee: Stefan Miklosovic
>            Priority: Normal
>             Fix For: 5.x
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> There are cases (e.g in our system_views tables but might be applicable for 
> user tables as well) when a column is of a type which represents number of 
> bytes. However, it is quite hard to parse a value for a human to have some 
> estimation what that value is.
> I propose this:
> {code:java}
> cqlsh> select * from myks.mytb ;
>  id | col1 | col2 | col3 | col4     
> ----+------+------+------+----------
>   1 |  100 |  200 |  300 | 32432423 
> (1 rows)
> cqlsh> select to_human_size(col4) from myks.mytb where id = 1;
>  system.to_human_size(col4)
> ----------------------
>             30.93 MiB
> (1 rows)
> cqlsh> select to_human_size(col4,0) from myks.mytb where id = 1;
>  system.to_human_size(col4, 0)
> -------------------------
>                   31 MiB
> (1 rows)
> cqlsh> select to_human_size(col4,1) from myks.mytb where id = 1;
>  system.to_human_size(col4, 1)
> -------------------------
>                 30.9 MiB
> (1 rows)
> {code}
> The second argument is optional and represents the number of decimal places 
> (at most) to use. Without the second argument, it will default to 
> FileUtils.df which is "#.##" format.
> {code}
> cqlsh> DESCRIBE myks.mytb ;
> CREATE TABLE myks.mytb (
> id int PRIMARY KEY,
> col1 int,
> col2 smallint,
> col3 bigint,
> col4 varint,
> )
> {code}
> I also propose that this to_human_size function (name of it might be indeed 
> discussed and it is just a suggestion) should be only applicable for int, 
> smallint, bigint and varint types. I am not sure how to apply this to e.g. 
> "float" or similar. As I mentioned, it is meant to convert just number of 
> bytes, which is just some number, to a string representation of that and I do 
> not think that applying that function to anything else but these types makes 
> sense.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to