[ https://issues.apache.org/jira/browse/CASSANDRA-19546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17842437#comment-17842437 ]
Brad Schoening commented on CASSANDRA-19546: -------------------------------------------- [~smiklosovic] I like the intent here and think this will be a nice enhancement. More direct function names instead of a generic catch-all would be preferable in my view. I looked at some other DBs which support something like this: [format_bytes()|https://dev.mysql.com/doc/refman/5.7/en/sys-format-bytes.html] {quote}Given a byte count, converts it to human-readable format and returns a string consisting of a value and a units indicator. Depending on the size of the value, the units part is {{{}bytes{}}}, {{KiB}} (kibibytes), {{MiB}} (mebibytes), {{GiB}} (gibibytes), {{TiB}} (tebibytes), or {{PiB}} (pebibytes). {quote} [format_time()|https://dev.mysql.com/doc/refman/8.0/en/sys-format-time.html] {quote}Depending on the size of the value, the units part is {{ps}} (picoseconds), {{ns}} (nanoseconds), {{us}} (microseconds), {{ms}} (milliseconds), {{s}} (seconds), {{m}} (minutes), {{h}} (hours), {{d}} (days), or {{w}} (weeks). {quote} [format()|https://dev.mysql.com/doc/refman/8.0/en/string-functions.html#function_format] {quote}Formats the number _{{X}}_ to a format like {{{}'#,###,###.##'{}}}, rounded to _{{D}}_ decimal places, and returns the result as a string. {quote} MySQL, [Big Query|https://cloud.google.com/bigquery/docs/reference/standard-sql/format-elements] and [Azure Data Explorer|https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/format-bytes-function] use the format_\{type} approach. Oracle, Postgres and DataBricks uses a single function [to_char()|https://www.postgresql.org/docs/current/functions-formatting.html] One could envision formatting percentages and currency at some future point. Instead of to_human_size, using either the explicit type naming (format_bytes) or to_char() would be a good convention to follow. Would the time / duration format apply to both Cassandra's integer and duration types? The code otherwise looks fine. > Add to_human_size and to_human_duration function > ------------------------------------------------ > > Key: CASSANDRA-19546 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19546 > Project: Cassandra > Issue Type: New Feature > Components: Legacy/CQL > Reporter: Stefan Miklosovic > Assignee: Stefan Miklosovic > Priority: Normal > Fix For: 5.x > > Time Spent: 20m > Remaining Estimate: 0h > > There are cases (e.g in our system_views tables but might be applicable for > user tables as well) when a column is of a type which represents number of > bytes. However, it is quite hard to parse a value for a human to have some > estimation what that value is. > I propose this: > {code:java} > cqlsh> select * from myks.mytb ; > id | col1 | col2 | col3 | col4 > ----+------+------+------+---------- > 1 | 100 | 200 | 300 | 32432423 > (1 rows) > cqlsh> select to_human_size(col4) from myks.mytb where id = 1; > system.to_human_size(col4) > ---------------------- > 30.93 MiB > (1 rows) > cqlsh> select to_human_size(col4,0) from myks.mytb where id = 1; > system.to_human_size(col4, 0) > ------------------------- > 31 MiB > (1 rows) > cqlsh> select to_human_size(col4,1) from myks.mytb where id = 1; > system.to_human_size(col4, 1) > ------------------------- > 30.9 MiB > (1 rows) > {code} > The second argument is optional and represents the number of decimal places > (at most) to use. Without the second argument, it will default to > FileUtils.df which is "#.##" format. > {code} > cqlsh> DESCRIBE myks.mytb ; > CREATE TABLE myks.mytb ( > id int PRIMARY KEY, > col1 int, > col2 smallint, > col3 bigint, > col4 varint, > ) > {code} > I also propose that this to_human_size function (name of it might be indeed > discussed and it is just a suggestion) should be only applicable for int, > smallint, bigint and varint types. I am not sure how to apply this to e.g. > "float" or similar. As I mentioned, it is meant to convert just number of > bytes, which is just some number, to a string representation of that and I do > not think that applying that function to anything else but these types makes > sense. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org