Hi all,

well I've tried to consider all comments, and thought about all of them. I've 
reviewed OpenBSD's fmt_scaled() & NetBSD's humanize_number(), and tried to find 
out all the usefull things this function could do.

Nothing's done by now, but I don't think it'll go into printf (printf is used 
everywhere, thus changing it would be very dangerous, and writing programs with 
this new printf would not be usable on other *nix). Maybe I'll have to 
reconsider the printf() option, but I've tried to do it in a different way.

I've called my function  humanize_number(), like in NetBSD. This can of course 
be discussed, and I haven't yet thought about the name it should have. Reusing 
humanize_number() would permit some developpers (those coming from NetBSD) to 
be comfy with OpenSolaris.

Concerning where it should be done, I think libutil is a good choice.

Here is a draft of the function's man page. This is a first preliminary 
pre-alpha version, and since my english is not perfect, it may not be very 
clear ... Just tell me what's incomprehensible !

============================================================
Man page draft.

 NAME
 humanize_number : converts a number of bytes into a human readable string

 SYNOPSIS
 #include <util.h>

 int humanize_number(int64_t value, char *buffer, size_t buf_len,
        int precision, u_int32_t flags, const char ** cust_units,
        const char *cust_suffix, const char *cust_sep1k, const char 
*cust_sepdec);
  
 DESCRIPTION
 The humanize_number() function converts a number of bytes <value> into a human 
readable
 format. The written string can represent a floating point value. <value> can 
be negative
 The converted string will be written into <buffer>, whose len is <buf_len>.
 The <precision> parameter sets the precision of the result string. The 
<precision> only
 applies to the decimal part of the result. You can for example, choose to have 
1, 2, or 3
 digits after the dot, or none.
 Example :
 char buffer[32];
 humanize_number(123456789LL,buffer,32,3,0,NULL,NULL,NULL,NULL);
 will put into buffer "117.737 MB"
 <flags> is a set of flags, that can be combined by being or'd. They can be 
used to change
 the result formatting.
 
 The flags are :
 
 .SCALE FLAGS
 
 - FLG_SCALE_1000
   if this flag is specified, the conversion will use 1000 or multiples of 1000 
to divide
   the value. Thus the result will be equal to (value/1000^X)
 
 - FLG_SCALE_1024
   if this flag is specified, the conversion will use 1024 or multiples of 1024 
to divide
   the value. Thus the result will be equal to (value/1024^X)
 
   If none of these flags are specified, FLG_SCALE_1024 is used, it is the 
default.
   If both are specified, FLG_SCALE_1024 is used.
 
 .UNIT FLAGS 

 - FLG_UNIT_AUTO
   if this flag is specified, the unit of the string result will be 
automatically choosen,
   this will be the one that fits best. In other words, the <value> will be 
divided by
   the divider (which can be 1000 or 1024) until the result is smaller than 
divider.
   
 - FLG_UNIT_BYTE
   if this flag is specified, the result's unit will be 'BYTE'.
   Thus the result is <value>/divider^0 (no conversion in really done in this 
case)
 - FLG_UNIT_KILO
   if this flag is specified, the result's unit will be 'KILO'.
   Thus the result is <value>/divider^1
 - FLG_UNIT_MEGA 
   if this flag is specified, the result's unit will be 'MEGA'.
   Thus the result is <value>/divider^2
 - FLG_UNIT_GIGA
   if this flag is specified, the result's unit will be 'GIGA'.
   Thus the result is <value>/divider^3
 - FLG_UNIT_TERA
   if this flag is specified, the result's unit will be 'TERA'.
   Thus the result is <value>/divider^4
 - FLG_UNIT_PETA
   if this flag is specified, the result's unit will be 'PETA'.
   Thus the result is <value>/divider^5
 - FLG_UNIT_EXA
   if this flag is specified, the result's unit will be 'EXA'.
   Thus the result is <value>/divider^6
 
    If none of the above unit flag is specified, FLG_UNIT_AUTO is used.
    If many unit flags are specified, the first one is used. The order is :
    AUTO / BYTE / KILO / MEGA / GIGA / TERA / PETA / EXA
    Thus if FLG_UNIT_KILO and FLG_UNIT_TERA are both specified, FLG_UNIT_KILO 
is used.
 
    
------------------------------------------------------------------------------------
    NOTE to myself : another flag could be used, FLG_UNIT_DESC (descending), so 
that if many unit
    flags are specified, the order is EXA / PETA / TERA / GIGA / MEGA / KILO / 
BYTE / AUTO
    The behaviour could be, if FLG_UNIT_AUTO is used with some other unit 
flags, to AUTOmaticaly
    choose the best unit within those specified. Thus, one could decide the 
unit to be the best
    between KILO,MEGA,GIGA (I don't want TERA or PETA or EXA results) with
    FLG_UNIT_AUTO|FLG_UNIT_KILO|FLG_UNIT_MEGA|FLG_UNIT_GIGA
    
------------------------------------------------------------------------------------

 .PRECISION FLAGS

 - FLG_PREC_DEC
  if this flag is used, the <precision> parameter applies only to the decimal 
part of the result.
  The <precision> parameter determines the number of digits the decimal part of 
the result will
  have. Their will never be more than 3 digits after the dot in the result.
  If <precision> is 2, and FLG_PREC_DEC is used, then the result will have 2 
digits after the dot.
  Thus, if the accurate result would be 10.012, the string result will be 10.01
  If the accurate result would be 13.1, the string result will be 13.10
  Since the result will never have more than 3 digits in its decimal part, if 
<precision> is
  greater than 3 and FLG_PREC_DEC is used, humanize_number() will use 3 as the 
<precision>
  parameter.

 - FLG_PREC_TOT
   if this flag is used, the <precision> parameter applies to ALL digits in the 
result string (
 integer part and decimal part). That means that if the integer part allready 
has <precision> or
 more digits, the decimal part will have 0 digit (thus won't be displayed). 
Otherwise, the
 decimal part will have ( <precision> - [num_of_digits_in_integer_part] ) 
digits.
 Example (with FLG_PREC_TOT used)
 precision=3 number=12.1111 => result=12.1
 precision=4 number=12.1111 => result=12.11
 precision=5 number=12.1111 => result=12.111
 precision=6 number=12.1111 => result=12.111 (NEVER more than 3 digits in the 
decimal part)

 precision=3 number=23.4 => result=23.4
 precision=2 number=23.4 => result=23
 precision=1 number=23.4 => result=23
 precision=0 number=23.4 => result=23

 if none or both is/are specified, FLG_PREC_DEC is used.
 if <precision> is 0 (zero) the result is the same whatever precision flag is 
used (no decimal part)

 .ROUNDING FLAGS

 - FLG_ROUND_FLOOR
   if this flag is used, the rounding policy is to "cut" the number.
   e.g 12999 with precision 2 and FLG_SCALE_1000|FLG_PREC_DEC will give 12.99 KB
 - FLG_ROUND_CEIL
   e.g 12999 with precision 2 and FLG_SCALE_1000|FLG_PREC_DEC will give 13.00 KB

 .SUFFIXES FLAGS

 - FLG_NO_SPC
   No ' ' (space char) will be inserted between the number and the unit
   e.g. 12999 with precision 2 and FLG_SCALE_1000|FLG_PREC_DEC|FLG_NO_SPC will 
give 13.00KB
 - FLG_NO_SUFFIX
   No 'B' (which stands for Byte) will be written after the unit
   e.g. 12999 with precision 2 and FLG_SCALE_1000|FLG_PREC_DEC|FLG_NO_SUFFIX 
will give 13.00 K
 - FLG_NO_UNIT
   No unit char will be written to the result
   e.g. 12999 with precision 2 and FLG_SCALE_1000|FLG_PREC_DEC|FLG_NO_UNIT will 
give 13.00 B
   Of course, if this option is used, FLG_NO_SUFFIX should also be used 
otherwise the results seems
   to be in bytes.

 If the 3 above flags are combined, the result string will contain only the 
number, no extra char.


 String parameters :
 _______________________
 const char **cust_units
 if not NULL, permits to give custom units.
 e.g :
 const char *u[] = { "","Kilo","Mega","Giga","Tera","Peta","Exa"};
 char buf[64] res;
 humanize_number(12999LL,buf,64,3,0,u,NULL,NULL,NULL);
 will print "12.694 KiloB" into buf
 u is an array that must contain AT LEAST 7 strings.

 If FLAG_NO_UNIT is defined, these custom units won't be used.

 _______________________
 const char *cust_suffix
 if not NULL, permits to give custom suffix
 e.g :
 const char *s = " Bytes";
 humanize_number(12999LL,buf,64,3,0,u,s,NULL,NULL);
 will print "12.694 Kilo Bytes" into buf

 If FLAG_NO_SUFFIX is defined, this custom suffix won't be used.

const char *cust_sep1k
 if not NULL, this string is used as a thousands separator.
 e.g :
 const char *sep1k = ",";
 humanize_number(1299999999LL,buf,64,3,FLG_UNIT_KILO,u,s,sep1k,NULL);
  
 will print "1,269,531.249 Kilo Bytes" into buf
  
 humanize_number(129999999LL,buf,64,3,FLG_UNIT_KILO,u,s,", ",NULL);
  
 will print "1, 269, 531.249 Kilo Bytes" into buf
  
 _______________________
 const char *cust_sepdec
 if not NULL, this string can be used for decimal separator.
 e.g :
 const char *sepdec = ",";
 sep1k = " "; 
 humanize_number(1299999999LL,buf,64,3,FLG_UNIT_KILO,u,s,sep1k,sepdec);
 will print "1 269 531,249 Kilo Bytes" into buf
 
 More examples
 const char *u = 
{"","thousands","millions","billions","trillions","quadrillion","quintillion"};
 const char *s = " dollars";
 int64_t price = 1234567891011LL;
 char buf[64];
 humanize_number(price,buf,64,3,FLG_SCALE_1000|FLG_UNIT_MEGA,u,s,",",NULL);
 printf("The price is %d$ (%s)\n",price,buf);

 $ The price is 1234567891011$ (1,234,567.891 millions dollars)
 

 RETURN VALUE
 in case of succes, it returns 0
 in case of error, it returns a negative value.
 the error cause can be determined from the returned value
 if ((-rc) & ERR_BUFFER_NULL) {
        fprintf(stderr,"You provided a NULL buffer !!!\n");
 } 

 if ((-rc) & ERR_GLOBAL_ERROR) {
        fprintf(stderr,"An error (unknow) occured !!!\n");
 } 
   
 if ((-rc) & WARN_BUFF_TOO_SHORT) {
        fprintf(stderr,"You provided a buffer which is too short !!!\n");
 } 
   
 
 If you specify the flag FLG_GET_BUF_LEN, the humanize_number() function returns
 a positive value which is the minimal length the buffer should be to contain 
all
 your options (in this case, only the <cust_units>,<cust_suffix>,<cust_sep1k> 
and 
 <cust_sepdec> parameters are used to determine the maximum len the result 
would be)

 If the FLG_BUF_LEN_POW2 flag is used with FLG_GET_BUF_LEN, the result returned 
is
 a power of 2.

 e.g.
 
 const char *u = 
{"","thousands","millions","billions","trillions","quadrillion","quintillion"};
 const char *s = " dollars";
 int64_t price = 1234567891011LL;
 char *buf; 
 int len;
 len = humanize_number(0,NULL,0,0,FLG_GET_BUF_LEN,u,s,",",NULL);
 printf("The buffer is %d chars long.\n",len);
 len = 
humanize_number(0,NULL,0,0,FLG_GET_BUF_LEN|FLG_BUF_LEN_POW2,u,s,",",NULL);
 if (len <= 0) {
        // bark  
 }
 buf=(char*)malloc(len);
 printf("The buffer is %d chars long.\n",len);
 humanize_number(price,buf,64,3,FLG_SCALE_1000|FLG_UNIT_MEGA,u,s,",",NULL);
 printf("The price is %d$ (%s)\n",price,buf);
 
 $ The buffer is 48 chars long.
 $ The buffer is 64 chars long. 
 $ The price is 1234567891011$ (1,234,567.891 millions dollars)
 
 As long as you don't change the <cust_units>,<cust_suffix>,<cust_sep1k> and 
<cust_sepdec>
    
------------------------------------------------------------------------------------
    Note to myself : a flag could be used to determine if trailing zeros in the 
decimal
    part are displayed or not.
    
------------------------------------------------------------------------------------
============================================================

I hope that was not too boring ;-)

Some comments (my own ones)

* as I said before, I've tried to find the usefull things this could do. Maybe 
I've found some options that won't be usefull ...

* I did not want the function to get too many parameters, that's why I've used 
the flags. But there are a lot of flags, and that could be boring to use ... I 
thought about a scaled_string struct, that'd contain all values, and that'd be 
the only parameter to the function. What do you think of this ?

My solaris box has just died last week ... Thus I've written the code on my 
Linux box. All of the features in the draft are implemented. This is about 200 
to 250 lines of C code, and it seems to be quiet fast.

If you have any idea/comment/advice ... I'd be happy to hear from you.

Regards

Yann
This message posted from opensolaris.org
_______________________________________________
opensolaris-code mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/opensolaris-code

Reply via email to