Hi guys, I am sending a patch that extends the support for printing floating point numbers provided by Adam Saleh. The attached printf prints doubles and floats in styles %e, %f, %g with any modifiers basically to spec [1].
Sorry about the patch length. Unfortunately, converting a binary floating
point representation into a decimal string is far from trivial.
RFC
---
The code uses uint64_t arithmetic (but no division). I assumed all platforms
supported by HelenOS provide such functionality.
Moreover the code that decodes the IEEE 754 floating point number binary
representation assumes that the endianness of integers and doubles is the
same (see extract_ieee_double() in uspace/lib/c/generic/ieee_double.c). I
am not sure if this is true for all platforms.
Furthermore I am assuming doubles are 64 bits long.
Lastly I was not sure where to place the needed abs() macro. I decided
to add it to uspace/lib/c/include/macros.h.
Algorithm
---------
I researched the available literature [2,3,4,5] on converting floating points
from a binary to a decimal representation and I chose [5] from 2010.
Pros of [5]:
+ Accurate, ie converting the produced strings back to binary would yield
the same binary double.
+ Does not need bignums like the others (ie no need for malloc!)
+ No special cases; simpler implementation.
+ Fast
Cons:
- Produces an optimal string representation in only over 99% of cases.
An optimal string is the shortest accurate/correct decimal string that
is also the closest to the binary form out of all such shortest accurate
decimal strings. Unlike [2,3,4] a few doubles are converted with an extra
digit or two. Not really an issue due to the printf specification (see
below).
The algorithm uses a pregenerated table of powers of 10. I wrote the
attached gen_pow10.cpp to generate such a table with GMP. It is my first
and only GMP program to date; therefore, it is incredibly clumsy. However
I am confident the table was generated correctly.
To reproduce the table, build:
$ g++ gen_pow10.cpp -lgmpxx -lgmp
And run:
$ ./a.out -368 376 8 struct
Deviation from std
------------------
I deviated from the standard for the %g specifier if no precision is supplied.
The standard demands to default precision to 6. Doing so would however make
it difficult for users to request an accurate string representation that
at the same time is not 20 digits long (because the last digit must be rounded
and may therefore become inaccurate/incorrect). Always specifying the precision
forces the implementation to generate as many digits. It does not leave any
room for the implementation to figure out how many digits are really needed
for an accurate conversion.
In short if no precision is specified for %g the shortest string is printed
instead of mucking around with rounding a number of at least 6 digits.
%e and %f work per spec, ie if no precision is given it defaults to 6.
Testing
-------
The code was tested only mildly but it seems to give correct results even
for the more obscure format strings (see uspace/app/tester/print/print6.c).
Unsupported
-----------
Long doubles are not supported. Adding such support is possible but it would
require checking some math and working with two uint64_t instead of one.
Adding support for %a should be simple. I leave it as an excercise for the
reader ;-).
Conclusion
----------
I am looking forward to any bug fixes, comments or improvement suggestions.
Adam Hraska
PS: My mail bounced back due to size => patch is gziped
[1] c99 standard, latest updated draft, 2007
www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf
[2] How to print floating-point numbers accurately;
Steele, White, 1990
[3] Correctly rounded binary-decimal and decimal-binary conversions;
Gay, 1990
[4] Printing Floating-Point Numbers Quickly and Accurately;
Burger, Dybvig, 1996
[5] Printing floating-point numbers quickly and accurately with integers;
Loitsch, 2010
printf-double.patch.gz
Description: GNU Zip compressed data
#include <iostream>
#include <string>
#include <sstream>
#include <cmath>
#include <cassert>
#include <stdint.h>
#include <gmpxx.h>
using namespace std;
const int double_max_exp = 1023;
const int double_min_exp = -1022;
const int double_bin_exp_range = double_max_exp - double_min_exp + 1;
// Add a leeway of at least 2*64 bits just to be safe.
const int leeway = 200;
const mp_bitcnt_t precision_bits = double_bin_exp_range + leeway;
typedef void (*pfunc_print_t)(uint64_t ms64bits, int bin_exp, int dec_exp);
// Returns the most significant normalized 64 bits of f.
// f == ms64bits * 2^bin_exp, where true == ms64bits & (1 << 63)
void most_significant_64(const mpf_class & f, bool round_up, uint64_t & ms64bits,
int & bin_exp)
{
const int u64_hex_len = 2 * sizeof(uint64_t);
const int garbage_len = 2 * sizeof(uint64_t);
// One extra hex digits so we can normalize (and round).
// FIXME: Garbage digits will hopefully prevent the gmp's
// rounding from reaching our extra digit.
const int ms_u64_hex_len = u64_hex_len + 1 + garbage_len;
mp_exp_t hex_exp;
// One extra hex digits so we can normalize (and round).
// FIXME: this is buggy: gmp will round the result so we
// are hoping not all garbage digits at the end will be 0xf
// and rounding will not propagate to our extra digit.
string hex_str = f.get_str(hex_exp, 16, ms_u64_hex_len);
// Pad with trailing zeros to full length.
if (hex_str.length() < ms_u64_hex_len) {
assert(0 < hex_str.length());
hex_str.insert(hex_str.end(), ms_u64_hex_len - hex_str.length(), '0');
}
// Separate the 64 bits from the remaining 4 bits with a space.
hex_str.insert(u64_hex_len, " ");
// Remove the garbage digits that gmp may have clobbered by rounding.
// u64_hex_len + space + one extra digit
hex_str.erase(hex_str.begin() + u64_hex_len + 2, hex_str.end());
uint64_t high_part = 0;
uint32_t low_part = 0;
const int low_msbit_shift = 3;
stringstream ss(hex_str);
ss >> hex >> high_part >> hex >> low_part;
assert(!ss.bad() && high_part != 0);
int shift = 0;
// Normalize (ie, most-significant bit should be set).
while (0 == (high_part >> 63) & 1) {
++shift;
assert(shift < 4);
high_part = (high_part << 1) | (low_part >> low_msbit_shift);
low_part <<= 1;
}
if (round_up) {
// Round up (ie away from zero since it is a positive number).
high_part += 1 & (low_part >> low_msbit_shift);
// high_part was 0xff..ff so rounding up caused an overflow.
if (0 == high_part) {
ms64bits = 1ULL << 63;
--shift;
assert(-1 <= shift && shift <= 3);
} else {
// No overflow, properly rounded.
ms64bits = high_part;
assert(0 <= shift && shift <= 3);
}
} else {
// Truncate.
ms64bits = high_part;
}
const int u64_width = 64;
bin_exp = 4 * ((int)hex_exp) - shift - u64_width;
}
void print_normalized_top64(int from_dec_exp, int to, int step,
bool round_up, pfunc_print_t print_num)
{
mpf_class ten(10.0, precision_bits);
mpf_class f(0.0, precision_bits);
uint64_t ms64bits;
int bin_exp;
// Print negatives using reciprocals.
if (from_dec_exp < 0) {
for (int i = from_dec_exp; i <= min(to, -1); i += step) {
// f = 10^|i|
mpf_pow_ui(f.get_mpf_t(), ten.get_mpf_t(), abs(i));
f = 1/f;
most_significant_64(f, round_up, ms64bits, bin_exp);
print_num(ms64bits, bin_exp, i);
}
}
// Print positives.
if (0 <= to) {
int pos_offset = (from_dec_exp % step);
if (pos_offset < 0) {
pos_offset += step;
}
int pos_start = max(pos_offset, from_dec_exp);
// f = 10^|pos_start|
mpf_pow_ui(f.get_mpf_t(), ten.get_mpf_t(), pos_start);
for (int i = pos_start; i <= to; i += step) {
most_significant_64(f, round_up, ms64bits, bin_exp);
print_num(ms64bits, bin_exp, i);
// Precision is large enough; no errors are introduced.
for (int k = 0; k < step; ++k) {
f *= 10;
}
}
}
}
void print_nicely(uint64_t ms64bits, int bin_exp, int dec_exp)
{
cout << "10^" << dec_exp << "\t= 0x"
<< hex << ms64bits << dec << " * 2^" << bin_exp << "\n";
}
void print_as_c_struct(uint64_t ms64bits, int bin_exp, int dec_exp)
{
cout << "{ 0x" << hex << ms64bits << "ULL, "
<< dec << bin_exp << ", " << dec_exp << " },\n";
}
int main(int argc, char **argv)
{
pfunc_print_t print_func = print_nicely;
int from = -4, to = 20, step = 1;
bool round_up = true;
if (3 <= argc) {
stringstream ss(argv[1]);
ss >> from;
ss.clear();
ss.str(argv[2]);
ss >> to;
if (4 <= argc) {
ss.clear();
ss.str(argv[3]);
ss >> step;
}
for (int i = 4; i < argc; ++i) {
if (string(argv[i]) == "struct") {
print_func = print_as_c_struct;
}
if (string(argv[i]) == "trunc") {
round_up = false;
}
}
}
cout << "gmp version: " << gmp_version << "\n";
cout << "printing " << from << ".." << to << ", step " << step << "\n";
cout << "rounding mode: " <<
(round_up ? "up (away from zero)" : "truncation (toward zero)") << "\n";
print_normalized_top64(from, to, step, round_up, print_func);
return 0;
}
_______________________________________________ HelenOS-devel mailing list [email protected] http://lists.modry.cz/cgi-bin/listinfo/helenos-devel
