Christophe de Dinechin <dinec...@redhat.com> writes: >> On 5 Dec 2019, at 16:29, Markus Armbruster <arm...@redhat.com> wrote: >> >> Tao Xu <tao3...@intel.com> writes: >> >>> Parse input string both as a double and as a uint64_t, then use the >>> method which consumes more characters. Update the related test cases. >>> >>> Signed-off-by: Tao Xu <tao3...@intel.com> >>> --- >> [...] >>> diff --git a/util/cutils.c b/util/cutils.c >>> index 77acadc70a..b08058c57c 100644 >>> --- a/util/cutils.c >>> +++ b/util/cutils.c >>> @@ -212,24 +212,43 @@ static int do_strtosz(const char *nptr, const char >>> **end, >>> const char default_suffix, int64_t unit, >>> uint64_t *result) >>> { >>> - int retval; >>> - const char *endptr; >>> + int retval, retd, retu; >>> + const char *suffix, *suffixd, *suffixu; >>> unsigned char c; >>> int mul_required = 0; >>> - double val, mul, integral, fraction; >>> + bool use_strtod; >>> + uint64_t valu; >>> + double vald, mul, integral, fraction; >> >> Note for later: @mul is double. >> >>> + >>> + retd = qemu_strtod_finite(nptr, &suffixd, &vald); >>> + retu = qemu_strtou64(nptr, &suffixu, 0, &valu); >>> + use_strtod = strlen(suffixd) < strlen(suffixu); >>> + >>> + /* >>> + * Parse @nptr both as a double and as a uint64_t, then use the method >>> + * which consumes more characters. >>> + */ >> >> The comment is in a funny place. I'd put it right before the >> qemu_strtod_finite() line. >> >>> + if (use_strtod) { >>> + suffix = suffixd; >>> + retval = retd; >>> + } else { >>> + suffix = suffixu; >>> + retval = retu; >>> + } >>> >>> - retval = qemu_strtod_finite(nptr, &endptr, &val); >>> if (retval) { >>> goto out; >>> } >> >> This is even more subtle than it looks. > > But why it is even necessary? > > The “contract” for the function used to be that it returned rounded values > beyond 2^53, which in itself is curious. > > But now it’s a 6-dimensional matrix of hell with NaNs and barfnots, when the > name implies it’s simply doing a text to u64 conversion… > > There is certainly a reason, but I’m really curious what it is :-)
It all goes back to commit 9f9b17a4f0 "Introduce strtosz() library function to convert a string to a byte count.". To support "convenient" usage like "1.5G", it parses the number part with strtod(). This limits us to 53 bits of precision. Larger sizes get rounded. I guess the excuse for this was that when you're dealing with sizes that large (petabytes!), your least significant bits are zero anyway. Regardless, the interface is *awful*. We should've forced the author to spell it out in all its glory in a proper function contract. That tends to cool the enthusiasm for "convenient" syntax amazingly fast. The awful interface has been confusing people for close to a decade now. What to do? Tao Xu's patch tries to make the function do what its users expect, namely parse a bleepin' 64 bit integer, without breaking any of the "convenience" syntax. Turns out that's amazingly subtle. Are we making things less confusing or more?