> On 17 Dec 2019, at 15:08, Markus Armbruster <arm...@redhat.com> wrote:
>
> Christophe de Dinechin <dinec...@redhat.com> writes:
>
>>> On 5 Dec 2019, at 16:29, Markus Armbruster <arm...@redhat.com> wrote:
>>>
>>> Tao Xu <tao3...@intel.com> writes:
>>>
>>>> Parse input string both as a double and as a uint64_t, then use the
>>>> method which consumes more characters. Update the related test cases.
>>>>
>>>> Signed-off-by: Tao Xu <tao3...@intel.com>
>>>> ---
>>> [...]
>>>> diff --git a/util/cutils.c b/util/cutils.c
>>>> index 77acadc70a..b08058c57c 100644
>>>> --- a/util/cutils.c
>>>> +++ b/util/cutils.c
>>>> @@ -212,24 +212,43 @@ static int do_strtosz(const char *nptr, const char
>>>> **end,
>>>> const char default_suffix, int64_t unit,
>>>> uint64_t *result)
>>>> {
>>>> - int retval;
>>>> - const char *endptr;
>>>> + int retval, retd, retu;
>>>> + const char *suffix, *suffixd, *suffixu;
>>>> unsigned char c;
>>>> int mul_required = 0;
>>>> - double val, mul, integral, fraction;
>>>> + bool use_strtod;
>>>> + uint64_t valu;
>>>> + double vald, mul, integral, fraction;
>>>
>>> Note for later: @mul is double.
>>>
>>>> +
>>>> + retd = qemu_strtod_finite(nptr, &suffixd, &vald);
>>>> + retu = qemu_strtou64(nptr, &suffixu, 0, &valu);
>>>> + use_strtod = strlen(suffixd) < strlen(suffixu);
>>>> +
>>>> + /*
>>>> + * Parse @nptr both as a double and as a uint64_t, then use the method
>>>> + * which consumes more characters.
>>>> + */
>>>
>>> The comment is in a funny place. I'd put it right before the
>>> qemu_strtod_finite() line.
>>>
>>>> + if (use_strtod) {
>>>> + suffix = suffixd;
>>>> + retval = retd;
>>>> + } else {
>>>> + suffix = suffixu;
>>>> + retval = retu;
>>>> + }
>>>>
>>>> - retval = qemu_strtod_finite(nptr, &endptr, &val);
>>>> if (retval) {
>>>> goto out;
>>>> }
>>>
>>> This is even more subtle than it looks.
>>
>> But why it is even necessary?
>>
>> The “contract” for the function used to be that it returned rounded values
>> beyond 2^53, which in itself is curious.
>>
>> But now it’s a 6-dimensional matrix of hell with NaNs and barfnots, when the
>> name implies it’s simply doing a text to u64 conversion…
>>
>> There is certainly a reason, but I’m really curious what it is :-)
>
> It all goes back to commit 9f9b17a4f0 "Introduce strtosz() library
> function to convert a string to a byte count.". To support "convenient"
> usage like "1.5G", it parses the number part with strtod(). This limits
> us to 53 bits of precision. Larger sizes get rounded.
>
> I guess the excuse for this was that when you're dealing with sizes that
> large (petabytes!), your least significant bits are zero anyway.
>
> Regardless, the interface is *awful*. We should've forced the author to
> spell it out in all its glory in a proper function contract. That tends
> to cool the enthusiasm for "convenient" syntax amazingly fast.
>
> The awful interface has been confusing people for close to a decade now.
>
> What to do?
I see. Thanks for the rationale. I knew it had to make sense :-)
I’d probably avoid strtod even with the convenient syntax above.
Do you want 1.33e-6M to be allowed? Do we want to ever
accept or generate NaN or Inf values?
>
> Tao Xu's patch tries to make the function do what its users expect,
> namely parse a bleepin' 64 bit integer, without breaking any of the
> "convenience" syntax. Turns out that's amazingly subtle. Are we making
> things less confusing or more?