Re: [PATCH v2] string_helpers: fix precision loss for some inputs
On Wed, Nov 04 2015, James Bottomley wrote: > On Wed, 2015-11-04 at 00:26 +0100, Rasmus Villemoes wrote: >> On Tue, Nov 03 2015, James Bottomley >> wrote: >> >> Please spell it U32_MAX >> > >> > Why? there's no reason not to use the arithmetic UINT_MAX here. Either >> > works, of course but UINT_MAX is standard. >> >> We're dealing with explicitly sized integers > > An integer is explicitly sized: it's 32 bits. That's why UINT_MAX is > a universal constant. In the Linux universe, yes. It's kind of amusing how you try to argue based on the UINT_MAX name being (defined in a) standard while at the same time very much rely on sizeof(int) having a value which is not specified by the standard. I repeat: >> U32_MAX is the natural name for the appropriate constant. (and it's defined right next to UINT_MAX in kernel.h, so it's not like you'd have to introduce that macro). >> Also, you could do > U32_MAX instead of >= U32_MAX, but that's unlikely >> to make any difference (well, except it might generate slightly better >> code, since it would allow gcc to just test the upper half for being 0, >> which might be cheaper on some architectures than comparing to a >> literal). > > Heh if we're going to be that concerned about the code generation, then > we should just tell gcc exactly how to do it instead of hoping it can > work it out for itself, so > > while (blk_size >> 32) { > ... Nah, that would still require the compiler to be able to transform that to the other form, which apparently it isn't. On x86_64, the simplest is to load U32_MAX once into a register and then do r/r comparisons, but when it's possible to directly test the upper half (e.g. when the 64 bit value is represented in a pair of 32 bit registers) that's much simpler. gcc generates good code for 'blk_size > U32_MAX' on both x86_64 and x86_32, but ends up doing an extra cmp on x86_32 for >=, and ends up doing mov,shift,test inside the loop on x86_64 for 'blk_size >> 32'. Rasmus -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] string_helpers: fix precision loss for some inputs
On Wed, 2015-11-04 at 00:26 +0100, Rasmus Villemoes wrote: > On Tue, Nov 03 2015, James Bottomley > wrote: > > > On Tue, 2015-11-03 at 23:13 +0100, Rasmus Villemoes wrote: > >> On Tue, Nov 03 2015, James Bottomley > >> wrote: > >> > >> > From: James Bottomley > >> > > >> > It was noticed that we lose precision in the final calculation for some > >> > inputs. The most egregious example is size=3000 blk_size=1900 in units > >> > of 10 > >> > should yield 5.70 MB but in fact yields 3.00 MB (oops). This is because > >> > the > >> > current algorithm doesn't correctly account for all the remainders in the > >> > logarithms. Fix this by doing a correct calculation in the remainders > >> > based > >> > on napier's algorithm. Additionally, now we have the correct result, we > >> > have > >> > to account for arithmetic rounding because we're printing 3 digits of > >> > precision. This means that if the fourth digit is five or greater, we > >> > have to > >> > round up, so add a section to ensure correct rounding. Finally account > >> > for > >> > all possible inputs correctly, including zero for block size. > >> > > >> > Reported-by: Vitaly Kuznetsov > >> > Cc: sta...@vger.kernel.org # delay backport by two months for > >> > testing > >> > Fixes: b9f28d863594c429e1df35a0474d2663ca28b307 > >> > Signed-off-by: James Bottomley > >> > > >> > -- > >> > > >> > v2: updated with a recommendation from Rasmus Villemoes to truncate the > >> > initial precision at just under 32 bits > >> > > >> > diff --git a/lib/string_helpers.c b/lib/string_helpers.c > >> > index 5939f63..363faca 100644 > >> > --- a/lib/string_helpers.c > >> > +++ b/lib/string_helpers.c > >> > @@ -43,38 +43,40 @@ void string_get_size(u64 size, u64 blk_size, const > >> > enum string_size_units units, > >> > [STRING_UNITS_10] = 1000, > >> > [STRING_UNITS_2] = 1024, > >> > }; > >> > -int i, j; > >> > -u32 remainder = 0, sf_cap, exp; > >> > +static const unsigned int rounding[] = { 500, 50, 5, 0}; > >> > >> j necessarily ends up being 0, 1 or 2. Any reason to include the last > >> entry? > > > > No reason beyond a vague worry someone might try to increase the printed > > precision by one digit. > > But that would seem to require prepending 5000 to that array and > changing various constants below to 1 (aside from checking all > callers to see if they pass a sufficient buffer size) - the 0 doesn't > serve any purpose in that scenario either. > > >> > + > >> > +while (blk_size >= UINT_MAX) > >> > i++; > >> > -} > >> > > >> > -exp = divisor[units] / (u32)blk_size; > >> > -/* > >> > - * size must be strictly greater than exp here to ensure that > >> > remainder > >> > - * is greater than divisor[units] coming out of the if below. > >> > - */ > >> > -if (size > exp) { > >> > -remainder = do_div(size, divisor[units]); > >> > -remainder *= blk_size; > >> > +while (size >= UINT_MAX) > >> > i++; > >> > >> Please spell it U32_MAX > > > > Why? there's no reason not to use the arithmetic UINT_MAX here. Either > > works, of course but UINT_MAX is standard. > > We're dealing with explicitly sized integers An integer is explicitly sized: it's 32 bits. That's why UINT_MAX is a universal constant. > and the comment even says > that we're reducing till we fit in 32 bits, so that we can do a > 32x32->64 multiplication. U32_MAX is the natural name for the > appropriate constant. > > Also, you could do > U32_MAX instead of >= U32_MAX, but that's unlikely > to make any difference (well, except it might generate slightly better > code, since it would allow gcc to just test the upper half for being 0, > which might be cheaper on some architectures than comparing to a > literal). Heh if we're going to be that concerned about the code generation, then we should just tell gcc exactly how to do it instead of hoping it can work it out for itself, so while (blk_size >> 32) { ... James > Rasmus > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] string_helpers: fix precision loss for some inputs
On Tue, Nov 03 2015, James Bottomley wrote: > On Tue, 2015-11-03 at 23:13 +0100, Rasmus Villemoes wrote: >> On Tue, Nov 03 2015, James Bottomley >> wrote: >> >> > From: James Bottomley >> > >> > It was noticed that we lose precision in the final calculation for some >> > inputs. The most egregious example is size=3000 blk_size=1900 in units of >> > 10 >> > should yield 5.70 MB but in fact yields 3.00 MB (oops). This is because the >> > current algorithm doesn't correctly account for all the remainders in the >> > logarithms. Fix this by doing a correct calculation in the remainders >> > based >> > on napier's algorithm. Additionally, now we have the correct result, we >> > have >> > to account for arithmetic rounding because we're printing 3 digits of >> > precision. This means that if the fourth digit is five or greater, we >> > have to >> > round up, so add a section to ensure correct rounding. Finally account for >> > all possible inputs correctly, including zero for block size. >> > >> > Reported-by: Vitaly Kuznetsov >> > Cc: sta...@vger.kernel.org # delay backport by two months for testing >> > Fixes: b9f28d863594c429e1df35a0474d2663ca28b307 >> > Signed-off-by: James Bottomley >> > >> > -- >> > >> > v2: updated with a recommendation from Rasmus Villemoes to truncate the >> > initial precision at just under 32 bits >> > >> > diff --git a/lib/string_helpers.c b/lib/string_helpers.c >> > index 5939f63..363faca 100644 >> > --- a/lib/string_helpers.c >> > +++ b/lib/string_helpers.c >> > @@ -43,38 +43,40 @@ void string_get_size(u64 size, u64 blk_size, const >> > enum string_size_units units, >> >[STRING_UNITS_10] = 1000, >> >[STRING_UNITS_2] = 1024, >> >}; >> > - int i, j; >> > - u32 remainder = 0, sf_cap, exp; >> > + static const unsigned int rounding[] = { 500, 50, 5, 0}; >> >> j necessarily ends up being 0, 1 or 2. Any reason to include the last entry? > > No reason beyond a vague worry someone might try to increase the printed > precision by one digit. But that would seem to require prepending 5000 to that array and changing various constants below to 1 (aside from checking all callers to see if they pass a sufficient buffer size) - the 0 doesn't serve any purpose in that scenario either. >> > + >> > + while (blk_size >= UINT_MAX) >> >i++; >> > - } >> > >> > - exp = divisor[units] / (u32)blk_size; >> > - /* >> > - * size must be strictly greater than exp here to ensure that remainder >> > - * is greater than divisor[units] coming out of the if below. >> > - */ >> > - if (size > exp) { >> > - remainder = do_div(size, divisor[units]); >> > - remainder *= blk_size; >> > + while (size >= UINT_MAX) >> >i++; >> >> Please spell it U32_MAX > > Why? there's no reason not to use the arithmetic UINT_MAX here. Either > works, of course but UINT_MAX is standard. We're dealing with explicitly sized integers, and the comment even says that we're reducing till we fit in 32 bits, so that we can do a 32x32->64 multiplication. U32_MAX is the natural name for the appropriate constant. Also, you could do > U32_MAX instead of >= U32_MAX, but that's unlikely to make any difference (well, except it might generate slightly better code, since it would allow gcc to just test the upper half for being 0, which might be cheaper on some architectures than comparing to a literal). Rasmus -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] string_helpers: fix precision loss for some inputs
On Tue, 2015-11-03 at 23:13 +0100, Rasmus Villemoes wrote: > On Tue, Nov 03 2015, James Bottomley > wrote: > > > From: James Bottomley > > > > It was noticed that we lose precision in the final calculation for some > > inputs. The most egregious example is size=3000 blk_size=1900 in units of > > 10 > > should yield 5.70 MB but in fact yields 3.00 MB (oops). This is because the > > current algorithm doesn't correctly account for all the remainders in the > > logarithms. Fix this by doing a correct calculation in the remainders based > > on napier's algorithm. Additionally, now we have the correct result, we > > have > > to account for arithmetic rounding because we're printing 3 digits of > > precision. This means that if the fourth digit is five or greater, we have > > to > > round up, so add a section to ensure correct rounding. Finally account for > > all possible inputs correctly, including zero for block size. > > > > Reported-by: Vitaly Kuznetsov > > Cc: sta...@vger.kernel.org # delay backport by two months for testing > > Fixes: b9f28d863594c429e1df35a0474d2663ca28b307 > > Signed-off-by: James Bottomley > > > > -- > > > > v2: updated with a recommendation from Rasmus Villemoes to truncate the > > initial precision at just under 32 bits > > > > diff --git a/lib/string_helpers.c b/lib/string_helpers.c > > index 5939f63..363faca 100644 > > --- a/lib/string_helpers.c > > +++ b/lib/string_helpers.c > > @@ -43,38 +43,40 @@ void string_get_size(u64 size, u64 blk_size, const enum > > string_size_units units, > > [STRING_UNITS_10] = 1000, > > [STRING_UNITS_2] = 1024, > > }; > > - int i, j; > > - u32 remainder = 0, sf_cap, exp; > > + static const unsigned int rounding[] = { 500, 50, 5, 0}; > > j necessarily ends up being 0, 1 or 2. Any reason to include the last entry? No reason beyond a vague worry someone might try to increase the printed precision by one digit. > > + > > + while (blk_size >= UINT_MAX) > > i++; > > - } > > > > - exp = divisor[units] / (u32)blk_size; > > - /* > > -* size must be strictly greater than exp here to ensure that remainder > > -* is greater than divisor[units] coming out of the if below. > > -*/ > > - if (size > exp) { > > - remainder = do_div(size, divisor[units]); > > - remainder *= blk_size; > > + while (size >= UINT_MAX) > > i++; > > Please spell it U32_MAX Why? there's no reason not to use the arithmetic UINT_MAX here. Either works, of course but UINT_MAX is standard. > . Also, it's not clear why you left out the > do_divs ;-) Over reduction. James > Rasmus > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] string_helpers: fix precision loss for some inputs
On Tue, 2015-11-03 at 13:37 -0800, Bart Van Assche wrote: > On 11/03/2015 01:21 PM, James Bottomley wrote: > > + while (blk_size >= UINT_MAX) > > i++; > > (reduced CC-list) Let's keep at least the lists in cc. > Hello James, > > Is the above loop an infinite loop if blk_size >= UINT_MAX ? Yes, a little too much truncation over the original. I'll post a v3. James -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] string_helpers: fix precision loss for some inputs
On Tue, Nov 03 2015, James Bottomley wrote: > From: James Bottomley > > It was noticed that we lose precision in the final calculation for some > inputs. The most egregious example is size=3000 blk_size=1900 in units of 10 > should yield 5.70 MB but in fact yields 3.00 MB (oops). This is because the > current algorithm doesn't correctly account for all the remainders in the > logarithms. Fix this by doing a correct calculation in the remainders based > on napier's algorithm. Additionally, now we have the correct result, we have > to account for arithmetic rounding because we're printing 3 digits of > precision. This means that if the fourth digit is five or greater, we have to > round up, so add a section to ensure correct rounding. Finally account for > all possible inputs correctly, including zero for block size. > > Reported-by: Vitaly Kuznetsov > Cc: sta...@vger.kernel.org# delay backport by two months for testing > Fixes: b9f28d863594c429e1df35a0474d2663ca28b307 > Signed-off-by: James Bottomley > > -- > > v2: updated with a recommendation from Rasmus Villemoes to truncate the > initial precision at just under 32 bits > > diff --git a/lib/string_helpers.c b/lib/string_helpers.c > index 5939f63..363faca 100644 > --- a/lib/string_helpers.c > +++ b/lib/string_helpers.c > @@ -43,38 +43,40 @@ void string_get_size(u64 size, u64 blk_size, const enum > string_size_units units, > [STRING_UNITS_10] = 1000, > [STRING_UNITS_2] = 1024, > }; > - int i, j; > - u32 remainder = 0, sf_cap, exp; > + static const unsigned int rounding[] = { 500, 50, 5, 0}; j necessarily ends up being 0, 1 or 2. Any reason to include the last entry? > + > + while (blk_size >= UINT_MAX) > i++; > - } > > - exp = divisor[units] / (u32)blk_size; > - /* > - * size must be strictly greater than exp here to ensure that remainder > - * is greater than divisor[units] coming out of the if below. > - */ > - if (size > exp) { > - remainder = do_div(size, divisor[units]); > - remainder *= blk_size; > + while (size >= UINT_MAX) > i++; Please spell it U32_MAX. Also, it's not clear why you left out the do_divs ;-) Rasmus -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] string_helpers: fix precision loss for some inputs
From: James Bottomley It was noticed that we lose precision in the final calculation for some inputs. The most egregious example is size=3000 blk_size=1900 in units of 10 should yield 5.70 MB but in fact yields 3.00 MB (oops). This is because the current algorithm doesn't correctly account for all the remainders in the logarithms. Fix this by doing a correct calculation in the remainders based on napier's algorithm. Additionally, now we have the correct result, we have to account for arithmetic rounding because we're printing 3 digits of precision. This means that if the fourth digit is five or greater, we have to round up, so add a section to ensure correct rounding. Finally account for all possible inputs correctly, including zero for block size. Reported-by: Vitaly Kuznetsov Cc: sta...@vger.kernel.org # delay backport by two months for testing Fixes: b9f28d863594c429e1df35a0474d2663ca28b307 Signed-off-by: James Bottomley -- v2: updated with a recommendation from Rasmus Villemoes to truncate the initial precision at just under 32 bits diff --git a/lib/string_helpers.c b/lib/string_helpers.c index 5939f63..363faca 100644 --- a/lib/string_helpers.c +++ b/lib/string_helpers.c @@ -43,38 +43,40 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units, [STRING_UNITS_10] = 1000, [STRING_UNITS_2] = 1024, }; - int i, j; - u32 remainder = 0, sf_cap, exp; + static const unsigned int rounding[] = { 500, 50, 5, 0}; + int i = 0, j; + u32 remainder = 0, sf_cap; char tmp[8]; const char *unit; tmp[0] = '\0'; - i = 0; - if (!size) + + if (blk_size == 0) + size = 0; + if (size == 0) goto out; - while (blk_size >= divisor[units]) { - remainder = do_div(blk_size, divisor[units]); + /* This is napier's algorithm. Reduce the original block size to +* +* coefficient * divisor[units]^i +* +* we do the reduction so both coefficients are just under 32 bits so +* that multiplying them together won't overflow 64 bits and we keep +* as much precision as possible in the numbers +*/ + + while (blk_size >= UINT_MAX) i++; - } - exp = divisor[units] / (u32)blk_size; - /* -* size must be strictly greater than exp here to ensure that remainder -* is greater than divisor[units] coming out of the if below. -*/ - if (size > exp) { - remainder = do_div(size, divisor[units]); - remainder *= blk_size; + while (size >= UINT_MAX) i++; - } else { - remainder *= size; - } + + /* now perform the actual multiplication keeping i as the sum of the +* two logarithms */ size *= blk_size; - size += remainder / divisor[units]; - remainder %= divisor[units]; + /* and logarithmically reduce it until it's just under the divisor */ while (size >= divisor[units]) { remainder = do_div(size, divisor[units]); i++; @@ -84,9 +86,20 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units, for (j = 0; sf_cap*10 < 1000; j++) sf_cap *= 10; + /* express the remainder as a decimal (it's currently the numerator of +* a fraction whose denominator is divisor[units]) */ + remainder *= 1000; + remainder /= divisor[units]; + + /* add a 5 to the digit below what will be printed to ensure +* an arithmetical round up and carry it through to size */ + remainder += rounding[j]; + if (remainder >= 1000) { + remainder -= 1000; + size += 1; + } + if (j) { - remainder *= 1000; - remainder /= divisor[units]; snprintf(tmp, sizeof(tmp), ".%03u", remainder); tmp[j+1] = '\0'; } -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html